diff --git a/configure.ac b/configure.ac index 0262bab96..e016db0b9 100644 --- a/configure.ac +++ b/configure.ac @@ -253,6 +253,7 @@ AC_CONFIG_FILES([ dep/src/sockets/Makefile dep/src/zlib/Makefile dep/Makefile + dep/tbb/Makefile doc/Doxyfile doc/Makefile Makefile diff --git a/dep/Makefile.am b/dep/Makefile.am index 2bd06d3a2..f2a9d7c95 100644 --- a/dep/Makefile.am +++ b/dep/Makefile.am @@ -23,5 +23,8 @@ if MANGOS_BUILD_ACE SUBDIRS += ACE_wrappers endif +# Intel's TBB +SUBDIRS += tbb + ## Additional files to include when running 'make dist' # Nothing yet. diff --git a/dep/tbb/CHANGES b/dep/tbb/CHANGES new file mode 100644 index 000000000..e6c71a98f --- /dev/null +++ b/dep/tbb/CHANGES @@ -0,0 +1,678 @@ +TBB 2.2 Update 1 commercial-aligned release + +Changes (w.r.t. TBB 2.2 commercial-aligned release): + +- Incorporates all changes from open-source releases below. +- Documentation was updated. +- TBB scheduler auto-initialization now covers all possible use cases. +- concurrent_queue: made argument types of sizeof used in paddings + consistent with those actually used. +- Memory allocator was improved: supported corner case of user's malloc + calling scalable_malloc (non-Windows), corrected processing of + memory allocation requests during tbb memory allocator startup + (Linux). +- Windows malloc replacement has got better support for static objects. +- In pipeline setups that do not allow actual parallelism, execution + by a single thread is guaranteed, idle spinning eliminated, and + performance improved. +- RML refactoring and clean-up. +- New constructor for concurrent_hash_map allows reserving space for + a number of items. +- Operator delete() added to the TBB exception classes. +- Lambda support was improved in parallel_reduce. +- gcc 4.3 warnings were fixed for concurrent_queue. +- Fixed possible initialization deadlock in modules using TBB entities + during construction of global static objects. +- Copy constructor in concurrent_hash_map was fixed. +- Fixed a couple of rare crashes in the scheduler possible before + in very specific use cases. +- Fixed a rare crash in the TBB allocator running out of memory. +- New tests were implemented, including test_lambda.cpp that checks + support for lambda expressions. +- A few other small changes in code, tests, and documentation. + +------------------------------------------------------------------------ +20090809 open-source release + +Changes (w.r.t. TBB 2.2 commercial-aligned release): + +- Fixed known exception safety issues in concurrent_vector. +- Better concurrency of simultaneous grow requests in concurrent_vector. +- TBB allocator further improves performance of large object allocation. +- Problem with source of text relocations was fixed on Linux +- Fixed bugs related to malloc replacement under Windows +- A few other small changes in code and documentation. + +------------------------------------------------------------------------ +TBB 2.2 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U4 commercial-aligned release): + +- Incorporates all changes from open-source releases below. +- Architecture folders renamed from em64t to intel64 and from itanium + to ia64. +- Major Interface version changed from 3 to 4. Deprecated interfaces + might be removed in future releases. +- Parallel algorithms that use partitioners have switched to use + the auto_partitioner by default. +- Improved memory allocator performance for allocations bigger than 8K. +- Added new thread-bound filters functionality for pipeline. +- New implementation of concurrent_hash_map that improves performance + significantly. +- A few other small changes in code and documentation. + +------------------------------------------------------------------------ +20090511 open-source release + +Changes (w.r.t. previous open-source release): + +- Basic support for MinGW32 development kit. +- Added tbb::zero_allocator class that initializes memory with zeros. + It can be used as an adaptor to any STL-compatible allocator class. +- Added tbb::parallel_for_each template function as alias to parallel_do. +- Added more overloads for tbb::parallel_for. +- Added support for exact exception propagation (can only be used with + compilers that support C++0x std::exception_ptr). +- tbb::atomic template class can be used with enumerations. +- mutex, recursive_mutex, spin_mutex, spin_rw_mutex classes extended + with explicit lock/unlock methods. +- Fixed size() and grow_to_at_least() methods of tbb::concurrent_vector + to provide space allocation guarantees. More methods added for + compatibility with std::vector, including some from C++0x. +- Preview of a lambda-friendly interface for low-level use of tasks. +- scalable_msize function added to the scalable allocator (Windows only). +- Rationalized internal auxiliary functions for spin-waiting and backoff. +- Several tests undergo decent refactoring. + +Changes affecting backward compatibility: + +- Improvements in concurrent_queue, including limited API changes. + The previous version is deprecated; its functionality is accessible + via methods of the new tbb::concurrent_bounded_queue class. +- grow* and push_back methods of concurrent_vector changed to return + iterators; old semantics is deprecated. + +------------------------------------------------------------------------ +TBB 2.1 Update 4 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U3 commercial-aligned release): + +- Added tests for aligned memory allocations and malloc replacement. +- Several improvements for better bundling with Intel(R) C++ Compiler. +- A few other small changes in code and documentaion. + +Bugs fixed: + +- 150 - request to build TBB examples with debug info in release mode. +- backward compatibility issue with concurrent_queue on Windows. +- dependency on VS 2005 SP1 runtime libraries removed. +- compilation of GUI examples under XCode* 3.1 (1577). +- On Windows, TBB allocator classes can be instantiated with const types + for compatibility with MS implementation of STL containers (1566). + +------------------------------------------------------------------------ +20090313 open-source release + +Changes (w.r.t. 20081109 open-source release): + +- Includes all changes introduced in TBB 2.1 Update 2 & Update 3 + commercial-aligned releases (see below for details). +- Added tbb::parallel_invoke template function. It runs up to 10 + user-defined functions in parallel and waits for them to complete. +- Added a special library providing ability to replace the standard + memory allocation routines in Microsoft* C/C++ RTL (malloc/free, + global new/delete, etc.) with the TBB memory allocator. + Usage details are described in include/tbb/tbbmalloc_proxy.h file. +- Task scheduler switched to use new implementation of its core + functionality (deque based task pool, new structure of arena slots). +- Preview of Microsoft* Visual Studio* 2005 project files for + building the library is available in build/vsproject folder. +- Added tests for aligned memory allocations and malloc replacement. +- Added parallel_for/game_of_life.net example (for Windows only) + showing TBB usage in a .NET application. +- A number of other fixes and improvements to code, tests, makefiles, + examples and documents. + +Bugs fixed: + +- The same list as in TBB 2.1 Update 4 right above. + +------------------------------------------------------------------------ +TBB 2.1 Update 3 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U2 commercial-aligned release): + +- Added support for aligned allocations to the TBB memory allocator. +- Added a special library to use with LD_PRELOAD on Linux* in order to + replace the standard memory allocation routines in C/C++ with the + TBB memory allocator. +- Added null_mutex and null_rw_mutex: no-op classes interface-compliant + to other TBB mutexes. +- Improved performance of parallel_sort, to close most of the serial gap + with std::sort, and beat it on 2 and more cores. +- A few other small changes. + +Bugs fixed: + +- the problem where parallel_for hanged after exception throw + if affinity_partitioner was used (1556). +- get rid of VS warnings about mbstowcs deprecation (1560), + as well as some other warnings. +- operator== for concurrent_vector::iterator fixed to work correctly + with different vector instances. + +------------------------------------------------------------------------ +TBB 2.1 Update 2 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U1 commercial-aligned release): + +- Incorporates all open-source-release changes down to TBB 2.1 U1, + except for: + - 20081019 addition of enumerable_thread_specific; +- Warning level for Microsoft* Visual C++* compiler raised to /W4 /Wp64; + warnings found on this level were cleaned or suppressed. +- Added TBB_runtime_interface_version API function. +- Added new example: pipeline/square. +- Added exception handling and cancellation support + for parallel_do and pipeline. +- Added copy constructor and [begin,end) constructor to concurrent_queue. +- Added some support for beta version of Intel(R) Parallel Amplifier. +- Added scripts to set environment for cross-compilation of 32-bit + applications on 64-bit Linux with Intel(R) C++ Compiler. +- Fixed semantics of concurrent_vector::clear() to not deallocate + internal arrays. Fixed compact() to perform such deallocation later. +- Fixed the issue with atomic when T is incomplete type. +- Improved support for PowerPC* Macintosh*, including the fix + for a bug in masked compare-and-swap reported by a customer. +- As usual, a number of other improvements everywhere. + +------------------------------------------------------------------------ +20081109 open-source release + +Changes (w.r.t. previous open-source release): + +- Added new serial out of order filter for tbb::pipeline. +- Fixed the issue with atomic::operator= reported at the forum. +- Fixed the issue with using tbb::task::self() in task destructor + reported at the forum. +- A number of other improvements to code, tests, makefiles, examples + and documents. + +Open-source contributions integrated: +- Changes in the memory allocator were partially integrated. + +------------------------------------------------------------------------ +20081019 open-source release + +Changes (w.r.t. previous open-source release): + +- Introduced enumerable_thread_specific. This new class provides a + wrapper around native thread local storage as well as iterators and + ranges for accessing the thread local copies (1533). +- Improved support for Intel(R) Threading Analysis Tools + on Intel(R) 64 architecture. +- Dependency from Microsoft* CRT was integrated to the libraries using + manifests, to avoid issues if called from code that uses different + version of Visual C++* runtime than the library. +- Introduced new defines TBB_USE_ASSERT, TBB_USE_DEBUG, + TBB_USE_PERFORMANCE_WARNINGS, TBB_USE_THREADING_TOOLS. +- A number of other improvements to code, tests, makefiles, examples + and documents. + +Open-source contributions integrated: + +- linker optimization: /incremental:no . + +------------------------------------------------------------------------ +20080925 open-source release + +Changes (w.r.t. previous open-source release): + +- Same fix for a memory leak in the memory allocator as in TBB 2.1 U1. +- Improved support for lambda functions. +- Fixed more concurrent_queue issues reported at the forum. +- A number of other improvements to code, tests, makefiles, examples + and documents. + +------------------------------------------------------------------------ +TBB 2.1 Update 1 commercial-aligned release + +Changes (w.r.t. TBB 2.1 Gold commercial-aligned release): + +- Fixed small memory leak in the memory allocator. +- Incorporates all open-source-release changes down to TBB 2.1 GOLD, + except for: + - 20080825 changes for parallel_do; + +------------------------------------------------------------------------ +20080825 open-source release + +Changes (w.r.t. previous open-source release): + +- Added exception handling and cancellation support for parallel_do. +- Added default HashCompare template argument for concurrent_hash_map. +- Fixed concurrent_queue.clear() issues due to incorrect assumption + about clear() being private method. +- Added the possibility to use TBB in applications that change + default calling conventions (Windows* only). +- Many improvements to code, tests, examples, makefiles and documents. + +Bugs fixed: + +- 120, 130 - memset declaration missed in concurrent_hash_map.h + +------------------------------------------------------------------------ +20080724 open-source release + +Changes (w.r.t. previous open-source release): + +- Inline assembly for atomic operations improved for gcc 4.3 +- A few more improvements to the code. + +------------------------------------------------------------------------ +20080709 open-source release + +Changes (w.r.t. previous open-source release): + +- operator=() was added to the tbb_thread class according to + the current working draft for std::thread. +- Recognizing SPARC* in makefiles for Linux* and Sun Solaris*. + +Bugs fixed: + +- 127 - concurrent_hash_map::range fixed to split correctly. + +Open-source contributions integrated: + +- fix_set_midpoint.diff by jyasskin +- SPARC* support in makefiles by Raf Schietekat + +------------------------------------------------------------------------ +20080622 open-source release + +Changes (w.r.t. previous open-source release): + +- Fixed a hang that rarely happened on Linux + during deinitialization of the TBB scheduler. +- Improved support for Intel(R) Thread Checker. +- A few more improvements to the code. + +------------------------------------------------------------------------ +TBB 2.1 GOLD commercial-aligned release + +Changes (w.r.t. TBB 2.0 U3 commercial-aligned release): + +- All open-source-release changes down to, and including, TBB 2.0 GOLD + below, were incorporated into this release. + +------------------------------------------------------------------------ +20080605 open-source release + +Changes (w.r.t. previous open-source release): + +- Explicit control of exported symbols by version scripts added on Linux. +- Interfaces polished for exception handling & algorithm cancellation. +- Cache behavior improvements in the scalable allocator. +- Improvements in text_filter, polygon_overlay, and other examples. +- A lot of other stability improvements in code, tests, and makefiles. +- First release where binary packages include headers/docs/examples, so + binary packages are now self-sufficient for using TBB. + +Open-source contributions integrated: + +- atomics patch (partially). +- tick_count warning patch. + +Bugs fixed: + +- 118 - fix for boost compatibility. +- 123 - fix for tbb_machine.h. + +------------------------------------------------------------------------ +20080512 open-source release + +Changes (w.r.t. previous open-source release): + +- Fixed a problem with backward binary compatibility + of debug Linux builds. +- Sun* Studio* support added. +- soname support added on Linux via linker script. To restore backward + binary compatibility, *.so -> *.so.2 softlinks should be created. +- concurrent_hash_map improvements - added few new forms of insert() + method and fixed precondition and guarantees of erase() methods. + Added runtime warning reporting about bad hash function used for + the container. Various improvements for performance and concurrency. +- Cancellation mechanism reworked so that it does not hurt scalability. +- Algorithm parallel_do reworked. Requirement for Body::argument_type + definition removed, and work item argument type can be arbitrarily + cv-qualified. +- polygon_overlay example added. +- A few more improvements to code, tests, examples and Makefiles. + +Open-source contributions integrated: + +- Soname support patch for Bugzilla #112. + +Bugs fixed: + +- 112 - fix for soname support. + +------------------------------------------------------------------------ +TBB 2.0 U3 commercial-aligned release (package 017, April 20, 2008) + +Corresponds to commercial 019 (for Linux*, 020; for Mac OS* X, 018) +packages. + +Changes (w.r.t. TBB 2.0 U2 commercial-aligned release): + +- Does not contain open-source-release changes below; this release is + only a minor update of TBB 2.0 U2. +- Removed spin-waiting in pipeline and concurrent_queue. +- A few more small bug fixes from open-source releases below. + +------------------------------------------------------------------------ +20080408 open-source release + +Changes (w.r.t. previous open-source release): + +- count_strings example reworked: new word generator implemented, hash + function replaced, and tbb_allocator is used with std::string class. +- Static methods of spin_rw_mutex were replaced by normal member + functions, and the class name was versioned. +- tacheon example was renamed to tachyon. +- Improved support for Intel(R) Thread Checker. +- A few more minor improvements. + +Open-source contributions integrated: + +- Two sets of Sun patches for IA Solaris support. + +------------------------------------------------------------------------ +20080402 open-source release + +Changes (w.r.t. previous open-source release): + +- Exception handling and cancellation support for tasks and algorithms + fully enabled. +- Exception safety guaranties defined and fixed for all concurrent + containers. +- User-defined memory allocator support added to all concurrent + containers. +- Performance improvement of concurrent_hash_map, spin_rw_mutex. +- Critical fix for a rare race condition during scheduler + initialization/de-initialization. +- New methods added for concurrent containers to be closer to STL, + as well as automatic filters removal from pipeline + and __TBB_AtomicAND function. +- The volatile keyword dropped from where it is not really needed. +- A few more minor improvements. + +------------------------------------------------------------------------ +20080319 open-source release + +Changes (w.r.t. previous open-source release): + +- Support for gcc version 4.3 was added. +- tbb_thread class, near compatible with std::thread expected in C++0x, + was added. + +Bugs fixed: + +- 116 - fix for compilation issues with gcc version 4.2.1. +- 120 - fix for compilation issues with gcc version 4.3. + +------------------------------------------------------------------------ +20080311 open-source release + +Changes (w.r.t. previous open-source release): + +- An enumerator added for pipeline filter types (serial vs. parallel). +- New task_scheduler_observer class introduced, to observe when + threads start and finish interacting with the TBB task scheduler. +- task_scheduler_init reverted to not use internal versioned class; + binary compatibility guaranteed with stable releases only. +- Various improvements to code, tests, examples and Makefiles. + +------------------------------------------------------------------------ +20080304 open-source release + +Changes (w.r.t. previous open-source release): + +- Task-to-thread affinity support, previously kept under a macro, + now fully legalized. +- Work-in-progress on cache_aligned_allocator improvements. +- Pipeline really supports parallel input stage; it's no more serialized. +- Various improvements to code, tests, examples and Makefiles. + +Bugs fixed: + +- 119 - fix for scalable_malloc sometimes failing to return a big block. +- TR575 - fixed a deadlock occurring on Windows in startup/shutdown + under some conditions. + +------------------------------------------------------------------------ +20080226 open-source release + +Changes (w.r.t. previous open-source release): + +- Introduced tbb_allocator to select between standard allocator and + tbb::scalable_allocator when available. +- Removed spin-waiting in pipeline and concurrent_queue. +- Improved performance of concurrent_hash_map by using tbb_allocator. +- Improved support for Intel(R) Thread Checker. +- Various improvements to code, tests, examples and Makefiles. + +------------------------------------------------------------------------ +TBB 2.0 U2 commercial-aligned release (package 017, February 14, 2008) + +Corresponds to commercial 017 (for Linux*, 018; for Mac OS* X, 016) +packages. + +Changes (w.r.t. TBB 2.0 U1 commercial-aligned release): + +- Does not contain open-source-release changes below; this release is + only a minor update of TBB 2.0 U1. +- Add support for Microsoft* Visual Studio* 2008, including binary + libraries and VS2008 projects for examples. +- Use SwitchToThread() not Sleep() to yield threads on Windows*. +- Enhancements to Doxygen-readable comments in source code. +- A few more small bug fixes from open-source releases below. + +Bugs fixed: + +- TR569 - Memory leak in concurrent_queue. + +------------------------------------------------------------------------ +20080207 open-source release + +Changes (w.r.t. previous open-source release): + +- Improvements and minor fixes in VS2008 projects for examples. +- Improvements in code for gating worker threads that wait for work, + previously consolidated under #if IMPROVED_GATING, now legalized. +- Cosmetic changes in code, examples, tests. + +Bugs fixed: + +- 113 - Iterators and ranges should be convertible to their const + counterparts. +- TR569 - Memory leak in concurrent_queue. + +------------------------------------------------------------------------ +20080122 open-source release + +Changes (w.r.t. previous open-source release): + +- Updated examples/parallel_for/seismic to improve the visuals and to + use the affinity_partitioner (20071127 and forward) for better + performance. +- Minor improvements to unittests and performance tests. + +------------------------------------------------------------------------ +20080115 open-source release + +Changes (w.r.t. previous open-source release): + +- Cleanup, simplifications and enhancements to the Makefiles for + building the libraries (see build/index.html for high-level + changes) and the examples. +- Use SwitchToThread() not Sleep() to yield threads on Windows*. +- Engineering work-in-progress on exception safety/support. +- Engineering work-in-progress on affinity_partitioner for + parallel_reduce. +- Engineering work-in-progress on improved gating for worker threads + (idle workers now block in the OS instead of spinning). +- Enhancements to Doxygen-readable comments in source code. + +Bugs fixed: + +- 102 - Support for parallel build with gmake -j +- 114 - /Wp64 build warning on Windows*. + +------------------------------------------------------------------------ +20071218 open-source release + +Changes (w.r.t. previous open-source release): + +- Full support for Microsoft* Visual Studio* 2008 in open-source. + Binaries for vc9/ will be available in future stable releases. +- New recursive_mutex class. +- Full support for 32-bit PowerMac including export files for builds. +- Improvements to parallel_do. + +------------------------------------------------------------------------ +20071206 open-source release + +Changes (w.r.t. previous open-source release): + +- Support for Microsoft* Visual Studio* 2008 in building libraries + from source as well as in vc9/ projects for examples. +- Small fixes to the affinity_partitioner first introduced in 20071127. +- Small fixes to the thread-stack size hook first introduced in 20071127. +- Engineering work in progress on concurrent_vector. +- Engineering work in progress on exception behavior. +- Unittest improvements. + +------------------------------------------------------------------------ +20071127 open-source release + +Changes (w.r.t. previous open-source release): + +- Task-to-thread affinity support (affinity partitioner) first appears. +- More work on concurrent_vector. +- New parallel_do algorithm (function-style version of parallel while) + and parallel_do/parallel_preorder example. +- New task_scheduler_init() hooks for getting default_num_threads() and + for setting thread stack size. +- Support for weak memory consistency models in the code base. +- Futex usage in the task scheduler (Linux). +- Started adding 32-bit PowerMac support. +- Intel(R) 9.1 compilers are now the base supported Intel(R) compiler + version. +- TBB libraries added to link line automatically on Microsoft Windows* + systems via #pragma comment linker directives. + +Open-source contributions integrated: + +- FreeBSD platform support patches. +- AIX weak memory model patch. + +Bugs fixed: + +- 108 - Removed broken affinity.h reference. +- 101 - Does not build on Debian Lenny (replaced arch with uname -m). + +------------------------------------------------------------------------ +20071030 open-source release + +Changes (w.r.t. previous open-source release): + +- More work on concurrent_vector. +- Better support for building with -Wall -Werror (or not) as desired. +- A few fixes to eliminate extraneous warnings. +- Begin introduction of versioning hooks so that the internal/API + version is tracked via TBB_INTERFACE_VERSION. The newest binary + libraries should always work with previously-compiled code when- + ever possible. +- Engineering work in progress on using futex inside the mutexes (Linux). +- Engineering work in progress on exception behavior. +- Engineering work in progress on a new parallel_do algorithm. +- Unittest improvements. + +------------------------------------------------------------------------ +20070927 open-source release + +Changes: + +- Minor update to TBB 2.0 U1 below. +- Begin introduction of new concurrent_vector interfaces not released + with TBB 2.0 U1. + +------------------------------------------------------------------------ +TBB 2.0 U1 commercial-aligned release (package 014, October 1, 2007) + +Corresponds to commercial 014 (for Linux*, 016) packages. + +Changes (w.r.t. previous commercial-aligned release): + +- All open-source-release changes down to, and including, TBB 2.0 GOLD + below, were incorporated into this release. +- Made a number of changes to the officially supported OS list: + Added Linux* OSs: + Asianux* 3, Debian* 4.0, Fedora Core* 6, Fedora* 7, + Turbo Linux* 11, Ubuntu* 7.04; + Dropped Linux* OSs: + Asianux* 2, Fedora Core* 4, Haansoft* Linux 2006 Server, + Mandriva/Mandrake* 10.1, Miracle Linux* 4.0, + Red Flag* DC Server 5.0; + Only Mac OS* X 10.4.9 (and forward) and Xcode* tool suite 2.4.1 (and + forward) are now supported. +- Commercial installers on Linux* fixed to recommend the correct + binaries to use in more cases, with less unnecessary warnings. +- Changes to eliminate spurious build warnings. + +Open-source contributions integrated: + +- Two small header guard macro patches; it also fixed bug #94. +- New blocked_range3d class. + +Bugs fixed: + +- 93 - Removed misleading comments in task.h. +- 94 - See above. + +------------------------------------------------------------------------ +20070815 open-source release + +Changes: + +- Changes to eliminate spurious build warnings. +- Engineering work in progress on concurrent_vector allocator behavior. +- Added hooks to use the Intel(R) compiler code coverage tools. + +Open-source contributions integrated: + +- Mac OS* X build warning patch. + +Bugs fixed: + +- 88 - Fixed TBB compilation errors if both VS2005 and Windows SDK are + installed. + +------------------------------------------------------------------------ +20070719 open-source release + +Changes: + +- Minor update to TBB 2.0 GOLD below. +- Changes to eliminate spurious build warnings. + +------------------------------------------------------------------------ +TBB 2.0 GOLD commercial-aligned release (package 010, July 19, 2007) + +Corresponds to commercial 010 (for Linux*, 012) packages. + +- TBB open-source debut release. + +------------------------------------------------------------------------ +* Other names and brands may be claimed as the property of others. diff --git a/dep/tbb/COPYING b/dep/tbb/COPYING new file mode 100644 index 000000000..5af6ed874 --- /dev/null +++ b/dep/tbb/COPYING @@ -0,0 +1,353 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. +---------------- END OF Gnu General Public License ---------------- + +The source code of Threading Building Blocks is distributed under version 2 +of the GNU General Public License, with the so-called "runtime exception," +as follows (or see any header or implementation file): + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. diff --git a/dep/tbb/Makefile.am b/dep/tbb/Makefile.am new file mode 100644 index 000000000..98027104a --- /dev/null +++ b/dep/tbb/Makefile.am @@ -0,0 +1,58 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +tbb_root = $(srcdir) + +include $(tbb_root)/build/common.inc + +# change these +override work_dir = $(CWD) +export work_dir +override tbb_root = $(srcdir) +export work_dir + +.PHONY: all tbb tbbmalloc + +#workaround for non-depend targets tbb and tbbmalloc which both depend on version_string.tmp +#According to documentation submakes should run in parallel +.NOTPARALLEL: tbb tbbmalloc + +all: tbb tbbmalloc + +tbb: + $(MAKE) -r -f $(tbb_root)/build/Makefile.tbb cfg=release tbb_root=$(tbb_root) + +tbbmalloc: + $(MAKE) -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=release malloc tbb_root=$(tbb_root) + +install-exec-local: + $(INSTALL) $(work_dir)/lib*.so* $(libdir) + +clean-local: + -rm -f *.d *.o + -rm -f lib*.so* + -rm -f *.def *.tmp tbbvars.* + diff --git a/dep/tbb/README b/dep/tbb/README new file mode 100644 index 000000000..67ab8ad2e --- /dev/null +++ b/dep/tbb/README @@ -0,0 +1,11 @@ +Threading Building Blocks - README + +See index.html for directions and documentation. + +If source is present (./Makefile and src/ directories), +type 'gmake' in this directory to build and test. + +See examples/index.html for runnable examples and directions. + +See http://threadingbuildingblocks.org for full documentation +and software information. diff --git a/dep/tbb/build/FreeBSD.gcc.inc b/dep/tbb/build/FreeBSD.gcc.inc new file mode 100644 index 000000000..300453525 --- /dev/null +++ b/dep/tbb/build/FreeBSD.gcc.inc @@ -0,0 +1,93 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -fPIC +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -Wall +DYLIB_KEY = -shared + +TBB_NOSTRICT = 1 + +CPLUS = g++ +CONLY = gcc +LIB_LINK_FLAGS = -shared +LIBS = -lpthread +C_FLAGS = $(CPLUS_FLAGS) + +ifeq ($(cfg), release) + CPLUS_FLAGS = -O2 -DUSE_PTHREAD +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = -DTBB_USE_DEBUG -g -O0 -DUSE_PTHREAD +endif + +ASM= +ASM_FLAGS= + +TBB_ASM.OBJ= + +ifeq (ia64,$(arch)) +# Position-independent code (PIC) is a must on IA-64, even for regular (not shared) executables + CPLUS_FLAGS += $(PIC_KEY) +endif + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +ifeq (ia32,$(arch)) + CPLUS_FLAGS += -m32 + LIB_LINK_FLAGS += -m32 +endif + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-gas +ifeq (ia64,$(arch)) + ASM=as + TBB_ASM.OBJ = atomic_support.o lock_byte.o log2.o pause.o +endif +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions -fno-schedule-insns2 + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/FreeBSD.inc b/dep/tbb/build/FreeBSD.inc new file mode 100644 index 000000000..82b3daa14 --- /dev/null +++ b/dep/tbb/build/FreeBSD.inc @@ -0,0 +1,81 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +ifndef arch + ifeq ($(shell uname -m),i386) + export arch:=ia32 + endif + ifeq ($(shell uname -m),ia64) + export arch:=ia64 + endif + ifeq ($(shell uname -m),amd64) + export arch:=intel64 + endif +endif + +ifndef runtime + gcc_version:=$(shell gcc -v 2>&1 | grep 'gcc version' | sed -e 's/^gcc version //' | sed -e 's/ .*$$//') + os_version:=$(shell uname -r) + os_kernel_version:=$(shell uname -r | sed -e 's/-.*$$//') + export runtime:=cc$(gcc_version)_kernel$(os_kernel_version) +endif + +native_compiler := gcc +export compiler ?= gcc +debugger ?= gdb + +CMD=$(SHELL) -c +CWD=$(shell pwd) +RM?=rm -f +RD?=rmdir +MD?=mkdir -p +NUL= /dev/null +SLASH=/ +MAKE_VERSIONS=sh $(tbb_root)/build/version_info_linux.sh $(CPLUS) $(CPLUS_FLAGS) $(INCLUDES) >version_string.tmp +MAKE_TBBVARS=sh $(tbb_root)/build/generate_tbbvars.sh + +ifdef LD_LIBRARY_PATH + export LD_LIBRARY_PATH := .:$(LD_LIBRARY_PATH) +else + export LD_LIBRARY_PATH := . +endif + +####### Build settings ######################################################## + +OBJ = o +DLL = so + +TBB.DEF = +TBB.DLL = libtbb$(DEBUG_SUFFIX).$(DLL) +TBB.LIB = $(TBB.DLL) +LINK_TBB.LIB = $(TBB.LIB) + +MALLOC.DLL = libtbbmalloc$(DEBUG_SUFFIX).$(DLL) +MALLOC.LIB = $(MALLOC.DLL) + +TBB_NOSTRICT=1 + +TEST_LAUNCHER=sh $(tbb_root)/build/test_launcher.sh diff --git a/dep/tbb/build/Makefile.rml b/dep/tbb/build/Makefile.rml new file mode 100644 index 000000000..1ef95c4fa --- /dev/null +++ b/dep/tbb/build/Makefile.rml @@ -0,0 +1,157 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +tbb_root ?= $(TBB22_INSTALL_DIR) +BUILDING_PHASE=1 +include $(tbb_root)/build/common.inc +DEBUG_SUFFIX=$(findstring _debug,_$(cfg)) + +# default target +default_rml: rml rml_test + +RML_ROOT ?= $(tbb_root)/src/rml +RML_SERVER_ROOT = $(RML_ROOT)/server + +VPATH = $(tbb_root)/src/tbb $(tbb_root)/src/tbb/$(ASSEMBLY_SOURCE) +VPATH += $(RML_ROOT)/server $(RML_ROOT)/client $(RML_ROOT)/test + +include $(tbb_root)/build/common_rules.inc + +#-------------------------------------------------------------------------- +# Define rules for making the RML server shared library and client objects. +#-------------------------------------------------------------------------- + +# Object files that make up RML server +RML_SERVER.OBJ = rml_server.$(OBJ) + +# Object files that RML clients need +RML_TBB_CLIENT.OBJ = rml_tbb.$(OBJ) dynamic_link.$(OBJ) +RML_OMP_CLIENT.OBJ = rml_omp.$(OBJ) omp_dynamic_link.$(OBJ) + +RML.OBJ = $(RML_SERVER.OBJ) $(RML_TBB_CLIENT.OBJ) $(RML_OMP_CLIENT.OBJ) +ifeq (windows,$(tbb_os)) +RML_ASM.OBJ = $(if $(findstring intel64,$(arch)),$(TBB_ASM.OBJ)) +endif +ifeq (linux,$(tbb_os)) +RML_ASM.OBJ = $(if $(findstring ia64,$(arch)),$(TBB_ASM.OBJ)) +endif + +RML_TBB_DEP= cache_aligned_allocator_rml.$(OBJ) dynamic_link_rml.$(OBJ) concurrent_vector_rml.$(OBJ) tbb_misc_rml.$(OBJ) +TBB_DEP_NON_RML_TEST= cache_aligned_allocator_rml.$(OBJ) dynamic_link_rml.$(OBJ) $(RML_ASM.OBJ) +TBB_DEP_RML_TEST= $(RML_ASM.OBJ) +ifeq ($(cfg),debug) +RML_TBB_DEP+= spin_mutex_rml.$(OBJ) +TBB_DEP_NON_RML_TEST+= tbb_misc_rml.$(OBJ) +TBB_DEP_RML_TEST+= tbb_misc_rml.$(OBJ) +endif +LIBS += $(LIBDL) + +INCLUDES += $(INCLUDE_KEY)$(RML_ROOT)/include $(INCLUDE_KEY). +T_INCLUDES = $(INCLUDES) $(INCLUDE_KEY)$(tbb_root)/src/test $(INCLUDE_KEY)$(RML_SERVER_ROOT) +WARNING_SUPPRESS += $(RML_WARNING_SUPPRESS) + +# Suppress superfluous warnings for RML compilation +R_CPLUS_FLAGS = $(subst DO_ITT_NOTIFY,DO_ITT_NOTIFY=0,$(CPLUS_FLAGS_NOSTRICT)) $(WARNING_SUPPRESS) \ + $(DEFINE_KEY)TBB_USE_THREADING_TOOLS=0 $(DEFINE_KEY)__TBB_RML_STATIC=1 $(DEFINE_KEY)__TBB_NO_IMPLICIT_LINKAGE=1 + +%.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(R_CPLUS_FLAGS) $(PIC_KEY) $(INCLUDES) $< + +tbb_misc_rml.$(OBJ): version_string.tmp + +RML_TEST.OBJ = test_job_automaton.$(OBJ) test_thread_monitor.$(OBJ) test_rml_tbb.$(OBJ) test_rml_omp.$(OBJ) test_rml_mixed.$(OBJ) + +$(RML_TBB_DEP): %_rml.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(OUTPUTOBJ_KEY)$@ $(R_CPLUS_FLAGS) $(PIC_KEY) $(INCLUDES) $< + +$(RML_TEST.OBJ): %.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(R_CPLUS_FLAGS) $(PIC_KEY) $(T_INCLUDES) $< + +ifneq (,$(RML.DEF)) +rml.def: $(RML.DEF) + $(CMD) "$(CPLUS) $(PREPROC_ONLY) $(RML.DEF) $(filter $(DEFINE_KEY)%,$(CPLUS_FLAGS)) >rml.def 2>$(NUL) || exit 0" + +LIB_LINK_FLAGS += $(EXPORT_KEY)rml.def +$(RML.DLL): rml.def +endif + +$(RML.DLL): BUILDING_LIBRARY = $(RML.DLL) +$(RML.DLL): $(RML_TBB_DEP) $(RML_SERVER.OBJ) $(RML.RES) $(RML_NO_VERSION.DLL) $(RML_ASM.OBJ) + $(LIB_LINK_CMD) $(LIB_OUTPUT_KEY)$(RML.DLL) $(RML_SERVER.OBJ) $(RML_TBB_DEP) $(RML_ASM.OBJ) $(RML.RES) $(LIB_LINK_LIBS) $(LIB_LINK_FLAGS) + +ifneq (,$(RML_NO_VERSION.DLL)) +$(RML_NO_VERSION.DLL): + echo "INPUT ($(RML.DLL))" > $(RML_NO_VERSION.DLL) +endif + +rml: $(RML.DLL) $(RML_TBB_CLIENT.OBJ) $(RML_OMP_CLIENT.OBJ) + +#------------------------------------------------------ +# End of rules for making the RML server shared library +#------------------------------------------------------ + +#------------------------------------------------------ +# Define rules for making the RML unit tests +#------------------------------------------------------ + +add_debug=$(basename $(1))_debug$(suffix $(1)) +cross_suffix=$(if $(crosstest),$(if $(DEBUG_SUFFIX),$(subst _debug,,$(1)),$(call add_debug,$(1))),$(1)) + +RML_TESTS = test_job_automaton.exe test_thread_monitor.exe test_rml_tbb.exe test_rml_omp.exe test_rml_mixed.exe test_rml_omp_c_linkage.exe + +test_rml_tbb.exe: test_rml_tbb.$(OBJ) $(RML_TBB_CLIENT.OBJ) $(TBB_DEP_RML_TEST) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) test_rml_tbb.$(OBJ) $(RML_TBB_CLIENT.OBJ) $(TBB_DEP_RML_TEST) $(LIBS) $(LINK_FLAGS) + +test_rml_omp.exe: test_rml_omp.$(OBJ) $(RML_OMP_CLIENT.OBJ) $(TBB_DEP_NON_RML_TEST) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) test_rml_omp.$(OBJ) $(RML_OMP_CLIENT.OBJ) $(TBB_DEP_NON_RML_TEST) $(LIBS) $(LINK_FLAGS) + +test_rml_mixed.exe: test_rml_mixed.$(OBJ) $(RML_TBB_CLIENT.OBJ) $(RML_OMP_CLIENT.OBJ) $(TBB_DEP_RML_TEST) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) test_rml_mixed.$(OBJ) $(RML_TBB_CLIENT.OBJ) $(RML_OMP_CLIENT.OBJ) $(TBB_DEP_RML_TEST) $(LIBS) $(LINK_FLAGS) + +rml_omp_stub.$(OBJ): rml_omp_stub.cpp + $(CPLUS) $(COMPILE_ONLY) $(M_CPLUS_FLAGS) $(WARNING_SUPPRESS) $(T_INCLUDES) $(PIC_KEY) $< + +test_rml_omp_c_linkage.exe: test_rml_omp_c_linkage.$(OBJ) rml_omp_stub.$(OBJ) + $(CONLY) $(C_FLAGS) $(OUTPUT_KEY)$@ test_rml_omp_c_linkage.$(OBJ) rml_omp_stub.$(OBJ) + +test_%.exe: test_%.$(OBJ) $(TBB_DEP_NON_RML_TEST) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(TBB_DEP_NON_RML_TEST) $(LIBS) $(LINK_FLAGS) + +### run_cmd is usually empty +rml_test: $(call cross_suffix,$(RML.DLL)) $(RML_TESTS) + $(run_cmd) ./test_job_automaton.exe + $(run_cmd) ./test_thread_monitor.exe + $(run_cmd) ./test_rml_tbb.exe + $(run_cmd) ./test_rml_omp.exe + $(run_cmd) ./test_rml_mixed.exe + $(run_cmd) ./test_rml_omp_c_linkage.exe + +#------------------------------------------------------ +# End of rules for making the TBBMalloc unit tests +#------------------------------------------------------ + +# Include automatically generated dependences +-include *.d diff --git a/dep/tbb/build/Makefile.tbb b/dep/tbb/build/Makefile.tbb new file mode 100644 index 000000000..9f7484008 --- /dev/null +++ b/dep/tbb/build/Makefile.tbb @@ -0,0 +1,121 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +#------------------------------------------------------------------------------ +# Define rules for making the TBB shared library. +#------------------------------------------------------------------------------ + +tbb_root ?= "$(TBB22_INSTALL_DIR)" +BUILDING_PHASE=1 +include $(tbb_root)/build/common.inc +DEBUG_SUFFIX=$(findstring _debug,_$(cfg)) + +#------------------------------------------------------------ +# Define static pattern rules dealing with .cpp source files +#------------------------------------------------------------ +$(warning CONFIG: cfg=$(cfg) arch=$(arch) compiler=$(compiler) os=$(tbb_os) runtime=$(runtime)) + +default_tbb: $(TBB.DLL) +.PHONY: default_tbb tbbvars clean +.PRECIOUS: %.$(OBJ) + +VPATH = $(tbb_root)/src/tbb/$(ASSEMBLY_SOURCE) $(tbb_root)/src/tbb $(tbb_root)/src/old $(tbb_root)/src/rml/client + +CPLUS_FLAGS += $(PIC_KEY) $(DEFINE_KEY)__TBB_BUILD=1 + +ifeq (1,$(TBB_NOSTRICT)) +# GNU 3.2.3 headers have a ISO syntax that is rejected by Intel compiler in -strict_ansi mode. +# The Mac uses gcc, so the list is empty for that platform. +# The files below need the -strict_ansi flag downgraded to -ansi to compile + +KNOWN_NOSTRICT = concurrent_hash_map.o \ + concurrent_queue.o \ + concurrent_vector_v2.o \ + concurrent_vector.o + +endif + +# Object files (that were compiled from C++ code) that gmake up TBB +TBB_CPLUS.OBJ = concurrent_hash_map.$(OBJ) \ + concurrent_queue.$(OBJ) \ + concurrent_vector.$(OBJ) \ + dynamic_link.$(OBJ) \ + itt_notify.$(OBJ) \ + cache_aligned_allocator.$(OBJ) \ + pipeline.$(OBJ) \ + queuing_mutex.$(OBJ) \ + queuing_rw_mutex.$(OBJ) \ + spin_rw_mutex.$(OBJ) \ + spin_mutex.$(OBJ) \ + task.$(OBJ) \ + tbb_misc.$(OBJ) \ + mutex.$(OBJ) \ + recursive_mutex.$(OBJ) \ + tbb_thread.$(OBJ) \ + itt_notify_proxy.$(OBJ) \ + private_server.$(OBJ) \ + rml_tbb.$(OBJ) + +# OLD/Legacy object files for backward binary compatibility +ifeq (,$(findstring $(DEFINE_KEY)TBB_NO_LEGACY,$(CPLUS_FLAGS))) +TBB_CPLUS_OLD.OBJ = \ + concurrent_vector_v2.$(OBJ) \ + concurrent_queue_v2.$(OBJ) \ + spin_rw_mutex_v2.$(OBJ) +endif + +# Object files that gmake up TBB (TBB_ASM.OBJ is platform-specific) +TBB.OBJ = $(TBB_CPLUS.OBJ) $(TBB_CPLUS_OLD.OBJ) $(TBB_ASM.OBJ) + +# Suppress superfluous warnings for TBB compilation +WARNING_KEY += $(WARNING_SUPPRESS) + +CXX_WARN_SUPPRESS = $(RML_WARNING_SUPPRESS) + +include $(tbb_root)/build/common_rules.inc + +ifneq (,$(TBB.DEF)) +tbb.def: $(TBB.DEF) + $(CMD) "$(CPLUS) $(PREPROC_ONLY) $(TBB.DEF) $(INCLUDES) $(filter $(DEFINE_KEY)%,$(CPLUS_FLAGS)) >tbb.def 2>$(NUL) || exit 0" + +LIB_LINK_FLAGS += $(EXPORT_KEY)tbb.def +$(TBB.DLL): tbb.def +endif + +$(TBB.DLL): BUILDING_LIBRARY = $(TBB.DLL) +$(TBB.DLL): $(TBB.OBJ) $(TBB.RES) tbbvars $(TBB_NO_VERSION.DLL) + $(LIB_LINK_CMD) $(LIB_OUTPUT_KEY)$(TBB.DLL) $(TBB.OBJ) $(TBB.RES) $(LIB_LINK_LIBS) $(LIB_LINK_FLAGS) + +ifneq (,$(TBB_NO_VERSION.DLL)) +$(TBB_NO_VERSION.DLL): + echo "INPUT ($(TBB.DLL))" > $(TBB_NO_VERSION.DLL) +endif + +#clean: +# $(RM) *.$(OBJ) *.$(DLL) *.res *.map *.ilk *.pdb *.exp *.manifest *.tmp *.d core core.*[0-9][0-9] + +# Include automatically generated dependences +-include *.d diff --git a/dep/tbb/build/Makefile.tbbmalloc b/dep/tbb/build/Makefile.tbbmalloc new file mode 100644 index 000000000..a6470f809 --- /dev/null +++ b/dep/tbb/build/Makefile.tbbmalloc @@ -0,0 +1,184 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# default target +default_malloc: malloc malloc_test + +tbb_root ?= $(TBB22_INSTALL_DIR) +BUILDING_PHASE=1 +TEST_RESOURCE = $(TBB.RES) +include $(tbb_root)/build/common.inc +DEBUG_SUFFIX=$(findstring _debug,_$(cfg)) + +MALLOC_ROOT ?= $(tbb_root)/src/tbbmalloc +MALLOC_SOURCE_ROOT ?= $(MALLOC_ROOT) + +VPATH = $(tbb_root)/src/tbb/$(ASSEMBLY_SOURCE) $(tbb_root)/src/tbb $(tbb_root)/src/test +VPATH += $(MALLOC_ROOT) $(MALLOC_SOURCE_ROOT) + +KNOWN_NOSTRICT = test_ScalableAllocator_STL.$(OBJ) test_malloc_compliance.$(OBJ) test_malloc_overload.$(OBJ) + +CPLUS_FLAGS += $(if $(crosstest),$(DEFINE_KEY)__TBBMALLOC_NO_IMPLICIT_LINKAGE=1) + +include $(tbb_root)/build/common_rules.inc + +#------------------------------------------------------ +# Define rules for making the TBBMalloc shared library. +#------------------------------------------------------ + +# Object files that make up TBBMalloc +MALLOC_CPLUS.OBJ = tbbmalloc.$(OBJ) dynamic_link.$(OBJ) +MALLOC_CUSTOM.OBJ += tbb_misc_malloc.$(OBJ) +MALLOC_ASM.OBJ = $(TBB_ASM.OBJ) + +# MALLOC_CPLUS.OBJ is built in two steps due to Intel Compiler Tracker # C69574 +MALLOC.OBJ := $(MALLOC_CPLUS.OBJ) $(MALLOC_ASM.OBJ) $(MALLOC_CUSTOM.OBJ) MemoryAllocator.$(OBJ) itt_notify_proxy.$(OBJ) +MALLOC_CPLUS.OBJ += MemoryAllocator.$(OBJ) +PROXY.OBJ := proxy.$(OBJ) tbb_function_replacement.$(OBJ) +M_CPLUS_FLAGS := $(subst $(WARNING_KEY),,$(M_CPLUS_FLAGS)) $(DEFINE_KEY)__TBB_BUILD=1 +M_INCLUDES = $(INCLUDES) $(INCLUDE_KEY)$(MALLOC_ROOT) $(INCLUDE_KEY)$(MALLOC_SOURCE_ROOT) + +# Suppress superfluous warnings for TBBmalloc compilation +$(MALLOC.OBJ): M_CPLUS_FLAGS += $(WARNING_SUPPRESS) + +itt_notify_proxy.$(OBJ): C_FLAGS += $(PIC_KEY) + +$(PROXY.OBJ): %.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(CPLUS_FLAGS) $(PIC_KEY) $(M_INCLUDES) $< + +$(MALLOC_CPLUS.OBJ): %.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(M_CPLUS_FLAGS) $(PIC_KEY) $(M_INCLUDES) $< + +tbb_misc_malloc.$(OBJ): tbb_misc.cpp version_string.tmp + $(CPLUS) $(COMPILE_ONLY) $(subst -strict_ansi,-ansi,$(M_CPLUS_FLAGS)) $(PIC_KEY) $(OUTPUTOBJ_KEY)$@ $(INCLUDE_KEY). $(INCLUDES) $< + +MALLOC_LINK_FLAGS = $(LIB_LINK_FLAGS) +PROXY_LINK_FLAGS = $(LIB_LINK_FLAGS) + +ifneq (,$(MALLOC.DEF)) +tbbmalloc.def: $(MALLOC.DEF) + $(CMD) "$(CPLUS) $(PREPROC_ONLY) $(MALLOC.DEF) $(filter $(DEFINE_KEY)%,$(CPLUS_FLAGS)) >tbbmalloc.def 2>$(NUL) || exit 0" + +MALLOC_LINK_FLAGS += $(EXPORT_KEY)tbbmalloc.def +$(MALLOC.DLL): tbbmalloc.def +endif + +$(MALLOC.DLL): BUILDING_LIBRARY = $(MALLOC.DLL) +$(MALLOC.DLL): $(MALLOC.OBJ) $(MALLOC.RES) $(MALLOC_NO_VERSION.DLL) + $(LIB_LINK_CMD) $(LIB_OUTPUT_KEY)$(MALLOC.DLL) $(MALLOC.OBJ) $(MALLOC.RES) $(LIB_LINK_LIBS) $(MALLOC_LINK_FLAGS) + +ifneq (,$(MALLOCPROXY.DEF)) +tbbmallocproxy.def: $(MALLOCPROXY.DEF) + $(CMD) "$(CPLUS) $(PREPROC_ONLY) $(MALLOCPROXY.DEF) $(filter $(DEFINE_KEY)%,$(CPLUS_FLAGS)) >tbbmallocproxy.def 2>$(NUL) || exit 0" + +PROXY_LINK_FLAGS += $(EXPORT_KEY)tbbmallocproxy.def +$(MALLOCPROXY.DLL): tbbmallocproxy.def +endif + +ifneq (,$(MALLOCPROXY.DLL)) +$(MALLOCPROXY.DLL): BUILDING_LIBRARY = $(MALLOCPROXY.DLL) +$(MALLOCPROXY.DLL): $(PROXY.OBJ) $(MALLOCPROXY_NO_VERSION.DLL) $(MALLOC.DLL) $(MALLOC.RES) + $(LIB_LINK_CMD) $(LIB_OUTPUT_KEY)$(MALLOCPROXY.DLL) $(PROXY.OBJ) $(MALLOC.RES) $(LIB_LINK_LIBS) $(LINK_MALLOC.LIB) $(PROXY_LINK_FLAGS) + +malloc: $(MALLOCPROXY.DLL) +endif + +ifneq (,$(MALLOC_NO_VERSION.DLL)) +$(MALLOC_NO_VERSION.DLL): + echo "INPUT ($(MALLOC.DLL))" > $(MALLOC_NO_VERSION.DLL) +endif + +ifneq (,$(MALLOCPROXY_NO_VERSION.DLL)) +$(MALLOCPROXY_NO_VERSION.DLL): + echo "INPUT ($(MALLOCPROXY.DLL))" > $(MALLOCPROXY_NO_VERSION.DLL) +endif + +malloc: $(MALLOC.DLL) $(MALLOCPROXY.DLL) + +malloc_dll: $(MALLOC.DLL) + +malloc_proxy_dll: $(MALLOCPROXY.DLL) + +.PHONY: malloc malloc_dll malloc_proxy_dll + +#------------------------------------------------------ +# End of rules for making the TBBMalloc shared library +#------------------------------------------------------ + +#------------------------------------------------------ +# Define rules for making the TBBMalloc unit tests +#------------------------------------------------------ + +add_debug=$(basename $(1))_debug$(suffix $(1)) +cross_suffix=$(if $(crosstest),$(if $(DEBUG_SUFFIX),$(subst _debug,,$(1)),$(call add_debug,$(1))),$(1)) + +MALLOC_MAIN_TESTS = test_ScalableAllocator.$(TEST_EXT) test_ScalableAllocator_STL.$(TEST_EXT) test_malloc_compliance.$(TEST_EXT) test_malloc_regression.$(TEST_EXT) +MALLOC_OVERLOAD_TESTS = test_malloc_overload.$(TEST_EXT) test_malloc_overload_proxy.$(TEST_EXT) + +MALLOC_LIB = $(call cross_suffix,$(MALLOC.LIB)) +MALLOC_PROXY_LIB = $(call cross_suffix,$(MALLOCPROXY.LIB)) + +ifeq (windows.gcc,$(tbb_os).$(compiler)) +test_malloc_overload.$(TEST_EXT): LIBS += $(MALLOC_PROXY_LIB) +endif + +test_malloc_overload.$(TEST_EXT): test_malloc_overload.$(OBJ) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(LIBDL) $(LIBS) $(LINK_FLAGS) +test_malloc_overload_proxy.$(TEST_EXT): test_malloc_overload.$(OBJ) $(MALLOC_PROXY_LIB) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(LIBDL) $(MALLOC_PROXY_LIB) $(LIBS) $(LINK_FLAGS) + +test_malloc_whitebox.$(TEST_EXT): test_malloc_whitebox.cpp $(MALLOC_ASM.OBJ) tbb_misc_malloc.$(OBJ) + $(CPLUS) $(OUTPUT_KEY)$@ $(M_CPLUS_FLAGS) $(M_INCLUDES) $^ $(LIBS) $(LINK_FLAGS) + +$(MALLOC_MAIN_TESTS): %.$(TEST_EXT): %.$(OBJ) $(MALLOC_LIB) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(MALLOC_LIB) $(LIBS) $(LINK_FLAGS) + +ifeq (,$(NO_C_TESTS)) +MALLOC_C_TESTS = test_malloc_pure_c.$(TEST_EXT) + +$(MALLOC_C_TESTS): %.$(TEST_EXT): %.$(OBJ) $(MALLOC_LIB) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $^ $(LIBS) $(LINK_FLAGS) +endif + +# run_cmd is usually empty +malloc_test: $(call cross_suffix,$(MALLOC.DLL)) $(MALLOC_MAIN_TESTS) $(MALLOC_C_TESTS) $(MALLOC_OVERLOAD_TESTS) test_malloc_whitebox.$(TEST_EXT) $(AUX_TEST_DEPENDENCIES) + $(run_cmd) ./test_malloc_whitebox.$(TEST_EXT) 1:4 + $(run_cmd) $(TEST_LAUNCHER) -l $(call cross_suffix,$(MALLOCPROXY.DLL)) test_malloc_overload.$(TEST_EXT) + $(run_cmd) $(TEST_LAUNCHER) test_malloc_overload_proxy.$(TEST_EXT) + $(run_cmd) $(TEST_LAUNCHER) test_malloc_compliance.$(TEST_EXT) 1:4 + $(run_cmd) ./test_ScalableAllocator.$(TEST_EXT) + $(run_cmd) ./test_ScalableAllocator_STL.$(TEST_EXT) + $(run_cmd) ./test_malloc_regression.$(TEST_EXT) +ifeq (,$(NO_C_TESTS)) + $(run_cmd) ./test_malloc_pure_c.$(TEST_EXT) +endif + +#------------------------------------------------------ +# End of rules for making the TBBMalloc unit tests +#------------------------------------------------------ + +# Include automatically generated dependences +-include *.d diff --git a/dep/tbb/build/Makefile.test b/dep/tbb/build/Makefile.test new file mode 100644 index 000000000..8b9c339fe --- /dev/null +++ b/dep/tbb/build/Makefile.test @@ -0,0 +1,310 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +#------------------------------------------------------------------------------ +# Define rules for making the TBB tests. +#------------------------------------------------------------------------------ +.PHONY: default test_tbb_plain test_tbb_old clean + +default: test_tbb_plain test_tbb_old + +tbb_root ?= $(TBB22_INSTALL_DIR) +BUILDING_PHASE=1 +TEST_RESOURCE = $(TBB.RES) +include $(tbb_root)/build/common.inc +DEBUG_SUFFIX=$(findstring _debug,$(call cross_cfg,_$(cfg))) + +#------------------------------------------------------------ +# Define static pattern rules dealing with .cpp source files +#------------------------------------------------------------ + +VPATH = $(tbb_root)/src/tbb/$(ASSEMBLY_SOURCE) $(tbb_root)/src/tbb $(tbb_root)/src/rml/client $(tbb_root)/src/old $(tbb_root)/src/test $(tbb_root)/src/perf + +CPLUS_FLAGS += $(if $(crosstest),$(DEFINE_KEY)__TBB_NO_IMPLICIT_LINKAGE=1) + +ifeq (1,$(TBB_NOSTRICT)) +# GNU 3.2.3 headers have a ISO syntax that is rejected by Intel compiler in -strict_ansi mode. +# The Mac uses gcc 4.0, so the list is empty for that platform. +# The files below need the -strict_ansi flag downgraded to -ansi to compile + +KNOWN_NOSTRICT += \ + test_concurrent_hash_map.o \ + test_concurrent_vector.o \ + test_concurrent_queue.o \ + test_enumerable_thread_specific.o \ + test_handle_perror.o \ + test_cache_aligned_allocator_STL.o \ + test_task_scheduler_init.o \ + test_model_plugin.o \ + test_parallel_do.o \ + test_lambda.o \ + test_eh_algorithms.o \ + test_parallel_sort.o \ + test_parallel_for_each.o \ + test_task_group.o \ + test_tbb_header.o \ + test_combinable.o \ + test_tbb_version.o + +endif + +include $(tbb_root)/build/common_rules.inc + +# Rule for generating executable test +%.$(TEST_EXT): %.$(OBJ) $(TBB.LIB) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(LINK_TBB.LIB) $(LIBS) $(LINK_FLAGS) + +# Rules for generating a test DLL +%.$(DLL).$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(OUTPUTOBJ_KEY)$@ $(CPLUS_FLAGS_NOSTRICT) $(PIC_KEY) $(DEFINE_KEY)_USRDLL $(INCLUDES) $< +%.$(DLL): %.$(DLL).$(OBJ) $(TBB.LIB) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $(PIC_KEY) $< $(LINK_TBB.LIB) $(LIBS) $(LINK_FLAGS) $(DYLIB_KEY) + +# Rules for the tests, which use TBB in a dynamically loadable library +test_model_plugin.$(TEST_EXT): test_model_plugin.$(OBJ) test_model_plugin.$(DLL) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(LIBDL) $(LIBS) $(LINK_FLAGS) + +TASK_CPP_DEPENDENCIES = $(TBB_ASM.OBJ) \ + cache_aligned_allocator.$(OBJ) \ + dynamic_link.$(OBJ) \ + tbb_misc.$(OBJ) \ + tbb_thread.$(OBJ) \ + itt_notify.$(OBJ) \ + mutex.$(OBJ) \ + spin_rw_mutex.$(OBJ) \ + spin_mutex.$(OBJ) \ + private_server.$(OBJ) \ + rml_tbb.$(OBJ) + +ifeq (,$(codecov)) + TASK_CPP_DEPENDENCIES += itt_notify_proxy.$(OBJ) +endif + +# These executables don't depend on the TBB library, but include task.cpp directly +TASK_CPP_DIRECTLY_INCLUDED = test_eh_tasks.$(TEST_EXT) \ + test_task_leaks.$(TEST_EXT) \ + test_task_assertions.$(TEST_EXT) \ + test_assembly.$(TEST_EXT) + +$(TASK_CPP_DIRECTLY_INCLUDED): WARNING_KEY += $(WARNING_SUPPRESS) + +$(TASK_CPP_DIRECTLY_INCLUDED): %.$(TEST_EXT) : %.$(OBJ) $(TASK_CPP_DEPENDENCIES) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $^ $(LIBDL) $(LIBS) $(LINK_FLAGS) + +test_handle_perror.$(TEST_EXT): test_handle_perror.$(OBJ) tbb_misc.$(OBJ) $(TBB_ASM.OBJ) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $^ $(LINK_TBB.LIB) $(LIBS) $(LINK_FLAGS) + +test_tbb_header2.$(OBJ): test_tbb_header.cpp + $(CPLUS) $(COMPILE_ONLY) $(CPLUS_FLAGS_NOSTRICT) $(CXX_ONLY_FLAGS) $(CXX_WARN_SUPPRESS) $(INCLUDES) $(DEFINE_KEY)__TBB_TEST_SECONDARY=1 $< $(OUTPUTOBJ_KEY)$@ + +# Detecting "multiple definition" linker error using the test that covers the whole library +test_tbb_header.$(TEST_EXT): test_tbb_header.$(OBJ) test_tbb_header2.$(OBJ) $(TBB.LIB) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $^ $(LINK_TBB.LIB) $(LIBS) $(LINK_FLAGS) + +# Rules for the tests, which depend on tbbmalloc +test_concurrent_hash_map_string.$(TEST_EXT): test_concurrent_hash_map_string.$(OBJ) + $(CPLUS) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $< $(LINK_TBB.LIB) $(MALLOC.LIB) $(LIBS) $(LINK_FLAGS) + +# These are in alphabetical order +TEST_TBB_PLAIN.EXE = test_assembly.$(TEST_EXT) \ + test_aligned_space.$(TEST_EXT) \ + test_task_assertions.$(TEST_EXT) \ + test_atomic.$(TEST_EXT) \ + test_blocked_range.$(TEST_EXT) \ + test_blocked_range2d.$(TEST_EXT) \ + test_blocked_range3d.$(TEST_EXT) \ + test_compiler.$(TEST_EXT) \ + test_concurrent_queue.$(TEST_EXT) \ + test_concurrent_vector.$(TEST_EXT) \ + test_concurrent_hash_map.$(TEST_EXT) \ + test_enumerable_thread_specific.$(TEST_EXT) \ + test_handle_perror.$(TEST_EXT) \ + test_halt.$(TEST_EXT) \ + test_lambda.$(TEST_EXT) \ + test_model_plugin.$(TEST_EXT) \ + test_mutex.$(TEST_EXT) \ + test_mutex_native_threads.$(TEST_EXT) \ + test_rwm_upgrade_downgrade.$(TEST_EXT) \ + test_cache_aligned_allocator_STL.$(TEST_EXT) \ + test_cache_aligned_allocator.$(TEST_EXT) \ + test_parallel_for.$(TEST_EXT) \ + test_parallel_reduce.$(TEST_EXT) \ + test_parallel_sort.$(TEST_EXT) \ + test_parallel_scan.$(TEST_EXT) \ + test_parallel_while.$(TEST_EXT) \ + test_parallel_do.$(TEST_EXT) \ + test_pipeline.$(TEST_EXT) \ + test_pipeline_with_tbf.$(TEST_EXT) \ + test_task_scheduler_init.$(TEST_EXT) \ + test_task_scheduler_observer.$(TEST_EXT) \ + test_task.$(TEST_EXT) \ + test_task_leaks.$(TEST_EXT) \ + test_tbb_thread.$(TEST_EXT) \ + test_tick_count.$(TEST_EXT) \ + test_inits_loop.$(TEST_EXT) \ + test_yield.$(TEST_EXT) \ + test_eh_tasks.$(TEST_EXT) \ + test_eh_algorithms.$(TEST_EXT) \ + test_parallel_invoke.$(TEST_EXT) \ + test_task_group.$(TEST_EXT) \ + test_ittnotify.$(TEST_EXT) \ + test_parallel_for_each.$(TEST_EXT) \ + test_tbb_header.$(TEST_EXT) \ + test_combinable.$(TEST_EXT) \ + test_task_auto_init.$(TEST_EXT) \ + test_tbb_version.$(TEST_EXT) # insert new files right above + +ifdef OPENMP_FLAG + TEST_TBB_PLAIN.EXE += test_tbb_openmp +test_openmp.$(TEST_EXT): test_openmp.cpp + $(CPLUS) $(OPENMP_FLAG) $(OUTPUT_KEY)$@ $(CPLUS_FLAGS) $(INCLUDES) $< $(LIBS) $(LINK_TBB.LIB) $(LINK_FLAGS) +.PHONY: test_tbb_openmp +test_tbb_openmp: test_openmp.$(TEST_EXT) + ./test_openmp.$(TEST_EXT) 1:4 + +endif + +# Run tests that are in TEST_TBB_PLAIN.EXE +# The test are ordered so that simpler components are tested first. +# If a component Y uses component X, then tests for Y should come after tests for X. +# Note that usually run_cmd is empty, and tests run directly +test_tbb_plain: $(TEST_TBB_PLAIN.EXE) + $(run_cmd) ./test_assembly.$(TEST_EXT) + $(run_cmd) ./test_compiler.$(TEST_EXT) + # Yes, 4:8 is intended on the next line. + $(run_cmd) ./test_yield.$(TEST_EXT) 4:8 + $(run_cmd) ./test_handle_perror.$(TEST_EXT) + $(run_cmd) ./test_task_auto_init.$(TEST_EXT) + $(run_cmd) ./test_task_scheduler_init.$(TEST_EXT) 1:4 + $(run_cmd) ./test_task_scheduler_observer.$(TEST_EXT) 1:4 + $(run_cmd) ./test_task_assertions.$(TEST_EXT) + $(run_cmd) ./test_task.$(TEST_EXT) 1:4 + $(run_cmd) ./test_task_leaks.$(TEST_EXT) + $(run_cmd) ./test_atomic.$(TEST_EXT) + $(run_cmd) ./test_cache_aligned_allocator.$(TEST_EXT) + $(run_cmd) ./test_cache_aligned_allocator_STL.$(TEST_EXT) + $(run_cmd) ./test_blocked_range.$(TEST_EXT) 1:4 + $(run_cmd) ./test_blocked_range2d.$(TEST_EXT) 1:4 + $(run_cmd) ./test_blocked_range3d.$(TEST_EXT) 1:4 + $(run_cmd) ./test_parallel_for.$(TEST_EXT) 1:4 + $(run_cmd) ./test_parallel_sort.$(TEST_EXT) 1:4 + $(run_cmd) ./test_aligned_space.$(TEST_EXT) + $(run_cmd) ./test_parallel_reduce.$(TEST_EXT) 1:4 + $(run_cmd) ./test_parallel_scan.$(TEST_EXT) 1:4 + $(run_cmd) ./test_parallel_while.$(TEST_EXT) 1:4 + $(run_cmd) ./test_parallel_do.$(TEST_EXT) 1:4 + $(run_cmd) ./test_inits_loop.$(TEST_EXT) + $(run_cmd) ./test_lambda.$(TEST_EXT) 1:4 + $(run_cmd) ./test_mutex.$(TEST_EXT) 1 + $(run_cmd) ./test_mutex.$(TEST_EXT) 2 + $(run_cmd) ./test_mutex.$(TEST_EXT) 4 + $(run_cmd) ./test_mutex_native_threads.$(TEST_EXT) 1:4 + $(run_cmd) ./test_rwm_upgrade_downgrade.$(TEST_EXT) 4 + # Yes, 4:8 is intended on the next line. + $(run_cmd) ./test_halt.$(TEST_EXT) 4:8 + $(run_cmd) ./test_pipeline.$(TEST_EXT) 1:4 + $(run_cmd) ./test_pipeline_with_tbf.$(TEST_EXT) 1:4 + $(run_cmd) ./test_tick_count.$(TEST_EXT) 1:4 + $(run_cmd) ./test_concurrent_queue.$(TEST_EXT) 1:4 + $(run_cmd) ./test_concurrent_vector.$(TEST_EXT) 1:4 + $(run_cmd) ./test_concurrent_hash_map.$(TEST_EXT) 1:4 + $(run_cmd) ./test_enumerable_thread_specific.$(TEST_EXT) 0:4 + $(run_cmd) ./test_combinable.$(TEST_EXT) 0:4 + $(run_cmd) ./test_model_plugin.$(TEST_EXT) 4 + $(run_cmd) ./test_eh_tasks.$(TEST_EXT) 2:4 + $(run_cmd) ./test_eh_algorithms.$(TEST_EXT) 2:4 + $(run_cmd) ./test_tbb_thread.$(TEST_EXT) + $(run_cmd) ./test_parallel_invoke.$(TEST_EXT) 1:4 + $(run_cmd) ./test_task_group.$(TEST_EXT) 1:4 + $(run_cmd) ./test_ittnotify.$(TEST_EXT) 2:2 + $(run_cmd) ./test_parallel_for_each.$(TEST_EXT) 1:4 + $(run_cmd) ./test_tbb_header.$(TEST_EXT) + $(run_cmd) ./test_tbb_version.$(TEST_EXT) + +CPLUS_FLAGS_DEPRECATED = $(DEFINE_KEY)TBB_DEPRECATED=1 $(subst $(WARNING_KEY),,$(CPLUS_FLAGS_NOSTRICT)) $(WARNING_SUPPRESS) + +TEST_TBB_OLD.OBJ = test_concurrent_vector_v2.$(OBJ) test_concurrent_queue_v2.$(OBJ) test_mutex_v2.$(OBJ) + +TEST_TBB_DEPRECATED.OBJ = test_concurrent_queue_deprecated.$(OBJ) \ + test_concurrent_vector_deprecated.$(OBJ) \ + + +# For deprecated files, we don't mind warnings etc., thus compilation rules are most relaxed +$(TEST_TBB_OLD.OBJ): %.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(CPLUS_FLAGS_DEPRECATED) $(INCLUDES) $< + +%_deprecated.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(OUTPUTOBJ_KEY)$@ $(CPLUS_FLAGS_DEPRECATED) $(INCLUDES) $< + +TEST_TBB_OLD.EXE = $(subst .$(OBJ),.$(TEST_EXT),$(TEST_TBB_OLD.OBJ) $(TEST_TBB_DEPRECATED.OBJ)) + +ifeq (,$(NO_LEGACY_TESTS)) +test_tbb_old: $(TEST_TBB_OLD.EXE) + $(run_cmd) ./test_concurrent_vector_v2.$(TEST_EXT) 1:4 + $(run_cmd) ./test_concurrent_vector_deprecated.$(TEST_EXT) 1:4 + $(run_cmd) ./test_concurrent_queue_v2.$(TEST_EXT) 1:4 + $(run_cmd) ./test_concurrent_queue_deprecated.$(TEST_EXT) 1:4 + $(run_cmd) ./test_mutex_v2.$(TEST_EXT) 1 + $(run_cmd) ./test_mutex_v2.$(TEST_EXT) 2 + $(run_cmd) ./test_mutex_v2.$(TEST_EXT) 4 +else +test_tbb_old: + @echo Legacy tests skipped +endif + +ifneq (,$(codecov)) +codecov_gen: + profmerge + codecov $(if $(findstring -,$(codecov)),$(codecov),) -demang -comp $(tbb_root)/build/codecov.txt +endif + +test_% debug_%: test_%.$(TEST_EXT) $(AUX_TEST_DEPENDENCIES) +ifeq (,$(repeat)) + $(run_cmd) ./$< $(args) +else +ifeq (windows,$(tbb_os)) + for /L %%i in (1,1,$(repeat)) do echo %%i of $(repeat): && $(run_cmd) $< $(args) +else + for ((i=1;i<=$(repeat);++i)); do echo $$i of $(repeat): && $(run_cmd) ./$< $(args); done +endif +endif # repeat +ifneq (,$(codecov)) + profmerge + codecov $(if $(findstring -,$(codecov)),$(codecov),) -demang -comp $(tbb_root)/build/codecov.txt +endif + +time_%: time_%.$(TEST_EXT) $(AUX_TEST_DEPENDENCIES) + $(run_cmd) ./$< $(args) + + +clean_%: + $(RM) $*.$(OBJ) $*.exe $*.$(DLL) $*.$(LIBEXT) $*.res $*.map $*.ilk $*.pdb $*.exp $*.*manifest $*.tmp $*.d + +clean: + $(RM) *.$(OBJ) *.exe *.$(DLL) *.$(LIBEXT) *.res *.map *.ilk *.pdb *.exp *.manifest *.tmp *.d pgopti.* *.dyn core core.*[0-9][0-9] + +# Include automatically generated dependences +-include *.d diff --git a/dep/tbb/build/SunOS.gcc.inc b/dep/tbb/build/SunOS.gcc.inc new file mode 100644 index 000000000..f60073bf3 --- /dev/null +++ b/dep/tbb/build/SunOS.gcc.inc @@ -0,0 +1,99 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -fPIC +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -Wall +DYLIB_KEY = -shared +LIBDL = -ldl + +TBB_NOSTRICT = 1 + +CPLUS = g++ +LIB_LINK_FLAGS = -shared +LIBS = -lpthread -lrt -ldl +C_FLAGS = $(CPLUS_FLAGS) -x c + +ifeq ($(cfg), release) + CPLUS_FLAGS = -O2 -DUSE_PTHREAD +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = -DTBB_USE_DEBUG -g -O0 -DUSE_PTHREAD +endif + +ASM= +ASM_FLAGS= + +TBB_ASM.OBJ= + +ifeq (ia64,$(arch)) +# Position-independent code (PIC) is a must for IA-64 + CPLUS_FLAGS += $(PIC_KEY) +endif + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +ifeq (ia32,$(arch)) + CPLUS_FLAGS += -m32 + LIB_LINK_FLAGS += -m32 +endif + +# for some gcc versions on Solaris, -m64 may imply V9, but perhaps not everywhere (TODO: verify) +ifeq (sparc,$(arch)) + CPLUS_FLAGS += -mcpu=v9 -m64 + LIB_LINK_FLAGS += -mcpu=v9 -m64 +endif + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-gas +ifeq (ia64,$(arch)) + ASM=ias + TBB_ASM.OBJ = atomic_support.o lock_byte.o log2.o pause.o +endif +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions -fno-schedule-insns2 + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/SunOS.inc b/dep/tbb/build/SunOS.inc new file mode 100644 index 000000000..a3b378ab7 --- /dev/null +++ b/dep/tbb/build/SunOS.inc @@ -0,0 +1,90 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +ifndef arch + arch:=$(shell uname -p) + ifeq ($(arch),i386) + ifeq ($(shell isainfo -b),64) + arch:=intel64 + else + arch:=ia32 + endif + endif + export arch +# For non-IA systems running Sun OS, 'arch' will contain whatever is printed by uname -p. +# In particular, for SPARC architecture it will contain "sparc". +endif + +ifndef runtime + gcc_version:=$(shell gcc -v 2>&1 | grep 'gcc version' | sed -e 's/^gcc version //' | sed -e 's/ .*$$//') + os_version:=$(shell uname -r) + os_kernel_version:=$(shell uname -r | sed -e 's/-.*$$//') + export runtime:=cc$(gcc_version)_kernel$(os_kernel_version) +endif + +native_compiler := suncc +export compiler ?= suncc +# debugger ?= gdb + +CMD=$(SHELL) -c +CWD=$(shell pwd) +RM?=rm -f +RD?=rmdir +MD?=mkdir -p +NUL= /dev/null +SLASH=/ +MAKE_VERSIONS=bash $(tbb_root)/build/version_info_sunos.sh $(CPLUS) $(CPLUS_FLAGS) $(INCLUDES) >version_string.tmp +MAKE_TBBVARS=bash $(tbb_root)/build/generate_tbbvars.sh + +ifeq ($(compiler),suncc) + export TBB_CUSTOM_VARS_SH=CXXFLAGS="-I$(CWD)/../include -library=stlport4 $(CXXFLAGS) -M$(CWD)/../build/suncc.map.pause" + export TBB_CUSTOM_VARS_CSH=CXXFLAGS "-I$(CWD)/../include -library=stlport4 $(CXXFLAGS) -M$(CWD)/../build/suncc.map.pause" +endif + +ifdef LD_LIBRARY_PATH + export LD_LIBRARY_PATH := .:$(LD_LIBRARY_PATH) +else + export LD_LIBRARY_PATH := . +endif + +####### Build settings ######################################################## + +OBJ = o +DLL = so + +TBB.DEF = +TBB.DLL = libtbb$(DEBUG_SUFFIX).$(DLL) +TBB.LIB = $(TBB.DLL) +LINK_TBB.LIB = $(TBB.LIB) + +MALLOC.DLL = libtbbmalloc$(DEBUG_SUFFIX).$(DLL) +MALLOC.LIB = $(MALLOC.DLL) + +MALLOCPROXY.DLL = libtbbmalloc_proxy$(DEBUG_SUFFIX).$(DLL) + +TBB_NOSTRICT=1 + +TEST_LAUNCHER=sh $(tbb_root)/build/test_launcher.sh diff --git a/dep/tbb/build/SunOS.suncc.inc b/dep/tbb/build/SunOS.suncc.inc new file mode 100644 index 000000000..9aac11756 --- /dev/null +++ b/dep/tbb/build/SunOS.suncc.inc @@ -0,0 +1,95 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +COMPILE_ONLY = -c -xMMD -errtags +PREPROC_ONLY = -E -xMMD +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -KPIC +DYLIB_KEY = -G +LIBDL = -ldl +# WARNING_AS_ERROR_KEY = -errwarn=%all +WARNING_AS_ERROR_KEY = Warning as error +WARNING_SUPPRESS = -erroff=unassigned,attrskipunsup,badargtype2w,badbinaryopw,wbadasg,wvarhidemem +tbb_strict=0 + +TBB_NOSTRICT = 1 + +CPLUS = CC +CONLY = cc +LIB_LINK_FLAGS = -G -R . -M$(tbb_root)/build/suncc.map.pause +LINK_FLAGS += -M$(tbb_root)/build/suncc.map.pause +LIBS = -lpthread -lrt -R . +C_FLAGS = $(CPLUS_FLAGS) + +ifeq ($(cfg), release) + CPLUS_FLAGS = -mt -xO2 -library=stlport4 -DUSE_PTHREAD $(WARNING_SUPPRESS) +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = -mt -DTBB_USE_DEBUG -g -library=stlport4 -DUSE_PTHREAD $(WARNING_SUPPRESS) +endif + +ASM= +ASM_FLAGS= + +TBB_ASM.OBJ= + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += -m64 + ASM_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +ifeq (ia32,$(arch)) + CPLUS_FLAGS += -m32 + LIB_LINK_FLAGS += -m32 +endif + +# TODO: verify whether -m64 implies V9 on relevant Sun Studio versions +# (those that handle gcc assembler syntax) +ifeq (sparc,$(arch)) + CPLUS_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-fbe +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ +M_INCLUDES = $(INCLUDES) -I$(MALLOC_ROOT) -I$(MALLOC_SOURCE_ROOT) +M_CPLUS_FLAGS = $(CPLUS_FLAGS) +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/codecov.txt b/dep/tbb/build/codecov.txt new file mode 100644 index 000000000..e22f8059a --- /dev/null +++ b/dep/tbb/build/codecov.txt @@ -0,0 +1,7 @@ +src/tbb +src/tbbmalloc +include/tbb +src/rml/server +src/rml/client +src/rml/include +source/malloc diff --git a/dep/tbb/build/common.inc b/dep/tbb/build/common.inc new file mode 100644 index 000000000..4ccb36ade --- /dev/null +++ b/dep/tbb/build/common.inc @@ -0,0 +1,97 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +ifndef tbb_os + ifeq ($(OS), Windows_NT) + export tbb_os=windows + else + OS:=$(shell uname) + ifeq ($(OS),) + $(error "$(OS) is not supported") + else + export tbb_os=$(OS) + ifeq ($(OS), Linux) + export tbb_os=linux + endif + ifeq ($(OS), Darwin) + export tbb_os=macos + endif + endif # OS successfully detected + endif # !Windows +endif # !tbb_os + +ifeq ($(wildcard $(tbb_root)/build/$(tbb_os).inc),) + $(error "$(tbb_os)" is not supported. Add build/$(tbb_os).inc file with os-specific settings ) +endif + +# detect arch and runtime versions, provide common os-specific definitions +include $(tbb_root)/build/$(tbb_os).inc + +ifeq ($(arch),) + $(error Architecture not detected) +endif +ifeq ($(runtime),) + $(error Runtime version not detected) +endif +ifeq ($(wildcard $(tbb_root)/build/$(tbb_os).$(compiler).inc),) + $(error Compiler "$(compiler)" is not supported on $(tbb_os). Add build/$(tbb_os).$(compiler).inc file with compiler-specific settings ) +endif + +# Support for running debug tests to release library and vice versa +flip_cfg=$(subst _flipcfg,_release,$(subst _release,_debug,$(subst _debug,_flipcfg,$(1)))) +cross_cfg = $(if $(crosstest),$(call flip_cfg,$(1)),$(1)) + +ifdef BUILDING_PHASE + # Setting default configuration to release + cfg?=release + # No lambas or other C++0x extensions by default for compilers that implement them as experimental features + lambdas ?= 0 + cpp0x ?= 0 + # include compiler-specific build configurations + -include $(tbb_root)/build/$(tbb_os).$(compiler).inc + ifdef extra_inc + -include $(tbb_root)/build/$(extra_inc) + endif +endif +ifneq ($(BUILDING_PHASE),1) + # definitions for top-level Makefiles + origin_build_dir:=$(origin tbb_build_dir) + tbb_build_dir?=$(tbb_root)$(SLASH)build + tbb_build_prefix?=$(tbb_os)_$(arch)_$(compiler)_$(runtime) + work_dir=$(tbb_build_dir)$(SLASH)$(tbb_build_prefix) + ifneq ($(BUILDING_PHASE),0) + work_dir:=$(work_dir) + # assign new value for tbb_root if path is not absolute (the filter keeps only /* paths) + ifeq ($(filter /% $(SLASH)%, $(subst :, ,$(tbb_root)) ),) + ifeq ($(origin_build_dir),undefined) + override tbb_root:=../.. + else + override tbb_root:=$(CWD)/$(tbb_root) + endif + endif + export tbb_root + endif # BUILDING_PHASE != 0 +endif # BUILDING_PHASE != 1 diff --git a/dep/tbb/build/common_rules.inc b/dep/tbb/build/common_rules.inc new file mode 100644 index 000000000..5957af5ed --- /dev/null +++ b/dep/tbb/build/common_rules.inc @@ -0,0 +1,125 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +.PRECIOUS: %.$(OBJ) %.$(DLL).$(OBJ) %.exe + +ifeq ($(tbb_strict),1) + ifeq ($(WARNING_AS_ERROR_KEY),) + $(error WARNING_AS_ERROR_KEY is empty) + endif + # Do not remove line below! + WARNING_KEY += $(WARNING_AS_ERROR_KEY) +endif + +ifndef TEST_EXT + TEST_EXT = exe +endif + +INCLUDES += $(INCLUDE_KEY)$(tbb_root)/src $(INCLUDE_KEY)$(tbb_root)/src/rml/include $(INCLUDE_KEY)$(tbb_root)/include + +CPLUS_FLAGS += $(WARNING_KEY) $(CXXFLAGS) +LINK_FLAGS += $(LDFLAGS) +LIB_LINK_FLAGS += $(LDFLAGS) +CPLUS_FLAGS_NOSTRICT:=$(subst -strict_ansi,-ansi,$(CPLUS_FLAGS)) + +LIB_LINK_CMD ?= $(CPLUS) $(PIC_KEY) +ifeq ($(origin LIB_OUTPUT_KEY), undefined) + LIB_OUTPUT_KEY = $(OUTPUT_KEY) +endif +ifeq ($(origin LIB_LINK_LIBS), undefined) + LIB_LINK_LIBS = $(LIBDL) $(LIBS) +endif + +CONLY ?= $(CPLUS) + +# The most generic rules +%.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(CPLUS_FLAGS) $(CXX_ONLY_FLAGS) $(CXX_WARN_SUPPRESS) $(INCLUDES) $< + +%.$(OBJ): %.c + $(CONLY) $(COMPILE_ONLY) $(C_FLAGS) $(INCLUDES) $< + +%.$(OBJ): %.asm + $(ASM) $(ASM_FLAGS) $< + +%.$(OBJ): %.s + cpp <$< | grep -v '^#' >$*.tmp + $(ASM) $(ASM_FLAGS) -o $@ $*.tmp + rm $*.tmp + +# Rule for generating .E file if needed for visual inspection +%.E: %.cpp + $(CPLUS) $(CPLUS_FLAGS) $(CXX_ONLY_FLAGS) $(INCLUDES) $(PREPROC_ONLY) $< >$@ + +# TODO Rule for generating .asm file if needed for visual inspection +%.asm: %.cpp + $(CPLUS) /c /Fa $(CPLUS_FLAGS) $(CXX_ONLY_FLAGS) $(INCLUDES) $< + +# TODO Rule for generating .s file if needed for visual inspection +%.s: %.cpp + $(CPLUS) -S $(CPLUS_FLAGS) $(CXX_ONLY_FLAGS) $(INCLUDES) $< + +# Customizations + +ifeq (1,$(TBB_NOSTRICT)) +# GNU 3.2.3 headers have a ISO syntax that is rejected by Intel compiler in -strict_ansi mode. +# The Mac uses gcc, so the list is empty for that platform. +# The files below need the -strict_ansi flag downgraded to -ansi to compile + +$(KNOWN_NOSTRICT): %.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(CPLUS_FLAGS_NOSTRICT) $(CXX_ONLY_FLAGS) $(INCLUDES) $< +endif + +$(KNOWN_WARNINGS): %.$(OBJ): %.cpp + $(CPLUS) $(COMPILE_ONLY) $(subst $(WARNING_KEY),,$(CPLUS_FLAGS_NOSTRICT)) $(CXX_ONLY_FLAGS) $(INCLUDES) $< + +tbb_misc.$(OBJ): tbb_misc.cpp version_string.tmp + $(CPLUS) $(COMPILE_ONLY) $(CPLUS_FLAGS_NOSTRICT) $(CXX_ONLY_FLAGS) $(INCLUDE_KEY). $(INCLUDES) $< + +tbb_misc.E: tbb_misc.cpp version_string.tmp + $(CPLUS) $(CPLUS_FLAGS_NOSTRICT) $(CXX_ONLY_FLAGS) $(INCLUDE_KEY). $(INCLUDES) $(PREPROC_ONLY) $< >$@ + +%.res: %.rc version_string.tmp $(TBB.MANIFEST) + rc /Fo$@ $(INCLUDES) $(filter /D%,$(CPLUS_FLAGS)) $< + +tbbvars: + $(MAKE_TBBVARS) + +ifneq (,$(TBB.MANIFEST)) +$(TBB.MANIFEST): + cmd /C "echo #include ^ >tbbmanifest.c" + cmd /C "echo int main(){return 0;} >>tbbmanifest.c" + cl $(C_FLAGS) tbbmanifest.c + +version_string.tmp: $(TBB.MANIFEST) + $(MAKE_VERSIONS) + cmd /C "echo #define TBB_MANIFEST 1 >> version_string.tmp" + +else +version_string.tmp: + $(MAKE_VERSIONS) +endif + diff --git a/dep/tbb/build/detect.js b/dep/tbb/build/detect.js new file mode 100644 index 000000000..b11c95497 --- /dev/null +++ b/dep/tbb/build/detect.js @@ -0,0 +1,129 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +function doWork() { + var WshShell = WScript.CreateObject("WScript.Shell"); + + var fso = new ActiveXObject("Scripting.FileSystemObject"); + + var tmpExec; + + if ( WScript.Arguments.Count() > 1 && WScript.Arguments(1) == "gcc" ) { + if ( WScript.Arguments(0) == "/arch" ) { + WScript.Echo( "ia32" ); + } + else if ( WScript.Arguments(0) == "/runtime" ) { + WScript.Echo( "mingw" ); + } + return; + } + + //Compile binary + tmpExec = WshShell.Exec("cmd /c echo int main(){return 0;} >detect.c"); + while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); + } + + tmpExec = WshShell.Exec("cl /MD detect.c /link /MAP"); + while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); + } + + if ( WScript.Arguments(0) == "/arch" ) { + //read compiler banner + var clVersion = tmpExec.StdErr.ReadAll(); + + //detect target architecture + var intel64=/AMD64|EM64T|x64/mgi; + var ia64=/IA-64|Itanium/mgi; + var ia32=/80x86/mgi; + if ( clVersion.match(intel64) ) { + WScript.Echo( "intel64" ); + } else if ( clVersion.match(ia64) ) { + WScript.Echo( "ia64" ); + } else if ( clVersion.match(ia32) ) { + WScript.Echo( "ia32" ); + } else { + WScript.Echo( "unknown" ); + } + } + + if ( WScript.Arguments(0) == "/runtime" ) { + //read map-file + var map = fso.OpenTextFile("detect.map", 1, 0); + var mapContext = map.readAll(); + map.Close(); + + //detect runtime + var vc71=/MSVCR71\.DLL/mgi; + var vc80=/MSVCR80\.DLL/mgi; + var vc90=/MSVCR90\.DLL/mgi; + var vc100=/MSVCR100\.DLL/mgi; + var psdk=/MSVCRT\.DLL/mgi; + if ( mapContext.match(vc71) ) { + WScript.Echo( "vc7.1" ); + } else if ( mapContext.match(vc80) ) { + WScript.Echo( "vc8" ); + } else if ( mapContext.match(vc90) ) { + WScript.Echo( "vc9" ); + } else if ( mapContext.match(vc100) ) { + WScript.Echo( "vc10" ); + } else if ( mapContext.match(psdk) ) { + // Our current naming convention assumes vc7.1 for 64-bit Windows PSDK + WScript.Echo( "vc7.1" ); + } else { + WScript.Echo( "unknown" ); + } + } + + // delete intermediate files + if ( fso.FileExists("detect.c") ) + fso.DeleteFile ("detect.c", false); + if ( fso.FileExists("detect.obj") ) + fso.DeleteFile ("detect.obj", false); + if ( fso.FileExists("detect.map") ) + fso.DeleteFile ("detect.map", false); + if ( fso.FileExists("detect.exe") ) + fso.DeleteFile ("detect.exe", false); + if ( fso.FileExists("detect.exe.manifest") ) + fso.DeleteFile ("detect.exe.manifest", false); +} + +if ( WScript.Arguments.Count() > 0 ) { + + try { + doWork(); + } catch( error ) + { + WScript.Echo( "unknown" ); + WScript.Quit( 0 ); + } + +} else { + + WScript.Echo( "/arch or /runtime should be set" ); +} + diff --git a/dep/tbb/build/generate_tbbvars.bat b/dep/tbb/build/generate_tbbvars.bat new file mode 100644 index 000000000..0a2088589 --- /dev/null +++ b/dep/tbb/build/generate_tbbvars.bat @@ -0,0 +1,98 @@ +@echo off +REM +REM Copyright 2005-2009 Intel Corporation. All Rights Reserved. +REM +REM This file is part of Threading Building Blocks. +REM +REM Threading Building Blocks is free software; you can redistribute it +REM and/or modify it under the terms of the GNU General Public License +REM version 2 as published by the Free Software Foundation. +REM +REM Threading Building Blocks is distributed in the hope that it will be +REM useful, but WITHOUT ANY WARRANTY; without even the implied warranty +REM of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +REM GNU General Public License for more details. +REM +REM You should have received a copy of the GNU General Public License +REM along with Threading Building Blocks; if not, write to the Free Software +REM Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +REM +REM As a special exception, you may use this file as part of a free software +REM library without restriction. Specifically, if other files instantiate +REM templates or use macros or inline functions from this file, or you compile +REM this file and link it with other files to produce an executable, this +REM file does not by itself cause the resulting executable to be covered by +REM the GNU General Public License. This exception does not however +REM invalidate any other reasons why the executable file might be covered by +REM the GNU General Public License. +REM +if exist tbbvars.bat exit +echo Generating tbbvars.bat +echo @echo off>tbbvars.bat +setlocal +for %%D in ("%tbb_root%") do set actual_root=%%~fD +if x%1==x goto without + +echo SET TBB22_INSTALL_DIR=%actual_root%>>tbbvars.bat +echo SET TBB_ARCH_PLATFORM=%arch%\%runtime%>>tbbvars.bat +echo SET INCLUDE=%%TBB22_INSTALL_DIR%%\include;%%INCLUDE%%>>tbbvars.bat +echo SET LIB=%%TBB22_INSTALL_DIR%%\build\%1;%%LIB%%>>tbbvars.bat +echo SET PATH=%%TBB22_INSTALL_DIR%%\build\%1;%%PATH%%>>tbbvars.bat + +if exist tbbvars.sh goto skipsh +set fslash_root=%actual_root:\=/% +echo Generating tbbvars.sh +echo #!/bin/sh>tbbvars.sh +echo export TBB22_INSTALL_DIR="%fslash_root%">>tbbvars.sh +echo TBB_ARCH_PLATFORM="%arch%\%runtime%">>tbbvars.sh +echo if [ -z "${PATH}" ]; then>>tbbvars.sh +echo export PATH="${TBB22_INSTALL_DIR}/build/%1">>tbbvars.sh +echo else>>tbbvars.sh +echo export PATH="${TBB22_INSTALL_DIR}/build/%1;$PATH">>tbbvars.sh +echo fi>>tbbvars.sh +echo if [ -z "${LIB}" ]; then>>tbbvars.sh +echo export LIB="${TBB22_INSTALL_DIR}/build/%1">>tbbvars.sh +echo else>>tbbvars.sh +echo export LIB="${TBB22_INSTALL_DIR}/build/%1;$LIB">>tbbvars.sh +echo fi>>tbbvars.sh +echo if [ -z "${INCLUDE}" ]; then>>tbbvars.sh +echo export INCLUDE="${TBB22_INSTALL_DIR}/include">>tbbvars.sh +echo else>>tbbvars.sh +echo export INCLUDE="${TBB22_INSTALL_DIR}/include;$INCLUDE">>tbbvars.sh +echo fi>>tbbvars.sh +:skipsh + +if exist tbbvars.csh goto skipcsh +echo Generating tbbvars.csh +echo #!/bin/csh>tbbvars.csh +echo setenv TBB22_INSTALL_DIR "%actual_root%">>tbbvars.csh +echo setenv TBB_ARCH_PLATFORM "%arch%\%runtime%">>tbbvars.csh +echo if (! $?PATH) then>>tbbvars.csh +echo setenv PATH "${TBB22_INSTALL_DIR}\build\%1">>tbbvars.csh +echo else>>tbbvars.csh +echo setenv PATH "${TBB22_INSTALL_DIR}\build\%1;$PATH">>tbbvars.csh +echo endif>>tbbvars.csh +echo if (! $?LIB) then>>tbbvars.csh +echo setenv LIB "${TBB22_INSTALL_DIR}\build\%1">>tbbvars.csh +echo else>>tbbvars.csh +echo setenv LIB "${TBB22_INSTALL_DIR}\build\%1;$LIB">>tbbvars.csh +echo endif>>tbbvars.csh +echo if (! $?INCLUDE) then>>tbbvars.csh +echo setenv INCLUDE "${TBB22_INSTALL_DIR}\include">>tbbvars.csh +echo else>>tbbvars.csh +echo setenv INCLUDE "${TBB22_INSTALL_DIR}\include;$INCLUDE">>tbbvars.csh +echo endif>>tbbvars.csh +) +:skipcsh +exit + +:without +set bin_dir=%CD% +echo SET tbb_root=%actual_root%>>tbbvars.bat +echo SET tbb_bin=%bin_dir%>>tbbvars.bat +echo SET TBB_ARCH_PLATFORM=%arch%\%runtime%>>tbbvars.bat +echo SET INCLUDE="%%tbb_root%%\include";%%INCLUDE%%>>tbbvars.bat +echo SET LIB="%%tbb_bin%%";%%LIB%%>>tbbvars.bat +echo SET PATH="%%tbb_bin%%";%%PATH%%>>tbbvars.bat + +endlocal diff --git a/dep/tbb/build/generate_tbbvars.sh b/dep/tbb/build/generate_tbbvars.sh new file mode 100644 index 000000000..1e1b02c58 --- /dev/null +++ b/dep/tbb/build/generate_tbbvars.sh @@ -0,0 +1,132 @@ +#!/bin/bash +# +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# Script used to generate tbbvars.[c]sh scripts +bin_dir="$PWD" # +cd "$tbb_root" # keep this comments here +tbb_root="$PWD" # to make it unsensible +cd "$bin_dir" # to EOL encoding +[ "`uname`" = "Darwin" ] && dll_path="DYLD_LIBRARY_PATH" || dll_path="LD_LIBRARY_PATH" # +custom_exp="$CXXFLAGS" # +if [ -z "$TBB_CUSTOM_VARS_SH" ]; then # +custom_exp_sh="" # +else # +custom_exp_sh="export $TBB_CUSTOM_VARS_SH" # +fi # +if [ -z "$TBB_CUSTOM_VARS_CSH" ]; then # +custom_exp_csh="" # +else # +custom_exp_csh="setenv $TBB_CUSTOM_VARS_CSH" # +fi # +if [ -z "$1" ]; then # custom tbb_build_dir, can't make with TBB_INSTALL_DIR +[ -f ./tbbvars.sh ] || cat >./tbbvars.sh <./tbbvars.csh <./tbbvars.sh <./tbbvars.csh < + + +

Overview

+This directory contains the internal Makefile infrastructure for Threading Building Blocks. + +

+See below for how to build TBB and how to port TBB +to a new platform, operating system or architecture. +

+ +

Files

+The files here are not intended to be used directly. See below for usage. +
+
Makefile.tbb +
Main Makefile to build the TBB library. + Invoked via 'make tbb' from top-level Makefile. +
Makefile.tbbmalloc +
Main Makefile to build the TBB scalable memory allocator library as well as its tests. + Invoked via 'make tbbmalloc' from top-level Makefile. +
Makefile.test +
Main Makefile to build and run the tests for the TBB library. + Invoked via 'make test' from top-level Makefile. +
common.inc +
Main common included Makefile that includes OS-specific and compiler-specific Makefiles. +
<os>.inc +
OS-specific Makefile for a particular <os>. +
<os>.<compiler>.inc +
Compiler-specific Makefile for a particular <os> / <compiler> combination. +
*.sh +
Infrastructure utilities for Linux*, Mac OS* X, and UNIX*-related systems. +
*.js, *.bat +
Infrastructure utilities for Windows* systems. +
+ +

To Build

+

+To port TBB to a new platform, operating system or architecture, see the porting directions below. +

+ +

Software prerequisites:

+
    +
  1. C++ compiler for the platform, operating system and architecture of interest. + Either the native compiler for your system, or, optionally, the appropriate Intel® C++ compiler, may be used. +
  2. GNU make utility. On Windows*, if a UNIX* emulator is used to run GNU make, + it should be able to run Windows* utilities and commands. On Linux*, Mac OS* X, etc., + shell commands issued by GNU make should execute in a Bourne or BASH compatible shell. +
+ +

+TBB libraries can be built by performing the following steps. +On systems that support only one ABI (e.g., 32-bit), these steps build the libraries for that ABI. +On systems that support both 64-bit and 32-bit libraries, these steps build the 64-bit libraries +(Linux*, Mac OS* X, and related systems) or whichever ABI is selected in the development environment (Windows* systems). +

+
    +
  1. Change to the top-level directory of the installed software. +
  2. If using the Intel® C++ compiler, make sure the appropriate compiler is available in your PATH + (e.g., by sourcing the appropriate iccvars script for the compiler to be used). +
  3. Invoke GNU make using no arguments, for example, 'gmake'. +
+ +

+To build TBB libraries for other than the default ABI (e.g., to build 32-bit libraries on Linux*, Mac OS* X, +or related systems that support both 64-bit and 32-bit libraries), perform the following steps. +

+
    +
  1. Change to the top-level directory of the installed software. +
  2. If using the Intel® C++ compiler, make sure the appropriate compiler is available in your PATH + (e.g., by sourcing the appropriate iccvars script for the compiler to be used). +
  3. Invoke GNU make as follows, 'gmake arch=ia32'. +
+ +

The default make target will build the release and debug versions of the TBB library.

+

Other targets are available in the top-level Makefile. You might find the following targets useful: +

    +
  • 'make test' will build and run TBB unit-tests; +
  • 'make examples' will build and run TBB examples; +
  • 'make all' will do all of the above. +
+See also the list of other targets below. +

+ +

+By default, the libraries will be built in sub-directories within the build/ directory. +The sub-directories are named according to the operating system, architecture, compiler and software environment used +(the sub-directory names also distinguish release vs. debug libraries). On Linux*, the software environment comprises +the GCC, libc and kernel version used. On Mac OS* X, the software environment comprises the GCC and OS version used. +On Windows, the software environment comprises the Microsoft* Visual Studio* version used. +See below for how to change the default build directory. +

+ +

+To perform different build and/or test operations, use the following steps. +

+
    +
  1. Change to the top-level directory of the installed software. +
  2. If using the Intel® C++ compiler, make sure the appropriate compiler is available in your PATH + (e.g., by sourcing the appropriate iccvars script for the compiler to be used). +
  3. Invoke GNU make by using one or more of the following commands. +
    +
    make +
    Default build. Equivalent to 'make tbb tbbmalloc'. +
    make all +
    Equivalent to 'make tbb tbbmalloc test examples'. +
    cd src;make release +
    Build and test release libraries only. +
    cd src;make debug +
    Build and test debug libraries only. +
    make tbb +
    Make TBB release and debug libraries. +
    make tbbmalloc +
    Make TBB scalable memory allocator libraries. +
    make test +
    Compile and run unit-tests +
    make examples +
    Build libraries and run all examples, like doing 'make debug clean release' from + the general example Makefile. +
    make compiler={icl, icc} [(above options or targets)] +
    Build and run as above, but use Intel® compilers instead of default, native compilers + (e.g., icl instead of cl.exe on Windows* systems, or icc instead of g++ on Linux* or Mac OS* X systems). +
    make arch={ia32, intel64, ia64} [(above options or targets)] +
    Build and run as above, but build libraries for the selected ABI. + Might be useful for cross-compilation; ensure proper environment is set before running this command. +
    make tbb_root={(TBB directory)} [(above options or targets)] +
    Build and run as above; for use when invoking 'make' from a directory other than + the top-level directory. +
    make tbb_build_dir={(build directory)} [(above options or targets)] +
    Build and run as above, but place the built libraries in the specified directory, rather than in the default + sub-directory within the build/ directory. This command might have troubles with the build in case the sources + installed to the directory with spaces in the path. +
    make tbb_build_prefix={(build sub-directory)} [(above options or targets)] +
    Build and run as above, but place the built libraries in the specified sub-directory within the build/ directory, + rather than using the default sub-directory name. +
    make [(above options)] clean +
    Remove any executables or intermediate files produced by the above commands. + Includes build directories, object files, libraries and test executables. +
    +
+ +

To Port

+

+This section provides information on how to port TBB to a new platform, operating system or architecture. +A subset or a superset of these steps may be required for porting to a given platform. +

+ +

To port the TBB source code:

+
    +
  1. If porting to a new architecture, create a file that describes the architecture-specific details for that architecture. +
      +
    • Create a <os>_<architecture>.h file in the include/tbb/machine directory + that describes these details. +
        +
      • The <os>_<architecture>.h is named after the operating system and architecture as recognized by + include/tbb/tbb_machine.h and the Makefile infrastructure. +
      • This file defines the implementations of synchronization operations, and also the + scheduler yield function, for the operating system and architecture. +
      • Several examples of <os>_<architecture>.h files can be found in the + include/tbb/machine directory. +
          +
        • A minimal implementation defines the 4-byte and 8-byte compare-and-swap operations, + and the scheduler yield function. See include/tbb/machine/mac_ppc.h + for an example of a minimal implementation. +
        • More complex implementation examples can also be found in the + include/tbb/machine directory + that implement all the individual variants of synchronization operations that TBB uses. + Such implementations are more verbose but may achieve better performance on a given architecture. +
        • In a given implementation, any synchronization operation that is not defined is implemented, by default, + in terms of 4-byte or 8-byte compare-and-swap. More operations can thus be added incrementally to increase + the performance of an implementation. +
        • In most cases, synchronization operations are implemented as inline assembly code; examples also exist, + (e.g., for Intel® Itanium® processors) that use out-of-line assembly code in *.s or *.asm files + (see the assembly code sub-directories in the src/tbb directory). +
        +
      +
    • Modify include/tbb/tbb_machine.h, if needed, to invoke the appropriate + <os>_<architecture>.h file in the include/tbb/machine directory. +
    +
  2. Add an implementation of DetectNumberOfWorkers() in src/tbb/tbb_misc.h, + if needed, that returns the number of cores found on the system. This is used to determine the default + number of threads for the TBB task scheduler. +
  3. Either properly define FillDynamicLinks for use in + src/tbb/cache_aligned_allocator.cpp, + or hardcode the allocator to be used. +
  4. Additional types might be required in the union defined in + include/tbb/aligned_space.h + to ensure proper alignment on your platform. +
  5. Changes may be required in include/tbb/tick_count.h + for systems that do not provide gettimeofday. +
+ +

To port the Makefile infrastructure:

+Modify the appropriate files in the Makefile infrastructure to add a new platform, operating system or architecture as needed. +See the Makefile infrastructure files for examples. +
    +
  1. The top-level Makefile includes common.inc to determine the operating system. +
      +
    • To add a new operating system, add the appropriate test to common.inc, + and create the needed <os>.inc and <os>.<compiler>.inc files (see below). +
    +
  2. The <os>.inc file makes OS-specific settings for a particular <os>. +
      +
    • For example, linux.inc makes settings specific to Linux* systems. +
    • This file performs OS-dependent tests to determine the specific platform and/or architecture, + and sets other platform-dependent values. +
    • Add a new <os>.inc file for each new operating system added. +
    +
  3. The <os>.<compiler>.inc file makes compiler-specific settings for a particular + <os> / <compiler> combination. +
      +
    • For example, linux.gcc.inc makes specific settings for using GCC on Linux* systems, + and linux.icc.inc makes specific settings for using the Intel® C++ compiler on Linux* systems. +
    • This file sets particular compiler, assembler and linker options required when using a particular + <os> / <compiler> combination. +
    • Add a new <os>.<compiler>.inc file for each new <os> / <compiler> combination added. +
    +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + diff --git a/dep/tbb/build/linux.gcc.inc b/dep/tbb/build/linux.gcc.inc new file mode 100644 index 000000000..05b3b3fff --- /dev/null +++ b/dep/tbb/build/linux.gcc.inc @@ -0,0 +1,107 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -fPIC +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -Wall +WARNING_SUPPRESS = -Wno-parentheses +RML_WARNING_SUPPRESS = -Wno-non-virtual-dtor +DYLIB_KEY = -shared +LIBDL = -ldl + +TBB_NOSTRICT = 1 + +CPLUS = g++ +CONLY = gcc +LIB_LINK_FLAGS = -shared -Wl,-soname=$(BUILDING_LIBRARY) +LIBS = -lpthread -lrt +C_FLAGS = $(CPLUS_FLAGS) + +ifeq ($(cfg), release) + CPLUS_FLAGS = -DDO_ITT_NOTIFY -O2 -DUSE_PTHREAD +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = -DTBB_USE_DEBUG -DDO_ITT_NOTIFY -g -O0 -DUSE_PTHREAD +endif + +ifneq (0,$(cpp0x)) + CXX_ONLY_FLAGS = -std=c++0x +endif + +ASM= +ASM_FLAGS= + +TBB_ASM.OBJ= + +ifeq (ia64,$(arch)) +# Position-independent code (PIC) is a must on IA-64, even for regular (not shared) executables + CPLUS_FLAGS += $(PIC_KEY) +endif + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +ifeq (ia32,$(arch)) + CPLUS_FLAGS += -m32 + LIB_LINK_FLAGS += -m32 +endif + +# for some gcc versions on Solaris, -m64 may imply V9, but perhaps not everywhere (TODO: verify) +ifeq (sparc,$(arch)) + CPLUS_FLAGS += -mcpu=v9 -m64 + LIB_LINK_FLAGS += -mcpu=v9 -m64 +endif + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-gas +ifeq (ia64,$(arch)) + ASM=as + ASM_FLAGS += -xexplicit + TBB_ASM.OBJ = atomic_support.o lock_byte.o log2.o pause.o ia64_misc.o +endif +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions -fno-schedule-insns2 + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/linux.icc.inc b/dep/tbb/build/linux.icc.inc new file mode 100644 index 000000000..9c368cbaa --- /dev/null +++ b/dep/tbb/build/linux.icc.inc @@ -0,0 +1,98 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -fPIC +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -w1 +DYLIB_KEY = -shared +LIBDL = -ldl +export COMPILER_VERSION := ICC: $(shell icc -V &1 | grep 'Version') +#TODO: autodetection of arch from COMPILER_VERSION!! + +TBB_NOSTRICT = 1 + +CPLUS = icpc +CONLY = icc + +ifeq (release,$(cfg)) +CPLUS_FLAGS = -O2 -strict_ansi -DUSE_PTHREAD +else +CPLUS_FLAGS = -O0 -g -strict_ansi -DUSE_PTHREAD -DTBB_USE_DEBUG +endif + +ifneq (,$(codecov)) + CPLUS_FLAGS += -prof-genx +else + CPLUS_FLAGS += -DDO_ITT_NOTIFY +endif + +OPENMP_FLAG = -openmp +LIB_LINK_FLAGS = -shared -i-static -Wl,-soname=$(BUILDING_LIBRARY) +LIBS = -lpthread -lrt +C_FLAGS = $(CPLUS_FLAGS) + +ASM= +ASM_FLAGS= + +TBB_ASM.OBJ= + +ifeq (ia64,$(arch)) +# Position-independent code (PIC) is a must on IA-64, even for regular (not shared) executables + CPLUS_FLAGS += $(PIC_KEY) +endif + +ifneq (00,$(lambdas)$(cpp0x)) + CPLUS_FLAGS += -std=c++0x -D_TBB_CPP0X +endif + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-gas +ifeq (ia64,$(arch)) + ASM=ias + TBB_ASM.OBJ = atomic_support.o lock_byte.o log2.o pause.o ia64_misc.o +endif +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ + diff --git a/dep/tbb/build/linux.inc b/dep/tbb/build/linux.inc new file mode 100644 index 000000000..d85844501 --- /dev/null +++ b/dep/tbb/build/linux.inc @@ -0,0 +1,108 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +ifndef arch + uname_m:=$(shell uname -m) + ifeq ($(uname_m),i686) + export arch:=ia32 + endif + ifeq ($(uname_m),ia64) + export arch:=ia64 + endif + ifeq ($(uname_m),x86_64) + export arch:=intel64 + endif + ifeq ($(uname_m),sparc64) + export arch:=sparc + endif +endif + +ifndef runtime + #gcc_version:=$(shell gcc -v 2>&1 | grep 'gcc --version' | sed -e 's/^gcc version //' | sed -e 's/ .*$$//') + gcc_version_full=$(shell gcc --version | grep 'gcc'| egrep -o ' [0-9]+\.[0-9]+\.[0-9]+.*' | sed -e 's/^\ //') + gcc_version=$(shell echo "$(gcc_version_full)" | egrep -o '^[0-9]+\.[0-9]+\.[0-9]+\s*' | head -n 1 | sed -e 's/ *//g') + os_version:=$(shell uname -r) + os_kernel_version:=$(shell uname -r | sed -e 's/-.*$$//') + export os_glibc_version_full:=$(shell getconf GNU_LIBC_VERSION | grep glibc | sed -e 's/^glibc //') + os_glibc_version:=$(shell echo "$(os_glibc_version_full)" | sed -e '2,$$d' -e 's/-.*$$//') + export runtime:=cc$(gcc_version)_libc$(os_glibc_version)_kernel$(os_kernel_version) +endif + +native_compiler := gcc +export compiler ?= gcc +debugger ?= gdb + +CMD=sh -c +CWD=$(shell pwd) +RM?=rm -f +RD?=rmdir +MD?=mkdir -p +NUL= /dev/null +SLASH=/ +MAKE_VERSIONS=sh $(tbb_root)/build/version_info_linux.sh $(CPLUS) $(CPLUS_FLAGS) $(INCLUDES) >version_string.tmp +MAKE_TBBVARS=sh $(tbb_root)/build/generate_tbbvars.sh + +ifdef LD_LIBRARY_PATH + export LD_LIBRARY_PATH := .:$(LD_LIBRARY_PATH) +else + export LD_LIBRARY_PATH := . +endif + +####### Build settings ######################################################## + +OBJ = o +DLL = so +LIBEXT = so +SONAME_SUFFIX =$(shell grep TBB_COMPATIBLE_INTERFACE_VERSION $(tbb_root)/include/tbb/tbb_stddef.h | egrep -o [0-9.]+) + +def_prefix = $(if $(findstring 32,$(arch)),lin32,$(if $(findstring intel64,$(arch)),lin64,lin64ipf)) +TBB.DEF = $(tbb_root)/src/tbb/$(def_prefix)-tbb-export.def + +EXPORT_KEY = -Wl,--version-script, +TBB.DLL = $(TBB_NO_VERSION.DLL).$(SONAME_SUFFIX) +TBB.LIB = $(TBB.DLL) +TBB_NO_VERSION.DLL=libtbb$(DEBUG_SUFFIX).$(DLL) +LINK_TBB.LIB = $(TBB_NO_VERSION.DLL) + +MALLOC_NO_VERSION.DLL = libtbbmalloc$(DEBUG_SUFFIX).$(DLL) +MALLOC.DEF = $(MALLOC_ROOT)/lin-tbbmalloc-export.def +MALLOC.DLL = $(MALLOC_NO_VERSION.DLL).$(SONAME_SUFFIX) +MALLOC.LIB = $(MALLOC_NO_VERSION.DLL) +LINK_MALLOC.LIB = $(MALLOC_NO_VERSION.DLL) + +MALLOCPROXY_NO_VERSION.DLL = libtbbmalloc_proxy$(DEBUG_SUFFIX).$(DLL) +MALLOCPROXY.DEF = $(MALLOC_ROOT)/$(def_prefix)-proxy-export.def +MALLOCPROXY.DLL = $(MALLOCPROXY_NO_VERSION.DLL).$(SONAME_SUFFIX) +MALLOCPROXY.LIB = $(MALLOCPROXY_NO_VERSION.DLL) + +RML_NO_VERSION.DLL = libirml$(DEBUG_SUFFIX).$(DLL) +RML.DEF = $(RML_SERVER_ROOT)/lin-rml-export.def +RML.DLL = $(RML_NO_VERSION.DLL).1 +RML.LIB = $(RML_NO_VERSION.DLL) + +TBB_NOSTRICT=1 + +TEST_LAUNCHER=sh $(tbb_root)/build/test_launcher.sh diff --git a/dep/tbb/build/macos.gcc.inc b/dep/tbb/build/macos.gcc.inc new file mode 100644 index 000000000..14a90162c --- /dev/null +++ b/dep/tbb/build/macos.gcc.inc @@ -0,0 +1,89 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +CPLUS = g++ +CONLY = gcc +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -fPIC +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -Wall +WARNING_SUPPRESS = +DYLIB_KEY = -dynamiclib +EXPORT_KEY = -Wl,-exported_symbols_list, +LIBDL = -ldl + +LIBS = -lpthread +LINK_FLAGS = +LIB_LINK_FLAGS = -dynamiclib +C_FLAGS = $(CPLUS_FLAGS) + +ifeq ($(cfg), release) + CPLUS_FLAGS = -O2 +else + CPLUS_FLAGS = -g -O0 -DTBB_USE_DEBUG +endif + +CPLUS_FLAGS += -DUSE_PTHREAD + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += -m64 + LINK_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +ifeq (ia32,$(arch)) + CPLUS_FLAGS += -m32 + LINK_FLAGS += -m32 + LIB_LINK_FLAGS += -m32 +endif + +ifeq (ppc64,$(arch)) + CPLUS_FLAGS += -arch ppc64 + LINK_FLAGS += -arch ppc64 + LIB_LINK_FLAGS += -arch ppc64 +endif + +ifeq (ppc,$(arch)) + CPLUS_FLAGS += -arch ppc + LINK_FLAGS += -arch ppc + LIB_LINK_FLAGS += -arch ppc +endif + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions -fno-schedule-insns2 + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ + diff --git a/dep/tbb/build/macos.icc.inc b/dep/tbb/build/macos.icc.inc new file mode 100644 index 000000000..7507ec07c --- /dev/null +++ b/dep/tbb/build/macos.icc.inc @@ -0,0 +1,75 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +CPLUS = icpc +CONLY = icc +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = -fPIC +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -w1 +DYLIB_KEY = -dynamiclib +EXPORT_KEY = -Wl,-exported_symbols_list, +LIBDL = -ldl +export COMPILER_VERSION := $(shell icc -V &1 | grep 'Version') +#TODO: autodetection of arch from COMPILER_VERSION!! + +OPENMP_FLAG = -openmp +LIBS = -lpthread +LINK_FLAGS = +LIB_LINK_FLAGS = -dynamiclib -i-static +C_FLAGS = $(CPLUS_FLAGS) + +ifeq ($(cfg), release) + CPLUS_FLAGS = -O2 -fno-omit-frame-pointer +else + CPLUS_FLAGS = -g -O0 -DTBB_USE_DEBUG +endif + +CPLUS_FLAGS += -DUSE_PTHREAD + +ifneq (,$(codecov)) + CPLUS_FLAGS += -prof-genx +endif + +ifneq (00,$(lambdas)$(cpp0x)) + CPLUS_FLAGS += -std=c++0x -D_TBB_CPP0X +endif + + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/macos.inc b/dep/tbb/build/macos.inc new file mode 100644 index 000000000..4e2f4dbcf --- /dev/null +++ b/dep/tbb/build/macos.inc @@ -0,0 +1,85 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +####### Detections and Commands ############################################### +ifndef arch + ifeq ($(shell /usr/sbin/sysctl -n hw.machine),Power Macintosh) + ifeq ($(shell /usr/sbin/sysctl -n hw.optional.64bitops),1) + export arch:=ppc64 + else + export arch:=ppc32 + endif + else + ifeq ($(shell /usr/sbin/sysctl -n hw.optional.x86_64 2>/dev/null),1) + export arch:=intel64 + else + export arch:=ia32 + endif + endif +endif + +ifndef runtime + #gcc_version:=$(shell gcc -v 2>&1 | grep 'gcc version' | sed -e 's/^gcc version //' | sed -e 's/ .*$$//' ) + gcc_version_full=$(shell gcc --version | grep 'gcc'| egrep -o ' [0-9]+\.[0-9]+\.[0-9]+.*' | sed -e 's/^\ //') + gcc_version=$(shell echo "$(gcc_version_full)" | egrep -o '^[0-9]+\.[0-9]+\.[0-9]+\s*' | head -n 1 | sed -e 's/ *//g') + os_version:=$(shell /usr/bin/sw_vers -productVersion) + export runtime:=cc$(gcc_version)_os$(os_version) +endif + +native_compiler := gcc +export compiler ?= gcc +debugger ?= gdb + +CMD=$(SHELL) -c +CWD=$(shell pwd) +RM?=rm -f +RD?=rmdir +MD?=mkdir -p +NUL= /dev/null +SLASH=/ +MAKE_VERSIONS=sh $(tbb_root)/build/version_info_macos.sh $(CPLUS) $(CPLUS_FLAGS) $(INCLUDES) >version_string.tmp +MAKE_TBBVARS=sh $(tbb_root)/build/generate_tbbvars.sh + +####### Build settings ######################################################## + +OBJ=o +DLL=dylib +LIBEXT=dylib + +def_prefix = $(if $(findstring 32,$(arch)),mac32,mac64) + +TBB.DEF = $(tbb_root)/src/tbb/$(def_prefix)-tbb-export.def +TBB.DLL = libtbb$(DEBUG_SUFFIX).$(DLL) +TBB.LIB = $(TBB.DLL) +LINK_TBB.LIB = $(TBB.LIB) + +MALLOC.DEF = $(MALLOC_ROOT)/$(def_prefix)-tbbmalloc-export.def +MALLOC.DLL = libtbbmalloc$(DEBUG_SUFFIX).$(DLL) +MALLOC.LIB = $(MALLOC.DLL) + +TBB_NOSTRICT=1 + +TEST_LAUNCHER=sh $(tbb_root)/build/test_launcher.sh diff --git a/dep/tbb/build/suncc.map.pause b/dep/tbb/build/suncc.map.pause new file mode 100644 index 000000000..a92d08eb1 --- /dev/null +++ b/dep/tbb/build/suncc.map.pause @@ -0,0 +1 @@ +hwcap_1 = OVERRIDE; \ No newline at end of file diff --git a/dep/tbb/build/test_launcher.bat b/dep/tbb/build/test_launcher.bat new file mode 100644 index 000000000..bc52a4414 --- /dev/null +++ b/dep/tbb/build/test_launcher.bat @@ -0,0 +1,36 @@ +@echo off +REM +REM Copyright 2005-2009 Intel Corporation. All Rights Reserved. +REM +REM This file is part of Threading Building Blocks. +REM +REM Threading Building Blocks is free software; you can redistribute it +REM and/or modify it under the terms of the GNU General Public License +REM version 2 as published by the Free Software Foundation. +REM +REM Threading Building Blocks is distributed in the hope that it will be +REM useful, but WITHOUT ANY WARRANTY; without even the implied warranty +REM of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +REM GNU General Public License for more details. +REM +REM You should have received a copy of the GNU General Public License +REM along with Threading Building Blocks; if not, write to the Free Software +REM Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +REM +REM As a special exception, you may use this file as part of a free software +REM library without restriction. Specifically, if other files instantiate +REM templates or use macros or inline functions from this file, or you compile +REM this file and link it with other files to produce an executable, this +REM file does not by itself cause the resulting executable to be covered by +REM the GNU General Public License. This exception does not however +REM invalidate any other reasons why the executable file might be covered by +REM the GNU General Public License. +REM + +REM no LD_PRELOAD under Windows +if "%1"=="-l" ( + echo skip + exit +) + +%* diff --git a/dep/tbb/build/test_launcher.sh b/dep/tbb/build/test_launcher.sh new file mode 100644 index 000000000..0f691ba7c --- /dev/null +++ b/dep/tbb/build/test_launcher.sh @@ -0,0 +1,42 @@ +#!/bin/sh +# +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +while getopts "l:" flag # +do # + if [ `uname` != 'Linux' ] ; then # + echo 'skip' # + exit # + fi # + LD_PRELOAD=$OPTARG # + shift `expr $OPTIND - 1` # +done # +# Set stack limit +ulimit -s 10240 # +# Run the command line passed via parameters +export LD_PRELOAD # +./$* # diff --git a/dep/tbb/build/version_info_linux.sh b/dep/tbb/build/version_info_linux.sh new file mode 100644 index 000000000..87d75516e --- /dev/null +++ b/dep/tbb/build/version_info_linux.sh @@ -0,0 +1,42 @@ +#!/bin/sh +# +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# Script used to generate version info string +echo "#define __TBB_VERSION_STRINGS \\" +echo '"TBB:' "BUILD_HOST\t\t"`hostname -s`" ("`uname -m`")"'" ENDL \' +# find OS name in *-release and issue* files by filtering blank lines and lsb-release content out +echo '"TBB:' "BUILD_OS\t\t"`lsb_release -sd 2>/dev/null | grep -ih '[a-z] ' - /etc/*release /etc/issue 2>/dev/null | head -1 | sed -e 's/["\\\\]//g'`'" ENDL \' +echo '"TBB:' "BUILD_KERNEL\t"`uname -srv`'" ENDL \' +echo '"TBB:' "BUILD_GCC\t\t"`g++ -v &1 | grep 'gcc.*version'`'" ENDL \' +[ -z "$COMPILER_VERSION" ] || echo '"TBB:' "BUILD_COMPILER\t"$COMPILER_VERSION'" ENDL \' +echo '"TBB:' "BUILD_GLIBC\t"`getconf GNU_LIBC_VERSION | grep glibc | sed -e 's/^glibc //'`'" ENDL \' +echo '"TBB:' "BUILD_LD\t\t"`ld -v 2>&1 | grep 'version'`'" ENDL \' +echo '"TBB:' "BUILD_TARGET\t$arch on $runtime"'" ENDL \' +echo '"TBB:' "BUILD_COMMAND\t"$*'" ENDL \' +echo "" +echo "#define __TBB_DATETIME \""`date -u`"\"" diff --git a/dep/tbb/build/version_info_macos.sh b/dep/tbb/build/version_info_macos.sh new file mode 100644 index 000000000..d6a40afbb --- /dev/null +++ b/dep/tbb/build/version_info_macos.sh @@ -0,0 +1,39 @@ +#!/bin/sh +# +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# Script used to generate version info string +echo "#define __TBB_VERSION_STRINGS \\" +echo '"TBB:' "BUILD_HOST\t\t"`hostname -s`" ("`arch`")"'" ENDL \' +echo '"TBB:' "BUILD_OS\t\t"`sw_vers -productName`" version "`sw_vers -productVersion`'" ENDL \' +echo '"TBB:' "BUILD_KERNEL\t"`uname -v`'" ENDL \' +echo '"TBB:' "BUILD_GCC\t\t"`gcc -v &1 | grep 'version'`'" ENDL \' +[ -z "$COMPILER_VERSION" ] || echo '"TBB:' "BUILD_COMPILER\t"$COMPILER_VERSION'" ENDL \' +echo '"TBB:' "BUILD_TARGET\t$arch on $runtime"'" ENDL \' +echo '"TBB:' "BUILD_COMMAND\t"$*'" ENDL \' +echo "" +echo "#define __TBB_DATETIME \""`date -u`"\"" diff --git a/dep/tbb/build/version_info_sunos.sh b/dep/tbb/build/version_info_sunos.sh new file mode 100644 index 000000000..16341165a --- /dev/null +++ b/dep/tbb/build/version_info_sunos.sh @@ -0,0 +1,39 @@ +#!/bin/sh +# +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# Script used to generate version info string +echo "#define __TBB_VERSION_STRINGS \\" +echo '"TBB:' "BUILD_HOST\t"`hostname`" ("`arch`")"'" ENDL \' +echo '"TBB:' "BUILD_OS\t\t"`uname`'" ENDL \' +echo '"TBB:' "BUILD_KERNEL\t"`uname -srv`'" ENDL \' +echo '"TBB:' "BUILD_SUNCC\t"`CC -V &1 | grep 'C++'`'" ENDL \' +[ -z "$COMPILER_VERSION" ] || echo '"TBB: ' "BUILD_COMPILER\t"$COMPILER_VERSION'" ENDL \' +echo '"TBB:' "BUILD_TARGET\t$arch on $runtime"'" ENDL \' +echo '"TBB:' "BUILD_COMMAND\t"$*'" ENDL \' +echo "" +echo "#define __TBB_DATETIME \""`date -u`"\"" diff --git a/dep/tbb/build/version_info_windows.js b/dep/tbb/build/version_info_windows.js new file mode 100644 index 000000000..1d1efb9f8 --- /dev/null +++ b/dep/tbb/build/version_info_windows.js @@ -0,0 +1,136 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +var WshShell = WScript.CreateObject("WScript.Shell"); + +var tmpExec; + +WScript.Echo("#define __TBB_VERSION_STRINGS \\"); + +//Getting BUILD_HOST +WScript.echo( "\"TBB: BUILD_HOST\\t\\t" + + WshShell.ExpandEnvironmentStrings("%COMPUTERNAME%") + + "\" ENDL \\" ); + +//Getting BUILD_OS +tmpExec = WshShell.Exec("cmd /c ver"); +while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); +} +tmpExec.StdOut.ReadLine(); + +WScript.echo( "\"TBB: BUILD_OS\\t\\t" + + tmpExec.StdOut.ReadLine() + + "\" ENDL \\" ); + +if ( WScript.Arguments(0).toLowerCase().match("gcc") ) { + tmpExec = WshShell.Exec("gcc --version"); + WScript.echo( "\"TBB: BUILD_COMPILER\\t" + + tmpExec.StdOut.ReadLine() + + "\" ENDL \\" ); + +} else { // MS / Intel compilers + //Getting BUILD_CL + tmpExec = WshShell.Exec("cmd /c echo #define 0 0>empty.cpp"); + tmpExec = WshShell.Exec("cl -c empty.cpp "); + while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); + } + var clVersion = tmpExec.StdErr.ReadLine(); + WScript.echo( "\"TBB: BUILD_CL\\t\\t" + + clVersion + + "\" ENDL \\" ); + + //Getting BUILD_COMPILER + if ( WScript.Arguments(0).toLowerCase().match("icl") ) { + tmpExec = WshShell.Exec("icl -c empty.cpp "); + while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); + } + WScript.echo( "\"TBB: BUILD_COMPILER\\t" + + tmpExec.StdErr.ReadLine() + + "\" ENDL \\" ); + } else { + WScript.echo( "\"TBB: BUILD_COMPILER\\t\\t" + + clVersion + + "\" ENDL \\" ); + } + tmpExec = WshShell.Exec("cmd /c del /F /Q empty.obj empty.cpp"); +} + +//Getting BUILD_TARGET +WScript.echo( "\"TBB: BUILD_TARGET\\t" + + WScript.Arguments(1) + + "\" ENDL \\" ); + +//Getting BUILD_COMMAND +WScript.echo( "\"TBB: BUILD_COMMAND\\t" + WScript.Arguments(2) + "\" ENDL" ); + +//Getting __TBB_DATETIME and __TBB_VERSION_YMD +var date = new Date(); +WScript.echo( "#define __TBB_DATETIME \"" + date.toUTCString() + "\"" ); +WScript.echo( "#define __TBB_VERSION_YMD " + date.getUTCFullYear() + ", " + + (date.getUTCMonth() > 8 ? (date.getUTCMonth()+1):("0"+(date.getUTCMonth()+1))) + + (date.getUTCDate() > 9 ? date.getUTCDate():("0"+date.getUTCDate())) ); + + +/* + +Original strings + +#define __TBB_VERSION_STRINGS \ +"TBB: BUILD_HOST\t\tvpolin-mobl1 (ia32)" ENDL \ +"TBB: BUILD_OS\t\tMicrosoft Windows XP [Version 5.1.2600]" ENDL \ +"TBB: BUILD_CL\t\tMicrosoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86" ENDL \ +"TBB: BUILD_COMPILER\tIntel(R) C++ Compiler for 32-bit applications, Version 9.1 Build 20070109Z Package ID: W_CC_C_9.1.034 " ENDL \ +"TBB: BUILD_TARGET\t" ENDL \ +"TBB: BUILD_COMMAND\t" ENDL \ + +#define __TBB_DATETIME "Mon Jun 4 10:16:07 UTC 2007" +#define __TBB_VERSION_YMD 2007, 0604 + + + +# The script must be run from two directory levels below this level. +x='"TBB: ' +y='" ENDL \' +echo "#define __TBB_VERSION_STRINGS \\" +echo $x "BUILD_HOST\t\t"`hostname`" ("`../../arch.exe`")"$y +echo $x "BUILD_OS\t\t"`../../win_version.bat|grep -i 'Version'`$y +echo >empty.cpp +echo $x "BUILD_CL\t\t"`cl -c empty.cpp 2>&1 | grep -i Version`$y +echo $x "BUILD_COMPILER\t"`icl -c empty.cpp 2>&1 | grep -i Version`$y +echo $x "BUILD_TARGET\t"$TBB_ARCH$y +echo $x "BUILD_COMMAND\t"$*$y +echo "" +# A workaround for MKS 8.6 where `date -u` crashes. +date -u > date.tmp +echo "#define __TBB_DATETIME \""`cat date.tmp`"\"" +echo "#define __TBB_VERSION_YMD "`date '+%Y, %m%d'` +rm empty.cpp +rm empty.obj +rm date.tmp +*/ diff --git a/dep/tbb/build/version_info_winlrb.js b/dep/tbb/build/version_info_winlrb.js new file mode 100644 index 000000000..67f2a2920 --- /dev/null +++ b/dep/tbb/build/version_info_winlrb.js @@ -0,0 +1,91 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +var WshShell = WScript.CreateObject("WScript.Shell"); + +var tmpExec; + +WScript.Echo("#define __TBB_VERSION_STRINGS \\"); + +//Getting BUILD_HOST +WScript.echo( "\"TBB: BUILD_HOST\\t\\t" + + WshShell.ExpandEnvironmentStrings("%COMPUTERNAME%") + + "\" ENDL \\" ); + +//Getting BUILD_OS +tmpExec = WshShell.Exec("cmd /c ver"); +while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); +} +tmpExec.StdOut.ReadLine(); + +WScript.echo( "\"TBB: BUILD_OS\\t\\t" + + tmpExec.StdOut.ReadLine() + + "\" ENDL \\" ); + +var Unknown = "Unknown"; + +WScript.echo( "\"TBB: BUILD_KERNEL\\t" + + Unknown + + "\" ENDL \\" ); + +//Getting BUILD_COMPILER +tmpExec = WshShell.Exec("icc --version"); +while ( tmpExec.Status == 0 ) { + WScript.Sleep(100); +} +var ccVersion = tmpExec.StdErr.ReadLine(); +WScript.echo( "\"TBB: BUILD_GCC\\t" + + ccVersion + + "\" ENDL \\" ); +WScript.echo( "\"TBB: BUILD_COMPILER\\t" + + ccVersion + + "\" ENDL \\" ); + +WScript.echo( "\"TBB: BUILD_GLIBC\\t" + + Unknown + + "\" ENDL \\" ); + +WScript.echo( "\"TBB: BUILD_LD\\t" + + Unknown + + "\" ENDL \\" ); + +//Getting BUILD_TARGET +WScript.echo( "\"TBB: BUILD_TARGET\\t" + + WScript.Arguments(1) + + "\" ENDL \\" ); + +//Getting BUILD_COMMAND +WScript.echo( "\"TBB: BUILD_COMMAND\\t" + WScript.Arguments(2) + "\" ENDL" ); + +//Getting __TBB_DATETIME and __TBB_VERSION_YMD +var date = new Date(); +WScript.echo( "#define __TBB_DATETIME \"" + date.toUTCString() + "\"" ); +WScript.echo( "#define __TBB_VERSION_YMD " + date.getUTCFullYear() + ", " + + (date.getUTCMonth() > 8 ? (date.getUTCMonth()+1):("0"+(date.getUTCMonth()+1))) + + (date.getUTCDate() > 9 ? date.getUTCDate():("0"+date.getUTCDate())) ); + + diff --git a/dep/tbb/build/vsproject/index.html b/dep/tbb/build/vsproject/index.html new file mode 100644 index 000000000..82cad002d --- /dev/null +++ b/dep/tbb/build/vsproject/index.html @@ -0,0 +1,31 @@ + + + +

Overview

+This directory contains the visual studio* 2005 solution to build Threading Building Blocks. + + +

Files

+
+
makefile.sln +
Solution file. +
tbb.vcproj +
Library project file. +
tbbmalloc.vcproj +
Scalable allocator library project file. Allocator sources are expected to be located in ../../src/tbbmalloc folder. +
tbbmalloc_proxy.vcproj +
Standard allocator replacement project file. +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + diff --git a/dep/tbb/build/vsproject/makefile.sln b/dep/tbb/build/vsproject/makefile.sln new file mode 100644 index 000000000..2a681d436 --- /dev/null +++ b/dep/tbb/build/vsproject/makefile.sln @@ -0,0 +1,72 @@ +Microsoft Visual Studio Solution File, Format Version 9.00 +# Visual Studio 2005 +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbb", "tbb.vcproj", "{F62787DD-1327-448B-9818-030062BCFAA5}" + ProjectSection(WebsiteProperties) = preProject + Debug.AspNetCompiler.Debug = "True" + Release.AspNetCompiler.Debug = "False" + EndProjectSection +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbbmalloc", "tbbmalloc.vcproj", "{B15F131E-328A-4D42-ADC2-9FF4CA6306D8}" + ProjectSection(WebsiteProperties) = preProject + Debug.AspNetCompiler.Debug = "True" + Release.AspNetCompiler.Debug = "False" + EndProjectSection + ProjectSection(ProjectDependencies) = postProject + {F62787DD-1327-448B-9818-030062BCFAA5} = {F62787DD-1327-448B-9818-030062BCFAA5} + EndProjectSection +EndProject +Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "Solution Items", "Solution Items", "{8898CE0B-0BFB-45AE-AA71-83735ED2510D}" + ProjectSection(WebsiteProperties) = preProject + Debug.AspNetCompiler.Debug = "True" + Release.AspNetCompiler.Debug = "False" + EndProjectSection + ProjectSection(SolutionItems) = preProject + index.html = index.html + EndProjectSection +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbbmalloc_proxy", "tbbmalloc_proxy.vcproj", "{02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}" + ProjectSection(WebsiteProperties) = preProject + Debug.AspNetCompiler.Debug = "True" + Release.AspNetCompiler.Debug = "False" + EndProjectSection + ProjectSection(ProjectDependencies) = postProject + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} = {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} + EndProjectSection +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Win32 = Debug|Win32 + Debug|x64 = Debug|x64 + Release|Win32 = Release|Win32 + Release|x64 = Release|x64 + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.ActiveCfg = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.Build.0 = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.ActiveCfg = Debug|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.Build.0 = Debug|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.ActiveCfg = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.Build.0 = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.ActiveCfg = Release|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.Build.0 = Release|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.ActiveCfg = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.Build.0 = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.ActiveCfg = Debug|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.Build.0 = Debug|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.ActiveCfg = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.Build.0 = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.ActiveCfg = Release|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.Build.0 = Release|x64 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Debug|Win32.ActiveCfg = Debug|Win32 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Debug|Win32.Build.0 = Debug|Win32 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Debug|x64.ActiveCfg = Debug|x64 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Debug|x64.Build.0 = Debug|x64 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Release|Win32.ActiveCfg = Release|Win32 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Release|Win32.Build.0 = Release|Win32 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Release|x64.ActiveCfg = Release|x64 + {02F61511-D5B6-46E6-B4BB-DEAA96E6BCC7}.Release|x64.Build.0 = Release|x64 + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection +EndGlobal diff --git a/dep/tbb/build/vsproject/tbb.vcproj b/dep/tbb/build/vsproject/tbb.vcproj new file mode 100644 index 000000000..1024d7ef7 --- /dev/null +++ b/dep/tbb/build/vsproject/tbb.vcproj @@ -0,0 +1,310 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dep/tbb/build/vsproject/tbbmalloc.vcproj b/dep/tbb/build/vsproject/tbbmalloc.vcproj new file mode 100644 index 000000000..26cc44b90 --- /dev/null +++ b/dep/tbb/build/vsproject/tbbmalloc.vcproj @@ -0,0 +1,290 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dep/tbb/build/vsproject/tbbmalloc_proxy.vcproj b/dep/tbb/build/vsproject/tbbmalloc_proxy.vcproj new file mode 100644 index 000000000..57d65f790 --- /dev/null +++ b/dep/tbb/build/vsproject/tbbmalloc_proxy.vcproj @@ -0,0 +1,126 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dep/tbb/build/vsproject/version_string.tmp b/dep/tbb/build/vsproject/version_string.tmp new file mode 100644 index 000000000..2098d6759 --- /dev/null +++ b/dep/tbb/build/vsproject/version_string.tmp @@ -0,0 +1 @@ +#define __TBB_VERSION_STRINGS "Empty" diff --git a/dep/tbb/build/windows.cl.inc b/dep/tbb/build/windows.cl.inc new file mode 100644 index 000000000..1051ece06 --- /dev/null +++ b/dep/tbb/build/windows.cl.inc @@ -0,0 +1,122 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +#------------------------------------------------------------------------------ +# Define compiler-specific variables. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting compiler flags. +#------------------------------------------------------------------------------ +CPLUS = cl /nologo +LINK_FLAGS = /link /nologo +LIB_LINK_FLAGS=/link /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO +MS_CRT_KEY = /MD$(if $(findstring debug,$(cfg)),d) +EH_FLAGS = /EHsc /GR + +ifeq ($(cfg), release) + CPLUS_FLAGS = $(MS_CRT_KEY) /O2 /Zi $(EH_FLAGS) /Zc:forScope /Zc:wchar_t + ASM_FLAGS = +ifeq (ia32,$(arch)) + CPLUS_FLAGS += /Oy +endif +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = $(MS_CRT_KEY) /Od /Ob0 /Zi $(EH_FLAGS) /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG + ASM_FLAGS = /DUSE_FRAME_POINTER +endif + + +COMPILE_ONLY = /c +PREPROC_ONLY = /TC /EP +INCLUDE_KEY = /I +DEFINE_KEY = /D +OUTPUT_KEY = /Fe +OUTPUTOBJ_KEY = /Fo +WARNING_AS_ERROR_KEY = /WX + +ifeq ($(runtime),vc7.1) + WARNING_KEY = /W3 +else + WARNING_KEY = /W4 +endif + +DYLIB_KEY = /DLL +EXPORT_KEY = /DEF: + +ifeq ($(runtime),vc8) + OPENMP_FLAG = /openmp + WARNING_KEY += /Wp64 + CPLUS_FLAGS += /D_USE_RTM_VERSION +endif +ifeq ($(runtime),vc9) + OPENMP_FLAG = /openmp +endif + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += /GS- +endif + + + +CPLUS_FLAGS += /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE \ + /D_WIN32_WINNT=$(_WIN32_WINNT) +C_FLAGS = $(CPLUS_FLAGS) +#------------------------------------------------------------------------------ +# End of setting compiler flags. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-masm +ifeq (intel64,$(arch)) + ASM=ml64 + ASM_FLAGS += /DEM64T=1 /c /Zi + TBB_ASM.OBJ = atomic_support.obj +else + ASM=ml + ASM_FLAGS += /c /coff /Zi + TBB_ASM.OBJ = atomic_support.obj lock_byte.obj +endif +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ +M_CPLUS_FLAGS = $(subst $(EH_FLAGS),/EHs-,$(CPLUS_FLAGS)) +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# End of define compiler-specific variables. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/windows.gcc.inc b/dep/tbb/build/windows.gcc.inc new file mode 100644 index 000000000..b52d2a75b --- /dev/null +++ b/dep/tbb/build/windows.gcc.inc @@ -0,0 +1,122 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +#------------------------------------------------------------------------------ +# Overriding settings from windows.inc +#------------------------------------------------------------------------------ + +SLASH= $(strip \) +OBJ = o +LIBEXT = dll # MinGW allows linking with DLLs directly + +TBB.RES = +MALLOC.RES = +TBB.MANIFEST = +MALLOC.MANIFEST = + +# TODO: do better when/if mingw64 support is added +TBB.DEF = $(tbb_root)/src/tbb/lin32-tbb-export.def +MALLOC.DEF = $(MALLOC_ROOT)/win-gcc-tbbmalloc-export.def + +LINK_TBB.LIB = $(TBB.LIB) + +#------------------------------------------------------------------------------ +# End of overridden settings +#------------------------------------------------------------------------------ +# Compiler-specific variables +#------------------------------------------------------------------------------ + +CPLUS = g++ +COMPILE_ONLY = -c -MMD +PREPROC_ONLY = -E -x c +INCLUDE_KEY = -I +DEFINE_KEY = -D +OUTPUT_KEY = -o # +OUTPUTOBJ_KEY = -o # +PIC_KEY = +WARNING_AS_ERROR_KEY = -Werror +WARNING_KEY = -Wall -Wno-uninitialized +WARNING_SUPPRESS = -Wno-parentheses +DYLIB_KEY = -shared +LIBDL = +EXPORT_KEY = -Wl,--version-script, +LIBS = -lpsapi + +#------------------------------------------------------------------------------ +# End of compiler-specific variables +#------------------------------------------------------------------------------ +# Command lines +#------------------------------------------------------------------------------ + +LINK_FLAGS = -Wl,--enable-auto-import +LIB_LINK_FLAGS = $(DYLIB_KEY) + +ifeq ($(cfg), release) + CPLUS_FLAGS = -O2 +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = -g -O0 -DTBB_USE_DEBUG +endif +CPLUS_FLAGS += -DUSE_WINTHREAD + +# MinGW specific +CPLUS_FLAGS += -D__MSVCRT_VERSION__=0x0700 -msse -mthreads + +CONLY = gcc +C_FLAGS = $(CPLUS_FLAGS) + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += -m64 + LIB_LINK_FLAGS += -m64 +endif + +ifeq (ia32,$(arch)) + CPLUS_FLAGS += -m32 + LIB_LINK_FLAGS += -m32 +endif + +#------------------------------------------------------------------------------ +# End of command lines +#------------------------------------------------------------------------------ +# Setting assembler data +#------------------------------------------------------------------------------ + +ASM= +ASM_FLAGS= +TBB_ASM.OBJ= +ASSEMBLY_SOURCE=$(arch)-gas + +#------------------------------------------------------------------------------ +# End of setting assembler data +#------------------------------------------------------------------------------ +# Setting tbbmalloc data +#------------------------------------------------------------------------------ + +M_CPLUS_FLAGS = $(CPLUS_FLAGS) -fno-rtti -fno-exceptions + +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/windows.icl.inc b/dep/tbb/build/windows.icl.inc new file mode 100644 index 000000000..386c5d8a5 --- /dev/null +++ b/dep/tbb/build/windows.icl.inc @@ -0,0 +1,144 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +#------------------------------------------------------------------------------ +# Define compiler-specific variables. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting default configuration to release. +#------------------------------------------------------------------------------ +cfg ?= release +#------------------------------------------------------------------------------ +# End of setting default configuration to release. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting compiler flags. +#------------------------------------------------------------------------------ +CPLUS = icl /nologo $(VCCOMPAT_FLAG) +LINK_FLAGS = /link /nologo +LIB_LINK_FLAGS= /link /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO +MS_CRT_KEY = /MD$(if $(findstring debug,$(cfg)),d) +EH_FLAGS = /EHsc /GR + +ifeq ($(cfg), release) + CPLUS_FLAGS = $(MS_CRT_KEY) /O2 /Zi $(EH_FLAGS) /Zc:forScope /Zc:wchar_t + ASM_FLAGS = +ifeq (ia32,$(arch)) + CPLUS_FLAGS += /Oy +endif +endif +ifeq ($(cfg), debug) + CPLUS_FLAGS = $(MS_CRT_KEY) /Od /Ob0 /Zi $(EH_FLAGS) /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG + LINK_FLAGS += libmmds.lib /NODEFAULTLIB:libmmdd.lib + ASM_FLAGS = /DUSE_FRAME_POINTER +endif + + +COMPILE_ONLY = /c /QMMD +PREPROC_ONLY = /EP /Tc +INCLUDE_KEY = /I +DEFINE_KEY = /D +OUTPUT_KEY = /Fe +OUTPUTOBJ_KEY = /Fo +WARNING_AS_ERROR_KEY = /WX +WARNING_KEY = /W3 +DYLIB_KEY = /DLL +EXPORT_KEY = /DEF: + +ifeq (intel64,$(arch)) + CPLUS_FLAGS += /GS- +endif + +ifneq (,$(codecov)) + CPLUS_FLAGS += /Qprof-genx +else + CPLUS_FLAGS += /DDO_ITT_NOTIFY +endif + +OPENMP_FLAG = /Qopenmp +CPLUS_FLAGS += /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE \ + /D_WIN32_WINNT=$(_WIN32_WINNT) + +ifeq ($(runtime),vc8) + CPLUS_FLAGS += /D_USE_RTM_VERSION +endif + +C_FLAGS = $(CPLUS_FLAGS) + +ifneq (00,$(lambdas)$(cpp0x)) + CPLUS_FLAGS += /Qstd=c++0x /D_TBB_CPP0X +endif + +VCVERSION:=$(runtime) +VCCOMPAT_FLAG := $(if $(findstring vc7.1, $(VCVERSION)),/Qvc7.1) +ifeq ($(VCCOMPAT_FLAG),) + VCCOMPAT_FLAG := $(if $(findstring vc8, $(VCVERSION)),/Qvc8) +endif +ifeq ($(VCCOMPAT_FLAG),) + VCCOMPAT_FLAG := $(if $(findstring vc9, $(VCVERSION)),/Qvc9) +endif +ifeq ($(VCCOMPAT_FLAG),) + $(error VC version not detected correctly: $(VCVERSION) ) +endif +export VCCOMPAT_FLAG +#------------------------------------------------------------------------------ +# End of setting compiler flags. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting assembler data. +#------------------------------------------------------------------------------ +ASSEMBLY_SOURCE=$(arch)-masm +ifeq (intel64,$(arch)) + ASM=ml64 + ASM_FLAGS += /DEM64T=1 /c /Zi + TBB_ASM.OBJ = atomic_support.obj +else + ASM=ml + ASM_FLAGS += /c /coff /Zi + TBB_ASM.OBJ = atomic_support.obj lock_byte.obj +endif +#------------------------------------------------------------------------------ +# End of setting assembler data. +#------------------------------------------------------------------------------ + + +#------------------------------------------------------------------------------ +# Setting tbbmalloc data. +#------------------------------------------------------------------------------ +M_CPLUS_FLAGS = $(subst $(EH_FLAGS),/EHs-,$(CPLUS_FLAGS)) +#------------------------------------------------------------------------------ +# End of setting tbbmalloc data. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# End of define compiler-specific variables. +#------------------------------------------------------------------------------ diff --git a/dep/tbb/build/windows.inc b/dep/tbb/build/windows.inc new file mode 100644 index 000000000..400864fe6 --- /dev/null +++ b/dep/tbb/build/windows.inc @@ -0,0 +1,100 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +ifdef tbb_build_dir + test_dir:=$(tbb_build_dir) +else + test_dir:=. +endif + +# TODO give an error if archs doesn't match +ifndef arch + export arch:=$(shell cmd /C "cscript /nologo /E:jscript $(tbb_root)/build/detect.js /arch $(compiler)") +endif + +ifndef runtime + export runtime:=$(shell cmd /C "cscript /nologo /E:jscript $(tbb_root)/build/detect.js /runtime $(compiler)") +endif + +native_compiler := cl +export compiler ?= cl +debugger ?= devenv /debugexe + +CMD=cmd /C +CWD=$(shell cmd /C echo %CD%) +RM=cmd /C del /Q /F +RD=cmd /C rmdir +MD=cmd /c mkdir +SLASH=\\ +NUL = nul + +OBJ = obj +DLL = dll +LIBEXT = lib + +def_prefix = $(if $(findstring ia32,$(arch)),win32,win64) + +# Target Windows version. Do not increase beyond 0x0500 without prior discussion! +# Used as the value for macro definition opiton in windows.cl.inc etc. +_WIN32_WINNT=0x0400 + +TBB.DEF = $(tbb_root)/src/tbb/$(def_prefix)-tbb-export.def +TBB.DLL = tbb$(DEBUG_SUFFIX).$(DLL) +TBB.LIB = tbb$(DEBUG_SUFFIX).$(LIBEXT) +TBB.RES = tbb_resource.res +# On Windows, we use #pragma comment to set the proper TBB lib to link with +# But for cross-configuration testing, need to link explicitly +LINK_TBB.LIB = $(if $(crosstest),$(TBB.LIB)) +TBB.MANIFEST = +ifneq ($(filter vc8 vc9,$(runtime)),) + TBB.MANIFEST = tbbmanifest.exe.manifest +endif + +MALLOC.DEF = $(MALLOC_ROOT)/$(def_prefix)-tbbmalloc-export.def +MALLOC.DLL = tbbmalloc$(DEBUG_SUFFIX).$(DLL) +MALLOC.LIB = tbbmalloc$(DEBUG_SUFFIX).$(LIBEXT) +MALLOC.RES = tbbmalloc.res +MALLOC.MANIFEST = +ifneq ($(filter vc8 vc9,$(runtime)),) +MALLOC.MANIFEST = tbbmanifest.exe.manifest +endif +LINK_MALLOC.LIB = $(MALLOC.LIB) + +MALLOCPROXY.DLL = tbbmalloc_proxy$(DEBUG_SUFFIX).$(DLL) +MALLOCPROXY.LIB = tbbmalloc_proxy$(DEBUG_SUFFIX).$(LIBEXT) + +RML.DEF = $(RML_SERVER_ROOT)/$(def_prefix)-rml-export.def +RML.DLL = irml$(DEBUG_SUFFIX).$(DLL) +RML.LIB = irml$(DEBUG_SUFFIX).$(LIBEXT) +RML.RES = irml.res +ifneq ($(runtime),vc7.1) +RML.MANIFEST = tbbmanifest.exe.manifest +endif + +MAKE_VERSIONS = cmd /C cscript /nologo /E:jscript $(subst \,/,$(tbb_root))/build/version_info_windows.js $(compiler) $(arch) $(subst \,/,"$(CPLUS) $(CPLUS_FLAGS) $(INCLUDES)") > version_string.tmp +MAKE_TBBVARS = cmd /C "$(subst /,\,$(tbb_root))\build\generate_tbbvars.bat" + +TEST_LAUNCHER = $(subst /,\,$(tbb_root))\build\test_launcher.bat diff --git a/dep/tbb/build/winlrb.cl.inc b/dep/tbb/build/winlrb.cl.inc new file mode 100644 index 000000000..618dba5bf --- /dev/null +++ b/dep/tbb/build/winlrb.cl.inc @@ -0,0 +1,66 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +include $(tbb_root)/build/windows.cl.inc + +ifeq ($(cfg), debug) + CFG_LETTER = d +else + CFG_LETTER = r +endif + +_CPLUS_FLAGS_HOST := $(CPLUS_FLAGS) /I$(LRB_INC_DIR) $(LINK_FLAGS) /LIBPATH:$(LRB_LIB_DIR) xn_host$(LRB_HOST_ARCH)$(CFG_LETTER).lib + +TEST_EXT = dll +CPLUS_FLAGS += /I$(LRB_INC_DIR) /D__LRB__ +LIB_LINK_FLAGS += /LIBPATH:$(LRB_LIB_DIR) xn_lrb$(LRB_HOST_ARCH)$(CFG_LETTER).lib +LINK_FLAGS = $(LIB_LINK_FLAGS) +OPENMP_FLAG = + +ifdef TEST_RESOURCE +LINK_FLAGS += $(TEST_RESOURCE) + +TEST_LAUNCHER_NAME = harness_lrb_host +AUX_TEST_DEPENDENCIES = $(TEST_LAUNCHER_NAME).exe + +$(TEST_LAUNCHER_NAME).exe: $(TEST_LAUNCHER_NAME).cpp + cl /Fe$@ $< $(_CPLUS_FLAGS_HOST) + +NO_LEGACY_TESTS = 1 +NO_C_TESTS = 1 +TEST_LAUNCHER= +endif # TEST_RESOURCE + +#test_model_plugin.%: +# @echo test_model_plugin is not supported for LRB architecture so far + +ifeq ($(BUILDING_PHASE),0) # examples + export RM = del /Q /F + export LIBS = -shared -lthr -z muldefs -L$(work_dir)_debug -L$(work_dir)_release + export UI = con + export x64 = 64 + export CXXFLAGS = -xR -I..\..\..\include +endif # examples diff --git a/dep/tbb/build/winlrb.icc.inc b/dep/tbb/build/winlrb.icc.inc new file mode 100644 index 000000000..427d06c9d --- /dev/null +++ b/dep/tbb/build/winlrb.icc.inc @@ -0,0 +1,49 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + + +include $(tbb_root)/build/winlrb.cl.inc + +TEST_EXT = so +.PRECIOUS: %.$(TEST_EXT) + +include $(tbb_root)/build/freebsd.gcc.inc + +WARNING_KEY = -w1 +CPLUS = icpc +CONLY = icc +#LIBS = -u _read -lcprts -lthr -lc +#LIBS = -lthr +LIBS = -u _read -lcprts -lthr -limf -lc +LINK_FLAGS = -L$(LRB_LIB_DIR) $(DYLIB_KEY) -lxn$(XN_VER)_lrb64$(CFG_LETTER) +CPLUS_FLAGS += -xR $(PIC_KEY) -I$(LRB_INC_DIR) -DXENSIM +C_FLAGS = $(CPLUS_FLAGS) +LIB_LINK_FLAGS = $(LINK_FLAGS) + +ifeq ($(cfg), release) + # workaround for LRB compiler issues + CPLUS_FLAGS := $(subst -O2,-O0, $(CPLUS_FLAGS)) +endif diff --git a/dep/tbb/build/winlrb.inc b/dep/tbb/build/winlrb.inc new file mode 100644 index 000000000..f72c66fde --- /dev/null +++ b/dep/tbb/build/winlrb.inc @@ -0,0 +1,88 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +ifndef XN_VER +export LRBSDK = $(LARRABEE_CORE_LATEST) +export LRB_LIB_DIR = "$(LRBSDK)lib" +export LRB_INC_DIR = "$(LRBSDK)include" + +# Function $(wildcard pattern) does not work with paths containing spaces! +_lrb_lib = $(shell cmd /C "dir /B "$(LRBSDK)lib\libxn*_lrb64d.so") +export XN_VER = $(patsubst libxn%_lrb64d.so,%,$(_lrb_lib)) + +ifeq (1,$(NETSIM_LRB_32_OVERRIDE)) + export LRB_HOST_ARCH = 32 +else + export LRB_HOST_ARCH = 64 +endif + +export run_cmd = harness_lrb_host.exe + +export UI = con + +endif #XN_VER + +include $(tbb_root)/build/windows.inc + +ifneq (1,$(netsim)) +# Target environment is native LRB or LrbFSim + +export compiler = icc +export arch := lrb + +target_machine = $(subst -,_,$(shell icpc -dumpmachine)) +runtime = $(subst _lrb_,_,$(target_machine)) +# -dumpmachine option does not work in R9 Core SDK 5 +ifeq ($(runtime),) + runtime = x86_64_freebsd +endif +export runtime:=$(runtime)_xn$(XN_VER) + +OBJ = o +DLL = so +LIBEXT = so + +TBB.DEF = +TBB.DLL = libtbb$(DEBUG_SUFFIX).$(DLL) +TBB.LIB = $(TBB.DLL) +LINK_TBB.LIB = $(TBB.DLL) +TBB.RES = + +MALLOC.DEF := +MALLOC.DLL = libtbbmalloc$(DEBUG_SUFFIX).$(DLL) +MALLOC.LIB = $(MALLOC.DLL) +MALLOC.RES = + +MAKE_VERSIONS = cmd /C cscript /nologo /E:jscript $(subst \,/,$(tbb_root))/build/version_info_winlrb.js $(compiler) $(arch) $(subst \,/,"$(CPLUS) $(CPLUS_FLAGS) $(INCLUDES)") > version_string.tmp +MAKE_TBBVARS = cmd /C "$(subst /,\,$(tbb_root))\build\generate_tbbvars.bat" + +ifneq (1,$(XENSIM_ENABLED)) + export run_cmd = rem +endif + +TBB_NOSTRICT = 1 + +endif # lrbfsim diff --git a/dep/tbb/include/index.html b/dep/tbb/include/index.html new file mode 100644 index 000000000..f80c5d491 --- /dev/null +++ b/dep/tbb/include/index.html @@ -0,0 +1,24 @@ + + + +

Overview

+Include files for Threading Building Blocks. + +

Directories

+
+
tbb +
Include files for Threading Building Blocks classes and functions. +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + diff --git a/dep/tbb/include/tbb/_concurrent_queue_internal.h b/dep/tbb/include/tbb/_concurrent_queue_internal.h new file mode 100644 index 000000000..418065dd8 --- /dev/null +++ b/dep/tbb/include/tbb/_concurrent_queue_internal.h @@ -0,0 +1,973 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_concurrent_queue_internal_H +#define __TBB_concurrent_queue_internal_H + +#include "tbb_stddef.h" +#include "tbb_machine.h" +#include "atomic.h" +#include "spin_mutex.h" +#include "cache_aligned_allocator.h" +#include "tbb_exception.h" +#include +#include + +namespace tbb { + +#if !__TBB_TEMPLATE_FRIENDS_BROKEN + +// forward declaration +namespace strict_ppl { +template class concurrent_queue; +} + +template class concurrent_bounded_queue; + +namespace deprecated { +template class concurrent_queue; +} +#endif + +//! For internal use only. +namespace strict_ppl { + +//! @cond INTERNAL +namespace internal { + +using namespace tbb::internal; + +typedef size_t ticket; + +static void* invalid_page; + +template class micro_queue ; +template class micro_queue_pop_finalizer ; +template class concurrent_queue_base_v3; + +//! parts of concurrent_queue_rep that do not have references to micro_queue +/** + * For internal use only. + */ +struct concurrent_queue_rep_base : no_copy { + template friend class micro_queue; + template friend class concurrent_queue_base_v3; + +protected: + //! Approximately n_queue/golden ratio + static const size_t phi = 3; + +public: + // must be power of 2 + static const size_t n_queue = 8; + + //! Prefix on a page + struct page { + page* next; + uintptr_t mask; + }; + + atomic head_counter; + char pad1[NFS_MaxLineSize-sizeof(atomic)]; + atomic tail_counter; + char pad2[NFS_MaxLineSize-sizeof(atomic)]; + + //! Always a power of 2 + size_t items_per_page; + + //! Size of an item + size_t item_size; + + //! number of invalid entries in the queue + atomic n_invalid_entries; + + char pad3[NFS_MaxLineSize-sizeof(size_t)-sizeof(size_t)-sizeof(atomic)]; +} ; + +//! Abstract class to define interface for page allocation/deallocation +/** + * For internal use only. + */ +class concurrent_queue_page_allocator +{ + template friend class micro_queue ; + template friend class micro_queue_pop_finalizer ; +protected: + virtual ~concurrent_queue_page_allocator() {} +private: + virtual concurrent_queue_rep_base::page* allocate_page() = 0; + virtual void deallocate_page( concurrent_queue_rep_base::page* p ) = 0; +} ; + +#if _MSC_VER && !defined(__INTEL_COMPILER) +// unary minus operator applied to unsigned type, result still unsigned +#pragma warning( push ) +#pragma warning( disable: 4146 ) +#endif + +//! A queue using simple locking. +/** For efficient, this class has no constructor. + The caller is expected to zero-initialize it. */ +template +class micro_queue : no_copy { + typedef concurrent_queue_rep_base::page page; + + //! Class used to ensure exception-safety of method "pop" + class destroyer: no_copy { + T& my_value; + public: + destroyer( T& value ) : my_value(value) {} + ~destroyer() {my_value.~T();} + }; + + T& get_ref( page& page, size_t index ) { + return static_cast(static_cast(&page+1))[index]; + } + + void copy_item( page& dst, size_t index, const void* src ) { + new( &get_ref(dst,index) ) T(*static_cast(src)); + } + + void copy_item( page& dst, size_t dindex, const page& src, size_t sindex ) { + new( &get_ref(dst,dindex) ) T( static_cast(static_cast(&src+1))[sindex] ); + } + + void assign_and_destroy_item( void* dst, page& src, size_t index ) { + T& from = get_ref(src,index); + destroyer d(from); + *static_cast(dst) = from; + } + + void spin_wait_until_my_turn( atomic& counter, ticket k, concurrent_queue_rep_base& rb ) const ; + +public: + friend class micro_queue_pop_finalizer; + + atomic head_page; + atomic head_counter; + + atomic tail_page; + atomic tail_counter; + + spin_mutex page_mutex; + + void push( const void* item, ticket k, concurrent_queue_base_v3& base ) ; + + bool pop( void* dst, ticket k, concurrent_queue_base_v3& base ) ; + + micro_queue& assign( const micro_queue& src, concurrent_queue_base_v3& base ) ; + + page* make_copy( concurrent_queue_base_v3& base, const page* src_page, size_t begin_in_page, size_t end_in_page, ticket& g_index ) ; + + void make_invalid( ticket k ) ; +}; + +template +void micro_queue::spin_wait_until_my_turn( atomic& counter, ticket k, concurrent_queue_rep_base& rb ) const { + atomic_backoff backoff; + do { + backoff.pause(); + if( counter&0x1 ) { + ++rb.n_invalid_entries; + throw_bad_last_alloc_exception_v4(); + } + } while( counter!=k ) ; +} + +template +void micro_queue::push( const void* item, ticket k, concurrent_queue_base_v3& base ) { + k &= -concurrent_queue_rep_base::n_queue; + page* p = NULL; + size_t index = k/concurrent_queue_rep_base::n_queue & (base.my_rep->items_per_page-1); + if( !index ) { + try { + concurrent_queue_page_allocator& pa = base; + p = pa.allocate_page(); + } catch (...) { + ++base.my_rep->n_invalid_entries; + make_invalid( k ); + } + p->mask = 0; + p->next = NULL; + } + + if( tail_counter!=k ) spin_wait_until_my_turn( tail_counter, k, *base.my_rep ); + + if( p ) { + spin_mutex::scoped_lock lock( page_mutex ); + if( page* q = tail_page ) + q->next = p; + else + head_page = p; + tail_page = p; + } else { + p = tail_page; + } + + try { + copy_item( *p, index, item ); + // If no exception was thrown, mark item as present. + p->mask |= uintptr_t(1)<n_invalid_entries; + tail_counter += concurrent_queue_rep_base::n_queue; + throw; + } +} + +template +bool micro_queue::pop( void* dst, ticket k, concurrent_queue_base_v3& base ) { + k &= -concurrent_queue_rep_base::n_queue; + if( head_counter!=k ) spin_wait_until_eq( head_counter, k ); + if( tail_counter==k ) spin_wait_while_eq( tail_counter, k ); + page& p = *head_page; + __TBB_ASSERT( &p, NULL ); + size_t index = k/concurrent_queue_rep_base::n_queue & (base.my_rep->items_per_page-1); + bool success = false; + { + micro_queue_pop_finalizer finalizer( *this, base, k+concurrent_queue_rep_base::n_queue, index==base.my_rep->items_per_page-1 ? &p : NULL ); + if( p.mask & uintptr_t(1)<n_invalid_entries; + } + } + return success; +} + +template +micro_queue& micro_queue::assign( const micro_queue& src, concurrent_queue_base_v3& base ) { + head_counter = src.head_counter; + tail_counter = src.tail_counter; + page_mutex = src.page_mutex; + + const page* srcp = src.head_page; + if( srcp ) { + ticket g_index = head_counter; + try { + size_t n_items = (tail_counter-head_counter)/concurrent_queue_rep_base::n_queue; + size_t index = head_counter/concurrent_queue_rep_base::n_queue & (base.my_rep->items_per_page-1); + size_t end_in_first_page = (index+n_itemsitems_per_page)?(index+n_items):base.my_rep->items_per_page; + + head_page = make_copy( base, srcp, index, end_in_first_page, g_index ); + page* cur_page = head_page; + + if( srcp != src.tail_page ) { + for( srcp = srcp->next; srcp!=src.tail_page; srcp=srcp->next ) { + cur_page->next = make_copy( base, srcp, 0, base.my_rep->items_per_page, g_index ); + cur_page = cur_page->next; + } + + __TBB_ASSERT( srcp==src.tail_page, NULL ); + size_t last_index = tail_counter/concurrent_queue_rep_base::n_queue & (base.my_rep->items_per_page-1); + if( last_index==0 ) last_index = base.my_rep->items_per_page; + + cur_page->next = make_copy( base, srcp, 0, last_index, g_index ); + cur_page = cur_page->next; + } + tail_page = cur_page; + } catch (...) { + make_invalid( g_index ); + } + } else { + head_page = tail_page = NULL; + } + return *this; +} + +template +void micro_queue::make_invalid( ticket k ) { + static page dummy = {static_cast((void*)1), 0}; + // mark it so that no more pushes are allowed. + invalid_page = &dummy; + { + spin_mutex::scoped_lock lock( page_mutex ); + tail_counter = k+concurrent_queue_rep_base::n_queue+1; + if( page* q = tail_page ) + q->next = static_cast(invalid_page); + else + head_page = static_cast(invalid_page); + tail_page = static_cast(invalid_page); + } + throw; +} + +template +concurrent_queue_rep_base::page* micro_queue::make_copy( concurrent_queue_base_v3& base, const concurrent_queue_rep_base::page* src_page, size_t begin_in_page, size_t end_in_page, ticket& g_index ) { + concurrent_queue_page_allocator& pa = base; + page* new_page = pa.allocate_page(); + new_page->next = NULL; + new_page->mask = src_page->mask; + for( ; begin_in_page!=end_in_page; ++begin_in_page, ++g_index ) + if( new_page->mask & uintptr_t(1)< +class micro_queue_pop_finalizer: no_copy { + typedef concurrent_queue_rep_base::page page; + ticket my_ticket; + micro_queue& my_queue; + page* my_page; + concurrent_queue_page_allocator& allocator; +public: + micro_queue_pop_finalizer( micro_queue& queue, concurrent_queue_base_v3& b, ticket k, page* p ) : + my_ticket(k), my_queue(queue), my_page(p), allocator(b) + {} + ~micro_queue_pop_finalizer() ; +}; + +template +micro_queue_pop_finalizer::~micro_queue_pop_finalizer() { + page* p = my_page; + if( p ) { + spin_mutex::scoped_lock lock( my_queue.page_mutex ); + page* q = p->next; + my_queue.head_page = q; + if( !q ) { + my_queue.tail_page = NULL; + } + } + my_queue.head_counter = my_ticket; + if( p ) { + allocator.deallocate_page( p ); + } +} + +#if _MSC_VER && !defined(__INTEL_COMPILER) +#pragma warning( pop ) +#endif // warning 4146 is back + +template class concurrent_queue_iterator_rep ; +template class concurrent_queue_iterator_base_v3; + +//! representation of concurrent_queue_base +/** + * the class inherits from concurrent_queue_rep_base and defines an array of micro_queue's + */ +template +struct concurrent_queue_rep : public concurrent_queue_rep_base { + micro_queue array[n_queue]; + + //! Map ticket to an array index + static size_t index( ticket k ) { + return k*phi%n_queue; + } + + micro_queue& choose( ticket k ) { + // The formula here approximates LRU in a cache-oblivious way. + return array[index(k)]; + } +}; + +//! base class of concurrent_queue +/** + * The class implements the interface defined by concurrent_queue_page_allocator + * and has a pointer to an instance of concurrent_queue_rep. + */ +template +class concurrent_queue_base_v3: public concurrent_queue_page_allocator { + //! Internal representation + concurrent_queue_rep* my_rep; + + friend struct concurrent_queue_rep; + friend class micro_queue; + friend class concurrent_queue_iterator_rep; + friend class concurrent_queue_iterator_base_v3; + +protected: + typedef typename concurrent_queue_rep::page page; + +private: + /* override */ virtual page *allocate_page() { + concurrent_queue_rep& r = *my_rep; + size_t n = sizeof(page) + r.items_per_page*r.item_size; + return reinterpret_cast(allocate_block ( n )); + } + + /* override */ virtual void deallocate_page( concurrent_queue_rep_base::page *p ) { + concurrent_queue_rep& r = *my_rep; + size_t n = sizeof(page) + r.items_per_page*r.item_size; + deallocate_block( reinterpret_cast(p), n ); + } + + //! custom allocator + virtual void *allocate_block( size_t n ) = 0; + + //! custom de-allocator + virtual void deallocate_block( void *p, size_t n ) = 0; + +protected: + concurrent_queue_base_v3( size_t item_size ) ; + + /* override */ virtual ~concurrent_queue_base_v3() { + size_t nq = my_rep->n_queue; + for( size_t i=0; iarray[i].tail_page==NULL, "pages were not freed properly" ); + cache_aligned_allocator >().deallocate(my_rep,1); + } + + //! Enqueue item at tail of queue + void internal_push( const void* src ) { + concurrent_queue_rep& r = *my_rep; + ticket k = r.tail_counter++; + r.choose(k).push( src, k, *this ); + } + + //! Attempt to dequeue item from queue. + /** NULL if there was no item to dequeue. */ + bool internal_try_pop( void* dst ) ; + + //! Get size of queue; result may be invalid if queue is modified concurrently + size_t internal_size() const ; + + //! check if the queue is empty; thread safe + bool internal_empty() const ; + + //! free any remaining pages + /* note that the name may be misleading, but it remains so due to a historical accident. */ + void internal_finish_clear() ; + + //! throw an exception + void internal_throw_exception() const { + throw std::bad_alloc(); + } + + //! copy internal representation + void assign( const concurrent_queue_base_v3& src ) ; +}; + +template +concurrent_queue_base_v3::concurrent_queue_base_v3( size_t item_size ) { + my_rep = cache_aligned_allocator >().allocate(1); + __TBB_ASSERT( (size_t)my_rep % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->head_counter % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->tail_counter % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->array % NFS_GetLineSize()==0, "alignment error" ); + memset(my_rep,0,sizeof(concurrent_queue_rep)); + my_rep->item_size = item_size; + my_rep->items_per_page = item_size<=8 ? 32 : + item_size<=16 ? 16 : + item_size<=32 ? 8 : + item_size<=64 ? 4 : + item_size<=128 ? 2 : + 1; +} + +template +bool concurrent_queue_base_v3::internal_try_pop( void* dst ) { + concurrent_queue_rep& r = *my_rep; + ticket k; + do { + k = r.head_counter; + for(;;) { + if( r.tail_counter<=k ) { + // Queue is empty + return false; + } + // Queue had item with ticket k when we looked. Attempt to get that item. + ticket tk=k; +#if defined(_MSC_VER) && defined(_Wp64) + #pragma warning (push) + #pragma warning (disable: 4267) +#endif + k = r.head_counter.compare_and_swap( tk+1, tk ); +#if defined(_MSC_VER) && defined(_Wp64) + #pragma warning (pop) +#endif + if( k==tk ) + break; + // Another thread snatched the item, retry. + } + } while( !r.choose( k ).pop( dst, k, *this ) ); + return true; +} + +template +size_t concurrent_queue_base_v3::internal_size() const { + concurrent_queue_rep& r = *my_rep; + __TBB_ASSERT( sizeof(ptrdiff_t)<=sizeof(size_t), NULL ); + ticket hc = r.head_counter; + size_t nie = r.n_invalid_entries; + ticket tc = r.tail_counter; + __TBB_ASSERT( hc!=tc || !nie, NULL ); + ptrdiff_t sz = tc-hc-nie; + return sz<0 ? 0 : size_t(sz); +} + +template +bool concurrent_queue_base_v3::internal_empty() const { + concurrent_queue_rep& r = *my_rep; + ticket tc = r.tail_counter; + ticket hc = r.head_counter; + // if tc!=r.tail_counter, the queue was not empty at some point between the two reads. + return tc==r.tail_counter && tc==hc+r.n_invalid_entries ; +} + +template +void concurrent_queue_base_v3::internal_finish_clear() { + concurrent_queue_rep& r = *my_rep; + size_t nq = r.n_queue; + for( size_t i=0; i +void concurrent_queue_base_v3::assign( const concurrent_queue_base_v3& src ) { + concurrent_queue_rep& r = *my_rep; + r.items_per_page = src.my_rep->items_per_page; + + // copy concurrent_queue_rep. + r.head_counter = src.my_rep->head_counter; + r.tail_counter = src.my_rep->tail_counter; + r.n_invalid_entries = src.my_rep->n_invalid_entries; + + // copy micro_queues + for( size_t i = 0; iarray[i], *this); + + __TBB_ASSERT( r.head_counter==src.my_rep->head_counter && r.tail_counter==src.my_rep->tail_counter, + "the source concurrent queue should not be concurrently modified." ); +} + +template class concurrent_queue_iterator; + +template +class concurrent_queue_iterator_rep: no_assign { +public: + ticket head_counter; + const concurrent_queue_base_v3& my_queue; + typename concurrent_queue_base_v3::page* array[concurrent_queue_rep::n_queue]; + concurrent_queue_iterator_rep( const concurrent_queue_base_v3& queue ) : + head_counter(queue.my_rep->head_counter), + my_queue(queue) + { + for( size_t k=0; k::n_queue; ++k ) + array[k] = queue.my_rep->array[k].head_page; + } + + //! Set item to point to kth element. Return true if at end of queue or item is marked valid; false otherwise. + bool get_item( void*& item, size_t k ) ; +}; + +template +bool concurrent_queue_iterator_rep::get_item( void*& item, size_t k ) { + if( k==my_queue.my_rep->tail_counter ) { + item = NULL; + return true; + } else { + typename concurrent_queue_base_v3::page* p = array[concurrent_queue_rep::index(k)]; + __TBB_ASSERT(p,NULL); + size_t i = k/concurrent_queue_rep::n_queue & (my_queue.my_rep->items_per_page-1); + item = static_cast(static_cast(p+1)) + my_queue.my_rep->item_size*i; + return (p->mask & uintptr_t(1)< +class concurrent_queue_iterator_base_v3 : no_assign { + //! Concurrentconcurrent_queue over which we are iterating. + /** NULL if one past last element in queue. */ + concurrent_queue_iterator_rep* my_rep; + + template + friend bool operator==( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ); + + template + friend bool operator!=( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ); +protected: + //! Pointer to current item + mutable void* my_item; + +public: + //! Default constructor + concurrent_queue_iterator_base_v3() : my_rep(NULL), my_item(NULL) { +#if __GNUC__==4&&__GNUC_MINOR__==3 + // to get around a possible gcc 4.3 bug + __asm__ __volatile__("": : :"memory"); +#endif + } + + //! Copy constructor + concurrent_queue_iterator_base_v3( const concurrent_queue_iterator_base_v3& i ) : my_rep(NULL), my_item(NULL) { + assign(i); + } + + //! Construct iterator pointing to head of queue. + concurrent_queue_iterator_base_v3( const concurrent_queue_base_v3& queue ) ; + +protected: + //! Assignment + void assign( const concurrent_queue_iterator_base_v3& other ) ; + + //! Advance iterator one step towards tail of queue. + void advance() ; + + //! Destructor + ~concurrent_queue_iterator_base_v3() { + cache_aligned_allocator >().deallocate(my_rep, 1); + my_rep = NULL; + } +}; + +template +concurrent_queue_iterator_base_v3::concurrent_queue_iterator_base_v3( const concurrent_queue_base_v3& queue ) { + my_rep = cache_aligned_allocator >().allocate(1); + new( my_rep ) concurrent_queue_iterator_rep(queue); + size_t k = my_rep->head_counter; + if( !my_rep->get_item(my_item, k) ) advance(); +} + +template +void concurrent_queue_iterator_base_v3::assign( const concurrent_queue_iterator_base_v3& other ) { + if( my_rep!=other.my_rep ) { + if( my_rep ) { + cache_aligned_allocator >().deallocate(my_rep, 1); + my_rep = NULL; + } + if( other.my_rep ) { + my_rep = cache_aligned_allocator >().allocate(1); + new( my_rep ) concurrent_queue_iterator_rep( *other.my_rep ); + } + } + my_item = other.my_item; +} + +template +void concurrent_queue_iterator_base_v3::advance() { + __TBB_ASSERT( my_item, "attempt to increment iterator past end of queue" ); + size_t k = my_rep->head_counter; + const concurrent_queue_base_v3& queue = my_rep->my_queue; +#if TBB_USE_ASSERT + void* tmp; + my_rep->get_item(tmp,k); + __TBB_ASSERT( my_item==tmp, NULL ); +#endif /* TBB_USE_ASSERT */ + size_t i = k/concurrent_queue_rep::n_queue & (queue.my_rep->items_per_page-1); + if( i==queue.my_rep->items_per_page-1 ) { + typename concurrent_queue_base_v3::page*& root = my_rep->array[concurrent_queue_rep::index(k)]; + root = root->next; + } + // advance k + my_rep->head_counter = ++k; + if( !my_rep->get_item(my_item, k) ) advance(); +} + +template +static inline const concurrent_queue_iterator_base_v3& add_constness( const concurrent_queue_iterator_base_v3& q ) +{ + return *reinterpret_cast *>(&q) ; +} + +//! Meets requirements of a forward iterator for STL. +/** Value is either the T or const T type of the container. + @ingroup containers */ +template +class concurrent_queue_iterator: public concurrent_queue_iterator_base_v3, + public std::iterator { +#if !__TBB_TEMPLATE_FRIENDS_BROKEN + template + friend class ::tbb::strict_ppl::concurrent_queue; +#else +public: // workaround for MSVC +#endif + //! Construct iterator pointing to head of queue. + concurrent_queue_iterator( const concurrent_queue_base_v3& queue ) : + concurrent_queue_iterator_base_v3(queue) + { + } + +public: + concurrent_queue_iterator() {} + + //! Copy constructor + concurrent_queue_iterator( const concurrent_queue_iterator& other ) : + concurrent_queue_iterator_base_v3(other) + { + } + + template + concurrent_queue_iterator( const concurrent_queue_iterator& other ) : + concurrent_queue_iterator_base_v3(add_constness(other)) + { + } + + //! Iterator assignment + concurrent_queue_iterator& operator=( const concurrent_queue_iterator& other ) { + assign(other); + return *this; + } + + //! Reference to current item + Value& operator*() const { + return *static_cast(this->my_item); + } + + Value* operator->() const {return &operator*();} + + //! Advance to next item in queue + concurrent_queue_iterator& operator++() { + this->advance(); + return *this; + } + + //! Post increment + Value* operator++(int) { + Value* result = &operator*(); + operator++(); + return result; + } +}; // concurrent_queue_iterator + + +template +bool operator==( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ) { + return i.my_item==j.my_item; +} + +template +bool operator!=( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ) { + return i.my_item!=j.my_item; +} + +} // namespace internal + +//! @endcond + +} // namespace strict_ppl + +//! @cond INTERNAL +namespace internal { + +class concurrent_queue_rep; +class concurrent_queue_iterator_rep; +class concurrent_queue_iterator_base_v3; +template class concurrent_queue_iterator; + +//! For internal use only. +/** Type-independent portion of concurrent_queue. + @ingroup containers */ +class concurrent_queue_base_v3: no_copy { + //! Internal representation + concurrent_queue_rep* my_rep; + + friend class concurrent_queue_rep; + friend struct micro_queue; + friend class micro_queue_pop_finalizer; + friend class concurrent_queue_iterator_rep; + friend class concurrent_queue_iterator_base_v3; +protected: + //! Prefix on a page + struct page { + page* next; + uintptr_t mask; + }; + + //! Capacity of the queue + ptrdiff_t my_capacity; + + //! Always a power of 2 + size_t items_per_page; + + //! Size of an item + size_t item_size; + +private: + virtual void copy_item( page& dst, size_t index, const void* src ) = 0; + virtual void assign_and_destroy_item( void* dst, page& src, size_t index ) = 0; +protected: + __TBB_EXPORTED_METHOD concurrent_queue_base_v3( size_t item_size ); + virtual __TBB_EXPORTED_METHOD ~concurrent_queue_base_v3(); + + //! Enqueue item at tail of queue + void __TBB_EXPORTED_METHOD internal_push( const void* src ); + + //! Dequeue item from head of queue + void __TBB_EXPORTED_METHOD internal_pop( void* dst ); + + //! Attempt to enqueue item onto queue. + bool __TBB_EXPORTED_METHOD internal_push_if_not_full( const void* src ); + + //! Attempt to dequeue item from queue. + /** NULL if there was no item to dequeue. */ + bool __TBB_EXPORTED_METHOD internal_pop_if_present( void* dst ); + + //! Get size of queue + ptrdiff_t __TBB_EXPORTED_METHOD internal_size() const; + + //! Check if the queue is emtpy + bool __TBB_EXPORTED_METHOD internal_empty() const; + + //! Set the queue capacity + void __TBB_EXPORTED_METHOD internal_set_capacity( ptrdiff_t capacity, size_t element_size ); + + //! custom allocator + virtual page *allocate_page() = 0; + + //! custom de-allocator + virtual void deallocate_page( page *p ) = 0; + + //! free any remaining pages + /* note that the name may be misleading, but it remains so due to a historical accident. */ + void __TBB_EXPORTED_METHOD internal_finish_clear() ; + + //! throw an exception + void __TBB_EXPORTED_METHOD internal_throw_exception() const; + + //! copy internal representation + void __TBB_EXPORTED_METHOD assign( const concurrent_queue_base_v3& src ) ; + +private: + virtual void copy_page_item( page& dst, size_t dindex, const page& src, size_t sindex ) = 0; +}; + +//! Type-independent portion of concurrent_queue_iterator. +/** @ingroup containers */ +class concurrent_queue_iterator_base_v3 { + //! Concurrentconcurrent_queue over which we are iterating. + /** NULL if one past last element in queue. */ + concurrent_queue_iterator_rep* my_rep; + + template + friend bool operator==( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ); + + template + friend bool operator!=( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ); +protected: + //! Pointer to current item + mutable void* my_item; + + //! Default constructor + concurrent_queue_iterator_base_v3() : my_rep(NULL), my_item(NULL) {} + + //! Copy constructor + concurrent_queue_iterator_base_v3( const concurrent_queue_iterator_base_v3& i ) : my_rep(NULL), my_item(NULL) { + assign(i); + } + + //! Construct iterator pointing to head of queue. + __TBB_EXPORTED_METHOD concurrent_queue_iterator_base_v3( const concurrent_queue_base_v3& queue ); + + //! Assignment + void __TBB_EXPORTED_METHOD assign( const concurrent_queue_iterator_base_v3& i ); + + //! Advance iterator one step towards tail of queue. + void __TBB_EXPORTED_METHOD advance(); + + //! Destructor + __TBB_EXPORTED_METHOD ~concurrent_queue_iterator_base_v3(); +}; + +typedef concurrent_queue_iterator_base_v3 concurrent_queue_iterator_base; + +//! Meets requirements of a forward iterator for STL. +/** Value is either the T or const T type of the container. + @ingroup containers */ +template +class concurrent_queue_iterator: public concurrent_queue_iterator_base, + public std::iterator { +#if !defined(_MSC_VER) || defined(__INTEL_COMPILER) + template + friend class ::tbb::concurrent_bounded_queue; + + template + friend class ::tbb::deprecated::concurrent_queue; +#else +public: // workaround for MSVC +#endif + //! Construct iterator pointing to head of queue. + concurrent_queue_iterator( const concurrent_queue_base_v3& queue ) : + concurrent_queue_iterator_base_v3(queue) + { + } + +public: + concurrent_queue_iterator() {} + + /** If Value==Container::value_type, then this routine is the copy constructor. + If Value==const Container::value_type, then this routine is a conversion constructor. */ + concurrent_queue_iterator( const concurrent_queue_iterator& other ) : + concurrent_queue_iterator_base_v3(other) + {} + + //! Iterator assignment + concurrent_queue_iterator& operator=( const concurrent_queue_iterator& other ) { + assign(other); + return *this; + } + + //! Reference to current item + Value& operator*() const { + return *static_cast(my_item); + } + + Value* operator->() const {return &operator*();} + + //! Advance to next item in queue + concurrent_queue_iterator& operator++() { + advance(); + return *this; + } + + //! Post increment + Value* operator++(int) { + Value* result = &operator*(); + operator++(); + return result; + } +}; // concurrent_queue_iterator + + +template +bool operator==( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ) { + return i.my_item==j.my_item; +} + +template +bool operator!=( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ) { + return i.my_item!=j.my_item; +} + +} // namespace internal; + +//! @endcond + +} // namespace tbb + +#endif /* __TBB_concurrent_queue_internal_H */ diff --git a/dep/tbb/include/tbb/_tbb_windef.h b/dep/tbb/include/tbb/_tbb_windef.h new file mode 100644 index 000000000..ceb697dc3 --- /dev/null +++ b/dep/tbb/include/tbb/_tbb_windef.h @@ -0,0 +1,84 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_tbb_windef_H +#error Do not #include this file directly. Use "#include tbb/tbb_stddef.h" instead. +#endif /* __TBB_tbb_windef_H */ + +// Check that the target Windows version has all API calls requried for TBB. +// Do not increase the version in condition beyond 0x0500 without prior discussion! +#if defined(_WIN32_WINNT) && _WIN32_WINNT<0x0400 +#error TBB is unable to run on old Windows versions; _WIN32_WINNT must be 0x0400 or greater. +#endif + +#if !defined(_MT) +#error TBB requires linkage with multithreaded C/C++ runtime library. \ + Choose multithreaded DLL runtime in project settings, or use /MD[d] compiler switch. +#elif !defined(_DLL) +#pragma message("Warning: Using TBB together with static C/C++ runtime library is not recommended. " \ + "Consider switching your project to multithreaded DLL runtime used by TBB.") +#endif + +// Workaround for the problem with MVSC headers failing to define namespace std +namespace std { + using ::size_t; using ::ptrdiff_t; +} + +#define __TBB_STRING_AUX(x) #x +#define __TBB_STRING(x) __TBB_STRING_AUX(x) + +// Default setting of TBB_USE_DEBUG +#ifdef TBB_USE_DEBUG +# if TBB_USE_DEBUG +# if !defined(_DEBUG) +# pragma message(__FILE__ "(" __TBB_STRING(__LINE__) ") : Warning: Recommend using /MDd if compiling with TBB_USE_DEBUG!=0") +# endif +# else +# if defined(_DEBUG) +# pragma message(__FILE__ "(" __TBB_STRING(__LINE__) ") : Warning: Recommend using /MD if compiling with TBB_USE_DEBUG==0") +# endif +# endif +#else +# ifdef _DEBUG +# define TBB_USE_DEBUG 1 +# endif +#endif + +#if __TBB_BUILD && !defined(__TBB_NO_IMPLICIT_LINKAGE) +#define __TBB_NO_IMPLICIT_LINKAGE 1 +#endif + +#if _MSC_VER + #if !__TBB_NO_IMPLICIT_LINKAGE + #ifdef _DEBUG + #pragma comment(lib, "tbb_debug.lib") + #else + #pragma comment(lib, "tbb.lib") + #endif + #endif +#endif diff --git a/dep/tbb/include/tbb/aligned_space.h b/dep/tbb/include/tbb/aligned_space.h new file mode 100644 index 000000000..f9a08df5a --- /dev/null +++ b/dep/tbb/include/tbb/aligned_space.h @@ -0,0 +1,55 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_aligned_space_H +#define __TBB_aligned_space_H + +#include "tbb_stddef.h" +#include "tbb_machine.h" + +namespace tbb { + +//! Block of space aligned sufficiently to construct an array T with N elements. +/** The elements are not constructed or destroyed by this class. + @ingroup memory_allocation */ +template +class aligned_space { +private: + typedef __TBB_TypeWithAlignmentAtLeastAsStrict(T) element_type; + element_type array[(sizeof(T)*N+sizeof(element_type)-1)/sizeof(element_type)]; +public: + //! Pointer to beginning of array + T* begin() {return reinterpret_cast(this);} + + //! Pointer to one past last element in array. + T* end() {return begin()+N;} +}; + +} // namespace tbb + +#endif /* __TBB_aligned_space_H */ diff --git a/dep/tbb/include/tbb/atomic.h b/dep/tbb/include/tbb/atomic.h new file mode 100644 index 000000000..8f3517f1e --- /dev/null +++ b/dep/tbb/include/tbb/atomic.h @@ -0,0 +1,397 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_atomic_H +#define __TBB_atomic_H + +#include +#include "tbb_stddef.h" + +#if _MSC_VER +#define __TBB_LONG_LONG __int64 +#else +#define __TBB_LONG_LONG long long +#endif /* _MSC_VER */ + +#include "tbb_machine.h" + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings + #pragma warning (push) + #pragma warning (disable: 4244 4267) +#endif + +namespace tbb { + +//! Specifies memory fencing. +enum memory_semantics { + //! For internal use only. + __TBB_full_fence, + //! Acquire fence + acquire, + //! Release fence + release +}; + +//! @cond INTERNAL +namespace internal { + +#if __GNUC__ || __SUNPRO_CC +#define __TBB_DECL_ATOMIC_FIELD(t,f,a) t f __attribute__ ((aligned(a))); +#elif defined(__INTEL_COMPILER)||_MSC_VER >= 1300 +#define __TBB_DECL_ATOMIC_FIELD(t,f,a) __declspec(align(a)) t f; +#else +#error Do not know syntax for forcing alignment. +#endif /* __GNUC__ */ + +template +struct atomic_rep; // Primary template declared, but never defined. + +template<> +struct atomic_rep<1> { // Specialization + typedef int8_t word; + int8_t value; +}; +template<> +struct atomic_rep<2> { // Specialization + typedef int16_t word; + __TBB_DECL_ATOMIC_FIELD(int16_t,value,2) +}; +template<> +struct atomic_rep<4> { // Specialization +#if _MSC_VER && __TBB_WORDSIZE==4 + // Work-around that avoids spurious /Wp64 warnings + typedef intptr_t word; +#else + typedef int32_t word; +#endif + __TBB_DECL_ATOMIC_FIELD(int32_t,value,4) +}; +template<> +struct atomic_rep<8> { // Specialization + typedef int64_t word; + __TBB_DECL_ATOMIC_FIELD(int64_t,value,8) +}; + +template +struct atomic_traits; // Primary template declared, but not defined. + +#define __TBB_DECL_FENCED_ATOMIC_PRIMITIVES(S,M) \ + template<> struct atomic_traits { \ + typedef atomic_rep::word word; \ + inline static word compare_and_swap( volatile void* location, word new_value, word comparand ) {\ + return __TBB_CompareAndSwap##S##M(location,new_value,comparand); \ + } \ + inline static word fetch_and_add( volatile void* location, word addend ) { \ + return __TBB_FetchAndAdd##S##M(location,addend); \ + } \ + inline static word fetch_and_store( volatile void* location, word value ) {\ + return __TBB_FetchAndStore##S##M(location,value); \ + } \ + }; + +#define __TBB_DECL_ATOMIC_PRIMITIVES(S) \ + template \ + struct atomic_traits { \ + typedef atomic_rep::word word; \ + inline static word compare_and_swap( volatile void* location, word new_value, word comparand ) {\ + return __TBB_CompareAndSwap##S(location,new_value,comparand); \ + } \ + inline static word fetch_and_add( volatile void* location, word addend ) { \ + return __TBB_FetchAndAdd##S(location,addend); \ + } \ + inline static word fetch_and_store( volatile void* location, word value ) {\ + return __TBB_FetchAndStore##S(location,value); \ + } \ + }; + +#if __TBB_DECL_FENCED_ATOMICS +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(1,__TBB_full_fence) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(2,__TBB_full_fence) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(4,__TBB_full_fence) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(8,__TBB_full_fence) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(1,acquire) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(2,acquire) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(4,acquire) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(8,acquire) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(1,release) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(2,release) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(4,release) +__TBB_DECL_FENCED_ATOMIC_PRIMITIVES(8,release) +#else +__TBB_DECL_ATOMIC_PRIMITIVES(1) +__TBB_DECL_ATOMIC_PRIMITIVES(2) +__TBB_DECL_ATOMIC_PRIMITIVES(4) +__TBB_DECL_ATOMIC_PRIMITIVES(8) +#endif + +//! Additive inverse of 1 for type T. +/** Various compilers issue various warnings if -1 is used with various integer types. + The baroque expression below avoids all the warnings (we hope). */ +#define __TBB_MINUS_ONE(T) (T(T(0)-T(1))) + +//! Base class that provides basic functionality for atomic without fetch_and_add. +/** Works for any type T that has the same size as an integral type, has a trivial constructor/destructor, + and can be copied/compared by memcpy/memcmp. */ +template +struct atomic_impl { +protected: + atomic_rep rep; +private: + //! Union type used to convert type T to underlying integral type. + union converter { + T value; + typename atomic_rep::word bits; + }; +public: + typedef T value_type; + + template + value_type fetch_and_store( value_type value ) { + converter u, w; + u.value = value; + w.bits = internal::atomic_traits::fetch_and_store(&rep.value,u.bits); + return w.value; + } + + value_type fetch_and_store( value_type value ) { + return fetch_and_store<__TBB_full_fence>(value); + } + + template + value_type compare_and_swap( value_type value, value_type comparand ) { + converter u, v, w; + u.value = value; + v.value = comparand; + w.bits = internal::atomic_traits::compare_and_swap(&rep.value,u.bits,v.bits); + return w.value; + } + + value_type compare_and_swap( value_type value, value_type comparand ) { + return compare_and_swap<__TBB_full_fence>(value,comparand); + } + + operator value_type() const volatile { // volatile qualifier here for backwards compatibility + converter w; + w.bits = __TBB_load_with_acquire( rep.value ); + return w.value; + } + +protected: + value_type store_with_release( value_type rhs ) { + converter u; + u.value = rhs; + __TBB_store_with_release(rep.value,u.bits); + return rhs; + } +}; + +//! Base class that provides basic functionality for atomic with fetch_and_add. +/** I is the underlying type. + D is the difference type. + StepType should be char if I is an integral type, and T if I is a T*. */ +template +struct atomic_impl_with_arithmetic: atomic_impl { +public: + typedef I value_type; + + template + value_type fetch_and_add( D addend ) { + return value_type(internal::atomic_traits::fetch_and_add( &this->rep.value, addend*sizeof(StepType) )); + } + + value_type fetch_and_add( D addend ) { + return fetch_and_add<__TBB_full_fence>(addend); + } + + template + value_type fetch_and_increment() { + return fetch_and_add(1); + } + + value_type fetch_and_increment() { + return fetch_and_add(1); + } + + template + value_type fetch_and_decrement() { + return fetch_and_add(__TBB_MINUS_ONE(D)); + } + + value_type fetch_and_decrement() { + return fetch_and_add(__TBB_MINUS_ONE(D)); + } + +public: + value_type operator+=( D addend ) { + return fetch_and_add(addend)+addend; + } + + value_type operator-=( D addend ) { + // Additive inverse of addend computed using binary minus, + // instead of unary minus, for sake of avoiding compiler warnings. + return operator+=(D(0)-addend); + } + + value_type operator++() { + return fetch_and_add(1)+1; + } + + value_type operator--() { + return fetch_and_add(__TBB_MINUS_ONE(D))-1; + } + + value_type operator++(int) { + return fetch_and_add(1); + } + + value_type operator--(int) { + return fetch_and_add(__TBB_MINUS_ONE(D)); + } +}; + +#if __TBB_WORDSIZE == 4 +// Plaforms with 32-bit hardware require special effort for 64-bit loads and stores. +#if defined(__INTEL_COMPILER)||!defined(_MSC_VER)||_MSC_VER>=1400 + +template<> +inline atomic_impl<__TBB_LONG_LONG>::operator atomic_impl<__TBB_LONG_LONG>::value_type() const volatile { + return __TBB_Load8(&rep.value); +} + +template<> +inline atomic_impl::operator atomic_impl::value_type() const volatile { + return __TBB_Load8(&rep.value); +} + +template<> +inline atomic_impl<__TBB_LONG_LONG>::value_type atomic_impl<__TBB_LONG_LONG>::store_with_release( value_type rhs ) { + __TBB_Store8(&rep.value,rhs); + return rhs; +} + +template<> +inline atomic_impl::value_type atomic_impl::store_with_release( value_type rhs ) { + __TBB_Store8(&rep.value,rhs); + return rhs; +} + +#endif /* defined(__INTEL_COMPILER)||!defined(_MSC_VER)||_MSC_VER>=1400 */ +#endif /* __TBB_WORDSIZE==4 */ + +} /* Internal */ +//! @endcond + +//! Primary template for atomic. +/** See the Reference for details. + @ingroup synchronization */ +template +struct atomic: internal::atomic_impl { + T operator=( T rhs ) { + // "this" required here in strict ISO C++ because store_with_release is a dependent name + return this->store_with_release(rhs); + } + atomic& operator=( const atomic& rhs ) {this->store_with_release(rhs); return *this;} +}; + +#define __TBB_DECL_ATOMIC(T) \ + template<> struct atomic: internal::atomic_impl_with_arithmetic { \ + T operator=( T rhs ) {return store_with_release(rhs);} \ + atomic& operator=( const atomic& rhs ) {store_with_release(rhs); return *this;} \ + }; + +#if defined(__INTEL_COMPILER)||!defined(_MSC_VER)||_MSC_VER>=1400 +__TBB_DECL_ATOMIC(__TBB_LONG_LONG) +__TBB_DECL_ATOMIC(unsigned __TBB_LONG_LONG) +#else +// Some old versions of MVSC cannot correctly compile templates with "long long". +#endif /* defined(__INTEL_COMPILER)||!defined(_MSC_VER)||_MSC_VER>=1400 */ + +__TBB_DECL_ATOMIC(long) +__TBB_DECL_ATOMIC(unsigned long) + +#if defined(_MSC_VER) && __TBB_WORDSIZE==4 +/* Special version of __TBB_DECL_ATOMIC that avoids gratuitous warnings from cl /Wp64 option. + It is identical to __TBB_DECL_ATOMIC(unsigned) except that it replaces operator=(T) + with an operator=(U) that explicitly converts the U to a T. Types T and U should be + type synonyms on the platform. Type U should be the wider variant of T from the + perspective of /Wp64. */ +#define __TBB_DECL_ATOMIC_ALT(T,U) \ + template<> struct atomic: internal::atomic_impl_with_arithmetic { \ + T operator=( U rhs ) {return store_with_release(T(rhs));} \ + atomic& operator=( const atomic& rhs ) {store_with_release(rhs); return *this;} \ + }; +__TBB_DECL_ATOMIC_ALT(unsigned,size_t) +__TBB_DECL_ATOMIC_ALT(int,ptrdiff_t) +#else +__TBB_DECL_ATOMIC(unsigned) +__TBB_DECL_ATOMIC(int) +#endif /* defined(_MSC_VER) && __TBB_WORDSIZE==4 */ + +__TBB_DECL_ATOMIC(unsigned short) +__TBB_DECL_ATOMIC(short) +__TBB_DECL_ATOMIC(char) +__TBB_DECL_ATOMIC(signed char) +__TBB_DECL_ATOMIC(unsigned char) + +#if !defined(_MSC_VER)||defined(_NATIVE_WCHAR_T_DEFINED) +__TBB_DECL_ATOMIC(wchar_t) +#endif /* _MSC_VER||!defined(_NATIVE_WCHAR_T_DEFINED) */ + +//! Specialization for atomic with arithmetic and operator->. +template struct atomic: internal::atomic_impl_with_arithmetic { + T* operator=( T* rhs ) { + // "this" required here in strict ISO C++ because store_with_release is a dependent name + return this->store_with_release(rhs); + } + atomic& operator=( const atomic& rhs ) { + this->store_with_release(rhs); return *this; + } + T* operator->() const { + return (*this); + } +}; + +//! Specialization for atomic, for sake of not allowing arithmetic or operator->. +template<> struct atomic: internal::atomic_impl { + void* operator=( void* rhs ) { + // "this" required here in strict ISO C++ because store_with_release is a dependent name + return this->store_with_release(rhs); + } + atomic& operator=( const atomic& rhs ) { + this->store_with_release(rhs); return *this; + } +}; + +} // namespace tbb + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warnings 4244, 4267 are back + +#endif /* __TBB_atomic_H */ diff --git a/dep/tbb/include/tbb/blocked_range.h b/dep/tbb/include/tbb/blocked_range.h new file mode 100644 index 000000000..fd20aa0c4 --- /dev/null +++ b/dep/tbb/include/tbb/blocked_range.h @@ -0,0 +1,129 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_blocked_range_H +#define __TBB_blocked_range_H + +#include "tbb_stddef.h" + +namespace tbb { + +/** \page range_req Requirements on range concept + Class \c R implementing the concept of range must define: + - \code R::R( const R& ); \endcode Copy constructor + - \code R::~R(); \endcode Destructor + - \code bool R::is_divisible() const; \endcode True if range can be partitioned into two subranges + - \code bool R::empty() const; \endcode True if range is empty + - \code R::R( R& r, split ); \endcode Split range \c r into two subranges. +**/ + +//! A range over which to iterate. +/** @ingroup algorithms */ +template +class blocked_range { +public: + //! Type of a value + /** Called a const_iterator for sake of algorithms that need to treat a blocked_range + as an STL container. */ + typedef Value const_iterator; + + //! Type for size of a range + typedef std::size_t size_type; + + //! Construct range with default-constructed values for begin and end. + /** Requires that Value have a default constructor. */ + blocked_range() : my_begin(), my_end() {} + + //! Construct range over half-open interval [begin,end), with the given grainsize. + blocked_range( Value begin_, Value end_, size_type grainsize_=1 ) : + my_end(end_), my_begin(begin_), my_grainsize(grainsize_) + { + __TBB_ASSERT( my_grainsize>0, "grainsize must be positive" ); + } + + //! Beginning of range. + const_iterator begin() const {return my_begin;} + + //! One past last value in range. + const_iterator end() const {return my_end;} + + //! Size of the range + /** Unspecified if end() + friend class blocked_range2d; + + template + friend class blocked_range3d; +}; + +} // namespace tbb + +#endif /* __TBB_blocked_range_H */ diff --git a/dep/tbb/include/tbb/blocked_range2d.h b/dep/tbb/include/tbb/blocked_range2d.h new file mode 100644 index 000000000..d0e48b936 --- /dev/null +++ b/dep/tbb/include/tbb/blocked_range2d.h @@ -0,0 +1,97 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_blocked_range2d_H +#define __TBB_blocked_range2d_H + +#include "tbb_stddef.h" +#include "blocked_range.h" + +namespace tbb { + +//! A 2-dimensional range that models the Range concept. +/** @ingroup algorithms */ +template +class blocked_range2d { +public: + //! Type for size of an iteation range + typedef blocked_range row_range_type; + typedef blocked_range col_range_type; + +private: + row_range_type my_rows; + col_range_type my_cols; + +public: + + blocked_range2d( RowValue row_begin, RowValue row_end, typename row_range_type::size_type row_grainsize, + ColValue col_begin, ColValue col_end, typename col_range_type::size_type col_grainsize ) : + my_rows(row_begin,row_end,row_grainsize), + my_cols(col_begin,col_end,col_grainsize) + { + } + + blocked_range2d( RowValue row_begin, RowValue row_end, + ColValue col_begin, ColValue col_end ) : + my_rows(row_begin,row_end), + my_cols(col_begin,col_end) + { + } + + //! True if range is empty + bool empty() const { + // Yes, it is a logical OR here, not AND. + return my_rows.empty() || my_cols.empty(); + } + + //! True if range is divisible into two pieces. + bool is_divisible() const { + return my_rows.is_divisible() || my_cols.is_divisible(); + } + + blocked_range2d( blocked_range2d& r, split ) : + my_rows(r.my_rows), + my_cols(r.my_cols) + { + if( my_rows.size()*double(my_cols.grainsize()) < my_cols.size()*double(my_rows.grainsize()) ) { + my_cols.my_begin = col_range_type::do_split(r.my_cols); + } else { + my_rows.my_begin = row_range_type::do_split(r.my_rows); + } + } + + //! The rows of the iteration space + const row_range_type& rows() const {return my_rows;} + + //! The columns of the iteration space + const col_range_type& cols() const {return my_cols;} +}; + +} // namespace tbb + +#endif /* __TBB_blocked_range2d_H */ diff --git a/dep/tbb/include/tbb/blocked_range3d.h b/dep/tbb/include/tbb/blocked_range3d.h new file mode 100644 index 000000000..6b6742f55 --- /dev/null +++ b/dep/tbb/include/tbb/blocked_range3d.h @@ -0,0 +1,116 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_blocked_range3d_H +#define __TBB_blocked_range3d_H + +#include "tbb_stddef.h" +#include "blocked_range.h" + +namespace tbb { + +//! A 3-dimensional range that models the Range concept. +/** @ingroup algorithms */ +template +class blocked_range3d { +public: + //! Type for size of an iteation range + typedef blocked_range page_range_type; + typedef blocked_range row_range_type; + typedef blocked_range col_range_type; + +private: + page_range_type my_pages; + row_range_type my_rows; + col_range_type my_cols; + +public: + + blocked_range3d( PageValue page_begin, PageValue page_end, + RowValue row_begin, RowValue row_end, + ColValue col_begin, ColValue col_end ) : + my_pages(page_begin,page_end), + my_rows(row_begin,row_end), + my_cols(col_begin,col_end) + { + } + + blocked_range3d( PageValue page_begin, PageValue page_end, typename page_range_type::size_type page_grainsize, + RowValue row_begin, RowValue row_end, typename row_range_type::size_type row_grainsize, + ColValue col_begin, ColValue col_end, typename col_range_type::size_type col_grainsize ) : + my_pages(page_begin,page_end,page_grainsize), + my_rows(row_begin,row_end,row_grainsize), + my_cols(col_begin,col_end,col_grainsize) + { + } + + //! True if range is empty + bool empty() const { + // Yes, it is a logical OR here, not AND. + return my_pages.empty() || my_rows.empty() || my_cols.empty(); + } + + //! True if range is divisible into two pieces. + bool is_divisible() const { + return my_pages.is_divisible() || my_rows.is_divisible() || my_cols.is_divisible(); + } + + blocked_range3d( blocked_range3d& r, split ) : + my_pages(r.my_pages), + my_rows(r.my_rows), + my_cols(r.my_cols) + { + if( my_pages.size()*double(my_rows.grainsize()) < my_rows.size()*double(my_pages.grainsize()) ) { + if ( my_rows.size()*double(my_cols.grainsize()) < my_cols.size()*double(my_rows.grainsize()) ) { + my_cols.my_begin = col_range_type::do_split(r.my_cols); + } else { + my_rows.my_begin = row_range_type::do_split(r.my_rows); + } + } else { + if ( my_pages.size()*double(my_cols.grainsize()) < my_cols.size()*double(my_pages.grainsize()) ) { + my_cols.my_begin = col_range_type::do_split(r.my_cols); + } else { + my_pages.my_begin = page_range_type::do_split(r.my_pages); + } + } + } + + //! The pages of the iteration space + const page_range_type& pages() const {return my_pages;} + + //! The rows of the iteration space + const row_range_type& rows() const {return my_rows;} + + //! The columns of the iteration space + const col_range_type& cols() const {return my_cols;} + +}; + +} // namespace tbb + +#endif /* __TBB_blocked_range3d_H */ diff --git a/dep/tbb/include/tbb/cache_aligned_allocator.h b/dep/tbb/include/tbb/cache_aligned_allocator.h new file mode 100644 index 000000000..449dcb1ed --- /dev/null +++ b/dep/tbb/include/tbb/cache_aligned_allocator.h @@ -0,0 +1,133 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_cache_aligned_allocator_H +#define __TBB_cache_aligned_allocator_H + +#include +#include "tbb_stddef.h" + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + //! Cache/sector line size. + /** @ingroup memory_allocation */ + size_t __TBB_EXPORTED_FUNC NFS_GetLineSize(); + + //! Allocate memory on cache/sector line boundary. + /** @ingroup memory_allocation */ + void* __TBB_EXPORTED_FUNC NFS_Allocate( size_t n_element, size_t element_size, void* hint ); + + //! Free memory allocated by NFS_Allocate. + /** Freeing a NULL pointer is allowed, but has no effect. + @ingroup memory_allocation */ + void __TBB_EXPORTED_FUNC NFS_Free( void* ); +} +//! @endcond + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Workaround for erroneous "unreferenced parameter" warning in method destroy. + #pragma warning (push) + #pragma warning (disable: 4100) +#endif + +//! Meets "allocator" requirements of ISO C++ Standard, Section 20.1.5 +/** The members are ordered the same way they are in section 20.4.1 + of the ISO C++ standard. + @ingroup memory_allocation */ +template +class cache_aligned_allocator { +public: + typedef typename internal::allocator_type::value_type value_type; + typedef value_type* pointer; + typedef const value_type* const_pointer; + typedef value_type& reference; + typedef const value_type& const_reference; + typedef size_t size_type; + typedef ptrdiff_t difference_type; + template struct rebind { + typedef cache_aligned_allocator other; + }; + + cache_aligned_allocator() throw() {} + cache_aligned_allocator( const cache_aligned_allocator& ) throw() {} + template cache_aligned_allocator(const cache_aligned_allocator&) throw() {} + + pointer address(reference x) const {return &x;} + const_pointer address(const_reference x) const {return &x;} + + //! Allocate space for n objects, starting on a cache/sector line. + pointer allocate( size_type n, const void* hint=0 ) { + // The "hint" argument is always ignored in NFS_Allocate thus const_cast shouldn't hurt + return pointer(internal::NFS_Allocate( n, sizeof(value_type), const_cast(hint) )); + } + + //! Free block of memory that starts on a cache line + void deallocate( pointer p, size_type ) { + internal::NFS_Free(p); + } + + //! Largest value for which method allocate might succeed. + size_type max_size() const throw() { + return (~size_t(0)-internal::NFS_MaxLineSize)/sizeof(value_type); + } + + //! Copy-construct value at location pointed to by p. + void construct( pointer p, const value_type& value ) {new(static_cast(p)) value_type(value);} + + //! Destroy value at location pointed to by p. + void destroy( pointer p ) {p->~value_type();} +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warning 4100 is back + +//! Analogous to std::allocator, as defined in ISO C++ Standard, Section 20.4.1 +/** @ingroup memory_allocation */ +template<> +class cache_aligned_allocator { +public: + typedef void* pointer; + typedef const void* const_pointer; + typedef void value_type; + template struct rebind { + typedef cache_aligned_allocator other; + }; +}; + +template +inline bool operator==( const cache_aligned_allocator&, const cache_aligned_allocator& ) {return true;} + +template +inline bool operator!=( const cache_aligned_allocator&, const cache_aligned_allocator& ) {return false;} + +} // namespace tbb + +#endif /* __TBB_cache_aligned_allocator_H */ diff --git a/dep/tbb/include/tbb/combinable.h b/dep/tbb/include/tbb/combinable.h new file mode 100644 index 000000000..9122ffa8e --- /dev/null +++ b/dep/tbb/include/tbb/combinable.h @@ -0,0 +1,78 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_combinable_H +#define __TBB_combinable_H + +#include "tbb/enumerable_thread_specific.h" +#include "tbb/cache_aligned_allocator.h" + +namespace tbb { +/** \name combinable + **/ +//@{ +//! Thread-local storage with optional reduction +/** @ingroup containers */ + template + class combinable { + private: + typedef typename tbb::cache_aligned_allocator my_alloc; + + typedef typename tbb::enumerable_thread_specific my_ets_type; + my_ets_type my_ets; + + public: + + combinable() { } + + template + combinable( finit _finit) : my_ets(_finit) { } + + //! destructor + ~combinable() { + } + + combinable(const combinable& other) : my_ets(other.my_ets) { } + + combinable & operator=( const combinable & other) { my_ets = other.my_ets; return *this; } + + void clear() { my_ets.clear(); } + + T& local() { return my_ets.local(); } + + T& local(bool & exists) { return my_ets.local(exists); } + + template< typename FCombine> + T combine(FCombine fcombine) { return my_ets.combine(fcombine); } + + template + void combine_each(FCombine fcombine) { my_ets.combine_each(fcombine); } + + }; +} // namespace tbb +#endif /* __TBB_combinable_H */ diff --git a/dep/tbb/include/tbb/compat/ppl.h b/dep/tbb/include/tbb/compat/ppl.h new file mode 100644 index 000000000..998bd0015 --- /dev/null +++ b/dep/tbb/include/tbb/compat/ppl.h @@ -0,0 +1,58 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_compat_ppl_H +#define __TBB_compat_ppl_H + +#include "../task_group.h" +#include "../parallel_invoke.h" +#include "../parallel_for_each.h" +#include "../parallel_for.h" + +namespace Concurrency { + + using tbb::task_handle; + using tbb::task_group_status; + using tbb::task_group; + using tbb::structured_task_group; + using tbb::missing_wait; + using tbb::make_task; + + using tbb::not_complete; + using tbb::complete; + using tbb::canceled; + + using tbb::is_current_task_group_canceling; + + using tbb::parallel_invoke; + using tbb::strict_ppl::parallel_for; + using tbb::parallel_for_each; + +} // namespace Concurrency + +#endif /* __TBB_compat_ppl_H */ diff --git a/dep/tbb/include/tbb/concurrent_hash_map.h b/dep/tbb/include/tbb/concurrent_hash_map.h new file mode 100644 index 000000000..ea4138fcc --- /dev/null +++ b/dep/tbb/include/tbb/concurrent_hash_map.h @@ -0,0 +1,1262 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_concurrent_hash_map_H +#define __TBB_concurrent_hash_map_H + +#include +#include +#include // Need std::pair +#include // Need std::memset +#include +#include "tbb_stddef.h" +#include "cache_aligned_allocator.h" +#include "tbb_allocator.h" +#include "spin_rw_mutex.h" +#include "atomic.h" +#include "aligned_space.h" +#if TBB_USE_PERFORMANCE_WARNINGS +#include +#endif + +namespace tbb { + +template struct tbb_hash_compare; +template, typename A = tbb_allocator > > +class concurrent_hash_map; + +//! @cond INTERNAL +namespace internal { + //! ITT instrumented routine that loads pointer from location pointed to by src. + void* __TBB_EXPORTED_FUNC itt_load_pointer_with_acquire_v3( const void* src ); + //! ITT instrumented routine that stores src into location pointed to by dst. + void __TBB_EXPORTED_FUNC itt_store_pointer_with_release_v3( void* dst, void* src ); + //! Routine that loads pointer from location pointed to by src without causing ITT to report a race. + void* __TBB_EXPORTED_FUNC itt_load_pointer_v3( const void* src ); + + //! Type of a hash code. + typedef size_t hashcode_t; + //! Node base type + struct hash_map_node_base : no_copy { + //! Mutex type + typedef spin_rw_mutex mutex_t; + //! Scoped lock type for mutex + typedef mutex_t::scoped_lock scoped_t; + //! Next node in chain + hash_map_node_base *next; + mutex_t mutex; + }; + //! Incompleteness flag value + static hash_map_node_base *const rehash_req = reinterpret_cast(size_t(3)); + //! Rehashed empty bucket flag + static hash_map_node_base *const empty_rehashed = reinterpret_cast(size_t(0)); + //! base class of concurrent_hash_map + class hash_map_base { + public: + //! Size type + typedef size_t size_type; + //! Type of a hash code. + typedef size_t hashcode_t; + //! Segment index type + typedef size_t segment_index_t; + //! Node base type + typedef hash_map_node_base node_base; + //! Bucket type + struct bucket : no_copy { + //! Mutex type for buckets + typedef spin_rw_mutex mutex_t; + //! Scoped lock type for mutex + typedef mutex_t::scoped_lock scoped_t; + mutex_t mutex; + node_base *node_list; + }; + //! Count of segments in the first block + static size_type const embedded_block = 1; + //! Count of segments in the first block + static size_type const embedded_buckets = 1< my_mask; + //! Segment pointers table. Also prevents false sharing between my_mask and my_size + segments_table_t my_table; + //! Size of container in stored items + atomic my_size; // It must be in separate cache line from my_mask due to performance effects + //! Zero segment + bucket my_embedded_segment[embedded_buckets]; + + //! Constructor + hash_map_base() { + std::memset( this, 0, pointers_per_table*sizeof(segment_ptr_t) // 32*4=128 or 64*8=512 + + sizeof(my_size) + sizeof(my_mask) // 4+4 or 8+8 + + embedded_buckets*sizeof(bucket) ); // n*8 or n*16 + for( size_type i = 0; i < embedded_block; i++ ) // fill the table + my_table[i] = my_embedded_segment + segment_base(i); + my_mask = embedded_buckets - 1; + __TBB_ASSERT( embedded_block <= first_block, "The first block number must include embedded blocks"); + } + + //! @return segment index of given index in the array + static segment_index_t segment_index_of( size_type index ) { + return segment_index_t( __TBB_Log2( index|1 ) ); + } + + //! @return the first array index of given segment + static segment_index_t segment_base( segment_index_t k ) { + return (segment_index_t(1)<(ptr) > size_t(63); + } + + //! Initialize buckets + static void init_buckets( segment_ptr_t ptr, size_type sz, bool is_initial ) { + if( is_initial ) std::memset(ptr, 0, sz*sizeof(bucket) ); + else for(size_type i = 0; i < sz; i++, ptr++) { + *reinterpret_cast(&ptr->mutex) = 0; + ptr->node_list = rehash_req; + } + } + + //! Add node @arg n to bucket @arg b + static void add_to_bucket( bucket *b, node_base *n ) { + __TBB_ASSERT(b->node_list != rehash_req, NULL); + n->next = b->node_list; + b->node_list = n; // its under lock and flag is set + } + + //! Exception safety helper + struct enable_segment_failsafe { + segment_ptr_t *my_segment_ptr; + enable_segment_failsafe(segments_table_t &table, segment_index_t k) : my_segment_ptr(&table[k]) {} + ~enable_segment_failsafe() { + if( my_segment_ptr ) *my_segment_ptr = 0; // indicate no allocation in progress + } + }; + + //! Enable segment + void enable_segment( segment_index_t k, bool is_initial = false ) { + __TBB_ASSERT( k, "Zero segment must be embedded" ); + enable_segment_failsafe watchdog( my_table, k ); + cache_aligned_allocator alloc; + size_type sz; + __TBB_ASSERT( !is_valid(my_table[k]), "Wrong concurrent assignment"); + if( k >= first_block ) { + sz = segment_size( k ); + segment_ptr_t ptr = alloc.allocate( sz ); + init_buckets( ptr, sz, is_initial ); +#if TBB_USE_THREADING_TOOLS + // TODO: actually, fence and notification are unnecessary here and below + itt_store_pointer_with_release_v3( my_table + k, ptr ); +#else + my_table[k] = ptr;// my_mask has release fence +#endif + sz <<= 1;// double it to get entire capacity of the container + } else { // the first block + __TBB_ASSERT( k == embedded_block, "Wrong segment index" ); + sz = segment_size( first_block ); + segment_ptr_t ptr = alloc.allocate( sz - embedded_buckets ); + init_buckets( ptr, sz - embedded_buckets, is_initial ); + ptr -= segment_base(embedded_block); + for(segment_index_t i = embedded_block; i < first_block; i++) // calc the offsets +#if TBB_USE_THREADING_TOOLS + itt_store_pointer_with_release_v3( my_table + i, ptr + segment_base(i) ); +#else + my_table[i] = ptr + segment_base(i); +#endif + } +#if TBB_USE_THREADING_TOOLS + itt_store_pointer_with_release_v3( &my_mask, (void*)(sz-1) ); +#else + my_mask = sz - 1; +#endif + watchdog.my_segment_ptr = 0; + } + + //! Get bucket by (masked) hashcode + bucket *get_bucket( hashcode_t h ) const throw() { // TODO: add throw() everywhere? + segment_index_t s = segment_index_of( h ); + h -= segment_base(s); + segment_ptr_t seg = my_table[s]; + __TBB_ASSERT( is_valid(seg), "hashcode must be cut by valid mask for allocated segments" ); + return &seg[h]; + } + + //! Check for mask race + // Splitting into two functions should help inlining + inline bool check_mask_race( const hashcode_t h, hashcode_t &m ) const { + hashcode_t m_now, m_old = m; +#if TBB_USE_THREADING_TOOLS + m_now = (hashcode_t) itt_load_pointer_with_acquire_v3( &my_mask ); +#else + m_now = my_mask; +#endif + if( m_old != m_now ) + return check_rehashing_collision( h, m_old, m = m_now ); + return false; + } + + //! Process mask race, check for rehashing collision + bool check_rehashing_collision( const hashcode_t h, hashcode_t m_old, hashcode_t m ) const { + __TBB_ASSERT(m_old != m, NULL); // TODO?: m arg could be optimized out by passing h = h&m + if( (h & m_old) != (h & m) ) { // mask changed for this hashcode, rare event + // condition above proves that 'h' has some other bits set beside 'm_old' + // find next applicable mask after m_old //TODO: look at bsl instruction + for( ++m_old; !(h & m_old); m_old <<= 1 ); // at maximum few rounds depending on the first block size + m_old = (m_old<<1) - 1; // get full mask from a bit + __TBB_ASSERT((m_old&(m_old+1))==0 && m_old <= m, NULL); + // check whether it is rehashing/ed +#if TBB_USE_THREADING_TOOLS + if( itt_load_pointer_with_acquire_v3(&( get_bucket(h & m_old)->node_list )) != rehash_req ) +#else + if( __TBB_load_with_acquire(get_bucket( h & m_old )->node_list) != rehash_req ) +#endif + return true; + } + return false; + } + + //! Insert a node and check for load factor. @return segment index to enable. + segment_index_t insert_new_node( bucket *b, node_base *n, hashcode_t mask ) { + size_type sz = ++my_size; // prefix form is to enforce allocation after the first item inserted + add_to_bucket( b, n ); + // check load factor + if( sz >= mask ) { // TODO: add custom load_factor + segment_index_t new_seg = segment_index_of( mask+1 ); + __TBB_ASSERT( is_valid(my_table[new_seg-1]), "new allocations must not publish new mask until segment has allocated"); +#if TBB_USE_THREADING_TOOLS + if( !itt_load_pointer_v3(my_table+new_seg) +#else + if( !my_table[new_seg] +#endif + && __TBB_CompareAndSwapW(&my_table[new_seg], 2, 0) == 0 ) + return new_seg; // The value must be processed + } + return 0; + } + + //! Prepare enough segments for number of buckets + void reserve(size_type buckets) { + if( !buckets-- ) return; + bool is_initial = !my_size; + for( size_type m = my_mask; buckets > m; m = my_mask ) + enable_segment( segment_index_of( m+1 ), is_initial ); + } + //! Swap hash_map_bases + void internal_swap(hash_map_base &table) { + std::swap(this->my_mask, table.my_mask); + std::swap(this->my_size, table.my_size); + for(size_type i = 0; i < embedded_buckets; i++) + std::swap(this->my_embedded_segment[i].node_list, table.my_embedded_segment[i].node_list); + for(size_type i = embedded_block; i < pointers_per_table; i++) + std::swap(this->my_table[i], table.my_table[i]); + } + }; + + template + class hash_map_range; + + //! Meets requirements of a forward iterator for STL */ + /** Value is either the T or const T type of the container. + @ingroup containers */ + template + class hash_map_iterator + : public std::iterator + { + typedef Container map_type; + typedef typename Container::node node; + typedef hash_map_base::node_base node_base; + typedef hash_map_base::bucket bucket; + + template + friend bool operator==( const hash_map_iterator& i, const hash_map_iterator& j ); + + template + friend bool operator!=( const hash_map_iterator& i, const hash_map_iterator& j ); + + template + friend ptrdiff_t operator-( const hash_map_iterator& i, const hash_map_iterator& j ); + + template + friend class internal::hash_map_iterator; + + template + friend class internal::hash_map_range; + + void advance_to_next_bucket() { // TODO?: refactor to iterator_base class + size_t k = my_index+1; + while( my_bucket && k <= my_map->my_mask ) { + // Following test uses 2's-complement wizardry + if( k& (k-2) ) // not the beginning of a segment + ++my_bucket; + else my_bucket = my_map->get_bucket( k ); + my_node = static_cast( my_bucket->node_list ); + if( hash_map_base::is_valid(my_node) ) { + my_index = k; return; + } + ++k; + } + my_bucket = 0; my_node = 0; my_index = k; // the end + } +#if !defined(_MSC_VER) || defined(__INTEL_COMPILER) + template + friend class tbb::concurrent_hash_map; +#else + public: // workaround +#endif + //! concurrent_hash_map over which we are iterating. + const Container *my_map; + + //! Index in hash table for current item + size_t my_index; + + //! Pointer to bucket + const bucket *my_bucket; + + //! Pointer to node that has current item + node *my_node; + + hash_map_iterator( const Container &map, size_t index, const bucket *b, node_base *n ); + + public: + //! Construct undefined iterator + hash_map_iterator() {} + hash_map_iterator( const hash_map_iterator &other ) : + my_map(other.my_map), + my_index(other.my_index), + my_bucket(other.my_bucket), + my_node(other.my_node) + {} + Value& operator*() const { + __TBB_ASSERT( hash_map_base::is_valid(my_node), "iterator uninitialized or at end of container?" ); + return my_node->item; + } + Value* operator->() const {return &operator*();} + hash_map_iterator& operator++(); + + //! Post increment + Value* operator++(int) { + Value* result = &operator*(); + operator++(); + return result; + } + }; + + template + hash_map_iterator::hash_map_iterator( const Container &map, size_t index, const bucket *b, node_base *n ) : + my_map(&map), + my_index(index), + my_bucket(b), + my_node( static_cast(n) ) + { + if( b && !hash_map_base::is_valid(n) ) + advance_to_next_bucket(); + } + + template + hash_map_iterator& hash_map_iterator::operator++() { + my_node = static_cast( my_node->next ); + if( !my_node ) advance_to_next_bucket(); + return *this; + } + + template + bool operator==( const hash_map_iterator& i, const hash_map_iterator& j ) { + return i.my_node == j.my_node && i.my_map == j.my_map; + } + + template + bool operator!=( const hash_map_iterator& i, const hash_map_iterator& j ) { + return i.my_node != j.my_node || i.my_map != j.my_map; + } + + //! Range class used with concurrent_hash_map + /** @ingroup containers */ + template + class hash_map_range { + typedef typename Iterator::map_type map_type; + Iterator my_begin; + Iterator my_end; + mutable Iterator my_midpoint; + size_t my_grainsize; + //! Set my_midpoint to point approximately half way between my_begin and my_end. + void set_midpoint() const; + template friend class hash_map_range; + public: + //! Type for size of a range + typedef std::size_t size_type; + typedef typename Iterator::value_type value_type; + typedef typename Iterator::reference reference; + typedef typename Iterator::difference_type difference_type; + typedef Iterator iterator; + + //! True if range is empty. + bool empty() const {return my_begin==my_end;} + + //! True if range can be partitioned into two subranges. + bool is_divisible() const { + return my_midpoint!=my_end; + } + //! Split range. + hash_map_range( hash_map_range& r, split ) : + my_end(r.my_end), + my_grainsize(r.my_grainsize) + { + r.my_end = my_begin = r.my_midpoint; + __TBB_ASSERT( !empty(), "Splitting despite the range is not divisible" ); + __TBB_ASSERT( !r.empty(), "Splitting despite the range is not divisible" ); + set_midpoint(); + r.set_midpoint(); + } + //! type conversion + template + hash_map_range( hash_map_range& r) : + my_begin(r.my_begin), + my_end(r.my_end), + my_midpoint(r.my_midpoint), + my_grainsize(r.my_grainsize) + {} +#if TBB_DEPRECATED + //! Init range with iterators and grainsize specified + hash_map_range( const Iterator& begin_, const Iterator& end_, size_type grainsize = 1 ) : + my_begin(begin_), + my_end(end_), + my_grainsize(grainsize) + { + if(!my_end.my_index && !my_end.my_bucket) // end + my_end.my_index = my_end.my_map->my_mask + 1; + set_midpoint(); + __TBB_ASSERT( grainsize>0, "grainsize must be positive" ); + } +#endif + //! Init range with container and grainsize specified + hash_map_range( const map_type &map, size_type grainsize = 1 ) : + my_begin( Iterator( map, 0, map.my_embedded_segment, map.my_embedded_segment->node_list ) ), + my_end( Iterator( map, map.my_mask + 1, 0, 0 ) ), + my_grainsize( grainsize ) + { + __TBB_ASSERT( grainsize>0, "grainsize must be positive" ); + set_midpoint(); + } + const Iterator& begin() const {return my_begin;} + const Iterator& end() const {return my_end;} + //! The grain size for this range. + size_type grainsize() const {return my_grainsize;} + }; + + template + void hash_map_range::set_midpoint() const { + // Split by groups of nodes + size_t m = my_end.my_index-my_begin.my_index; + if( m > my_grainsize ) { + m = my_begin.my_index + m/2u; + hash_map_base::bucket *b = my_begin.my_map->get_bucket(m); + my_midpoint = Iterator(*my_begin.my_map,m,b,b->node_list); + } else { + my_midpoint = my_end; + } + __TBB_ASSERT( my_begin.my_index <= my_midpoint.my_index, + "my_begin is after my_midpoint" ); + __TBB_ASSERT( my_midpoint.my_index <= my_end.my_index, + "my_midpoint is after my_end" ); + __TBB_ASSERT( my_begin != my_midpoint || my_begin == my_end, + "[my_begin, my_midpoint) range should not be empty" ); + } +} // namespace internal +//! @endcond + +//! Hash multiplier +static const size_t hash_multiplier = sizeof(size_t)==4? 2654435769U : 11400714819323198485ULL; +//! Hasher functions +template +inline static size_t tbb_hasher( const T& t ) { + return static_cast( t ) * hash_multiplier; +} +template +inline static size_t tbb_hasher( P* ptr ) { + size_t const h = reinterpret_cast( ptr ); + return (h >> 3) ^ h; +} +template +inline static size_t tbb_hasher( const std::basic_string& s ) { + size_t h = 0; + for( const E* c = s.c_str(); *c; c++ ) + h = static_cast(*c) ^ (h * hash_multiplier); + return h; +} +template +inline static size_t tbb_hasher( const std::pair& p ) { + return tbb_hasher(p.first) ^ tbb_hasher(p.second); +} + +//! hash_compare - default argument +template +struct tbb_hash_compare { + static size_t hash( const T& t ) { return tbb_hasher(t); } + static bool equal( const T& a, const T& b ) { return a == b; } +}; + +//! Unordered map from Key to T. +/** concurrent_hash_map is associative container with concurrent access. + +@par Compatibility + The class meets all Container Requirements from C++ Standard (See ISO/IEC 14882:2003(E), clause 23.1). + +@par Exception Safety + - Hash function is not permitted to throw an exception. User-defined types Key and T are forbidden from throwing an exception in destructors. + - If exception happens during insert() operations, it has no effect (unless exception raised by HashCompare::hash() function during grow_segment). + - If exception happens during operator=() operation, the container can have a part of source items, and methods size() and empty() can return wrong results. + +@par Changes since TBB 2.1 + - Replaced internal algorithm and data structure. Patent is pending. + - Added buckets number argument for constructor + +@par Changes since TBB 2.0 + - Fixed exception-safety + - Added template argument for allocator + - Added allocator argument in constructors + - Added constructor from a range of iterators + - Added several new overloaded insert() methods + - Added get_allocator() + - Added swap() + - Added count() + - Added overloaded erase(accessor &) and erase(const_accessor&) + - Added equal_range() [const] + - Added [const_]pointer, [const_]reference, and allocator_type types + - Added global functions: operator==(), operator!=(), and swap() + + @ingroup containers */ +template +class concurrent_hash_map : protected internal::hash_map_base { + template + friend class internal::hash_map_iterator; + + template + friend class internal::hash_map_range; + +public: + typedef Key key_type; + typedef T mapped_type; + typedef std::pair value_type; + typedef internal::hash_map_base::size_type size_type; + typedef ptrdiff_t difference_type; + typedef value_type *pointer; + typedef const value_type *const_pointer; + typedef value_type &reference; + typedef const value_type &const_reference; + typedef internal::hash_map_iterator iterator; + typedef internal::hash_map_iterator const_iterator; + typedef internal::hash_map_range range_type; + typedef internal::hash_map_range const_range_type; + typedef Allocator allocator_type; + +protected: + friend class const_accessor; + struct node; + typedef typename Allocator::template rebind::other node_allocator_type; + node_allocator_type my_allocator; + HashCompare my_hash_compare; + + struct node : public node_base { + value_type item; + node( const Key &key ) : item(key, T()) {} + node( const Key &key, const T &t ) : item(key, t) {} + // exception-safe allocation, see C++ Standard 2003, clause 5.3.4p17 + void *operator new( size_t /*size*/, node_allocator_type &a ) { + void *ptr = a.allocate(1); + if(!ptr) throw std::bad_alloc(); + return ptr; + } + // match placement-new form above to be called if exception thrown in constructor + void operator delete( void *ptr, node_allocator_type &a ) {return a.deallocate(static_cast(ptr),1); } + }; + + void delete_node( node_base *n ) { + my_allocator.destroy( static_cast(n) ); + my_allocator.deallocate( static_cast(n), 1); + } + + node *search_bucket( const key_type &key, bucket *b ) const { + node *n = static_cast( b->node_list ); + while( is_valid(n) && !my_hash_compare.equal(key, n->item.first) ) + n = static_cast( n->next ); + __TBB_ASSERT(n != internal::rehash_req, "Search can be executed only for rehashed bucket"); + return n; + } + + //! bucket accessor is to find, rehash, acquire a lock, and access a bucket + class bucket_accessor : public bucket::scoped_t { + bool my_is_writer; // TODO: use it from base type + bucket *my_b; + public: + bucket_accessor( concurrent_hash_map *base, const hashcode_t h, bool writer = false ) { acquire( base, h, writer ); } + //! find a bucket by masked hashcode, optionally rehash, and acquire the lock + inline void acquire( concurrent_hash_map *base, const hashcode_t h, bool writer = false ) { + my_b = base->get_bucket( h ); +#if TBB_USE_THREADING_TOOLS + // TODO: actually, notification is unnecessary here, just hiding double-check + if( itt_load_pointer_with_acquire_v3(&my_b->node_list) == internal::rehash_req +#else + if( __TBB_load_with_acquire(my_b->node_list) == internal::rehash_req +#endif + && try_acquire( my_b->mutex, /*write=*/true ) ) + { + if( my_b->node_list == internal::rehash_req ) base->rehash_bucket( my_b, h ); //recursive rehashing + my_is_writer = true; + } + else bucket::scoped_t::acquire( my_b->mutex, /*write=*/my_is_writer = writer ); + __TBB_ASSERT( my_b->node_list != internal::rehash_req, NULL); + } + //! check whether bucket is locked for write + bool is_writer() { return my_is_writer; } + //! get bucket pointer + bucket *operator() () { return my_b; } + // TODO: optimize out + bool upgrade_to_writer() { my_is_writer = true; return bucket::scoped_t::upgrade_to_writer(); } + }; + + // TODO refactor to hash_base + void rehash_bucket( bucket *b_new, const hashcode_t h ) { + __TBB_ASSERT( *(intptr_t*)(&b_new->mutex), "b_new must be locked (for write)"); + __TBB_ASSERT( h > 1, "The lowermost buckets can't be rehashed" ); + __TBB_store_with_release(b_new->node_list, internal::empty_rehashed); // mark rehashed + hashcode_t mask = ( 1u<<__TBB_Log2( h ) ) - 1; // get parent mask from the topmost bit + + bucket_accessor b_old( this, h & mask ); + + mask = (mask<<1) | 1; // get full mask for new bucket + __TBB_ASSERT( (mask&(mask+1))==0 && (h & mask) == h, NULL ); + restart: + for( node_base **p = &b_old()->node_list, *n = __TBB_load_with_acquire(*p); is_valid(n); n = *p ) { + hashcode_t c = my_hash_compare.hash( static_cast(n)->item.first ); + if( (c & mask) == h ) { + if( !b_old.is_writer() ) + if( !b_old.upgrade_to_writer() ) { + goto restart; // node ptr can be invalid due to concurrent erase + } + *p = n->next; // exclude from b_old + add_to_bucket( b_new, n ); + } else p = &n->next; // iterate to next item + } + } + +public: + + class accessor; + //! Combines data access, locking, and garbage collection. + class const_accessor { + friend class concurrent_hash_map; + friend class accessor; + void operator=( const accessor & ) const; // Deny access + const_accessor( const accessor & ); // Deny access + public: + //! Type of value + typedef const typename concurrent_hash_map::value_type value_type; + + //! True if result is empty. + bool empty() const {return !my_node;} + + //! Set to null + void release() { + if( my_node ) { + my_lock.release(); + my_node = 0; + } + } + + //! Return reference to associated value in hash table. + const_reference operator*() const { + __TBB_ASSERT( my_node, "attempt to dereference empty accessor" ); + return my_node->item; + } + + //! Return pointer to associated value in hash table. + const_pointer operator->() const { + return &operator*(); + } + + //! Create empty result + const_accessor() : my_node(NULL) {} + + //! Destroy result after releasing the underlying reference. + ~const_accessor() { + my_node = NULL; // my_lock.release() is called in scoped_lock destructor + } + private: + node *my_node; + typename node::scoped_t my_lock; + hashcode_t my_hash; + }; + + //! Allows write access to elements and combines data access, locking, and garbage collection. + class accessor: public const_accessor { + public: + //! Type of value + typedef typename concurrent_hash_map::value_type value_type; + + //! Return reference to associated value in hash table. + reference operator*() const { + __TBB_ASSERT( this->my_node, "attempt to dereference empty accessor" ); + return this->my_node->item; + } + + //! Return pointer to associated value in hash table. + pointer operator->() const { + return &operator*(); + } + }; + + //! Construct empty table. + concurrent_hash_map(const allocator_type &a = allocator_type()) + : my_allocator(a) + {} + + //! Construct empty table with n preallocated buckets. This number serves also as initial concurrency level. + concurrent_hash_map(size_type n, const allocator_type &a = allocator_type()) + : my_allocator(a) + { + reserve( n ); + } + + //! Copy constructor + concurrent_hash_map( const concurrent_hash_map& table, const allocator_type &a = allocator_type()) + : my_allocator(a) + { + internal_copy(table); + } + + //! Construction with copying iteration range and given allocator instance + template + concurrent_hash_map(I first, I last, const allocator_type &a = allocator_type()) + : my_allocator(a) + { + reserve( std::distance(first, last) ); // TODO: load_factor? + internal_copy(first, last); + } + + //! Assignment + concurrent_hash_map& operator=( const concurrent_hash_map& table ) { + if( this!=&table ) { + clear(); + internal_copy(table); + } + return *this; + } + + + //! Clear table + void clear(); + + //! Clear table and destroy it. + ~concurrent_hash_map() { clear(); } + + //------------------------------------------------------------------------ + // Parallel algorithm support + //------------------------------------------------------------------------ + range_type range( size_type grainsize=1 ) { + return range_type( *this, grainsize ); + } + const_range_type range( size_type grainsize=1 ) const { + return const_range_type( *this, grainsize ); + } + + //------------------------------------------------------------------------ + // STL support - not thread-safe methods + //------------------------------------------------------------------------ + iterator begin() {return iterator(*this,0,my_embedded_segment,my_embedded_segment->node_list);} + iterator end() {return iterator(*this,0,0,0);} + const_iterator begin() const {return const_iterator(*this,0,my_embedded_segment,my_embedded_segment->node_list);} + const_iterator end() const {return const_iterator(*this,0,0,0);} + std::pair equal_range( const Key& key ) { return internal_equal_range(key, end()); } + std::pair equal_range( const Key& key ) const { return internal_equal_range(key, end()); } + + //! Number of items in table. + size_type size() const { return my_size; } + + //! True if size()==0. + bool empty() const { return my_size == 0; } + + //! Upper bound on size. + size_type max_size() const {return (~size_type(0))/sizeof(node);} + + //! return allocator object + allocator_type get_allocator() const { return this->my_allocator; } + + //! swap two instances. Iterators are invalidated + void swap(concurrent_hash_map &table); + + //------------------------------------------------------------------------ + // concurrent map operations + //------------------------------------------------------------------------ + + //! Return count of items (0 or 1) + size_type count( const Key &key ) const { + return const_cast(this)->lookup(/*insert*/false, key, NULL, NULL, /*write=*/false ); + } + + //! Find item and acquire a read lock on the item. + /** Return true if item is found, false otherwise. */ + bool find( const_accessor &result, const Key &key ) const { + result.release(); + return const_cast(this)->lookup(/*insert*/false, key, NULL, &result, /*write=*/false ); + } + + //! Find item and acquire a write lock on the item. + /** Return true if item is found, false otherwise. */ + bool find( accessor &result, const Key &key ) { + result.release(); + return lookup(/*insert*/false, key, NULL, &result, /*write=*/true ); + } + + //! Insert item (if not already present) and acquire a read lock on the item. + /** Returns true if item is new. */ + bool insert( const_accessor &result, const Key &key ) { + result.release(); + return lookup(/*insert*/true, key, NULL, &result, /*write=*/false ); + } + + //! Insert item (if not already present) and acquire a write lock on the item. + /** Returns true if item is new. */ + bool insert( accessor &result, const Key &key ) { + result.release(); + return lookup(/*insert*/true, key, NULL, &result, /*write=*/true ); + } + + //! Insert item by copying if there is no such key present already and acquire a read lock on the item. + /** Returns true if item is new. */ + bool insert( const_accessor &result, const value_type &value ) { + result.release(); + return lookup(/*insert*/true, value.first, &value.second, &result, /*write=*/false ); + } + + //! Insert item by copying if there is no such key present already and acquire a write lock on the item. + /** Returns true if item is new. */ + bool insert( accessor &result, const value_type &value ) { + result.release(); + return lookup(/*insert*/true, value.first, &value.second, &result, /*write=*/true ); + } + + //! Insert item by copying if there is no such key present already + /** Returns true if item is inserted. */ + bool insert( const value_type &value ) { + return lookup(/*insert*/true, value.first, &value.second, NULL, /*write=*/false ); + } + + //! Insert range [first, last) + template + void insert(I first, I last) { + for(; first != last; ++first) + insert( *first ); + } + + //! Erase item. + /** Return true if item was erased by particularly this call. */ + bool erase( const Key& key ); + + //! Erase item by const_accessor. + /** Return true if item was erased by particularly this call. */ + bool erase( const_accessor& item_accessor ) { + return exclude( item_accessor, /*readonly=*/ true ); + } + + //! Erase item by accessor. + /** Return true if item was erased by particularly this call. */ + bool erase( accessor& item_accessor ) { + return exclude( item_accessor, /*readonly=*/ false ); + } + +protected: + //! Insert or find item and optionally acquire a lock on the item. + bool lookup( bool op_insert, const Key &key, const T *t, const_accessor *result, bool write ); + + //! delete item by accessor + bool exclude( const_accessor &item_accessor, bool readonly ); + + //! Returns an iterator for an item defined by the key, or for the next item after it (if upper==true) + template + std::pair internal_equal_range( const Key& key, I end ) const; + + //! Copy "source" to *this, where *this must start out empty. + void internal_copy( const concurrent_hash_map& source ); + + template + void internal_copy(I first, I last); + + //! Fast find when no concurrent erasure is used. For internal use inside TBB only! + /** Return pointer to item with given key, or NULL if no such item exists. + Must not be called concurrently with erasure operations. */ + const_pointer internal_fast_find( const Key& key ) const { + hashcode_t h = my_hash_compare.hash( key ); +#if TBB_USE_THREADING_TOOLS + hashcode_t m = (hashcode_t) itt_load_pointer_with_acquire_v3( &my_mask ); +#else + hashcode_t m = my_mask; +#endif + node *n; + restart: + __TBB_ASSERT((m&(m+1))==0, NULL); + bucket *b = get_bucket( h & m ); +#if TBB_USE_THREADING_TOOLS + // TODO: actually, notification is unnecessary here, just hiding double-check + if( itt_load_pointer_with_acquire_v3(&b->node_list) == internal::rehash_req ) +#else + if( __TBB_load_with_acquire(b->node_list) == internal::rehash_req ) +#endif + { + bucket::scoped_t lock; + if( lock.try_acquire( b->mutex, /*write=*/true ) ) { + if( b->node_list == internal::rehash_req) + const_cast(this)->rehash_bucket( b, h & m ); //recursive rehashing + } + else lock.acquire( b->mutex, /*write=*/false ); + __TBB_ASSERT(b->node_list!=internal::rehash_req,NULL); + } + n = search_bucket( key, b ); + if( n ) + return &n->item; + else if( check_mask_race( h, m ) ) + goto restart; + return 0; + } +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Suppress "conditional expression is constant" warning. + #pragma warning( push ) + #pragma warning( disable: 4127 ) +#endif + +template +bool concurrent_hash_map::lookup( bool op_insert, const Key &key, const T *t, const_accessor *result, bool write ) { + __TBB_ASSERT( !result || !result->my_node, NULL ); + segment_index_t grow_segment; + bool return_value; + node *n, *tmp_n = 0; + hashcode_t const h = my_hash_compare.hash( key ); +#if TBB_USE_THREADING_TOOLS + hashcode_t m = (hashcode_t) itt_load_pointer_with_acquire_v3( &my_mask ); +#else + hashcode_t m = my_mask; +#endif + restart: + {//lock scope + __TBB_ASSERT((m&(m+1))==0, NULL); + return_value = false; + // get bucket + bucket_accessor b( this, h & m ); + + // find a node + n = search_bucket( key, b() ); + if( op_insert ) { + // [opt] insert a key + if( !n ) { + if( !tmp_n ) { + if(t) tmp_n = new( my_allocator ) node(key, *t); + else tmp_n = new( my_allocator ) node(key); + } + if( !b.is_writer() && !b.upgrade_to_writer() ) { // TODO: improved insertion + // Rerun search_list, in case another thread inserted the item during the upgrade. + n = search_bucket( key, b() ); + if( is_valid(n) ) { // unfortunately, it did + b.downgrade_to_reader(); + goto exists; + } + } + if( check_mask_race(h, m) ) + goto restart; // b.release() is done in ~b(). + // insert and set flag to grow the container + grow_segment = insert_new_node( b(), n = tmp_n, m ); + tmp_n = 0; + return_value = true; + } else { + exists: grow_segment = 0; + } + } else { // find or count + if( !n ) { + if( check_mask_race( h, m ) ) + goto restart; // b.release() is done in ~b(). TODO: replace by continue + return false; + } + return_value = true; + grow_segment = 0; + } + if( !result ) goto check_growth; + // TODO: the following seems as generic/regular operation + // acquire the item + if( !result->my_lock.try_acquire( n->mutex, write ) ) { + // we are unlucky, prepare for longer wait + internal::atomic_backoff trials; + do { + if( !trials.bounded_pause() ) { + // the wait takes really long, restart the operation + b.release(); + __TBB_ASSERT( !op_insert || !return_value, "Can't acquire new item in locked bucket?" ); + __TBB_Yield(); + m = my_mask; + goto restart; + } + } while( !result->my_lock.try_acquire( n->mutex, write ) ); + } + }//lock scope + result->my_node = n; + result->my_hash = h; +check_growth: + // [opt] grow the container + if( grow_segment ) + enable_segment( grow_segment ); + if( tmp_n ) // if op_insert only + delete_node( tmp_n ); + return return_value; +} + +template +template +std::pair concurrent_hash_map::internal_equal_range( const Key& key, I end ) const { + hashcode_t h = my_hash_compare.hash( key ); + hashcode_t m = my_mask; + __TBB_ASSERT((m&(m+1))==0, NULL); + h &= m; + bucket *b = get_bucket( h ); + while( b->node_list == internal::rehash_req ) { + m = ( 1u<<__TBB_Log2( h ) ) - 1; // get parent mask from the topmost bit + b = get_bucket( h &= m ); + } + node *n = search_bucket( key, b ); + if( !n ) + return std::make_pair(end, end); + iterator lower(*this, h, b, n), upper(lower); + return std::make_pair(lower, ++upper); +} + +template +bool concurrent_hash_map::exclude( const_accessor &item_accessor, bool readonly ) { + __TBB_ASSERT( item_accessor.my_node, NULL ); + node_base *const n = item_accessor.my_node; + item_accessor.my_node = NULL; // we ought release accessor anyway + hashcode_t const h = item_accessor.my_hash; + hashcode_t m = my_mask; + do { + // get bucket + bucket_accessor b( this, h & m, /*writer=*/true ); + node_base **p = &b()->node_list; + while( *p && *p != n ) + p = &(*p)->next; + if( !*p ) { // someone else was the first + if( check_mask_race( h, m ) ) + continue; + item_accessor.my_lock.release(); + return false; + } + __TBB_ASSERT( *p == n, NULL ); + *p = n->next; // remove from container + my_size--; + break; + } while(true); + if( readonly ) // need to get exclusive lock + item_accessor.my_lock.upgrade_to_writer(); // return value means nothing here + item_accessor.my_lock.release(); + delete_node( n ); // Only one thread can delete it due to write lock on the chain_mutex + return true; +} + +template +bool concurrent_hash_map::erase( const Key &key ) { + node_base *n; + hashcode_t const h = my_hash_compare.hash( key ); + hashcode_t m = my_mask; +restart: + {//lock scope + // get bucket + bucket_accessor b( this, h & m ); + search: + node_base **p = &b()->node_list; + n = *p; + while( is_valid(n) && !my_hash_compare.equal(key, static_cast(n)->item.first ) ) { + p = &n->next; + n = *p; + } + if( !n ) { // not found, but mask could be changed + if( check_mask_race( h, m ) ) + goto restart; + return false; + } + else if( !b.is_writer() && !b.upgrade_to_writer() ) { + if( check_mask_race( h, m ) ) // contended upgrade, check mask + goto restart; + goto search; + } + *p = n->next; + my_size--; + } + { + typename node::scoped_t item_locker( n->mutex, /*write=*/true ); + } + // note: there should be no threads pretending to acquire this mutex again, do not try to upgrade const_accessor! + delete_node( n ); // Only one thread can delete it due to write lock on the bucket + return true; +} + +template +void concurrent_hash_map::swap(concurrent_hash_map &table) { + std::swap(this->my_allocator, table.my_allocator); + std::swap(this->my_hash_compare, table.my_hash_compare); + internal_swap(table); +} + +template +void concurrent_hash_map::clear() { + hashcode_t m = my_mask; + __TBB_ASSERT((m&(m+1))==0, NULL); +#if TBB_USE_DEBUG || TBB_USE_PERFORMANCE_WARNINGS +#if TBB_USE_PERFORMANCE_WARNINGS + int size = int(my_size), buckets = int(m)+1, empty_buckets = 0, overpopulated_buckets = 0; // usage statistics + static bool reported = false; +#endif + // check consistency + for( segment_index_t b = 0; b <= m; b++ ) { + node_base *n = get_bucket(b)->node_list; +#if TBB_USE_PERFORMANCE_WARNINGS + if( n == internal::empty_rehashed ) empty_buckets++; + else if( n == internal::rehash_req ) buckets--; + else if( n->next ) overpopulated_buckets++; +#endif + for(; is_valid(n); n = n->next ) { + hashcode_t h = my_hash_compare.hash( static_cast(n)->item.first ); + h &= m; + __TBB_ASSERT( h == b || get_bucket(h)->node_list == internal::rehash_req, "Rehashing is not finished until serial stage due to concurrent or unexpectedly terminated operation" ); + } + } +#if TBB_USE_PERFORMANCE_WARNINGS + if( buckets > size) empty_buckets -= buckets - size; + else overpopulated_buckets -= size - buckets; // TODO: load_factor? + if( !reported && buckets >= 512 && ( 2*empty_buckets >= size || 2*overpopulated_buckets > size ) ) { + internal::runtime_warning( + "Performance is not optimal because the hash function produces bad randomness in lower bits in %s.\nSize: %d Empties: %d Overlaps: %d", + typeid(*this).name(), size, empty_buckets, overpopulated_buckets ); + reported = true; + } +#endif +#endif//TBB_USE_DEBUG || TBB_USE_PERFORMANCE_WARNINGS + my_size = 0; + segment_index_t s = segment_index_of( m ); + __TBB_ASSERT( s+1 == pointers_per_table || !my_table[s+1], "wrong mask or concurrent grow" ); + cache_aligned_allocator alloc; + do { + __TBB_ASSERT( is_valid( my_table[s] ), "wrong mask or concurrent grow" ); + segment_ptr_t buckets = my_table[s]; + size_type sz = segment_size( s ? s : 1 ); + for( segment_index_t i = 0; i < sz; i++ ) + for( node_base *n = buckets[i].node_list; is_valid(n); n = buckets[i].node_list ) { + buckets[i].node_list = n->next; + delete_node( n ); + } + if( s >= first_block) // the first segment or the next + alloc.deallocate( buckets, sz ); + else if( s == embedded_block && embedded_block != first_block ) + alloc.deallocate( buckets, segment_size(first_block)-embedded_buckets ); + if( s >= embedded_block ) my_table[s] = 0; + } while(s-- > 0); + my_mask = embedded_buckets - 1; +} + +template +void concurrent_hash_map::internal_copy( const concurrent_hash_map& source ) { + reserve( source.my_size ); // TODO: load_factor? + hashcode_t mask = source.my_mask; + if( my_mask == mask ) { // optimized version + bucket *dst = 0, *src = 0; + for( hashcode_t k = 0; k <= mask; k++ ) { + if( k & (k-2) ) ++dst,src++; // not the beginning of a segment + else { dst = get_bucket( k ); src = source.get_bucket( k ); } + __TBB_ASSERT( dst->node_list != internal::rehash_req, "Invalid bucket in destination table"); + node *n = static_cast( src->node_list ); + if( n == internal::rehash_req ) { // source is not rehashed, items are in previous buckets + bucket_accessor b( this, k ); + rehash_bucket( b(), k ); // TODO: use without synchronization + } else for(; n; n = static_cast( n->next ) ) { + add_to_bucket( dst, new( my_allocator ) node(n->item.first, n->item.second) ); + ++my_size; // TODO: replace by non-atomic op + } + } + } else internal_copy( source.begin(), source.end() ); +} + +template +template +void concurrent_hash_map::internal_copy(I first, I last) { + hashcode_t m = my_mask; + for(; first != last; ++first) { + hashcode_t h = my_hash_compare.hash( first->first ); + bucket *b = get_bucket( h & m ); + __TBB_ASSERT( b->node_list != internal::rehash_req, "Invalid bucket in destination table"); + node *n = new( my_allocator ) node(first->first, first->second); + add_to_bucket( b, n ); + ++my_size; // TODO: replace by non-atomic op + } +} + +template +inline bool operator==(const concurrent_hash_map &a, const concurrent_hash_map &b) { + if(a.size() != b.size()) return false; + typename concurrent_hash_map::const_iterator i(a.begin()), i_end(a.end()); + typename concurrent_hash_map::const_iterator j, j_end(b.end()); + for(; i != i_end; ++i) { + j = b.equal_range(i->first).first; + if( j == j_end || !(i->second == j->second) ) return false; + } + return true; +} + +template +inline bool operator!=(const concurrent_hash_map &a, const concurrent_hash_map &b) +{ return !(a == b); } + +template +inline void swap(concurrent_hash_map &a, concurrent_hash_map &b) +{ a.swap( b ); } + +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning( pop ) +#endif // warning 4127 is back + +} // namespace tbb + +#endif /* __TBB_concurrent_hash_map_H */ diff --git a/dep/tbb/include/tbb/concurrent_queue.h b/dep/tbb/include/tbb/concurrent_queue.h new file mode 100644 index 000000000..f344a8471 --- /dev/null +++ b/dep/tbb/include/tbb/concurrent_queue.h @@ -0,0 +1,409 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_concurrent_queue_H +#define __TBB_concurrent_queue_H + +#include "_concurrent_queue_internal.h" + +namespace tbb { + +namespace strict_ppl { + +//! A high-performance thread-safe non-blocking concurrent queue. +/** Multiple threads may each push and pop concurrently. + Assignment construction is not allowed. + @ingroup containers */ +template > +class concurrent_queue: public internal::concurrent_queue_base_v3 { + template friend class internal::concurrent_queue_iterator; + + //! Allocator type + typedef typename A::template rebind::other page_allocator_type; + page_allocator_type my_allocator; + + //! Allocates a block of size n (bytes) + /*overide*/ virtual void *allocate_block( size_t n ) { + void *b = reinterpret_cast(my_allocator.allocate( n )); + if( !b ) this->internal_throw_exception(); + return b; + } + + //! Returns a block of size n (bytes) + /*override*/ virtual void deallocate_block( void *b, size_t n ) { + my_allocator.deallocate( reinterpret_cast(b), n ); + } + +public: + //! Element type in the queue. + typedef T value_type; + + //! Reference type + typedef T& reference; + + //! Const reference type + typedef const T& const_reference; + + //! Integral type for representing size of the queue. + typedef size_t size_type; + + //! Difference type for iterator + typedef ptrdiff_t difference_type; + + //! Allocator type + typedef A allocator_type; + + //! Construct empty queue + explicit concurrent_queue(const allocator_type& a = allocator_type()) : + internal::concurrent_queue_base_v3( sizeof(T) ), my_allocator( a ) + { + } + + //! [begin,end) constructor + template + concurrent_queue( InputIterator begin, InputIterator end, const allocator_type& a = allocator_type()) : + internal::concurrent_queue_base_v3( sizeof(T) ), my_allocator( a ) + { + for( ; begin != end; ++begin ) + internal_push(&*begin); + } + + //! Copy constructor + concurrent_queue( const concurrent_queue& src, const allocator_type& a = allocator_type()) : + internal::concurrent_queue_base_v3( sizeof(T) ), my_allocator( a ) + { + assign( src ); + } + + //! Destroy queue + ~concurrent_queue(); + + //! Enqueue an item at tail of queue. + void push( const T& source ) { + internal_push( &source ); + } + + //! Attempt to dequeue an item from head of queue. + /** Does not wait for item to become available. + Returns true if successful; false otherwise. */ + bool try_pop( T& result ) { + return internal_try_pop( &result ); + } + + //! Return the number of items in the queue; thread unsafe + size_type unsafe_size() const {return this->internal_size();} + + //! Equivalent to size()==0. + bool empty() const {return this->internal_empty();} + + //! Clear the queue. not thread-safe. + void clear() ; + + //! Return allocator object + allocator_type get_allocator() const { return this->my_allocator; } + + typedef internal::concurrent_queue_iterator iterator; + typedef internal::concurrent_queue_iterator const_iterator; + + //------------------------------------------------------------------------ + // The iterators are intended only for debugging. They are slow and not thread safe. + //------------------------------------------------------------------------ + iterator unsafe_begin() {return iterator(*this);} + iterator unsafe_end() {return iterator();} + const_iterator unsafe_begin() const {return const_iterator(*this);} + const_iterator unsafe_end() const {return const_iterator();} +} ; + +template +concurrent_queue::~concurrent_queue() { + clear(); + this->internal_finish_clear(); +} + +template +void concurrent_queue::clear() { + while( !empty() ) { + T value; + internal_try_pop(&value); + } +} + +} // namespace strict_ppl + +//! A high-performance thread-safe blocking concurrent bounded queue. +/** This is the pre-PPL TBB concurrent queue which supports boundedness and blocking semantics. + Note that method names agree with the PPL-style concurrent queue. + Multiple threads may each push and pop concurrently. + Assignment construction is not allowed. + @ingroup containers */ +template > +class concurrent_bounded_queue: public internal::concurrent_queue_base_v3 { + template friend class internal::concurrent_queue_iterator; + + //! Allocator type + typedef typename A::template rebind::other page_allocator_type; + page_allocator_type my_allocator; + + //! Class used to ensure exception-safety of method "pop" + class destroyer: internal::no_copy { + T& my_value; + public: + destroyer( T& value ) : my_value(value) {} + ~destroyer() {my_value.~T();} + }; + + T& get_ref( page& page, size_t index ) { + __TBB_ASSERT( index(static_cast(&page+1))[index]; + } + + /*override*/ virtual void copy_item( page& dst, size_t index, const void* src ) { + new( &get_ref(dst,index) ) T(*static_cast(src)); + } + + /*override*/ virtual void copy_page_item( page& dst, size_t dindex, const page& src, size_t sindex ) { + new( &get_ref(dst,dindex) ) T( static_cast(static_cast(&src+1))[sindex] ); + } + + /*override*/ virtual void assign_and_destroy_item( void* dst, page& src, size_t index ) { + T& from = get_ref(src,index); + destroyer d(from); + *static_cast(dst) = from; + } + + /*overide*/ virtual page *allocate_page() { + size_t n = sizeof(page) + items_per_page*item_size; + page *p = reinterpret_cast(my_allocator.allocate( n )); + if( !p ) internal_throw_exception(); + return p; + } + + /*override*/ virtual void deallocate_page( page *p ) { + size_t n = sizeof(page) + items_per_page*item_size; + my_allocator.deallocate( reinterpret_cast(p), n ); + } + +public: + //! Element type in the queue. + typedef T value_type; + + //! Allocator type + typedef A allocator_type; + + //! Reference type + typedef T& reference; + + //! Const reference type + typedef const T& const_reference; + + //! Integral type for representing size of the queue. + /** Notice that the size_type is a signed integral type. + This is because the size can be negative if there are pending pops without corresponding pushes. */ + typedef std::ptrdiff_t size_type; + + //! Difference type for iterator + typedef std::ptrdiff_t difference_type; + + //! Construct empty queue + explicit concurrent_bounded_queue(const allocator_type& a = allocator_type()) : + concurrent_queue_base_v3( sizeof(T) ), my_allocator( a ) + { + } + + //! Copy constructor + concurrent_bounded_queue( const concurrent_bounded_queue& src, const allocator_type& a = allocator_type()) : + concurrent_queue_base_v3( sizeof(T) ), my_allocator( a ) + { + assign( src ); + } + + //! [begin,end) constructor + template + concurrent_bounded_queue( InputIterator begin, InputIterator end, const allocator_type& a = allocator_type()) : + concurrent_queue_base_v3( sizeof(T) ), my_allocator( a ) + { + for( ; begin != end; ++begin ) + internal_push_if_not_full(&*begin); + } + + //! Destroy queue + ~concurrent_bounded_queue(); + + //! Enqueue an item at tail of queue. + void push( const T& source ) { + internal_push( &source ); + } + + //! Dequeue item from head of queue. + /** Block until an item becomes available, and then dequeue it. */ + void pop( T& destination ) { + internal_pop( &destination ); + } + + //! Enqueue an item at tail of queue if queue is not already full. + /** Does not wait for queue to become not full. + Returns true if item is pushed; false if queue was already full. */ + bool try_push( const T& source ) { + return internal_push_if_not_full( &source ); + } + + //! Attempt to dequeue an item from head of queue. + /** Does not wait for item to become available. + Returns true if successful; false otherwise. */ + bool try_pop( T& destination ) { + return internal_pop_if_present( &destination ); + } + + //! Return number of pushes minus number of pops. + /** Note that the result can be negative if there are pops waiting for the + corresponding pushes. The result can also exceed capacity() if there + are push operations in flight. */ + size_type size() const {return internal_size();} + + //! Equivalent to size()<=0. + bool empty() const {return internal_empty();} + + //! Maximum number of allowed elements + size_type capacity() const { + return my_capacity; + } + + //! Set the capacity + /** Setting the capacity to 0 causes subsequent try_push operations to always fail, + and subsequent push operations to block forever. */ + void set_capacity( size_type capacity ) { + internal_set_capacity( capacity, sizeof(T) ); + } + + //! return allocator object + allocator_type get_allocator() const { return this->my_allocator; } + + //! clear the queue. not thread-safe. + void clear() ; + + typedef internal::concurrent_queue_iterator iterator; + typedef internal::concurrent_queue_iterator const_iterator; + + //------------------------------------------------------------------------ + // The iterators are intended only for debugging. They are slow and not thread safe. + //------------------------------------------------------------------------ + iterator unsafe_begin() {return iterator(*this);} + iterator unsafe_end() {return iterator();} + const_iterator unsafe_begin() const {return const_iterator(*this);} + const_iterator unsafe_end() const {return const_iterator();} + +}; + +template +concurrent_bounded_queue::~concurrent_bounded_queue() { + clear(); + internal_finish_clear(); +} + +template +void concurrent_bounded_queue::clear() { + while( !empty() ) { + T value; + internal_pop_if_present(&value); + } +} + +namespace deprecated { + +//! A high-performance thread-safe blocking concurrent bounded queue. +/** This is the pre-PPL TBB concurrent queue which support boundedness and blocking semantics. + Note that method names agree with the PPL-style concurrent queue. + Multiple threads may each push and pop concurrently. + Assignment construction is not allowed. + @ingroup containers */ +template > +class concurrent_queue: public concurrent_bounded_queue { +#if !__TBB_TEMPLATE_FRIENDS_BROKEN + template friend class internal::concurrent_queue_iterator; +#endif + +public: + //! Construct empty queue + explicit concurrent_queue(const A& a = A()) : + concurrent_bounded_queue( a ) + { + } + + //! Copy constructor + concurrent_queue( const concurrent_queue& src, const A& a = A()) : + concurrent_bounded_queue( src, a ) + { + } + + //! [begin,end) constructor + template + concurrent_queue( InputIterator begin, InputIterator end, const A& a = A()) : + concurrent_bounded_queue( begin, end, a ) + { + } + + //! Enqueue an item at tail of queue if queue is not already full. + /** Does not wait for queue to become not full. + Returns true if item is pushed; false if queue was already full. */ + bool push_if_not_full( const T& source ) { + return try_push( source ); + } + + //! Attempt to dequeue an item from head of queue. + /** Does not wait for item to become available. + Returns true if successful; false otherwise. + @deprecated Use try_pop() + */ + bool pop_if_present( T& destination ) { + return try_pop( destination ); + } + + typedef typename concurrent_bounded_queue::iterator iterator; + typedef typename concurrent_bounded_queue::const_iterator const_iterator; + // + //------------------------------------------------------------------------ + // The iterators are intended only for debugging. They are slow and not thread safe. + //------------------------------------------------------------------------ + iterator begin() {return this->unsafe_begin();} + iterator end() {return this->unsafe_end();} + const_iterator begin() const {return this->unsafe_begin();} + const_iterator end() const {return this->unsafe_end();} +}; + +} + + +#if TBB_DEPRECATED +using deprecated::concurrent_queue; +#else +using strict_ppl::concurrent_queue; +#endif + +} // namespace tbb + +#endif /* __TBB_concurrent_queue_H */ diff --git a/dep/tbb/include/tbb/concurrent_vector.h b/dep/tbb/include/tbb/concurrent_vector.h new file mode 100644 index 000000000..383c04489 --- /dev/null +++ b/dep/tbb/include/tbb/concurrent_vector.h @@ -0,0 +1,1049 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_concurrent_vector_H +#define __TBB_concurrent_vector_H + +#include "tbb_stddef.h" +#include +#include +#include +#include +#include "atomic.h" +#include "cache_aligned_allocator.h" +#include "blocked_range.h" + +#include "tbb_machine.h" + +#if _MSC_VER==1500 && !__INTEL_COMPILER + // VS2008/VC9 seems to have an issue; limits pull in math.h + #pragma warning( push ) + #pragma warning( disable: 4985 ) +#endif +#include /* std::numeric_limits */ +#if _MSC_VER==1500 && !__INTEL_COMPILER + #pragma warning( pop ) +#endif + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (push) + #pragma warning (disable: 4267) +#endif + +namespace tbb { + +template > +class concurrent_vector; + + +//! @cond INTERNAL +namespace internal { + + //! Bad allocation marker + static void *const vector_allocation_error_flag = reinterpret_cast(size_t(63)); + //! Routine that loads pointer from location pointed to by src without any fence, without causing ITT to report a race. + void* __TBB_EXPORTED_FUNC itt_load_pointer_v3( const void* src ); + + //! Base class of concurrent vector implementation. + /** @ingroup containers */ + class concurrent_vector_base_v3 { + protected: + + // Basic types declarations + typedef size_t segment_index_t; + typedef size_t size_type; + + // Using enumerations due to Mac linking problems of static const variables + enum { + // Size constants + default_initial_segments = 1, // 2 initial items + //! Number of slots for segment's pointers inside the class + pointers_per_short_table = 3, // to fit into 8 words of entire structure + pointers_per_long_table = sizeof(segment_index_t) * 8 // one segment per bit + }; + + // Segment pointer. Can be zero-initialized + struct segment_t { + void* array; +#if TBB_USE_ASSERT + ~segment_t() { + __TBB_ASSERT( array <= internal::vector_allocation_error_flag, "should have been freed by clear" ); + } +#endif /* TBB_USE_ASSERT */ + }; + + // Data fields + + //! allocator function pointer + void* (*vector_allocator_ptr)(concurrent_vector_base_v3 &, size_t); + + //! count of segments in the first block + atomic my_first_block; + + //! Requested size of vector + atomic my_early_size; + + //! Pointer to the segments table + atomic my_segment; + + //! embedded storage of segment pointers + segment_t my_storage[pointers_per_short_table]; + + // Methods + + concurrent_vector_base_v3() { + my_early_size = 0; + my_first_block = 0; // here is not default_initial_segments + for( segment_index_t i = 0; i < pointers_per_short_table; i++) + my_storage[i].array = NULL; + my_segment = my_storage; + } + __TBB_EXPORTED_METHOD ~concurrent_vector_base_v3(); + + static segment_index_t segment_index_of( size_type index ) { + return segment_index_t( __TBB_Log2( index|1 ) ); + } + + static segment_index_t segment_base( segment_index_t k ) { + return (segment_index_t(1)< + class vector_iterator + { + //! concurrent_vector over which we are iterating. + Container* my_vector; + + //! Index into the vector + size_t my_index; + + //! Caches my_vector->internal_subscript(my_index) + /** NULL if cached value is not available */ + mutable Value* my_item; + + template + friend vector_iterator operator+( ptrdiff_t offset, const vector_iterator& v ); + + template + friend bool operator==( const vector_iterator& i, const vector_iterator& j ); + + template + friend bool operator<( const vector_iterator& i, const vector_iterator& j ); + + template + friend ptrdiff_t operator-( const vector_iterator& i, const vector_iterator& j ); + + template + friend class internal::vector_iterator; + +#if !defined(_MSC_VER) || defined(__INTEL_COMPILER) + template + friend class tbb::concurrent_vector; +#else +public: // workaround for MSVC +#endif + + vector_iterator( const Container& vector, size_t index, void *ptr = 0 ) : + my_vector(const_cast(&vector)), + my_index(index), + my_item(static_cast(ptr)) + {} + + public: + //! Default constructor + vector_iterator() : my_vector(NULL), my_index(~size_t(0)), my_item(NULL) {} + + vector_iterator( const vector_iterator& other ) : + my_vector(other.my_vector), + my_index(other.my_index), + my_item(other.my_item) + {} + + vector_iterator operator+( ptrdiff_t offset ) const { + return vector_iterator( *my_vector, my_index+offset ); + } + vector_iterator &operator+=( ptrdiff_t offset ) { + my_index+=offset; + my_item = NULL; + return *this; + } + vector_iterator operator-( ptrdiff_t offset ) const { + return vector_iterator( *my_vector, my_index-offset ); + } + vector_iterator &operator-=( ptrdiff_t offset ) { + my_index-=offset; + my_item = NULL; + return *this; + } + Value& operator*() const { + Value* item = my_item; + if( !item ) { + item = my_item = &my_vector->internal_subscript(my_index); + } + __TBB_ASSERT( item==&my_vector->internal_subscript(my_index), "corrupt cache" ); + return *item; + } + Value& operator[]( ptrdiff_t k ) const { + return my_vector->internal_subscript(my_index+k); + } + Value* operator->() const {return &operator*();} + + //! Pre increment + vector_iterator& operator++() { + size_t k = ++my_index; + if( my_item ) { + // Following test uses 2's-complement wizardry + if( (k& (k-2))==0 ) { + // k is a power of two that is at least k-2 + my_item= NULL; + } else { + ++my_item; + } + } + return *this; + } + + //! Pre decrement + vector_iterator& operator--() { + __TBB_ASSERT( my_index>0, "operator--() applied to iterator already at beginning of concurrent_vector" ); + size_t k = my_index--; + if( my_item ) { + // Following test uses 2's-complement wizardry + if( (k& (k-2))==0 ) { + // k is a power of two that is at least k-2 + my_item= NULL; + } else { + --my_item; + } + } + return *this; + } + + //! Post increment + vector_iterator operator++(int) { + vector_iterator result = *this; + operator++(); + return result; + } + + //! Post decrement + vector_iterator operator--(int) { + vector_iterator result = *this; + operator--(); + return result; + } + + // STL support + + typedef ptrdiff_t difference_type; + typedef Value value_type; + typedef Value* pointer; + typedef Value& reference; + typedef std::random_access_iterator_tag iterator_category; + }; + + template + vector_iterator operator+( ptrdiff_t offset, const vector_iterator& v ) { + return vector_iterator( *v.my_vector, v.my_index+offset ); + } + + template + bool operator==( const vector_iterator& i, const vector_iterator& j ) { + return i.my_index==j.my_index && i.my_vector == j.my_vector; + } + + template + bool operator!=( const vector_iterator& i, const vector_iterator& j ) { + return !(i==j); + } + + template + bool operator<( const vector_iterator& i, const vector_iterator& j ) { + return i.my_index + bool operator>( const vector_iterator& i, const vector_iterator& j ) { + return j + bool operator>=( const vector_iterator& i, const vector_iterator& j ) { + return !(i + bool operator<=( const vector_iterator& i, const vector_iterator& j ) { + return !(j + ptrdiff_t operator-( const vector_iterator& i, const vector_iterator& j ) { + return ptrdiff_t(i.my_index)-ptrdiff_t(j.my_index); + } + + template + class allocator_base { + public: + typedef typename A::template + rebind::other allocator_type; + allocator_type my_allocator; + + allocator_base(const allocator_type &a = allocator_type() ) : my_allocator(a) {} + }; + +} // namespace internal +//! @endcond + +//! Concurrent vector container +/** concurrent_vector is a container having the following main properties: + - It provides random indexed access to its elements. The index of the first element is 0. + - It ensures safe concurrent growing its size (different threads can safely append new elements). + - Adding new elements does not invalidate existing iterators and does not change indices of existing items. + +@par Compatibility + The class meets all Container Requirements and Reversible Container Requirements from + C++ Standard (See ISO/IEC 14882:2003(E), clause 23.1). But it doesn't meet + Sequence Requirements due to absence of insert() and erase() methods. + +@par Exception Safety + Methods working with memory allocation and/or new elements construction can throw an + exception if allocator fails to allocate memory or element's default constructor throws one. + Concurrent vector's element of type T must conform to the following requirements: + - Throwing an exception is forbidden for destructor of T. + - Default constructor of T must not throw an exception OR its non-virtual destructor must safely work when its object memory is zero-initialized. + . + Otherwise, the program's behavior is undefined. +@par + If an exception happens inside growth or assignment operation, an instance of the vector becomes invalid unless it is stated otherwise in the method documentation. + Invalid state means: + - There are no guaranties that all items were initialized by a constructor. The rest of items is zero-filled, including item where exception happens. + - An invalid vector instance cannot be repaired; it is unable to grow anymore. + - Size and capacity reported by the vector are incorrect, and calculated as if the failed operation were successful. + - Attempt to access not allocated elements using operator[] or iterators results in access violation or segmentation fault exception, and in case of using at() method a C++ exception is thrown. + . + If a concurrent grow operation successfully completes, all the elements it has added to the vector will remain valid and accessible even if one of subsequent grow operations fails. + +@par Fragmentation + Unlike an STL vector, a concurrent_vector does not move existing elements if it needs + to allocate more memory. The container is divided into a series of contiguous arrays of + elements. The first reservation, growth, or assignment operation determines the size of + the first array. Using small number of elements as initial size incurs fragmentation that + may increase element access time. Internal layout can be optimized by method compact() that + merges several smaller arrays into one solid. + +@par Changes since TBB 2.1 + - Fixed guarantees of concurrent_vector::size() and grow_to_at_least() methods to assure elements are allocated. + - Methods end()/rbegin()/back() are partly thread-safe since they use size() to get the end of vector + - Added resize() methods (not thread-safe) + - Added cbegin/cend/crbegin/crend methods + - Changed return type of methods grow* and push_back to iterator + +@par Changes since TBB 2.0 + - Implemented exception-safety guaranties + - Added template argument for allocator + - Added allocator argument in constructors + - Faster index calculation + - First growth call specifies a number of segments to be merged in the first allocation. + - Fixed memory blow up for swarm of vector's instances of small size + - Added grow_by(size_type n, const_reference t) growth using copying constructor to init new items. + - Added STL-like constructors. + - Added operators ==, < and derivatives + - Added at() method, approved for using after an exception was thrown inside the vector + - Added get_allocator() method. + - Added assign() methods + - Added compact() method to defragment first segments + - Added swap() method + - range() defaults on grainsize = 1 supporting auto grainsize algorithms. + + @ingroup containers */ +template +class concurrent_vector: protected internal::allocator_base, + private internal::concurrent_vector_base { +private: + template + class generic_range_type: public blocked_range { + public: + typedef T value_type; + typedef T& reference; + typedef const T& const_reference; + typedef I iterator; + typedef ptrdiff_t difference_type; + generic_range_type( I begin_, I end_, size_t grainsize = 1) : blocked_range(begin_,end_,grainsize) {} + template + generic_range_type( const generic_range_type& r) : blocked_range(r.begin(),r.end(),r.grainsize()) {} + generic_range_type( generic_range_type& r, split ) : blocked_range(r,split()) {} + }; + + template + friend class internal::vector_iterator; +public: + //------------------------------------------------------------------------ + // STL compatible types + //------------------------------------------------------------------------ + typedef internal::concurrent_vector_base_v3::size_type size_type; + typedef typename internal::allocator_base::allocator_type allocator_type; + + typedef T value_type; + typedef ptrdiff_t difference_type; + typedef T& reference; + typedef const T& const_reference; + typedef T *pointer; + typedef const T *const_pointer; + + typedef internal::vector_iterator iterator; + typedef internal::vector_iterator const_iterator; + +#if !defined(_MSC_VER) || _CPPLIB_VER>=300 + // Assume ISO standard definition of std::reverse_iterator + typedef std::reverse_iterator reverse_iterator; + typedef std::reverse_iterator const_reverse_iterator; +#else + // Use non-standard std::reverse_iterator + typedef std::reverse_iterator reverse_iterator; + typedef std::reverse_iterator const_reverse_iterator; +#endif /* defined(_MSC_VER) && (_MSC_VER<1300) */ + + //------------------------------------------------------------------------ + // Parallel algorithm support + //------------------------------------------------------------------------ + typedef generic_range_type range_type; + typedef generic_range_type const_range_type; + + //------------------------------------------------------------------------ + // STL compatible constructors & destructors + //------------------------------------------------------------------------ + + //! Construct empty vector. + explicit concurrent_vector(const allocator_type &a = allocator_type()) + : internal::allocator_base(a) + { + vector_allocator_ptr = &internal_allocator; + } + + //! Copying constructor + concurrent_vector( const concurrent_vector& vector, const allocator_type& a = allocator_type() ) + : internal::allocator_base(a) + { + vector_allocator_ptr = &internal_allocator; + try { + internal_copy(vector, sizeof(T), ©_array); + } catch(...) { + segment_t *table = my_segment; + internal_free_segments( reinterpret_cast(table), internal_clear(&destroy_array), my_first_block ); + throw; + } + } + + //! Copying constructor for vector with different allocator type + template + concurrent_vector( const concurrent_vector& vector, const allocator_type& a = allocator_type() ) + : internal::allocator_base(a) + { + vector_allocator_ptr = &internal_allocator; + try { + internal_copy(vector.internal_vector_base(), sizeof(T), ©_array); + } catch(...) { + segment_t *table = my_segment; + internal_free_segments( reinterpret_cast(table), internal_clear(&destroy_array), my_first_block ); + throw; + } + } + + //! Construction with initial size specified by argument n + explicit concurrent_vector(size_type n) + { + vector_allocator_ptr = &internal_allocator; + try { + internal_resize( n, sizeof(T), max_size(), NULL, &destroy_array, &initialize_array ); + } catch(...) { + segment_t *table = my_segment; + internal_free_segments( reinterpret_cast(table), internal_clear(&destroy_array), my_first_block ); + throw; + } + } + + //! Construction with initial size specified by argument n, initialization by copying of t, and given allocator instance + concurrent_vector(size_type n, const_reference t, const allocator_type& a = allocator_type()) + : internal::allocator_base(a) + { + vector_allocator_ptr = &internal_allocator; + try { + internal_resize( n, sizeof(T), max_size(), static_cast(&t), &destroy_array, &initialize_array_by ); + } catch(...) { + segment_t *table = my_segment; + internal_free_segments( reinterpret_cast(table), internal_clear(&destroy_array), my_first_block ); + throw; + } + } + + //! Construction with copying iteration range and given allocator instance + template + concurrent_vector(I first, I last, const allocator_type &a = allocator_type()) + : internal::allocator_base(a) + { + vector_allocator_ptr = &internal_allocator; + try { + internal_assign_range(first, last, static_cast::is_integer> *>(0) ); + } catch(...) { + segment_t *table = my_segment; + internal_free_segments( reinterpret_cast(table), internal_clear(&destroy_array), my_first_block ); + throw; + } + } + + //! Assignment + concurrent_vector& operator=( const concurrent_vector& vector ) { + if( this != &vector ) + internal_assign(vector, sizeof(T), &destroy_array, &assign_array, ©_array); + return *this; + } + + //! Assignment for vector with different allocator type + template + concurrent_vector& operator=( const concurrent_vector& vector ) { + if( static_cast( this ) != static_cast( &vector ) ) + internal_assign(vector.internal_vector_base(), + sizeof(T), &destroy_array, &assign_array, ©_array); + return *this; + } + + //------------------------------------------------------------------------ + // Concurrent operations + //------------------------------------------------------------------------ + //! Grow by "delta" elements. +#if TBB_DEPRECATED + /** Returns old size. */ + size_type grow_by( size_type delta ) { + return delta ? internal_grow_by( delta, sizeof(T), &initialize_array, NULL ) : my_early_size; + } +#else + /** Returns iterator pointing to the first new element. */ + iterator grow_by( size_type delta ) { + return iterator(*this, delta ? internal_grow_by( delta, sizeof(T), &initialize_array, NULL ) : my_early_size); + } +#endif + + //! Grow by "delta" elements using copying constuctor. +#if TBB_DEPRECATED + /** Returns old size. */ + size_type grow_by( size_type delta, const_reference t ) { + return delta ? internal_grow_by( delta, sizeof(T), &initialize_array_by, static_cast(&t) ) : my_early_size; + } +#else + /** Returns iterator pointing to the first new element. */ + iterator grow_by( size_type delta, const_reference t ) { + return iterator(*this, delta ? internal_grow_by( delta, sizeof(T), &initialize_array_by, static_cast(&t) ) : my_early_size); + } +#endif + + //! Append minimal sequence of elements such that size()>=n. +#if TBB_DEPRECATED + /** The new elements are default constructed. Blocks until all elements in range [0..n) are allocated. + May return while other elements are being constructed by other threads. */ + void grow_to_at_least( size_type n ) { + if( n ) internal_grow_to_at_least_with_result( n, sizeof(T), &initialize_array, NULL ); + }; +#else + /** The new elements are default constructed. Blocks until all elements in range [0..n) are allocated. + May return while other elements are being constructed by other threads. + Returns iterator that points to beginning of appended sequence. + If no elements were appended, returns iterator pointing to nth element. */ + iterator grow_to_at_least( size_type n ) { + size_type m=0; + if( n ) { + m = internal_grow_to_at_least_with_result( n, sizeof(T), &initialize_array, NULL ); + if( m>n ) m=n; + } + return iterator(*this, m); + }; +#endif + + //! Push item +#if TBB_DEPRECATED + size_type push_back( const_reference item ) +#else + /** Returns iterator pointing to the new element. */ + iterator push_back( const_reference item ) +#endif + { + size_type k; + void *ptr = internal_push_back(sizeof(T),k); + internal_loop_guide loop(1, ptr); + loop.init(&item); +#if TBB_DEPRECATED + return k; +#else + return iterator(*this, k, ptr); +#endif + } + + //! Get reference to element at given index. + /** This method is thread-safe for concurrent reads, and also while growing the vector, + as long as the calling thread has checked that index<size(). */ + reference operator[]( size_type index ) { + return internal_subscript(index); + } + + //! Get const reference to element at given index. + const_reference operator[]( size_type index ) const { + return internal_subscript(index); + } + + //! Get reference to element at given index. Throws exceptions on errors. + reference at( size_type index ) { + return internal_subscript_with_exceptions(index); + } + + //! Get const reference to element at given index. Throws exceptions on errors. + const_reference at( size_type index ) const { + return internal_subscript_with_exceptions(index); + } + + //! Get range for iterating with parallel algorithms + range_type range( size_t grainsize = 1) { + return range_type( begin(), end(), grainsize ); + } + + //! Get const range for iterating with parallel algorithms + const_range_type range( size_t grainsize = 1 ) const { + return const_range_type( begin(), end(), grainsize ); + } + //------------------------------------------------------------------------ + // Capacity + //------------------------------------------------------------------------ + //! Return size of vector. It may include elements under construction + size_type size() const { + size_type sz = my_early_size, cp = internal_capacity(); + return cp < sz ? cp : sz; + } + + //! Return true if vector is not empty or has elements under construction at least. + bool empty() const {return !my_early_size;} + + //! Maximum size to which array can grow without allocating more memory. Concurrent allocations are not included in the value. + size_type capacity() const {return internal_capacity();} + + //! Allocate enough space to grow to size n without having to allocate more memory later. + /** Like most of the methods provided for STL compatibility, this method is *not* thread safe. + The capacity afterwards may be bigger than the requested reservation. */ + void reserve( size_type n ) { + if( n ) + internal_reserve(n, sizeof(T), max_size()); + } + + //! Resize the vector. Not thread-safe. + void resize( size_type n ) { + internal_resize( n, sizeof(T), max_size(), NULL, &destroy_array, &initialize_array ); + } + + //! Resize the vector, copy t for new elements. Not thread-safe. + void resize( size_type n, const_reference t ) { + internal_resize( n, sizeof(T), max_size(), static_cast(&t), &destroy_array, &initialize_array_by ); + } + +#if TBB_DEPRECATED + //! An alias for shrink_to_fit() + void compact() {shrink_to_fit();} +#endif /* TBB_DEPRECATED */ + + //! Optimize memory usage and fragmentation. + void shrink_to_fit(); + + //! Upper bound on argument to reserve. + size_type max_size() const {return (~size_type(0))/sizeof(T);} + + //------------------------------------------------------------------------ + // STL support + //------------------------------------------------------------------------ + + //! start iterator + iterator begin() {return iterator(*this,0);} + //! end iterator + iterator end() {return iterator(*this,size());} + //! start const iterator + const_iterator begin() const {return const_iterator(*this,0);} + //! end const iterator + const_iterator end() const {return const_iterator(*this,size());} + //! start const iterator + const_iterator cbegin() const {return const_iterator(*this,0);} + //! end const iterator + const_iterator cend() const {return const_iterator(*this,size());} + //! reverse start iterator + reverse_iterator rbegin() {return reverse_iterator(end());} + //! reverse end iterator + reverse_iterator rend() {return reverse_iterator(begin());} + //! reverse start const iterator + const_reverse_iterator rbegin() const {return const_reverse_iterator(end());} + //! reverse end const iterator + const_reverse_iterator rend() const {return const_reverse_iterator(begin());} + //! reverse start const iterator + const_reverse_iterator crbegin() const {return const_reverse_iterator(end());} + //! reverse end const iterator + const_reverse_iterator crend() const {return const_reverse_iterator(begin());} + //! the first item + reference front() { + __TBB_ASSERT( size()>0, NULL); + return static_cast(my_segment[0].array)[0]; + } + //! the first item const + const_reference front() const { + __TBB_ASSERT( size()>0, NULL); + return static_cast(my_segment[0].array)[0]; + } + //! the last item + reference back() { + __TBB_ASSERT( size()>0, NULL); + return internal_subscript( size()-1 ); + } + //! the last item const + const_reference back() const { + __TBB_ASSERT( size()>0, NULL); + return internal_subscript( size()-1 ); + } + //! return allocator object + allocator_type get_allocator() const { return this->my_allocator; } + + //! assign n items by copying t item + void assign(size_type n, const_reference t) { + clear(); + internal_resize( n, sizeof(T), max_size(), static_cast(&t), &destroy_array, &initialize_array_by ); + } + + //! assign range [first, last) + template + void assign(I first, I last) { + clear(); internal_assign_range( first, last, static_cast::is_integer> *>(0) ); + } + + //! swap two instances + void swap(concurrent_vector &vector) { + if( this != &vector ) { + concurrent_vector_base_v3::internal_swap(static_cast(vector)); + std::swap(this->my_allocator, vector.my_allocator); + } + } + + //! Clear container while keeping memory allocated. + /** To free up the memory, use in conjunction with method compact(). Not thread safe **/ + void clear() { + internal_clear(&destroy_array); + } + + //! Clear and destroy vector. + ~concurrent_vector() { + segment_t *table = my_segment; + internal_free_segments( reinterpret_cast(table), internal_clear(&destroy_array), my_first_block ); + // base class destructor call should be then + } + + const internal::concurrent_vector_base_v3 &internal_vector_base() const { return *this; } +private: + //! Allocate k items + static void *internal_allocator(internal::concurrent_vector_base_v3 &vb, size_t k) { + return static_cast&>(vb).my_allocator.allocate(k); + } + //! Free k segments from table + void internal_free_segments(void *table[], segment_index_t k, segment_index_t first_block); + + //! Get reference to element at given index. + T& internal_subscript( size_type index ) const; + + //! Get reference to element at given index with errors checks + T& internal_subscript_with_exceptions( size_type index ) const; + + //! assign n items by copying t + void internal_assign_n(size_type n, const_pointer p) { + internal_resize( n, sizeof(T), max_size(), static_cast(p), &destroy_array, p? &initialize_array_by : &initialize_array ); + } + + //! helper class + template class is_integer_tag; + + //! assign integer items by copying when arguments are treated as iterators. See C++ Standard 2003 23.1.1p9 + template + void internal_assign_range(I first, I last, is_integer_tag *) { + internal_assign_n(static_cast(first), &static_cast(last)); + } + //! inline proxy assign by iterators + template + void internal_assign_range(I first, I last, is_integer_tag *) { + internal_assign_iterators(first, last); + } + //! assign by iterators + template + void internal_assign_iterators(I first, I last); + + //! Construct n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC initialize_array( void* begin, const void*, size_type n ); + + //! Construct n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC initialize_array_by( void* begin, const void* src, size_type n ); + + //! Construct n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC copy_array( void* dst, const void* src, size_type n ); + + //! Assign n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC assign_array( void* dst, const void* src, size_type n ); + + //! Destroy n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC destroy_array( void* begin, size_type n ); + + //! Exception-aware helper class for filling a segment by exception-danger operators of user class + class internal_loop_guide : internal::no_copy { + public: + const pointer array; + const size_type n; + size_type i; + internal_loop_guide(size_type ntrials, void *ptr) + : array(static_cast(ptr)), n(ntrials), i(0) {} + void init() { for(; i < n; ++i) new( &array[i] ) T(); } + void init(const void *src) { for(; i < n; ++i) new( &array[i] ) T(*static_cast(src)); } + void copy(const void *src) { for(; i < n; ++i) new( &array[i] ) T(static_cast(src)[i]); } + void assign(const void *src) { for(; i < n; ++i) array[i] = static_cast(src)[i]; } + template void iterate(I &src) { for(; i < n; ++i, ++src) new( &array[i] ) T( *src ); } + ~internal_loop_guide() { + if(i < n) // if exception raised, do zerroing on the rest of items + std::memset(array+i, 0, (n-i)*sizeof(value_type)); + } + }; +}; + +template +void concurrent_vector::shrink_to_fit() { + internal_segments_table old; + try { + if( internal_compact( sizeof(T), &old, &destroy_array, ©_array ) ) + internal_free_segments( old.table, pointers_per_long_table, old.first_block ); // free joined and unnecessary segments + } catch(...) { + if( old.first_block ) // free segment allocated for compacting. Only for support of exceptions in ctor of user T[ype] + internal_free_segments( old.table, 1, old.first_block ); + throw; + } +} + +template +void concurrent_vector::internal_free_segments(void *table[], segment_index_t k, segment_index_t first_block) { + // Free the arrays + while( k > first_block ) { + --k; + T* array = static_cast(table[k]); + table[k] = NULL; + if( array > internal::vector_allocation_error_flag ) // check for correct segment pointer + this->my_allocator.deallocate( array, segment_size(k) ); + } + T* array = static_cast(table[0]); + if( array > internal::vector_allocation_error_flag ) { + __TBB_ASSERT( first_block > 0, NULL ); + while(k > 0) table[--k] = NULL; + this->my_allocator.deallocate( array, segment_size(first_block) ); + } +} + +template +T& concurrent_vector::internal_subscript( size_type index ) const { + __TBB_ASSERT( index < my_early_size, "index out of bounds" ); + size_type j = index; + segment_index_t k = segment_base_index_of( j ); + __TBB_ASSERT( my_segment != (segment_t*)my_storage || k < pointers_per_short_table, "index is being allocated" ); + // no need in __TBB_load_with_acquire since thread works in own space or gets +#if TBB_USE_THREADING_TOOLS + T* array = static_cast( tbb::internal::itt_load_pointer_v3(&my_segment[k].array)); +#else + T* array = static_cast(my_segment[k].array); +#endif /* TBB_USE_THREADING_TOOLS */ + __TBB_ASSERT( array != internal::vector_allocation_error_flag, "the instance is broken by bad allocation. Use at() instead" ); + __TBB_ASSERT( array, "index is being allocated" ); + return array[j]; +} + +template +T& concurrent_vector::internal_subscript_with_exceptions( size_type index ) const { + if( index >= my_early_size ) + internal_throw_exception(0); // throw std::out_of_range + size_type j = index; + segment_index_t k = segment_base_index_of( j ); + if( my_segment == (segment_t*)my_storage && k >= pointers_per_short_table ) + internal_throw_exception(1); // throw std::range_error + void *array = my_segment[k].array; // no need in __TBB_load_with_acquire + if( array <= internal::vector_allocation_error_flag ) // check for correct segment pointer + internal_throw_exception(2); // throw std::range_error + return static_cast(array)[j]; +} + +template template +void concurrent_vector::internal_assign_iterators(I first, I last) { + __TBB_ASSERT(my_early_size == 0, NULL); + size_type n = std::distance(first, last); + if( !n ) return; + internal_reserve(n, sizeof(T), max_size()); + my_early_size = n; + segment_index_t k = 0; + size_type sz = segment_size( my_first_block ); + while( sz < n ) { + internal_loop_guide loop(sz, my_segment[k].array); + loop.iterate(first); + n -= sz; + if( !k ) k = my_first_block; + else { ++k; sz <<= 1; } + } + internal_loop_guide loop(n, my_segment[k].array); + loop.iterate(first); +} + +template +void concurrent_vector::initialize_array( void* begin, const void *, size_type n ) { + internal_loop_guide loop(n, begin); loop.init(); +} + +template +void concurrent_vector::initialize_array_by( void* begin, const void *src, size_type n ) { + internal_loop_guide loop(n, begin); loop.init(src); +} + +template +void concurrent_vector::copy_array( void* dst, const void* src, size_type n ) { + internal_loop_guide loop(n, dst); loop.copy(src); +} + +template +void concurrent_vector::assign_array( void* dst, const void* src, size_type n ) { + internal_loop_guide loop(n, dst); loop.assign(src); +} + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warning + #pragma warning (push) + #pragma warning (disable: 4189) +#endif +template +void concurrent_vector::destroy_array( void* begin, size_type n ) { + T* array = static_cast(begin); + for( size_type j=n; j>0; --j ) + array[j-1].~T(); // destructors are supposed to not throw any exceptions +} +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warning 4189 is back + +// concurrent_vector's template functions +template +inline bool operator==(const concurrent_vector &a, const concurrent_vector &b) { + // Simply: return a.size() == b.size() && std::equal(a.begin(), a.end(), b.begin()); + if(a.size() != b.size()) return false; + typename concurrent_vector::const_iterator i(a.begin()); + typename concurrent_vector::const_iterator j(b.begin()); + for(; i != a.end(); ++i, ++j) + if( !(*i == *j) ) return false; + return true; +} + +template +inline bool operator!=(const concurrent_vector &a, const concurrent_vector &b) +{ return !(a == b); } + +template +inline bool operator<(const concurrent_vector &a, const concurrent_vector &b) +{ return (std::lexicographical_compare(a.begin(), a.end(), b.begin(), b.end())); } + +template +inline bool operator>(const concurrent_vector &a, const concurrent_vector &b) +{ return b < a; } + +template +inline bool operator<=(const concurrent_vector &a, const concurrent_vector &b) +{ return !(b < a); } + +template +inline bool operator>=(const concurrent_vector &a, const concurrent_vector &b) +{ return !(a < b); } + +template +inline void swap(concurrent_vector &a, concurrent_vector &b) +{ a.swap( b ); } + +} // namespace tbb + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && defined(_Wp64) + #pragma warning (pop) +#endif // warning 4267 is back + +#endif /* __TBB_concurrent_vector_H */ diff --git a/dep/tbb/include/tbb/enumerable_thread_specific.h b/dep/tbb/include/tbb/enumerable_thread_specific.h new file mode 100644 index 000000000..123a62f00 --- /dev/null +++ b/dep/tbb/include/tbb/enumerable_thread_specific.h @@ -0,0 +1,880 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_enumerable_thread_specific_H +#define __TBB_enumerable_thread_specific_H + +#include "concurrent_vector.h" +#include "tbb_thread.h" +#include "concurrent_hash_map.h" +#include "cache_aligned_allocator.h" +#if __SUNPRO_CC +#include // for memcpy +#endif + +#if _WIN32||_WIN64 +#include +#else +#include +#endif + +namespace tbb { + + //! enum for selecting between single key and key-per-instance versions + enum ets_key_usage_type { ets_key_per_instance, ets_no_key }; + + //! @cond + namespace internal { + + //! Random access iterator for traversing the thread local copies. + template< typename Container, typename Value > + class enumerable_thread_specific_iterator +#if defined(_WIN64) && defined(_MSC_VER) + // Ensure that Microsoft's internal template function _Val_type works correctly. + : public std::iterator +#endif /* defined(_WIN64) && defined(_MSC_VER) */ + { + //! current position in the concurrent_vector + + Container *my_container; + typename Container::size_type my_index; + mutable Value *my_value; + + template + friend enumerable_thread_specific_iterator operator+( ptrdiff_t offset, + const enumerable_thread_specific_iterator& v ); + + template + friend bool operator==( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ); + + template + friend bool operator<( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ); + + template + friend ptrdiff_t operator-( const enumerable_thread_specific_iterator& i, const enumerable_thread_specific_iterator& j ); + + template + friend class enumerable_thread_specific_iterator; + + public: + + enumerable_thread_specific_iterator( const Container &container, typename Container::size_type index ) : + my_container(&const_cast(container)), my_index(index), my_value(NULL) {} + + //! Default constructor + enumerable_thread_specific_iterator() : my_container(NULL), my_index(0), my_value(NULL) {} + + template + enumerable_thread_specific_iterator( const enumerable_thread_specific_iterator& other ) : + my_container( other.my_container ), my_index( other.my_index), my_value( const_cast(other.my_value) ) {} + + enumerable_thread_specific_iterator operator+( ptrdiff_t offset ) const { + return enumerable_thread_specific_iterator(*my_container, my_index + offset); + } + + enumerable_thread_specific_iterator &operator+=( ptrdiff_t offset ) { + my_index += offset; + my_value = NULL; + return *this; + } + + enumerable_thread_specific_iterator operator-( ptrdiff_t offset ) const { + return enumerable_thread_specific_iterator( *my_container, my_index-offset ); + } + + enumerable_thread_specific_iterator &operator-=( ptrdiff_t offset ) { + my_index -= offset; + my_value = NULL; + return *this; + } + + Value& operator*() const { + Value* value = my_value; + if( !value ) { + value = my_value = &(*my_container)[my_index].value; + } + __TBB_ASSERT( value==&(*my_container)[my_index].value, "corrupt cache" ); + return *value; + } + + Value& operator[]( ptrdiff_t k ) const { + return (*my_container)[my_index + k].value; + } + + Value* operator->() const {return &operator*();} + + enumerable_thread_specific_iterator& operator++() { + ++my_index; + my_value = NULL; + return *this; + } + + enumerable_thread_specific_iterator& operator--() { + --my_index; + my_value = NULL; + return *this; + } + + //! Post increment + enumerable_thread_specific_iterator operator++(int) { + enumerable_thread_specific_iterator result = *this; + ++my_index; + my_value = NULL; + return result; + } + + //! Post decrement + enumerable_thread_specific_iterator operator--(int) { + enumerable_thread_specific_iterator result = *this; + --my_index; + my_value = NULL; + return result; + } + + // STL support + typedef ptrdiff_t difference_type; + typedef Value value_type; + typedef Value* pointer; + typedef Value& reference; + typedef std::random_access_iterator_tag iterator_category; + }; + + template + enumerable_thread_specific_iterator operator+( ptrdiff_t offset, + const enumerable_thread_specific_iterator& v ) { + return enumerable_thread_specific_iterator( v.my_container, v.my_index + offset ); + } + + template + bool operator==( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return i.my_index==j.my_index && i.my_container == j.my_container; + } + + template + bool operator!=( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return !(i==j); + } + + template + bool operator<( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return i.my_index + bool operator>( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return j + bool operator>=( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return !(i + bool operator<=( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return !(j + ptrdiff_t operator-( const enumerable_thread_specific_iterator& i, + const enumerable_thread_specific_iterator& j ) { + return i.my_index-j.my_index; + } + + template + class segmented_iterator +#if defined(_WIN64) && defined(_MSC_VER) + : public std::iterator +#endif + { + template + friend bool operator==(const segmented_iterator& i, const segmented_iterator& j); + + template + friend bool operator!=(const segmented_iterator& i, const segmented_iterator& j); + + template + friend class segmented_iterator; + + public: + + segmented_iterator() {my_segcont = NULL;} + + segmented_iterator( const SegmentedContainer& _segmented_container ) : + my_segcont(const_cast(&_segmented_container)), + outer_iter(my_segcont->end()) { } + + ~segmented_iterator() {} + + typedef typename SegmentedContainer::iterator outer_iterator; + typedef typename SegmentedContainer::value_type InnerContainer; + typedef typename InnerContainer::iterator inner_iterator; + + // STL support + typedef ptrdiff_t difference_type; + typedef Value value_type; + typedef typename SegmentedContainer::size_type size_type; + typedef Value* pointer; + typedef Value& reference; + typedef std::input_iterator_tag iterator_category; + + // Copy Constructor + template + segmented_iterator(const segmented_iterator& other) : + my_segcont(other.my_segcont), + outer_iter(other.outer_iter), + // can we assign a default-constructed iterator to inner if we're at the end? + inner_iter(other.inner_iter) + {} + + // assignment + template + segmented_iterator& operator=( const segmented_iterator& other) { + if(this != &other) { + my_segcont = other.my_segcont; + outer_iter = other.outer_iter; + if(outer_iter != my_segcont->end()) inner_iter = other.inner_iter; + } + return *this; + } + + // allow assignment of outer iterator to segmented iterator. Once it is + // assigned, move forward until a non-empty inner container is found or + // the end of the outer container is reached. + segmented_iterator& operator=(const outer_iterator& new_outer_iter) { + __TBB_ASSERT(my_segcont != NULL, NULL); + // check that this iterator points to something inside the segmented container + for(outer_iter = new_outer_iter ;outer_iter!=my_segcont->end(); ++outer_iter) { + if( !outer_iter->empty() ) { + inner_iter = outer_iter->begin(); + break; + } + } + return *this; + } + + // pre-increment + segmented_iterator& operator++() { + advance_me(); + return *this; + } + + // post-increment + segmented_iterator operator++(int) { + segmented_iterator tmp = *this; + operator++(); + return tmp; + } + + bool operator==(const outer_iterator& other_outer) const { + __TBB_ASSERT(my_segcont != NULL, NULL); + return (outer_iter == other_outer && + (outer_iter == my_segcont->end() || inner_iter == outer_iter->begin())); + } + + bool operator!=(const outer_iterator& other_outer) const { + return !operator==(other_outer); + + } + + // (i)* RHS + reference operator*() const { + __TBB_ASSERT(my_segcont != NULL, NULL); + __TBB_ASSERT(outer_iter != my_segcont->end(), "Dereferencing a pointer at end of container"); + __TBB_ASSERT(inner_iter != outer_iter->end(), NULL); // should never happen + return *inner_iter; + } + + // i-> + pointer operator->() const { return &operator*();} + + private: + SegmentedContainer* my_segcont; + outer_iterator outer_iter; + inner_iterator inner_iter; + + void advance_me() { + __TBB_ASSERT(my_segcont != NULL, NULL); + __TBB_ASSERT(outer_iter != my_segcont->end(), NULL); // not true if there are no inner containers + __TBB_ASSERT(inner_iter != outer_iter->end(), NULL); // not true if the inner containers are all empty. + ++inner_iter; + while(inner_iter == outer_iter->end() && ++outer_iter != my_segcont->end()) { + inner_iter = outer_iter->begin(); + } + } + }; // segmented_iterator + + template + bool operator==( const segmented_iterator& i, + const segmented_iterator& j ) { + if(i.my_segcont != j.my_segcont) return false; + if(i.my_segcont == NULL) return true; + if(i.outer_iter != j.outer_iter) return false; + if(i.outer_iter == i.my_segcont->end()) return true; + return i.inner_iter == j.inner_iter; + } + + // != + template + bool operator!=( const segmented_iterator& i, + const segmented_iterator& j ) { + return !(i==j); + } + + // empty template for following specializations + template + struct tls_manager {}; + + //! Struct that doesn't use a key + template <> + struct tls_manager { + typedef size_t tls_key_t; + static inline void create_key( tls_key_t &) { } + static inline void destroy_key( tls_key_t & ) { } + static inline void set_tls( tls_key_t &, void * ) { } + static inline void * get_tls( tls_key_t & ) { return (size_t)0; } + }; + + //! Struct to use native TLS support directly + template <> + struct tls_manager { +#if _WIN32||_WIN64 + typedef DWORD tls_key_t; + static inline void create_key( tls_key_t &k) { k = TlsAlloc(); } + static inline void destroy_key( tls_key_t &k) { TlsFree(k); } + static inline void set_tls( tls_key_t &k, void * value) { TlsSetValue(k, (LPVOID)value); } + static inline void * get_tls( tls_key_t &k ) { return (void *)TlsGetValue(k); } +#else + typedef pthread_key_t tls_key_t; + static inline void create_key( tls_key_t &k) { pthread_key_create(&k, NULL); } + static inline void destroy_key( tls_key_t &k) { pthread_key_delete(k); } + static inline void set_tls( tls_key_t &k, void * value) { pthread_setspecific(k, value); } + static inline void * get_tls( tls_key_t &k ) { return pthread_getspecific(k); } +#endif + }; + + class thread_hash_compare { + public: + // using hack suggested by Arch to get value for thread id for hashing... +#if _WIN32||_WIN64 + typedef DWORD thread_key; +#else + typedef pthread_t thread_key; +#endif + static thread_key my_thread_key(const tbb_thread::id j) { + thread_key key_val; + memcpy(&key_val, &j, sizeof(thread_key)); + return key_val; + } + + bool equal( const thread_key j, const thread_key k) const { + return j == k; + } + unsigned long hash(const thread_key k) const { + return (unsigned long)k; + } + }; + + // storage for initialization function pointer + template + struct callback_base { + virtual T apply( ) = 0; + virtual void destroy( ) = 0; + // need to be able to create copies of callback_base for copy constructor + virtual callback_base* make_copy() = 0; + // need virtual destructor to satisfy GCC compiler warning + virtual ~callback_base() { } + }; + + template + struct callback_leaf : public callback_base { + typedef Functor my_callback_type; + typedef callback_leaf my_type; + typedef my_type* callback_pointer; + typedef typename tbb::tbb_allocator my_allocator_type; + Functor f; + callback_leaf( const Functor& f_) : f(f_) { + } + + static callback_pointer new_callback(const Functor& f_ ) { + void* new_void = my_allocator_type().allocate(1); + callback_pointer new_cb = new (new_void) callback_leaf(f_); // placement new + return new_cb; + } + + /* override */ callback_pointer make_copy() { + return new_callback( f ); + } + + /* override */ void destroy( ) { + callback_pointer my_ptr = this; + my_allocator_type().destroy(my_ptr); + my_allocator_type().deallocate(my_ptr,1); + } + /* override */ T apply() { return f(); } // does copy construction of returned value. + }; + + template + class ets_concurrent_hash_map : public tbb::concurrent_hash_map { + public: + typedef tbb::concurrent_hash_map base_type; + typedef typename base_type::const_pointer const_pointer; + typedef typename base_type::key_type key_type; + const_pointer find( const key_type &k ) { + return internal_fast_find( k ); + } // make public + }; + + } // namespace internal + //! @endcond + + //! The thread local class template + template , + ets_key_usage_type ETS_key_type=ets_no_key > + class enumerable_thread_specific { + + template friend class enumerable_thread_specific; + + typedef internal::tls_manager< ETS_key_type > my_tls_manager; + + //! The padded elements; padded to avoid false sharing + template + struct padded_element { + U value; + char padding[ ( (sizeof(U) - 1) / internal::NFS_MaxLineSize + 1 ) * internal::NFS_MaxLineSize - sizeof(U) ]; + padded_element(const U &v) : value(v) {} + padded_element() {} + }; + + //! A generic range, used to create range objects from the iterators + template + class generic_range_type: public blocked_range { + public: + typedef T value_type; + typedef T& reference; + typedef const T& const_reference; + typedef I iterator; + typedef ptrdiff_t difference_type; + generic_range_type( I begin_, I end_, size_t grainsize = 1) : blocked_range(begin_,end_,grainsize) {} + template + generic_range_type( const generic_range_type& r) : blocked_range(r.begin(),r.end(),r.grainsize()) {} + generic_range_type( generic_range_type& r, split ) : blocked_range(r,split()) {} + }; + + typedef typename Allocator::template rebind< padded_element >::other padded_allocator_type; + typedef tbb::concurrent_vector< padded_element, padded_allocator_type > internal_collection_type; + typedef typename internal_collection_type::size_type hash_table_index_type; // storing array indices rather than iterators to simplify + // copying the hash table that correlates thread IDs with concurrent vector elements. + + typedef typename Allocator::template rebind< std::pair< typename internal::thread_hash_compare::thread_key, hash_table_index_type > >::other hash_element_allocator; + typedef internal::ets_concurrent_hash_map< typename internal::thread_hash_compare::thread_key, hash_table_index_type, internal::thread_hash_compare, hash_element_allocator > thread_to_index_type; + + typename my_tls_manager::tls_key_t my_key; + + void reset_key() { + my_tls_manager::destroy_key(my_key); + my_tls_manager::create_key(my_key); + } + + internal::callback_base *my_finit_callback; + + // need to use a pointed-to exemplar because T may not be assignable. + // using tbb_allocator instead of padded_element_allocator because we may be + // copying an exemplar from one instantiation of ETS to another with a different + // allocator. + typedef typename tbb::tbb_allocator > exemplar_allocator_type; + static padded_element * create_exemplar(const T& my_value) { + padded_element *new_exemplar = 0; + // void *new_space = padded_allocator_type().allocate(1); + void *new_space = exemplar_allocator_type().allocate(1); + new_exemplar = new(new_space) padded_element(my_value); + return new_exemplar; + } + + static padded_element *create_exemplar( ) { + // void *new_space = padded_allocator_type().allocate(1); + void *new_space = exemplar_allocator_type().allocate(1); + padded_element *new_exemplar = new(new_space) padded_element( ); + return new_exemplar; + } + + static void free_exemplar(padded_element *my_ptr) { + // padded_allocator_type().destroy(my_ptr); + // padded_allocator_type().deallocate(my_ptr,1); + exemplar_allocator_type().destroy(my_ptr); + exemplar_allocator_type().deallocate(my_ptr,1); + } + + padded_element* my_exemplar_ptr; + + internal_collection_type my_locals; + thread_to_index_type my_hash_tbl; + + public: + + //! Basic types + typedef Allocator allocator_type; + typedef T value_type; + typedef T& reference; + typedef const T& const_reference; + typedef T* pointer; + typedef const T* const_pointer; + typedef typename internal_collection_type::size_type size_type; + typedef typename internal_collection_type::difference_type difference_type; + + // Iterator types + typedef typename internal::enumerable_thread_specific_iterator< internal_collection_type, value_type > iterator; + typedef typename internal::enumerable_thread_specific_iterator< internal_collection_type, const value_type > const_iterator; + + // Parallel range types + typedef generic_range_type< iterator > range_type; + typedef generic_range_type< const_iterator > const_range_type; + + //! Default constructor, which leads to default construction of local copies + enumerable_thread_specific() : my_finit_callback(0) { + my_exemplar_ptr = create_exemplar(); + my_tls_manager::create_key(my_key); + } + + //! construction with initializer method + // Finit should be a function taking 0 parameters and returning a T + template + enumerable_thread_specific( Finit _finit ) + { + my_finit_callback = internal::callback_leaf::new_callback( _finit ); + my_tls_manager::create_key(my_key); + my_exemplar_ptr = 0; // don't need exemplar if function is provided + } + + //! Constuction with exemplar, which leads to copy construction of local copies + enumerable_thread_specific(const T &_exemplar) : my_finit_callback(0) { + my_exemplar_ptr = create_exemplar(_exemplar); + my_tls_manager::create_key(my_key); + } + + //! Destructor + ~enumerable_thread_specific() { + my_tls_manager::destroy_key(my_key); + if(my_finit_callback) { + my_finit_callback->destroy(); + } + if(my_exemplar_ptr) + { + free_exemplar(my_exemplar_ptr); + } + } + + //! returns reference to local, discarding exists + reference local() { + bool exists; + return local(exists); + } + + //! Returns reference to calling thread's local copy, creating one if necessary + reference local(bool& exists) { + if ( pointer local_ptr = static_cast(my_tls_manager::get_tls(my_key)) ) { + exists = true; + return *local_ptr; + } + hash_table_index_type local_index; + typename internal::thread_hash_compare::thread_key my_t_key = internal::thread_hash_compare::my_thread_key(tbb::this_tbb_thread::get_id()); + { + typename thread_to_index_type::const_pointer my_existing_entry; + my_existing_entry = my_hash_tbl.find(my_t_key); + if(my_existing_entry) { + exists = true; + local_index = my_existing_entry->second; + } + else { + + // see if the table entry can be found by accessor + typename thread_to_index_type::accessor a; + if(!my_hash_tbl.insert(a, my_t_key)) { + exists = true; + local_index = a->second; + } + else { + // create new entry + exists = false; + if(my_finit_callback) { + // convert iterator to array index +#if TBB_DEPRECATED + local_index = my_locals.push_back(my_finit_callback->apply()); +#else + local_index = my_locals.push_back(my_finit_callback->apply()) - my_locals.begin(); +#endif + } + else { + // convert iterator to array index +#if TBB_DEPRECATED + local_index = my_locals.push_back(*my_exemplar_ptr); +#else + local_index = my_locals.push_back(*my_exemplar_ptr) - my_locals.begin(); +#endif + } + // insert into hash table + a->second = local_index; + } + } + } + + reference local_ref = (my_locals[local_index].value); + my_tls_manager::set_tls( my_key, static_cast(&local_ref) ); + return local_ref; + } // local + + //! Get the number of local copies + size_type size() const { return my_locals.size(); } + + //! true if there have been no local copies created + bool empty() const { return my_locals.empty(); } + + //! begin iterator + iterator begin() { return iterator( my_locals, 0 ); } + //! end iterator + iterator end() { return iterator(my_locals, my_locals.size() ); } + + //! begin const iterator + const_iterator begin() const { return const_iterator(my_locals, 0); } + + //! end const iterator + const_iterator end() const { return const_iterator(my_locals, my_locals.size()); } + + //! Get range for parallel algorithms + range_type range( size_t grainsize=1 ) { return range_type( begin(), end(), grainsize ); } + + //! Get const range for parallel algorithms + const_range_type range( size_t grainsize=1 ) const { return const_range_type( begin(), end(), grainsize ); } + + //! Destroys local copies + void clear() { + my_locals.clear(); + my_hash_tbl.clear(); + reset_key(); + // callback is not destroyed + // exemplar is not destroyed + } + + // STL container methods + // copy constructor + + private: + + template + void + internal_copy_construct( const enumerable_thread_specific& other) { + typedef typename tbb::enumerable_thread_specific other_type; + for(typename other_type::const_iterator ci = other.begin(); ci != other.end(); ++ci) { + my_locals.push_back(*ci); + } + if(other.my_finit_callback) { + my_finit_callback = other.my_finit_callback->make_copy(); + } + else { + my_finit_callback = 0; + } + if(other.my_exemplar_ptr) { + my_exemplar_ptr = create_exemplar(other.my_exemplar_ptr->value); + } + else { + my_exemplar_ptr = 0; + } + my_tls_manager::create_key(my_key); + } + + public: + + template + enumerable_thread_specific( const enumerable_thread_specific& other ) : my_hash_tbl(other.my_hash_tbl) + { // Have to do push_back because the contained elements are not necessarily assignable. + internal_copy_construct(other); + } + + // non-templatized version + enumerable_thread_specific( const enumerable_thread_specific& other ) : my_hash_tbl(other.my_hash_tbl) + { + internal_copy_construct(other); + } + + private: + + template + enumerable_thread_specific & + internal_assign(const enumerable_thread_specific& other) { + typedef typename tbb::enumerable_thread_specific other_type; + if(static_cast( this ) != static_cast( &other )) { + this->clear(); // resets TLS key + my_hash_tbl = other.my_hash_tbl; + // cannot use assign because T may not be assignable. + for(typename other_type::const_iterator ci = other.begin(); ci != other.end(); ++ci) { + my_locals.push_back(*ci); + } + + if(my_finit_callback) { + my_finit_callback->destroy(); + my_finit_callback = 0; + } + if(my_exemplar_ptr) { + free_exemplar(my_exemplar_ptr); + my_exemplar_ptr = 0; + } + if(other.my_finit_callback) { + my_finit_callback = other.my_finit_callback->make_copy(); + } + + if(other.my_exemplar_ptr) { + my_exemplar_ptr = create_exemplar(other.my_exemplar_ptr->value); + } + } + return *this; + } + + public: + + // assignment + enumerable_thread_specific& operator=(const enumerable_thread_specific& other) { + return internal_assign(other); + } + + template + enumerable_thread_specific& operator=(const enumerable_thread_specific& other) + { + return internal_assign(other); + } + + private: + + // combine_func_t has signature T(T,T) or T(const T&, const T&) + template + T internal_combine(typename internal_collection_type::const_range_type r, combine_func_t f_combine) { + if(r.is_divisible()) { + typename internal_collection_type::const_range_type r2(r,split()); + return f_combine(internal_combine(r2, f_combine), internal_combine(r, f_combine)); + } + if(r.size() == 1) { + return r.begin()->value; + } + typename internal_collection_type::const_iterator i2 = r.begin(); + ++i2; + return f_combine(r.begin()->value, i2->value); + } + + public: + + // combine_func_t has signature T(T,T) or T(const T&, const T&) + template + T combine(combine_func_t f_combine) { + if(my_locals.begin() == my_locals.end()) { + if(my_finit_callback) { + return my_finit_callback->apply(); + } + return (*my_exemplar_ptr).value; + } + typename internal_collection_type::const_range_type r(my_locals.begin(), my_locals.end(), (size_t)2); + return internal_combine(r, f_combine); + } + + // combine_func_t has signature void(T) or void(const T&) + template + void combine_each(combine_func_t f_combine) { + for(const_iterator ci = begin(); ci != end(); ++ci) { + f_combine( *ci ); + } + } + }; // enumerable_thread_specific + + template< typename Container > + class flattened2d { + + // This intermediate typedef is to address issues with VC7.1 compilers + typedef typename Container::value_type conval_type; + + public: + + //! Basic types + typedef typename conval_type::size_type size_type; + typedef typename conval_type::difference_type difference_type; + typedef typename conval_type::allocator_type allocator_type; + typedef typename conval_type::value_type value_type; + typedef typename conval_type::reference reference; + typedef typename conval_type::const_reference const_reference; + typedef typename conval_type::pointer pointer; + typedef typename conval_type::const_pointer const_pointer; + + typedef typename internal::segmented_iterator iterator; + typedef typename internal::segmented_iterator const_iterator; + + flattened2d( const Container &c, typename Container::const_iterator b, typename Container::const_iterator e ) : + my_container(const_cast(&c)), my_begin(b), my_end(e) { } + + flattened2d( const Container &c ) : + my_container(const_cast(&c)), my_begin(c.begin()), my_end(c.end()) { } + + iterator begin() { return iterator(*my_container) = my_begin; } + iterator end() { return iterator(*my_container) = my_end; } + const_iterator begin() const { return const_iterator(*my_container) = my_begin; } + const_iterator end() const { return const_iterator(*my_container) = my_end; } + + size_type size() const { + size_type tot_size = 0; + for(typename Container::const_iterator i = my_begin; i != my_end; ++i) { + tot_size += i->size(); + } + return tot_size; + } + + private: + + Container *my_container; + typename Container::const_iterator my_begin; + typename Container::const_iterator my_end; + + }; + + template + flattened2d flatten2d(const Container &c, const typename Container::const_iterator b, const typename Container::const_iterator e) { + return flattened2d(c, b, e); + } + + template + flattened2d flatten2d(const Container &c) { + return flattened2d(c); + } + +} // namespace tbb + +#endif diff --git a/dep/tbb/include/tbb/index.html b/dep/tbb/include/tbb/index.html new file mode 100644 index 000000000..fa0596588 --- /dev/null +++ b/dep/tbb/include/tbb/index.html @@ -0,0 +1,28 @@ + + + +

Overview

+Include files for Threading Building Blocks classes and functions. + +
Click here to see all files in the directory. + +

Directories

+
+
machine +
Include files for low-level architecture specific functionality. +
compat +
Include files for source level compatibility with other frameworks. +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + diff --git a/dep/tbb/include/tbb/machine/ibm_aix51.h b/dep/tbb/include/tbb/machine/ibm_aix51.h new file mode 100644 index 000000000..439011540 --- /dev/null +++ b/dep/tbb/include/tbb/machine/ibm_aix51.h @@ -0,0 +1,52 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#define __TBB_WORDSIZE 8 +#define __TBB_BIG_ENDIAN 1 + +#include +#include +#include + +extern "C" { + +int32_t __TBB_machine_cas_32 (volatile void* ptr, int32_t value, int32_t comparand); +int64_t __TBB_machine_cas_64 (volatile void* ptr, int64_t value, int64_t comparand); +#define __TBB_fence_for_acquire() __TBB_machine_flush () +#define __TBB_fence_for_release() __TBB_machine_flush () + +} + +#define __TBB_CompareAndSwap4(P,V,C) __TBB_machine_cas_32(P,V,C) +#define __TBB_CompareAndSwap8(P,V,C) __TBB_machine_cas_64(P,V,C) +#define __TBB_CompareAndSwapW(P,V,C) __TBB_machine_cas_64(P,V,C) +#define __TBB_Yield() sched_yield() diff --git a/dep/tbb/include/tbb/machine/linux_common.h b/dep/tbb/include/tbb/machine/linux_common.h new file mode 100644 index 000000000..35bff2592 --- /dev/null +++ b/dep/tbb/include/tbb/machine/linux_common.h @@ -0,0 +1,95 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#include +#include +#include + +// Definition of __TBB_Yield() +#define __TBB_Yield() sched_yield() + +/* Futex definitions */ +#include + +#if defined(SYS_futex) + +#define __TBB_USE_FUTEX 1 +#include +#include +// Unfortunately, some versions of Linux do not have a header that defines FUTEX_WAIT and FUTEX_WAKE. + +#ifdef FUTEX_WAIT +#define __TBB_FUTEX_WAIT FUTEX_WAIT +#else +#define __TBB_FUTEX_WAIT 0 +#endif + +#ifdef FUTEX_WAKE +#define __TBB_FUTEX_WAKE FUTEX_WAKE +#else +#define __TBB_FUTEX_WAKE 1 +#endif + +#ifndef __TBB_ASSERT +#error machine specific headers must be included after tbb_stddef.h +#endif + +namespace tbb { + +namespace internal { + +inline int futex_wait( void *futex, int comparand ) { + int r = ::syscall( SYS_futex,futex,__TBB_FUTEX_WAIT,comparand,NULL,NULL,0 ); +#if TBB_USE_ASSERT + int e = errno; + __TBB_ASSERT( r==0||r==EWOULDBLOCK||(r==-1&&(e==EAGAIN||e==EINTR)), "futex_wait failed." ); +#endif /* TBB_USE_ASSERT */ + return r; +} + +inline int futex_wakeup_one( void *futex ) { + int r = ::syscall( SYS_futex,futex,__TBB_FUTEX_WAKE,1,NULL,NULL,0 ); + __TBB_ASSERT( r==0||r==1, "futex_wakeup_one: more than one thread woken up?" ); + return r; +} + +inline int futex_wakeup_all( void *futex ) { + int r = ::syscall( SYS_futex,futex,__TBB_FUTEX_WAKE,INT_MAX,NULL,NULL,0 ); + __TBB_ASSERT( r>=0, "futex_wakeup_all: error in waking up threads" ); + return r; +} + +} /* namespace internal */ + +} /* namespace tbb */ + +#endif /* SYS_futex */ diff --git a/dep/tbb/include/tbb/machine/linux_ia32.h b/dep/tbb/include/tbb/machine/linux_ia32.h new file mode 100644 index 000000000..514e3d79d --- /dev/null +++ b/dep/tbb/include/tbb/machine/linux_ia32.h @@ -0,0 +1,253 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#if !__MINGW32__ +#include "linux_common.h" +#endif + +#define __TBB_WORDSIZE 4 +#define __TBB_BIG_ENDIAN 0 + +#define __TBB_release_consistency_helper() __asm__ __volatile__("": : :"memory") + +inline void __TBB_rel_acq_fence() { __asm__ __volatile__("mfence": : :"memory"); } + +#define __MACHINE_DECL_ATOMICS(S,T,X) \ +static inline T __TBB_machine_cmpswp##S (volatile void *ptr, T value, T comparand ) \ +{ \ + T result; \ + \ + __asm__ __volatile__("lock\ncmpxchg" X " %2,%1" \ + : "=a"(result), "=m"(*(T *)ptr) \ + : "q"(value), "0"(comparand), "m"(*(T *)ptr) \ + : "memory"); \ + return result; \ +} \ + \ +static inline T __TBB_machine_fetchadd##S(volatile void *ptr, T addend) \ +{ \ + T result; \ + __asm__ __volatile__("lock\nxadd" X " %0,%1" \ + : "=r"(result), "=m"(*(T *)ptr) \ + : "0"(addend), "m"(*(T *)ptr) \ + : "memory"); \ + return result; \ +} \ + \ +static inline T __TBB_machine_fetchstore##S(volatile void *ptr, T value) \ +{ \ + T result; \ + __asm__ __volatile__("lock\nxchg" X " %0,%1" \ + : "=r"(result), "=m"(*(T *)ptr) \ + : "0"(value), "m"(*(T *)ptr) \ + : "memory"); \ + return result; \ +} \ + +__MACHINE_DECL_ATOMICS(1,int8_t,"") +__MACHINE_DECL_ATOMICS(2,int16_t,"") +__MACHINE_DECL_ATOMICS(4,int32_t,"l") + +static inline int64_t __TBB_machine_cmpswp8 (volatile void *ptr, int64_t value, int64_t comparand ) +{ + int64_t result; +#if __PIC__ + /* compiling position-independent code */ + // EBX register preserved for compliancy with position-independent code rules on IA32 + __asm__ __volatile__ ( + "pushl %%ebx\n\t" + "movl (%%ecx),%%ebx\n\t" + "movl 4(%%ecx),%%ecx\n\t" + "lock\n\t cmpxchg8b %1\n\t" + "popl %%ebx" + : "=A"(result), "=m"(*(int64_t *)ptr) + : "m"(*(int64_t *)ptr) + , "0"(comparand) + , "c"(&value) + : "memory", "esp" +#if __INTEL_COMPILER + ,"ebx" +#endif + ); +#else /* !__PIC__ */ + union { + int64_t i64; + int32_t i32[2]; + }; + i64 = value; + __asm__ __volatile__ ( + "lock\n\t cmpxchg8b %1\n\t" + : "=A"(result), "=m"(*(int64_t *)ptr) + : "m"(*(int64_t *)ptr) + , "0"(comparand) + , "b"(i32[0]), "c"(i32[1]) + : "memory" + ); +#endif /* __PIC__ */ + return result; +} + +static inline int32_t __TBB_machine_lg( uint32_t x ) { + int32_t j; + __asm__ ("bsr %1,%0" : "=r"(j) : "r"(x)); + return j; +} + +static inline void __TBB_machine_or( volatile void *ptr, uint32_t addend ) { + __asm__ __volatile__("lock\norl %1,%0" : "=m"(*(uint32_t *)ptr) : "r"(addend), "m"(*(uint32_t *)ptr) : "memory"); +} + +static inline void __TBB_machine_and( volatile void *ptr, uint32_t addend ) { + __asm__ __volatile__("lock\nandl %1,%0" : "=m"(*(uint32_t *)ptr) : "r"(addend), "m"(*(uint32_t *)ptr) : "memory"); +} + +static inline void __TBB_machine_pause( int32_t delay ) { + for (int32_t i = 0; i < delay; i++) { + __asm__ __volatile__("pause;"); + } + return; +} + +static inline int64_t __TBB_machine_load8 (const volatile void *ptr) { + int64_t result; + if( ((uint32_t)ptr&7u)==0 ) { + // Aligned load + __asm__ __volatile__ ( "fildq %1\n\t" + "fistpq %0" : "=m"(result) : "m"(*(uint64_t *)ptr) : "memory" ); + } else { + // Unaligned load + result = __TBB_machine_cmpswp8((void*)ptr,0,0); + } + return result; +} + +//! Handles misaligned 8-byte store +/** Defined in tbb_misc.cpp */ +extern "C" void __TBB_machine_store8_slow( volatile void *ptr, int64_t value ); +extern "C" void __TBB_machine_store8_slow_perf_warning( volatile void *ptr ); + +static inline void __TBB_machine_store8(volatile void *ptr, int64_t value) { + if( ((uint32_t)ptr&7u)==0 ) { + // Aligned store + __asm__ __volatile__ ( "fildq %1\n\t" + "fistpq %0" : "=m"(*(int64_t *)ptr) : "m"(value) : "memory" ); + } else { + // Unaligned store +#if TBB_USE_PERFORMANCE_WARNINGS + __TBB_machine_store8_slow_perf_warning(ptr); +#endif /* TBB_USE_PERFORMANCE_WARNINGS */ + __TBB_machine_store8_slow(ptr,value); + } +} + +template +struct __TBB_machine_load_store { + static inline T load_with_acquire(const volatile T& location) { + T to_return = location; + __asm__ __volatile__("" : : : "memory" ); // Compiler fence to keep operations from migrating upwards + return to_return; + } + + static inline void store_with_release(volatile T &location, T value) { + __asm__ __volatile__("" : : : "memory" ); // Compiler fence to keep operations from migrating upwards + location = value; + } +}; + +template +struct __TBB_machine_load_store { + static inline T load_with_acquire(const volatile T& location) { + T to_return = __TBB_machine_load8((volatile void *)&location); + __asm__ __volatile__("" : : : "memory" ); // Compiler fence to keep operations from migrating upwards + return to_return; + } + + static inline void store_with_release(volatile T &location, T value) { + __asm__ __volatile__("" : : : "memory" ); // Compiler fence to keep operations from migrating downwards + __TBB_machine_store8((volatile void *)&location,(int64_t)value); + } +}; + +template +inline T __TBB_machine_load_with_acquire(const volatile T &location) { + return __TBB_machine_load_store::load_with_acquire(location); +} + +template +inline void __TBB_machine_store_with_release(volatile T &location, V value) { + __TBB_machine_load_store::store_with_release(location,value); +} + +#define __TBB_load_with_acquire(L) __TBB_machine_load_with_acquire((L)) +#define __TBB_store_with_release(L,V) __TBB_machine_store_with_release((L),(V)) + +// Machine specific atomic operations + +#define __TBB_CompareAndSwap1(P,V,C) __TBB_machine_cmpswp1(P,V,C) +#define __TBB_CompareAndSwap2(P,V,C) __TBB_machine_cmpswp2(P,V,C) +#define __TBB_CompareAndSwap4(P,V,C) __TBB_machine_cmpswp4(P,V,C) +#define __TBB_CompareAndSwap8(P,V,C) __TBB_machine_cmpswp8(P,V,C) +#define __TBB_CompareAndSwapW(P,V,C) __TBB_machine_cmpswp4(P,V,C) + +#define __TBB_FetchAndAdd1(P,V) __TBB_machine_fetchadd1(P,V) +#define __TBB_FetchAndAdd2(P,V) __TBB_machine_fetchadd2(P,V) +#define __TBB_FetchAndAdd4(P,V) __TBB_machine_fetchadd4(P,V) +#define __TBB_FetchAndAddW(P,V) __TBB_machine_fetchadd4(P,V) + +#define __TBB_FetchAndStore1(P,V) __TBB_machine_fetchstore1(P,V) +#define __TBB_FetchAndStore2(P,V) __TBB_machine_fetchstore2(P,V) +#define __TBB_FetchAndStore4(P,V) __TBB_machine_fetchstore4(P,V) +#define __TBB_FetchAndStoreW(P,V) __TBB_machine_fetchstore4(P,V) + +#define __TBB_Store8(P,V) __TBB_machine_store8(P,V) +#define __TBB_Load8(P) __TBB_machine_load8(P) + +#define __TBB_AtomicOR(P,V) __TBB_machine_or(P,V) +#define __TBB_AtomicAND(P,V) __TBB_machine_and(P,V) + + +// Those we chose not to implement (they will be implemented generically using CMPSWP8) +#undef __TBB_FetchAndAdd8 +#undef __TBB_FetchAndStore8 + +// Definition of other functions +#define __TBB_Pause(V) __TBB_machine_pause(V) +#define __TBB_Log2(V) __TBB_machine_lg(V) + +// Special atomic functions +#define __TBB_FetchAndAddWrelease(P,V) __TBB_FetchAndAddW(P,V) +#define __TBB_FetchAndIncrementWacquire(P) __TBB_FetchAndAddW(P,1) +#define __TBB_FetchAndDecrementWrelease(P) __TBB_FetchAndAddW(P,-1) + +// Use generic definitions from tbb_machine.h +#undef __TBB_TryLockByte +#undef __TBB_LockByte diff --git a/dep/tbb/include/tbb/machine/linux_ia64.h b/dep/tbb/include/tbb/machine/linux_ia64.h new file mode 100644 index 000000000..59347b5cd --- /dev/null +++ b/dep/tbb/include/tbb/machine/linux_ia64.h @@ -0,0 +1,169 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#include "linux_common.h" +#include + +#define __TBB_WORDSIZE 8 +#define __TBB_BIG_ENDIAN 0 +#define __TBB_DECL_FENCED_ATOMICS 1 + +// Most of the functions will be in a .s file + +extern "C" { + int8_t __TBB_machine_cmpswp1__TBB_full_fence (volatile void *ptr, int8_t value, int8_t comparand); + int8_t __TBB_machine_fetchadd1__TBB_full_fence (volatile void *ptr, int8_t addend); + int8_t __TBB_machine_fetchadd1acquire(volatile void *ptr, int8_t addend); + int8_t __TBB_machine_fetchadd1release(volatile void *ptr, int8_t addend); + int8_t __TBB_machine_fetchstore1acquire(volatile void *ptr, int8_t value); + int8_t __TBB_machine_fetchstore1release(volatile void *ptr, int8_t value); + + int16_t __TBB_machine_cmpswp2__TBB_full_fence (volatile void *ptr, int16_t value, int16_t comparand); + int16_t __TBB_machine_fetchadd2__TBB_full_fence (volatile void *ptr, int16_t addend); + int16_t __TBB_machine_fetchadd2acquire(volatile void *ptr, int16_t addend); + int16_t __TBB_machine_fetchadd2release(volatile void *ptr, int16_t addend); + int16_t __TBB_machine_fetchstore2acquire(volatile void *ptr, int16_t value); + int16_t __TBB_machine_fetchstore2release(volatile void *ptr, int16_t value); + + int32_t __TBB_machine_fetchstore4__TBB_full_fence (volatile void *ptr, int32_t value); + int32_t __TBB_machine_fetchstore4acquire(volatile void *ptr, int32_t value); + int32_t __TBB_machine_fetchstore4release(volatile void *ptr, int32_t value); + int32_t __TBB_machine_fetchadd4acquire(volatile void *ptr, int32_t addend); + int32_t __TBB_machine_fetchadd4release(volatile void *ptr, int32_t addend); + + int64_t __TBB_machine_cmpswp8__TBB_full_fence (volatile void *ptr, int64_t value, int64_t comparand); + int64_t __TBB_machine_fetchstore8__TBB_full_fence (volatile void *ptr, int64_t value); + int64_t __TBB_machine_fetchstore8acquire(volatile void *ptr, int64_t value); + int64_t __TBB_machine_fetchstore8release(volatile void *ptr, int64_t value); + int64_t __TBB_machine_fetchadd8acquire(volatile void *ptr, int64_t addend); + int64_t __TBB_machine_fetchadd8release(volatile void *ptr, int64_t addend); + + int8_t __TBB_machine_cmpswp1acquire(volatile void *ptr, int8_t value, int8_t comparand); + int8_t __TBB_machine_cmpswp1release(volatile void *ptr, int8_t value, int8_t comparand); + int8_t __TBB_machine_fetchstore1__TBB_full_fence (volatile void *ptr, int8_t value); + + int16_t __TBB_machine_cmpswp2acquire(volatile void *ptr, int16_t value, int16_t comparand); + int16_t __TBB_machine_cmpswp2release(volatile void *ptr, int16_t value, int16_t comparand); + int16_t __TBB_machine_fetchstore2__TBB_full_fence (volatile void *ptr, int16_t value); + + int32_t __TBB_machine_cmpswp4__TBB_full_fence (volatile void *ptr, int32_t value, int32_t comparand); + int32_t __TBB_machine_cmpswp4acquire(volatile void *ptr, int32_t value, int32_t comparand); + int32_t __TBB_machine_cmpswp4release(volatile void *ptr, int32_t value, int32_t comparand); + int32_t __TBB_machine_fetchadd4__TBB_full_fence (volatile void *ptr, int32_t value); + + int64_t __TBB_machine_cmpswp8acquire(volatile void *ptr, int64_t value, int64_t comparand); + int64_t __TBB_machine_cmpswp8release(volatile void *ptr, int64_t value, int64_t comparand); + int64_t __TBB_machine_fetchadd8__TBB_full_fence (volatile void *ptr, int64_t value); + + int64_t __TBB_machine_lg(uint64_t value); + void __TBB_machine_pause(int32_t delay); + bool __TBB_machine_trylockbyte( volatile unsigned char &ptr ); + int64_t __TBB_machine_lockbyte( volatile unsigned char &ptr ); + + //! Retrieves the current RSE backing store pointer. IA64 specific. + void* __TBB_get_bsp(); +} + +#define __TBB_CompareAndSwap1(P,V,C) __TBB_machine_cmpswp1__TBB_full_fence(P,V,C) +#define __TBB_CompareAndSwap2(P,V,C) __TBB_machine_cmpswp2__TBB_full_fence(P,V,C) + +#define __TBB_FetchAndAdd1(P,V) __TBB_machine_fetchadd1__TBB_full_fence(P,V) +#define __TBB_FetchAndAdd1acquire(P,V) __TBB_machine_fetchadd1acquire(P,V) +#define __TBB_FetchAndAdd1release(P,V) __TBB_machine_fetchadd1release(P,V) +#define __TBB_FetchAndAdd2(P,V) __TBB_machine_fetchadd2__TBB_full_fence(P,V) +#define __TBB_FetchAndAdd2acquire(P,V) __TBB_machine_fetchadd2acquire(P,V) +#define __TBB_FetchAndAdd2release(P,V) __TBB_machine_fetchadd2release(P,V) +#define __TBB_FetchAndAdd4acquire(P,V) __TBB_machine_fetchadd4acquire(P,V) +#define __TBB_FetchAndAdd4release(P,V) __TBB_machine_fetchadd4release(P,V) +#define __TBB_FetchAndAdd8acquire(P,V) __TBB_machine_fetchadd8acquire(P,V) +#define __TBB_FetchAndAdd8release(P,V) __TBB_machine_fetchadd8release(P,V) + +#define __TBB_FetchAndStore1acquire(P,V) __TBB_machine_fetchstore1acquire(P,V) +#define __TBB_FetchAndStore1release(P,V) __TBB_machine_fetchstore1release(P,V) +#define __TBB_FetchAndStore2acquire(P,V) __TBB_machine_fetchstore2acquire(P,V) +#define __TBB_FetchAndStore2release(P,V) __TBB_machine_fetchstore2release(P,V) +#define __TBB_FetchAndStore4acquire(P,V) __TBB_machine_fetchstore4acquire(P,V) +#define __TBB_FetchAndStore4release(P,V) __TBB_machine_fetchstore4release(P,V) +#define __TBB_FetchAndStore8acquire(P,V) __TBB_machine_fetchstore8acquire(P,V) +#define __TBB_FetchAndStore8release(P,V) __TBB_machine_fetchstore8release(P,V) + +#define __TBB_CompareAndSwap1acquire(P,V,C) __TBB_machine_cmpswp1acquire(P,V,C) +#define __TBB_CompareAndSwap1release(P,V,C) __TBB_machine_cmpswp1release(P,V,C) +#define __TBB_CompareAndSwap2acquire(P,V,C) __TBB_machine_cmpswp2acquire(P,V,C) +#define __TBB_CompareAndSwap2release(P,V,C) __TBB_machine_cmpswp2release(P,V,C) +#define __TBB_CompareAndSwap4(P,V,C) __TBB_machine_cmpswp4__TBB_full_fence(P,V,C) +#define __TBB_CompareAndSwap4acquire(P,V,C) __TBB_machine_cmpswp4acquire(P,V,C) +#define __TBB_CompareAndSwap4release(P,V,C) __TBB_machine_cmpswp4release(P,V,C) +#define __TBB_CompareAndSwap8(P,V,C) __TBB_machine_cmpswp8__TBB_full_fence(P,V,C) +#define __TBB_CompareAndSwap8acquire(P,V,C) __TBB_machine_cmpswp8acquire(P,V,C) +#define __TBB_CompareAndSwap8release(P,V,C) __TBB_machine_cmpswp8release(P,V,C) + +#define __TBB_FetchAndAdd4(P,V) __TBB_machine_fetchadd4__TBB_full_fence(P,V) +#define __TBB_FetchAndAdd8(P,V) __TBB_machine_fetchadd8__TBB_full_fence(P,V) + +#define __TBB_FetchAndStore1(P,V) __TBB_machine_fetchstore1__TBB_full_fence(P,V) +#define __TBB_FetchAndStore2(P,V) __TBB_machine_fetchstore2__TBB_full_fence(P,V) +#define __TBB_FetchAndStore4(P,V) __TBB_machine_fetchstore4__TBB_full_fence(P,V) +#define __TBB_FetchAndStore8(P,V) __TBB_machine_fetchstore8__TBB_full_fence(P,V) + +#define __TBB_FetchAndIncrementWacquire(P) __TBB_FetchAndAdd8acquire(P,1) +#define __TBB_FetchAndDecrementWrelease(P) __TBB_FetchAndAdd8release(P,-1) + +#ifndef __INTEL_COMPILER +/* Even though GCC imbues volatile loads with acquire semantics, + it sometimes moves loads over the acquire fence. The + fences defined here stop such incorrect code motion. */ +#define __TBB_release_consistency_helper() __asm__ __volatile__("": : :"memory") +#define __TBB_rel_acq_fence() __asm__ __volatile__("mf": : :"memory") +#else +#define __TBB_release_consistency_helper() +#define __TBB_rel_acq_fence() __mf() +#endif /* __INTEL_COMPILER */ + +// Special atomic functions +#define __TBB_CompareAndSwapW(P,V,C) __TBB_CompareAndSwap8(P,V,C) +#define __TBB_FetchAndStoreW(P,V) __TBB_FetchAndStore8(P,V) +#define __TBB_FetchAndAddW(P,V) __TBB_FetchAndAdd8(P,V) +#define __TBB_FetchAndAddWrelease(P,V) __TBB_FetchAndAdd8release(P,V) + +// Not needed +#undef __TBB_Store8 +#undef __TBB_Load8 + +// Definition of Lock functions +#define __TBB_TryLockByte(P) __TBB_machine_trylockbyte(P) +#define __TBB_LockByte(P) __TBB_machine_lockbyte(P) + +// Definition of other utility functions +#define __TBB_Pause(V) __TBB_machine_pause(V) +#define __TBB_Log2(V) __TBB_machine_lg(V) + diff --git a/dep/tbb/include/tbb/machine/linux_intel64.h b/dep/tbb/include/tbb/machine/linux_intel64.h new file mode 100644 index 000000000..55bca95eb --- /dev/null +++ b/dep/tbb/include/tbb/machine/linux_intel64.h @@ -0,0 +1,139 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#include "linux_common.h" + +#define __TBB_WORDSIZE 8 +#define __TBB_BIG_ENDIAN 0 + +#define __TBB_release_consistency_helper() __asm__ __volatile__("": : :"memory") + +#ifndef __TBB_rel_acq_fence +inline void __TBB_rel_acq_fence() { __asm__ __volatile__("mfence": : :"memory"); } +#endif + +#define __MACHINE_DECL_ATOMICS(S,T,X) \ +static inline T __TBB_machine_cmpswp##S (volatile void *ptr, T value, T comparand ) \ +{ \ + T result; \ + \ + __asm__ __volatile__("lock\ncmpxchg" X " %2,%1" \ + : "=a"(result), "=m"(*(T *)ptr) \ + : "q"(value), "0"(comparand), "m"(*(T *)ptr) \ + : "memory"); \ + return result; \ +} \ + \ +static inline T __TBB_machine_fetchadd##S(volatile void *ptr, T addend) \ +{ \ + T result; \ + __asm__ __volatile__("lock\nxadd" X " %0,%1" \ + : "=r"(result),"=m"(*(T *)ptr) \ + : "0"(addend), "m"(*(T *)ptr) \ + : "memory"); \ + return result; \ +} \ + \ +static inline T __TBB_machine_fetchstore##S(volatile void *ptr, T value) \ +{ \ + T result; \ + __asm__ __volatile__("lock\nxchg" X " %0,%1" \ + : "=r"(result),"=m"(*(T *)ptr) \ + : "0"(value), "m"(*(T *)ptr) \ + : "memory"); \ + return result; \ +} \ + +__MACHINE_DECL_ATOMICS(1,int8_t,"") +__MACHINE_DECL_ATOMICS(2,int16_t,"") +__MACHINE_DECL_ATOMICS(4,int32_t,"") +__MACHINE_DECL_ATOMICS(8,int64_t,"q") + +static inline int64_t __TBB_machine_lg( uint64_t x ) { + int64_t j; + __asm__ ("bsr %1,%0" : "=r"(j) : "r"(x)); + return j; +} + +static inline void __TBB_machine_or( volatile void *ptr, uint64_t addend ) { + __asm__ __volatile__("lock\norq %1,%0" : "=m"(*(uint64_t *)ptr) : "r"(addend), "m"(*(uint64_t *)ptr) : "memory"); +} + +static inline void __TBB_machine_and( volatile void *ptr, uint64_t addend ) { + __asm__ __volatile__("lock\nandq %1,%0" : "=m"(*(uint64_t *)ptr) : "r"(addend), "m"(*(uint64_t *)ptr) : "memory"); +} + +static inline void __TBB_machine_pause( int32_t delay ) { + for (int32_t i = 0; i < delay; i++) { + __asm__ __volatile__("pause;"); + } + return; +} + +// Machine specific atomic operations + +#define __TBB_CompareAndSwap1(P,V,C) __TBB_machine_cmpswp1(P,V,C) +#define __TBB_CompareAndSwap2(P,V,C) __TBB_machine_cmpswp2(P,V,C) +#define __TBB_CompareAndSwap4(P,V,C) __TBB_machine_cmpswp4(P,V,C) +#define __TBB_CompareAndSwap8(P,V,C) __TBB_machine_cmpswp8(P,V,C) +#define __TBB_CompareAndSwapW(P,V,C) __TBB_machine_cmpswp8(P,V,C) + +#define __TBB_FetchAndAdd1(P,V) __TBB_machine_fetchadd1(P,V) +#define __TBB_FetchAndAdd2(P,V) __TBB_machine_fetchadd2(P,V) +#define __TBB_FetchAndAdd4(P,V) __TBB_machine_fetchadd4(P,V) +#define __TBB_FetchAndAdd8(P,V) __TBB_machine_fetchadd8(P,V) +#define __TBB_FetchAndAddW(P,V) __TBB_machine_fetchadd8(P,V) + +#define __TBB_FetchAndStore1(P,V) __TBB_machine_fetchstore1(P,V) +#define __TBB_FetchAndStore2(P,V) __TBB_machine_fetchstore2(P,V) +#define __TBB_FetchAndStore4(P,V) __TBB_machine_fetchstore4(P,V) +#define __TBB_FetchAndStore8(P,V) __TBB_machine_fetchstore8(P,V) +#define __TBB_FetchAndStoreW(P,V) __TBB_machine_fetchstore8(P,V) + +#define __TBB_Store8(P,V) (*P = V) +#define __TBB_Load8(P) (*P) + +#define __TBB_AtomicOR(P,V) __TBB_machine_or(P,V) +#define __TBB_AtomicAND(P,V) __TBB_machine_and(P,V) + +// Definition of other functions +#define __TBB_Pause(V) __TBB_machine_pause(V) +#define __TBB_Log2(V) __TBB_machine_lg(V) + +// Special atomic functions +#define __TBB_FetchAndAddWrelease(P,V) __TBB_FetchAndAddW(P,V) +#define __TBB_FetchAndIncrementWacquire(P) __TBB_FetchAndAddW(P,1) +#define __TBB_FetchAndDecrementWrelease(P) __TBB_FetchAndAddW(P,-1) + +// Use generic definitions from tbb_machine.h +#undef __TBB_TryLockByte +#undef __TBB_LockByte diff --git a/dep/tbb/include/tbb/machine/mac_ppc.h b/dep/tbb/include/tbb/machine/mac_ppc.h new file mode 100644 index 000000000..6d6b1befe --- /dev/null +++ b/dep/tbb/include/tbb/machine/mac_ppc.h @@ -0,0 +1,85 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#include +#include + +#include // sched_yield + +inline int32_t __TBB_machine_cmpswp4 (volatile void *ptr, int32_t value, int32_t comparand ) +{ + int32_t result; + + __asm__ __volatile__("sync\n" + "0: lwarx %0,0,%2\n\t" /* load w/ reservation */ + "cmpw %0,%4\n\t" /* compare against comparand */ + "bne- 1f\n\t" /* exit if not same */ + "stwcx. %3,0,%2\n\t" /* store new_value */ + "bne- 0b\n" /* retry if reservation lost */ + "1: sync" /* the exit */ + : "=&r"(result), "=m"(* (int32_t*) ptr) + : "r"(ptr), "r"(value), "r"(comparand), "m"(* (int32_t*) ptr) + : "cr0"); + return result; +} + +inline int64_t __TBB_machine_cmpswp8 (volatile void *ptr, int64_t value, int64_t comparand ) +{ + int64_t result; + __asm__ __volatile__("sync\n" + "0: ldarx %0,0,%2\n\t" /* load w/ reservation */ + "cmpd %0,%4\n\t" /* compare against comparand */ + "bne- 1f\n\t" /* exit if not same */ + "stdcx. %3,0,%2\n\t" /* store new_value */ + "bne- 0b\n" /* retry if reservation lost */ + "1: sync" /* the exit */ + : "=&b"(result), "=m"(* (int64_t*) ptr) + : "r"(ptr), "r"(value), "r"(comparand), "m"(* (int64_t*) ptr) + : "cr0"); + return result; +} + +#define __TBB_BIG_ENDIAN 1 + +#if defined(powerpc64) || defined(__powerpc64__) || defined(__ppc64__) +#define __TBB_WORDSIZE 8 +#define __TBB_CompareAndSwapW(P,V,C) __TBB_machine_cmpswp8(P,V,C) +#else +#define __TBB_WORDSIZE 4 +#define __TBB_CompareAndSwapW(P,V,C) __TBB_machine_cmpswp4(P,V,C) +#endif + +#define __TBB_CompareAndSwap4(P,V,C) __TBB_machine_cmpswp4(P,V,C) +#define __TBB_CompareAndSwap8(P,V,C) __TBB_machine_cmpswp8(P,V,C) +#define __TBB_Yield() sched_yield() +#define __TBB_rel_acq_fence() __asm__ __volatile__("lwsync": : :"memory") +#define __TBB_release_consistency_helper() __TBB_rel_acq_fence() diff --git a/dep/tbb/include/tbb/machine/windows_ia32.h b/dep/tbb/include/tbb/machine/windows_ia32.h new file mode 100644 index 000000000..69c961a24 --- /dev/null +++ b/dep/tbb/include/tbb/machine/windows_ia32.h @@ -0,0 +1,242 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#if defined(__INTEL_COMPILER) +#define __TBB_release_consistency_helper() __asm { __asm nop } +#elif _MSC_VER >= 1300 +extern "C" void _ReadWriteBarrier(); +#pragma intrinsic(_ReadWriteBarrier) +#define __TBB_release_consistency_helper() _ReadWriteBarrier() +#else +#error Unsupported compiler - need to define __TBB_release_consistency_helper to support it +#endif + +inline void __TBB_rel_acq_fence() { __asm { __asm mfence } } + +#define __TBB_WORDSIZE 4 +#define __TBB_BIG_ENDIAN 0 + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (push) + #pragma warning (disable: 4244 4267) +#endif + +extern "C" { + __int64 __TBB_EXPORTED_FUNC __TBB_machine_cmpswp8 (volatile void *ptr, __int64 value, __int64 comparand ); + __int64 __TBB_EXPORTED_FUNC __TBB_machine_fetchadd8 (volatile void *ptr, __int64 addend ); + __int64 __TBB_EXPORTED_FUNC __TBB_machine_fetchstore8 (volatile void *ptr, __int64 value ); + void __TBB_EXPORTED_FUNC __TBB_machine_store8 (volatile void *ptr, __int64 value ); + __int64 __TBB_EXPORTED_FUNC __TBB_machine_load8 (const volatile void *ptr); +} + +template +struct __TBB_machine_load_store { + static inline T load_with_acquire(const volatile T& location) { + T to_return = location; + __TBB_release_consistency_helper(); + return to_return; + } + + static inline void store_with_release(volatile T &location, T value) { + __TBB_release_consistency_helper(); + location = value; + } +}; + +template +struct __TBB_machine_load_store { + static inline T load_with_acquire(const volatile T& location) { + return __TBB_machine_load8((volatile void *)&location); + } + + static inline void store_with_release(T &location, T value) { + __TBB_machine_store8((volatile void *)&location,(__int64)value); + } +}; + +template +inline T __TBB_machine_load_with_acquire(const volatile T &location) { + return __TBB_machine_load_store::load_with_acquire(location); +} + +template +inline void __TBB_machine_store_with_release(T& location, V value) { + __TBB_machine_load_store::store_with_release(location,value); +} + +//! Overload that exists solely to avoid /Wp64 warnings. +inline void __TBB_machine_store_with_release(size_t& location, size_t value) { + __TBB_machine_load_store::store_with_release(location,value); +} + +#define __TBB_load_with_acquire(L) __TBB_machine_load_with_acquire((L)) +#define __TBB_store_with_release(L,V) __TBB_machine_store_with_release((L),(V)) + +#define __TBB_DEFINE_ATOMICS(S,T,U,A,C) \ +static inline T __TBB_machine_cmpswp##S ( volatile void * ptr, U value, U comparand ) { \ + T result; \ + volatile T *p = (T *)ptr; \ + __TBB_release_consistency_helper(); \ + __asm \ + { \ + __asm mov edx, p \ + __asm mov C , value \ + __asm mov A , comparand \ + __asm lock cmpxchg [edx], C \ + __asm mov result, A \ + } \ + __TBB_release_consistency_helper(); \ + return result; \ +} \ +\ +static inline T __TBB_machine_fetchadd##S ( volatile void * ptr, U addend ) { \ + T result; \ + volatile T *p = (T *)ptr; \ + __TBB_release_consistency_helper(); \ + __asm \ + { \ + __asm mov edx, p \ + __asm mov A, addend \ + __asm lock xadd [edx], A \ + __asm mov result, A \ + } \ + __TBB_release_consistency_helper(); \ + return result; \ +}\ +\ +static inline T __TBB_machine_fetchstore##S ( volatile void * ptr, U value ) { \ + T result; \ + volatile T *p = (T *)ptr; \ + __TBB_release_consistency_helper(); \ + __asm \ + { \ + __asm mov edx, p \ + __asm mov A, value \ + __asm lock xchg [edx], A \ + __asm mov result, A \ + } \ + __TBB_release_consistency_helper(); \ + return result; \ +} + +__TBB_DEFINE_ATOMICS(1, __int8, __int8, al, cl) +__TBB_DEFINE_ATOMICS(2, __int16, __int16, ax, cx) +__TBB_DEFINE_ATOMICS(4, __int32, ptrdiff_t, eax, ecx) + +static inline __int32 __TBB_machine_lg( unsigned __int64 i ) { + unsigned __int32 j; + __asm + { + bsr eax, i + mov j, eax + } + return j; +} + +static inline void __TBB_machine_OR( volatile void *operand, __int32 addend ) { + __asm + { + mov eax, addend + mov edx, [operand] + lock or [edx], eax + } +} + +static inline void __TBB_machine_AND( volatile void *operand, __int32 addend ) { + __asm + { + mov eax, addend + mov edx, [operand] + lock and [edx], eax + } +} + +static inline void __TBB_machine_pause (__int32 delay ) { + _asm + { + mov eax, delay + L1: + pause + add eax, -1 + jne L1 + } + return; +} + +#define __TBB_CompareAndSwap1(P,V,C) __TBB_machine_cmpswp1(P,V,C) +#define __TBB_CompareAndSwap2(P,V,C) __TBB_machine_cmpswp2(P,V,C) +#define __TBB_CompareAndSwap4(P,V,C) __TBB_machine_cmpswp4(P,V,C) +#define __TBB_CompareAndSwap8(P,V,C) __TBB_machine_cmpswp8(P,V,C) +#define __TBB_CompareAndSwapW(P,V,C) __TBB_machine_cmpswp4(P,V,C) + +#define __TBB_FetchAndAdd1(P,V) __TBB_machine_fetchadd1(P,V) +#define __TBB_FetchAndAdd2(P,V) __TBB_machine_fetchadd2(P,V) +#define __TBB_FetchAndAdd4(P,V) __TBB_machine_fetchadd4(P,V) +#define __TBB_FetchAndAdd8(P,V) __TBB_machine_fetchadd8(P,V) +#define __TBB_FetchAndAddW(P,V) __TBB_machine_fetchadd4(P,V) + +#define __TBB_FetchAndStore1(P,V) __TBB_machine_fetchstore1(P,V) +#define __TBB_FetchAndStore2(P,V) __TBB_machine_fetchstore2(P,V) +#define __TBB_FetchAndStore4(P,V) __TBB_machine_fetchstore4(P,V) +#define __TBB_FetchAndStore8(P,V) __TBB_machine_fetchstore8(P,V) +#define __TBB_FetchAndStoreW(P,V) __TBB_machine_fetchstore4(P,V) + +// Should define this: +#define __TBB_Store8(P,V) __TBB_machine_store8(P,V) +#define __TBB_Load8(P) __TBB_machine_load8(P) +#define __TBB_AtomicOR(P,V) __TBB_machine_OR(P,V) +#define __TBB_AtomicAND(P,V) __TBB_machine_AND(P,V) + +// Definition of other functions +extern "C" __declspec(dllimport) int __stdcall SwitchToThread( void ); +#define __TBB_Yield() SwitchToThread() +#define __TBB_Pause(V) __TBB_machine_pause(V) +#define __TBB_Log2(V) __TBB_machine_lg(V) + +// Use generic definitions from tbb_machine.h +#undef __TBB_TryLockByte +#undef __TBB_LockByte + +#if defined(_MSC_VER)&&_MSC_VER<1400 + static inline void* __TBB_machine_get_current_teb () { + void* pteb; + __asm mov eax, fs:[0x18] + __asm mov pteb, eax + return pteb; + } +#endif + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warnings 4244, 4267 are back + diff --git a/dep/tbb/include/tbb/machine/windows_intel64.h b/dep/tbb/include/tbb/machine/windows_intel64.h new file mode 100644 index 000000000..a885aa46d --- /dev/null +++ b/dep/tbb/include/tbb/machine/windows_intel64.h @@ -0,0 +1,132 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#error Do not include this file directly; include tbb_machine.h instead +#endif + +#include +#if !defined(__INTEL_COMPILER) +#pragma intrinsic(_InterlockedOr64) +#pragma intrinsic(_InterlockedAnd64) +#pragma intrinsic(_InterlockedCompareExchange) +#pragma intrinsic(_InterlockedCompareExchange64) +#pragma intrinsic(_InterlockedExchangeAdd) +#pragma intrinsic(_InterlockedExchangeAdd64) +#pragma intrinsic(_InterlockedExchange) +#pragma intrinsic(_InterlockedExchange64) +#endif /* !defined(__INTEL_COMPILER) */ + +#if defined(__INTEL_COMPILER) +#define __TBB_release_consistency_helper() __asm { __asm nop } +inline void __TBB_rel_acq_fence() { __asm { __asm mfence } } +#elif _MSC_VER >= 1300 +extern "C" void _ReadWriteBarrier(); +#pragma intrinsic(_ReadWriteBarrier) +#define __TBB_release_consistency_helper() _ReadWriteBarrier() +#pragma intrinsic(_mm_mfence) +inline void __TBB_rel_acq_fence() { _mm_mfence(); } +#endif + +#define __TBB_WORDSIZE 8 +#define __TBB_BIG_ENDIAN 0 + +// ATTENTION: if you ever change argument types in machine-specific primitives, +// please take care of atomic_word<> specializations in tbb/atomic.h +extern "C" { + __int8 __TBB_EXPORTED_FUNC __TBB_machine_cmpswp1 (volatile void *ptr, __int8 value, __int8 comparand ); + __int8 __TBB_EXPORTED_FUNC __TBB_machine_fetchadd1 (volatile void *ptr, __int8 addend ); + __int8 __TBB_EXPORTED_FUNC __TBB_machine_fetchstore1 (volatile void *ptr, __int8 value ); + __int16 __TBB_EXPORTED_FUNC __TBB_machine_cmpswp2 (volatile void *ptr, __int16 value, __int16 comparand ); + __int16 __TBB_EXPORTED_FUNC __TBB_machine_fetchadd2 (volatile void *ptr, __int16 addend ); + __int16 __TBB_EXPORTED_FUNC __TBB_machine_fetchstore2 (volatile void *ptr, __int16 value ); + void __TBB_EXPORTED_FUNC __TBB_machine_pause (__int32 delay ); +} + + +#if !__INTEL_COMPILER +extern "C" unsigned char _BitScanReverse64( unsigned long* i, unsigned __int64 w ); +#pragma intrinsic(_BitScanReverse64) +#endif + +inline __int64 __TBB_machine_lg( unsigned __int64 i ) { +#if __INTEL_COMPILER + unsigned __int64 j; + __asm + { + bsr rax, i + mov j, rax + } +#else + unsigned long j; + _BitScanReverse64( &j, i ); +#endif + return j; +} + +inline void __TBB_machine_OR( volatile void *operand, intptr_t addend ) { + _InterlockedOr64((__int64*)operand, addend); +} + +inline void __TBB_machine_AND( volatile void *operand, intptr_t addend ) { + _InterlockedAnd64((__int64*)operand, addend); +} + +#define __TBB_CompareAndSwap1(P,V,C) __TBB_machine_cmpswp1(P,V,C) +#define __TBB_CompareAndSwap2(P,V,C) __TBB_machine_cmpswp2(P,V,C) +#define __TBB_CompareAndSwap4(P,V,C) _InterlockedCompareExchange( (long*) P , V , C ) +#define __TBB_CompareAndSwap8(P,V,C) _InterlockedCompareExchange64( (__int64*) P , V , C ) +#define __TBB_CompareAndSwapW(P,V,C) _InterlockedCompareExchange64( (__int64*) P , V , C ) + +#define __TBB_FetchAndAdd1(P,V) __TBB_machine_fetchadd1(P,V) +#define __TBB_FetchAndAdd2(P,V) __TBB_machine_fetchadd2(P,V) +#define __TBB_FetchAndAdd4(P,V) _InterlockedExchangeAdd((long*) P , V ) +#define __TBB_FetchAndAdd8(P,V) _InterlockedExchangeAdd64((__int64*) P , V ) +#define __TBB_FetchAndAddW(P,V) _InterlockedExchangeAdd64((__int64*) P , V ) + +#define __TBB_FetchAndStore1(P,V) __TBB_machine_fetchstore1(P,V) +#define __TBB_FetchAndStore2(P,V) __TBB_machine_fetchstore2(P,V) +#define __TBB_FetchAndStore4(P,V) _InterlockedExchange((long*) P , V ) +#define __TBB_FetchAndStore8(P,V) _InterlockedExchange64((__int64*) P , V ) +#define __TBB_FetchAndStoreW(P,V) _InterlockedExchange64((__int64*) P , V ) + +// Not used if wordsize == 8 +#undef __TBB_Store8 +#undef __TBB_Load8 + +#define __TBB_AtomicOR(P,V) __TBB_machine_OR(P,V) +#define __TBB_AtomicAND(P,V) __TBB_machine_AND(P,V) + +extern "C" __declspec(dllimport) int __stdcall SwitchToThread( void ); +#define __TBB_Yield() SwitchToThread() +#define __TBB_Pause(V) __TBB_machine_pause(V) +#define __TBB_Log2(V) __TBB_machine_lg(V) + +// Use generic definitions from tbb_machine.h +#undef __TBB_TryLockByte +#undef __TBB_LockByte diff --git a/dep/tbb/include/tbb/mutex.h b/dep/tbb/include/tbb/mutex.h new file mode 100644 index 000000000..a14735f8b --- /dev/null +++ b/dep/tbb/include/tbb/mutex.h @@ -0,0 +1,236 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_mutex_H +#define __TBB_mutex_H + +#if _WIN32||_WIN64 +#include +#if !defined(_WIN32_WINNT) +// The following Windows API function is declared explicitly; +// otherwise any user would have to specify /D_WIN32_WINNT=0x0400 +extern "C" BOOL WINAPI TryEnterCriticalSection( LPCRITICAL_SECTION ); +#endif + +#else /* if not _WIN32||_WIN64 */ +#include +namespace tbb { namespace internal { +// Use this internal TBB function to throw an exception +extern void handle_perror( int error_code, const char* what ); +} } //namespaces +#endif /* _WIN32||_WIN64 */ + +#include +#include "aligned_space.h" +#include "tbb_stddef.h" +#include "tbb_profiling.h" + +namespace tbb { + +//! Wrapper around the platform's native reader-writer lock. +/** For testing purposes only. + @ingroup synchronization */ +class mutex { +public: + //! Construct unacquired mutex. + mutex() { +#if TBB_USE_ASSERT || TBB_USE_THREADING_TOOLS + internal_construct(); +#else + #if _WIN32||_WIN64 + InitializeCriticalSection(&impl); + #else + int error_code = pthread_mutex_init(&impl,NULL); + if( error_code ) + tbb::internal::handle_perror(error_code,"mutex: pthread_mutex_init failed"); + #endif /* _WIN32||_WIN64*/ +#endif /* TBB_USE_ASSERT */ + }; + + ~mutex() { +#if TBB_USE_ASSERT + internal_destroy(); +#else + #if _WIN32||_WIN64 + DeleteCriticalSection(&impl); + #else + pthread_mutex_destroy(&impl); + + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + }; + + class scoped_lock; + friend class scoped_lock; + + //! The scoped locking pattern + /** It helps to avoid the common problem of forgetting to release lock. + It also nicely provides the "node" for queuing locks. */ + class scoped_lock : internal::no_copy { + public: + //! Construct lock that has not acquired a mutex. + scoped_lock() : my_mutex(NULL) {}; + + //! Acquire lock on given mutex. + /** Upon entry, *this should not be in the "have acquired a mutex" state. */ + scoped_lock( mutex& mutex ) { + acquire( mutex ); + } + + //! Release lock (if lock is held). + ~scoped_lock() { + if( my_mutex ) + release(); + } + + //! Acquire lock on given mutex. + void acquire( mutex& mutex ) { +#if TBB_USE_ASSERT + internal_acquire(mutex); +#else + mutex.lock(); + my_mutex = &mutex; +#endif /* TBB_USE_ASSERT */ + } + + //! Try acquire lock on given mutex. + bool try_acquire( mutex& mutex ) { +#if TBB_USE_ASSERT + return internal_try_acquire (mutex); +#else + bool result = mutex.try_lock(); + if( result ) + my_mutex = &mutex; + return result; +#endif /* TBB_USE_ASSERT */ + } + + //! Release lock + void release() { +#if TBB_USE_ASSERT + internal_release (); +#else + my_mutex->unlock(); + my_mutex = NULL; +#endif /* TBB_USE_ASSERT */ + } + + private: + //! The pointer to the current mutex to work + mutex* my_mutex; + + //! All checks from acquire using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_acquire( mutex& m ); + + //! All checks from try_acquire using mutex.state were moved here + bool __TBB_EXPORTED_METHOD internal_try_acquire( mutex& m ); + + //! All checks from release using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_release(); + + friend class mutex; + }; + + // Mutex traits + static const bool is_rw_mutex = false; + static const bool is_recursive_mutex = false; + static const bool is_fair_mutex = false; + + // ISO C++0x compatibility methods + + //! Acquire lock + void lock() { +#if TBB_USE_ASSERT + aligned_space tmp; + new(tmp.begin()) scoped_lock(*this); +#else + #if _WIN32||_WIN64 + EnterCriticalSection(&impl); + #else + pthread_mutex_lock(&impl); + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + } + + //! Try acquiring lock (non-blocking) + /** Return true if lock acquired; false otherwise. */ + bool try_lock() { +#if TBB_USE_ASSERT + aligned_space tmp; + scoped_lock& s = *tmp.begin(); + s.my_mutex = NULL; + return s.internal_try_acquire(*this); +#else + #if _WIN32||_WIN64 + return TryEnterCriticalSection(&impl)!=0; + #else + return pthread_mutex_trylock(&impl)==0; + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + } + + //! Release lock + void unlock() { +#if TBB_USE_ASSERT + aligned_space tmp; + scoped_lock& s = *tmp.begin(); + s.my_mutex = this; + s.internal_release(); +#else + #if _WIN32||_WIN64 + LeaveCriticalSection(&impl); + #else + pthread_mutex_unlock(&impl); + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + } + +private: +#if _WIN32||_WIN64 + CRITICAL_SECTION impl; + enum state_t { + INITIALIZED=0x1234, + DESTROYED=0x789A, + HELD=0x56CD + } state; +#else + pthread_mutex_t impl; +#endif /* _WIN32||_WIN64 */ + + //! All checks from mutex constructor using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_construct(); + + //! All checks from mutex destructor using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_destroy(); +}; + +__TBB_DEFINE_PROFILING_SET_NAME(mutex) + +} // namespace tbb + +#endif /* __TBB_mutex_H */ diff --git a/dep/tbb/include/tbb/null_mutex.h b/dep/tbb/include/tbb/null_mutex.h new file mode 100644 index 000000000..6cf8dc8cf --- /dev/null +++ b/dep/tbb/include/tbb/null_mutex.h @@ -0,0 +1,63 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_null_mutex_H +#define __TBB_null_mutex_H + +namespace tbb { + +//! A mutex which does nothing +/** A null_mutex does no operation and simulates success. + @ingroup synchronization */ +class null_mutex { + //! Deny assignment and copy construction + null_mutex( const null_mutex& ); + void operator=( const null_mutex& ); +public: + //! Represents acquisition of a mutex. + class scoped_lock { + public: + scoped_lock() {} + scoped_lock( null_mutex& ) {} + ~scoped_lock() {} + void acquire( null_mutex& ) {} + bool try_acquire( null_mutex& ) { return true; } + void release() {} + }; + + null_mutex() {} + + // Mutex traits + static const bool is_rw_mutex = false; + static const bool is_recursive_mutex = true; + static const bool is_fair_mutex = true; +}; + +} + +#endif /* __TBB_null_mutex_H */ diff --git a/dep/tbb/include/tbb/null_rw_mutex.h b/dep/tbb/include/tbb/null_rw_mutex.h new file mode 100644 index 000000000..6be42e184 --- /dev/null +++ b/dep/tbb/include/tbb/null_rw_mutex.h @@ -0,0 +1,65 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_null_rw_mutex_H +#define __TBB_null_rw_mutex_H + +namespace tbb { + +//! A rw mutex which does nothing +/** A null_rw_mutex is a rw mutex that does nothing and simulates successful operation. + @ingroup synchronization */ +class null_rw_mutex { + //! Deny assignment and copy construction + null_rw_mutex( const null_rw_mutex& ); + void operator=( const null_rw_mutex& ); +public: + //! Represents acquisition of a mutex. + class scoped_lock { + public: + scoped_lock() {} + scoped_lock( null_rw_mutex& , bool = true ) {} + ~scoped_lock() {} + void acquire( null_rw_mutex& , bool = true ) {} + bool upgrade_to_writer() { return true; } + bool downgrade_to_reader() { return true; } + bool try_acquire( null_rw_mutex& , bool = true ) { return true; } + void release() {} + }; + + null_rw_mutex() {} + + // Mutex traits + static const bool is_rw_mutex = true; + static const bool is_recursive_mutex = true; + static const bool is_fair_mutex = true; +}; + +} + +#endif /* __TBB_null_rw_mutex_H */ diff --git a/dep/tbb/include/tbb/parallel_do.h b/dep/tbb/include/tbb/parallel_do.h new file mode 100644 index 000000000..922c9684a --- /dev/null +++ b/dep/tbb/include/tbb/parallel_do.h @@ -0,0 +1,508 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_do_H +#define __TBB_parallel_do_H + +#include "task.h" +#include "aligned_space.h" +#include + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + template class parallel_do_feeder_impl; + template class do_group_task; + + //! Strips its template type argument from 'cv' and '&' qualifiers + template + struct strip { typedef T type; }; + template + struct strip { typedef T type; }; + template + struct strip { typedef T type; }; + template + struct strip { typedef T type; }; + template + struct strip { typedef T type; }; + // Most of the compilers remove cv-qualifiers from non-reference function argument types. + // But unfortunately there are those that don't. + template + struct strip { typedef T type; }; + template + struct strip { typedef T type; }; + template + struct strip { typedef T type; }; +} // namespace internal +//! @endcond + +//! Class the user supplied algorithm body uses to add new tasks +/** \param Item Work item type **/ +template +class parallel_do_feeder: internal::no_copy +{ + parallel_do_feeder() {} + virtual ~parallel_do_feeder () {} + virtual void internal_add( const Item& item ) = 0; + template friend class internal::parallel_do_feeder_impl; +public: + //! Add a work item to a running parallel_do. + void add( const Item& item ) {internal_add(item);} +}; + +//! @cond INTERNAL +namespace internal { + //! For internal use only. + /** Selects one of the two possible forms of function call member operator. + @ingroup algorithms **/ + template + class parallel_do_operator_selector + { + typedef parallel_do_feeder Feeder; + template + static void internal_call( const Body& obj, A1& arg1, A2&, void (Body::*)(CvItem) const ) { + obj(arg1); + } + template + static void internal_call( const Body& obj, A1& arg1, A2& arg2, void (Body::*)(CvItem, parallel_do_feeder&) const ) { + obj(arg1, arg2); + } + + public: + template + static void call( const Body& obj, A1& arg1, A2& arg2 ) + { + internal_call( obj, arg1, arg2, &Body::operator() ); + } + }; + + //! For internal use only. + /** Executes one iteration of a do. + @ingroup algorithms */ + template + class do_iteration_task: public task + { + typedef parallel_do_feeder_impl feeder_type; + + Item my_value; + feeder_type& my_feeder; + + do_iteration_task( const Item& value, feeder_type& feeder ) : + my_value(value), my_feeder(feeder) + {} + + /*override*/ + task* execute() + { + parallel_do_operator_selector::call(*my_feeder.my_body, my_value, my_feeder); + return NULL; + } + + template friend class parallel_do_feeder_impl; + }; // class do_iteration_task + + template + class do_iteration_task_iter: public task + { + typedef parallel_do_feeder_impl feeder_type; + + Iterator my_iter; + feeder_type& my_feeder; + + do_iteration_task_iter( const Iterator& iter, feeder_type& feeder ) : + my_iter(iter), my_feeder(feeder) + {} + + /*override*/ + task* execute() + { + parallel_do_operator_selector::call(*my_feeder.my_body, *my_iter, my_feeder); + return NULL; + } + + template friend class do_group_task_forward; + template friend class do_group_task_input; + template friend class do_task_iter; + }; // class do_iteration_task_iter + + //! For internal use only. + /** Implements new task adding procedure. + @ingroup algorithms **/ + template + class parallel_do_feeder_impl : public parallel_do_feeder + { + /*override*/ + void internal_add( const Item& item ) + { + typedef do_iteration_task iteration_type; + + iteration_type& t = *new (task::self().allocate_additional_child_of(*my_barrier)) iteration_type(item, *this); + + t.spawn( t ); + } + public: + const Body* my_body; + empty_task* my_barrier; + + parallel_do_feeder_impl() + { + my_barrier = new( task::allocate_root() ) empty_task(); + __TBB_ASSERT(my_barrier, "root task allocation failed"); + } + +#if __TBB_EXCEPTIONS + parallel_do_feeder_impl(tbb::task_group_context &context) + { + my_barrier = new( task::allocate_root(context) ) empty_task(); + __TBB_ASSERT(my_barrier, "root task allocation failed"); + } +#endif + + ~parallel_do_feeder_impl() + { + my_barrier->destroy(*my_barrier); + } + }; // class parallel_do_feeder_impl + + + //! For internal use only + /** Unpacks a block of iterations. + @ingroup algorithms */ + + template + class do_group_task_forward: public task + { + static const size_t max_arg_size = 4; + + typedef parallel_do_feeder_impl feeder_type; + + feeder_type& my_feeder; + Iterator my_first; + size_t my_size; + + do_group_task_forward( Iterator first, size_t size, feeder_type& feeder ) + : my_feeder(feeder), my_first(first), my_size(size) + {} + + /*override*/ task* execute() + { + typedef do_iteration_task_iter iteration_type; + __TBB_ASSERT( my_size>0, NULL ); + task_list list; + task* t; + size_t k=0; + for(;;) { + t = new( allocate_child() ) iteration_type( my_first, my_feeder ); + ++my_first; + if( ++k==my_size ) break; + list.push_back(*t); + } + set_ref_count(int(k+1)); + spawn(list); + spawn_and_wait_for_all(*t); + return NULL; + } + + template friend class do_task_iter; + }; // class do_group_task_forward + + template + class do_group_task_input: public task + { + static const size_t max_arg_size = 4; + + typedef parallel_do_feeder_impl feeder_type; + + feeder_type& my_feeder; + size_t my_size; + aligned_space my_arg; + + do_group_task_input( feeder_type& feeder ) + : my_feeder(feeder), my_size(0) + {} + + /*override*/ task* execute() + { + typedef do_iteration_task_iter iteration_type; + __TBB_ASSERT( my_size>0, NULL ); + task_list list; + task* t; + size_t k=0; + for(;;) { + t = new( allocate_child() ) iteration_type( my_arg.begin() + k, my_feeder ); + if( ++k==my_size ) break; + list.push_back(*t); + } + set_ref_count(int(k+1)); + spawn(list); + spawn_and_wait_for_all(*t); + return NULL; + } + + ~do_group_task_input(){ + for( size_t k=0; k~Item(); + } + + template friend class do_task_iter; + }; // class do_group_task_input + + //! For internal use only. + /** Gets block of iterations and packages them into a do_group_task. + @ingroup algorithms */ + template + class do_task_iter: public task + { + typedef parallel_do_feeder_impl feeder_type; + + public: + do_task_iter( Iterator first, Iterator last , feeder_type& feeder ) : + my_first(first), my_last(last), my_feeder(feeder) + {} + + private: + Iterator my_first; + Iterator my_last; + feeder_type& my_feeder; + + /* Do not merge run(xxx) and run_xxx() methods. They are separated in order + to make sure that compilers will eliminate unused argument of type xxx + (that is will not put it on stack). The sole purpose of this argument + is overload resolution. + + An alternative could be using template functions, but explicit specialization + of member function templates is not supported for non specialized class + templates. Besides template functions would always fall back to the least + efficient variant (the one for input iterators) in case of iterators having + custom tags derived from basic ones. */ + /*override*/ task* execute() + { + typedef typename std::iterator_traits::iterator_category iterator_tag; + return run( (iterator_tag*)NULL ); + } + + /** This is the most restricted variant that operates on input iterators or + iterators with unknown tags (tags not derived from the standard ones). **/ + inline task* run( void* ) { return run_for_input_iterator(); } + + task* run_for_input_iterator() { + typedef do_group_task_input block_type; + + block_type& t = *new( allocate_additional_child_of(*my_feeder.my_barrier) ) block_type(my_feeder); + size_t k=0; + while( !(my_first == my_last) ) { + new (t.my_arg.begin() + k) Item(*my_first); + ++my_first; + if( ++k==block_type::max_arg_size ) { + if ( !(my_first == my_last) ) + recycle_to_reexecute(); + break; + } + } + if( k==0 ) { + destroy(t); + return NULL; + } else { + t.my_size = k; + return &t; + } + } + + inline task* run( std::forward_iterator_tag* ) { return run_for_forward_iterator(); } + + task* run_for_forward_iterator() { + typedef do_group_task_forward block_type; + + Iterator first = my_first; + size_t k=0; + while( !(my_first==my_last) ) { + ++my_first; + if( ++k==block_type::max_arg_size ) { + if ( !(my_first==my_last) ) + recycle_to_reexecute(); + break; + } + } + return k==0 ? NULL : new( allocate_additional_child_of(*my_feeder.my_barrier) ) block_type(first, k, my_feeder); + } + + inline task* run( std::random_access_iterator_tag* ) { return run_for_random_access_iterator(); } + + task* run_for_random_access_iterator() { + typedef do_group_task_forward block_type; + typedef do_iteration_task_iter iteration_type; + + size_t k = static_cast(my_last-my_first); + if( k > block_type::max_arg_size ) { + Iterator middle = my_first + k/2; + + empty_task& c = *new( allocate_continuation() ) empty_task; + do_task_iter& b = *new( c.allocate_child() ) do_task_iter(middle, my_last, my_feeder); + recycle_as_child_of(c); + + my_last = middle; + c.set_ref_count(2); + c.spawn(b); + return this; + }else if( k != 0 ) { + task_list list; + task* t; + size_t k1=0; + for(;;) { + t = new( allocate_child() ) iteration_type(my_first, my_feeder); + ++my_first; + if( ++k1==k ) break; + list.push_back(*t); + } + set_ref_count(int(k+1)); + spawn(list); + spawn_and_wait_for_all(*t); + } + return NULL; + } + }; // class do_task_iter + + //! For internal use only. + /** Implements parallel iteration over a range. + @ingroup algorithms */ + template + void run_parallel_do( Iterator first, Iterator last, const Body& body +#if __TBB_EXCEPTIONS + , task_group_context& context +#endif + ) + { + typedef do_task_iter root_iteration_task; +#if __TBB_EXCEPTIONS + parallel_do_feeder_impl feeder(context); +#else + parallel_do_feeder_impl feeder; +#endif + feeder.my_body = &body; + + root_iteration_task &t = *new( feeder.my_barrier->allocate_child() ) root_iteration_task(first, last, feeder); + + feeder.my_barrier->set_ref_count(2); + feeder.my_barrier->spawn_and_wait_for_all(t); + } + + //! For internal use only. + /** Detects types of Body's operator function arguments. + @ingroup algorithms **/ + template + void select_parallel_do( Iterator first, Iterator last, const Body& body, void (Body::*)(Item) const +#if __TBB_EXCEPTIONS + , task_group_context& context +#endif // __TBB_EXCEPTIONS + ) + { + run_parallel_do::type>( first, last, body +#if __TBB_EXCEPTIONS + , context +#endif // __TBB_EXCEPTIONS + ); + } + + //! For internal use only. + /** Detects types of Body's operator function arguments. + @ingroup algorithms **/ + template + void select_parallel_do( Iterator first, Iterator last, const Body& body, void (Body::*)(Item, parallel_do_feeder<_Item>&) const +#if __TBB_EXCEPTIONS + , task_group_context& context +#endif // __TBB_EXCEPTIONS + ) + { + run_parallel_do::type>( first, last, body +#if __TBB_EXCEPTIONS + , context +#endif // __TBB_EXCEPTIONS + ); + } + +} // namespace internal +//! @endcond + + +/** \page parallel_do_body_req Requirements on parallel_do body + Class \c Body implementing the concept of parallel_do body must define: + - \code + B::operator()( + cv_item_type item, + parallel_do_feeder& feeder + ) const + + OR + + B::operator()( cv_item_type& item ) const + \endcode Process item. + May be invoked concurrently for the same \c this but different \c item. + + - \code item_type( const item_type& ) \endcode + Copy a work item. + - \code ~item_type() \endcode Destroy a work item +**/ + +/** \name parallel_do + See also requirements on \ref parallel_do_body_req "parallel_do Body". **/ +//@{ +//! Parallel iteration over a range, with optional addition of more work. +/** @ingroup algorithms */ +template +void parallel_do( Iterator first, Iterator last, const Body& body ) +{ + if ( first == last ) + return; +#if __TBB_EXCEPTIONS + task_group_context context; +#endif // __TBB_EXCEPTIONS + internal::select_parallel_do( first, last, body, &Body::operator() +#if __TBB_EXCEPTIONS + , context +#endif // __TBB_EXCEPTIONS + ); +} + +#if __TBB_EXCEPTIONS +//! Parallel iteration over a range, with optional addition of more work and user-supplied context +/** @ingroup algorithms */ +template +void parallel_do( Iterator first, Iterator last, const Body& body, task_group_context& context ) +{ + if ( first == last ) + return; + internal::select_parallel_do( first, last, body, &Body::operator(), context ); +} +#endif // __TBB_EXCEPTIONS + +//@} + +} // namespace + +#endif /* __TBB_parallel_do_H */ diff --git a/dep/tbb/include/tbb/parallel_for.h b/dep/tbb/include/tbb/parallel_for.h new file mode 100644 index 000000000..8d103e027 --- /dev/null +++ b/dep/tbb/include/tbb/parallel_for.h @@ -0,0 +1,242 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_for_H +#define __TBB_parallel_for_H + +#include "task.h" +#include "partitioner.h" +#include "blocked_range.h" +#include +#include // std::invalid_argument +#include // std::invalid_argument text + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + + //! Task type used in parallel_for + /** @ingroup algorithms */ + template + class start_for: public task { + Range my_range; + const Body my_body; + typename Partitioner::partition_type my_partition; + /*override*/ task* execute(); + + //! Constructor for root task. + start_for( const Range& range, const Body& body, Partitioner& partitioner ) : + my_range(range), + my_body(body), + my_partition(partitioner) + { + } + //! Splitting constructor used to generate children. + /** this becomes left child. Newly constructed object is right child. */ + start_for( start_for& parent, split ) : + my_range(parent.my_range,split()), + my_body(parent.my_body), + my_partition(parent.my_partition,split()) + { + my_partition.set_affinity(*this); + } + //! Update affinity info, if any. + /*override*/ void note_affinity( affinity_id id ) { + my_partition.note_affinity( id ); + } + public: + static void run( const Range& range, const Body& body, const Partitioner& partitioner ) { + if( !range.empty() ) { +#if !__TBB_EXCEPTIONS || TBB_JOIN_OUTER_TASK_GROUP + start_for& a = *new(task::allocate_root()) start_for(range,body,const_cast(partitioner)); +#else + // Bound context prevents exceptions from body to affect nesting or sibling algorithms, + // and allows users to handle exceptions safely by wrapping parallel_for in the try-block. + task_group_context context; + start_for& a = *new(task::allocate_root(context)) start_for(range,body,const_cast(partitioner)); +#endif /* __TBB_EXCEPTIONS && !TBB_JOIN_OUTER_TASK_GROUP */ + task::spawn_root_and_wait(a); + } + } +#if __TBB_EXCEPTIONS + static void run( const Range& range, const Body& body, const Partitioner& partitioner, task_group_context& context ) { + if( !range.empty() ) { + start_for& a = *new(task::allocate_root(context)) start_for(range,body,const_cast(partitioner)); + task::spawn_root_and_wait(a); + } + } +#endif /* __TBB_EXCEPTIONS */ + }; + + template + task* start_for::execute() { + if( !my_range.is_divisible() || my_partition.should_execute_range(*this) ) { + my_body( my_range ); + return my_partition.continue_after_execute_range(*this); + } else { + empty_task& c = *new( this->allocate_continuation() ) empty_task; + recycle_as_child_of(c); + c.set_ref_count(2); + bool delay = my_partition.decide_whether_to_delay(); + start_for& b = *new( c.allocate_child() ) start_for(*this,split()); + my_partition.spawn_or_delay(delay,*this,b); + return this; + } + } +} // namespace internal +//! @endcond + + +// Requirements on Range concept are documented in blocked_range.h + +/** \page parallel_for_body_req Requirements on parallel_for body + Class \c Body implementing the concept of parallel_for body must define: + - \code Body::Body( const Body& ); \endcode Copy constructor + - \code Body::~Body(); \endcode Destructor + - \code void Body::operator()( Range& r ) const; \endcode Function call operator applying the body to range \c r. +**/ + +/** \name parallel_for + See also requirements on \ref range_req "Range" and \ref parallel_for_body_req "parallel_for Body". **/ +//@{ + +//! Parallel iteration over range with default partitioner. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body ) { + internal::start_for::run(range,body,__TBB_DEFAULT_PARTITIONER()); +} + +//! Parallel iteration over range with simple partitioner. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body, const simple_partitioner& partitioner ) { + internal::start_for::run(range,body,partitioner); +} + +//! Parallel iteration over range with auto_partitioner. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body, const auto_partitioner& partitioner ) { + internal::start_for::run(range,body,partitioner); +} + +//! Parallel iteration over range with affinity_partitioner. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body, affinity_partitioner& partitioner ) { + internal::start_for::run(range,body,partitioner); +} + +#if __TBB_EXCEPTIONS +//! Parallel iteration over range with simple partitioner and user-supplied context. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body, const simple_partitioner& partitioner, task_group_context& context ) { + internal::start_for::run(range, body, partitioner, context); +} + +//! Parallel iteration over range with auto_partitioner and user-supplied context. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body, const auto_partitioner& partitioner, task_group_context& context ) { + internal::start_for::run(range, body, partitioner, context); +} + +//! Parallel iteration over range with affinity_partitioner and user-supplied context. +/** @ingroup algorithms **/ +template +void parallel_for( const Range& range, const Body& body, affinity_partitioner& partitioner, task_group_context& context ) { + internal::start_for::run(range,body,partitioner, context); +} +#endif /* __TBB_EXCEPTIONS */ +//@} + +//! @cond INTERNAL +namespace internal { + //! Calls the function with values from range [begin, end) with a step provided +template +class parallel_for_body : internal::no_assign { + const Function &my_func; + const Index my_begin; + const Index my_step; +public: + parallel_for_body( const Function& _func, Index& _begin, Index& _step) + : my_func(_func), my_begin(_begin), my_step(_step) {} + + void operator()( tbb::blocked_range& r ) const { + for( Index i = r.begin(), k = my_begin + i * my_step; i < r.end(); i++, k = k + my_step) + my_func( k ); + } +}; +} // namespace internal +//! @endcond + +namespace strict_ppl { + +//@{ +//! Parallel iteration over a range of integers with a step provided +template +void parallel_for(Index first, Index last, Index step, const Function& f) { + tbb::task_group_context context; + parallel_for(first, last, step, f, context); +} +template +void parallel_for(Index first, Index last, Index step, const Function& f, tbb::task_group_context &context) { + if (step <= 0 ) throw std::invalid_argument("step should be positive"); + + if (last > first) { + Index end = (last - first) / step; + if (first + end * step < last) end++; + tbb::blocked_range range(static_cast(0), end); + internal::parallel_for_body body(f, first, step); + tbb::parallel_for(range, body, tbb::auto_partitioner(), context); + } +} +//! Parallel iteration over a range of integers with a default step value +template +void parallel_for(Index first, Index last, const Function& f) { + tbb::task_group_context context; + parallel_for(first, last, static_cast(1), f, context); +} +template +void parallel_for(Index first, Index last, const Function& f, tbb::task_group_context &context) { + parallel_for(first, last, static_cast(1), f, context); +} + +//@} + +} // namespace strict_ppl + +using strict_ppl::parallel_for; + +} // namespace tbb + +#endif /* __TBB_parallel_for_H */ + diff --git a/dep/tbb/include/tbb/parallel_for_each.h b/dep/tbb/include/tbb/parallel_for_each.h new file mode 100644 index 000000000..fa67b6cbc --- /dev/null +++ b/dep/tbb/include/tbb/parallel_for_each.h @@ -0,0 +1,79 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_for_each_H +#define __TBB_parallel_for_each_H + +#include "parallel_do.h" + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + // The class calls user function in operator() + template + class parallel_for_each_body : internal::no_assign { + Function &my_func; + public: + parallel_for_each_body(Function &_func) : my_func(_func) {} + parallel_for_each_body(const parallel_for_each_body &_caller) : my_func(_caller.my_func) {} + + void operator() ( typename std::iterator_traits::value_type value ) const { + my_func(value); + } + }; +} // namespace internal +//! @endcond + +/** \name parallel_for_each + **/ +//@{ +//! Calls function f for all items from [first, last) interval using user-supplied context +/** @ingroup algorithms */ +template +Function parallel_for_each(InputIterator first, InputIterator last, Function f, task_group_context &context) { + internal::parallel_for_each_body body(f); + + tbb::parallel_do (first, last, body, context); + return f; +} + +//! Uses default context +template +Function parallel_for_each(InputIterator first, InputIterator last, Function f) { + internal::parallel_for_each_body body(f); + + tbb::parallel_do (first, last, body); + return f; +} + +//@} + +} // namespace + +#endif /* __TBB_parallel_for_each_H */ diff --git a/dep/tbb/include/tbb/parallel_invoke.h b/dep/tbb/include/tbb/parallel_invoke.h new file mode 100644 index 000000000..fb425c676 --- /dev/null +++ b/dep/tbb/include/tbb/parallel_invoke.h @@ -0,0 +1,333 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_invoke_H +#define __TBB_parallel_invoke_H + +#include "task.h" + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + // Simple task object, executing user method + template + class function_invoker : public task{ + public: + function_invoker(function& _function) : my_function(_function) {} + private: + function &my_function; + /*override*/ + task* execute() + { + my_function(); + return NULL; + } + }; + + // The class spawns two or three child tasks + template + class spawner : public task { + private: + function1& my_func1; + function2& my_func2; + function3& my_func3; + bool is_recycled; + + task* execute (){ + if(is_recycled){ + return NULL; + }else{ + __TBB_ASSERT(N==2 || N==3, "Number of arguments passed to spawner is wrong"); + set_ref_count(N); + recycle_as_safe_continuation(); + internal::function_invoker* invoker2 = new (allocate_child()) internal::function_invoker(my_func2); + __TBB_ASSERT(invoker2, "Child task allocation failed"); + spawn(*invoker2); + size_t n = N; // To prevent compiler warnings + if (n>2) { + internal::function_invoker* invoker3 = new (allocate_child()) internal::function_invoker(my_func3); + __TBB_ASSERT(invoker3, "Child task allocation failed"); + spawn(*invoker3); + } + my_func1(); + is_recycled = true; + return NULL; + } + } // execute + + public: + spawner(function1& _func1, function2& _func2, function3& _func3) : my_func1(_func1), my_func2(_func2), my_func3(_func3), is_recycled(false) {} + }; + + // Creates and spawns child tasks + class parallel_invoke_helper : public empty_task { + public: + // Dummy functor class + class parallel_invoke_noop { + public: + void operator() () const {} + }; + // Creates a helper object with user-defined number of children expected + parallel_invoke_helper(int number_of_children) + { + set_ref_count(number_of_children + 1); + } + // Adds child task and spawns it + template + void add_child (function &_func) + { + internal::function_invoker* invoker = new (allocate_child()) internal::function_invoker(_func); + __TBB_ASSERT(invoker, "Child task allocation failed"); + spawn(*invoker); + } + + // Adds a task with multiple child tasks and spawns it + // two arguments + template + void add_children (function1& _func1, function2& _func2) + { + // The third argument is dummy, it is ignored actually. + parallel_invoke_noop noop; + internal::spawner<2, function1, function2, parallel_invoke_noop>& sub_root = *new(allocate_child())internal::spawner<2, function1, function2, parallel_invoke_noop>(_func1, _func2, noop); + spawn(sub_root); + } + // three arguments + template + void add_children (function1& _func1, function2& _func2, function3& _func3) + { + internal::spawner<3, function1, function2, function3>& sub_root = *new(allocate_child())internal::spawner<3, function1, function2, function3>(_func1, _func2, _func3); + spawn(sub_root); + } + + // Waits for all child tasks + template + void run_and_finish(F0& f0) + { + internal::function_invoker* invoker = new (allocate_child()) internal::function_invoker(f0); + __TBB_ASSERT(invoker, "Child task allocation failed"); + spawn_and_wait_for_all(*invoker); + } + }; + // The class destroys root if exception occured as well as in normal case + class parallel_invoke_cleaner: internal::no_copy { + public: + parallel_invoke_cleaner(int number_of_children, tbb::task_group_context& context) : root(*new(task::allocate_root(context)) internal::parallel_invoke_helper(number_of_children)) + {} + ~parallel_invoke_cleaner(){ + root.destroy(root); + } + internal::parallel_invoke_helper& root; + }; +} // namespace internal +//! @endcond + +/** \name parallel_invoke + **/ +//@{ +//! Executes a list of tasks in parallel and waits for all tasks to complete. +/** @ingroup algorithms */ + +// parallel_invoke with user-defined context +// two arguments +template +void parallel_invoke(F0 f0, F1 f1, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(2, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_child(f1); + + root.run_and_finish(f0); +} + +// three arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(3, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_child(f2); + root.add_child(f1); + + root.run_and_finish(f0); +} + +// four arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(4, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_child(f3); + root.add_child(f2); + root.add_child(f1); + + root.run_and_finish(f0); +} + +// five arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(3, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_children(f4, f3); + root.add_children(f2, f1); + + root.run_and_finish(f0); +} + +// six arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(3, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_children(f5, f4, f3); + root.add_children(f2, f1); + + root.run_and_finish(f0); +} + +// seven arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(3, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_children(f6, f5, f4); + root.add_children(f3, f2, f1); + + root.run_and_finish(f0); +} + +// eight arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, F7 f7, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(4, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_children(f7, f6, f5); + root.add_children(f4, f3); + root.add_children(f2, f1); + + root.run_and_finish(f0); +} + +// nine arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, F7 f7, F8 f8, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(4, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_children(f8, f7, f6); + root.add_children(f5, f4, f3); + root.add_children(f2, f1); + + root.run_and_finish(f0); +} + +// ten arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, F7 f7, F8 f8, F9 f9, tbb::task_group_context& context) { + internal::parallel_invoke_cleaner cleaner(4, context); + internal::parallel_invoke_helper& root = cleaner.root; + + root.add_children(f9, f8, f7); + root.add_children(f6, f5, f4); + root.add_children(f3, f2, f1); + + root.run_and_finish(f0); +} + +// two arguments +template +void parallel_invoke(F0 f0, F1 f1) { + task_group_context context; + parallel_invoke(f0, f1, context); +} +// three arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2) { + task_group_context context; + parallel_invoke(f0, f1, f2, context); +} +// four arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, context); +} +// five arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, f4, context); +} +// six arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, f4, f5, context); +} +// seven arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, f4, f5, f6, context); +} +// eigth arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, F7 f7) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, f4, f5, f6, f7, context); +} +// nine arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, F7 f7, F8 f8) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, f4, f5, f6, f7, f8, context); +} +// ten arguments +template +void parallel_invoke(F0 f0, F1 f1, F2 f2, F3 f3, F4 f4, F5 f5, F6 f6, F7 f7, F8 f8, F9 f9) { + task_group_context context; + parallel_invoke(f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, context); +} + +//@} + +} // namespace + +#endif /* __TBB_parallel_invoke_H */ diff --git a/dep/tbb/include/tbb/parallel_reduce.h b/dep/tbb/include/tbb/parallel_reduce.h new file mode 100644 index 000000000..030017394 --- /dev/null +++ b/dep/tbb/include/tbb/parallel_reduce.h @@ -0,0 +1,387 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_reduce_H +#define __TBB_parallel_reduce_H + +#include "task.h" +#include "aligned_space.h" +#include "partitioner.h" +#include + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + + //! ITT instrumented routine that stores src into location pointed to by dst. + void __TBB_EXPORTED_FUNC itt_store_pointer_with_release_v3( void* dst, void* src ); + + //! ITT instrumented routine that loads pointer from location pointed to by src. + void* __TBB_EXPORTED_FUNC itt_load_pointer_with_acquire_v3( const void* src ); + + template inline void parallel_reduce_store_body( T*& dst, T* src ) { +#if TBB_USE_THREADING_TOOLS + itt_store_pointer_with_release_v3(&dst,src); +#else + __TBB_store_with_release(dst,src); +#endif /* TBB_USE_THREADING_TOOLS */ + } + + template inline T* parallel_reduce_load_body( T*& src ) { +#if TBB_USE_THREADING_TOOLS + return static_cast(itt_load_pointer_with_acquire_v3(&src)); +#else + return __TBB_load_with_acquire(src); +#endif /* TBB_USE_THREADING_TOOLS */ + } + + //! 0 if root, 1 if a left child, 2 if a right child. + /** Represented as a char, not enum, for compactness. */ + typedef char reduction_context; + + //! Task type use to combine the partial results of parallel_reduce with affinity_partitioner. + /** @ingroup algorithms */ + template + class finish_reduce: public task { + //! Pointer to body, or NULL if the left child has not yet finished. + Body* my_body; + bool has_right_zombie; + const reduction_context my_context; + aligned_space zombie_space; + finish_reduce( char context ) : + my_body(NULL), + has_right_zombie(false), + my_context(context) + { + } + task* execute() { + if( has_right_zombie ) { + // Right child was stolen. + Body* s = zombie_space.begin(); + my_body->join( *s ); + s->~Body(); + } + if( my_context==1 ) + parallel_reduce_store_body( static_cast(parent())->my_body, my_body ); + return NULL; + } + template + friend class start_reduce; + }; + + //! Task type used to split the work of parallel_reduce with affinity_partitioner. + /** @ingroup algorithms */ + template + class start_reduce: public task { + typedef finish_reduce finish_type; + Body* my_body; + Range my_range; + typename Partitioner::partition_type my_partition; + reduction_context my_context; + /*override*/ task* execute(); + template + friend class finish_reduce; + + //! Constructor used for root task + start_reduce( const Range& range, Body* body, Partitioner& partitioner ) : + my_body(body), + my_range(range), + my_partition(partitioner), + my_context(0) + { + } + //! Splitting constructor used to generate children. + /** this becomes left child. Newly constructed object is right child. */ + start_reduce( start_reduce& parent, split ) : + my_body(parent.my_body), + my_range(parent.my_range,split()), + my_partition(parent.my_partition,split()), + my_context(2) + { + my_partition.set_affinity(*this); + parent.my_context = 1; + } + //! Update affinity info, if any + /*override*/ void note_affinity( affinity_id id ) { + my_partition.note_affinity( id ); + } + +public: + static void run( const Range& range, Body& body, Partitioner& partitioner ) { + if( !range.empty() ) { +#if !__TBB_EXCEPTIONS || TBB_JOIN_OUTER_TASK_GROUP + task::spawn_root_and_wait( *new(task::allocate_root()) start_reduce(range,&body,partitioner) ); +#else + // Bound context prevents exceptions from body to affect nesting or sibling algorithms, + // and allows users to handle exceptions safely by wrapping parallel_for in the try-block. + task_group_context context; + task::spawn_root_and_wait( *new(task::allocate_root(context)) start_reduce(range,&body,partitioner) ); +#endif /* __TBB_EXCEPTIONS && !TBB_JOIN_OUTER_TASK_GROUP */ + } + } +#if __TBB_EXCEPTIONS + static void run( const Range& range, Body& body, Partitioner& partitioner, task_group_context& context ) { + if( !range.empty() ) + task::spawn_root_and_wait( *new(task::allocate_root(context)) start_reduce(range,&body,partitioner) ); + } +#endif /* __TBB_EXCEPTIONS */ + }; + + template + task* start_reduce::execute() { + if( my_context==2 ) { + finish_type* p = static_cast(parent() ); + if( !parallel_reduce_load_body(p->my_body) ) { + my_body = new( p->zombie_space.begin() ) Body(*my_body,split()); + p->has_right_zombie = true; + } + } + if( !my_range.is_divisible() || my_partition.should_execute_range(*this) ) { + (*my_body)( my_range ); + if( my_context==1 ) + parallel_reduce_store_body(static_cast(parent())->my_body, my_body ); + return my_partition.continue_after_execute_range(*this); + } else { + finish_type& c = *new( allocate_continuation()) finish_type(my_context); + recycle_as_child_of(c); + c.set_ref_count(2); + bool delay = my_partition.decide_whether_to_delay(); + start_reduce& b = *new( c.allocate_child() ) start_reduce(*this,split()); + my_partition.spawn_or_delay(delay,*this,b); + return this; + } + } + + //! Auxiliary class for parallel_reduce; for internal use only. + /** The adaptor class that implements \ref parallel_reduce_body_req "parallel_reduce Body" + using given \ref parallel_reduce_lambda_req "anonymous function objects". + **/ + /** @ingroup algorithms */ + template + class lambda_reduce_body { + +//FIXME: decide if my_real_body, my_reduction, and identity_element should be copied or referenced +// (might require some performance measurements) + + const Value& identity_element; + const RealBody& my_real_body; + const Reduction& my_reduction; + Value my_value; + lambda_reduce_body& operator= ( const lambda_reduce_body& other ); + public: + lambda_reduce_body( const Value& identity, const RealBody& body, const Reduction& reduction ) + : identity_element(identity) + , my_real_body(body) + , my_reduction(reduction) + , my_value(identity) + { } + lambda_reduce_body( const lambda_reduce_body& other ) + : identity_element(other.identity_element) + , my_real_body(other.my_real_body) + , my_reduction(other.my_reduction) + , my_value(other.my_value) + { } + lambda_reduce_body( lambda_reduce_body& other, tbb::split ) + : identity_element(other.identity_element) + , my_real_body(other.my_real_body) + , my_reduction(other.my_reduction) + , my_value(other.identity_element) + { } + void operator()(Range& range) { + my_value = my_real_body(range, const_cast(my_value)); + } + void join( lambda_reduce_body& rhs ) { + my_value = my_reduction(const_cast(my_value), const_cast(rhs.my_value)); + } + Value result() const { + return my_value; + } + }; + +} // namespace internal +//! @endcond + +// Requirements on Range concept are documented in blocked_range.h + +/** \page parallel_reduce_body_req Requirements on parallel_reduce body + Class \c Body implementing the concept of parallel_reduce body must define: + - \code Body::Body( Body&, split ); \endcode Splitting constructor. + Must be able to run concurrently with operator() and method \c join + - \code Body::~Body(); \endcode Destructor + - \code void Body::operator()( Range& r ); \endcode Function call operator applying body to range \c r + and accumulating the result + - \code void Body::join( Body& b ); \endcode Join results. + The result in \c b should be merged into the result of \c this +**/ + +/** \page parallel_reduce_lambda_req Requirements on parallel_reduce anonymous function objects (lambda functions) + TO BE DOCUMENTED +**/ + +/** \name parallel_reduce + See also requirements on \ref range_req "Range" and \ref parallel_reduce_body_req "parallel_reduce Body". **/ +//@{ + +//! Parallel iteration with reduction and default partitioner. +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body ) { + internal::start_reduce::run( range, body, __TBB_DEFAULT_PARTITIONER() ); +} + +//! Parallel iteration with reduction and simple_partitioner +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body, const simple_partitioner& partitioner ) { + internal::start_reduce::run( range, body, partitioner ); +} + +//! Parallel iteration with reduction and auto_partitioner +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body, const auto_partitioner& partitioner ) { + internal::start_reduce::run( range, body, partitioner ); +} + +//! Parallel iteration with reduction and affinity_partitioner +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body, affinity_partitioner& partitioner ) { + internal::start_reduce::run( range, body, partitioner ); +} + +#if __TBB_EXCEPTIONS +//! Parallel iteration with reduction, simple partitioner and user-supplied context. +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body, const simple_partitioner& partitioner, task_group_context& context ) { + internal::start_reduce::run( range, body, partitioner, context ); +} + +//! Parallel iteration with reduction, auto_partitioner and user-supplied context +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body, const auto_partitioner& partitioner, task_group_context& context ) { + internal::start_reduce::run( range, body, partitioner, context ); +} + +//! Parallel iteration with reduction, affinity_partitioner and user-supplied context +/** @ingroup algorithms **/ +template +void parallel_reduce( const Range& range, Body& body, affinity_partitioner& partitioner, task_group_context& context ) { + internal::start_reduce::run( range, body, partitioner, context ); +} +#endif /* __TBB_EXCEPTIONS */ + +/** parallel_reduce overloads that work with anonymous function objects + (see also \ref parallel_reduce_lambda_req "requirements on parallel_reduce anonymous function objects"). **/ + +//! Parallel iteration with reduction and default partitioner. +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,const __TBB_DEFAULT_PARTITIONER> + ::run(range, body, __TBB_DEFAULT_PARTITIONER() ); + return body.result(); +} + +//! Parallel iteration with reduction and simple_partitioner. +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction, + const simple_partitioner& partitioner ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,const simple_partitioner> + ::run(range, body, partitioner ); + return body.result(); +} + +//! Parallel iteration with reduction and auto_partitioner +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction, + const auto_partitioner& partitioner ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,const auto_partitioner> + ::run( range, body, partitioner ); + return body.result(); +} + +//! Parallel iteration with reduction and affinity_partitioner +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction, + affinity_partitioner& partitioner ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,affinity_partitioner> + ::run( range, body, partitioner ); + return body.result(); +} + +#if __TBB_EXCEPTIONS +//! Parallel iteration with reduction, simple partitioner and user-supplied context. +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction, + const simple_partitioner& partitioner, task_group_context& context ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,const simple_partitioner> + ::run( range, body, partitioner, context ); + return body.result(); +} + +//! Parallel iteration with reduction, auto_partitioner and user-supplied context +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction, + const auto_partitioner& partitioner, task_group_context& context ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,const auto_partitioner> + ::run( range, body, partitioner, context ); + return body.result(); +} + +//! Parallel iteration with reduction, affinity_partitioner and user-supplied context +/** @ingroup algorithms **/ +template +Value parallel_reduce( const Range& range, const Value& identity, const RealBody& real_body, const Reduction& reduction, + affinity_partitioner& partitioner, task_group_context& context ) { + internal::lambda_reduce_body body(identity, real_body, reduction); + internal::start_reduce,affinity_partitioner> + ::run( range, body, partitioner, context ); + return body.result(); +} +#endif /* __TBB_EXCEPTIONS */ +//@} + +} // namespace tbb + +#endif /* __TBB_parallel_reduce_H */ + diff --git a/dep/tbb/include/tbb/parallel_scan.h b/dep/tbb/include/tbb/parallel_scan.h new file mode 100644 index 000000000..1369bf733 --- /dev/null +++ b/dep/tbb/include/tbb/parallel_scan.h @@ -0,0 +1,351 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_scan_H +#define __TBB_parallel_scan_H + +#include "task.h" +#include "aligned_space.h" +#include +#include "partitioner.h" + +namespace tbb { + +//! Used to indicate that the initial scan is being performed. +/** @ingroup algorithms */ +struct pre_scan_tag { + static bool is_final_scan() {return false;} +}; + +//! Used to indicate that the final scan is being performed. +/** @ingroup algorithms */ +struct final_scan_tag { + static bool is_final_scan() {return true;} +}; + +//! @cond INTERNAL +namespace internal { + + //! Performs final scan for a leaf + /** @ingroup algorithms */ + template + class final_sum: public task { + public: + Body body; + private: + aligned_space range; + //! Where to put result of last subrange, or NULL if not last subrange. + Body* stuff_last; + public: + final_sum( Body& body_ ) : + body(body_,split()) + { + poison_pointer(stuff_last); + } + ~final_sum() { + range.begin()->~Range(); + } + void finish_construction( const Range& range_, Body* stuff_last_ ) { + new( range.begin() ) Range(range_); + stuff_last = stuff_last_; + } + private: + /*override*/ task* execute() { + body( *range.begin(), final_scan_tag() ); + if( stuff_last ) + stuff_last->assign(body); + return NULL; + } + }; + + //! Split work to be done in the scan. + /** @ingroup algorithms */ + template + class sum_node: public task { + typedef final_sum final_sum_type; + public: + final_sum_type *incoming; + final_sum_type *body; + Body *stuff_last; + private: + final_sum_type *left_sum; + sum_node *left; + sum_node *right; + bool left_is_final; + Range range; + sum_node( const Range range_, bool left_is_final_ ) : + left_sum(NULL), + left(NULL), + right(NULL), + left_is_final(left_is_final_), + range(range_) + { + // Poison fields that will be set by second pass. + poison_pointer(body); + poison_pointer(incoming); + } + task* create_child( const Range& range, final_sum_type& f, sum_node* n, final_sum_type* incoming, Body* stuff_last ) { + if( !n ) { + f.recycle_as_child_of( *this ); + f.finish_construction( range, stuff_last ); + return &f; + } else { + n->body = &f; + n->incoming = incoming; + n->stuff_last = stuff_last; + return n; + } + } + /*override*/ task* execute() { + if( body ) { + if( incoming ) + left_sum->body.reverse_join( incoming->body ); + recycle_as_continuation(); + sum_node& c = *this; + task* b = c.create_child(Range(range,split()),*left_sum,right,left_sum,stuff_last); + task* a = left_is_final ? NULL : c.create_child(range,*body,left,incoming,NULL); + set_ref_count( (a!=NULL)+(b!=NULL) ); + body = NULL; + if( a ) spawn(*b); + else a = b; + return a; + } else { + return NULL; + } + } + template + friend class start_scan; + + template + friend class finish_scan; + }; + + //! Combine partial results + /** @ingroup algorithms */ + template + class finish_scan: public task { + typedef sum_node sum_node_type; + typedef final_sum final_sum_type; + final_sum_type** const sum; + sum_node_type*& return_slot; + public: + final_sum_type* right_zombie; + sum_node_type& result; + + /*override*/ task* execute() { + __TBB_ASSERT( result.ref_count()==(result.left!=NULL)+(result.right!=NULL), NULL ); + if( result.left ) + result.left_is_final = false; + if( right_zombie && sum ) + ((*sum)->body).reverse_join(result.left_sum->body); + __TBB_ASSERT( !return_slot, NULL ); + if( right_zombie || result.right ) { + return_slot = &result; + } else { + destroy( result ); + } + if( right_zombie && !sum && !result.right ) destroy(*right_zombie); + return NULL; + } + + finish_scan( sum_node_type*& return_slot_, final_sum_type** sum_, sum_node_type& result_ ) : + sum(sum_), + return_slot(return_slot_), + right_zombie(NULL), + result(result_) + { + __TBB_ASSERT( !return_slot, NULL ); + } + }; + + //! Initial task to split the work + /** @ingroup algorithms */ + template + class start_scan: public task { + typedef sum_node sum_node_type; + typedef final_sum final_sum_type; + final_sum_type* body; + /** Non-null if caller is requesting total. */ + final_sum_type** sum; + sum_node_type** return_slot; + /** Null if computing root. */ + sum_node_type* parent_sum; + bool is_final; + bool is_right_child; + Range range; + typename Partitioner::partition_type partition; + /*override*/ task* execute(); + public: + start_scan( sum_node_type*& return_slot_, start_scan& parent, sum_node_type* parent_sum_ ) : + body(parent.body), + sum(parent.sum), + return_slot(&return_slot_), + parent_sum(parent_sum_), + is_final(parent.is_final), + is_right_child(false), + range(parent.range,split()), + partition(parent.partition,split()) + { + __TBB_ASSERT( !*return_slot, NULL ); + } + + start_scan( sum_node_type*& return_slot_, const Range& range_, final_sum_type& body_, const Partitioner& partitioner_) : + body(&body_), + sum(NULL), + return_slot(&return_slot_), + parent_sum(NULL), + is_final(true), + is_right_child(false), + range(range_), + partition(partitioner_) + { + __TBB_ASSERT( !*return_slot, NULL ); + } + + static void run( const Range& range, Body& body, const Partitioner& partitioner ) { + if( !range.empty() ) { + typedef internal::start_scan start_pass1_type; + internal::sum_node* root = NULL; + typedef internal::final_sum final_sum_type; + final_sum_type* temp_body = new(task::allocate_root()) final_sum_type( body ); + start_pass1_type& pass1 = *new(task::allocate_root()) start_pass1_type( + /*return_slot=*/root, + range, + *temp_body, + partitioner ); + task::spawn_root_and_wait( pass1 ); + if( root ) { + root->body = temp_body; + root->incoming = NULL; + root->stuff_last = &body; + task::spawn_root_and_wait( *root ); + } else { + body.assign(temp_body->body); + temp_body->finish_construction( range, NULL ); + temp_body->destroy(*temp_body); + } + } + } + }; + + template + task* start_scan::execute() { + typedef internal::finish_scan finish_pass1_type; + finish_pass1_type* p = parent_sum ? static_cast( parent() ) : NULL; + // Inspecting p->result.left_sum would ordinarily be a race condition. + // But we inspect it only if we are not a stolen task, in which case we + // know that task assigning to p->result.left_sum has completed. + bool treat_as_stolen = is_right_child && (is_stolen_task() || body!=p->result.left_sum); + if( treat_as_stolen ) { + // Invocation is for right child that has been really stolen or needs to be virtually stolen + p->right_zombie = body = new( allocate_root() ) final_sum_type(body->body); + is_final = false; + } + task* next_task = NULL; + if( (is_right_child && !treat_as_stolen) || !range.is_divisible() || partition.should_execute_range(*this) ) { + if( is_final ) + (body->body)( range, final_scan_tag() ); + else if( sum ) + (body->body)( range, pre_scan_tag() ); + if( sum ) + *sum = body; + __TBB_ASSERT( !*return_slot, NULL ); + } else { + sum_node_type* result; + if( parent_sum ) + result = new(allocate_additional_child_of(*parent_sum)) sum_node_type(range,/*left_is_final=*/is_final); + else + result = new(task::allocate_root()) sum_node_type(range,/*left_is_final=*/is_final); + finish_pass1_type& c = *new( allocate_continuation()) finish_pass1_type(*return_slot,sum,*result); + // Split off right child + start_scan& b = *new( c.allocate_child() ) start_scan( /*return_slot=*/result->right, *this, result ); + b.is_right_child = true; + // Left child is recycling of *this. Must recycle this before spawning b, + // otherwise b might complete and decrement c.ref_count() to zero, which + // would cause c.execute() to run prematurely. + recycle_as_child_of(c); + c.set_ref_count(2); + c.spawn(b); + sum = &result->left_sum; + return_slot = &result->left; + is_right_child = false; + next_task = this; + parent_sum = result; + __TBB_ASSERT( !*return_slot, NULL ); + } + return next_task; + } +} // namespace internal +//! @endcond + +// Requirements on Range concept are documented in blocked_range.h + +/** \page parallel_scan_body_req Requirements on parallel_scan body + Class \c Body implementing the concept of parallel_reduce body must define: + - \code Body::Body( Body&, split ); \endcode Splitting constructor. + Split \c b so that \c this and \c b can accumulate separately + - \code Body::~Body(); \endcode Destructor + - \code void Body::operator()( const Range& r, pre_scan_tag ); \endcode + Preprocess iterations for range \c r + - \code void Body::operator()( const Range& r, final_scan_tag ); \endcode + Do final processing for iterations of range \c r + - \code void Body::reverse_join( Body& a ); \endcode + Merge preprocessing state of \c a into \c this, where \c a was + created earlier from \c b by b's splitting constructor +**/ + +/** \name parallel_scan + See also requirements on \ref range_req "Range" and \ref parallel_scan_body_req "parallel_scan Body". **/ +//@{ + +//! Parallel prefix with default partitioner +/** @ingroup algorithms **/ +template +void parallel_scan( const Range& range, Body& body ) { + internal::start_scan::run(range,body,__TBB_DEFAULT_PARTITIONER()); +} + +//! Parallel prefix with simple_partitioner +/** @ingroup algorithms **/ +template +void parallel_scan( const Range& range, Body& body, const simple_partitioner& partitioner ) { + internal::start_scan::run(range,body,partitioner); +} + +//! Parallel prefix with auto_partitioner +/** @ingroup algorithms **/ +template +void parallel_scan( const Range& range, Body& body, const auto_partitioner& partitioner ) { + internal::start_scan::run(range,body,partitioner); +} +//@} + +} // namespace tbb + +#endif /* __TBB_parallel_scan_H */ + diff --git a/dep/tbb/include/tbb/parallel_sort.h b/dep/tbb/include/tbb/parallel_sort.h new file mode 100644 index 000000000..38b380dea --- /dev/null +++ b/dep/tbb/include/tbb/parallel_sort.h @@ -0,0 +1,227 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_sort_H +#define __TBB_parallel_sort_H + +#include "parallel_for.h" +#include "blocked_range.h" +#include +#include +#include + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + +//! Range used in quicksort to split elements into subranges based on a value. +/** The split operation selects a splitter and places all elements less than or equal + to the value in the first range and the remaining elements in the second range. + @ingroup algorithms */ +template +class quick_sort_range: private no_assign { + + inline size_t median_of_three(const RandomAccessIterator &array, size_t l, size_t m, size_t r) const { + return comp(array[l], array[m]) ? ( comp(array[m], array[r]) ? m : ( comp( array[l], array[r]) ? r : l ) ) + : ( comp(array[r], array[m]) ? m : ( comp( array[r], array[l] ) ? r : l ) ); + } + + inline size_t pseudo_median_of_nine( const RandomAccessIterator &array, const quick_sort_range &range ) const { + size_t offset = range.size/8u; + return median_of_three(array, + median_of_three(array, 0, offset, offset*2), + median_of_three(array, offset*3, offset*4, offset*5), + median_of_three(array, offset*6, offset*7, range.size - 1) ); + + } + +public: + + static const size_t grainsize = 500; + const Compare ∁ + RandomAccessIterator begin; + size_t size; + + quick_sort_range( RandomAccessIterator begin_, size_t size_, const Compare &comp_ ) : + comp(comp_), begin(begin_), size(size_) {} + + bool empty() const {return size==0;} + bool is_divisible() const {return size>=grainsize;} + + quick_sort_range( quick_sort_range& range, split ) : comp(range.comp) { + RandomAccessIterator array = range.begin; + RandomAccessIterator key0 = range.begin; + size_t m = pseudo_median_of_nine(array, range); + if (m) std::swap ( array[0], array[m] ); + + size_t i=0; + size_t j=range.size; + // Partition interval [i+1,j-1] with key *key0. + for(;;) { + __TBB_ASSERT( i +class quick_sort_pretest_body : internal::no_assign { + const Compare ∁ + +public: + quick_sort_pretest_body(const Compare &_comp) : comp(_comp) {} + + void operator()( const blocked_range& range ) const { + task &my_task = task::self(); + RandomAccessIterator my_end = range.end(); + + int i = 0; + for (RandomAccessIterator k = range.begin(); k != my_end; ++k, ++i) { + if ( i%64 == 0 && my_task.is_cancelled() ) break; + + // The k-1 is never out-of-range because the first chunk starts at begin+serial_cutoff+1 + if ( comp( *(k), *(k-1) ) ) { + my_task.cancel_group_execution(); + break; + } + } + } + +}; + +//! Body class used to sort elements in a range that is smaller than the grainsize. +/** @ingroup algorithms */ +template +struct quick_sort_body { + void operator()( const quick_sort_range& range ) const { + //SerialQuickSort( range.begin, range.size, range.comp ); + std::sort( range.begin, range.begin + range.size, range.comp ); + } +}; + +//! Wrapper method to initiate the sort by calling parallel_for. +/** @ingroup algorithms */ +template +void parallel_quick_sort( RandomAccessIterator begin, RandomAccessIterator end, const Compare& comp ) { + task_group_context my_context; + const int serial_cutoff = 9; + + __TBB_ASSERT( begin + serial_cutoff < end, "min_parallel_size is smaller than serial cutoff?" ); + RandomAccessIterator k; + for ( k = begin ; k != begin + serial_cutoff; ++k ) { + if ( comp( *(k+1), *k ) ) { + goto do_parallel_quick_sort; + } + } + + parallel_for( blocked_range(k+1, end), + quick_sort_pretest_body(comp), + auto_partitioner(), + my_context); + + if (my_context.is_group_execution_cancelled()) +do_parallel_quick_sort: + parallel_for( quick_sort_range(begin, end-begin, comp ), + quick_sort_body(), + auto_partitioner() ); +} + +} // namespace internal +//! @endcond + +/** \page parallel_sort_iter_req Requirements on iterators for parallel_sort + Requirements on value type \c T of \c RandomAccessIterator for \c parallel_sort: + - \code void swap( T& x, T& y ) \endcode Swaps \c x and \c y + - \code bool Compare::operator()( const T& x, const T& y ) \endcode + True if x comes before y; +**/ + +/** \name parallel_sort + See also requirements on \ref parallel_sort_iter_req "iterators for parallel_sort". **/ +//@{ + +//! Sorts the data in [begin,end) using the given comparator +/** The compare function object is used for all comparisons between elements during sorting. + The compare object must define a bool operator() function. + @ingroup algorithms **/ +template +void parallel_sort( RandomAccessIterator begin, RandomAccessIterator end, const Compare& comp) { + const int min_parallel_size = 500; + if( end > begin ) { + if (end - begin < min_parallel_size) { + std::sort(begin, end, comp); + } else { + internal::parallel_quick_sort(begin, end, comp); + } + } +} + +//! Sorts the data in [begin,end) with a default comparator \c std::less +/** @ingroup algorithms **/ +template +inline void parallel_sort( RandomAccessIterator begin, RandomAccessIterator end ) { + parallel_sort( begin, end, std::less< typename std::iterator_traits::value_type >() ); +} + +//! Sorts the data in the range \c [begin,end) with a default comparator \c std::less +/** @ingroup algorithms **/ +template +inline void parallel_sort( T * begin, T * end ) { + parallel_sort( begin, end, std::less< T >() ); +} +//@} + + +} // namespace tbb + +#endif + diff --git a/dep/tbb/include/tbb/parallel_while.h b/dep/tbb/include/tbb/parallel_while.h new file mode 100644 index 000000000..a4ad9e6e2 --- /dev/null +++ b/dep/tbb/include/tbb/parallel_while.h @@ -0,0 +1,194 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_parallel_while +#define __TBB_parallel_while + +#include "task.h" +#include + +namespace tbb { + +template +class parallel_while; + +//! @cond INTERNAL +namespace internal { + + template class while_task; + + //! For internal use only. + /** Executes one iteration of a while. + @ingroup algorithms */ + template + class while_iteration_task: public task { + const Body& my_body; + typename Body::argument_type my_value; + /*override*/ task* execute() { + my_body(my_value); + return NULL; + } + while_iteration_task( const typename Body::argument_type& value, const Body& body ) : + my_body(body), my_value(value) + {} + template friend class while_group_task; + friend class tbb::parallel_while; + }; + + //! For internal use only + /** Unpacks a block of iterations. + @ingroup algorithms */ + template + class while_group_task: public task { + static const size_t max_arg_size = 4; + const Body& my_body; + size_t size; + typename Body::argument_type my_arg[max_arg_size]; + while_group_task( const Body& body ) : my_body(body), size(0) {} + /*override*/ task* execute() { + typedef while_iteration_task iteration_type; + __TBB_ASSERT( size>0, NULL ); + task_list list; + task* t; + size_t k=0; + for(;;) { + t = new( allocate_child() ) iteration_type(my_arg[k],my_body); + if( ++k==size ) break; + list.push_back(*t); + } + set_ref_count(int(k+1)); + spawn(list); + spawn_and_wait_for_all(*t); + return NULL; + } + template friend class while_task; + }; + + //! For internal use only. + /** Gets block of iterations from a stream and packages them into a while_group_task. + @ingroup algorithms */ + template + class while_task: public task { + Stream& my_stream; + const Body& my_body; + empty_task& my_barrier; + /*override*/ task* execute() { + typedef while_group_task block_type; + block_type& t = *new( allocate_additional_child_of(my_barrier) ) block_type(my_body); + size_t k=0; + while( my_stream.pop_if_present(t.my_arg[k]) ) { + if( ++k==block_type::max_arg_size ) { + // There might be more iterations. + recycle_to_reexecute(); + break; + } + } + if( k==0 ) { + destroy(t); + return NULL; + } else { + t.size = k; + return &t; + } + } + while_task( Stream& stream, const Body& body, empty_task& barrier ) : + my_stream(stream), + my_body(body), + my_barrier(barrier) + {} + friend class tbb::parallel_while; + }; + +} // namespace internal +//! @endcond + +//! Parallel iteration over a stream, with optional addition of more work. +/** The Body b has the requirement: \n + "b(v)" \n + "b.argument_type" \n + where v is an argument_type + @ingroup algorithms */ +template +class parallel_while: internal::no_copy { +public: + //! Construct empty non-running parallel while. + parallel_while() : my_body(NULL), my_barrier(NULL) {} + + //! Destructor cleans up data members before returning. + ~parallel_while() { + if( my_barrier ) { + my_barrier->destroy(*my_barrier); + my_barrier = NULL; + } + } + + //! Type of items + typedef typename Body::argument_type value_type; + + //! Apply body.apply to each item in the stream. + /** A Stream s has the requirements \n + "S::value_type" \n + "s.pop_if_present(value) is convertible to bool */ + template + void run( Stream& stream, const Body& body ); + + //! Add a work item while running. + /** Should be executed only by body.apply or a thread spawned therefrom. */ + void add( const value_type& item ); + +private: + const Body* my_body; + empty_task* my_barrier; +}; + +template +template +void parallel_while::run( Stream& stream, const Body& body ) { + using namespace internal; + empty_task& barrier = *new( task::allocate_root() ) empty_task(); + my_body = &body; + my_barrier = &barrier; + my_barrier->set_ref_count(2); + while_task& w = *new( my_barrier->allocate_child() ) while_task( stream, body, barrier ); + my_barrier->spawn_and_wait_for_all(w); + my_barrier->destroy(*my_barrier); + my_barrier = NULL; + my_body = NULL; +} + +template +void parallel_while::add( const value_type& item ) { + __TBB_ASSERT(my_barrier,"attempt to add to parallel_while that is not running"); + typedef internal::while_iteration_task iteration_type; + iteration_type& i = *new( task::self().allocate_additional_child_of(*my_barrier) ) iteration_type(item,*my_body); + task::self().spawn( i ); +} + +} // namespace + +#endif /* __TBB_parallel_while */ diff --git a/dep/tbb/include/tbb/partitioner.h b/dep/tbb/include/tbb/partitioner.h new file mode 100644 index 000000000..53e2953c0 --- /dev/null +++ b/dep/tbb/include/tbb/partitioner.h @@ -0,0 +1,228 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_partitioner_H +#define __TBB_partitioner_H + +#include "task.h" + +namespace tbb { +class affinity_partitioner; + +//! @cond INTERNAL +namespace internal { +size_t __TBB_EXPORTED_FUNC get_initial_auto_partitioner_divisor(); + +//! Defines entry points into tbb run-time library; +/** The entry points are the constructor and destructor. */ +class affinity_partitioner_base_v3: no_copy { + friend class tbb::affinity_partitioner; + //! Array that remembers affinities of tree positions to affinity_id. + /** NULL if my_size==0. */ + affinity_id* my_array; + //! Number of elements in my_array. + size_t my_size; + //! Zeros the fields. + affinity_partitioner_base_v3() : my_array(NULL), my_size(0) {} + //! Deallocates my_array. + ~affinity_partitioner_base_v3() {resize(0);} + //! Resize my_array. + /** Retains values if resulting size is the same. */ + void __TBB_EXPORTED_METHOD resize( unsigned factor ); + friend class affinity_partition_type; +}; + +//! Provides default methods for partition objects without affinity. +class partition_type_base { +public: + void set_affinity( task & ) {} + void note_affinity( task::affinity_id ) {} + task* continue_after_execute_range( task& ) {return NULL;} + bool decide_whether_to_delay() {return false;} + void spawn_or_delay( bool, task& a, task& b ) { + a.spawn(b); + } +}; + +class affinity_partition_type; + +template class start_for; +template class start_reduce; +template class start_reduce_with_affinity; +template class start_scan; + +} // namespace internal +//! @endcond + +//! A simple partitioner +/** Divides the range until the range is not divisible. + @ingroup algorithms */ +class simple_partitioner { +public: + simple_partitioner() {} +private: + template friend class internal::start_for; + template friend class internal::start_reduce; + template friend class internal::start_scan; + + class partition_type: public internal::partition_type_base { + public: + bool should_execute_range(const task& ) {return false;} + partition_type( const simple_partitioner& ) {} + partition_type( const partition_type&, split ) {} + }; +}; + +//! An auto partitioner +/** The range is initial divided into several large chunks. + Chunks are further subdivided into VICTIM_CHUNKS pieces if they are stolen and divisible. + @ingroup algorithms */ +class auto_partitioner { +public: + auto_partitioner() {} + +private: + template friend class internal::start_for; + template friend class internal::start_reduce; + template friend class internal::start_scan; + + class partition_type: public internal::partition_type_base { + size_t num_chunks; + static const size_t VICTIM_CHUNKS = 4; +public: + bool should_execute_range(const task &t) { + if( num_chunks friend class internal::start_for; + template friend class internal::start_reduce; + template friend class internal::start_reduce_with_affinity; + template friend class internal::start_scan; + + typedef internal::affinity_partition_type partition_type; + friend class internal::affinity_partition_type; +}; + +//! @cond INTERNAL +namespace internal { + +class affinity_partition_type: public no_copy { + //! Must be power of two + static const unsigned factor = 16; + static const size_t VICTIM_CHUNKS = 4; + + internal::affinity_id* my_array; + task_list delay_list; + unsigned map_begin, map_end; + size_t num_chunks; +public: + affinity_partition_type( affinity_partitioner& ap ) { + __TBB_ASSERT( (factor&(factor-1))==0, "factor must be power of two" ); + ap.resize(factor); + my_array = ap.my_array; + map_begin = 0; + map_end = unsigned(ap.my_size); + num_chunks = internal::get_initial_auto_partitioner_divisor(); + } + affinity_partition_type(affinity_partition_type& p, split) : my_array(p.my_array) { + __TBB_ASSERT( p.map_end-p.map_beginfactor ) + d &= 0u-factor; + map_end = e; + map_begin = p.map_end = e-d; + } + + bool should_execute_range(const task &t) { + if( num_chunks < VICTIM_CHUNKS && t.is_stolen_task() ) + num_chunks = VICTIM_CHUNKS; + return num_chunks == 1; + } + + void set_affinity( task &t ) { + if( map_begin + +namespace tbb { + +class pipeline; +class filter; + +//! @cond INTERNAL +namespace internal { + +// The argument for PIPELINE_VERSION should be an integer between 2 and 9 +#define __TBB_PIPELINE_VERSION(x) (unsigned char)(x-2)<<1 + +typedef unsigned long Token; +typedef long tokendiff_t; +class stage_task; +class input_buffer; +class pipeline_root_task; +class pipeline_cleaner; + +} // namespace internal +//! @endcond + +//! A stage in a pipeline. +/** @ingroup algorithms */ +class filter: internal::no_copy { +private: + //! Value used to mark "not in pipeline" + static filter* not_in_pipeline() {return reinterpret_cast(internal::intptr(-1));} + + //! The lowest bit 0 is for parallel vs. serial + static const unsigned char filter_is_serial = 0x1; + + //! 4th bit distinguishes ordered vs unordered filters. + /** The bit was not set for parallel filters in TBB 2.1 and earlier, + but is_ordered() function always treats parallel filters as out of order. */ + static const unsigned char filter_is_out_of_order = 0x1<<4; + + //! 5th bit distinguishes thread-bound and regular filters. + static const unsigned char filter_is_bound = 0x1<<5; + + static const unsigned char current_version = __TBB_PIPELINE_VERSION(5); + static const unsigned char version_mask = 0x7<<1; // bits 1-3 are for version +public: + enum mode { + //! processes multiple items in parallel and in no particular order + parallel = current_version | filter_is_out_of_order, + //! processes items one at a time; all such filters process items in the same order + serial_in_order = current_version | filter_is_serial, + //! processes items one at a time and in no particular order + serial_out_of_order = current_version | filter_is_serial | filter_is_out_of_order, + //! @deprecated use serial_in_order instead + serial = serial_in_order + }; +protected: + filter( bool is_serial_ ) : + next_filter_in_pipeline(not_in_pipeline()), + my_input_buffer(NULL), + my_filter_mode(static_cast(is_serial_ ? serial : parallel)), + prev_filter_in_pipeline(not_in_pipeline()), + my_pipeline(NULL), + next_segment(NULL) + {} + + filter( mode filter_mode ) : + next_filter_in_pipeline(not_in_pipeline()), + my_input_buffer(NULL), + my_filter_mode(static_cast(filter_mode)), + prev_filter_in_pipeline(not_in_pipeline()), + my_pipeline(NULL), + next_segment(NULL) + {} + +public: + //! True if filter is serial. + bool is_serial() const { + return bool( my_filter_mode & filter_is_serial ); + } + + //! True if filter must receive stream in order. + bool is_ordered() const { + return (my_filter_mode & (filter_is_out_of_order|filter_is_serial))==filter_is_serial; + } + + //! True if filter is thread-bound. + bool is_bound() const { + return ( my_filter_mode & filter_is_bound )==filter_is_bound; + } + + //! Operate on an item from the input stream, and return item for output stream. + /** Returns NULL if filter is a sink. */ + virtual void* operator()( void* item ) = 0; + + //! Destroy filter. + /** If the filter was added to a pipeline, the pipeline must be destroyed first. */ + virtual __TBB_EXPORTED_METHOD ~filter(); + +#if __TBB_EXCEPTIONS + //! Destroys item if pipeline was cancelled. + /** Required to prevent memory leaks. + Note it can be called concurrently even for serial filters.*/ + virtual void finalize( void* /*item*/ ) {}; +#endif + +private: + //! Pointer to next filter in the pipeline. + filter* next_filter_in_pipeline; + + //! Buffer for incoming tokens, or NULL if not required. + /** The buffer is required if the filter is serial or follows a thread-bound one. */ + internal::input_buffer* my_input_buffer; + + friend class internal::stage_task; + friend class internal::pipeline_root_task; + friend class pipeline; + friend class thread_bound_filter; + + //! Storage for filter mode and dynamically checked implementation version. + const unsigned char my_filter_mode; + + //! Pointer to previous filter in the pipeline. + filter* prev_filter_in_pipeline; + + //! Pointer to the pipeline. + pipeline* my_pipeline; + + //! Pointer to the next "segment" of filters, or NULL if not required. + /** In each segment, the first filter is not thread-bound but follows a thread-bound one. */ + filter* next_segment; +}; + +//! A stage in a pipeline served by a user thread. +/** @ingroup algorithms */ +class thread_bound_filter: public filter { +public: + enum result_type { + // item was processed + success, + // item is currently not available + item_not_available, + // there are no more items to process + end_of_stream + }; +protected: + thread_bound_filter(mode filter_mode): + filter(static_cast(filter_mode | filter::filter_is_bound)) + {} +public: + //! If a data item is available, invoke operator() on that item. + /** This interface is non-blocking. + Returns 'success' if an item was processed. + Returns 'item_not_available' if no item can be processed now + but more may arrive in the future, or if token limit is reached. + Returns 'end_of_stream' if there are no more items to process. */ + result_type __TBB_EXPORTED_METHOD try_process_item(); + + //! Wait until a data item becomes available, and invoke operator() on that item. + /** This interface is blocking. + Returns 'success' if an item was processed. + Returns 'end_of_stream' if there are no more items to process. + Never returns 'item_not_available', as it blocks until another return condition applies. */ + result_type __TBB_EXPORTED_METHOD process_item(); + +private: + //! Internal routine for item processing + result_type internal_process_item(bool is_blocking); +}; + +//! A processing pipeling that applies filters to items. +/** @ingroup algorithms */ +class pipeline { +public: + //! Construct empty pipeline. + __TBB_EXPORTED_METHOD pipeline(); + + /** Though the current implementation declares the destructor virtual, do not rely on this + detail. The virtualness is deprecated and may disappear in future versions of TBB. */ + virtual __TBB_EXPORTED_METHOD ~pipeline(); + + //! Add filter to end of pipeline. + void __TBB_EXPORTED_METHOD add_filter( filter& filter_ ); + + //! Run the pipeline to completion. + void __TBB_EXPORTED_METHOD run( size_t max_number_of_live_tokens ); + +#if __TBB_EXCEPTIONS + //! Run the pipeline to completion with user-supplied context. + void __TBB_EXPORTED_METHOD run( size_t max_number_of_live_tokens, tbb::task_group_context& context ); +#endif + + //! Remove all filters from the pipeline. + void __TBB_EXPORTED_METHOD clear(); + +private: + friend class internal::stage_task; + friend class internal::pipeline_root_task; + friend class filter; + friend class thread_bound_filter; + friend class internal::pipeline_cleaner; + + //! Pointer to first filter in the pipeline. + filter* filter_list; + + //! Pointer to location where address of next filter to be added should be stored. + filter* filter_end; + + //! task who's reference count is used to determine when all stages are done. + task* end_counter; + + //! Number of idle tokens waiting for input stage. + atomic input_tokens; + + //! Global counter of tokens + atomic token_counter; + + //! False until fetch_input returns NULL. + bool end_of_input; + + //! True if the pipeline contains a thread-bound filter; false otherwise. + bool has_thread_bound_filters; + + //! Remove filter from pipeline. + void remove_filter( filter& filter_ ); + + //! Not used, but retained to satisfy old export files. + void __TBB_EXPORTED_METHOD inject_token( task& self ); + +#if __TBB_EXCEPTIONS + //! Does clean up if pipeline is cancelled or exception occured + void clear_filters(); +#endif +}; + +} // tbb + +#endif /* __TBB_pipeline_H */ diff --git a/dep/tbb/include/tbb/queuing_mutex.h b/dep/tbb/include/tbb/queuing_mutex.h new file mode 100644 index 000000000..a7cb71c1b --- /dev/null +++ b/dep/tbb/include/tbb/queuing_mutex.h @@ -0,0 +1,119 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_queuing_mutex_H +#define __TBB_queuing_mutex_H + +#include +#include "atomic.h" +#include "tbb_profiling.h" + +namespace tbb { + +//! Queuing lock with local-only spinning. +/** @ingroup synchronization */ +class queuing_mutex { +public: + //! Construct unacquired mutex. + queuing_mutex() { + q_tail = NULL; +#if TBB_USE_THREADING_TOOLS + internal_construct(); +#endif + } + + //! The scoped locking pattern + /** It helps to avoid the common problem of forgetting to release lock. + It also nicely provides the "node" for queuing locks. */ + class scoped_lock: internal::no_copy { + //! Initialize fields to mean "no lock held". + void initialize() { + mutex = NULL; +#if TBB_USE_ASSERT + internal::poison_pointer(next); +#endif /* TBB_USE_ASSERT */ + } + public: + //! Construct lock that has not acquired a mutex. + /** Equivalent to zero-initialization of *this. */ + scoped_lock() {initialize();} + + //! Acquire lock on given mutex. + /** Upon entry, *this should not be in the "have acquired a mutex" state. */ + scoped_lock( queuing_mutex& m ) { + initialize(); + acquire(m); + } + + //! Release lock (if lock is held). + ~scoped_lock() { + if( mutex ) release(); + } + + //! Acquire lock on given mutex. + void __TBB_EXPORTED_METHOD acquire( queuing_mutex& m ); + + //! Acquire lock on given mutex if free (i.e. non-blocking) + bool __TBB_EXPORTED_METHOD try_acquire( queuing_mutex& m ); + + //! Release lock. + void __TBB_EXPORTED_METHOD release(); + + private: + //! The pointer to the mutex owned, or NULL if not holding a mutex. + queuing_mutex* mutex; + + //! The pointer to the next competitor for a mutex + scoped_lock *next; + + //! The local spin-wait variable + /** Inverted (0 - blocked, 1 - acquired the mutex) for the sake of + zero-initialization. Defining it as an entire word instead of + a byte seems to help performance slightly. */ + internal::uintptr going; + }; + + void __TBB_EXPORTED_METHOD internal_construct(); + + // Mutex traits + static const bool is_rw_mutex = false; + static const bool is_recursive_mutex = false; + static const bool is_fair_mutex = true; + + friend class scoped_lock; +private: + //! The last competitor requesting the lock + atomic q_tail; + +}; + +__TBB_DEFINE_PROFILING_SET_NAME(queuing_mutex) + +} // namespace tbb + +#endif /* __TBB_queuing_mutex_H */ diff --git a/dep/tbb/include/tbb/queuing_rw_mutex.h b/dep/tbb/include/tbb/queuing_rw_mutex.h new file mode 100644 index 000000000..27456f685 --- /dev/null +++ b/dep/tbb/include/tbb/queuing_rw_mutex.h @@ -0,0 +1,161 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_queuing_rw_mutex_H +#define __TBB_queuing_rw_mutex_H + +#include +#include "atomic.h" +#include "tbb_profiling.h" + +namespace tbb { + +//! Reader-writer lock with local-only spinning. +/** Adapted from Krieger, Stumm, et al. pseudocode at + http://www.eecg.toronto.edu/parallel/pubs_abs.html#Krieger_etal_ICPP93 + @ingroup synchronization */ +class queuing_rw_mutex { +public: + //! Construct unacquired mutex. + queuing_rw_mutex() { + q_tail = NULL; +#if TBB_USE_THREADING_TOOLS + internal_construct(); +#endif + } + + //! Destructor asserts if the mutex is acquired, i.e. q_tail is non-NULL + ~queuing_rw_mutex() { +#if TBB_USE_ASSERT + __TBB_ASSERT( !q_tail, "destruction of an acquired mutex"); +#endif + } + + class scoped_lock; + friend class scoped_lock; + + //! The scoped locking pattern + /** It helps to avoid the common problem of forgetting to release lock. + It also nicely provides the "node" for queuing locks. */ + class scoped_lock: internal::no_copy { + //! Initialize fields + void initialize() { + mutex = NULL; +#if TBB_USE_ASSERT + state = 0xFF; // Set to invalid state + internal::poison_pointer(next); + internal::poison_pointer(prev); +#endif /* TBB_USE_ASSERT */ + } + public: + //! Construct lock that has not acquired a mutex. + /** Equivalent to zero-initialization of *this. */ + scoped_lock() {initialize();} + + //! Acquire lock on given mutex. + /** Upon entry, *this should not be in the "have acquired a mutex" state. */ + scoped_lock( queuing_rw_mutex& m, bool write=true ) { + initialize(); + acquire(m,write); + } + + //! Release lock (if lock is held). + ~scoped_lock() { + if( mutex ) release(); + } + + //! Acquire lock on given mutex. + void acquire( queuing_rw_mutex& m, bool write=true ); + + //! Try acquire lock on given mutex. + bool try_acquire( queuing_rw_mutex& m, bool write=true ); + + //! Release lock. + void release(); + + //! Upgrade reader to become a writer. + /** Returns true if the upgrade happened without re-acquiring the lock and false if opposite */ + bool upgrade_to_writer(); + + //! Downgrade writer to become a reader. + bool downgrade_to_reader(); + + private: + //! The pointer to the current mutex to work + queuing_rw_mutex* mutex; + + //! The pointer to the previous and next competitors for a mutex + scoped_lock * prev, * next; + + typedef unsigned char state_t; + + //! State of the request: reader, writer, active reader, other service states + atomic state; + + //! The local spin-wait variable + /** Corresponds to "spin" in the pseudocode but inverted for the sake of zero-initialization */ + unsigned char going; + + //! A tiny internal lock + unsigned char internal_lock; + + //! Acquire the internal lock + void acquire_internal_lock(); + + //! Try to acquire the internal lock + /** Returns true if lock was successfully acquired. */ + bool try_acquire_internal_lock(); + + //! Release the internal lock + void release_internal_lock(); + + //! Wait for internal lock to be released + void wait_for_release_of_internal_lock(); + + //! A helper function + void unblock_or_wait_on_internal_lock( uintptr_t ); + }; + + void __TBB_EXPORTED_METHOD internal_construct(); + + // Mutex traits + static const bool is_rw_mutex = true; + static const bool is_recursive_mutex = false; + static const bool is_fair_mutex = true; + +private: + //! The last competitor requesting the lock + atomic q_tail; + +}; + +__TBB_DEFINE_PROFILING_SET_NAME(queuing_rw_mutex) + +} // namespace tbb + +#endif /* __TBB_queuing_rw_mutex_H */ diff --git a/dep/tbb/include/tbb/recursive_mutex.h b/dep/tbb/include/tbb/recursive_mutex.h new file mode 100644 index 000000000..1b7a82539 --- /dev/null +++ b/dep/tbb/include/tbb/recursive_mutex.h @@ -0,0 +1,245 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_recursive_mutex_H +#define __TBB_recursive_mutex_H + +#if _WIN32||_WIN64 + +#include +#if !defined(_WIN32_WINNT) +// The following Windows API function is declared explicitly; +// otherwise any user would have to specify /D_WIN32_WINNT=0x0400 +extern "C" BOOL WINAPI TryEnterCriticalSection( LPCRITICAL_SECTION ); +#endif + +#else /* if not _WIN32||_WIN64 */ + +#include +namespace tbb { namespace internal { +// Use this internal TBB function to throw an exception + extern void handle_perror( int error_code, const char* what ); +} } //namespaces + +#endif /* _WIN32||_WIN64 */ + +#include +#include "aligned_space.h" +#include "tbb_stddef.h" +#include "tbb_profiling.h" + +namespace tbb { +//! Mutex that allows recursive mutex acquisition. +/** Mutex that allows recursive mutex acquisition. + @ingroup synchronization */ +class recursive_mutex { +public: + //! Construct unacquired recursive_mutex. + recursive_mutex() { +#if TBB_USE_ASSERT || TBB_USE_THREADING_TOOLS + internal_construct(); +#else + #if _WIN32||_WIN64 + InitializeCriticalSection(&impl); + #else + pthread_mutexattr_t mtx_attr; + int error_code = pthread_mutexattr_init( &mtx_attr ); + if( error_code ) + tbb::internal::handle_perror(error_code,"recursive_mutex: pthread_mutexattr_init failed"); + + pthread_mutexattr_settype( &mtx_attr, PTHREAD_MUTEX_RECURSIVE ); + error_code = pthread_mutex_init( &impl, &mtx_attr ); + if( error_code ) + tbb::internal::handle_perror(error_code,"recursive_mutex: pthread_mutex_init failed"); + + pthread_mutexattr_destroy( &mtx_attr ); + #endif /* _WIN32||_WIN64*/ +#endif /* TBB_USE_ASSERT */ + }; + + ~recursive_mutex() { +#if TBB_USE_ASSERT + internal_destroy(); +#else + #if _WIN32||_WIN64 + DeleteCriticalSection(&impl); + #else + pthread_mutex_destroy(&impl); + + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + }; + + class scoped_lock; + friend class scoped_lock; + + //! The scoped locking pattern + /** It helps to avoid the common problem of forgetting to release lock. + It also nicely provides the "node" for queuing locks. */ + class scoped_lock: internal::no_copy { + public: + //! Construct lock that has not acquired a recursive_mutex. + scoped_lock() : my_mutex(NULL) {}; + + //! Acquire lock on given mutex. + scoped_lock( recursive_mutex& mutex ) { +#if TBB_USE_ASSERT + my_mutex = &mutex; +#endif /* TBB_USE_ASSERT */ + acquire( mutex ); + } + + //! Release lock (if lock is held). + ~scoped_lock() { + if( my_mutex ) + release(); + } + + //! Acquire lock on given mutex. + void acquire( recursive_mutex& mutex ) { +#if TBB_USE_ASSERT + internal_acquire( mutex ); +#else + my_mutex = &mutex; + mutex.lock(); +#endif /* TBB_USE_ASSERT */ + } + + //! Try acquire lock on given recursive_mutex. + bool try_acquire( recursive_mutex& mutex ) { +#if TBB_USE_ASSERT + return internal_try_acquire( mutex ); +#else + bool result = mutex.try_lock(); + if( result ) + my_mutex = &mutex; + return result; +#endif /* TBB_USE_ASSERT */ + } + + //! Release lock + void release() { +#if TBB_USE_ASSERT + internal_release(); +#else + my_mutex->unlock(); + my_mutex = NULL; +#endif /* TBB_USE_ASSERT */ + } + + private: + //! The pointer to the current recursive_mutex to work + recursive_mutex* my_mutex; + + //! All checks from acquire using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_acquire( recursive_mutex& m ); + + //! All checks from try_acquire using mutex.state were moved here + bool __TBB_EXPORTED_METHOD internal_try_acquire( recursive_mutex& m ); + + //! All checks from release using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_release(); + + friend class recursive_mutex; + }; + + // Mutex traits + static const bool is_rw_mutex = false; + static const bool is_recursive_mutex = true; + static const bool is_fair_mutex = false; + + // C++0x compatibility interface + + //! Acquire lock + void lock() { +#if TBB_USE_ASSERT + aligned_space tmp; + new(tmp.begin()) scoped_lock(*this); +#else + #if _WIN32||_WIN64 + EnterCriticalSection(&impl); + #else + pthread_mutex_lock(&impl); + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + } + + //! Try acquiring lock (non-blocking) + /** Return true if lock acquired; false otherwise. */ + bool try_lock() { +#if TBB_USE_ASSERT + aligned_space tmp; + return (new(tmp.begin()) scoped_lock)->internal_try_acquire(*this); +#else + #if _WIN32||_WIN64 + return TryEnterCriticalSection(&impl)!=0; + #else + return pthread_mutex_trylock(&impl)==0; + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + } + + //! Release lock + void unlock() { +#if TBB_USE_ASSERT + aligned_space tmp; + scoped_lock& s = *tmp.begin(); + s.my_mutex = this; + s.internal_release(); +#else + #if _WIN32||_WIN64 + LeaveCriticalSection(&impl); + #else + pthread_mutex_unlock(&impl); + #endif /* _WIN32||_WIN64 */ +#endif /* TBB_USE_ASSERT */ + } + +private: +#if _WIN32||_WIN64 + CRITICAL_SECTION impl; + enum state_t { + INITIALIZED=0x1234, + DESTROYED=0x789A, + } state; +#else + pthread_mutex_t impl; +#endif /* _WIN32||_WIN64 */ + + //! All checks from mutex constructor using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_construct(); + + //! All checks from mutex destructor using mutex.state were moved here + void __TBB_EXPORTED_METHOD internal_destroy(); +}; + +__TBB_DEFINE_PROFILING_SET_NAME(recursive_mutex) + +} // namespace tbb + +#endif /* __TBB_recursive_mutex_H */ diff --git a/dep/tbb/include/tbb/scalable_allocator.h b/dep/tbb/include/tbb/scalable_allocator.h new file mode 100644 index 000000000..aca27a736 --- /dev/null +++ b/dep/tbb/include/tbb/scalable_allocator.h @@ -0,0 +1,205 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_scalable_allocator_H +#define __TBB_scalable_allocator_H +/** @file */ + +#include /* Need ptrdiff_t and size_t from here. */ + +#if !defined(__cplusplus) && __ICC==1100 + #pragma warning (push) + #pragma warning (disable: 991) +#endif + +#ifdef __cplusplus +extern "C" { +#endif /* __cplusplus */ + +#if _MSC_VER >= 1400 +#define __TBB_EXPORTED_FUNC __cdecl +#else +#define __TBB_EXPORTED_FUNC +#endif + +/** The "malloc" analogue to allocate block of memory of size bytes. + * @ingroup memory_allocation */ +void * __TBB_EXPORTED_FUNC scalable_malloc (size_t size); + +/** The "free" analogue to discard a previously allocated piece of memory. + @ingroup memory_allocation */ +void __TBB_EXPORTED_FUNC scalable_free (void* ptr); + +/** The "realloc" analogue complementing scalable_malloc. + @ingroup memory_allocation */ +void * __TBB_EXPORTED_FUNC scalable_realloc (void* ptr, size_t size); + +/** The "calloc" analogue complementing scalable_malloc. + @ingroup memory_allocation */ +void * __TBB_EXPORTED_FUNC scalable_calloc (size_t nobj, size_t size); + +/** The "posix_memalign" analogue. + @ingroup memory_allocation */ +int __TBB_EXPORTED_FUNC scalable_posix_memalign (void** memptr, size_t alignment, size_t size); + +/** The "_aligned_malloc" analogue. + @ingroup memory_allocation */ +void * __TBB_EXPORTED_FUNC scalable_aligned_malloc (size_t size, size_t alignment); + +/** The "_aligned_realloc" analogue. + @ingroup memory_allocation */ +void * __TBB_EXPORTED_FUNC scalable_aligned_realloc (void* ptr, size_t size, size_t alignment); + +/** The "_aligned_free" analogue. + @ingroup memory_allocation */ +void __TBB_EXPORTED_FUNC scalable_aligned_free (void* ptr); + +/** The analogue of _msize/malloc_size/malloc_usable_size. + Returns the usable size of a memory block previously allocated by scalable_*, + or 0 (zero) if ptr does not point to such a block. + @ingroup memory_allocation */ +size_t __TBB_EXPORTED_FUNC scalable_msize (void* ptr); + +#ifdef __cplusplus +} /* extern "C" */ +#endif /* __cplusplus */ + +#ifdef __cplusplus + +#include /* To use new with the placement argument */ + +/* Ensure that including this header does not cause implicit linkage with TBB */ +#ifndef __TBB_NO_IMPLICIT_LINKAGE + #define __TBB_NO_IMPLICIT_LINKAGE 1 + #include "tbb_stddef.h" + #undef __TBB_NO_IMPLICIT_LINKAGE +#else + #include "tbb_stddef.h" +#endif + + +namespace tbb { + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Workaround for erroneous "unreferenced parameter" warning in method destroy. + #pragma warning (push) + #pragma warning (disable: 4100) +#endif + +//! Meets "allocator" requirements of ISO C++ Standard, Section 20.1.5 +/** The members are ordered the same way they are in section 20.4.1 + of the ISO C++ standard. + @ingroup memory_allocation */ +template +class scalable_allocator { +public: + typedef typename internal::allocator_type::value_type value_type; + typedef value_type* pointer; + typedef const value_type* const_pointer; + typedef value_type& reference; + typedef const value_type& const_reference; + typedef size_t size_type; + typedef ptrdiff_t difference_type; + template struct rebind { + typedef scalable_allocator other; + }; + + scalable_allocator() throw() {} + scalable_allocator( const scalable_allocator& ) throw() {} + template scalable_allocator(const scalable_allocator&) throw() {} + + pointer address(reference x) const {return &x;} + const_pointer address(const_reference x) const {return &x;} + + //! Allocate space for n objects. + pointer allocate( size_type n, const void* /*hint*/ =0 ) { + return static_cast( scalable_malloc( n * sizeof(value_type) ) ); + } + + //! Free previously allocated block of memory + void deallocate( pointer p, size_type ) { + scalable_free( p ); + } + + //! Largest value for which method allocate might succeed. + size_type max_size() const throw() { + size_type absolutemax = static_cast(-1) / sizeof (value_type); + return (absolutemax > 0 ? absolutemax : 1); + } + void construct( pointer p, const value_type& val ) { new(static_cast(p)) value_type(val); } + void destroy( pointer p ) {p->~value_type();} +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warning 4100 is back + +//! Analogous to std::allocator, as defined in ISO C++ Standard, Section 20.4.1 +/** @ingroup memory_allocation */ +template<> +class scalable_allocator { +public: + typedef void* pointer; + typedef const void* const_pointer; + typedef void value_type; + template struct rebind { + typedef scalable_allocator other; + }; +}; + +template +inline bool operator==( const scalable_allocator&, const scalable_allocator& ) {return true;} + +template +inline bool operator!=( const scalable_allocator&, const scalable_allocator& ) {return false;} + +} // namespace tbb + +#if _MSC_VER + #if __TBB_BUILD && !defined(__TBBMALLOC_NO_IMPLICIT_LINKAGE) + #define __TBBMALLOC_NO_IMPLICIT_LINKAGE 1 + #endif + + #if !__TBBMALLOC_NO_IMPLICIT_LINKAGE + #ifdef _DEBUG + #pragma comment(lib, "tbbmalloc_debug.lib") + #else + #pragma comment(lib, "tbbmalloc.lib") + #endif + #endif + + +#endif + +#endif /* __cplusplus */ + +#if !defined(__cplusplus) && __ICC==1100 + #pragma warning (pop) +#endif // ICC 11.0 warning 991 is back + +#endif /* __TBB_scalable_allocator_H */ diff --git a/dep/tbb/include/tbb/spin_mutex.h b/dep/tbb/include/tbb/spin_mutex.h new file mode 100644 index 000000000..446821a70 --- /dev/null +++ b/dep/tbb/include/tbb/spin_mutex.h @@ -0,0 +1,192 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_spin_mutex_H +#define __TBB_spin_mutex_H + +#include +#include +#include "aligned_space.h" +#include "tbb_stddef.h" +#include "tbb_machine.h" +#include "tbb_profiling.h" + +namespace tbb { + +//! A lock that occupies a single byte. +/** A spin_mutex is a spin mutex that fits in a single byte. + It should be used only for locking short critical sections + (typically <20 instructions) when fairness is not an issue. + If zero-initialized, the mutex is considered unheld. + @ingroup synchronization */ +class spin_mutex { + //! 0 if lock is released, 1 if lock is acquired. + unsigned char flag; + +public: + //! Construct unacquired lock. + /** Equivalent to zero-initialization of *this. */ + spin_mutex() : flag(0) { +#if TBB_USE_THREADING_TOOLS + internal_construct(); +#endif + } + + //! Represents acquisition of a mutex. + class scoped_lock : internal::no_copy { + private: + //! Points to currently held mutex, or NULL if no lock is held. + spin_mutex* my_mutex; + + //! Value to store into spin_mutex::flag to unlock the mutex. + internal::uintptr my_unlock_value; + + //! Like acquire, but with ITT instrumentation. + void __TBB_EXPORTED_METHOD internal_acquire( spin_mutex& m ); + + //! Like try_acquire, but with ITT instrumentation. + bool __TBB_EXPORTED_METHOD internal_try_acquire( spin_mutex& m ); + + //! Like release, but with ITT instrumentation. + void __TBB_EXPORTED_METHOD internal_release(); + + friend class spin_mutex; + + public: + //! Construct without acquiring a mutex. + scoped_lock() : my_mutex(NULL), my_unlock_value(0) {} + + //! Construct and acquire lock on a mutex. + scoped_lock( spin_mutex& m ) { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + my_mutex=NULL; + internal_acquire(m); +#else + my_unlock_value = __TBB_LockByte(m.flag); + my_mutex=&m; +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT*/ + } + + //! Acquire lock. + void acquire( spin_mutex& m ) { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + internal_acquire(m); +#else + my_unlock_value = __TBB_LockByte(m.flag); + my_mutex = &m; +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT*/ + } + + //! Try acquiring lock (non-blocking) + /** Return true if lock acquired; false otherwise. */ + bool try_acquire( spin_mutex& m ) { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + return internal_try_acquire(m); +#else + bool result = __TBB_TryLockByte(m.flag); + if( result ) { + my_unlock_value = 0; + my_mutex = &m; + } + return result; +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT*/ + } + + //! Release lock + void release() { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + internal_release(); +#else + __TBB_store_with_release(my_mutex->flag, static_cast(my_unlock_value)); + my_mutex = NULL; +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + } + + //! Destroy lock. If holding a lock, releases the lock first. + ~scoped_lock() { + if( my_mutex ) { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + internal_release(); +#else + __TBB_store_with_release(my_mutex->flag, static_cast(my_unlock_value)); +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + } + } + }; + + void __TBB_EXPORTED_METHOD internal_construct(); + + // Mutex traits + static const bool is_rw_mutex = false; + static const bool is_recursive_mutex = false; + static const bool is_fair_mutex = false; + + // ISO C++0x compatibility methods + + //! Acquire lock + void lock() { +#if TBB_USE_THREADING_TOOLS + aligned_space tmp; + new(tmp.begin()) scoped_lock(*this); +#else + __TBB_LockByte(flag); +#endif /* TBB_USE_THREADING_TOOLS*/ + } + + //! Try acquiring lock (non-blocking) + /** Return true if lock acquired; false otherwise. */ + bool try_lock() { +#if TBB_USE_THREADING_TOOLS + aligned_space tmp; + return (new(tmp.begin()) scoped_lock)->internal_try_acquire(*this); +#else + return __TBB_TryLockByte(flag); +#endif /* TBB_USE_THREADING_TOOLS*/ + } + + //! Release lock + void unlock() { +#if TBB_USE_THREADING_TOOLS + aligned_space tmp; + scoped_lock& s = *tmp.begin(); + s.my_mutex = this; + s.my_unlock_value = 0; + s.internal_release(); +#else + __TBB_store_with_release(flag, 0); +#endif /* TBB_USE_THREADING_TOOLS */ + } + + friend class scoped_lock; +}; + +__TBB_DEFINE_PROFILING_SET_NAME(spin_mutex) + +} // namespace tbb + +#endif /* __TBB_spin_mutex_H */ diff --git a/dep/tbb/include/tbb/spin_rw_mutex.h b/dep/tbb/include/tbb/spin_rw_mutex.h new file mode 100644 index 000000000..229745b52 --- /dev/null +++ b/dep/tbb/include/tbb/spin_rw_mutex.h @@ -0,0 +1,229 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_spin_rw_mutex_H +#define __TBB_spin_rw_mutex_H + +#include "tbb_stddef.h" +#include "tbb_machine.h" +#include "tbb_profiling.h" + +namespace tbb { + +class spin_rw_mutex_v3; +typedef spin_rw_mutex_v3 spin_rw_mutex; + +//! Fast, unfair, spinning reader-writer lock with backoff and writer-preference +/** @ingroup synchronization */ +class spin_rw_mutex_v3 { + //! @cond INTERNAL + + //! Internal acquire write lock. + bool __TBB_EXPORTED_METHOD internal_acquire_writer(); + + //! Out of line code for releasing a write lock. + /** This code is has debug checking and instrumentation for Intel(R) Thread Checker and Intel(R) Thread Profiler. */ + void __TBB_EXPORTED_METHOD internal_release_writer(); + + //! Internal acquire read lock. + void __TBB_EXPORTED_METHOD internal_acquire_reader(); + + //! Internal upgrade reader to become a writer. + bool __TBB_EXPORTED_METHOD internal_upgrade(); + + //! Out of line code for downgrading a writer to a reader. + /** This code is has debug checking and instrumentation for Intel(R) Thread Checker and Intel(R) Thread Profiler. */ + void __TBB_EXPORTED_METHOD internal_downgrade(); + + //! Internal release read lock. + void __TBB_EXPORTED_METHOD internal_release_reader(); + + //! Internal try_acquire write lock. + bool __TBB_EXPORTED_METHOD internal_try_acquire_writer(); + + //! Internal try_acquire read lock. + bool __TBB_EXPORTED_METHOD internal_try_acquire_reader(); + + //! @endcond +public: + //! Construct unacquired mutex. + spin_rw_mutex_v3() : state(0) { +#if TBB_USE_THREADING_TOOLS + internal_construct(); +#endif + } + +#if TBB_USE_ASSERT + //! Destructor asserts if the mutex is acquired, i.e. state is zero. + ~spin_rw_mutex_v3() { + __TBB_ASSERT( !state, "destruction of an acquired mutex"); + }; +#endif /* TBB_USE_ASSERT */ + + //! The scoped locking pattern + /** It helps to avoid the common problem of forgetting to release lock. + It also nicely provides the "node" for queuing locks. */ + class scoped_lock : internal::no_copy { + public: + //! Construct lock that has not acquired a mutex. + /** Equivalent to zero-initialization of *this. */ + scoped_lock() : mutex(NULL), is_writer(false) {} + + //! Acquire lock on given mutex. + /** Upon entry, *this should not be in the "have acquired a mutex" state. */ + scoped_lock( spin_rw_mutex& m, bool write = true ) : mutex(NULL) { + acquire(m, write); + } + + //! Release lock (if lock is held). + ~scoped_lock() { + if( mutex ) release(); + } + + //! Acquire lock on given mutex. + void acquire( spin_rw_mutex& m, bool write = true ) { + __TBB_ASSERT( !mutex, "holding mutex already" ); + is_writer = write; + mutex = &m; + if( write ) mutex->internal_acquire_writer(); + else mutex->internal_acquire_reader(); + } + + //! Upgrade reader to become a writer. + /** Returns true if the upgrade happened without re-acquiring the lock and false if opposite */ + bool upgrade_to_writer() { + __TBB_ASSERT( mutex, "lock is not acquired" ); + __TBB_ASSERT( !is_writer, "not a reader" ); + is_writer = true; + return mutex->internal_upgrade(); + } + + //! Release lock. + void release() { + __TBB_ASSERT( mutex, "lock is not acquired" ); + spin_rw_mutex *m = mutex; + mutex = NULL; +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + if( is_writer ) m->internal_release_writer(); + else m->internal_release_reader(); +#else + if( is_writer ) __TBB_AtomicAND( &m->state, READERS ); + else __TBB_FetchAndAddWrelease( &m->state, -(intptr_t)ONE_READER); +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + } + + //! Downgrade writer to become a reader. + bool downgrade_to_reader() { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + __TBB_ASSERT( mutex, "lock is not acquired" ); + __TBB_ASSERT( is_writer, "not a writer" ); + mutex->internal_downgrade(); +#else + __TBB_FetchAndAddW( &mutex->state, ((intptr_t)ONE_READER-WRITER)); +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + is_writer = false; + + return true; + } + + //! Try acquire lock on given mutex. + bool try_acquire( spin_rw_mutex& m, bool write = true ) { + __TBB_ASSERT( !mutex, "holding mutex already" ); + bool result; + is_writer = write; + result = write? m.internal_try_acquire_writer() + : m.internal_try_acquire_reader(); + if( result ) + mutex = &m; + return result; + } + + private: + //! The pointer to the current mutex that is held, or NULL if no mutex is held. + spin_rw_mutex* mutex; + + //! If mutex!=NULL, then is_writer is true if holding a writer lock, false if holding a reader lock. + /** Not defined if not holding a lock. */ + bool is_writer; + }; + + // Mutex traits + static const bool is_rw_mutex = true; + static const bool is_recursive_mutex = false; + static const bool is_fair_mutex = false; + + // ISO C++0x compatibility methods + + //! Acquire writer lock + void lock() {internal_acquire_writer();} + + //! Try acquiring writer lock (non-blocking) + /** Return true if lock acquired; false otherwise. */ + bool try_lock() {return internal_try_acquire_writer();} + + //! Release lock + void unlock() { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + if( state&WRITER ) internal_release_writer(); + else internal_release_reader(); +#else + if( state&WRITER ) __TBB_AtomicAND( &state, READERS ); + else __TBB_FetchAndAddWrelease( &state, -(intptr_t)ONE_READER); +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + } + + // Methods for reader locks that resemble ISO C++0x compatibility methods. + + //! Acquire reader lock + void lock_read() {internal_acquire_reader();} + + //! Try acquiring reader lock (non-blocking) + /** Return true if reader lock acquired; false otherwise. */ + bool try_lock_read() {return internal_try_acquire_reader();} + +private: + typedef intptr_t state_t; + static const state_t WRITER = 1; + static const state_t WRITER_PENDING = 2; + static const state_t READERS = ~(WRITER | WRITER_PENDING); + static const state_t ONE_READER = 4; + static const state_t BUSY = WRITER | READERS; + //! State of lock + /** Bit 0 = writer is holding lock + Bit 1 = request by a writer to acquire lock (hint to readers to wait) + Bit 2..N = number of readers holding lock */ + state_t state; + + void __TBB_EXPORTED_METHOD internal_construct(); +}; + +__TBB_DEFINE_PROFILING_SET_NAME(spin_rw_mutex) + +} // namespace tbb + +#endif /* __TBB_spin_rw_mutex_H */ diff --git a/dep/tbb/include/tbb/task.h b/dep/tbb/include/tbb/task.h new file mode 100644 index 000000000..05a68985c --- /dev/null +++ b/dep/tbb/include/tbb/task.h @@ -0,0 +1,787 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_task_H +#define __TBB_task_H + +#include "tbb_stddef.h" +#include "tbb_machine.h" + +namespace tbb { + +class task; +class task_list; + +#if __TBB_EXCEPTIONS +class task_group_context; +#endif /* __TBB_EXCEPTIONS */ + +//! @cond INTERNAL +namespace internal { + + class scheduler: no_copy { + public: + //! For internal use only + virtual void spawn( task& first, task*& next ) = 0; + + //! For internal use only + virtual void wait_for_all( task& parent, task* child ) = 0; + + //! For internal use only + virtual void spawn_root_and_wait( task& first, task*& next ) = 0; + + //! Pure virtual destructor; + // Have to have it just to shut up overzealous compilation warnings + virtual ~scheduler() = 0; + }; + + //! A reference count + /** Should always be non-negative. A signed type is used so that underflow can be detected. */ + typedef intptr reference_count; + + //! An id as used for specifying affinity. + typedef unsigned short affinity_id; + +#if __TBB_EXCEPTIONS + struct context_list_node_t { + context_list_node_t *my_prev, + *my_next; + }; + + class allocate_root_with_context_proxy: no_assign { + task_group_context& my_context; + public: + allocate_root_with_context_proxy ( task_group_context& ctx ) : my_context(ctx) {} + task& __TBB_EXPORTED_METHOD allocate( size_t size ) const; + void __TBB_EXPORTED_METHOD free( task& ) const; + }; +#endif /* __TBB_EXCEPTIONS */ + + class allocate_root_proxy: no_assign { + public: + static task& __TBB_EXPORTED_FUNC allocate( size_t size ); + static void __TBB_EXPORTED_FUNC free( task& ); + }; + + class allocate_continuation_proxy: no_assign { + public: + task& __TBB_EXPORTED_METHOD allocate( size_t size ) const; + void __TBB_EXPORTED_METHOD free( task& ) const; + }; + + class allocate_child_proxy: no_assign { + public: + task& __TBB_EXPORTED_METHOD allocate( size_t size ) const; + void __TBB_EXPORTED_METHOD free( task& ) const; + }; + + class allocate_additional_child_of_proxy: no_assign { + task& self; + task& parent; + public: + allocate_additional_child_of_proxy( task& self_, task& parent_ ) : self(self_), parent(parent_) {} + task& __TBB_EXPORTED_METHOD allocate( size_t size ) const; + void __TBB_EXPORTED_METHOD free( task& ) const; + }; + + class task_group_base; + + //! Memory prefix to a task object. + /** This class is internal to the library. + Do not reference it directly, except within the library itself. + Fields are ordered in way that preserves backwards compatibility and yields + good packing on typical 32-bit and 64-bit platforms. + @ingroup task_scheduling */ + class task_prefix { + private: + friend class tbb::task; + friend class tbb::task_list; + friend class internal::scheduler; + friend class internal::allocate_root_proxy; + friend class internal::allocate_child_proxy; + friend class internal::allocate_continuation_proxy; + friend class internal::allocate_additional_child_of_proxy; + friend class internal::task_group_base; + +#if __TBB_EXCEPTIONS + //! Shared context that is used to communicate asynchronous state changes + /** Currently it is used to broadcast cancellation requests generated both + by users and as the result of unhandled exceptions in the task::execute() + methods. */ + task_group_context *context; +#endif /* __TBB_EXCEPTIONS */ + + //! The scheduler that allocated the task, or NULL if the task is big. + /** Small tasks are pooled by the scheduler that allocated the task. + If a scheduler needs to free a small task allocated by another scheduler, + it returns the task to that other scheduler. This policy avoids + memory space blowup issues for memory allocators that allocate from + thread-specific pools. */ + scheduler* origin; + + //! The scheduler that owns the task. + scheduler* owner; + + //! The task whose reference count includes me. + /** In the "blocking style" of programming, this field points to the parent task. + In the "continuation-passing style" of programming, this field points to the + continuation of the parent. */ + tbb::task* parent; + + //! Reference count used for synchronization. + /** In the "continuation-passing style" of programming, this field is + the difference of the number of allocated children minus the + number of children that have completed. + In the "blocking style" of programming, this field is one more than the difference. */ + reference_count ref_count; + + //! Obsolete. Used to be scheduling depth before TBB 2.2 + /** Retained only for the sake of backward binary compatibility. **/ + int depth; + + //! A task::state_type, stored as a byte for compactness. + /** This state is exposed to users via method task::state(). */ + unsigned char state; + + //! Miscellaneous state that is not directly visible to users, stored as a byte for compactness. + /** 0x0 -> version 1.0 task + 0x1 -> version 3.0 task + 0x2 -> task_proxy + 0x40 -> task has live ref_count */ + unsigned char extra_state; + + affinity_id affinity; + + //! "next" field for list of task + tbb::task* next; + + //! The task corresponding to this task_prefix. + tbb::task& task() {return *reinterpret_cast(this+1);} + }; + +} // namespace internal +//! @endcond + +#if __TBB_EXCEPTIONS + +#if TBB_USE_CAPTURED_EXCEPTION + class tbb_exception; +#else + namespace internal { + class tbb_exception_ptr; + } +#endif /* !TBB_USE_CAPTURED_EXCEPTION */ + +//! Used to form groups of tasks +/** @ingroup task_scheduling + The context services explicit cancellation requests from user code, and unhandled + exceptions intercepted during tasks execution. Intercepting an exception results + in generating internal cancellation requests (which is processed in exactly the + same way as external ones). + + The context is associated with one or more root tasks and defines the cancellation + group that includes all the descendants of the corresponding root task(s). Association + is established when a context object is passed as an argument to the task::allocate_root() + method. See task_group_context::task_group_context for more details. + + The context can be bound to another one, and other contexts can be bound to it, + forming a tree-like structure: parent -> this -> children. Arrows here designate + cancellation propagation direction. If a task in a cancellation group is canceled + all the other tasks in this group and groups bound to it (as children) get canceled too. + + IMPLEMENTATION NOTE: + When adding new members to task_group_context or changing types of existing ones, + update the size of both padding buffers (_leading_padding and _trailing_padding) + appropriately. See also VERSIONING NOTE at the constructor definition below. **/ +class task_group_context : internal::no_copy +{ +private: +#if TBB_USE_CAPTURED_EXCEPTION + typedef tbb_exception exception_container_type; +#else + typedef internal::tbb_exception_ptr exception_container_type; +#endif + + enum version_traits_word_layout { + traits_offset = 16, + version_mask = 0xFFFF, + traits_mask = 0xFFFFul << traits_offset + }; + +public: + enum kind_type { + isolated, + bound + }; + + enum traits_type { + exact_exception = 0x0001ul << traits_offset, + no_cancellation = 0x0002ul << traits_offset, + concurrent_wait = 0x0004ul << traits_offset, +#if TBB_USE_CAPTURED_EXCEPTION + default_traits = 0 +#else + default_traits = exact_exception +#endif /* !TBB_USE_CAPTURED_EXCEPTION */ + }; + +private: + union { + //! Flavor of this context: bound or isolated. + kind_type my_kind; + uintptr_t _my_kind_aligner; + }; + + //! Pointer to the context of the parent cancellation group. NULL for isolated contexts. + task_group_context *my_parent; + + //! Used to form the thread specific list of contexts without additional memory allocation. + /** A context is included into the list of the current thread when its binding to + its parent happens. Any context can be present in the list of one thread only. **/ + internal::context_list_node_t my_node; + + //! Leading padding protecting accesses to frequently used members from false sharing. + /** Read accesses to the field my_cancellation_requested are on the hot path inside + the scheduler. This padding ensures that this field never shares the same cache + line with a local variable that is frequently written to. **/ + char _leading_padding[internal::NFS_MaxLineSize - + 2 * sizeof(uintptr_t)- sizeof(void*) - sizeof(internal::context_list_node_t)]; + + //! Specifies whether cancellation was request for this task group. + uintptr_t my_cancellation_requested; + + //! Version for run-time checks and behavioral traits of the context. + /** Version occupies low 16 bits, and traits (zero or more ORed enumerators + from the traits_type enumerations) take the next 16 bits. + Original (zeroth) version of the context did not support any traits. **/ + uintptr_t my_version_and_traits; + + //! Pointer to the container storing exception being propagated across this task group. + exception_container_type *my_exception; + + //! Scheduler that registered this context in its thread specific list. + /** This field is not terribly necessary, but it allows to get a small performance + benefit by getting us rid of using thread local storage. We do not care + about extra memory it takes since this data structure is excessively padded anyway. **/ + void *my_owner; + + //! Trailing padding protecting accesses to frequently used members from false sharing + /** \sa _leading_padding **/ + char _trailing_padding[internal::NFS_MaxLineSize - sizeof(intptr_t) - 2 * sizeof(void*)]; + +public: + //! Default & binding constructor. + /** By default a bound context is created. That is this context will be bound + (as child) to the context of the task calling task::allocate_root(this_context) + method. Cancellation requests passed to the parent context are propagated + to all the contexts bound to it. + + If task_group_context::isolated is used as the argument, then the tasks associated + with this context will never be affected by events in any other context. + + Creating isolated contexts involve much less overhead, but they have limited + utility. Normally when an exception occurs in an algorithm that has nested + ones running, it is desirably to have all the nested algorithms canceled + as well. Such a behavior requires nested algorithms to use bound contexts. + + There is one good place where using isolated algorithms is beneficial. It is + a master thread. That is if a particular algorithm is invoked directly from + the master thread (not from a TBB task), supplying it with explicitly + created isolated context will result in a faster algorithm startup. + + VERSIONING NOTE: + Implementation(s) of task_group_context constructor(s) cannot be made + entirely out-of-line because the run-time version must be set by the user + code. This will become critically important for binary compatibility, if + we ever have to change the size of the context object. + + Boosting the runtime version will also be necessary whenever new fields + are introduced in the currently unused padding areas or the meaning of + the existing fields is changed or extended. **/ + task_group_context ( kind_type relation_with_parent = bound, + uintptr_t traits = default_traits ) + : my_kind(relation_with_parent) + , my_version_and_traits(1 | traits) + { + init(); + } + + __TBB_EXPORTED_METHOD ~task_group_context (); + + //! Forcefully reinitializes the context after the task tree it was associated with is completed. + /** Because the method assumes that all the tasks that used to be associated with + this context have already finished, calling it while the context is still + in use somewhere in the task hierarchy leads to undefined behavior. + + IMPORTANT: This method is not thread safe! + + The method does not change the context's parent if it is set. **/ + void __TBB_EXPORTED_METHOD reset (); + + //! Initiates cancellation of all tasks in this cancellation group and its subordinate groups. + /** \return false if cancellation has already been requested, true otherwise. + + Note that canceling never fails. When false is returned, it just means that + another thread (or this one) has already sent cancellation request to this + context or to one of its ancestors (if this context is bound). It is guaranteed + that when this method is concurrently called on the same not yet cancelled + context, true will be returned by one and only one invocation. **/ + bool __TBB_EXPORTED_METHOD cancel_group_execution (); + + //! Returns true if the context received cancellation request. + bool __TBB_EXPORTED_METHOD is_group_execution_cancelled () const; + + //! Records the pending exception, and cancels the task group. + /** May be called only from inside a catch-block. If the context is already + canceled, does nothing. + The method brings the task group associated with this context exactly into + the state it would be in, if one of its tasks threw the currently pending + exception during its execution. In other words, it emulates the actions + of the scheduler's dispatch loop exception handler. **/ + void __TBB_EXPORTED_METHOD register_pending_exception (); + +protected: + //! Out-of-line part of the constructor. + /** Singled out to ensure backward binary compatibility of the future versions. **/ + void __TBB_EXPORTED_METHOD init (); + +private: + friend class task; + friend class internal::allocate_root_with_context_proxy; + + static const kind_type binding_required = bound; + static const kind_type binding_completed = kind_type(bound+1); + + //! Checks if any of the ancestors has a cancellation request outstanding, + //! and propagates it back to descendants. + void propagate_cancellation_from_ancestors (); + + //! For debugging purposes only. + bool is_alive () { +#if TBB_USE_DEBUG + return my_version_and_traits != 0xDeadBeef; +#else + return true; +#endif /* TBB_USE_DEBUG */ + } +}; // class task_group_context + +#endif /* __TBB_EXCEPTIONS */ + +//! Base class for user-defined tasks. +/** @ingroup task_scheduling */ +class task: internal::no_copy { + //! Set reference count + void __TBB_EXPORTED_METHOD internal_set_ref_count( int count ); + + //! Decrement reference count and return true if non-zero. + internal::reference_count __TBB_EXPORTED_METHOD internal_decrement_ref_count(); + +protected: + //! Default constructor. + task() {prefix().extra_state=1;} + +public: + //! Destructor. + virtual ~task() {} + + //! Should be overridden by derived classes. + virtual task* execute() = 0; + + //! Enumeration of task states that the scheduler considers. + enum state_type { + //! task is running, and will be destroyed after method execute() completes. + executing, + //! task to be rescheduled. + reexecute, + //! task is in ready pool, or is going to be put there, or was just taken off. + ready, + //! task object is freshly allocated or recycled. + allocated, + //! task object is on free list, or is going to be put there, or was just taken off. + freed, + //! task to be recycled as continuation + recycle + }; + + //------------------------------------------------------------------------ + // Allocating tasks + //------------------------------------------------------------------------ + + //! Returns proxy for overloaded new that allocates a root task. + static internal::allocate_root_proxy allocate_root() { + return internal::allocate_root_proxy(); + } + +#if __TBB_EXCEPTIONS + //! Returns proxy for overloaded new that allocates a root task associated with user supplied context. + static internal::allocate_root_with_context_proxy allocate_root( task_group_context& ctx ) { + return internal::allocate_root_with_context_proxy(ctx); + } +#endif /* __TBB_EXCEPTIONS */ + + //! Returns proxy for overloaded new that allocates a continuation task of *this. + /** The continuation's parent becomes the parent of *this. */ + internal::allocate_continuation_proxy& allocate_continuation() { + return *reinterpret_cast(this); + } + + //! Returns proxy for overloaded new that allocates a child task of *this. + internal::allocate_child_proxy& allocate_child() { + return *reinterpret_cast(this); + } + + //! Like allocate_child, except that task's parent becomes "t", not this. + /** Typically used in conjunction with schedule_to_reexecute to implement while loops. + Atomically increments the reference count of t.parent() */ + internal::allocate_additional_child_of_proxy allocate_additional_child_of( task& t ) { + return internal::allocate_additional_child_of_proxy(*this,t); + } + + //! Destroy a task. + /** Usually, calling this method is unnecessary, because a task is + implicitly deleted after its execute() method runs. However, + sometimes a task needs to be explicitly deallocated, such as + when a root task is used as the parent in spawn_and_wait_for_all. */ + void __TBB_EXPORTED_METHOD destroy( task& victim ); + + //------------------------------------------------------------------------ + // Recycling of tasks + //------------------------------------------------------------------------ + + //! Change this to be a continuation of its former self. + /** The caller must guarantee that the task's refcount does not become zero until + after the method execute() returns. Typically, this is done by having + method execute() return a pointer to a child of the task. If the guarantee + cannot be made, use method recycle_as_safe_continuation instead. + + Because of the hazard, this method may be deprecated in the future. */ + void recycle_as_continuation() { + __TBB_ASSERT( prefix().state==executing, "execute not running?" ); + prefix().state = allocated; + } + + //! Recommended to use, safe variant of recycle_as_continuation + /** For safety, it requires additional increment of ref_count. */ + void recycle_as_safe_continuation() { + __TBB_ASSERT( prefix().state==executing, "execute not running?" ); + prefix().state = recycle; + } + + //! Change this to be a child of new_parent. + void recycle_as_child_of( task& new_parent ) { + internal::task_prefix& p = prefix(); + __TBB_ASSERT( prefix().state==executing||prefix().state==allocated, "execute not running, or already recycled" ); + __TBB_ASSERT( prefix().ref_count==0, "no child tasks allowed when recycled as a child" ); + __TBB_ASSERT( p.parent==NULL, "parent must be null" ); + __TBB_ASSERT( new_parent.prefix().state<=recycle, "corrupt parent's state" ); + __TBB_ASSERT( new_parent.prefix().state!=freed, "parent already freed" ); + p.state = allocated; + p.parent = &new_parent; +#if __TBB_EXCEPTIONS + p.context = new_parent.prefix().context; +#endif /* __TBB_EXCEPTIONS */ + } + + //! Schedule this for reexecution after current execute() returns. + /** Requires that this.execute() be running. */ + void recycle_to_reexecute() { + __TBB_ASSERT( prefix().state==executing, "execute not running, or already recycled" ); + __TBB_ASSERT( prefix().ref_count==0, "no child tasks allowed when recycled for reexecution" ); + prefix().state = reexecute; + } + + // All depth-related methods are obsolete, and are retained for the sake + // of backward source compatibility only + intptr_t depth() const {return 0;} + void set_depth( intptr_t ) {} + void add_to_depth( int ) {} + + + //------------------------------------------------------------------------ + // Spawning and blocking + //------------------------------------------------------------------------ + + //! Set reference count + void set_ref_count( int count ) { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + internal_set_ref_count(count); +#else + prefix().ref_count = count; +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + } + + //! Atomically increment reference count. + /** Has acquire semantics */ + void increment_ref_count() { + __TBB_FetchAndIncrementWacquire( &prefix().ref_count ); + } + + //! Atomically decrement reference count. + /** Has release semanics. */ + int decrement_ref_count() { +#if TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT + return int(internal_decrement_ref_count()); +#else + return int(__TBB_FetchAndDecrementWrelease( &prefix().ref_count ))-1; +#endif /* TBB_USE_THREADING_TOOLS||TBB_USE_ASSERT */ + } + + //! Schedule task for execution when a worker becomes available. + /** After all children spawned so far finish their method task::execute, + their parent's method task::execute may start running. Therefore, it + is important to ensure that at least one child has not completed until + the parent is ready to run. */ + void spawn( task& child ) { + prefix().owner->spawn( child, child.prefix().next ); + } + + //! Spawn multiple tasks and clear list. + void spawn( task_list& list ); + + //! Similar to spawn followed by wait_for_all, but more efficient. + void spawn_and_wait_for_all( task& child ) { + prefix().owner->wait_for_all( *this, &child ); + } + + //! Similar to spawn followed by wait_for_all, but more efficient. + void __TBB_EXPORTED_METHOD spawn_and_wait_for_all( task_list& list ); + + //! Spawn task allocated by allocate_root, wait for it to complete, and deallocate it. + /** The thread that calls spawn_root_and_wait must be the same thread + that allocated the task. */ + static void spawn_root_and_wait( task& root ) { + root.prefix().owner->spawn_root_and_wait( root, root.prefix().next ); + } + + //! Spawn root tasks on list and wait for all of them to finish. + /** If there are more tasks than worker threads, the tasks are spawned in + order of front to back. */ + static void spawn_root_and_wait( task_list& root_list ); + + //! Wait for reference count to become one, and set reference count to zero. + /** Works on tasks while waiting. */ + void wait_for_all() { + prefix().owner->wait_for_all( *this, NULL ); + } + + //! The innermost task being executed or destroyed by the current thread at the moment. + static task& __TBB_EXPORTED_FUNC self(); + + //! task on whose behalf this task is working, or NULL if this is a root. + task* parent() const {return prefix().parent;} + +#if __TBB_EXCEPTIONS + //! Shared context that is used to communicate asynchronous state changes + task_group_context* context() {return prefix().context;} +#endif /* __TBB_EXCEPTIONS */ + + //! True if task is owned by different thread than thread that owns its parent. + bool is_stolen_task() const { + internal::task_prefix& p = prefix(); + internal::task_prefix& q = parent()->prefix(); + return p.owner!=q.owner; + } + + //------------------------------------------------------------------------ + // Debugging + //------------------------------------------------------------------------ + + //! Current execution state + state_type state() const {return state_type(prefix().state);} + + //! The internal reference count. + int ref_count() const { +#if TBB_USE_ASSERT + internal::reference_count ref_count = prefix().ref_count; + __TBB_ASSERT( ref_count==int(ref_count), "integer overflow error"); +#endif + return int(prefix().ref_count); + } + + //! Obsolete, and only retained for the sake of backward compatibility. Always returns true. + bool __TBB_EXPORTED_METHOD is_owned_by_current_thread() const; + + //------------------------------------------------------------------------ + // Affinity + //------------------------------------------------------------------------ + + //! An id as used for specifying affinity. + /** Guaranteed to be integral type. Value of 0 means no affinity. */ + typedef internal::affinity_id affinity_id; + + //! Set affinity for this task. + void set_affinity( affinity_id id ) {prefix().affinity = id;} + + //! Current affinity of this task + affinity_id affinity() const {return prefix().affinity;} + + //! Invoked by scheduler to notify task that it ran on unexpected thread. + /** Invoked before method execute() runs, if task is stolen, or task has + affinity but will be executed on another thread. + + The default action does nothing. */ + virtual void __TBB_EXPORTED_METHOD note_affinity( affinity_id id ); + +#if __TBB_EXCEPTIONS + //! Initiates cancellation of all tasks in this cancellation group and its subordinate groups. + /** \return false if cancellation has already been requested, true otherwise. **/ + bool cancel_group_execution () { return prefix().context->cancel_group_execution(); } + + //! Returns true if the context received cancellation request. + bool is_cancelled () const { return prefix().context->is_group_execution_cancelled(); } +#endif /* __TBB_EXCEPTIONS */ + +private: + friend class task_list; + friend class internal::scheduler; + friend class internal::allocate_root_proxy; +#if __TBB_EXCEPTIONS + friend class internal::allocate_root_with_context_proxy; +#endif /* __TBB_EXCEPTIONS */ + friend class internal::allocate_continuation_proxy; + friend class internal::allocate_child_proxy; + friend class internal::allocate_additional_child_of_proxy; + + friend class internal::task_group_base; + + //! Get reference to corresponding task_prefix. + /** Version tag prevents loader on Linux from using the wrong symbol in debug builds. **/ + internal::task_prefix& prefix( internal::version_tag* = NULL ) const { + return reinterpret_cast(const_cast(this))[-1]; + } +}; // class task + +//! task that does nothing. Useful for synchronization. +/** @ingroup task_scheduling */ +class empty_task: public task { + /*override*/ task* execute() { + return NULL; + } +}; + +//! A list of children. +/** Used for method task::spawn_children + @ingroup task_scheduling */ +class task_list: internal::no_copy { +private: + task* first; + task** next_ptr; + friend class task; +public: + //! Construct empty list + task_list() : first(NULL), next_ptr(&first) {} + + //! Destroys the list, but does not destroy the task objects. + ~task_list() {} + + //! True if list if empty; false otherwise. + bool empty() const {return !first;} + + //! Push task onto back of list. + void push_back( task& task ) { + task.prefix().next = NULL; + *next_ptr = &task; + next_ptr = &task.prefix().next; + } + + //! Pop the front task from the list. + task& pop_front() { + __TBB_ASSERT( !empty(), "attempt to pop item from empty task_list" ); + task* result = first; + first = result->prefix().next; + if( !first ) next_ptr = &first; + return *result; + } + + //! Clear the list + void clear() { + first=NULL; + next_ptr=&first; + } +}; + +inline void task::spawn( task_list& list ) { + if( task* t = list.first ) { + prefix().owner->spawn( *t, *list.next_ptr ); + list.clear(); + } +} + +inline void task::spawn_root_and_wait( task_list& root_list ) { + if( task* t = root_list.first ) { + t->prefix().owner->spawn_root_and_wait( *t, *root_list.next_ptr ); + root_list.clear(); + } +} + +} // namespace tbb + +inline void *operator new( size_t bytes, const tbb::internal::allocate_root_proxy& ) { + return &tbb::internal::allocate_root_proxy::allocate(bytes); +} + +inline void operator delete( void* task, const tbb::internal::allocate_root_proxy& ) { + tbb::internal::allocate_root_proxy::free( *static_cast(task) ); +} + +#if __TBB_EXCEPTIONS +inline void *operator new( size_t bytes, const tbb::internal::allocate_root_with_context_proxy& p ) { + return &p.allocate(bytes); +} + +inline void operator delete( void* task, const tbb::internal::allocate_root_with_context_proxy& p ) { + p.free( *static_cast(task) ); +} +#endif /* __TBB_EXCEPTIONS */ + +inline void *operator new( size_t bytes, const tbb::internal::allocate_continuation_proxy& p ) { + return &p.allocate(bytes); +} + +inline void operator delete( void* task, const tbb::internal::allocate_continuation_proxy& p ) { + p.free( *static_cast(task) ); +} + +inline void *operator new( size_t bytes, const tbb::internal::allocate_child_proxy& p ) { + return &p.allocate(bytes); +} + +inline void operator delete( void* task, const tbb::internal::allocate_child_proxy& p ) { + p.free( *static_cast(task) ); +} + +inline void *operator new( size_t bytes, const tbb::internal::allocate_additional_child_of_proxy& p ) { + return &p.allocate(bytes); +} + +inline void operator delete( void* task, const tbb::internal::allocate_additional_child_of_proxy& p ) { + p.free( *static_cast(task) ); +} + +#endif /* __TBB_task_H */ diff --git a/dep/tbb/include/tbb/task_group.h b/dep/tbb/include/tbb/task_group.h new file mode 100644 index 000000000..b3e6cf224 --- /dev/null +++ b/dep/tbb/include/tbb/task_group.h @@ -0,0 +1,228 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_task_group_H +#define __TBB_task_group_H + +#include "task.h" +#include + +namespace tbb { + +template +class task_handle { + F my_func; + +public: + task_handle( const F& f ) : my_func(f) {} + + void operator()() { my_func(); } +}; + +enum task_group_status { + not_complete, + complete, + canceled +}; + +namespace internal { + +// Suppress gratuitous warnings from icc 11.0 when lambda expressions are used in instances of function_task. +//#pragma warning(disable: 588) + +template +class function_task : public task { + F my_func; + /*override*/ task* execute() { + my_func(); + return NULL; + } +public: + function_task( const F& f ) : my_func(f) {} +}; + +template +class task_handle_task : public task { + task_handle& my_handle; + /*override*/ task* execute() { + my_handle(); + return NULL; + } +public: + task_handle_task( task_handle& h ) : my_handle(h) {} +}; + +class task_group_base : internal::no_copy { +protected: + empty_task* my_root; + task_group_context my_context; + + task& owner () { return *my_root; } + + template + task_group_status internal_run_and_wait( F& f ) { + try { + if ( !my_context.is_group_execution_cancelled() ) + f(); + } catch ( ... ) { + my_context.register_pending_exception(); + } + return wait(); + } + + template + void internal_run( F& f ) { + owner().spawn( *new( owner().allocate_additional_child_of(*my_root) ) Task(f) ); + } + +public: + task_group_base( uintptr_t traits = 0 ) + : my_context(task_group_context::bound, task_group_context::default_traits | traits) + { + my_root = new( task::allocate_root(my_context) ) empty_task; + my_root->set_ref_count(1); + } + + template + void run( task_handle& h ) { + internal_run< task_handle, internal::task_handle_task >( h ); + } + + task_group_status wait() { + try { + owner().prefix().owner->wait_for_all( *my_root, NULL ); + } catch ( ... ) { + my_context.reset(); + throw; + } + if ( my_context.is_group_execution_cancelled() ) { + my_context.reset(); + return canceled; + } + return complete; + } + + bool is_canceling() { + return my_context.is_group_execution_cancelled(); + } + + void cancel() { + my_context.cancel_group_execution(); + } +}; // class task_group_base + +} // namespace internal + +class task_group : public internal::task_group_base { +public: + task_group () : task_group_base( task_group_context::concurrent_wait ) {} + + ~task_group() try { + __TBB_ASSERT( my_root->ref_count() != 0, NULL ); + if( my_root->ref_count() > 1 ) + my_root->wait_for_all(); + owner().destroy(*my_root); + } + catch (...) { + owner().destroy(*my_root); + throw; + } + +#if __SUNPRO_CC + template + void run( task_handle& h ) { + internal_run< task_handle, internal::task_handle_task >( h ); + } +#else + using task_group_base::run; +#endif + + template + void run( const F& f ) { + internal_run< const F, internal::function_task >( f ); + } + + template + task_group_status run_and_wait( const F& f ) { + return internal_run_and_wait( f ); + } + + template + task_group_status run_and_wait( task_handle& h ) { + return internal_run_and_wait< task_handle >( h ); + } +}; // class task_group + +class missing_wait : public std::exception { +public: + /*override*/ + const char* what() const throw() { return "wait() was not called on the structured_task_group"; } +}; + +class structured_task_group : public internal::task_group_base { +public: + ~structured_task_group() { + if( my_root->ref_count() > 1 ) { + bool stack_unwinding_in_progress = std::uncaught_exception(); + // Always attempt to do proper cleanup to avoid inevitable memory corruption + // in case of missing wait (for the sake of better testability & debuggability) + if ( !is_canceling() ) + cancel(); + my_root->wait_for_all(); + owner().destroy(*my_root); + if ( !stack_unwinding_in_progress ) + throw missing_wait(); + } + else + owner().destroy(*my_root); + } + + template + task_group_status run_and_wait ( task_handle& h ) { + return internal_run_and_wait< task_handle >( h ); + } + + task_group_status wait() { + __TBB_ASSERT ( my_root->ref_count() != 0, "wait() can be called only once during the structured_task_group lifetime" ); + return task_group_base::wait(); + } +}; // class structured_task_group + +inline +bool is_current_task_group_canceling() { + return task::self().is_cancelled(); +} + +template +task_handle make_task( const F& f ) { + return task_handle( f ); +} + +} // namespace tbb + +#endif /* __TBB_task_group_H */ diff --git a/dep/tbb/include/tbb/task_scheduler_init.h b/dep/tbb/include/tbb/task_scheduler_init.h new file mode 100644 index 000000000..f817ccc37 --- /dev/null +++ b/dep/tbb/include/tbb/task_scheduler_init.h @@ -0,0 +1,106 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_task_scheduler_init_H +#define __TBB_task_scheduler_init_H + +#include "tbb_stddef.h" + +namespace tbb { + +typedef std::size_t stack_size_type; + +//! @cond INTERNAL +namespace internal { + //! Internal to library. Should not be used by clients. + /** @ingroup task_scheduling */ + class scheduler; +} // namespace internal +//! @endcond + +//! Class representing reference to tbb scheduler. +/** A thread must construct a task_scheduler_init, and keep it alive, + during the time that it uses the services of class task. + @ingroup task_scheduling */ +class task_scheduler_init: internal::no_copy { + /** NULL if not currently initialized. */ + internal::scheduler* my_scheduler; +public: + + //! Typedef for number of threads that is automatic. + static const int automatic = -1; + + //! Argument to initialize() or constructor that causes initialization to be deferred. + static const int deferred = -2; + + //! Ensure that scheduler exists for this thread + /** A value of -1 lets tbb decide on the number of threads, which is typically + the number of hardware threads. For production code, the default value of -1 + should be used, particularly if the client code is mixed with third party clients + that might also use tbb. + + The number_of_threads is ignored if any other task_scheduler_inits + currently exist. A thread may construct multiple task_scheduler_inits. + Doing so does no harm because the underlying scheduler is reference counted. */ + void __TBB_EXPORTED_METHOD initialize( int number_of_threads=automatic ); + + //! The overloaded method with stack size parameter + /** Overloading is necessary to preserve ABI compatibility */ + void __TBB_EXPORTED_METHOD initialize( int number_of_threads, stack_size_type thread_stack_size ); + + //! Inverse of method initialize. + void __TBB_EXPORTED_METHOD terminate(); + + //! Shorthand for default constructor followed by call to intialize(number_of_threads). + task_scheduler_init( int number_of_threads=automatic, stack_size_type thread_stack_size=0 ) : my_scheduler(NULL) { + initialize( number_of_threads, thread_stack_size ); + } + + //! Destroy scheduler for this thread if thread has no other live task_scheduler_inits. + ~task_scheduler_init() { + if( my_scheduler ) + terminate(); + internal::poison_pointer( my_scheduler ); + } + //! Returns the number of threads tbb scheduler would create if initialized by default. + /** Result returned by this method does not depend on whether the scheduler + has already been initialized. + + Because tbb 2.0 does not support blocking tasks yet, you may use this method + to boost the number of threads in the tbb's internal pool, if your tasks are + doing I/O operations. The optimal number of additional threads depends on how + much time your tasks spend in the blocked state. */ + static int __TBB_EXPORTED_FUNC default_num_threads (); + + //! Returns true if scheduler is active (initialized); false otherwise + bool is_active() const { return my_scheduler != NULL; } +}; + +} // namespace tbb + +#endif /* __TBB_task_scheduler_init_H */ diff --git a/dep/tbb/include/tbb/task_scheduler_observer.h b/dep/tbb/include/tbb/task_scheduler_observer.h new file mode 100644 index 000000000..ee8bd5df2 --- /dev/null +++ b/dep/tbb/include/tbb/task_scheduler_observer.h @@ -0,0 +1,74 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_task_scheduler_observer_H +#define __TBB_task_scheduler_observer_H + +#include "atomic.h" + +#if __TBB_SCHEDULER_OBSERVER + +namespace tbb { + +namespace internal { + +class observer_proxy; + +class task_scheduler_observer_v3 { + friend class observer_proxy; + observer_proxy* my_proxy; + atomic my_busy_count; +public: + //! Enable or disable observation + void __TBB_EXPORTED_METHOD observe( bool state=true ); + + //! True if observation is enables; false otherwise. + bool is_observing() const {return my_proxy!=NULL;} + + //! Construct observer with observation disabled. + task_scheduler_observer_v3() : my_proxy(NULL) {my_busy_count=0;} + + //! Called by thread before first steal since observation became enabled + virtual void on_scheduler_entry( bool /*is_worker*/ ) {} + + //! Called by thread when it no longer takes part in task stealing. + virtual void on_scheduler_exit( bool /*is_worker*/ ) {} + + //! Destructor + virtual ~task_scheduler_observer_v3() {observe(false);} +}; + +} // namespace internal + +typedef internal::task_scheduler_observer_v3 task_scheduler_observer; + +} // namespace tbb + +#endif /* __TBB_SCHEDULER_OBSERVER */ + +#endif /* __TBB_task_scheduler_observer_H */ diff --git a/dep/tbb/include/tbb/tbb.h b/dep/tbb/include/tbb/tbb.h new file mode 100644 index 000000000..4bac7bf48 --- /dev/null +++ b/dep/tbb/include/tbb/tbb.h @@ -0,0 +1,76 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_tbb_H +#define __TBB_tbb_H + +/** + This header bulk-includes declarations or definitions of all the functionality + provided by TBB (save for malloc dependent headers). + + If you use only a few TBB constructs, consider including specific headers only. + Any header listed below can be included independently of others. +**/ + +#include "aligned_space.h" +#include "atomic.h" +#include "blocked_range.h" +#include "blocked_range2d.h" +#include "blocked_range3d.h" +#include "cache_aligned_allocator.h" +#include "concurrent_hash_map.h" +#include "concurrent_queue.h" +#include "concurrent_vector.h" +#include "enumerable_thread_specific.h" +#include "mutex.h" +#include "null_mutex.h" +#include "null_rw_mutex.h" +#include "parallel_do.h" +#include "parallel_for.h" +#include "parallel_for_each.h" +#include "parallel_invoke.h" +#include "parallel_reduce.h" +#include "parallel_scan.h" +#include "parallel_sort.h" +#include "partitioner.h" +#include "pipeline.h" +#include "queuing_mutex.h" +#include "queuing_rw_mutex.h" +#include "recursive_mutex.h" +#include "spin_mutex.h" +#include "spin_rw_mutex.h" +#include "task.h" +#include "task_group.h" +#include "task_scheduler_init.h" +#include "task_scheduler_observer.h" +#include "tbb_allocator.h" +#include "tbb_exception.h" +#include "tbb_thread.h" +#include "tick_count.h" + +#endif /* __TBB_tbb_H */ diff --git a/dep/tbb/include/tbb/tbb_allocator.h b/dep/tbb/include/tbb/tbb_allocator.h new file mode 100644 index 000000000..aa1544b96 --- /dev/null +++ b/dep/tbb/include/tbb/tbb_allocator.h @@ -0,0 +1,203 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_tbb_allocator_H +#define __TBB_tbb_allocator_H + +#include +#include +#include "tbb_stddef.h" + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + + //! Deallocates memory using FreeHandler + /** The function uses scalable_free if scalable allocator is available and free if not*/ + void __TBB_EXPORTED_FUNC deallocate_via_handler_v3( void *p ); + + //! Allocates memory using MallocHandler + /** The function uses scalable_malloc if scalable allocator is available and malloc if not*/ + void* __TBB_EXPORTED_FUNC allocate_via_handler_v3( size_t n ); + + //! Returns true if standard malloc/free are used to work with memory. + bool __TBB_EXPORTED_FUNC is_malloc_used_v3(); +} +//! @endcond + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Workaround for erroneous "unreferenced parameter" warning in method destroy. + #pragma warning (push) + #pragma warning (disable: 4100) +#endif + +//! Meets "allocator" requirements of ISO C++ Standard, Section 20.1.5 +/** The class selects the best memory allocation mechanism available + from scalable_malloc and standard malloc. + The members are ordered the same way they are in section 20.4.1 + of the ISO C++ standard. + @ingroup memory_allocation */ +template +class tbb_allocator { +public: + typedef typename internal::allocator_type::value_type value_type; + typedef value_type* pointer; + typedef const value_type* const_pointer; + typedef value_type& reference; + typedef const value_type& const_reference; + typedef size_t size_type; + typedef ptrdiff_t difference_type; + template struct rebind { + typedef tbb_allocator other; + }; + + //! Specifies current allocator + enum malloc_type { + scalable, + standard + }; + + tbb_allocator() throw() {} + tbb_allocator( const tbb_allocator& ) throw() {} + template tbb_allocator(const tbb_allocator&) throw() {} + + pointer address(reference x) const {return &x;} + const_pointer address(const_reference x) const {return &x;} + + //! Allocate space for n objects. + pointer allocate( size_type n, const void* /*hint*/ = 0) { + return pointer(internal::allocate_via_handler_v3( n * sizeof(value_type) )); + } + + //! Free previously allocated block of memory. + void deallocate( pointer p, size_type ) { + internal::deallocate_via_handler_v3(p); + } + + //! Largest value for which method allocate might succeed. + size_type max_size() const throw() { + size_type max = static_cast(-1) / sizeof (value_type); + return (max > 0 ? max : 1); + } + + //! Copy-construct value at location pointed to by p. + void construct( pointer p, const value_type& value ) {new(static_cast(p)) value_type(value);} + + //! Destroy value at location pointed to by p. + void destroy( pointer p ) {p->~value_type();} + + //! Returns current allocator + static malloc_type allocator_type() { + return internal::is_malloc_used_v3() ? standard : scalable; + } +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warning 4100 is back + +//! Analogous to std::allocator, as defined in ISO C++ Standard, Section 20.4.1 +/** @ingroup memory_allocation */ +template<> +class tbb_allocator { +public: + typedef void* pointer; + typedef const void* const_pointer; + typedef void value_type; + template struct rebind { + typedef tbb_allocator other; + }; +}; + +template +inline bool operator==( const tbb_allocator&, const tbb_allocator& ) {return true;} + +template +inline bool operator!=( const tbb_allocator&, const tbb_allocator& ) {return false;} + +//! Meets "allocator" requirements of ISO C++ Standard, Section 20.1.5 +/** The class is an adapter over an actual allocator that fills the allocation + using memset function with template argument C as the value. + The members are ordered the same way they are in section 20.4.1 + of the ISO C++ standard. + @ingroup memory_allocation */ +template class Allocator = tbb_allocator> +class zero_allocator : public Allocator +{ +public: + typedef Allocator base_allocator_type; + typedef typename base_allocator_type::value_type value_type; + typedef typename base_allocator_type::pointer pointer; + typedef typename base_allocator_type::const_pointer const_pointer; + typedef typename base_allocator_type::reference reference; + typedef typename base_allocator_type::const_reference const_reference; + typedef typename base_allocator_type::size_type size_type; + typedef typename base_allocator_type::difference_type difference_type; + template struct rebind { + typedef zero_allocator other; + }; + + zero_allocator() throw() { } + zero_allocator(const zero_allocator &a) throw() : base_allocator_type( a ) { } + template + zero_allocator(const zero_allocator &a) throw() : base_allocator_type( Allocator( a ) ) { } + + pointer allocate(const size_type n, const void *hint = 0 ) { + pointer ptr = base_allocator_type::allocate( n, hint ); + std::memset( ptr, 0, n * sizeof(value_type) ); + return ptr; + } +}; + +//! Analogous to std::allocator, as defined in ISO C++ Standard, Section 20.4.1 +/** @ingroup memory_allocation */ +template class Allocator> +class zero_allocator : public Allocator { +public: + typedef Allocator base_allocator_type; + typedef typename base_allocator_type::value_type value_type; + typedef typename base_allocator_type::pointer pointer; + typedef typename base_allocator_type::const_pointer const_pointer; + template struct rebind { + typedef zero_allocator other; + }; +}; + +template class B1, typename T2, template class B2> +inline bool operator==( const zero_allocator &a, const zero_allocator &b) { + return static_cast< B1 >(a) == static_cast< B2 >(b); +} +template class B1, typename T2, template class B2> +inline bool operator!=( const zero_allocator &a, const zero_allocator &b) { + return static_cast< B1 >(a) != static_cast< B2 >(b); +} + +} // namespace tbb + +#endif /* __TBB_tbb_allocator_H */ diff --git a/dep/tbb/include/tbb/tbb_config.h b/dep/tbb/include/tbb/tbb_config.h new file mode 100644 index 000000000..fad5bf214 --- /dev/null +++ b/dep/tbb/include/tbb/tbb_config.h @@ -0,0 +1,161 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_tbb_config_H +#define __TBB_tbb_config_H + +/** This header is supposed to contain macro definitions and C style comments only. + The macros defined here are intended to control such aspects of TBB build as + - compilation modes + - feature sets + - workarounds presence +**/ + +/** Compilation modes **/ + +#ifndef TBB_USE_DEBUG +#ifdef TBB_DO_ASSERT +#define TBB_USE_DEBUG TBB_DO_ASSERT +#else +#define TBB_USE_DEBUG 0 +#endif /* TBB_DO_ASSERT */ +#else +#define TBB_DO_ASSERT TBB_USE_DEBUG +#endif /* TBB_USE_DEBUG */ + +#ifndef TBB_USE_ASSERT +#ifdef TBB_DO_ASSERT +#define TBB_USE_ASSERT TBB_DO_ASSERT +#else +#define TBB_USE_ASSERT TBB_USE_DEBUG +#endif /* TBB_DO_ASSERT */ +#endif /* TBB_USE_ASSERT */ + +#ifndef TBB_USE_THREADING_TOOLS +#ifdef TBB_DO_THREADING_TOOLS +#define TBB_USE_THREADING_TOOLS TBB_DO_THREADING_TOOLS +#else +#define TBB_USE_THREADING_TOOLS TBB_USE_DEBUG +#endif /* TBB_DO_THREADING_TOOLS */ +#endif /* TBB_USE_THREADING_TOOLS */ + +#ifndef TBB_USE_PERFORMANCE_WARNINGS +#ifdef TBB_PERFORMANCE_WARNINGS +#define TBB_USE_PERFORMANCE_WARNINGS TBB_PERFORMANCE_WARNINGS +#else +#define TBB_USE_PERFORMANCE_WARNINGS TBB_USE_DEBUG +#endif /* TBB_PEFORMANCE_WARNINGS */ +#endif /* TBB_USE_PERFORMANCE_WARNINGS */ + + +/** Feature sets **/ + +#ifndef __TBB_EXCEPTIONS +#define __TBB_EXCEPTIONS 1 +#endif /* __TBB_EXCEPTIONS */ + +#ifndef __TBB_SCHEDULER_OBSERVER +#define __TBB_SCHEDULER_OBSERVER 1 +#endif /* __TBB_SCHEDULER_OBSERVER */ + +#ifndef __TBB_NEW_ITT_NOTIFY +#define __TBB_NEW_ITT_NOTIFY 1 +#endif /* !__TBB_NEW_ITT_NOTIFY */ + + +/* TODO: The following condition should be extended as soon as new compilers/runtimes + with std::exception_ptr support appear. */ +#define __TBB_EXCEPTION_PTR_PRESENT (_MSC_VER >= 1600 || __GXX_EXPERIMENTAL_CXX0X__ && (__GNUC__==4 && __GNUC_MINOR__>=4)) + + +#ifndef TBB_USE_CAPTURED_EXCEPTION + #if __TBB_EXCEPTION_PTR_PRESENT + #define TBB_USE_CAPTURED_EXCEPTION 0 + #else + #define TBB_USE_CAPTURED_EXCEPTION 1 + #endif +#else /* defined TBB_USE_CAPTURED_EXCEPTION */ + #if !TBB_USE_CAPTURED_EXCEPTION && !__TBB_EXCEPTION_PTR_PRESENT + #error Current runtime does not support std::exception_ptr. Set TBB_USE_CAPTURED_EXCEPTION and make sure that your code is ready to catch tbb::captured_exception. + #endif +#endif /* defined TBB_USE_CAPTURED_EXCEPTION */ + + +#ifndef __TBB_DEFAULT_PARTITIONER +#if TBB_DEPRECATED +/** Default partitioner for parallel loop templates in TBB 1.0-2.1 */ +#define __TBB_DEFAULT_PARTITIONER tbb::simple_partitioner +#else +/** Default partitioner for parallel loop templates in TBB 2.2 */ +#define __TBB_DEFAULT_PARTITIONER tbb::auto_partitioner +#endif /* TBB_DEFAULT_PARTITIONER */ +#endif /* !defined(__TBB_DEFAULT_PARTITIONER */ + +/** Workarounds presence **/ + +#if __GNUC__==4 && __GNUC_MINOR__==4 && !defined(__INTEL_COMPILER) + #define __TBB_GCC_WARNING_SUPPRESSION_ENABLED 1 +#endif + +/** Macros of the form __TBB_XXX_BROKEN denote known issues that are caused by + the bugs in compilers, standard or OS specific libraries. They should be + removed as soon as the corresponding bugs are fixed or the buggy OS/compiler + versions go out of the support list. +**/ + +#if defined(_MSC_VER) && _MSC_VER < 0x1500 && !defined(__INTEL_COMPILER) + /** VS2005 and earlier does not allow to declare a template class as a friend + of classes defined in other namespaces. **/ + #define __TBB_TEMPLATE_FRIENDS_BROKEN 1 +#endif + +#if __GLIBC__==2 && __GLIBC_MINOR__==3 || __MINGW32__ + /** Some older versions of glibc crash when exception handling happens concurrently. **/ + #define __TBB_EXCEPTION_HANDLING_BROKEN 1 +#endif + +#if (_WIN32||_WIN64) && __INTEL_COMPILER == 1110 + /** That's a bug in Intel compiler 11.1.044/IA-32/Windows, that leads to a worker thread crash on the thread's startup. **/ + #define __TBB_ICL_11_1_CODE_GEN_BROKEN 1 +#endif + +#if __FreeBSD__ + /** The bug in FreeBSD 8.0 results in kernel panic when there is contention + on a mutex created with this attribute. **/ + #define __TBB_PRIO_INHERIT_BROKEN 1 + + /** A bug in FreeBSD 8.0 results in test hanging when an exception occurs + during (concurrent?) object construction by means of placement new operator. **/ + #define __TBB_PLACEMENT_NEW_EXCEPTION_SAFETY_BROKEN 1 +#endif /* __FreeBSD__ */ + +#if __LRB__ +#include "tbb_config_lrb.h" +#endif + +#endif /* __TBB_tbb_config_H */ diff --git a/dep/tbb/include/tbb/tbb_exception.h b/dep/tbb/include/tbb/tbb_exception.h new file mode 100644 index 000000000..621129eef --- /dev/null +++ b/dep/tbb/include/tbb/tbb_exception.h @@ -0,0 +1,297 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_exception_H +#define __TBB_exception_H + +#include "tbb_stddef.h" +#include + +#if __TBB_EXCEPTIONS && !defined(__EXCEPTIONS) && !defined(_CPPUNWIND) && !defined(__SUNPRO_CC) +#error The current compilation environment does not support exception handling. Please set __TBB_EXCEPTIONS to 0 in tbb_config.h +#endif + +namespace tbb { + +//! Exception for concurrent containers +class bad_last_alloc : public std::bad_alloc { +public: + virtual const char* what() const throw() { return "bad allocation in previous or concurrent attempt"; } + virtual ~bad_last_alloc() throw() {} +}; + +namespace internal { +void __TBB_EXPORTED_FUNC throw_bad_last_alloc_exception_v4() ; +} // namespace internal + +} // namespace tbb + +#if __TBB_EXCEPTIONS +#include "tbb_allocator.h" +#include +#include +#include + +namespace tbb { + +//! Interface to be implemented by all exceptions TBB recognizes and propagates across the threads. +/** If an unhandled exception of the type derived from tbb::tbb_exception is intercepted + by the TBB scheduler in one of the worker threads, it is delivered to and re-thrown in + the root thread. The root thread is the thread that has started the outermost algorithm + or root task sharing the same task_group_context with the guilty algorithm/task (the one + that threw the exception first). + + Note: when documentation mentions workers with respect to exception handling, + masters are implied as well, because they are completely equivalent in this context. + Consequently a root thread can be master or worker thread. + + NOTE: In case of nested algorithms or complex task hierarchies when the nested + levels share (explicitly or by means of implicit inheritance) the task group + context of the outermost level, the exception may be (re-)thrown multiple times + (ultimately - in each worker on each nesting level) before reaching the root + thread at the outermost level. IMPORTANT: if you intercept an exception derived + from this class on a nested level, you must re-throw it in the catch block by means + of the "throw;" operator. + + TBB provides two implementations of this interface: tbb::captured_exception and + template class tbb::movable_exception. See their declarations for more info. **/ +class tbb_exception : public std::exception +{ + /** No operator new is provided because the TBB usage model assumes dynamic + creation of the TBB exception objects only by means of applying move() + operation on an exception thrown out of TBB scheduler. **/ + void* operator new ( size_t ); + +public: + //! Creates and returns pointer to the deep copy of this exception object. + /** Move semantics is allowed. **/ + virtual tbb_exception* move () throw() = 0; + + //! Destroys objects created by the move() method. + /** Frees memory and calls destructor for this exception object. + Can and must be used only on objects created by the move method. **/ + virtual void destroy () throw() = 0; + + //! Throws this exception object. + /** Make sure that if you have several levels of derivation from this interface + you implement or override this method on the most derived level. The implementation + is as simple as "throw *this;". Failure to do this will result in exception + of a base class type being thrown. **/ + virtual void throw_self () = 0; + + //! Returns RTTI name of the originally intercepted exception + virtual const char* name() const throw() = 0; + + //! Returns the result of originally intercepted exception's what() method. + virtual const char* what() const throw() = 0; + + /** Operator delete is provided only to allow using existing smart pointers + with TBB exception objects obtained as the result of applying move() + operation on an exception thrown out of TBB scheduler. + + When overriding method move() make sure to override operator delete as well + if memory is allocated not by TBB's scalable allocator. **/ + void operator delete ( void* p ) { + internal::deallocate_via_handler_v3(p); + } +}; + +//! This class is used by TBB to propagate information about unhandled exceptions into the root thread. +/** Exception of this type is thrown by TBB in the root thread (thread that started a parallel + algorithm ) if an unhandled exception was intercepted during the algorithm execution in one + of the workers. + \sa tbb::tbb_exception **/ +class captured_exception : public tbb_exception +{ +public: + captured_exception ( const captured_exception& src ) + : tbb_exception(src), my_dynamic(false) + { + set(src.my_exception_name, src.my_exception_info); + } + + captured_exception ( const char* name, const char* info ) + : my_dynamic(false) + { + set(name, info); + } + + __TBB_EXPORTED_METHOD ~captured_exception () throw() { + clear(); + } + + captured_exception& operator= ( const captured_exception& src ) { + if ( this != &src ) { + clear(); + set(src.my_exception_name, src.my_exception_info); + } + return *this; + } + + /*override*/ + captured_exception* __TBB_EXPORTED_METHOD move () throw(); + + /*override*/ + void __TBB_EXPORTED_METHOD destroy () throw(); + + /*override*/ + void throw_self () { throw *this; } + + /*override*/ + const char* __TBB_EXPORTED_METHOD name() const throw(); + + /*override*/ + const char* __TBB_EXPORTED_METHOD what() const throw(); + + void __TBB_EXPORTED_METHOD set ( const char* name, const char* info ) throw(); + void __TBB_EXPORTED_METHOD clear () throw(); + +private: + //! Used only by method clone(). + captured_exception() {} + + //! Functionally equivalent to {captured_exception e(name,info); return e.clone();} + static captured_exception* allocate ( const char* name, const char* info ); + + bool my_dynamic; + const char* my_exception_name; + const char* my_exception_info; +}; + +//! Template that can be used to implement exception that transfers arbitrary ExceptionData to the root thread +/** Code using TBB can instantiate this template with an arbitrary ExceptionData type + and throw this exception object. Such exceptions are intercepted by the TBB scheduler + and delivered to the root thread (). + \sa tbb::tbb_exception **/ +template +class movable_exception : public tbb_exception +{ + typedef movable_exception self_type; + +public: + movable_exception ( const ExceptionData& data ) + : my_exception_data(data) + , my_dynamic(false) + , my_exception_name(typeid(self_type).name()) + {} + + movable_exception ( const movable_exception& src ) throw () + : tbb_exception(src) + , my_exception_data(src.my_exception_data) + , my_dynamic(false) + , my_exception_name(src.my_exception_name) + {} + + ~movable_exception () throw() {} + + const movable_exception& operator= ( const movable_exception& src ) { + if ( this != &src ) { + my_exception_data = src.my_exception_data; + my_exception_name = src.my_exception_name; + } + return *this; + } + + ExceptionData& data () throw() { return my_exception_data; } + + const ExceptionData& data () const throw() { return my_exception_data; } + + /*override*/ const char* name () const throw() { return my_exception_name; } + + /*override*/ const char* what () const throw() { return "tbb::movable_exception"; } + + /*override*/ + movable_exception* move () throw() { + void* e = internal::allocate_via_handler_v3(sizeof(movable_exception)); + if ( e ) { + ::new (e) movable_exception(*this); + ((movable_exception*)e)->my_dynamic = true; + } + return (movable_exception*)e; + } + /*override*/ + void destroy () throw() { + __TBB_ASSERT ( my_dynamic, "Method destroy can be called only on dynamically allocated movable_exceptions" ); + if ( my_dynamic ) { + this->~movable_exception(); + internal::deallocate_via_handler_v3(this); + } + } + /*override*/ + void throw_self () { + throw *this; + } + +protected: + //! User data + ExceptionData my_exception_data; + +private: + //! Flag specifying whether this object has been dynamically allocated (by the move method) + bool my_dynamic; + + //! RTTI name of this class + /** We rely on the fact that RTTI names are static string constants. **/ + const char* my_exception_name; +}; + +#if !TBB_USE_CAPTURED_EXCEPTION +namespace internal { + +//! Exception container that preserves the exact copy of the original exception +/** This class can be used only when the appropriate runtime support (mandated + by C++0x) is present **/ +class tbb_exception_ptr { + std::exception_ptr my_ptr; + +public: + static tbb_exception_ptr* allocate (); + static tbb_exception_ptr* allocate ( const tbb_exception& tag ); + //! This overload uses move semantics (i.e. it empties src) + static tbb_exception_ptr* allocate ( captured_exception& src ); + + //! Destroys this objects + /** Note that objects of this type can be created only by the allocate() method. **/ + void destroy () throw(); + + //! Throws the contained exception . + void throw_self () { std::rethrow_exception(my_ptr); } + +private: + tbb_exception_ptr ( const std::exception_ptr& src ) : my_ptr(src) {} + tbb_exception_ptr ( const captured_exception& src ) : my_ptr(std::copy_exception(src)) {} +}; // class tbb::internal::tbb_exception_ptr + +} // namespace internal +#endif /* !TBB_USE_CAPTURED_EXCEPTION */ + +} // namespace tbb + +#endif /* __TBB_EXCEPTIONS */ + +#endif /* __TBB_exception_H */ diff --git a/dep/tbb/include/tbb/tbb_machine.h b/dep/tbb/include/tbb/tbb_machine.h new file mode 100644 index 000000000..0673f2424 --- /dev/null +++ b/dep/tbb/include/tbb/tbb_machine.h @@ -0,0 +1,592 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_machine_H +#define __TBB_machine_H + +#include "tbb_stddef.h" + +#if _WIN32||_WIN64 + +#ifdef _MANAGED +#pragma managed(push, off) +#endif + +#if __MINGW32__ +#include "machine/linux_ia32.h" +extern "C" __declspec(dllimport) int __stdcall SwitchToThread( void ); +#define __TBB_Yield() SwitchToThread() +#elif defined(_M_IX86) +#include "machine/windows_ia32.h" +#elif defined(_M_AMD64) +#include "machine/windows_intel64.h" +#else +#error Unsupported platform +#endif + +#ifdef _MANAGED +#pragma managed(pop) +#endif + +#elif __linux__ || __FreeBSD__ + +#if __i386__ +#include "machine/linux_ia32.h" +#elif __x86_64__ +#include "machine/linux_intel64.h" +#elif __ia64__ +#include "machine/linux_ia64.h" +#endif + +#elif __APPLE__ + +#if __i386__ +#include "machine/linux_ia32.h" +#elif __x86_64__ +#include "machine/linux_intel64.h" +#elif __POWERPC__ +#include "machine/mac_ppc.h" +#endif + +#elif _AIX + +#include "machine/ibm_aix51.h" + +#elif __sun || __SUNPRO_CC + +#define __asm__ asm +#define __volatile__ volatile +#if __i386 || __i386__ +#include "machine/linux_ia32.h" +#elif __x86_64__ +#include "machine/linux_intel64.h" +#endif + +#endif + +#if !defined(__TBB_CompareAndSwap4) \ + || !defined(__TBB_CompareAndSwap8) \ + || !defined(__TBB_Yield) \ + || !defined(__TBB_release_consistency_helper) +#error Minimal requirements for tbb_machine.h not satisfied +#endif + +#ifndef __TBB_load_with_acquire + //! Load with acquire semantics; i.e., no following memory operation can move above the load. + template + inline T __TBB_load_with_acquire(const volatile T& location) { + T temp = location; + __TBB_release_consistency_helper(); + return temp; + } +#endif + +#ifndef __TBB_store_with_release + //! Store with release semantics; i.e., no prior memory operation can move below the store. + template + inline void __TBB_store_with_release(volatile T& location, V value) { + __TBB_release_consistency_helper(); + location = T(value); + } +#endif + +#ifndef __TBB_Pause + inline void __TBB_Pause(int32_t) { + __TBB_Yield(); + } +#endif + +namespace tbb { +namespace internal { + +//! Class that implements exponential backoff. +/** See implementation of spin_wait_while_eq for an example. */ +class atomic_backoff { + //! Time delay, in units of "pause" instructions. + /** Should be equal to approximately the number of "pause" instructions + that take the same time as an context switch. */ + static const int32_t LOOPS_BEFORE_YIELD = 16; + int32_t count; +public: + atomic_backoff() : count(1) {} + + //! Pause for a while. + void pause() { + if( count<=LOOPS_BEFORE_YIELD ) { + __TBB_Pause(count); + // Pause twice as long the next time. + count*=2; + } else { + // Pause is so long that we might as well yield CPU to scheduler. + __TBB_Yield(); + } + } + + // pause for a few times and then return false immediately. + bool bounded_pause() { + if( count<=LOOPS_BEFORE_YIELD ) { + __TBB_Pause(count); + // Pause twice as long the next time. + count*=2; + return true; + } else { + return false; + } + } + + void reset() { + count = 1; + } +}; + +//! Spin WHILE the value of the variable is equal to a given value +/** T and U should be comparable types. */ +template +void spin_wait_while_eq( const volatile T& location, U value ) { + atomic_backoff backoff; + while( location==value ) backoff.pause(); +} + +//! Spin UNTIL the value of the variable is equal to a given value +/** T and U should be comparable types. */ +template +void spin_wait_until_eq( const volatile T& location, const U value ) { + atomic_backoff backoff; + while( location!=value ) backoff.pause(); +} + +// T should be unsigned, otherwise sign propagation will break correctness of bit manipulations. +// S should be either 1 or 2, for the mask calculation to work correctly. +// Together, these rules limit applicability of Masked CAS to unsigned char and unsigned short. +template +inline T __TBB_MaskedCompareAndSwap (volatile T *ptr, T value, T comparand ) { + volatile uint32_t * base = (uint32_t*)( (uintptr_t)ptr & ~(uintptr_t)0x3 ); +#if __TBB_BIG_ENDIAN + const uint8_t bitoffset = uint8_t( 8*( 4-S - (uintptr_t(ptr) & 0x3) ) ); +#else + const uint8_t bitoffset = uint8_t( 8*((uintptr_t)ptr & 0x3) ); +#endif + const uint32_t mask = ( (1<<(S*8)) - 1 )<> bitoffset); +} + +template +inline T __TBB_CompareAndSwapGeneric (volatile void *ptr, T value, T comparand ) { + return __TBB_CompareAndSwapW((T *)ptr,value,comparand); +} + +template<> +inline uint8_t __TBB_CompareAndSwapGeneric <1,uint8_t> (volatile void *ptr, uint8_t value, uint8_t comparand ) { +#ifdef __TBB_CompareAndSwap1 + return __TBB_CompareAndSwap1(ptr,value,comparand); +#else + return __TBB_MaskedCompareAndSwap<1,uint8_t>((volatile uint8_t *)ptr,value,comparand); +#endif +} + +template<> +inline uint16_t __TBB_CompareAndSwapGeneric <2,uint16_t> (volatile void *ptr, uint16_t value, uint16_t comparand ) { +#ifdef __TBB_CompareAndSwap2 + return __TBB_CompareAndSwap2(ptr,value,comparand); +#else + return __TBB_MaskedCompareAndSwap<2,uint16_t>((volatile uint16_t *)ptr,value,comparand); +#endif +} + +template<> +inline uint32_t __TBB_CompareAndSwapGeneric <4,uint32_t> (volatile void *ptr, uint32_t value, uint32_t comparand ) { + return __TBB_CompareAndSwap4(ptr,value,comparand); +} + +template<> +inline uint64_t __TBB_CompareAndSwapGeneric <8,uint64_t> (volatile void *ptr, uint64_t value, uint64_t comparand ) { + return __TBB_CompareAndSwap8(ptr,value,comparand); +} + +template +inline T __TBB_FetchAndAddGeneric (volatile void *ptr, T addend) { + atomic_backoff b; + T result; + for(;;) { + result = *reinterpret_cast(ptr); + // __TBB_CompareAndSwapGeneric presumed to have full fence. + if( __TBB_CompareAndSwapGeneric ( ptr, result+addend, result )==result ) + break; + b.pause(); + } + return result; +} + +template +inline T __TBB_FetchAndStoreGeneric (volatile void *ptr, T value) { + atomic_backoff b; + T result; + for(;;) { + result = *reinterpret_cast(ptr); + // __TBB_CompareAndSwapGeneric presumed to have full fence. + if( __TBB_CompareAndSwapGeneric ( ptr, value, result )==result ) + break; + b.pause(); + } + return result; +} + +// Macro __TBB_TypeWithAlignmentAtLeastAsStrict(T) should be a type with alignment at least as +// strict as type T. Type type should have a trivial default constructor and destructor, so that +// arrays of that type can be declared without initializers. +// It is correct (but perhaps a waste of space) if __TBB_TypeWithAlignmentAtLeastAsStrict(T) expands +// to a type bigger than T. +// The default definition here works on machines where integers are naturally aligned and the +// strictest alignment is 16. +#ifndef __TBB_TypeWithAlignmentAtLeastAsStrict + +#if __GNUC__ || __SUNPRO_CC +struct __TBB_machine_type_with_strictest_alignment { + int member[4]; +} __attribute__((aligned(16))); +#elif _MSC_VER +__declspec(align(16)) struct __TBB_machine_type_with_strictest_alignment { + int member[4]; +}; +#else +#error Must define __TBB_TypeWithAlignmentAtLeastAsStrict(T) or __TBB_machine_type_with_strictest_alignment +#endif + +template struct type_with_alignment {__TBB_machine_type_with_strictest_alignment member;}; +template<> struct type_with_alignment<1> { char member; }; +template<> struct type_with_alignment<2> { uint16_t member; }; +template<> struct type_with_alignment<4> { uint32_t member; }; +template<> struct type_with_alignment<8> { uint64_t member; }; + +#if _MSC_VER||defined(__GNUC__)&&__GNUC__==3 && __GNUC_MINOR__<=2 +//! Work around for bug in GNU 3.2 and MSVC compilers. +/** Bug is that compiler sometimes returns 0 for __alignof(T) when T has not yet been instantiated. + The work-around forces instantiation by forcing computation of sizeof(T) before __alignof(T). */ +template +struct work_around_alignment_bug { +#if _MSC_VER + static const size_t alignment = __alignof(T); +#else + static const size_t alignment = __alignof__(T); +#endif +}; +#define __TBB_TypeWithAlignmentAtLeastAsStrict(T) tbb::internal::type_with_alignment::alignment> +#elif __GNUC__ || __SUNPRO_CC +#define __TBB_TypeWithAlignmentAtLeastAsStrict(T) tbb::internal::type_with_alignment<__alignof__(T)> +#else +#define __TBB_TypeWithAlignmentAtLeastAsStrict(T) __TBB_machine_type_with_strictest_alignment +#endif +#endif /* ____TBB_TypeWithAlignmentAtLeastAsStrict */ + +} // namespace internal +} // namespace tbb + +#ifndef __TBB_CompareAndSwap1 +#define __TBB_CompareAndSwap1 tbb::internal::__TBB_CompareAndSwapGeneric<1,uint8_t> +#endif + +#ifndef __TBB_CompareAndSwap2 +#define __TBB_CompareAndSwap2 tbb::internal::__TBB_CompareAndSwapGeneric<2,uint16_t> +#endif + +#ifndef __TBB_CompareAndSwapW +#define __TBB_CompareAndSwapW tbb::internal::__TBB_CompareAndSwapGeneric +#endif + +#ifndef __TBB_FetchAndAdd1 +#define __TBB_FetchAndAdd1 tbb::internal::__TBB_FetchAndAddGeneric<1,uint8_t> +#endif + +#ifndef __TBB_FetchAndAdd2 +#define __TBB_FetchAndAdd2 tbb::internal::__TBB_FetchAndAddGeneric<2,uint16_t> +#endif + +#ifndef __TBB_FetchAndAdd4 +#define __TBB_FetchAndAdd4 tbb::internal::__TBB_FetchAndAddGeneric<4,uint32_t> +#endif + +#ifndef __TBB_FetchAndAdd8 +#define __TBB_FetchAndAdd8 tbb::internal::__TBB_FetchAndAddGeneric<8,uint64_t> +#endif + +#ifndef __TBB_FetchAndAddW +#define __TBB_FetchAndAddW tbb::internal::__TBB_FetchAndAddGeneric +#endif + +#ifndef __TBB_FetchAndStore1 +#define __TBB_FetchAndStore1 tbb::internal::__TBB_FetchAndStoreGeneric<1,uint8_t> +#endif + +#ifndef __TBB_FetchAndStore2 +#define __TBB_FetchAndStore2 tbb::internal::__TBB_FetchAndStoreGeneric<2,uint16_t> +#endif + +#ifndef __TBB_FetchAndStore4 +#define __TBB_FetchAndStore4 tbb::internal::__TBB_FetchAndStoreGeneric<4,uint32_t> +#endif + +#ifndef __TBB_FetchAndStore8 +#define __TBB_FetchAndStore8 tbb::internal::__TBB_FetchAndStoreGeneric<8,uint64_t> +#endif + +#ifndef __TBB_FetchAndStoreW +#define __TBB_FetchAndStoreW tbb::internal::__TBB_FetchAndStoreGeneric +#endif + +#if __TBB_DECL_FENCED_ATOMICS + +#ifndef __TBB_CompareAndSwap1__TBB_full_fence +#define __TBB_CompareAndSwap1__TBB_full_fence __TBB_CompareAndSwap1 +#endif +#ifndef __TBB_CompareAndSwap1acquire +#define __TBB_CompareAndSwap1acquire __TBB_CompareAndSwap1__TBB_full_fence +#endif +#ifndef __TBB_CompareAndSwap1release +#define __TBB_CompareAndSwap1release __TBB_CompareAndSwap1__TBB_full_fence +#endif + +#ifndef __TBB_CompareAndSwap2__TBB_full_fence +#define __TBB_CompareAndSwap2__TBB_full_fence __TBB_CompareAndSwap2 +#endif +#ifndef __TBB_CompareAndSwap2acquire +#define __TBB_CompareAndSwap2acquire __TBB_CompareAndSwap2__TBB_full_fence +#endif +#ifndef __TBB_CompareAndSwap2release +#define __TBB_CompareAndSwap2release __TBB_CompareAndSwap2__TBB_full_fence +#endif + +#ifndef __TBB_CompareAndSwap4__TBB_full_fence +#define __TBB_CompareAndSwap4__TBB_full_fence __TBB_CompareAndSwap4 +#endif +#ifndef __TBB_CompareAndSwap4acquire +#define __TBB_CompareAndSwap4acquire __TBB_CompareAndSwap4__TBB_full_fence +#endif +#ifndef __TBB_CompareAndSwap4release +#define __TBB_CompareAndSwap4release __TBB_CompareAndSwap4__TBB_full_fence +#endif + +#ifndef __TBB_CompareAndSwap8__TBB_full_fence +#define __TBB_CompareAndSwap8__TBB_full_fence __TBB_CompareAndSwap8 +#endif +#ifndef __TBB_CompareAndSwap8acquire +#define __TBB_CompareAndSwap8acquire __TBB_CompareAndSwap8__TBB_full_fence +#endif +#ifndef __TBB_CompareAndSwap8release +#define __TBB_CompareAndSwap8release __TBB_CompareAndSwap8__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndAdd1__TBB_full_fence +#define __TBB_FetchAndAdd1__TBB_full_fence __TBB_FetchAndAdd1 +#endif +#ifndef __TBB_FetchAndAdd1acquire +#define __TBB_FetchAndAdd1acquire __TBB_FetchAndAdd1__TBB_full_fence +#endif +#ifndef __TBB_FetchAndAdd1release +#define __TBB_FetchAndAdd1release __TBB_FetchAndAdd1__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndAdd2__TBB_full_fence +#define __TBB_FetchAndAdd2__TBB_full_fence __TBB_FetchAndAdd2 +#endif +#ifndef __TBB_FetchAndAdd2acquire +#define __TBB_FetchAndAdd2acquire __TBB_FetchAndAdd2__TBB_full_fence +#endif +#ifndef __TBB_FetchAndAdd2release +#define __TBB_FetchAndAdd2release __TBB_FetchAndAdd2__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndAdd4__TBB_full_fence +#define __TBB_FetchAndAdd4__TBB_full_fence __TBB_FetchAndAdd4 +#endif +#ifndef __TBB_FetchAndAdd4acquire +#define __TBB_FetchAndAdd4acquire __TBB_FetchAndAdd4__TBB_full_fence +#endif +#ifndef __TBB_FetchAndAdd4release +#define __TBB_FetchAndAdd4release __TBB_FetchAndAdd4__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndAdd8__TBB_full_fence +#define __TBB_FetchAndAdd8__TBB_full_fence __TBB_FetchAndAdd8 +#endif +#ifndef __TBB_FetchAndAdd8acquire +#define __TBB_FetchAndAdd8acquire __TBB_FetchAndAdd8__TBB_full_fence +#endif +#ifndef __TBB_FetchAndAdd8release +#define __TBB_FetchAndAdd8release __TBB_FetchAndAdd8__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndStore1__TBB_full_fence +#define __TBB_FetchAndStore1__TBB_full_fence __TBB_FetchAndStore1 +#endif +#ifndef __TBB_FetchAndStore1acquire +#define __TBB_FetchAndStore1acquire __TBB_FetchAndStore1__TBB_full_fence +#endif +#ifndef __TBB_FetchAndStore1release +#define __TBB_FetchAndStore1release __TBB_FetchAndStore1__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndStore2__TBB_full_fence +#define __TBB_FetchAndStore2__TBB_full_fence __TBB_FetchAndStore2 +#endif +#ifndef __TBB_FetchAndStore2acquire +#define __TBB_FetchAndStore2acquire __TBB_FetchAndStore2__TBB_full_fence +#endif +#ifndef __TBB_FetchAndStore2release +#define __TBB_FetchAndStore2release __TBB_FetchAndStore2__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndStore4__TBB_full_fence +#define __TBB_FetchAndStore4__TBB_full_fence __TBB_FetchAndStore4 +#endif +#ifndef __TBB_FetchAndStore4acquire +#define __TBB_FetchAndStore4acquire __TBB_FetchAndStore4__TBB_full_fence +#endif +#ifndef __TBB_FetchAndStore4release +#define __TBB_FetchAndStore4release __TBB_FetchAndStore4__TBB_full_fence +#endif + +#ifndef __TBB_FetchAndStore8__TBB_full_fence +#define __TBB_FetchAndStore8__TBB_full_fence __TBB_FetchAndStore8 +#endif +#ifndef __TBB_FetchAndStore8acquire +#define __TBB_FetchAndStore8acquire __TBB_FetchAndStore8__TBB_full_fence +#endif +#ifndef __TBB_FetchAndStore8release +#define __TBB_FetchAndStore8release __TBB_FetchAndStore8__TBB_full_fence +#endif + +#endif // __TBB_DECL_FENCED_ATOMICS + +// Special atomic functions +#ifndef __TBB_FetchAndAddWrelease +#define __TBB_FetchAndAddWrelease __TBB_FetchAndAddW +#endif + +#ifndef __TBB_FetchAndIncrementWacquire +#define __TBB_FetchAndIncrementWacquire(P) __TBB_FetchAndAddW(P,1) +#endif + +#ifndef __TBB_FetchAndDecrementWrelease +#define __TBB_FetchAndDecrementWrelease(P) __TBB_FetchAndAddW(P,(-1)) +#endif + +#if __TBB_WORDSIZE==4 +// On 32-bit platforms, "atomic.h" requires definition of __TBB_Store8 and __TBB_Load8 +#ifndef __TBB_Store8 +inline void __TBB_Store8 (volatile void *ptr, int64_t value) { + tbb::internal::atomic_backoff b; + for(;;) { + int64_t result = *(int64_t *)ptr; + if( __TBB_CompareAndSwap8(ptr,value,result)==result ) break; + b.pause(); + } +} +#endif + +#ifndef __TBB_Load8 +inline int64_t __TBB_Load8 (const volatile void *ptr) { + int64_t result = *(int64_t *)ptr; + result = __TBB_CompareAndSwap8((volatile void *)ptr,result,result); + return result; +} +#endif +#endif /* __TBB_WORDSIZE==4 */ + +#ifndef __TBB_Log2 +inline intptr_t __TBB_Log2( uintptr_t x ) { + if( x==0 ) return -1; + intptr_t result = 0; + uintptr_t tmp; +#if __TBB_WORDSIZE>=8 + if( (tmp = x>>32) ) { x=tmp; result += 32; } +#endif + if( (tmp = x>>16) ) { x=tmp; result += 16; } + if( (tmp = x>>8) ) { x=tmp; result += 8; } + if( (tmp = x>>4) ) { x=tmp; result += 4; } + if( (tmp = x>>2) ) { x=tmp; result += 2; } + return (x&2)? result+1: result; +} +#endif + +#ifndef __TBB_AtomicOR +inline void __TBB_AtomicOR( volatile void *operand, uintptr_t addend ) { + tbb::internal::atomic_backoff b; + for(;;) { + uintptr_t tmp = *(volatile uintptr_t *)operand; + uintptr_t result = __TBB_CompareAndSwapW(operand, tmp|addend, tmp); + if( result==tmp ) break; + b.pause(); + } +} +#endif + +#ifndef __TBB_AtomicAND +inline void __TBB_AtomicAND( volatile void *operand, uintptr_t addend ) { + tbb::internal::atomic_backoff b; + for(;;) { + uintptr_t tmp = *(volatile uintptr_t *)operand; + uintptr_t result = __TBB_CompareAndSwapW(operand, tmp&addend, tmp); + if( result==tmp ) break; + b.pause(); + } +} +#endif + +#ifndef __TBB_TryLockByte +inline bool __TBB_TryLockByte( unsigned char &flag ) { + return __TBB_CompareAndSwap1(&flag,1,0)==0; +} +#endif + +#ifndef __TBB_LockByte +inline uintptr_t __TBB_LockByte( unsigned char& flag ) { + if ( !__TBB_TryLockByte(flag) ) { + tbb::internal::atomic_backoff b; + do { + b.pause(); + } while ( !__TBB_TryLockByte(flag) ); + } + return 0; +} +#endif + +#endif /* __TBB_machine_H */ diff --git a/dep/tbb/include/tbb/tbb_profiling.h b/dep/tbb/include/tbb/tbb_profiling.h new file mode 100644 index 000000000..f9c686d1d --- /dev/null +++ b/dep/tbb/include/tbb/tbb_profiling.h @@ -0,0 +1,105 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_profiling_H +#define __TBB_profiling_H + +// Check if the tools support is enabled +#if (_WIN32||_WIN64||__linux__) && TBB_USE_THREADING_TOOLS + +#if _WIN32||_WIN64 +#include /* mbstowcs_s */ +#endif +#include "tbb_stddef.h" + +namespace tbb { + namespace internal { +#if _WIN32||_WIN64 + void __TBB_EXPORTED_FUNC itt_set_sync_name_v3( void *obj, const wchar_t* name ); + inline size_t multibyte_to_widechar( wchar_t* wcs, const char* mbs, size_t bufsize) { +#if _MSC_VER>=1400 + size_t len; + mbstowcs_s( &len, wcs, bufsize, mbs, _TRUNCATE ); + return len; // mbstowcs_s counts null terminator +#else + size_t len = mbstowcs( wcs, mbs, bufsize ); + if(wcs && len!=size_t(-1) ) + wcs[lenModules (groups of functionality) implemented by the library + * - Classes provided by the library + * - Files constituting the library. + * . + * Please note that significant part of TBB functionality is implemented in the form of + * template functions, descriptions of which are not accessible on the Classes + * tab. Use Modules or Namespace/Namespace Members + * tabs to find them. + * + * Additional pieces of information can be found here + * - \subpage concepts + * . + */ + +/** \page concepts TBB concepts + + A concept is a set of requirements to a type, which are necessary and sufficient + for the type to model a particular behavior or a set of behaviors. Some concepts + are specific to a particular algorithm (e.g. algorithm body), while other ones + are common to several algorithms (e.g. range concept). + + All TBB algorithms make use of different classes implementing various concepts. + Implementation classes are supplied by the user as type arguments of template + parameters and/or as objects passed as function call arguments. The library + provides predefined implementations of some concepts (e.g. several kinds of + \ref range_req "ranges"), while other ones must always be implemented by the user. + + TBB defines a set of minimal requirements each concept must conform to. Here is + the list of different concepts hyperlinked to the corresponding requirements specifications: + - \subpage range_req + - \subpage parallel_do_body_req + - \subpage parallel_for_body_req + - \subpage parallel_reduce_body_req + - \subpage parallel_scan_body_req + - \subpage parallel_sort_iter_req +**/ + +// Define preprocessor symbols used to determine architecture +#if _WIN32||_WIN64 +# if defined(_M_AMD64) +# define __TBB_x86_64 1 +# elif defined(_M_IA64) +# define __TBB_ipf 1 +# elif defined(_M_IX86)||defined(__i386__) // the latter for MinGW support +# define __TBB_x86_32 1 +# endif +#else /* Assume generic Unix */ +# if !__linux__ && !__APPLE__ +# define __TBB_generic_os 1 +# endif +# if __x86_64__ +# define __TBB_x86_64 1 +# elif __ia64__ +# define __TBB_ipf 1 +# elif __i386__||__i386 // __i386 is for Sun OS +# define __TBB_x86_32 1 +# else +# define __TBB_generic_arch 1 +# endif +#endif + +#if _MSC_VER +// define the parts of stdint.h that are needed, but put them inside tbb::internal +namespace tbb { +namespace internal { + typedef __int8 int8_t; + typedef __int16 int16_t; + typedef __int32 int32_t; + typedef __int64 int64_t; + typedef unsigned __int8 uint8_t; + typedef unsigned __int16 uint16_t; + typedef unsigned __int32 uint32_t; + typedef unsigned __int64 uint64_t; +} // namespace internal +} // namespace tbb +#else +#include +#endif /* _MSC_VER */ + +#if _MSC_VER >=1400 +#define __TBB_EXPORTED_FUNC __cdecl +#define __TBB_EXPORTED_METHOD __thiscall +#else +#define __TBB_EXPORTED_FUNC +#define __TBB_EXPORTED_METHOD +#endif + +#include /* Need size_t and ptrdiff_t (the latter on Windows only) from here. */ + +#if _MSC_VER +#define __TBB_tbb_windef_H +#include "_tbb_windef.h" +#undef __TBB_tbb_windef_H +#endif + +#include "tbb_config.h" + +namespace tbb { + //! Type for an assertion handler + typedef void(*assertion_handler_type)( const char* filename, int line, const char* expression, const char * comment ); +} + +#if TBB_USE_ASSERT + +//! Assert that x is true. +/** If x is false, print assertion failure message. + If the comment argument is not NULL, it is printed as part of the failure message. + The comment argument has no other effect. */ +#define __TBB_ASSERT(predicate,message) ((predicate)?((void)0):tbb::assertion_failure(__FILE__,__LINE__,#predicate,message)) +#define __TBB_ASSERT_EX __TBB_ASSERT + +namespace tbb { + //! Set assertion handler and return previous value of it. + assertion_handler_type __TBB_EXPORTED_FUNC set_assertion_handler( assertion_handler_type new_handler ); + + //! Process an assertion failure. + /** Normally called from __TBB_ASSERT macro. + If assertion handler is null, print message for assertion failure and abort. + Otherwise call the assertion handler. */ + void __TBB_EXPORTED_FUNC assertion_failure( const char* filename, int line, const char* expression, const char* comment ); +} // namespace tbb + +#else + +//! No-op version of __TBB_ASSERT. +#define __TBB_ASSERT(predicate,comment) ((void)0) +//! "Extended" version is useful to suppress warnings if a variable is only used with an assert +#define __TBB_ASSERT_EX(predicate,comment) ((void)(1 && (predicate))) + +#endif /* TBB_USE_ASSERT */ + +//! The namespace tbb contains all components of the library. +namespace tbb { + +//! The function returns the interface version of the TBB shared library being used. +/** + * The version it returns is determined at runtime, not at compile/link time. + * So it can be different than the value of TBB_INTERFACE_VERSION obtained at compile time. + */ +extern "C" int __TBB_EXPORTED_FUNC TBB_runtime_interface_version(); + +//! Dummy type that distinguishes splitting constructor from copy constructor. +/** + * See description of parallel_for and parallel_reduce for example usages. + * @ingroup algorithms + */ +class split { +}; + +/** + * @cond INTERNAL + * @brief Identifiers declared inside namespace internal should never be used directly by client code. + */ +namespace internal { + +using std::size_t; + +//! An unsigned integral type big enough to hold a pointer. +/** There's no guarantee by the C++ standard that a size_t is really big enough, + but it happens to be for all platforms of interest. */ +typedef size_t uintptr; + +//! A signed integral type big enough to hold a pointer. +/** There's no guarantee by the C++ standard that a ptrdiff_t is really big enough, + but it happens to be for all platforms of interest. */ +typedef std::ptrdiff_t intptr; + +//! Compile-time constant that is upper bound on cache line/sector size. +/** It should be used only in situations where having a compile-time upper + bound is more useful than a run-time exact answer. + @ingroup memory_allocation */ +const size_t NFS_MaxLineSize = 128; + +//! Report a runtime warning. +void __TBB_EXPORTED_FUNC runtime_warning( const char* format, ... ); + +#if TBB_USE_ASSERT +//! Set p to invalid pointer value. +template +inline void poison_pointer( T* & p ) { + p = reinterpret_cast(-1); +} +#else +template +inline void poison_pointer( T* ) {/*do nothing*/} +#endif /* TBB_USE_ASSERT */ + +//! Base class for types that should not be assigned. +class no_assign { + // Deny assignment + void operator=( const no_assign& ); +public: +#if __GNUC__ + //! Explicitly define default construction, because otherwise gcc issues gratuitous warning. + no_assign() {} +#endif /* __GNUC__ */ +}; + +//! Base class for types that should not be copied or assigned. +class no_copy: no_assign { + //! Deny copy construction + no_copy( const no_copy& ); +public: + //! Allow default construction + no_copy() {} +}; + +//! Class for determining type of std::allocator::value_type. +template +struct allocator_type { + typedef T value_type; +}; + +#if _MSC_VER +//! Microsoft std::allocator has non-standard extension that strips const from a type. +template +struct allocator_type { + typedef T value_type; +}; +#endif + +// Struct to be used as a version tag for inline functions. +/** Version tag can be necessary to prevent loader on Linux from using the wrong + symbol in debug builds (when inline functions are compiled as out-of-line). **/ +struct version_tag_v3 {}; + +typedef version_tag_v3 version_tag; + +} // internal +//! @endcond + +} // tbb + +#endif /* RC_INVOKED */ +#endif /* __TBB_tbb_stddef_H */ diff --git a/dep/tbb/include/tbb/tbb_thread.h b/dep/tbb/include/tbb/tbb_thread.h new file mode 100644 index 000000000..6b40a9c04 --- /dev/null +++ b/dep/tbb/include/tbb/tbb_thread.h @@ -0,0 +1,294 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_tbb_thread_H +#define __TBB_tbb_thread_H + +#if _WIN32||_WIN64 +#include +#define __TBB_NATIVE_THREAD_ROUTINE unsigned WINAPI +#define __TBB_NATIVE_THREAD_ROUTINE_PTR(r) unsigned (WINAPI* r)( void* ) +#else +#define __TBB_NATIVE_THREAD_ROUTINE void* +#define __TBB_NATIVE_THREAD_ROUTINE_PTR(r) void* (*r)( void* ) +#include +#endif // _WIN32||_WIN64 + +#include +#include // Need std::terminate from here. +#include "tbb_stddef.h" +#include "tick_count.h" + +namespace tbb { + +//! @cond INTERNAL +namespace internal { + + class tbb_thread_v3; + +} // namespace internal + +void swap( internal::tbb_thread_v3& t1, internal::tbb_thread_v3& t2 ); + +namespace internal { + + //! Allocate a closure + void* __TBB_EXPORTED_FUNC allocate_closure_v3( size_t size ); + //! Free a closure allocated by allocate_closure_v3 + void __TBB_EXPORTED_FUNC free_closure_v3( void* ); + + struct thread_closure_base { + void* operator new( size_t size ) {return allocate_closure_v3(size);} + void operator delete( void* ptr ) {free_closure_v3(ptr);} + }; + + template struct thread_closure_0: thread_closure_base { + F function; + + static __TBB_NATIVE_THREAD_ROUTINE start_routine( void* c ) { + thread_closure_0 *self = static_cast(c); + try { + self->function(); + } catch ( ... ) { + std::terminate(); + } + delete self; + return 0; + } + thread_closure_0( const F& f ) : function(f) {} + }; + //! Structure used to pass user function with 1 argument to thread. + template struct thread_closure_1: thread_closure_base { + F function; + X arg1; + //! Routine passed to Windows's _beginthreadex by thread::internal_start() inside tbb.dll + static __TBB_NATIVE_THREAD_ROUTINE start_routine( void* c ) { + thread_closure_1 *self = static_cast(c); + try { + self->function(self->arg1); + } catch ( ... ) { + std::terminate(); + } + delete self; + return 0; + } + thread_closure_1( const F& f, const X& x ) : function(f), arg1(x) {} + }; + template struct thread_closure_2: thread_closure_base { + F function; + X arg1; + Y arg2; + //! Routine passed to Windows's _beginthreadex by thread::internal_start() inside tbb.dll + static __TBB_NATIVE_THREAD_ROUTINE start_routine( void* c ) { + thread_closure_2 *self = static_cast(c); + try { + self->function(self->arg1, self->arg2); + } catch ( ... ) { + std::terminate(); + } + delete self; + return 0; + } + thread_closure_2( const F& f, const X& x, const Y& y ) : function(f), arg1(x), arg2(y) {} + }; + + //! Versioned thread class. + class tbb_thread_v3 { + tbb_thread_v3(const tbb_thread_v3&); // = delete; // Deny access + public: +#if _WIN32||_WIN64 + typedef HANDLE native_handle_type; +#else + typedef pthread_t native_handle_type; +#endif // _WIN32||_WIN64 + + class id; + //! Constructs a thread object that does not represent a thread of execution. + tbb_thread_v3() : my_handle(0) +#if _WIN32||_WIN64 + , my_thread_id(0) +#endif // _WIN32||_WIN64 + {} + + //! Constructs an object and executes f() in a new thread + template explicit tbb_thread_v3(F f) { + typedef internal::thread_closure_0 closure_type; + internal_start(closure_type::start_routine, new closure_type(f)); + } + //! Constructs an object and executes f(x) in a new thread + template tbb_thread_v3(F f, X x) { + typedef internal::thread_closure_1 closure_type; + internal_start(closure_type::start_routine, new closure_type(f,x)); + } + //! Constructs an object and executes f(x,y) in a new thread + template tbb_thread_v3(F f, X x, Y y) { + typedef internal::thread_closure_2 closure_type; + internal_start(closure_type::start_routine, new closure_type(f,x,y)); + } + + tbb_thread_v3& operator=(tbb_thread_v3& x) { + if (joinable()) detach(); + my_handle = x.my_handle; + x.my_handle = 0; +#if _WIN32||_WIN64 + my_thread_id = x.my_thread_id; + x.my_thread_id = 0; +#endif // _WIN32||_WIN64 + return *this; + } + bool joinable() const {return my_handle!=0; } + //! The completion of the thread represented by *this happens before join() returns. + void __TBB_EXPORTED_METHOD join(); + //! When detach() returns, *this no longer represents the possibly continuing thread of execution. + void __TBB_EXPORTED_METHOD detach(); + ~tbb_thread_v3() {if( joinable() ) detach();} + inline id get_id() const; + native_handle_type native_handle() { return my_handle; } + + //! The number of hardware thread contexts. + static unsigned __TBB_EXPORTED_FUNC hardware_concurrency(); + private: + native_handle_type my_handle; +#if _WIN32||_WIN64 + DWORD my_thread_id; +#endif // _WIN32||_WIN64 + + /** Runs start_routine(closure) on another thread and sets my_handle to the handle of the created thread. */ + void __TBB_EXPORTED_METHOD internal_start( __TBB_NATIVE_THREAD_ROUTINE_PTR(start_routine), + void* closure ); + friend void __TBB_EXPORTED_FUNC move_v3( tbb_thread_v3& t1, tbb_thread_v3& t2 ); + friend void tbb::swap( tbb_thread_v3& t1, tbb_thread_v3& t2 ); + }; + + class tbb_thread_v3::id { +#if _WIN32||_WIN64 + DWORD my_id; + id( DWORD my_id ) : my_id(my_id) {} +#else + pthread_t my_id; + id( pthread_t my_id ) : my_id(my_id) {} +#endif // _WIN32||_WIN64 + friend class tbb_thread_v3; + public: + id() : my_id(0) {} + + friend bool operator==( tbb_thread_v3::id x, tbb_thread_v3::id y ); + friend bool operator!=( tbb_thread_v3::id x, tbb_thread_v3::id y ); + friend bool operator<( tbb_thread_v3::id x, tbb_thread_v3::id y ); + friend bool operator<=( tbb_thread_v3::id x, tbb_thread_v3::id y ); + friend bool operator>( tbb_thread_v3::id x, tbb_thread_v3::id y ); + friend bool operator>=( tbb_thread_v3::id x, tbb_thread_v3::id y ); + + template + friend std::basic_ostream& + operator<< (std::basic_ostream &out, + tbb_thread_v3::id id) + { + out << id.my_id; + return out; + } + friend tbb_thread_v3::id __TBB_EXPORTED_FUNC thread_get_id_v3(); + }; // tbb_thread_v3::id + + tbb_thread_v3::id tbb_thread_v3::get_id() const { +#if _WIN32||_WIN64 + return id(my_thread_id); +#else + return id(my_handle); +#endif // _WIN32||_WIN64 + } + void __TBB_EXPORTED_FUNC move_v3( tbb_thread_v3& t1, tbb_thread_v3& t2 ); + tbb_thread_v3::id __TBB_EXPORTED_FUNC thread_get_id_v3(); + void __TBB_EXPORTED_FUNC thread_yield_v3(); + void __TBB_EXPORTED_FUNC thread_sleep_v3(const tick_count::interval_t &i); + + inline bool operator==(tbb_thread_v3::id x, tbb_thread_v3::id y) + { + return x.my_id == y.my_id; + } + inline bool operator!=(tbb_thread_v3::id x, tbb_thread_v3::id y) + { + return x.my_id != y.my_id; + } + inline bool operator<(tbb_thread_v3::id x, tbb_thread_v3::id y) + { + return x.my_id < y.my_id; + } + inline bool operator<=(tbb_thread_v3::id x, tbb_thread_v3::id y) + { + return x.my_id <= y.my_id; + } + inline bool operator>(tbb_thread_v3::id x, tbb_thread_v3::id y) + { + return x.my_id > y.my_id; + } + inline bool operator>=(tbb_thread_v3::id x, tbb_thread_v3::id y) + { + return x.my_id >= y.my_id; + } + +} // namespace internal; + +//! Users reference thread class by name tbb_thread +typedef internal::tbb_thread_v3 tbb_thread; + +using internal::operator==; +using internal::operator!=; +using internal::operator<; +using internal::operator>; +using internal::operator<=; +using internal::operator>=; + +inline void move( tbb_thread& t1, tbb_thread& t2 ) { + internal::move_v3(t1, t2); +} + +inline void swap( internal::tbb_thread_v3& t1, internal::tbb_thread_v3& t2 ) { + tbb::tbb_thread::native_handle_type h = t1.my_handle; + t1.my_handle = t2.my_handle; + t2.my_handle = h; +#if _WIN32||_WIN64 + DWORD i = t1.my_thread_id; + t1.my_thread_id = t2.my_thread_id; + t2.my_thread_id = i; +#endif /* _WIN32||_WIN64 */ +} + +namespace this_tbb_thread { + inline tbb_thread::id get_id() { return internal::thread_get_id_v3(); } + //! Offers the operating system the opportunity to schedule another thread. + inline void yield() { internal::thread_yield_v3(); } + //! The current thread blocks at least until the time specified. + inline void sleep(const tick_count::interval_t &i) { + internal::thread_sleep_v3(i); + } +} // namespace this_tbb_thread + +} // namespace tbb + +#endif /* __TBB_tbb_thread_H */ diff --git a/dep/tbb/include/tbb/tbbmalloc_proxy.h b/dep/tbb/include/tbb/tbbmalloc_proxy.h new file mode 100644 index 000000000..ebde35886 --- /dev/null +++ b/dep/tbb/include/tbb/tbbmalloc_proxy.h @@ -0,0 +1,74 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +/* +Replacing the standard memory allocation routines in Microsoft* C/C++ RTL +(malloc/free, global new/delete, etc.) with the TBB memory allocator. + +Include the following header to a source of any binary which is loaded during +application startup + +#include "tbb/tbbmalloc_proxy.h" + +or add following parameters to the linker options for the binary which is +loaded during application startup. It can be either exe-file or dll. + +For win32 +tbbmalloc_proxy.lib /INCLUDE:"___TBB_malloc_proxy" +win64 +tbbmalloc_proxy.lib /INCLUDE:"__TBB_malloc_proxy" +*/ + +#ifndef __TBB_tbbmalloc_proxy_H +#define __TBB_tbbmalloc_proxy_H + +#if _MSC_VER + +#ifdef _DEBUG + #pragma comment(lib, "tbbmalloc_proxy_debug.lib") +#else + #pragma comment(lib, "tbbmalloc_proxy.lib") +#endif + +#if defined(_WIN64) + #pragma comment(linker, "/include:__TBB_malloc_proxy") +#else + #pragma comment(linker, "/include:___TBB_malloc_proxy") +#endif + +#else +/* Primarily to support MinGW */ + +extern "C" void __TBB_malloc_proxy(); +struct __TBB_malloc_proxy_caller { + __TBB_malloc_proxy_caller() { __TBB_malloc_proxy(); } +} volatile __TBB_malloc_proxy_helper_object; + +#endif // _MSC_VER + +#endif //__TBB_tbbmalloc_proxy_H diff --git a/dep/tbb/include/tbb/tick_count.h b/dep/tbb/include/tbb/tick_count.h new file mode 100644 index 000000000..495618278 --- /dev/null +++ b/dep/tbb/include/tbb/tick_count.h @@ -0,0 +1,155 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_tick_count_H +#define __TBB_tick_count_H + +#include "tbb_stddef.h" + +#if _WIN32||_WIN64 +#include +#elif __linux__ +#include +#else /* generic Unix */ +#include +#endif /* (choice of OS) */ + +namespace tbb { + +//! Absolute timestamp +/** @ingroup timing */ +class tick_count { +public: + //! Relative time interval. + class interval_t { + long long value; + explicit interval_t( long long value_ ) : value(value_) {} + public: + //! Construct a time interval representing zero time duration + interval_t() : value(0) {}; + + //! Construct a time interval representing sec seconds time duration + explicit interval_t( double sec ); + + //! Return the length of a time interval in seconds + double seconds() const; + + friend class tbb::tick_count; + + //! Extract the intervals from the tick_counts and subtract them. + friend interval_t operator-( const tick_count& t1, const tick_count& t0 ); + + //! Add two intervals. + friend interval_t operator+( const interval_t& i, const interval_t& j ) { + return interval_t(i.value+j.value); + } + + //! Subtract two intervals. + friend interval_t operator-( const interval_t& i, const interval_t& j ) { + return interval_t(i.value-j.value); + } + + //! Accumulation operator + interval_t& operator+=( const interval_t& i ) {value += i.value; return *this;} + + //! Subtraction operator + interval_t& operator-=( const interval_t& i ) {value -= i.value; return *this;} + }; + + //! Construct an absolute timestamp initialized to zero. + tick_count() : my_count(0) {}; + + //! Return current time. + static tick_count now(); + + //! Subtract two timestamps to get the time interval between + friend interval_t operator-( const tick_count& t1, const tick_count& t0 ); + +private: + long long my_count; +}; + +inline tick_count tick_count::now() { + tick_count result; +#if _WIN32||_WIN64 + LARGE_INTEGER qpcnt; + QueryPerformanceCounter(&qpcnt); + result.my_count = qpcnt.QuadPart; +#elif __linux__ + struct timespec ts; +#if TBB_USE_ASSERT + int status = +#endif /* TBB_USE_ASSERT */ + clock_gettime( CLOCK_REALTIME, &ts ); + __TBB_ASSERT( status==0, "CLOCK_REALTIME not supported" ); + result.my_count = static_cast(1000000000UL)*static_cast(ts.tv_sec) + static_cast(ts.tv_nsec); +#else /* generic Unix */ + struct timeval tv; +#if TBB_USE_ASSERT + int status = +#endif /* TBB_USE_ASSERT */ + gettimeofday(&tv, NULL); + __TBB_ASSERT( status==0, "gettimeofday failed" ); + result.my_count = static_cast(1000000)*static_cast(tv.tv_sec) + static_cast(tv.tv_usec); +#endif /*(choice of OS) */ + return result; +} + +inline tick_count::interval_t::interval_t( double sec ) +{ +#if _WIN32||_WIN64 + LARGE_INTEGER qpfreq; + QueryPerformanceFrequency(&qpfreq); + value = static_cast(sec*qpfreq.QuadPart); +#elif __linux__ + value = static_cast(sec*1E9); +#else /* generic Unix */ + value = static_cast(sec*1E6); +#endif /* (choice of OS) */ +} + +inline tick_count::interval_t operator-( const tick_count& t1, const tick_count& t0 ) { + return tick_count::interval_t( t1.my_count-t0.my_count ); +} + +inline double tick_count::interval_t::seconds() const { +#if _WIN32||_WIN64 + LARGE_INTEGER qpfreq; + QueryPerformanceFrequency(&qpfreq); + return value/(double)qpfreq.QuadPart; +#elif __linux__ + return value*1E-9; +#else /* generic Unix */ + return value*1E-6; +#endif /* (choice of OS) */ +} + +} // namespace tbb + +#endif /* __TBB_tick_count_H */ + diff --git a/dep/tbb/index.html b/dep/tbb/index.html new file mode 100644 index 000000000..d35f39238 --- /dev/null +++ b/dep/tbb/index.html @@ -0,0 +1,44 @@ + + + +

Overview

+Top level directory for Threading Building Blocks (TBB). +

+To build TBB, use the top-level Makefile; see also the build directions. +To port TBB to a new platform, operating system or architecture, see the porting directions. +

+ +

Files

+
+
Makefile +
Top-level Makefile for TBB. See also the build directions. +
+ +

Directories

+
+
doc +
Documentation for the library. +
include +
Include files required for compiling code that uses the library. +
examples +
Examples of how to use the library. +
src +
Source code for the library. +
build +
Internal Makefile infrastructure for TBB. Do not use directly; see the build directions. +
ia32, intel64, ia64 +
Platform-specific binary files for the library. +
+ +
+

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + + diff --git a/dep/tbb/src/Makefile b/dep/tbb/src/Makefile new file mode 100644 index 000000000..c4ff8da30 --- /dev/null +++ b/dep/tbb/src/Makefile @@ -0,0 +1,219 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +tbb_root?=.. +examples_root:=$(tbb_root)/examples +include $(tbb_root)/build/common.inc +.PHONY: all tbb tbbmalloc test test_no_depends release debug examples clean + +all: release debug examples + +tbb: tbb_release tbb_debug + +tbbmalloc: tbbmalloc_release tbbmalloc_debug + +rml: rml_release rml_debug + +test: tbbmalloc_test_release test_release tbbmalloc_test_debug test_debug + +# Suffix _ni stands for "no ingnore", meaning that the first error during the test session will stop it +test_ni: tbbmalloc_test_release_ni test_release_ni tbbmalloc_test_debug_ni test_debug_ni + +test_no_depends: tbbmalloc_test_release_no_depends test_release_no_depends tbbmalloc_test_debug_no_depends test_debug_no_depends + @echo done + +release: tbb_release tbbmalloc_release +release: $(call cross_cfg,tbbmalloc_test_release) $(call cross_cfg,test_release) + +debug: tbb_debug tbbmalloc_debug +debug: $(call cross_cfg,tbbmalloc_test_debug) $(call cross_cfg, test_debug) + +examples: tbb tbbmalloc examples_debug clean_examples examples_release + +clean: clean_release clean_debug clean_examples + @echo clean done + +.PHONY: full +full: + $(MAKE) -s -i -r --no-print-directory -f Makefile tbb_root=. clean all +ifeq ($(tbb_os),windows) + $(MAKE) -s -i -r --no-print-directory -f Makefile tbb_root=. compiler=icl clean all native_examples +else + $(MAKE) -s -i -r --no-print-directory -f Makefile tbb_root=. compiler=icc clean all native_examples +endif +ifeq ($(arch),intel64) + $(MAKE) -s -i -r --no-print-directory -f Makefile tbb_root=. arch=ia32 clean all +endif +# it doesn't test compiler=icc arch=ia32 on intel64 systems due to enviroment settings of icc + +native_examples: tbb tbbmalloc + $(MAKE) -C $(examples_root) -r -f Makefile tbb_root=.. compiler=$(native_compiler) tbb_build_prefix=$(tbb_build_prefix) debug test + $(MAKE) -C $(examples_root) -r -f Makefile tbb_root=.. compiler=$(native_compiler) tbb_build_prefix=$(tbb_build_prefix) clean release test + +../examples/% examples/%:: + $(MAKE) -C $(examples_root) -r -f Makefile tbb_root=.. $(subst examples/,,$(subst ../,,$@)) + +debug_%:: cfg?=debug +debug_%:: run_cmd=$(debugger) +test_% stress_% time_%:: cfg?=release +debug_% test_% stress_% time_%:: + $(MAKE) -C "$(work_dir)_$(cfg)" -r -f $(tbb_root)/build/Makefile.test cfg=$(cfg) run_cmd="$(run_cmd)" tbb_root=$(tbb_root) $@ + +clean_%:: +ifeq ($(cfg),) + @$(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.test cfg=release tbb_root=$(tbb_root) $@ + @$(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.test cfg=debug tbb_root=$(tbb_root) $@ +else + @$(MAKE) -C "$(work_dir)_$(cfg)" -r -f $(tbb_root)/build/Makefile.test cfg=$(cfg) tbb_root=$(tbb_root) $@ +endif + +.PHONY: tbb_release tbb_debug test_release test_debug test_release_no_depends test_debug_no_depends + +# do not delete double-space after -C option +tbb_release: mkdir_release + $(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.tbb cfg=release tbb_root=$(tbb_root) + +tbb_debug: mkdir_debug + $(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.tbb cfg=debug tbb_root=$(tbb_root) + +test_release: $(call cross_cfg,mkdir_release) $(call cross_cfg,tbb_release) test_release_no_depends +test_release_no_depends: + -$(MAKE) -C "$(call cross_cfg,$(work_dir)_release)" -r -f $(tbb_root)/build/Makefile.test cfg=release tbb_root=$(tbb_root) + +test_debug: $(call cross_cfg,mkdir_debug) $(call cross_cfg,tbb_debug) test_debug_no_depends +test_debug_no_depends: + -$(MAKE) -C "$(call cross_cfg,$(work_dir)_debug)" -r -f $(tbb_root)/build/Makefile.test cfg=debug tbb_root=$(tbb_root) + +test_release_ni: + $(MAKE) -C "$(call cross_cfg,$(work_dir)_release)" -r -f $(tbb_root)/build/Makefile.test cfg=release tbb_root=$(tbb_root) + +test_debug_ni: + $(MAKE) -C "$(call cross_cfg,$(work_dir)_debug)" -r -f $(tbb_root)/build/Makefile.test cfg=debug tbb_root=$(tbb_root) + +.PHONY: tbbmalloc_release tbbmalloc_debug +.PHONY: tbbmalloc_dll_release tbbmalloc_dll_debug tbbmalloc_proxy_dll_release tbbmalloc_proxy_dll_debug +.PHONY: tbbmalloc_test_release tbbmalloc_test_debug tbbmalloc_test_release_no_depends tbbmalloc_test_debug_no_depends + +tbbmalloc_release: mkdir_release + $(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=release malloc tbb_root=$(tbb_root) + +tbbmalloc_debug: mkdir_debug + $(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=debug malloc tbb_root=$(tbb_root) + +tbbmalloc_dll_release: mkdir_release + $(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=release malloc_dll tbb_root=$(tbb_root) + +tbbmalloc_proxy_dll_release: mkdir_release + $(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=release malloc_proxy_dll tbb_root=$(tbb_root) + +tbbmalloc_dll_debug: mkdir_debug + $(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=debug malloc_dll tbb_root=$(tbb_root) + +tbbmalloc_proxy_dll_debug: mkdir_debug + $(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=debug malloc_proxy_dll tbb_root=$(tbb_root) + +tbbmalloc_test_release: $(call cross_cfg,mkdir_release) $(call cross_cfg,tbbmalloc_release) tbbmalloc_test_release_no_depends +tbbmalloc_test_release_no_depends: + -$(MAKE) -C "$(call cross_cfg,$(work_dir)_release)" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=release malloc_test tbb_root=$(tbb_root) + +tbbmalloc_test_debug: $(call cross_cfg,mkdir_debug) $(call cross_cfg,tbbmalloc_debug) tbbmalloc_test_debug_no_depends +tbbmalloc_test_debug_no_depends: + -$(MAKE) -C "$(call cross_cfg,$(work_dir)_debug)" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=debug malloc_test tbb_root=$(tbb_root) + +tbbmalloc_test_release_ni: $(call cross_cfg,mkdir_release) $(call cross_cfg,tbbmalloc_release) tbbmalloc_test_release_no_depends + $(MAKE) -C "$(call cross_cfg,$(work_dir)_release)" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=release malloc_test tbb_root=$(tbb_root) + +tbbmalloc_test_debug_ni: $(call cross_cfg,mkdir_debug) $(call cross_cfg,tbbmalloc_debug) tbbmalloc_test_debug_no_depends + $(MAKE) -C "$(call cross_cfg,$(work_dir)_debug)" -r -f $(tbb_root)/build/Makefile.tbbmalloc cfg=debug malloc_test tbb_root=$(tbb_root) + +.PHONY: rml_release rml_debug rml_test_release rml_test_debug +.PHONY: rml_test_release_no_depends rml_test_debug_no_depends + +rml_release: mkdir_release + $(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.rml cfg=release tbb_root=$(tbb_root) rml + +rml_debug: mkdir_debug + $(MAKE) -C "$(work_dir)_debug" -r -f $(tbb_root)/build/Makefile.rml cfg=debug tbb_root=$(tbb_root) rml + +rml_test_release: $(call cross_cfg,mkdir_release) $(call cross_cfg,rml_release) rml_test_release_no_depends +rml_test_release_no_depends: + -$(MAKE) -C "$(call cross_cfg,$(work_dir)_release)" -r -f $(tbb_root)/build/Makefile.rml cfg=release rml_test tbb_root=$(tbb_root) + +rml_test_debug: $(call cross_cfg,mkdir_debug) $(call cross_cfg,rml_debug) rml_test_debug_no_depends +rml_test_debug_no_depends: + -$(MAKE) -C "$(call cross_cfg,$(work_dir)_debug)" -r -f $(tbb_root)/build/Makefile.rml cfg=debug rml_test tbb_root=$(tbb_root) + +.PHONY: examples_release examples_debug + +examples_release: tbb_release tbbmalloc_release + $(MAKE) -C $(examples_root) -r -f Makefile tbb_root=.. release test + +examples_debug: tbb_debug tbbmalloc_debug + $(MAKE) -C $(examples_root) -r -f Makefile tbb_root=.. debug test + +.PHONY: clean_release clean_debug clean_examples + +clean_release: + $(shell $(RM) $(work_dir)_release$(SLASH)*.* >$(NUL) 2>$(NUL)) + $(shell $(RD) $(work_dir)_release >$(NUL) 2>$(NUL)) + +clean_debug: + $(shell $(RM) $(work_dir)_debug$(SLASH)*.* >$(NUL) 2>$(NUL)) + $(shell $(RD) $(work_dir)_debug >$(NUL) 2>$(NUL)) + +clean_examples: + $(shell $(MAKE) -s -i -r -C $(examples_root) -f Makefile tbb_root=.. clean >$(NUL) 2>$(NUL)) + +.PHONY: mkdir_release mkdir_debug codecov do_codecov info + +mkdir_release: + $(shell $(MD) "$(work_dir)_release" >$(NUL) 2>$(NUL)) + $(if $(subst undefined,,$(origin_build_dir)),,cd "$(work_dir)_release" && $(MAKE_TBBVARS) $(tbb_build_prefix)_release) + +mkdir_debug: + $(shell $(MD) "$(work_dir)_debug" >$(NUL) 2>$(NUL)) + $(if $(subst undefined,,$(origin_build_dir)),,cd "$(work_dir)_debug" && $(MAKE_TBBVARS) $(tbb_build_prefix)_debug) + +codecov: compiler=$(if $(findstring windows,$(tbb_os)),icl,icc) +codecov: + $(MAKE) tbb_root=.. codecov=yes do_codecov + +do_codecov: + $(MAKE) RML=yes tbbmalloc_test_release test_release + $(MAKE) clean_test_* cfg=release + $(MAKE) RML=yes crosstest=yes tbbmalloc_test_debug test_debug + $(MAKE) clean_test_* cfg=release + $(MAKE) rml_test_release + $(MAKE) clean_test_* cfg=release + $(MAKE) crosstest=yes rml_test_debug + $(MAKE) -C "$(work_dir)_release" -r -f $(tbb_root)/build/Makefile.test tbb_root=$(tbb_root) cfg=release codecov=yes codecov_gen + +info: + @echo OS: $(tbb_os) + @echo arch=$(arch) + @echo compiler=$(compiler) + @echo runtime=$(runtime) + @echo tbb_build_prefix=$(tbb_build_prefix) diff --git a/dep/tbb/src/index.html b/dep/tbb/src/index.html new file mode 100644 index 000000000..5e53ce787 --- /dev/null +++ b/dep/tbb/src/index.html @@ -0,0 +1,32 @@ + + + +

Overview

+This directory contains the source code and unit tests for Threading Building Blocks. + +

Directories

+
+
tbb +
Source code of the TBB library core. +
tbbmalloc +
Source code of the TBB scalable memory allocator. +
test +
Source code of the TBB unit tests. +
old +
Source code of deprecated TBB entities that are still shipped as part of the TBB library for the sake of backward compatibility. +
rml +
Source code of the Resource Management Layer (RML). +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + diff --git a/dep/tbb/src/old/concurrent_queue_v2.cpp b/dep/tbb/src/old/concurrent_queue_v2.cpp new file mode 100644 index 000000000..a6d0d6f4f --- /dev/null +++ b/dep/tbb/src/old/concurrent_queue_v2.cpp @@ -0,0 +1,382 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "concurrent_queue_v2.h" +#include "tbb/cache_aligned_allocator.h" +#include "tbb/spin_mutex.h" +#include "tbb/atomic.h" +#include +#include + +#if defined(_MSC_VER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (disable: 4267) +#endif + +#define RECORD_EVENTS 0 + +using namespace std; + +namespace tbb { + +namespace internal { + +class concurrent_queue_rep; + +//! A queue using simple locking. +/** For efficient, this class has no constructor. + The caller is expected to zero-initialize it. */ +struct micro_queue { + typedef concurrent_queue_base::page page; + typedef size_t ticket; + + atomic head_page; + atomic head_counter; + + atomic tail_page; + atomic tail_counter; + + spin_mutex page_mutex; + + class push_finalizer: no_copy { + ticket my_ticket; + micro_queue& my_queue; + public: + push_finalizer( micro_queue& queue, ticket k ) : + my_ticket(k), my_queue(queue) + {} + ~push_finalizer() { + my_queue.tail_counter = my_ticket; + } + }; + + void push( const void* item, ticket k, concurrent_queue_base& base ); + + class pop_finalizer: no_copy { + ticket my_ticket; + micro_queue& my_queue; + page* my_page; + public: + pop_finalizer( micro_queue& queue, ticket k, page* p ) : + my_ticket(k), my_queue(queue), my_page(p) + {} + ~pop_finalizer() { + page* p = my_page; + if( p ) { + spin_mutex::scoped_lock lock( my_queue.page_mutex ); + page* q = p->next; + my_queue.head_page = q; + if( !q ) { + my_queue.tail_page = NULL; + } + } + my_queue.head_counter = my_ticket; + if( p ) + operator delete(p); + } + }; + + bool pop( void* dst, ticket k, concurrent_queue_base& base ); +}; + +//! Internal representation of a ConcurrentQueue. +/** For efficient, this class has no constructor. + The caller is expected to zero-initialize it. */ +class concurrent_queue_rep { +public: + typedef size_t ticket; + +private: + friend struct micro_queue; + + //! Approximately n_queue/golden ratio + static const size_t phi = 3; + +public: + //! Must be power of 2 + static const size_t n_queue = 8; + + //! Map ticket to an array index + static size_t index( ticket k ) { + return k*phi%n_queue; + } + + atomic head_counter; + char pad1[NFS_MaxLineSize-sizeof(size_t)]; + + atomic tail_counter; + char pad2[NFS_MaxLineSize-sizeof(ticket)]; + micro_queue array[n_queue]; + + micro_queue& choose( ticket k ) { + // The formula here approximates LRU in a cache-oblivious way. + return array[index(k)]; + } + + //! Value for effective_capacity that denotes unbounded queue. + static const ptrdiff_t infinite_capacity = ptrdiff_t(~size_t(0)/2); +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // unary minus operator applied to unsigned type, result still unsigned + #pragma warning( push ) + #pragma warning( disable: 4146 ) +#endif + +//------------------------------------------------------------------------ +// micro_queue +//------------------------------------------------------------------------ +void micro_queue::push( const void* item, ticket k, concurrent_queue_base& base ) { + k &= -concurrent_queue_rep::n_queue; + page* p = NULL; + size_t index = (k/concurrent_queue_rep::n_queue & base.items_per_page-1); + if( !index ) { + size_t n = sizeof(page) + base.items_per_page*base.item_size; + p = static_cast(operator new( n )); + p->mask = 0; + p->next = NULL; + } + { + push_finalizer finalizer( *this, k+concurrent_queue_rep::n_queue ); + spin_wait_until_eq( tail_counter, k ); + if( p ) { + spin_mutex::scoped_lock lock( page_mutex ); + if( page* q = tail_page ) + q->next = p; + else + head_page = p; + tail_page = p; + } else { + p = tail_page; + } + base.copy_item( *p, index, item ); + // If no exception was thrown, mark item as present. + p->mask |= uintptr(1)<1 ? item_size : 2); + my_rep = cache_aligned_allocator().allocate(1); + __TBB_ASSERT( (size_t)my_rep % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->head_counter % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->tail_counter % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->array % NFS_GetLineSize()==0, "alignment error" ); + memset(my_rep,0,sizeof(concurrent_queue_rep)); + this->item_size = item_size; +} + +concurrent_queue_base::~concurrent_queue_base() { + size_t nq = my_rep->n_queue; + for( size_t i=0; iarray[i].tail_page; + __TBB_ASSERT( my_rep->array[i].head_page==tp, "at most one page should remain" ); + if( tp!=NULL ) + delete tp; + } + cache_aligned_allocator().deallocate(my_rep,1); +} + +void concurrent_queue_base::internal_push( const void* src ) { + concurrent_queue_rep& r = *my_rep; + concurrent_queue_rep::ticket k = r.tail_counter++; + ptrdiff_t e = my_capacity; + if( e(my_capacity); + } + } + r.choose(k).push(src,k,*this); +} + +void concurrent_queue_base::internal_pop( void* dst ) { + concurrent_queue_rep& r = *my_rep; + concurrent_queue_rep::ticket k; + do { + k = r.head_counter++; + } while( !r.choose(k).pop(dst,k,*this) ); +} + +bool concurrent_queue_base::internal_pop_if_present( void* dst ) { + concurrent_queue_rep& r = *my_rep; + concurrent_queue_rep::ticket k; + do { + atomic_backoff backoff; + for(;;) { + k = r.head_counter; + if( r.tail_counter<=k ) { + // Queue is empty + return false; + } + // Queue had item with ticket k when we looked. Attempt to get that item. + if( r.head_counter.compare_and_swap(k+1,k)==k ) { + break; + } + // Another thread snatched the item, so pause and retry. + backoff.pause(); + } + } while( !r.choose(k).pop(dst,k,*this) ); + return true; +} + +bool concurrent_queue_base::internal_push_if_not_full( const void* src ) { + concurrent_queue_rep& r = *my_rep; + atomic_backoff backoff; + concurrent_queue_rep::ticket k; + for(;;) { + k = r.tail_counter; + if( (ptrdiff_t)(k-r.head_counter)>=my_capacity ) { + // Queue is full + return false; + } + // Queue had empty slot with ticket k when we looked. Attempt to claim that slot. + if( r.tail_counter.compare_and_swap(k+1,k)==k ) + break; + // Another thread claimed the slot, so pause and retry. + backoff.pause(); + } + r.choose(k).push(src,k,*this); + return true; +} + +ptrdiff_t concurrent_queue_base::internal_size() const { + __TBB_ASSERT( sizeof(ptrdiff_t)<=sizeof(size_t), NULL ); + return ptrdiff_t(my_rep->tail_counter-my_rep->head_counter); +} + +void concurrent_queue_base::internal_set_capacity( ptrdiff_t capacity, size_t /*item_size*/ ) { + my_capacity = capacity<0 ? concurrent_queue_rep::infinite_capacity : capacity; +} + +//------------------------------------------------------------------------ +// concurrent_queue_iterator_rep +//------------------------------------------------------------------------ +class concurrent_queue_iterator_rep: no_assign { +public: + typedef concurrent_queue_rep::ticket ticket; + ticket head_counter; + const concurrent_queue_base& my_queue; + concurrent_queue_base::page* array[concurrent_queue_rep::n_queue]; + concurrent_queue_iterator_rep( const concurrent_queue_base& queue ) : + head_counter(queue.my_rep->head_counter), + my_queue(queue) + { + const concurrent_queue_rep& rep = *queue.my_rep; + for( size_t k=0; ktail_counter ) + return NULL; + else { + concurrent_queue_base::page* p = array[concurrent_queue_rep::index(k)]; + __TBB_ASSERT(p,NULL); + size_t i = k/concurrent_queue_rep::n_queue & my_queue.items_per_page-1; + return static_cast(static_cast(p+1)) + my_queue.item_size*i; + } + } +}; + +//------------------------------------------------------------------------ +// concurrent_queue_iterator_base +//------------------------------------------------------------------------ +concurrent_queue_iterator_base::concurrent_queue_iterator_base( const concurrent_queue_base& queue ) { + my_rep = new concurrent_queue_iterator_rep(queue); + my_item = my_rep->choose(my_rep->head_counter); +} + +void concurrent_queue_iterator_base::assign( const concurrent_queue_iterator_base& other ) { + if( my_rep!=other.my_rep ) { + if( my_rep ) { + delete my_rep; + my_rep = NULL; + } + if( other.my_rep ) { + my_rep = new concurrent_queue_iterator_rep( *other.my_rep ); + } + } + my_item = other.my_item; +} + +void concurrent_queue_iterator_base::advance() { + __TBB_ASSERT( my_item, "attempt to increment iterator past end of queue" ); + size_t k = my_rep->head_counter; + const concurrent_queue_base& queue = my_rep->my_queue; + __TBB_ASSERT( my_item==my_rep->choose(k), NULL ); + size_t i = k/concurrent_queue_rep::n_queue & queue.items_per_page-1; + if( i==queue.items_per_page-1 ) { + concurrent_queue_base::page*& root = my_rep->array[concurrent_queue_rep::index(k)]; + root = root->next; + } + my_rep->head_counter = k+1; + my_item = my_rep->choose(k+1); +} + +concurrent_queue_iterator_base::~concurrent_queue_iterator_base() { + delete my_rep; + my_rep = NULL; +} + +} // namespace internal + +} // namespace tbb diff --git a/dep/tbb/src/old/concurrent_queue_v2.h b/dep/tbb/src/old/concurrent_queue_v2.h new file mode 100644 index 000000000..862384e32 --- /dev/null +++ b/dep/tbb/src/old/concurrent_queue_v2.h @@ -0,0 +1,328 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_concurrent_queue_H +#define __TBB_concurrent_queue_H + +#include "tbb/tbb_stddef.h" +#include + +namespace tbb { + +template class concurrent_queue; + +//! @cond INTERNAL +namespace internal { + +class concurrent_queue_rep; +class concurrent_queue_iterator_rep; +class concurrent_queue_iterator_base; +template class concurrent_queue_iterator; + +//! For internal use only. +/** Type-independent portion of concurrent_queue. + @ingroup containers */ +class concurrent_queue_base: no_copy { + //! Internal representation + concurrent_queue_rep* my_rep; + + friend class concurrent_queue_rep; + friend struct micro_queue; + friend class concurrent_queue_iterator_rep; + friend class concurrent_queue_iterator_base; +protected: + //! Prefix on a page + struct page { + page* next; + uintptr mask; + }; + + //! Capacity of the queue + ptrdiff_t my_capacity; + + //! Always a power of 2 + size_t items_per_page; + + //! Size of an item + size_t item_size; +private: + virtual void copy_item( page& dst, size_t index, const void* src ) = 0; + virtual void assign_and_destroy_item( void* dst, page& src, size_t index ) = 0; +protected: + __TBB_EXPORTED_METHOD concurrent_queue_base( size_t item_size ); + virtual __TBB_EXPORTED_METHOD ~concurrent_queue_base(); + + //! Enqueue item at tail of queue + void __TBB_EXPORTED_METHOD internal_push( const void* src ); + + //! Dequeue item from head of queue + void __TBB_EXPORTED_METHOD internal_pop( void* dst ); + + //! Attempt to enqueue item onto queue. + bool __TBB_EXPORTED_METHOD internal_push_if_not_full( const void* src ); + + //! Attempt to dequeue item from queue. + /** NULL if there was no item to dequeue. */ + bool __TBB_EXPORTED_METHOD internal_pop_if_present( void* dst ); + + //! Get size of queue + ptrdiff_t __TBB_EXPORTED_METHOD internal_size() const; + + void __TBB_EXPORTED_METHOD internal_set_capacity( ptrdiff_t capacity, size_t element_size ); +}; + +//! Type-independent portion of concurrent_queue_iterator. +/** @ingroup containers */ +class concurrent_queue_iterator_base { + //! Concurrentconcurrent_queue over which we are iterating. + /** NULL if one past last element in queue. */ + concurrent_queue_iterator_rep* my_rep; + + template + friend bool operator==( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ); + + template + friend bool operator!=( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ); +protected: + //! Pointer to current item + mutable void* my_item; + + //! Default constructor + __TBB_EXPORTED_METHOD concurrent_queue_iterator_base() : my_rep(NULL), my_item(NULL) {} + + //! Copy constructor + concurrent_queue_iterator_base( const concurrent_queue_iterator_base& i ) : my_rep(NULL), my_item(NULL) { + assign(i); + } + + //! Construct iterator pointing to head of queue. + concurrent_queue_iterator_base( const concurrent_queue_base& queue ); + + //! Assignment + void __TBB_EXPORTED_METHOD assign( const concurrent_queue_iterator_base& i ); + + //! Advance iterator one step towards tail of queue. + void __TBB_EXPORTED_METHOD advance(); + + //! Destructor + __TBB_EXPORTED_METHOD ~concurrent_queue_iterator_base(); +}; + +//! Meets requirements of a forward iterator for STL. +/** Value is either the T or const T type of the container. + @ingroup containers */ +template +class concurrent_queue_iterator: public concurrent_queue_iterator_base { +#if !defined(_MSC_VER) || defined(__INTEL_COMPILER) + template + friend class ::tbb::concurrent_queue; +#else +public: // workaround for MSVC +#endif + //! Construct iterator pointing to head of queue. + concurrent_queue_iterator( const concurrent_queue_base& queue ) : + concurrent_queue_iterator_base(queue) + { + } +public: + concurrent_queue_iterator() {} + + /** If Value==Container::value_type, then this routine is the copy constructor. + If Value==const Container::value_type, then this routine is a conversion constructor. */ + concurrent_queue_iterator( const concurrent_queue_iterator& other ) : + concurrent_queue_iterator_base(other) + {} + + //! Iterator assignment + concurrent_queue_iterator& operator=( const concurrent_queue_iterator& other ) { + assign(other); + return *this; + } + + //! Reference to current item + Value& operator*() const { + return *static_cast(my_item); + } + + Value* operator->() const {return &operator*();} + + //! Advance to next item in queue + concurrent_queue_iterator& operator++() { + advance(); + return *this; + } + + //! Post increment + Value* operator++(int) { + Value* result = &operator*(); + operator++(); + return result; + } +}; // concurrent_queue_iterator + +template +bool operator==( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ) { + return i.my_item==j.my_item; +} + +template +bool operator!=( const concurrent_queue_iterator& i, const concurrent_queue_iterator& j ) { + return i.my_item!=j.my_item; +} + +} // namespace internal; +//! @endcond + +//! A high-performance thread-safe queue. +/** Multiple threads may each push and pop concurrently. + Assignment and copy construction are not allowed. + @ingroup containers */ +template +class concurrent_queue: public internal::concurrent_queue_base { + template friend class internal::concurrent_queue_iterator; + + //! Class used to ensure exception-safety of method "pop" + class destroyer { + T& my_value; + public: + destroyer( T& value ) : my_value(value) {} + ~destroyer() {my_value.~T();} + }; + + T& get_ref( page& page, size_t index ) { + __TBB_ASSERT( index(static_cast(&page+1))[index]; + } + + /*override*/ virtual void copy_item( page& dst, size_t index, const void* src ) { + new( &get_ref(dst,index) ) T(*static_cast(src)); + } + + /*override*/ virtual void assign_and_destroy_item( void* dst, page& src, size_t index ) { + T& from = get_ref(src,index); + destroyer d(from); + *static_cast(dst) = from; + } + +public: + //! Element type in the queue. + typedef T value_type; + + //! Reference type + typedef T& reference; + + //! Const reference type + typedef const T& const_reference; + + //! Integral type for representing size of the queue. + /** Notice that the size_type is a signed integral type. + This is because the size can be negative if there are pending pops without corresponding pushes. */ + typedef std::ptrdiff_t size_type; + + //! Difference type for iterator + typedef std::ptrdiff_t difference_type; + + //! Construct empty queue + concurrent_queue() : + concurrent_queue_base( sizeof(T) ) + { + } + + //! Destroy queue + ~concurrent_queue(); + + //! Enqueue an item at tail of queue. + void push( const T& source ) { + internal_push( &source ); + } + + //! Dequeue item from head of queue. + /** Block until an item becomes available, and then dequeue it. */ + void pop( T& destination ) { + internal_pop( &destination ); + } + + //! Enqueue an item at tail of queue if queue is not already full. + /** Does not wait for queue to become not full. + Returns true if item is pushed; false if queue was already full. */ + bool push_if_not_full( const T& source ) { + return internal_push_if_not_full( &source ); + } + + //! Attempt to dequeue an item from head of queue. + /** Does not wait for item to become available. + Returns true if successful; false otherwise. */ + bool pop_if_present( T& destination ) { + return internal_pop_if_present( &destination ); + } + + //! Return number of pushes minus number of pops. + /** Note that the result can be negative if there are pops waiting for the + corresponding pushes. The result can also exceed capacity() if there + are push operations in flight. */ + size_type size() const {return internal_size();} + + //! Equivalent to size()<=0. + bool empty() const {return size()<=0;} + + //! Maximum number of allowed elements + size_type capacity() const { + return my_capacity; + } + + //! Set the capacity + /** Setting the capacity to 0 causes subsequent push_if_not_full operations to always fail, + and subsequent push operations to block forever. */ + void set_capacity( size_type capacity ) { + internal_set_capacity( capacity, sizeof(T) ); + } + + typedef internal::concurrent_queue_iterator iterator; + typedef internal::concurrent_queue_iterator const_iterator; + + //------------------------------------------------------------------------ + // The iterators are intended only for debugging. They are slow and not thread safe. + //------------------------------------------------------------------------ + iterator begin() {return iterator(*this);} + iterator end() {return iterator();} + const_iterator begin() const {return const_iterator(*this);} + const_iterator end() const {return const_iterator();} + +}; + +template +concurrent_queue::~concurrent_queue() { + while( !empty() ) { + T value; + internal_pop(&value); + } +} + +} // namespace tbb + +#endif /* __TBB_concurrent_queue_H */ diff --git a/dep/tbb/src/old/concurrent_vector_v2.cpp b/dep/tbb/src/old/concurrent_vector_v2.cpp new file mode 100644 index 000000000..36186ea9a --- /dev/null +++ b/dep/tbb/src/old/concurrent_vector_v2.cpp @@ -0,0 +1,266 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "concurrent_vector_v2.h" +#include "tbb/tbb_machine.h" +#include +#include "../tbb/itt_notify.h" +#include "tbb/task.h" +#include + + +#if defined(_MSC_VER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (disable: 4267) +#endif + +namespace tbb { + +namespace internal { + +void concurrent_vector_base::internal_grow_to_at_least( size_type new_size, size_type element_size, internal_array_op1 init ) { + size_type e = my_early_size; + while( e=pointers_per_short_segment && v.my_segment==v.my_storage ) { + extend_segment(v); + } + } +}; + +void concurrent_vector_base::helper::extend_segment( concurrent_vector_base& v ) { + const size_t pointers_per_long_segment = sizeof(void*)==4 ? 32 : 64; + segment_t* s = (segment_t*)NFS_Allocate( pointers_per_long_segment, sizeof(segment_t), NULL ); + std::memset( s, 0, pointers_per_long_segment*sizeof(segment_t) ); + // If other threads are trying to set pointers in the short segment, wait for them to finish their + // assigments before we copy the short segment to the long segment. + atomic_backoff backoff; + while( !v.my_storage[0].array || !v.my_storage[1].array ) { + backoff.pause(); + } + s[0] = v.my_storage[0]; + s[1] = v.my_storage[1]; + if( v.my_segment.compare_and_swap( s, v.my_storage )!=v.my_storage ) + NFS_Free(s); +} + +concurrent_vector_base::size_type concurrent_vector_base::internal_capacity() const { + return segment_base( helper::find_segment_end(*this) ); +} + +void concurrent_vector_base::internal_reserve( size_type n, size_type element_size, size_type max_size ) { + if( n>max_size ) { + throw std::length_error("argument to ConcurrentVector::reserve exceeds ConcurrentVector::max_size()"); + } + for( segment_index_t k = helper::find_segment_end(*this); segment_base(k)n-b ) m = n-b; + copy( my_segment[k].array, src.my_segment[k].array, m ); + } + } +} + +void concurrent_vector_base::internal_assign( const concurrent_vector_base& src, size_type element_size, internal_array_op1 destroy, internal_array_op2 assign, internal_array_op2 copy ) { + size_type n = src.my_early_size; + while( my_early_size>n ) { + segment_index_t k = segment_index_of( my_early_size-1 ); + size_type b=segment_base(k); + size_type new_end = b>=n ? b : n; + __TBB_ASSERT( my_early_size>new_end, NULL ); + destroy( (char*)my_segment[k].array+element_size*(new_end-b), my_early_size-new_end ); + my_early_size = new_end; + } + size_type dst_initialized_size = my_early_size; + my_early_size = n; + size_type b; + for( segment_index_t k=0; (b=segment_base(k))n-b ) m = n-b; + size_type a = 0; + if( dst_initialized_size>b ) { + a = dst_initialized_size-b; + if( a>m ) a = m; + assign( my_segment[k].array, src.my_segment[k].array, a ); + m -= a; + a *= element_size; + } + if( m>0 ) + copy( (char*)my_segment[k].array+a, (char*)src.my_segment[k].array+a, m ); + } + __TBB_ASSERT( src.my_early_size==n, "detected use of ConcurrentVector::operator= with right side that was concurrently modified" ); +} + +void* concurrent_vector_base::internal_push_back( size_type element_size, size_type& index ) { + __TBB_ASSERT( sizeof(my_early_size)==sizeof(reference_count), NULL ); + //size_t tmp = __TBB_FetchAndIncrementWacquire(*(tbb::internal::reference_count*)&my_early_size); + size_t tmp = __TBB_FetchAndIncrementWacquire((tbb::internal::reference_count*)&my_early_size); + index = tmp; + segment_index_t k_old = segment_index_of( tmp ); + size_type base = segment_base(k_old); + helper::extend_segment_if_necessary(*this,k_old); + segment_t& s = my_segment[k_old]; + void* array = s.array; + if( !array ) { + // FIXME - consider factoring this out and share with internal_grow_by + if( base==tmp ) { + __TBB_ASSERT( !s.array, NULL ); + size_t n = segment_size(k_old); + array = NFS_Allocate( n, element_size, NULL ); + ITT_NOTIFY( sync_releasing, &s.array ); + s.array = array; + } else { + ITT_NOTIFY(sync_prepare, &s.array); + spin_wait_while_eq( s.array, (void*)0 ); + ITT_NOTIFY(sync_acquired, &s.array); + array = s.array; + } + } + size_type j_begin = tmp-base; + return (void*)((char*)array+element_size*j_begin); +} + +concurrent_vector_base::size_type concurrent_vector_base::internal_grow_by( size_type delta, size_type element_size, internal_array_op1 init ) { + size_type result = my_early_size.fetch_and_add(delta); + internal_grow( result, result+delta, element_size, init ); + return result; +} + +void concurrent_vector_base::internal_grow( const size_type start, size_type finish, size_type element_size, internal_array_op1 init ) { + __TBB_ASSERT( start finish-base ? finish-base : n; + (*init)( (void*)((char*)array+element_size*j_begin), j_end-j_begin ); + tmp = base+j_end; + } while( tmp0 ) { + segment_index_t k_old = segment_index_of(finish-1); + segment_t& s = my_segment[k_old]; + __TBB_ASSERT( s.array, NULL ); + size_type base = segment_base(k_old); + size_type j_end = finish-base; + __TBB_ASSERT( j_end, NULL ); + (*destroy)( s.array, j_end ); + finish = base; + } + + // Free the arrays + if( reclaim_storage ) { + size_t k = helper::find_segment_end(*this); + while( k>0 ) { + --k; + segment_t& s = my_segment[k]; + void* array = s.array; + s.array = NULL; + NFS_Free( array ); + } + // Clear short segment. + my_storage[0].array = NULL; + my_storage[1].array = NULL; + segment_t* s = my_segment; + if( s!=my_storage ) { + my_segment = my_storage; + NFS_Free( s ); + } + } +} + +} // namespace internal + +} // tbb diff --git a/dep/tbb/src/old/concurrent_vector_v2.h b/dep/tbb/src/old/concurrent_vector_v2.h new file mode 100644 index 000000000..a9c3a3be4 --- /dev/null +++ b/dep/tbb/src/old/concurrent_vector_v2.h @@ -0,0 +1,512 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_concurrent_vector_H +#define __TBB_concurrent_vector_H + +#include "tbb/tbb_stddef.h" +#include +#include +#include "tbb/atomic.h" +#include "tbb/cache_aligned_allocator.h" +#include "tbb/blocked_range.h" + +#include "tbb/tbb_machine.h" + +namespace tbb { + +template +class concurrent_vector; + +//! @cond INTERNAL +namespace internal { + + //! Base class of concurrent vector implementation. + /** @ingroup containers */ + class concurrent_vector_base { + protected: + typedef unsigned long segment_index_t; + + //! Log2 of "min_segment_size". + static const int lg_min_segment_size = 4; + + //! Minimum size (in physical items) of a segment. + static const int min_segment_size = segment_index_t(1)<>1< my_early_size; + + /** Can be zero-initialized. */ + struct segment_t { + /** Declared volatile because in weak memory model, must have ld.acq/st.rel */ + void* volatile array; +#if TBB_DO_ASSERT + ~segment_t() { + __TBB_ASSERT( !array, "should have been set to NULL by clear" ); + } +#endif /* TBB_DO_ASSERT */ + }; + + atomic my_segment; + + segment_t my_storage[2]; + + concurrent_vector_base() { + my_early_size = 0; + my_storage[0].array = NULL; + my_storage[1].array = NULL; + my_segment = my_storage; + } + + //! An operation on an n-lement array starting at begin. + typedef void(__TBB_EXPORTED_FUNC *internal_array_op1)(void* begin, size_type n ); + + //! An operation on n-element destination array and n-element source array. + typedef void(__TBB_EXPORTED_FUNC *internal_array_op2)(void* dst, const void* src, size_type n ); + + void __TBB_EXPORTED_METHOD internal_grow_to_at_least( size_type new_size, size_type element_size, internal_array_op1 init ); + void internal_grow( size_type start, size_type finish, size_type element_size, internal_array_op1 init ); + size_type __TBB_EXPORTED_METHOD internal_grow_by( size_type delta, size_type element_size, internal_array_op1 init ); + void* __TBB_EXPORTED_METHOD internal_push_back( size_type element_size, size_type& index ); + void __TBB_EXPORTED_METHOD internal_clear( internal_array_op1 destroy, bool reclaim_storage ); + void __TBB_EXPORTED_METHOD internal_copy( const concurrent_vector_base& src, size_type element_size, internal_array_op2 copy ); + void __TBB_EXPORTED_METHOD internal_assign( const concurrent_vector_base& src, size_type element_size, + internal_array_op1 destroy, internal_array_op2 assign, internal_array_op2 copy ); +private: + //! Private functionality that does not cross DLL boundary. + class helper; + + friend class helper; + }; + + //! Meets requirements of a forward iterator for STL and a Value for a blocked_range.*/ + /** Value is either the T or const T type of the container. + @ingroup containers */ + template + class vector_iterator +#if defined(_WIN64) && defined(_MSC_VER) + // Ensure that Microsoft's internal template function _Val_type works correctly. + : public std::iterator +#endif /* defined(_WIN64) && defined(_MSC_VER) */ + { + //! concurrent_vector over which we are iterating. + Container* my_vector; + + //! Index into the vector + size_t my_index; + + //! Caches my_vector->internal_subscript(my_index) + /** NULL if cached value is not available */ + mutable Value* my_item; + + template + friend bool operator==( const vector_iterator& i, const vector_iterator& j ); + + template + friend bool operator<( const vector_iterator& i, const vector_iterator& j ); + + template + friend ptrdiff_t operator-( const vector_iterator& i, const vector_iterator& j ); + + template + friend class internal::vector_iterator; + +#if !defined(_MSC_VER) || defined(__INTEL_COMPILER) + template + friend class tbb::concurrent_vector; +#else +public: // workaround for MSVC +#endif + + vector_iterator( const Container& vector, size_t index ) : + my_vector(const_cast(&vector)), + my_index(index), + my_item(NULL) + {} + + public: + //! Default constructor + vector_iterator() : my_vector(NULL), my_index(~size_t(0)), my_item(NULL) {} + + vector_iterator( const vector_iterator& other ) : + my_vector(other.my_vector), + my_index(other.my_index), + my_item(other.my_item) + {} + + vector_iterator operator+( ptrdiff_t offset ) const { + return vector_iterator( *my_vector, my_index+offset ); + } + friend vector_iterator operator+( ptrdiff_t offset, const vector_iterator& v ) { + return vector_iterator( *v.my_vector, v.my_index+offset ); + } + vector_iterator operator+=( ptrdiff_t offset ) { + my_index+=offset; + my_item = NULL; + return *this; + } + vector_iterator operator-( ptrdiff_t offset ) const { + return vector_iterator( *my_vector, my_index-offset ); + } + vector_iterator operator-=( ptrdiff_t offset ) { + my_index-=offset; + my_item = NULL; + return *this; + } + Value& operator*() const { + Value* item = my_item; + if( !item ) { + item = my_item = &my_vector->internal_subscript(my_index); + } + __TBB_ASSERT( item==&my_vector->internal_subscript(my_index), "corrupt cache" ); + return *item; + } + Value& operator[]( ptrdiff_t k ) const { + return my_vector->internal_subscript(my_index+k); + } + Value* operator->() const {return &operator*();} + + //! Pre increment + vector_iterator& operator++() { + size_t k = ++my_index; + if( my_item ) { + // Following test uses 2's-complement wizardry and fact that + // min_segment_size is a power of 2. + if( (k& k-concurrent_vector::min_segment_size)==0 ) { + // k is a power of two that is at least k-min_segment_size + my_item= NULL; + } else { + ++my_item; + } + } + return *this; + } + + //! Pre decrement + vector_iterator& operator--() { + __TBB_ASSERT( my_index>0, "operator--() applied to iterator already at beginning of concurrent_vector" ); + size_t k = my_index--; + if( my_item ) { + // Following test uses 2's-complement wizardry and fact that + // min_segment_size is a power of 2. + if( (k& k-concurrent_vector::min_segment_size)==0 ) { + // k is a power of two that is at least k-min_segment_size + my_item= NULL; + } else { + --my_item; + } + } + return *this; + } + + //! Post increment + vector_iterator operator++(int) { + vector_iterator result = *this; + operator++(); + return result; + } + + //! Post decrement + vector_iterator operator--(int) { + vector_iterator result = *this; + operator--(); + return result; + } + + // STL support + + typedef ptrdiff_t difference_type; + typedef Value value_type; + typedef Value* pointer; + typedef Value& reference; + typedef std::random_access_iterator_tag iterator_category; + }; + + template + bool operator==( const vector_iterator& i, const vector_iterator& j ) { + return i.my_index==j.my_index; + } + + template + bool operator!=( const vector_iterator& i, const vector_iterator& j ) { + return !(i==j); + } + + template + bool operator<( const vector_iterator& i, const vector_iterator& j ) { + return i.my_index + bool operator>( const vector_iterator& i, const vector_iterator& j ) { + return j + bool operator>=( const vector_iterator& i, const vector_iterator& j ) { + return !(i + bool operator<=( const vector_iterator& i, const vector_iterator& j ) { + return !(j + ptrdiff_t operator-( const vector_iterator& i, const vector_iterator& j ) { + return ptrdiff_t(i.my_index)-ptrdiff_t(j.my_index); + } + +} // namespace internal +//! @endcond + +//! Concurrent vector +/** @ingroup containers */ +template +class concurrent_vector: private internal::concurrent_vector_base { +public: + using internal::concurrent_vector_base::size_type; +private: + template + class generic_range_type: public blocked_range { + public: + typedef T value_type; + typedef T& reference; + typedef const T& const_reference; + typedef I iterator; + typedef ptrdiff_t difference_type; + generic_range_type( I begin_, I end_, size_t grainsize ) : blocked_range(begin_,end_,grainsize) {} + generic_range_type( generic_range_type& r, split ) : blocked_range(r,split()) {} + }; + + template + friend class internal::vector_iterator; +public: + typedef T& reference; + typedef const T& const_reference; + + //! Construct empty vector. + concurrent_vector() {} + + //! Copy a vector. + concurrent_vector( const concurrent_vector& vector ) {internal_copy(vector,sizeof(T),©_array);} + + //! Assignment + concurrent_vector& operator=( const concurrent_vector& vector ) { + if( this!=&vector ) + internal_assign(vector,sizeof(T),&destroy_array,&assign_array,©_array); + return *this; + } + + //! Clear and destroy vector. + ~concurrent_vector() {internal_clear(&destroy_array,/*reclaim_storage=*/true);} + + //------------------------------------------------------------------------ + // Concurrent operations + //------------------------------------------------------------------------ + //! Grow by "delta" elements. + /** Returns old size. */ + size_type grow_by( size_type delta ) { + return delta ? internal_grow_by( delta, sizeof(T), &initialize_array ) : my_early_size; + } + + //! Grow array until it has at least n elements. + void grow_to_at_least( size_type n ) { + if( my_early_size iterator; + typedef internal::vector_iterator const_iterator; + +#if !defined(_MSC_VER) || _CPPLIB_VER>=300 + // Assume ISO standard definition of std::reverse_iterator + typedef std::reverse_iterator reverse_iterator; + typedef std::reverse_iterator const_reverse_iterator; +#else + // Use non-standard std::reverse_iterator + typedef std::reverse_iterator reverse_iterator; + typedef std::reverse_iterator const_reverse_iterator; +#endif /* defined(_MSC_VER) && (_MSC_VER<1300) */ + + typedef generic_range_type range_type; + typedef generic_range_type const_range_type; + + range_type range( size_t grainsize = 1 ) { + return range_type( begin(), end(), grainsize ); + } + + const_range_type range( size_t grainsize = 1 ) const { + return const_range_type( begin(), end(), grainsize ); + } + + //------------------------------------------------------------------------ + // Capacity + //------------------------------------------------------------------------ + //! Return size of vector. + size_type size() const {return my_early_size;} + + //! Return size of vector. + bool empty() const {return !my_early_size;} + + //! Maximum size to which array can grow without allocating more memory. + size_type capacity() const {return internal_capacity();} + + //! Allocate enough space to grow to size n without having to allocate more memory later. + /** Like most of the methods provided for STL compatibility, this method is *not* thread safe. + The capacity afterwards may be bigger than the requested reservation. */ + void reserve( size_type n ) { + if( n ) + internal_reserve(n, sizeof(T), max_size()); + } + + //! Upper bound on argument to reserve. + size_type max_size() const {return (~size_t(0))/sizeof(T);} + + //------------------------------------------------------------------------ + // STL support + //------------------------------------------------------------------------ + + typedef T value_type; + typedef ptrdiff_t difference_type; + + iterator begin() {return iterator(*this,0);} + iterator end() {return iterator(*this,size());} + const_iterator begin() const {return const_iterator(*this,0);} + const_iterator end() const {return const_iterator(*this,size());} + + reverse_iterator rbegin() {return reverse_iterator(end());} + reverse_iterator rend() {return reverse_iterator(begin());} + const_reverse_iterator rbegin() const {return const_reverse_iterator(end());} + const_reverse_iterator rend() const {return const_reverse_iterator(begin());} + + //! Not thread safe + /** Does not change capacity. */ + void clear() {internal_clear(&destroy_array,/*reclaim_storage=*/false);} +private: + //! Get reference to element at given index. + T& internal_subscript( size_type index ) const; + + //! Construct n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC initialize_array( void* begin, size_type n ); + + //! Construct n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC copy_array( void* dst, const void* src, size_type n ); + + //! Assign n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC assign_array( void* dst, const void* src, size_type n ); + + //! Destroy n instances of T, starting at "begin". + static void __TBB_EXPORTED_FUNC destroy_array( void* begin, size_type n ); +}; + +template +T& concurrent_vector::internal_subscript( size_type index ) const { + __TBB_ASSERT( index(my_segment[k].array)[j]; +} + +template +void concurrent_vector::initialize_array( void* begin, size_type n ) { + T* array = static_cast(begin); + for( size_type j=0; j +void concurrent_vector::copy_array( void* dst, const void* src, size_type n ) { + T* d = static_cast(dst); + const T* s = static_cast(src); + for( size_type j=0; j +void concurrent_vector::assign_array( void* dst, const void* src, size_type n ) { + T* d = static_cast(dst); + const T* s = static_cast(src); + for( size_type j=0; j +void concurrent_vector::destroy_array( void* begin, size_type n ) { + T* array = static_cast(begin); + for( size_type j=n; j>0; --j ) + array[j-1].~T(); +} + +} // namespace tbb + +#endif /* __TBB_concurrent_vector_H */ diff --git a/dep/tbb/src/old/spin_rw_mutex_v2.cpp b/dep/tbb/src/old/spin_rw_mutex_v2.cpp new file mode 100644 index 000000000..9067b0957 --- /dev/null +++ b/dep/tbb/src/old/spin_rw_mutex_v2.cpp @@ -0,0 +1,166 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "spin_rw_mutex_v2.h" +#include "tbb/tbb_machine.h" +#include "../tbb/itt_notify.h" + +namespace tbb { + +using namespace internal; + +static inline bool CAS(volatile uintptr &addr, uintptr newv, uintptr oldv) { + return __TBB_CompareAndSwapW((volatile void *)&addr, (intptr)newv, (intptr)oldv) == (intptr)oldv; +} + +//! Signal that write lock is released +void spin_rw_mutex::internal_itt_releasing(spin_rw_mutex *mutex) { + ITT_NOTIFY(sync_releasing, mutex); +#if !DO_ITT_NOTIFY + (void)mutex; +#endif +} + +bool spin_rw_mutex::internal_acquire_writer(spin_rw_mutex *mutex) +{ + ITT_NOTIFY(sync_prepare, mutex); + atomic_backoff backoff; + for(;;) { + state_t s = mutex->state; + if( !(s & BUSY) ) { // no readers, no writers + if( CAS(mutex->state, WRITER, s) ) + break; // successfully stored writer flag + backoff.reset(); // we could be very close to complete op. + } else if( !(s & WRITER_PENDING) ) { // no pending writers + __TBB_AtomicOR(&mutex->state, WRITER_PENDING); + } + backoff.pause(); + } + ITT_NOTIFY(sync_acquired, mutex); + __TBB_ASSERT( (mutex->state & BUSY)==WRITER, "invalid state of a write lock" ); + return false; +} + +//! Signal that write lock is released +void spin_rw_mutex::internal_release_writer(spin_rw_mutex *mutex) { + __TBB_ASSERT( (mutex->state & BUSY)==WRITER, "invalid state of a write lock" ); + ITT_NOTIFY(sync_releasing, mutex); + mutex->state = 0; +} + +//! Acquire lock on given mutex. +void spin_rw_mutex::internal_acquire_reader(spin_rw_mutex *mutex) { + ITT_NOTIFY(sync_prepare, mutex); + atomic_backoff backoff; + for(;;) { + state_t s = mutex->state; + if( !(s & (WRITER|WRITER_PENDING)) ) { // no writer or write requests + if( CAS(mutex->state, s+ONE_READER, s) ) + break; // successfully stored increased number of readers + backoff.reset(); // we could be very close to complete op. + } + backoff.pause(); + } + ITT_NOTIFY(sync_acquired, mutex); + __TBB_ASSERT( mutex->state & READERS, "invalid state of a read lock: no readers" ); + __TBB_ASSERT( !(mutex->state & WRITER), "invalid state of a read lock: active writer" ); +} + +//! Upgrade reader to become a writer. +/** Returns true if the upgrade happened without re-acquiring the lock and false if opposite */ +bool spin_rw_mutex::internal_upgrade(spin_rw_mutex *mutex) { + state_t s = mutex->state; + __TBB_ASSERT( s & READERS, "invalid state before upgrade: no readers " ); + __TBB_ASSERT( !(s & WRITER), "invalid state before upgrade: active writer " ); + // check and set writer-pending flag + // required conditions: either no pending writers, or we are the only reader + // (with multiple readers and pending writer, another upgrade could have been requested) + while( (s & READERS)==ONE_READER || !(s & WRITER_PENDING) ) { + if( CAS(mutex->state, s | WRITER_PENDING, s) ) + { + atomic_backoff backoff; + ITT_NOTIFY(sync_prepare, mutex); + while( (mutex->state & READERS) != ONE_READER ) // more than 1 reader + backoff.pause(); + // the state should be 0...0110, i.e. 1 reader and waiting writer; + // both new readers and writers are blocked + __TBB_ASSERT(mutex->state == (ONE_READER | WRITER_PENDING),"invalid state when upgrading to writer"); + mutex->state = WRITER; + ITT_NOTIFY(sync_acquired, mutex); + __TBB_ASSERT( (mutex->state & BUSY) == WRITER, "invalid state after upgrade" ); + return true; // successfully upgraded + } else { + s = mutex->state; // re-read + } + } + // slow reacquire + internal_release_reader(mutex); + return internal_acquire_writer(mutex); // always returns false +} + +void spin_rw_mutex::internal_downgrade(spin_rw_mutex *mutex) { + __TBB_ASSERT( (mutex->state & BUSY) == WRITER, "invalid state before downgrade" ); + ITT_NOTIFY(sync_releasing, mutex); + mutex->state = ONE_READER; + __TBB_ASSERT( mutex->state & READERS, "invalid state after downgrade: no readers" ); + __TBB_ASSERT( !(mutex->state & WRITER), "invalid state after downgrade: active writer" ); +} + +void spin_rw_mutex::internal_release_reader(spin_rw_mutex *mutex) +{ + __TBB_ASSERT( mutex->state & READERS, "invalid state of a read lock: no readers" ); + __TBB_ASSERT( !(mutex->state & WRITER), "invalid state of a read lock: active writer" ); + ITT_NOTIFY(sync_releasing, mutex); // release reader + __TBB_FetchAndAddWrelease((volatile void *)&(mutex->state),-(intptr)ONE_READER); +} + +bool spin_rw_mutex::internal_try_acquire_writer( spin_rw_mutex * mutex ) +{ +// for a writer: only possible to acquire if no active readers or writers + state_t s = mutex->state; // on Itanium, this volatile load has acquire semantic + if( !(s & BUSY) ) // no readers, no writers; mask is 1..1101 + if( CAS(mutex->state, WRITER, s) ) { + ITT_NOTIFY(sync_acquired, mutex); + return true; // successfully stored writer flag + } + return false; +} + +bool spin_rw_mutex::internal_try_acquire_reader( spin_rw_mutex * mutex ) +{ +// for a reader: acquire if no active or waiting writers + state_t s = mutex->state; // on Itanium, a load of volatile variable has acquire semantic + while( !(s & (WRITER|WRITER_PENDING)) ) // no writers + if( CAS(mutex->state, s+ONE_READER, s) ) { + ITT_NOTIFY(sync_acquired, mutex); + return true; // successfully stored increased number of readers + } + return false; +} + +} // namespace tbb diff --git a/dep/tbb/src/old/spin_rw_mutex_v2.h b/dep/tbb/src/old/spin_rw_mutex_v2.h new file mode 100644 index 000000000..3285e8ee5 --- /dev/null +++ b/dep/tbb/src/old/spin_rw_mutex_v2.h @@ -0,0 +1,185 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_spin_rw_mutex_H +#define __TBB_spin_rw_mutex_H + +#include "tbb/tbb_stddef.h" + +namespace tbb { + +//! Fast, unfair, spinning reader-writer lock with backoff and writer-preference +/** @ingroup synchronization */ +class spin_rw_mutex { + //! @cond INTERNAL + + //! Present so that 1.0 headers work with 1.1 dynamic library. + static void __TBB_EXPORTED_FUNC internal_itt_releasing(spin_rw_mutex *); + + //! Internal acquire write lock. + static bool __TBB_EXPORTED_FUNC internal_acquire_writer(spin_rw_mutex *); + + //! Out of line code for releasing a write lock. + /** This code is has debug checking and instrumentation for Intel(R) Thread Checker and Intel(R) Thread Profiler. */ + static void __TBB_EXPORTED_FUNC internal_release_writer(spin_rw_mutex *); + + //! Internal acquire read lock. + static void __TBB_EXPORTED_FUNC internal_acquire_reader(spin_rw_mutex *); + + //! Internal upgrade reader to become a writer. + static bool __TBB_EXPORTED_FUNC internal_upgrade(spin_rw_mutex *); + + //! Out of line code for downgrading a writer to a reader. + /** This code is has debug checking and instrumentation for Intel(R) Thread Checker and Intel(R) Thread Profiler. */ + static void __TBB_EXPORTED_FUNC internal_downgrade(spin_rw_mutex *); + + //! Internal release read lock. + static void __TBB_EXPORTED_FUNC internal_release_reader(spin_rw_mutex *); + + //! Internal try_acquire write lock. + static bool __TBB_EXPORTED_FUNC internal_try_acquire_writer(spin_rw_mutex *); + + //! Internal try_acquire read lock. + static bool __TBB_EXPORTED_FUNC internal_try_acquire_reader(spin_rw_mutex *); + + //! @endcond +public: + //! Construct unacquired mutex. + spin_rw_mutex() : state(0) {} + +#if TBB_DO_ASSERT + //! Destructor asserts if the mutex is acquired, i.e. state is zero. + ~spin_rw_mutex() { + __TBB_ASSERT( !state, "destruction of an acquired mutex"); + }; +#endif /* TBB_DO_ASSERT */ + + //! The scoped locking pattern + /** It helps to avoid the common problem of forgetting to release lock. + It also nicely provides the "node" for queuing locks. */ + class scoped_lock : private internal::no_copy { + public: + //! Construct lock that has not acquired a mutex. + /** Equivalent to zero-initialization of *this. */ + scoped_lock() : mutex(NULL) {} + + //! Acquire lock on given mutex. + /** Upon entry, *this should not be in the "have acquired a mutex" state. */ + scoped_lock( spin_rw_mutex& m, bool write = true ) : mutex(NULL) { + acquire(m, write); + } + + //! Release lock (if lock is held). + ~scoped_lock() { + if( mutex ) release(); + } + + //! Acquire lock on given mutex. + void acquire( spin_rw_mutex& m, bool write = true ) { + __TBB_ASSERT( !mutex, "holding mutex already" ); + is_writer = write; + mutex = &m; + if( write ) internal_acquire_writer(mutex); + else internal_acquire_reader(mutex); + } + + //! Upgrade reader to become a writer. + /** Returns true if the upgrade happened without re-acquiring the lock and false if opposite */ + bool upgrade_to_writer() { + __TBB_ASSERT( mutex, "lock is not acquired" ); + __TBB_ASSERT( !is_writer, "not a reader" ); + is_writer = true; + return internal_upgrade(mutex); + } + + //! Release lock. + void release() { + __TBB_ASSERT( mutex, "lock is not acquired" ); + spin_rw_mutex *m = mutex; + mutex = NULL; + if( is_writer ) { +#if TBB_DO_THREADING_TOOLS||TBB_DO_ASSERT + internal_release_writer(m); +#else + m->state = 0; +#endif /* TBB_DO_THREADING_TOOLS||TBB_DO_ASSERT */ + } else { + internal_release_reader(m); + } + }; + + //! Downgrade writer to become a reader. + bool downgrade_to_reader() { +#if TBB_DO_THREADING_TOOLS||TBB_DO_ASSERT + __TBB_ASSERT( mutex, "lock is not acquired" ); + __TBB_ASSERT( is_writer, "not a writer" ); + internal_downgrade(mutex); +#else + mutex->state = 4; // Bit 2 - reader, 00..00100 +#endif + is_writer = false; + + return true; + } + + //! Try acquire lock on given mutex. + bool try_acquire( spin_rw_mutex& m, bool write = true ) { + __TBB_ASSERT( !mutex, "holding mutex already" ); + bool result; + is_writer = write; + result = write? internal_try_acquire_writer(&m) + : internal_try_acquire_reader(&m); + if( result ) mutex = &m; + return result; + } + + private: + //! The pointer to the current mutex that is held, or NULL if no mutex is held. + spin_rw_mutex* mutex; + + //! True if holding a writer lock, false if holding a reader lock. + /** Not defined if not holding a lock. */ + bool is_writer; + }; + +private: + typedef internal::uintptr state_t; + static const state_t WRITER = 1; + static const state_t WRITER_PENDING = 2; + static const state_t READERS = ~(WRITER | WRITER_PENDING); + static const state_t ONE_READER = 4; + static const state_t BUSY = WRITER | READERS; + /** Bit 0 = writer is holding lock + Bit 1 = request by a writer to acquire lock (hint to readers to wait) + Bit 2..N = number of readers holding lock */ + volatile state_t state; +}; + +} // namespace ThreadingBuildingBlocks + +#endif /* __TBB_spin_rw_mutex_H */ diff --git a/dep/tbb/src/old/test_concurrent_queue_v2.cpp b/dep/tbb/src/old/test_concurrent_queue_v2.cpp new file mode 100644 index 000000000..4443b592c --- /dev/null +++ b/dep/tbb/src/old/test_concurrent_queue_v2.cpp @@ -0,0 +1,361 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/concurrent_queue.h" +#include "tbb/atomic.h" +#include "tbb/tick_count.h" + +#include "../test/harness_assert.h" +#include "../test/harness.h" + +static tbb::atomic FooConstructed; +static tbb::atomic FooDestroyed; + +class Foo { + enum state_t{ + LIVE=0x1234, + DEAD=0xDEAD + }; + state_t state; +public: + int thread_id; + int serial; + Foo() : state(LIVE) { + ++FooConstructed; + } + Foo( const Foo& item ) : state(LIVE) { + ASSERT( item.state==LIVE, NULL ); + ++FooConstructed; + thread_id = item.thread_id; + serial = item.serial; + } + ~Foo() { + ASSERT( state==LIVE, NULL ); + ++FooDestroyed; + state=DEAD; + thread_id=0xDEAD; + serial=0xDEAD; + } + void operator=( Foo& item ) { + ASSERT( item.state==LIVE, NULL ); + ASSERT( state==LIVE, NULL ); + thread_id = item.thread_id; + serial = item.serial; + } + bool is_const() {return false;} + bool is_const() const {return true;} +}; + +const size_t MAXTHREAD = 256; + +static int Sum[MAXTHREAD]; + +//! Count of various pop operations +/** [0] = pop_if_present that failed + [1] = pop_if_present that succeeded + [2] = pop */ +static tbb::atomic PopKind[3]; + +const int M = 10000; + +struct Body { + tbb::concurrent_queue* queue; + const int nthread; + Body( int nthread_ ) : nthread(nthread_) {} + void operator()( long thread_id ) const { + long pop_kind[3] = {0,0,0}; + int serial[MAXTHREAD+1]; + memset( serial, 0, nthread*sizeof(unsigned) ); + ASSERT( thread_idpop_if_present(f); + ++pop_kind[prepopped]; + } + Foo g; + g.thread_id = thread_id; + g.serial = j+1; + queue->push( g ); + if( !prepopped ) { + queue->pop(f); + ++pop_kind[2]; + } + ASSERT( f.thread_id<=nthread, NULL ); + ASSERT( f.thread_id==nthread || serial[f.thread_id]0, "nthread must be positive" ); + if( prefill+1>=capacity ) + return; + bool success = false; + for( int k=0; k<3; ++k ) + PopKind[k] = 0; + for( int trial=0; !success; ++trial ) { + FooConstructed = 0; + FooDestroyed = 0; + Body body(nthread); + tbb::concurrent_queue queue; + queue.set_capacity( capacity ); + body.queue = &queue; + for( int i=0; i=0; ) { + ASSERT( !queue.empty(), NULL ); + Foo f; + queue.pop(f); + ASSERT( queue.size()==i, NULL ); + sum += f.serial-1; + } + ASSERT( queue.empty(), NULL ); + ASSERT( queue.size()==0, NULL ); + if( sum!=expected ) + printf("sum=%d expected=%d\n",sum,expected); + ASSERT( FooConstructed==FooDestroyed, NULL ); + + success = true; + if( nthread>1 && prefill==0 ) { + // Check that pop_if_present got sufficient exercise + for( int k=0; k<2; ++k ) { +#if (_WIN32||_WIN64) + // The TBB library on Windows seems to have a tough time generating + // the desired interleavings for pop_if_present, so the code tries longer, and settles + // for fewer desired interleavings. + const int max_trial = 100; + const int min_requirement = 20; +#else + const int min_requirement = 100; + const int max_trial = 20; +#endif /* _WIN32||_WIN64 */ + if( PopKind[k]=max_trial ) { + if( Verbose ) + printf("Warning: %d threads had only %ld pop_if_present operations %s after %d trials (expected at least %d). " + "This problem may merely be unlucky scheduling. " + "Investigate only if it happens repeatedly.\n", + nthread, long(PopKind[k]), k==0?"failed":"succeeded", max_trial, min_requirement); + else + printf("Warning: the number of %s pop_if_present operations is less than expected for %d threads. Investigate if it happens repeatedly.\n", + k==0?"failed":"succeeded", nthread ); + } else { + success = false; + } + } + } + } + } +} + +template +void TestIteratorAux( Iterator1 i, Iterator2 j, int size ) { + // Now test iteration + Iterator1 old_i; + for( int k=0; k" + ASSERT( k+2==i->serial, NULL ); + } + // Test assignment + old_i = i; + } + ASSERT( k+1==f.serial, NULL ); + } + ASSERT( !(i!=j), NULL ); + ASSERT( i==j, NULL ); +} + +template +void TestIteratorAssignment( Iterator2 j ) { + Iterator1 i(j); + ASSERT( i==j, NULL ); + ASSERT( !(i!=j), NULL ); + Iterator1 k; + k = j; + ASSERT( k==j, NULL ); + ASSERT( !(k!=j), NULL ); +} + +//! Test the iterators for concurrent_queue +void TestIterator() { + tbb::concurrent_queue queue; + tbb::concurrent_queue& const_queue = queue; + for( int j=0; j<500; ++j ) { + TestIteratorAux( queue.begin(), queue.end(), j ); + TestIteratorAux( const_queue.begin(), const_queue.end(), j ); + TestIteratorAux( const_queue.begin(), queue.end(), j ); + TestIteratorAux( queue.begin(), const_queue.end(), j ); + Foo f; + f.serial = j+1; + queue.push(f); + } + TestIteratorAssignment::const_iterator>( const_queue.begin() ); + TestIteratorAssignment::const_iterator>( queue.begin() ); + TestIteratorAssignment::iterator>( queue.begin() ); +} + +void TestConcurrenetQueueType() { + AssertSameType( tbb::concurrent_queue::value_type(), Foo() ); + Foo f; + const Foo g; + tbb::concurrent_queue::reference r = f; + ASSERT( &r==&f, NULL ); + ASSERT( !r.is_const(), NULL ); + tbb::concurrent_queue::const_reference cr = g; + ASSERT( &cr==&g, NULL ); + ASSERT( cr.is_const(), NULL ); +} + +template +void TestEmptyQueue() { + const tbb::concurrent_queue queue; + ASSERT( queue.size()==0, NULL ); + ASSERT( queue.capacity()>0, NULL ); + ASSERT( size_t(queue.capacity())>=size_t(-1)/(sizeof(void*)+sizeof(T)), NULL ); +} + +void TestFullQueue() { + for( int n=0; n<10; ++n ) { + FooConstructed = 0; + FooDestroyed = 0; + tbb::concurrent_queue queue; + queue.set_capacity(n); + for( int i=0; i<=n; ++i ) { + Foo f; + f.serial = i; + bool result = queue.push_if_not_full( f ); + ASSERT( result==(i +struct TestNegativeQueueBody { + tbb::concurrent_queue& queue; + const int nthread; + TestNegativeQueueBody( tbb::concurrent_queue& q, int n ) : queue(q), nthread(n) {} + void operator()( int k ) const { + if( k==0 ) { + int number_of_pops = nthread-1; + // Wait for all pops to pend. + while( queue.size()>-number_of_pops ) { + __TBB_Yield(); + } + for( int i=0; ; ++i ) { + ASSERT( queue.size()==i-number_of_pops, NULL ); + ASSERT( queue.empty()==(queue.size()<=0), NULL ); + if( i==number_of_pops ) break; + // Satisfy another pop + queue.push( T() ); + } + } else { + // Pop item from queue + T item; + queue.pop(item); + } + } +}; + +//! Test a queue with a negative size. +template +void TestNegativeQueue( int nthread ) { + tbb::concurrent_queue queue; + NativeParallelFor( nthread, TestNegativeQueueBody(queue,nthread) ); +} + +int main( int argc, char* argv[] ) { + // Set default for minimum number of threads. + MinThread = 1; + ParseCommandLine(argc,argv); + + TestEmptyQueue(); + TestEmptyQueue(); + TestFullQueue(); + TestConcurrenetQueueType(); + TestIterator(); + + // Test concurrent operations + for( int nthread=MinThread; nthread<=MaxThread; ++nthread ) { + TestNegativeQueue(nthread); + for( int prefill=0; prefill<64; prefill+=(1+prefill/3) ) { + TestPushPop(prefill,ptrdiff_t(-1),nthread); + TestPushPop(prefill,ptrdiff_t(1),nthread); + TestPushPop(prefill,ptrdiff_t(2),nthread); + TestPushPop(prefill,ptrdiff_t(10),nthread); + TestPushPop(prefill,ptrdiff_t(100),nthread); + } + } + printf("done\n"); + return 0; +} diff --git a/dep/tbb/src/old/test_concurrent_vector_v2.cpp b/dep/tbb/src/old/test_concurrent_vector_v2.cpp new file mode 100644 index 000000000..68a73115b --- /dev/null +++ b/dep/tbb/src/old/test_concurrent_vector_v2.cpp @@ -0,0 +1,570 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "concurrent_vector_v2.h" +#include +#include +#include "../test/harness_assert.h" + +tbb::atomic FooCount; + +//! Problem size +const size_t N = 500000; + +struct Foo { + int my_bar; +public: + enum State { + DefaultInitialized=0x1234, + CopyInitialized=0x89ab, + Destroyed=0x5678 + } state; + int& bar() { + ASSERT( state==DefaultInitialized||state==CopyInitialized, NULL ); + return my_bar; + } + int bar() const { + ASSERT( state==DefaultInitialized||state==CopyInitialized, NULL ); + return my_bar; + } + static const int initial_value_of_bar = 42; + Foo() { + state = DefaultInitialized; + ++FooCount; + my_bar = initial_value_of_bar; + } + Foo( const Foo& foo ) { + state = CopyInitialized; + ++FooCount; + my_bar = foo.my_bar; + } + ~Foo() { + ASSERT( state==DefaultInitialized||state==CopyInitialized, NULL ); + state = Destroyed; + my_bar = ~initial_value_of_bar; + --FooCount; + } + bool is_const() const {return true;} + bool is_const() {return false;} +}; + +class FooWithAssign: public Foo { +public: + void operator=( const FooWithAssign& x ) { + ASSERT( x.state==DefaultInitialized||x.state==CopyInitialized, NULL ); + ASSERT( state==DefaultInitialized||state==CopyInitialized, NULL ); + my_bar = x.my_bar; + } +}; + +inline void NextSize( int& s ) { + if( s<=32 ) ++s; + else s += s/10; +} + +static void CheckVector( const tbb::concurrent_vector& cv, size_t expected_size, size_t old_size ) { + ASSERT( cv.size()==expected_size, NULL ); + ASSERT( cv.empty()==(expected_size==0), NULL ); + for( int j=0; j vector_t; + for( int old_size=0; old_size<=128; NextSize( old_size ) ) { + for( int new_size=old_size; new_size<=128; NextSize( new_size ) ) { + long count = FooCount; + vector_t v; + ASSERT( count==FooCount, NULL ); + v.grow_by(old_size); + ASSERT( count+old_size==FooCount, NULL ); + for( int j=0; j vector_t; + vector_t v; + v.reserve( old_size ); + ASSERT( v.capacity()>=old_size, NULL ); + v.reserve( new_size ); + ASSERT( v.capacity()>=old_size, NULL ); + ASSERT( v.capacity()>=new_size, NULL ); + for( size_t i=0; i<2*new_size; ++i ) { + ASSERT( size_t(FooCount)==count+i, NULL ); + size_t j = v.grow_by(1); + ASSERT( j==i, NULL ); + } + } + ASSERT( FooCount==count, NULL ); + } + } +} + +struct AssignElement { + typedef tbb::concurrent_vector::range_type::iterator iterator; + iterator base; + void operator()( const tbb::concurrent_vector::range_type& range ) const { + for( iterator i=range.begin(); i!=range.end(); ++i ) { + if( *i!=0 ) + std::printf("ERROR for v[%ld]\n", long(i-base)); + *i = int(i-base); + } + } + AssignElement( iterator base_ ) : base(base_) {} +}; + +struct CheckElement { + typedef tbb::concurrent_vector::const_range_type::iterator iterator; + iterator base; + void operator()( const tbb::concurrent_vector::const_range_type& range ) const { + for( iterator i=range.begin(); i!=range.end(); ++i ) + if( *i != int(i-base) ) + std::printf("ERROR for v[%ld]\n", long(i-base)); + } + CheckElement( iterator base_ ) : base(base_) {} +}; + +#include "tbb/tick_count.h" +#include "tbb/parallel_for.h" +#include "../test/harness.h" + +void TestParallelFor( int nthread ) { + typedef tbb::concurrent_vector vector_t; + vector_t v; + v.grow_to_at_least(N); + tbb::tick_count t0 = tbb::tick_count::now(); + if( Verbose ) + std::printf("Calling parallel_for.h with %ld threads\n",long(nthread)); + tbb::parallel_for( v.range(10000), AssignElement(v.begin()) ); + tbb::tick_count t1 = tbb::tick_count::now(); + const vector_t& u = v; + tbb::parallel_for( u.range(10000), CheckElement(u.begin()) ); + tbb::tick_count t2 = tbb::tick_count::now(); + if( Verbose ) + std::printf("Time for parallel_for.h: assign time = %8.5f, check time = %8.5f\n", + (t1-t0).seconds(),(t2-t1).seconds()); + for( long i=0; size_t(i) +void TestIteratorAssignment( Iterator2 j ) { + Iterator1 i(j); + ASSERT( i==j, NULL ); + ASSERT( !(i!=j), NULL ); + Iterator1 k; + k = j; + ASSERT( k==j, NULL ); + ASSERT( !(k!=j), NULL ); +} + +template +void TestIteratorTraits() { + AssertSameType( static_cast(0), static_cast(0) ); + AssertSameType( static_cast(0), static_cast(0) ); + AssertSameType( static_cast(0), static_cast(0) ); + AssertSameType( static_cast(0), static_cast(0) ); + T x; + typename Iterator::reference xr = x; + typename Iterator::pointer xp = &x; + ASSERT( &xr==xp, NULL ); +} + +template +void CheckConstIterator( const Vector& u, int i, const Iterator& cp ) { + typename Vector::const_reference pref = *cp; + if( pref.bar()!=i ) + std::printf("ERROR for u[%ld] using const_iterator\n", long(i)); + typename Vector::difference_type delta = cp-u.begin(); + ASSERT( delta==i, NULL ); + if( u[i].bar()!=i ) + std::printf("ERROR for u[%ld] using subscripting\n", long(i)); + ASSERT( u.begin()[i].bar()==i, NULL ); +} + +template +void CheckIteratorComparison( V& u ) { + Iterator1 i = u.begin(); + for( int i_count=0; i_count<100; ++i_count ) { + Iterator2 j = u.begin(); + for( int j_count=0; j_count<100; ++j_count ) { + ASSERT( (i==j)==(i_count==j_count), NULL ); + ASSERT( (i!=j)==(i_count!=j_count), NULL ); + ASSERT( (i-j)==(i_count-j_count), NULL ); + ASSERT( (ij)==(i_count>j_count), NULL ); + ASSERT( (i<=j)==(i_count<=j_count), NULL ); + ASSERT( (i>=j)==(i_count>=j_count), NULL ); + ++j; + } + ++i; + } +} + +//! Test sequential iterators for vector type V. +/** Also does timing. */ +template +void TestSequentialFor() { + V v; + v.grow_by(N); + + // Check iterator + tbb::tick_count t0 = tbb::tick_count::now(); + typename V::iterator p = v.begin(); + ASSERT( !(*p).is_const(), NULL ); + ASSERT( !p->is_const(), NULL ); + for( int i=0; size_t(i)is_const(), NULL ); + for( int i=0; size_t(i)0; ) { + --i; + --cp; + if( i>0 ) { + typename V::const_iterator cp_old = cp--; + int here = (*cp_old).bar(); + ASSERT( here==u[i].bar(), NULL ); + typename V::const_iterator cp_new = cp++; + int prev = (*cp_new).bar(); + ASSERT( prev==u[i-1].bar(), NULL ); + } + CheckConstIterator(u,i,cp); + } + + // Now go forwards and backwards + cp = u.begin(); + ptrdiff_t j = 0; + for( size_t i=0; i(v); + CheckIteratorComparison(v); + CheckIteratorComparison(v); + CheckIteratorComparison(v); + + TestIteratorAssignment( u.begin() ); + TestIteratorAssignment( v.begin() ); + TestIteratorAssignment( v.begin() ); + + // Check reverse_iterator + typename V::reverse_iterator rp = v.rbegin(); + for( size_t i=v.size(); i>0; --i, ++rp ) { + typename V::reference pref = *rp; + ASSERT( size_t(pref.bar())==i-1, NULL ); + ASSERT( rp!=v.rend(), NULL ); + } + ASSERT( rp==v.rend(), NULL ); + + // Check const_reverse_iterator + typename V::const_reverse_iterator crp = u.rbegin(); + for( size_t i=v.size(); i>0; --i, ++crp ) { + typename V::const_reference cpref = *crp; + ASSERT( size_t(cpref.bar())==i-1, NULL ); + ASSERT( crp!=u.rend(), NULL ); + } + ASSERT( crp==u.rend(), NULL ); + + TestIteratorAssignment( u.rbegin() ); + TestIteratorAssignment( v.rbegin() ); +} + +static const size_t Modulus = 7; + +typedef tbb::concurrent_vector MyVector; + +class GrowToAtLeast { + MyVector& my_vector; +public: + void operator()( const tbb::blocked_range& range ) const { + for( size_t i=range.begin(); i!=range.end(); ++i ) { + size_t n = my_vector.size(); + size_t k = n==0 ? 0 : i % (2*n+1); + my_vector.grow_to_at_least(k+1); + ASSERT( my_vector.size()>=k+1, NULL ); + } + } + GrowToAtLeast( MyVector& vector ) : my_vector(vector) {} +}; + +void TestConcurrentGrowToAtLeast() { + MyVector v; + for( size_t s=1; s<1000; s*=10 ) { + tbb::parallel_for( tbb::blocked_range(0,1000000,100), GrowToAtLeast(v) ); + } +} + +//! Test concurrent invocations of method concurrent_vector::grow_by +class GrowBy { + MyVector& my_vector; +public: + void operator()( const tbb::blocked_range& range ) const { + for( int i=range.begin(); i!=range.end(); ++i ) { + if( i%3 ) { + Foo& element = my_vector[my_vector.grow_by(1)]; + element.bar() = i; + } else { + Foo f; + f.bar() = i; + size_t k = my_vector.push_back( f ); + ASSERT( my_vector[k].bar()==i, NULL ); + } + } + } + GrowBy( MyVector& vector ) : my_vector(vector) {} +}; + +//! Test concurrent invocations of method concurrent_vector::grow_by +void TestConcurrentGrowBy( int nthread ) { + int m = 100000; + MyVector v; + tbb::parallel_for( tbb::blocked_range(0,m,1000), GrowBy(v) ); + ASSERT( v.size()==size_t(m), NULL ); + + // Verify that v is a permutation of 0..m + int inversions = 0; + bool* found = new bool[m]; + memset( found, 0, m ); + for( int i=0; i0 ) + inversions += v[i].bar()1 || v[i].bar()==i, "sequential execution is wrong" ); + } + delete[] found; + if( nthread>1 && inversions vector_t; + for( int dst_size=1; dst_size<=128; NextSize( dst_size ) ) { + for( int src_size=2; src_size<=128; NextSize( src_size ) ) { + vector_t u; + u.grow_to_at_least(src_size); + for( int i=0; i + +typedef unsigned long Number; + +static tbb::concurrent_vector Primes; + +class FindPrimes { + bool is_prime( Number val ) const { + int limit, factor = 3; + if( val<5u ) + return val==2; + else { + limit = long(sqrtf(float(val))+0.5f); + while( factor<=limit && val % factor ) + ++factor; + return factor>limit; + } + } +public: + void operator()( const tbb::blocked_range& r ) const { + for( Number i=r.begin(); i!=r.end(); ++i ) { + if( i%2 && is_prime(i) ) { + Primes[Primes.grow_by(1)] = i; + } + } + } +}; + +static double TimeFindPrimes( int nthread ) { + Primes.clear(); + tbb::task_scheduler_init init(nthread); + tbb::tick_count t0 = tbb::tick_count::now(); + tbb::parallel_for( tbb::blocked_range(0,1000000,500), FindPrimes() ); + tbb::tick_count t1 = tbb::tick_count::now(); + return (t1-t0).seconds(); +} + +static void TestFindPrimes() { + // Time fully subscribed run. + double t2 = TimeFindPrimes( tbb::task_scheduler_init::automatic ); + + // Time parallel run that is very likely oversubscribed. + double t128 = TimeFindPrimes(128); + + if( Verbose ) + std::printf("TestFindPrimes: t2==%g t128=%g\n", t2, t128 ); + + // We allow the 128-thread run a little extra time to allow for thread overhead. + // Theoretically, following test will fail on machine with >128 processors. + // But that situation is not going to come up in the near future, + // and the generalization to fix the issue is not worth the trouble. + if( t128>1.10*t2 ) { + std::printf("Warning: grow_by is pathetically slow: t2==%g t128=%g\n", t2, t128); + } +} + +//------------------------------------------------------------------------ +// Test compatibility with STL sort. +//------------------------------------------------------------------------ + +#include + +void TestSort() { + for( int n=1; n<100; n*=3 ) { + tbb::concurrent_vector array; + array.grow_by( n ); + for( int i=0; i::iterator,Foo>(); + TestIteratorTraits::const_iterator,const Foo>(); + TestSequentialFor > (); + TestResizeAndCopy(); + TestAssign(); + TestCapacity(); + for( int nthread=MinThread; nthread<=MaxThread; ++nthread ) { + tbb::task_scheduler_init init( nthread ); + TestParallelFor( nthread ); + TestConcurrentGrowToAtLeast(); + TestConcurrentGrowBy( nthread ); + } + TestFindPrimes(); + TestSort(); + std::printf("done\n"); + return 0; +} diff --git a/dep/tbb/src/old/test_mutex_v2.cpp b/dep/tbb/src/old/test_mutex_v2.cpp new file mode 100644 index 000000000..4e2a1ef72 --- /dev/null +++ b/dep/tbb/src/old/test_mutex_v2.cpp @@ -0,0 +1,270 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +//------------------------------------------------------------------------ +// Test TBB mutexes when used with parallel_for.h +// +// Usage: test_Mutex.exe [-v] nthread +// +// The -v option causes timing information to be printed. +// +// Compile with _OPENMP and -openmp +//------------------------------------------------------------------------ +#include "tbb/atomic.h" +#include "tbb/blocked_range.h" +#include "tbb/parallel_for.h" +#include "tbb/tick_count.h" +#include "../test/harness.h" +#include "spin_rw_mutex_v2.h" +#include +#include + +#if __linux__ +#define STD std +#else +#define STD /* Cater to broken Windows compilers that are missing "std". */ +#endif /* __linux__ */ + +// This test deliberately avoids a "using tbb" statement, +// so that the error of putting types in the wrong namespace will be caught. + +template +struct Counter { + typedef M mutex_type; + M mutex; + volatile long value; +}; + +//! Function object for use with parallel_for.h. +template +struct AddOne { + C& counter; + /** Increments counter once for each iteration in the iteration space. */ + void operator()( tbb::blocked_range& range ) const { + for( size_t i=range.begin(); i!=range.end(); ++i ) { + if( i&1 ) { + // Try implicit acquire and explicit release + typename C::mutex_type::scoped_lock lock(counter.mutex); + counter.value = counter.value+1; + lock.release(); + } else { + // Try explicit acquire and implicit release + typename C::mutex_type::scoped_lock lock; + lock.acquire(counter.mutex); + counter.value = counter.value+1; + } + } + } + AddOne( C& counter_ ) : counter(counter_) {} +}; + +//! Generic test of a TBB mutex type M. +/** Does not test features specific to reader-writer locks. */ +template +void Test( const char * name ) { + if( Verbose ) { + printf("%s time = ",name); + fflush(stdout); + } + Counter counter; + counter.value = 0; + const int n = 100000; + tbb::tick_count t0 = tbb::tick_count::now(); + tbb::parallel_for(tbb::blocked_range(0,n,10000),AddOne >(counter)); + tbb::tick_count t1 = tbb::tick_count::now(); + if( Verbose ) + printf("%g usec\n",(t1-t0).seconds()); + if( counter.value!=n ) + STD::printf("ERROR for %s: counter.value=%ld\n",name,counter.value); +} + +template +struct Invariant { + typedef M mutex_type; + M mutex; + const char* mutex_name; + volatile long value[N]; + volatile long single_value; + Invariant( const char* mutex_name_ ) : + mutex_name(mutex_name_) + { + single_value = 0; + for( size_t k=0; k +struct TwiddleInvariant { + I& invariant; + /** Increments counter once for each iteration in the iteration space. */ + void operator()( tbb::blocked_range& range ) const { + for( size_t i=range.begin(); i!=range.end(); ++i ) { + //! Every 8th access is a write access + bool write = (i%8)==7; + bool okay = true; + bool lock_kept = true; + if( (i/8)&1 ) { + // Try implicit acquire and explicit release + typename I::mutex_type::scoped_lock lock(invariant.mutex,write); + if( write ) { + long my_value = invariant.value[0]; + invariant.update(); + if( i%16==7 ) { + lock_kept = lock.downgrade_to_reader(); + if( !lock_kept ) + my_value = invariant.value[0] - 1; + okay = invariant.value_is(my_value+1); + } + } else { + okay = invariant.is_okay(); + if( i%8==3 ) { + long my_value = invariant.value[0]; + lock_kept = lock.upgrade_to_writer(); + if( !lock_kept ) + my_value = invariant.value[0]; + invariant.update(); + okay = invariant.value_is(my_value+1); + } + } + lock.release(); + } else { + // Try explicit acquire and implicit release + typename I::mutex_type::scoped_lock lock; + lock.acquire(invariant.mutex,write); + if( write ) { + long my_value = invariant.value[0]; + invariant.update(); + if( i%16==7 ) { + lock_kept = lock.downgrade_to_reader(); + if( !lock_kept ) + my_value = invariant.value[0] - 1; + okay = invariant.value_is(my_value+1); + } + } else { + okay = invariant.is_okay(); + if( i%8==3 ) { + long my_value = invariant.value[0]; + lock_kept = lock.upgrade_to_writer(); + if( !lock_kept ) + my_value = invariant.value[0]; + invariant.update(); + okay = invariant.value_is(my_value+1); + } + } + } + if( !okay ) { + STD::printf( "ERROR for %s at %ld: %s %s %s %s\n",invariant.mutex_name, long(i), + write?"write,":"read,", write?(i%16==7?"downgrade,":""):(i%8==3?"upgrade,":""), + lock_kept?"lock kept,":"lock not kept,", (i/8)&1?"imp/exp":"exp/imp" ); + } + } + } + TwiddleInvariant( I& invariant_ ) : invariant(invariant_) {} +}; + +/** This test is generic so that we can test any other kinds of ReaderWriter locks we write later. */ +template +void TestReaderWriterLock( const char * mutex_name ) { + if( Verbose ) { + printf("%s readers & writers time = ",mutex_name); + fflush(stdout); + } + Invariant invariant(mutex_name); + const size_t n = 500000; + tbb::tick_count t0 = tbb::tick_count::now(); + tbb::parallel_for(tbb::blocked_range(0,n,5000),TwiddleInvariant >(invariant)); + tbb::tick_count t1 = tbb::tick_count::now(); + // There is either a writer or a reader upgraded to a writer for each 4th iteration + long expected_value = n/4; + if( !invariant.value_is(expected_value) ) + STD::printf("ERROR for %s: final invariant value is wrong\n",mutex_name); + if( Verbose ) + printf("%g usec\n",(t1-t0).seconds()); +} + +/** Test try_acquire functionality of a non-reenterable mutex */ +template +void TestTryAcquire_OneThread( const char * mutex_name ) { + M tested_mutex; + typename M::scoped_lock lock1; + if( lock1.try_acquire(tested_mutex) ) + lock1.release(); + else + STD::printf("ERROR for %s: try_acquire failed though it should not\n", mutex_name); + { + typename M::scoped_lock lock2(tested_mutex); + if( lock1.try_acquire(tested_mutex) ) + STD::printf("ERROR for %s: try_acquire succeeded though it should not\n", mutex_name); + } + if( lock1.try_acquire(tested_mutex) ) + lock1.release(); + else + STD::printf("ERROR for %s: try_acquire failed though it should not\n", mutex_name); +} + +#include "tbb/task_scheduler_init.h" + +int main( int argc, char * argv[] ) { + ParseCommandLine( argc, argv ); + for( int p=MinThread; p<=MaxThread; ++p ) { + tbb::task_scheduler_init init( p ); + if( Verbose ) + printf( "testing with %d workers\n", static_cast(p) ); + // Run each test 3 times. + for( int i=0; i<3; ++i ) { + Test( "Spin RW Mutex" ); + + TestTryAcquire_OneThread("Spin RW Mutex"); // only tests try_acquire for writers + TestReaderWriterLock( "Spin RW Mutex" ); + if( Verbose ) + printf( "calling destructor for task_scheduler_init\n" ); + } + } + STD::printf("done\n"); + return 0; +} diff --git a/dep/tbb/src/rml/client/index.html b/dep/tbb/src/rml/client/index.html new file mode 100644 index 000000000..5c7bd50fc --- /dev/null +++ b/dep/tbb/src/rml/client/index.html @@ -0,0 +1,43 @@ + + +

Overview

+ +This directory has source code that must be statically linked into an RML client. + +

Files

+ +
+

rml_factory.h +

Text shared by rml_omp.cpp and rml_tbb.cpp. + This is not an ordinary include file, so it does not have an #ifndef guard.

+
+ +

Specific to client=OpenMP

+
+

rml_omp.cpp +

Source file for OpenMP client.

+

omp_dynamic_link.h +

omp_dynamic_link.cpp +
Source files for dynamic linking support. + The code is the code from the TBB source directory, but adjusted so that it + appears in namespace __kmp instead of namespace tbb::internal. +
+

Specific to client=TBB

+
+

rml_tbb.cpp +

Source file for TBB client. It uses the dynamic linking support from the TBB source directory. +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + + diff --git a/dep/tbb/src/rml/client/library_assert.h b/dep/tbb/src/rml/client/library_assert.h new file mode 100644 index 000000000..6d8300b94 --- /dev/null +++ b/dep/tbb/src/rml/client/library_assert.h @@ -0,0 +1,41 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef LIBRARY_ASSERT_H +#define LIBRARY_ASSERT_H + +#ifndef LIBRARY_ASSERT +#ifdef KMP_ASSERT2 +#define LIBRARY_ASSERT(x,y) KMP_ASSERT2((x),(y)) +#else +#include +#define LIBRARY_ASSERT(x,y) assert(x) +#endif +#endif /* LIBRARY_ASSERT */ + +#endif /* LIBRARY_ASSERT_H */ diff --git a/dep/tbb/src/rml/client/omp_dynamic_link.cpp b/dep/tbb/src/rml/client/omp_dynamic_link.cpp new file mode 100644 index 000000000..0f89a3ccb --- /dev/null +++ b/dep/tbb/src/rml/client/omp_dynamic_link.cpp @@ -0,0 +1,32 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "omp_dynamic_link.h" +#include "library_assert.h" +#include "tbb/dynamic_link.cpp" // Refers to src/tbb, not include/tbb + diff --git a/dep/tbb/src/rml/client/omp_dynamic_link.h b/dep/tbb/src/rml/client/omp_dynamic_link.h new file mode 100644 index 000000000..290b668fc --- /dev/null +++ b/dep/tbb/src/rml/client/omp_dynamic_link.h @@ -0,0 +1,37 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __KMP_omp_dynamic_link_H +#define __KMP_omp_dynamic_link_H + +#define OPEN_INTERNAL_NAMESPACE namespace __kmp { +#define CLOSE_INTERNAL_NAMESPACE } + +#include "tbb/dynamic_link.h" // Refers to src/tbb, not include/tbb + +#endif /* __KMP_omp_dynamic_link_H */ diff --git a/dep/tbb/src/rml/client/rml_factory.h b/dep/tbb/src/rml/client/rml_factory.h new file mode 100644 index 000000000..2f584b9cf --- /dev/null +++ b/dep/tbb/src/rml/client/rml_factory.h @@ -0,0 +1,100 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// No ifndef guard because this file is not a normal include file. + +// FIXME - resolve whether _debug version of the RML should have different suffix. */ + +#if TBB_USE_DEBUG +#define DEBUG_SUFFIX "_debug" +#else +#define DEBUG_SUFFIX +#endif /* TBB_USE_DEBUG */ + +// RML_SERVER_NAME is the name of the RML server library. +#if _WIN32||_WIN64 +#define RML_SERVER_NAME "irml" DEBUG_SUFFIX ".dll" +#elif __APPLE__ +#define RML_SERVER_NAME "libirml" DEBUG_SUFFIX ".dylib" +#elif __linux__ +#define RML_SERVER_NAME "libirml" DEBUG_SUFFIX ".so.1" +#elif __FreeBSD__ || __sun +#define RML_SERVER_NAME "libirml" DEBUG_SUFFIX ".so" +#else +#error Unknown OS +#endif + +#include "library_assert.h" + +const ::rml::versioned_object::version_type CLIENT_VERSION = 1; + +::rml::factory::status_type FACTORY::open() { + // Failure of following assertion indicates that factory is already open, or not zero-inited. + LIBRARY_ASSERT( !library_handle, NULL ); + status_type (*open_factory_routine)( factory&, version_type&, version_type ); + dynamic_link_descriptor server_link_table[4] = { + DLD(__RML_open_factory,open_factory_routine), + MAKE_SERVER(my_make_server_routine), + DLD(__RML_close_factory,my_wait_to_close_routine), + GET_INFO(my_call_with_server_info_routine), + }; + status_type result; + dynamic_link_handle h; + if( dynamic_link( RML_SERVER_NAME, server_link_table, 4, 4, &h ) ) { + library_handle = h; + version_type server_version; + status_type result = (*open_factory_routine)( *this, server_version, CLIENT_VERSION ); + // server_version can be checked here for incompatibility here if necessary. + return result; + } else { + library_handle = NULL; + result = st_not_found; + } + return result; +} + +void FACTORY::close() { + if( library_handle ) { + (*my_wait_to_close_routine)(*this); + dynamic_link_handle h = library_handle; + dynamic_unlink(h); + library_handle = NULL; + } +} + +::rml::factory::status_type FACTORY::make_server( SERVER*& s, CLIENT& c) { + // Failure of following assertion means that factory was not successfully opened. + LIBRARY_ASSERT( my_make_server_routine, NULL ); + return (*my_make_server_routine)(*this,s,c); +} + +void FACTORY::call_with_server_info( ::rml::server_info_callback_t cb, void* arg ) const { + // Failure of following assertion means that factory was not successfully opened. + LIBRARY_ASSERT( my_call_with_server_info_routine, NULL ); + (*my_call_with_server_info_routine)( cb, arg ); +} diff --git a/dep/tbb/src/rml/client/rml_omp.cpp b/dep/tbb/src/rml/client/rml_omp.cpp new file mode 100644 index 000000000..38a5a5f63 --- /dev/null +++ b/dep/tbb/src/rml/client/rml_omp.cpp @@ -0,0 +1,44 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "rml_omp.h" +#include "omp_dynamic_link.h" +#include + +namespace __kmp { +namespace rml { + +#define MAKE_SERVER(x) DLD(__KMP_make_rml_server,x) +#define GET_INFO(x) DLD(__KMP_call_with_my_server_info,x) +#define SERVER omp_server +#define CLIENT omp_client +#define FACTORY omp_factory +#include "rml_factory.h" + +} // rml +} // __kmp diff --git a/dep/tbb/src/rml/client/rml_tbb.cpp b/dep/tbb/src/rml/client/rml_tbb.cpp new file mode 100644 index 000000000..7e1612e28 --- /dev/null +++ b/dep/tbb/src/rml/client/rml_tbb.cpp @@ -0,0 +1,46 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "../include/rml_tbb.h" +#include "tbb/dynamic_link.h" +#include + +namespace tbb { +namespace internal { +namespace rml { + +#define MAKE_SERVER(x) DLD(__TBB_make_rml_server,x) +#define GET_INFO(x) DLD(__TBB_call_with_my_server_info,x) +#define SERVER tbb_server +#define CLIENT tbb_client +#define FACTORY tbb_factory +#include "rml_factory.h" + +} // rml +} // internal +} // tbb diff --git a/dep/tbb/src/rml/include/index.html b/dep/tbb/src/rml/include/index.html new file mode 100644 index 000000000..aacad333b --- /dev/null +++ b/dep/tbb/src/rml/include/index.html @@ -0,0 +1,30 @@ + + +

Overview

+ +This directory has the include files for the Resource Management Layer (RML). + +

Files

+ +
+

rml_base.h +

Interfaces shared by TBB and OpenMP.

+

rml_omp.h +

Interface exclusive to OpenMP.

+

rml_tbb.h +

Interface exclusive to TBB.

+
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + + diff --git a/dep/tbb/src/rml/include/rml_base.h b/dep/tbb/src/rml/include/rml_base.h new file mode 100644 index 000000000..148edb28b --- /dev/null +++ b/dep/tbb/src/rml/include/rml_base.h @@ -0,0 +1,186 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// Header guard and namespace names follow rml conventions. + +#ifndef __RML_rml_base_H +#define __RML_rml_base_H + +#include +#if _WIN32||_WIN64 +#include +#endif /* _WIN32||_WIN64 */ + +#ifdef RML_PURE_VIRTUAL_HANDLER +#define RML_PURE(T) {RML_PURE_VIRTUAL_HANDLER(); return (T)0;} +#else +#define RML_PURE(T) = 0; +#endif + +namespace rml { + +//! Base class for denying assignment and copy constructor. +class no_copy { + void operator=( no_copy& ); + no_copy( no_copy& ); +public: + no_copy() {} +}; + +class server; + +class versioned_object { +public: + //! A version number + typedef unsigned version_type; + + //! Get version of this object + /** The version number is incremented when a incompatible change is introduced. + The version number is invariant for the lifetime of the object. */ + virtual version_type version() const RML_PURE(version_type) +}; + +//! Represents a client's job for an execution context. +/** A job object is constructed by the client. + Not derived from versioned_object because version is same as for client. */ +class job { + friend class server; + + //! Word for use by server + /** Typically the server uses it to speed up internal lookup. + Clients must not modify the word. */ + void* scratch_ptr; +}; + +//! Information that client provides to server when asking for a server. +/** The instance must endure at least until acknowledge_close_connection is called. */ +class client: public versioned_object { +public: + //! Typedef for convenience of derived classes in other namespaces. + typedef ::rml::job job; + + //! Index of a job in a job pool + typedef unsigned size_type; + + //! Maximum number of threads that client can exploit profitably if nothing else is running on the machine. + /** The returned value should remain invariant for the lifetime of the connection. [idempotent] */ + virtual size_type max_job_count() const RML_PURE(size_type) + + //! Minimum stack size for each job. 0 means to use default stack size. [idempotent] + virtual std::size_t min_stack_size() const RML_PURE(std::size_t) + + //! Server calls this routine when it needs client to create a job object. + /** Value of index is guaranteed to be unique for each job and in the half-open + interval [0,max_job_count) */ + virtual job* create_one_job() RML_PURE(job*) + + //! Acknowledge that all jobs have been cleaned up. + /** Called by server in response to request_close_connection + after cleanup(job) has been called for each job. */ + virtual void acknowledge_close_connection() RML_PURE(void) + + enum policy_type {turnaround,throughput}; + + //! Inform server of desired policy. [idempotent] + virtual policy_type policy() const RML_PURE(policy_type) + + //! Inform client that server is done with *this. + /** Client should destroy the job. + Not necessarily called by execution context represented by *this. + Never called while any other thread is working on the job. */ + virtual void cleanup( job& ) RML_PURE(void) + + // In general, we should not add new virtual methods, because that would + // break derived classes. Think about reserving some vtable slots. +}; + +// Information that server provides to client. +// Virtual functions are routines provided by the server for the client to call. +class server: public versioned_object { +public: + //! Typedef for convenience of derived classes. + typedef ::rml::job job; + + //! Request that connection to server be closed. + /** Causes each job associated with the client to have its cleanup method called, + possibly by a thread different than the thread that created the job. + This method can return before all cleanup methods return. + Actions that have to wait after all cleanup methods return should be part of + client::acknowledge_close_connection. */ + virtual void request_close_connection() = 0; + + //! Called by client thread when it reaches a point where it cannot make progress until other threads do. + virtual void yield() = 0; + + //! Called by client to indicate a change in the number of non-RML threads that are running. + /** This is a performance hint to the RML to adjust how many many threads it should let run + concurrently. The delta is the change in the number of non-RML threads that are running. + For example, a value of 1 means the client has started running another thread, and a value + of -1 indicates that the client has blocked or terminated one of its threads. */ + virtual void independent_thread_number_changed( int delta ) = 0; + + //! Default level of concurrency for which RML strives when there are no non-RML threads running. + /** Normally, the value is the hardware concurrency minus one. + The "minus one" accounts for the thread created by main(). */ + virtual unsigned default_concurrency() const = 0; + +protected: + static void*& scratch_ptr( job& j ) {return j.scratch_ptr;} +}; + +class factory { +public: + //! status results + enum status_type { + st_success=0, + st_connection_exists, + st_not_found, + st_incompatible + }; + + //! Scratch pointer for use by RML. + void* scratch_ptr; + +protected: + //! Pointer to routine that waits for server to indicate when client can close itself. + status_type (*my_wait_to_close_routine)( factory& ); + +public: + //! Library handle for use by RML. +#if _WIN32||_WIN64 + HMODULE library_handle; +#else + void* library_handle; +#endif /* _WIN32||_WIN64 */ +}; + +typedef void (*server_info_callback_t)( void* arg, const char* server_info ); + +} // namespace rml + +#endif /* __RML_rml_base_H */ diff --git a/dep/tbb/src/rml/include/rml_omp.h b/dep/tbb/src/rml/include/rml_omp.h new file mode 100644 index 000000000..d664908d3 --- /dev/null +++ b/dep/tbb/src/rml/include/rml_omp.h @@ -0,0 +1,123 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// Header guard and namespace names follow OpenMP runtime conventions. + +#ifndef KMP_RML_OMP_H +#define KMP_RML_OMP_H + +#include "rml_base.h" + +namespace __kmp { +namespace rml { + +class omp_client; + +//------------------------------------------------------------------------ +// Classes instantiated by the server +//------------------------------------------------------------------------ + +//! Represents a set of worker threads provided by the server. +class omp_server: public ::rml::server { +public: + //! A number of threads + typedef unsigned size_type; + + //! Return the number of coins in the bank. (negative if machine is oversubscribed). + virtual int current_balance() const RML_PURE(int); + + //! Request n coins. Returns number of coins granted. Oversubscription amount if negative. + /** Always granted if is_strict is true. + - Positive or zero result indicates that the number of coins was taken from the bank. + - Negative result indicates that no coins were taken, and that the bank has deficit + by that amount and the caller (if being a good citizen) should return that many coins. + */ + virtual int try_increase_load( size_type /*n*/, bool /*strict*/ ) RML_PURE(size_type) + + //! Return n coins into the bank. + virtual void decrease_load( size_type /*n*/ ) RML_PURE(void); + + //! Convert n coins into n threads. + /** When a thread returns, it is converted back into a coin and the coin is returned to the bank. */ + virtual void get_threads( size_type /*m*/, void* /*cookie*/, job* /*array*/[] ) RML_PURE(void); + + /** Putting a thread to sleep - convert a thread into a coin + Waking up a thread - convert a coin into a thread + + Note: conversion between a coin and a thread does not affect the accounting. + */ +}; + + +//------------------------------------------------------------------------ +// Classes (or base classes thereof) instantiated by the client +//------------------------------------------------------------------------ + +class omp_client: public ::rml::client { +public: + //! Called by server thread when it runs its part of a parallel region. + /** The index argument is a 0-origin index of this thread within the array + returned by method get_threads. Server decreases the load by 1 after this method returns. */ + virtual void process( job&, void* /*cookie*/, size_type /*index*/ ) RML_PURE(void) +}; + +/** Client must ensure that instance is zero-inited, typically by being a file-scope object. */ +class omp_factory: public ::rml::factory { + + //! Pointer to routine that creates an RML server. + status_type (*my_make_server_routine)( omp_factory&, omp_server*&, omp_client& ); + + //! Pointer to routine that returns server version info. + void (*my_call_with_server_info_routine)( ::rml::server_info_callback_t cb, void* arg ); + +public: + typedef ::rml::versioned_object::version_type version_type; + typedef omp_client client_type; + typedef omp_server server_type; + + //! Open factory. + /** Dynamically links against RML library. + Returns st_success, st_incompatible, or st_not_found. */ + status_type open(); + + //! Factory method to be called by client to create a server object. + /** Factory must be open. + Returns st_success or st_incompatible . */ + status_type make_server( server_type*&, client_type& ); + + //! Close factory. + void close(); + + //! Call the callback with the server build info. + void call_with_server_info( ::rml::server_info_callback_t cb, void* arg ) const; +}; + +} // namespace rml +} // namespace __kmp + +#endif /* KMP_RML_OMP_H */ diff --git a/dep/tbb/src/rml/include/rml_tbb.h b/dep/tbb/src/rml/include/rml_tbb.h new file mode 100644 index 000000000..3c0d8a94c --- /dev/null +++ b/dep/tbb/src/rml/include/rml_tbb.h @@ -0,0 +1,98 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// Header guard and namespace names follow TBB conventions. + +#ifndef __TBB_rml_tbb_H +#define __TBB_rml_tbb_H + +#include "rml_base.h" + +namespace tbb { +namespace internal { +namespace rml { + +class tbb_client; + +//------------------------------------------------------------------------ +// Classes instantiated by the server +//------------------------------------------------------------------------ +class tbb_server: public ::rml::server { +public: + //! Inform server of adjustments in the number of workers that the client can profitably use. + virtual void adjust_job_count_estimate( int delta ) = 0; +}; + +//------------------------------------------------------------------------ +// Classes instantiated by the client +//------------------------------------------------------------------------ + +class tbb_client: public ::rml::client { +public: + //! Defined by TBB to steal a task and execute it. + /** Called by server when wants an execution context to do some TBB work. + The method should return when it is okay for the thread to yield indefinitely. */ + virtual void process( job& ) = 0; +}; + +/** Client must ensure that instance is zero-inited, typically by being a file-scope object. */ +class tbb_factory: public ::rml::factory { + + //! Pointer to routine that creates an RML server. + status_type (*my_make_server_routine)( tbb_factory&, tbb_server*&, tbb_client& ); + + //! Pointer to routine that returns server version info. + void (*my_call_with_server_info_routine)( ::rml::server_info_callback_t cb, void* arg ); + +public: + typedef ::rml::versioned_object::version_type version_type; + typedef tbb_client client_type; + typedef tbb_server server_type; + + //! Open factory. + /** Dynamically links against RML library. + Returns st_success, st_incompatible, or st_not_found. */ + status_type open(); + + //! Factory method to be called by client to create a server object. + /** Factory must be open. + Returns st_success, st_connection_exists, or st_incompatible . */ + status_type make_server( server_type*&, client_type& ); + + //! Close factory + void close(); + + //! Call the callback with the server build info + void call_with_server_info( ::rml::server_info_callback_t cb, void* arg ) const; +}; + +} // namespace rml +} // namespace internal +} // namespace tbb + +#endif /*__TBB_rml_tbb_H */ diff --git a/dep/tbb/src/rml/index.html b/dep/tbb/src/rml/index.html new file mode 100644 index 000000000..9c403afa5 --- /dev/null +++ b/dep/tbb/src/rml/index.html @@ -0,0 +1,32 @@ + + +

Overview

+ +The subdirectories pertain to the Resource Management Layer (RML). + +

Directories

+ +
+

include/ +

Include files used by clients of RML.

+

client/ +

Source files for code that must be statically linked with a client.

+

server/ +

Source files for the RML server.

+

test/ +

Unit tests for RML server and its components.

+
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + + diff --git a/dep/tbb/src/rml/server/index.html b/dep/tbb/src/rml/server/index.html new file mode 100644 index 000000000..e2750c643 --- /dev/null +++ b/dep/tbb/src/rml/server/index.html @@ -0,0 +1,19 @@ + + +

Overview

+ +This directory has source code internal to the server. + +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + + diff --git a/dep/tbb/src/rml/server/irml.rc b/dep/tbb/src/rml/server/irml.rc new file mode 100644 index 000000000..35e5db81d --- /dev/null +++ b/dep/tbb/src/rml/server/irml.rc @@ -0,0 +1,126 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +// Microsoft Visual C++ generated resource script. +// +#ifdef APSTUDIO_INVOKED +#ifndef APSTUDIO_READONLY_SYMBOLS +#define _APS_NO_MFC 1 +#define _APS_NEXT_RESOURCE_VALUE 102 +#define _APS_NEXT_COMMAND_VALUE 40001 +#define _APS_NEXT_CONTROL_VALUE 1001 +#define _APS_NEXT_SYMED_VALUE 101 +#endif +#endif + +#define APSTUDIO_READONLY_SYMBOLS +///////////////////////////////////////////////////////////////////////////// +// +// Generated from the TEXTINCLUDE 2 resource. +// +#include +#define ENDL "\r\n" +#include "tbb/tbb_version.h" + +///////////////////////////////////////////////////////////////////////////// +#undef APSTUDIO_READONLY_SYMBOLS + +///////////////////////////////////////////////////////////////////////////// +// Neutral resources + +#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_NEU) +#ifdef _WIN32 +LANGUAGE LANG_NEUTRAL, SUBLANG_NEUTRAL +#pragma code_page(1252) +#endif //_WIN32 + +///////////////////////////////////////////////////////////////////////////// +// manifest integration +#ifdef TBB_MANIFEST +#include "winuser.h" +2 RT_MANIFEST tbbmanifest.exe.manifest +#endif + +///////////////////////////////////////////////////////////////////////////// +// +// Version +// + +VS_VERSION_INFO VERSIONINFO + FILEVERSION TBB_VERNUMBERS + PRODUCTVERSION TBB_VERNUMBERS + FILEFLAGSMASK 0x17L +#ifdef _DEBUG + FILEFLAGS 0x1L +#else + FILEFLAGS 0x0L +#endif + FILEOS 0x40004L + FILETYPE 0x2L + FILESUBTYPE 0x0L +BEGIN + BLOCK "StringFileInfo" + BEGIN + BLOCK "000004b0" + BEGIN + VALUE "CompanyName", "Intel Corporation\0" + VALUE "FileDescription", "Resource manager library\0" + VALUE "FileVersion", TBB_VERSION "\0" +//what is it? VALUE "InternalName", "irml\0" + VALUE "LegalCopyright", "Copyright (C) 2009\0" + VALUE "LegalTrademarks", "\0" +#ifndef TBB_USE_DEBUG + VALUE "OriginalFilename", "irml.dll\0" +#else + VALUE "OriginalFilename", "irml_debug.dll\0" +#endif + VALUE "ProductName", "Threading Building Blocks\0" + VALUE "ProductVersion", TBB_VERSION "\0" + VALUE "Comments", TBB_VERSION_STRINGS "\0" + VALUE "PrivateBuild", "\0" + VALUE "SpecialBuild", "\0" + END + END + BLOCK "VarFileInfo" + BEGIN + VALUE "Translation", 0x0, 1200 + END +END + +#endif // Neutral resources +///////////////////////////////////////////////////////////////////////////// + + +#ifndef APSTUDIO_INVOKED +///////////////////////////////////////////////////////////////////////////// +// +// Generated from the TEXTINCLUDE 3 resource. +// + + +///////////////////////////////////////////////////////////////////////////// +#endif // not APSTUDIO_INVOKED + diff --git a/dep/tbb/src/rml/server/job_automaton.h b/dep/tbb/src/rml/server/job_automaton.h new file mode 100644 index 000000000..7e3c4f354 --- /dev/null +++ b/dep/tbb/src/rml/server/job_automaton.h @@ -0,0 +1,157 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __RML_job_automaton_H +#define __RML_job_automaton_H + +#include "rml_base.h" +#include "tbb/atomic.h" + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings + #pragma warning (push) + #pragma warning (disable: 4244) +#endif + +namespace rml { + +namespace internal { + +//! Finite state machine. +/** /--------------\ + / V + 0 --> 1--> ptr --> -1 + ^ + | + | + V + ptr|1 + +"owner" = corresponding server_thread. +Odd states indicate that someone is executing code on the job. +Furthermore, odd states!=-1 indicate that owner will read its mailbox shortly. +Most transitions driven only by owner. +Transition 0-->-1 is driven by non-owner. +Transition ptr->-1 is driven by owner or non-owner. +*/ +class job_automaton: no_copy { +private: + tbb::atomic my_job; +public: + /** Created by non-owner */ + job_automaton() { + my_job = 0; + } + + ~job_automaton() { + __TBB_ASSERT( my_job==-1, "must plug before destroying" ); + } + + //! Try to transition 0-->1 or ptr-->ptr|1. + /** Should only be called by owner. */ + bool try_acquire() { + intptr_t snapshot = my_job; + if( snapshot==-1 ) { + return false; + } else { + __TBB_ASSERT( (snapshot&1)==0, "already marked that way" ); + intptr_t old = my_job.compare_and_swap( snapshot|1, snapshot ); + __TBB_ASSERT( old==snapshot || old==-1, "unexpected interference" ); + return old==snapshot; + } + } + //! Transition ptr|1-->ptr + /** Should only be called by owner. */ + void release() { + intptr_t snapshot = my_job; + __TBB_ASSERT( snapshot&1, NULL ); + // Atomic store suffices here. + my_job = snapshot&~1; + } + + //! Transition 1-->ptr + /** Should only be called by owner. */ + void set_and_release( rml::job& job ) { + intptr_t value = reinterpret_cast(&job); + __TBB_ASSERT( (value&1)==0, "job misaligned" ); + __TBB_ASSERT( value!=0, "null job" ); + __TBB_ASSERT( my_job==1, "already set, or not marked busy?" ); + // Atomic store suffices here. + my_job = value; + } + + //! Transition 0-->-1 + /** If successful, return true. */ + bool try_plug_null() { + return my_job.compare_and_swap( -1, 0 )==0; + } + + //! Try to transition to -1. If successful, set j to contents and return true. + /** Called by owner or non-owner. */ + bool try_plug( rml::job*&j ) { + for(;;) { + intptr_t snapshot = my_job; + if( snapshot&1 ) { + // server_thread that owns job is executing a mailbox item for the job, + // and will thus read its mailbox afterwards, and see a terminate request + // for the job. + j = NULL; + return false; + } + // Not busy + if( my_job.compare_and_swap( -1, snapshot )==snapshot ) { + j = reinterpret_cast(snapshot); + return true; + } + // Need to retry, because current thread may be nonowner that read a 0, and owner might have + // caused transition 0->1->ptr after we took our snapshot. + } + } + + /** Called by non-owner to wait for transition to ptr. */ + rml::job& wait_for_job() const { + intptr_t snapshot; + for(;;) { + snapshot = my_job; + if( snapshot&~1 ) break; + __TBB_Yield(); + } + __TBB_ASSERT( snapshot!=-1, "wait on plugged job_automaton" ); + return *reinterpret_cast(snapshot&~1); + } +}; + +} // namespace internal +} // namespace rml + + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + #pragma warning (pop) +#endif // warning 4244 are back + +#endif /* __RML_job_automaton_H */ diff --git a/dep/tbb/src/rml/server/lin-rml-export.def b/dep/tbb/src/rml/server/lin-rml-export.def new file mode 100644 index 000000000..2c332aa0d --- /dev/null +++ b/dep/tbb/src/rml/server/lin-rml-export.def @@ -0,0 +1,38 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +{ +global: +__RML_open_factory; +__RML_close_factory; +__TBB_make_rml_server; +__KMP_make_rml_server; +__TBB_call_with_my_server_info; +__KMP_call_with_my_server_info; +local:*; +}; diff --git a/dep/tbb/src/rml/server/rml_server.cpp b/dep/tbb/src/rml/server/rml_server.cpp new file mode 100644 index 000000000..0ffdfe72c --- /dev/null +++ b/dep/tbb/src/rml/server/rml_server.cpp @@ -0,0 +1,1287 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "rml_tbb.h" +#define private public /* Sleazy trick to avoid publishing internal names in public header. */ +#include "rml_omp.h" +#undef private + +#include "tbb/tbb_allocator.h" +#include "tbb/cache_aligned_allocator.h" +#include "job_automaton.h" +#include "wait_counter.h" +#include "thread_monitor.h" +#include "tbb/aligned_space.h" +#include "tbb/atomic.h" +#include "tbb/tbb_misc.h" // Get DetectNumberOfWorkers() from here. +#if _MSC_VER==1500 && !defined(__INTEL_COMPILER) +// VS2008/VC9 seems to have an issue; +#pragma warning( push ) +#pragma warning( disable: 4985 ) +#endif +#include "tbb/concurrent_vector.h" +#if _MSC_VER==1500 && !defined(__INTEL_COMPILER) +#pragma warning( pop ) +#endif + +namespace rml { + +namespace internal { + +//! Number of hardware contexts +static inline unsigned hardware_concurrency() { + static unsigned DefaultNumberOfThreads = 0; + unsigned n = DefaultNumberOfThreads; + if( !n ) DefaultNumberOfThreads = n = tbb::internal::DetectNumberOfWorkers(); + return n; +} + +using tbb::internal::rml::tbb_client; +using tbb::internal::rml::tbb_server; + +using __kmp::rml::omp_client; +using __kmp::rml::omp_server; + +typedef versioned_object::version_type version_type; + +const version_type SERVER_VERSION = 1; + +static const size_t cache_line_size = tbb::internal::NFS_MaxLineSize; + +template class generic_connection; +class tbb_connection_v1; +class omp_connection_v1; + +enum request_kind { + rk_none, + rk_initialize_tbb_job, + rk_terminate_tbb_job, + rk_initialize_omp_job, + rk_terminate_omp_job +}; + +//! State of a server_thread +/** Below is a diagram of legal state transitions. + + OMP + ts_omp_busy + ^ ^ + / \ + / V + ts_asleep <-----------> ts_idle + + TBB + ts_tbb_busy + ^ ^ + / \ + / V + ts_asleep <-----------> ts_idle --> ts_done + + For TBB only. Extra state transition. + + ts_created -> ts_started -> ts_visited + */ +enum thread_state_t { + //! Thread not doing anything useful, but running and looking for work. + ts_idle, + //! Thread not doing anything useful and is asleep */ + ts_asleep, + //! Thread is enlisted into OpenMP team + ts_omp_busy, + //! Thread is busy doing TBB work. + ts_tbb_busy, + //! For tbb threads only + ts_done, + ts_created, + ts_started, + ts_visited +}; + +#if TBB_USE_ASSERT +#define PRODUCE_ARG(x) ,x +#else +#define PRODUCE_ARG(x) +#endif + +//! Synchronizes dispatch of OpenMP work. +class omp_dispatch_type { + typedef ::rml::job job_type; + omp_client* client; + void* cookie; + omp_client::size_type index; + tbb::atomic job; +#if TBB_USE_ASSERT + omp_connection_v1* server; +#endif /* TBB_USE_ASSERT */ +public: + omp_dispatch_type() {job=NULL;} + void consume(); + void produce( omp_client& c, job_type& j, void* cookie_, omp_client::size_type index_ PRODUCE_ARG( omp_connection_v1& s )) { + __TBB_ASSERT( &j, NULL ); + __TBB_ASSERT( !job, "job already set" ); + client = &c; +#if TBB_USE_ASSERT + server = &s; +#endif /* TBB_USE_ASSERT */ + cookie = cookie_; + index = index_; + // Must be last + job = &j; + } +}; + +//! A reference count. +/** No default constructor, because clients must be very careful about whether the + initial reference count is 0 or 1. */ +class ref_count: no_copy { + tbb::atomic my_ref_count; +public: + ref_count(int k ) {my_ref_count=k;} + ~ref_count() {__TBB_ASSERT( !my_ref_count, "premature destruction of refcounted object" );} + //! Add one and return new value. + int add_ref() { + int k = ++my_ref_count; + __TBB_ASSERT(k>=1,"reference count underflowed before add_ref"); + return k; + } + //! Subtract one and return new value. + int remove_ref() { + int k = --my_ref_count; + __TBB_ASSERT(k>=0,"reference count underflow"); + return k; + } +}; + +//! Forward declaration +class server_thread; +class thread_map; + +//! thread_map_base; we need to make the iterator type available to server_thread +struct thread_map_base { + //! A value in the map + class value_type { + public: + server_thread& thread() { + __TBB_ASSERT( my_thread, "thread_map::value_type::thread() called when !my_thread" ); + return *my_thread; + } + rml::job& job() { + __TBB_ASSERT( my_job, "thread_map::value_type::job() called when !my_job" ); + return *my_job; + } + value_type() : my_thread(NULL), my_job(NULL) {} + server_thread& wait_for_thread() const { + for(;;) { + server_thread* ptr=const_cast(my_thread); + if( ptr ) + return *ptr; + __TBB_Yield(); + } + } + /** Shortly after when a connection is established, it is possible for the server + to grab a server_thread that has not yet created a job object for that server. */ + rml::job& wait_for_job() const { + if( !my_job ) { + my_job = &my_automaton.wait_for_job(); + } + return *my_job; + } + private: + server_thread* my_thread; + /** Marked mutable because though it is physically modified, conceptually it is a duplicate of + the job held by job_automaton. */ + mutable rml::job* my_job; + job_automaton my_automaton; +// FIXME - pad out to cache line, because my_automaton is hit hard by thread() + friend class thread_map; + }; + typedef tbb::concurrent_vector > array_type; +}; + +template +class padded: public T { + char pad[cache_line_size - sizeof(T)%cache_line_size]; +}; + +// FIXME - should we pad out memory to avoid false sharing of our global variables? + +static tbb::atomic the_balance; +static tbb::atomic the_balance_inited; + +//! Per thread information +/** ref_count holds number of clients that are using this, + plus 1 if a host thread owns this instance. */ +class server_thread: public ref_count { + friend class thread_map; + template friend class generic_connection; + //! Integral type that can hold a thread_state_t + typedef int thread_state_rep_t; + tbb::atomic state; +public: + thread_monitor monitor; + // FIXME: make them private... + bool is_omp_thread; + tbb::atomic tbb_state; + server_thread* link; // FIXME: this is a temporary fix. Remove when all is done. + thread_map_base::array_type::iterator my_map_pos; +private: + rml::server *my_conn; + rml::job* my_job; + job_automaton* my_ja; + size_t my_index; + +#if TBB_USE_ASSERT + //! Flag used to check if thread is still using *this. + bool has_active_thread; +#endif /* TBB_USE_ASSERT */ + + //! Volunteer to sleep. + void sleep_perhaps( thread_state_t asleep ); + + //! Destroy job corresponding to given client + /** Return true if thread must quit. */ + template + bool destroy_job( Connection& c ); + + //! Process requests + /** Return true if thread must quit. */ + bool process_requests(); + + void loop(); + static __RML_DECL_THREAD_ROUTINE thread_routine( void* arg ); +public: + thread_state_t read_state() const { + thread_state_rep_t s = state; + __TBB_ASSERT( unsigned(s)<=unsigned(ts_done), "corrupted server thread?" ); + return thread_state_t(s); + } + + tbb::atomic request; + + omp_dispatch_type omp_dispatch; + + server_thread(); + ~server_thread(); + + //! Launch a thread that is bound to *this. + void launch( size_t stack_size ); + + //! Attempt to wakeup a thread + /** The value "to" is the new state for the thread, if it was woken up. + Returns true if thread was woken up, false otherwise. */ + bool wakeup( thread_state_t to, thread_state_t from ); + + //! Attempt to enslave a thread for OpenMP/TBB. + bool try_grab_for( thread_state_t s ); +}; + +//! Bag of threads that are private to a client. +class private_thread_bag { + struct list_thread: server_thread { + list_thread* next; + }; + //! Root of atomic linked list of list_thread + /** ABA problem is avoided because items are only atomically pushed, never popped. */ + tbb::atomic my_root; + tbb::cache_aligned_allocator > my_allocator; +public: + //! Construct empty bag + private_thread_bag() {my_root=NULL;} + + //! Create a fresh server_thread object. + server_thread& add_one_thread() { + list_thread* t = my_allocator.allocate(1); + new( t ) list_thread; + // Atomically add to list + list_thread* old_root; + do { + old_root = my_root; + t->next = old_root; + } while( my_root.compare_and_swap( t, old_root )!=old_root ); + return *t; + } + + //! Destroy the bag and threads in it. + ~private_thread_bag() { + while( my_root ) { + // Unlink thread from list. + list_thread* t = my_root; + my_root = t->next; + // Destroy and deallocate the thread. + t->~list_thread(); + my_allocator.deallocate(static_cast*>(t),1); + } + } +}; + +//! Forward declaration +void wakeup_some_tbb_threads(); + +//! Type-independent part of class generic_connection. * +/** One to one map from server threads to jobs, and associated reference counting. */ +class thread_map : public thread_map_base { +public: + typedef rml::client::size_type size_type; + //! ctor + thread_map( wait_counter& fc, ::rml::client& client ) : + all_visited_at_least_once(false), my_min_stack_size(0), my_server_ref_count(1), + my_client_ref_count(1), my_client(client), my_factory_counter(fc) + { my_unrealized_threads = 0; } + //! dtor + ~thread_map() {} + typedef array_type::iterator iterator; + iterator begin() {return my_array.begin();} + iterator end() {return my_array.end();} + void bind( /* rml::server& server, message_kind initialize */ ); + void unbind( request_kind terminate ); + void assist_cleanup( bool assist_null_only ); + + /** Returns number of unrealized threads to create. */ + size_type wakeup_tbb_threads( size_type n ); + bool wakeup_next_thread( iterator i, tbb_connection_v1& conn ); + void release_tbb_threads( server_thread* t ); + void adjust_balance( int delta ); + + //! Add a server_thread object to the map, but do not bind it. + /** Return NULL if out of unrealized threads. */ + value_type* add_one_thread( bool is_omp_thread_ ); + + void bind_one_thread( rml::server& server, request_kind initialize, value_type& x ); + + void remove_client_ref(); + int add_server_ref() {return my_server_ref_count.add_ref();} + int remove_server_ref() {return my_server_ref_count.remove_ref();} + + ::rml::client& client() const {return my_client;} + + size_type get_unrealized_threads() { return my_unrealized_threads; } + +private: + private_thread_bag my_private_threads; + bool all_visited_at_least_once; + array_type my_array; + size_t my_min_stack_size; + tbb::atomic my_unrealized_threads; + + //! Number of threads referencing *this, plus one extra. + /** When it becomes zero, the containing server object can be safely deleted. */ + ref_count my_server_ref_count; + + //! Number of jobs that need cleanup, plus one extra. + /** When it becomes zero, acknowledge_close_connection is called. */ + ref_count my_client_ref_count; + ::rml::client& my_client; + //! Counter owned by factory that produced this thread_map. + wait_counter& my_factory_counter; +}; + +void thread_map::bind_one_thread( rml::server& server, request_kind initialize, value_type& x ) { + // Add one to account for the thread referencing this map hereforth. + server_thread& t = x.thread(); + my_server_ref_count.add_ref(); + my_client_ref_count.add_ref(); +#if TBB_USE_ASSERT + __TBB_ASSERT( t.add_ref()==1, NULL ); +#else + t.add_ref(); +#endif + // Have responsibility to start the thread. + t.my_conn = &server; + t.my_ja = &x.my_automaton; + t.request = initialize; + t.launch( my_min_stack_size ); + // Must wakeup thread so it can fill in its "my_job" field in *this. + // Otherwise deadlock can occur where wait_for_job spins on thread that is sleeping. + __TBB_ASSERT( t.state!=ts_tbb_busy, NULL ); + t.wakeup( ts_idle, ts_asleep ); +} + +thread_map::value_type* thread_map::add_one_thread( bool is_omp_thread_ ) { + size_type u; + do { + u = my_unrealized_threads; + if( !u ) return NULL; + } while( my_unrealized_threads.compare_and_swap(u-1,u)!=u ); + server_thread& t = my_private_threads.add_one_thread(); + t.is_omp_thread = is_omp_thread_; + __TBB_ASSERT( u>=1, NULL ); + t.my_index = u - 1; + __TBB_ASSERT( t.state!=ts_tbb_busy, NULL ); + if( !t.is_omp_thread ) + t.tbb_state = ts_created; + iterator i = t.my_map_pos = my_array.grow_by(1); + value_type& v = *i; + v.my_thread = &t; + return &v; +} + +void thread_map::bind( /* rml::server& server, request_kind initialize */ ) { + ++my_factory_counter; + my_min_stack_size = my_client.min_stack_size(); + __TBB_ASSERT( my_unrealized_threads==0, "already called bind?" ); + my_unrealized_threads = my_client.max_job_count(); +} + +void thread_map::unbind( request_kind terminate ) { + // Ask each server_thread to cleanup its job for this server. + for( iterator i=begin(); i!=end(); ++i ) { + server_thread& t = i->thread(); + // The last parameter of the message is not used by the recipient. + t.request = terminate; + t.wakeup( ts_idle, ts_asleep ); + } + // Remove extra ref to client. + remove_client_ref(); +} + +void thread_map::assist_cleanup( bool assist_null_only ) { + // To avoid deadlock, the current thread *must* help out with cleanups that have not started, + // becausd the thread that created the job may be busy for a long time. + for( iterator i = begin(); i!=end(); ++i ) { + rml::job* j=0; + job_automaton& ja = i->my_automaton; + if( assist_null_only ? ja.try_plug_null() : ja.try_plug(j) ) { + if( j ) { + my_client.cleanup(*j); + } else { + // server thread did not get a chance to create a job. + } + remove_client_ref(); + } + } +} + +thread_map::size_type thread_map::wakeup_tbb_threads( size_type n ) { + __TBB_ASSERT(n>0,"must specify positive number of threads to wake up"); + iterator e = end(); + for( iterator k=begin(); k!=e; ++k ) { + // If another thread added *k, there is a tiny timing window where thread() is invalid. + server_thread& t = k->wait_for_thread(); + if( t.tbb_state==ts_created || t.read_state()==ts_tbb_busy ) + continue; + if( --the_balance>=0 ) { // try to withdraw a coin from the deposit + while( !t.try_grab_for( ts_tbb_busy ) ) { + if( t.read_state()==ts_tbb_busy ) { + // we lost; move on to the next. + ++the_balance; + goto skip; + } + } + if( --n==0 ) + return 0; + } else { + // overdraft. + ++the_balance; + break; + } +skip: + ; + } + return n +struct connection_traits {}; + +static tbb::atomic this_tbb_connection; + +template +class generic_connection: public Server, no_copy { + /*override*/ version_type version() const {return SERVER_VERSION;} + /*override*/ void yield() {thread_monitor::yield();} + /*override*/ void independent_thread_number_changed( int delta ) {my_thread_map.adjust_balance( -delta );} + /*override*/ unsigned default_concurrency() const {return hardware_concurrency()-1;} + +protected: + thread_map my_thread_map; + void do_open() {my_thread_map.bind();} + void request_close_connection(); + //! Make destructor virtual + virtual ~generic_connection() {} + generic_connection( wait_counter& fc, Client& c ) : my_thread_map(fc,c) {} + +public: + Client& client() const {return static_cast(my_thread_map.client());} + int add_server_ref () {return my_thread_map.add_server_ref();} + void remove_server_ref() {if( my_thread_map.remove_server_ref()==0 ) delete this;} + void remove_client_ref() {my_thread_map.remove_client_ref();} + void make_job( server_thread& t, job_automaton& ja ); +}; + +//------------------------------------------------------------------------ +// TBB server +//------------------------------------------------------------------------ + +template<> +struct connection_traits { + static const request_kind initialize = rk_initialize_tbb_job; + static const request_kind terminate = rk_terminate_tbb_job; + static const bool assist_null_only = true; + static const bool is_tbb = true; +}; + +//! Represents a server and client binding. +/** The internal representation uses inheritance for the server part and a pointer for the client part. */ +class tbb_connection_v1: public generic_connection { + friend void wakeup_some_tbb_threads(); + /*override*/ void adjust_job_count_estimate( int delta ); + //! Estimate on number of jobs without threads working on them. + tbb::atomic my_slack; + friend class dummy_class_to_shut_up_gratuitous_warning_from_gcc_3_2_3; +#if TBB_USE_ASSERT + tbb::atomic my_job_count_estimate; +#endif /* TBB_USE_ASSERT */ + + // pad these? or use a single variable w/ atomic add/subtract? + tbb::atomic n_adjust_job_count_requests; + ~tbb_connection_v1(); + +public: + enum tbb_conn_t { + c_empty = 0, + c_init = -1, + c_locked = -2 + }; + + //! True if there is slack that try_process can use. + bool has_slack() const {return my_slack>0;} + + bool try_process( job& job ) { + bool visited = false; + // No check for my_slack>0 here because caller is expected to do that check. + int k = --my_slack; + if( k>=0 ) { + client().process(job); + visited = true; + } + ++my_slack; + return visited; + } + + tbb_connection_v1( wait_counter& fc, tbb_client& client ) : generic_connection(fc,client) { + my_slack = 0; +#if TBB_USE_ASSERT + my_job_count_estimate = 0; +#endif /* TBB_USE_ASSERT */ + __TBB_ASSERT( !my_slack, NULL ); + do_open(); + __TBB_ASSERT( this_tbb_connection==reinterpret_cast(tbb_connection_v1::c_init), NULL ); + n_adjust_job_count_requests = 0; + this_tbb_connection = this; + } + + void wakeup_tbb_threads( unsigned n ) {my_thread_map.wakeup_tbb_threads( n );} + bool wakeup_next_thread( thread_map::iterator i ) {return my_thread_map.wakeup_next_thread( i, *this );} + thread_map::size_type get_unrealized_threads () {return my_thread_map.get_unrealized_threads();} +}; + +/* to deal with cases where the machine is oversubscribed; we want each thread to trip to try_process() at least once */ +/* this should not involve computing the_balance */ +bool thread_map::wakeup_next_thread( thread_map::iterator this_thr, tbb_connection_v1& conn ) { + if( all_visited_at_least_once ) + return false; + + iterator e = end(); +retry: + bool exist = false; + iterator k=this_thr; + for( ++k; k!=e; ++k ) { + // If another thread added *k, there is a tiny timing window where thread() is invalid. + server_thread& t = k->wait_for_thread(); + if( t.tbb_state!=ts_visited ) + exist = true; + if( t.read_state()!=ts_tbb_busy && t.tbb_state==ts_started ) + if( t.try_grab_for( ts_tbb_busy ) ) + return true; + } + for( k=begin(); k!=this_thr; ++k ) { + server_thread& t = k->wait_for_thread(); + if( t.tbb_state!=ts_visited ) + exist = true; + if( t.read_state()!=ts_tbb_busy && t.tbb_state==ts_started ) + if( t.try_grab_for( ts_tbb_busy ) ) + return true; + } + + if( exist ) + if( conn.has_slack() ) + goto retry; + else + all_visited_at_least_once = true; + return false; +} + +void thread_map::release_tbb_threads( server_thread* t ) { + for( ; t; t = t->link ) { + while( t->read_state()!=ts_asleep ) + __TBB_Yield(); + t->tbb_state = ts_started; + } +} + +void thread_map::adjust_balance( int delta ) { + int new_balance = the_balance += delta; + if( new_balance>0 && 0>=new_balance-delta /*== old the_balance*/ ) + wakeup_some_tbb_threads(); +} + +//------------------------------------------------------------------------ +// OpenMP server +//------------------------------------------------------------------------ + +template<> +struct connection_traits { + static const request_kind initialize = rk_initialize_omp_job; + static const request_kind terminate = rk_terminate_omp_job; + static const bool assist_null_only = false; + static const bool is_tbb = false; +}; + +class omp_connection_v1: public generic_connection { + /*override*/ int current_balance() const {return the_balance;} + /*override*/ int try_increase_load( size_type n, bool strict ); + /*override*/ void decrease_load( size_type n ); + /*override*/ void get_threads( size_type request_size, void* cookie, job* array[] ); +public: +#if TBB_USE_ASSERT + //! Net change in delta caused by this connection. + /** Should be zero when connection is broken */ + tbb::atomic net_delta; +#endif /* TBB_USE_ASSERT */ + + omp_connection_v1( wait_counter& fc, omp_client& client ) : generic_connection(fc,client) { +#if TBB_USE_ASSERT + net_delta = 0; +#endif /* TBB_USE_ASSERT */ + do_open(); + } + ~omp_connection_v1() {__TBB_ASSERT( net_delta==0, "net increase/decrease of load is nonzero" );} +}; + +template +void generic_connection::request_close_connection() { +#if _MSC_VER && !defined(__INTEL_COMPILER) +// Suppress "conditional expression is constant" warning. +#pragma warning( push ) +#pragma warning( disable: 4127 ) +#endif + if( connection_traits::is_tbb ) { + __TBB_ASSERT( this_tbb_connection==reinterpret_cast(this), NULL ); + tbb_connection_v1* conn; + do { + while( (conn=this_tbb_connection)==reinterpret_cast(tbb_connection_v1::c_locked) ) + __TBB_Yield(); + } while ( this_tbb_connection.compare_and_swap(0, conn)!=conn ); + } +#if _MSC_VER && !defined(__INTEL_COMPILER) +#pragma warning( pop ) +#endif + my_thread_map.unbind( connection_traits::terminate ); + my_thread_map.assist_cleanup( connection_traits::assist_null_only ); + // Remove extra reference + remove_server_ref(); +} + +template +void generic_connection::make_job( server_thread& t, job_automaton& ja ) { + if( ja.try_acquire() ) { + rml::job& j = *client().create_one_job(); + __TBB_ASSERT( &j!=NULL, "client:::create_one_job returned NULL" ); + __TBB_ASSERT( (intptr_t(&j)&1)==0, "client::create_one_job returned misaligned job" ); + ja.set_and_release( j ); + __TBB_ASSERT( t.my_conn && t.my_ja && t.my_job==NULL, NULL ); + t.my_job = &j; + } +} + +tbb_connection_v1::~tbb_connection_v1() { +#if TBB_USE_ASSERT + if( my_job_count_estimate!=0 ) { + fprintf(stderr, "TBB client tried to disconnect with non-zero net job count estimate of %d\n", int(my_job_count_estimate )); + abort(); + } + __TBB_ASSERT( !my_slack, "attempt to destroy tbb_server with nonzero slack" ); + __TBB_ASSERT( this!=this_tbb_connection, "request_close_connection() must be called" ); +#endif /* TBB_USE_ASSERT */ + // if the next connection has unstarted threads, start one of them. + wakeup_some_tbb_threads(); +} + +void tbb_connection_v1::adjust_job_count_estimate( int delta ) { +#if TBB_USE_ASSERT + my_job_count_estimate += delta; +#endif /* TBB_USE_ASSERT */ + // Atomically update slack. + int c = my_slack+=delta; + if( c>0 ) { + ++n_adjust_job_count_requests; + // The client has work to do and there are threads available + thread_map::size_type n = my_thread_map.wakeup_tbb_threads(c); + + server_thread* new_threads_anchor = NULL; + thread_map::size_type i; + for( i=0; ithread(); + __TBB_ASSERT( !t.link, NULL ); + t.link = new_threads_anchor; + new_threads_anchor = &t; + } + + thread_map::size_type j=0; + for( ; the_balance>0 && j=0 ) { + // withdraw a coin from the bank + __TBB_ASSERT( new_threads_anchor, NULL ); + + server_thread* t = new_threads_anchor; + new_threads_anchor = t->link; + while( !t->try_grab_for( ts_tbb_busy ) ) + __TBB_Yield(); + t->tbb_state = ts_started; + } else { + // overdraft. return it to the bank + ++the_balance; + break; + } + } + __TBB_ASSERT( i-j!=0||new_threads_anchor==NULL, NULL ); + // mark the ones that did not get started as eligible for being snatched. + if( new_threads_anchor ) + my_thread_map.release_tbb_threads( new_threads_anchor ); + + --n_adjust_job_count_requests; + } +} + +//! wake some available tbb threads +/** + First, atomically grab the connection, then increase the server ref count to keep it from being released prematurely. + Second, check if the balance is available for TBB and the tbb conneciton has slack to exploit. + If the answer is true, go ahead and try to wake some up. + */ +void wakeup_some_tbb_threads() +{ + for( ;; ) { + tbb_connection_v1* conn = this_tbb_connection; + /* + if( conn==0 or conn==tbb_connection_v1::c_init ) + the next connection will see my last change to the deposit; do nothing + if( conn==tbb_connection_v1::c_locked ) + a thread is already in the region A-B below. + it will read the change made by threads of my connection to the_balance; + do nothing + + 0==c_empty, -1==c_init, -2==c_locked + */ + if( ((-ptrdiff_t(conn))&~3 )==0 ) + return; + + // FIXME: place the_balance next to tbb_this_connection ? to save some cache moves ? + /* region A: this is the only place to set this_tbb_connection to c_locked */ + tbb_connection_v1* old_ttc = this_tbb_connection.compare_and_swap( reinterpret_cast(tbb_connection_v1::c_locked), conn ); + if( old_ttc==conn ) { +#if USE_TBB_ASSERT + __TBB_ASSERT( conn->add_server_ref()>1, NULL ); +#else + conn->add_server_ref(); +#endif + /* region B: this is the only place to restore this_tbb_connection from c_locked */ + this_tbb_connection = conn; // restoring it means releasing it + + /* some threads are creating tbb server threads; they may not see my changes made to the_balance */ + while( conn->n_adjust_job_count_requests>0 ) + __TBB_Yield(); + + int bal = the_balance; + if( bal>0 && conn->has_slack() ) + conn->wakeup_tbb_threads( bal ); + conn->remove_server_ref(); + break; + } else if( ((-ptrdiff_t(old_ttc))&~3)==0 ) { + return; /* see above */ + } else { + __TBB_Yield(); + } + } +} + +int omp_connection_v1::try_increase_load( size_type n, bool strict ) { + __TBB_ASSERT(int(n)>=0,NULL); + if( strict ) { + the_balance-=int(n); + } else { + int avail, old; + do { + avail = the_balance; + if( avail<=0 ) { + // No atomic read-write-modify operation necessary. + return avail; + } + // don't read the_balance; if it changes, compare_and_swap will fail anyway. + old = the_balance.compare_and_swap( int(n)avail ) + n=avail; + } +#if TBB_USE_ASSERT + net_delta += n; +#endif /* TBB_USE_ASSERT */ + return n; +} + +void omp_connection_v1::decrease_load( size_type n ) { + __TBB_ASSERT(int(n)>=0,NULL); + my_thread_map.adjust_balance(int(n)); +#if TBB_USE_ASSERT + net_delta -= n; +#endif /* TBB_USE_ASSERT */ +} + +void omp_connection_v1::get_threads( size_type request_size, void* cookie, job* array[] ) { + + if( !request_size ) + return; + + unsigned index = 0; + for(;;) { // don't return until all request_size threads are grabbed. + // Need to grab some threads + thread_map::iterator k_end=my_thread_map.end(); + // FIXME - this search is going to be *very* slow when there is a large number of threads and most are in use. + // Consider starting search at random point, or high water mark of sorts. + for( thread_map::iterator k=my_thread_map.begin(); k!=k_end; ++k ) { + // If another thread added *k, there is a tiny timing window where thread() is invalid. + server_thread& t = k->wait_for_thread(); + if( t.try_grab_for( ts_omp_busy ) ) { + // The preincrement instead of post-increment of index is deliberate. + job& j = k->wait_for_job(); + array[index] = &j; + t.omp_dispatch.produce( client(), j, cookie, index PRODUCE_ARG(*this) ); + if( ++index==request_size ) + return; + } + } + // Need to allocate more threads + for( unsigned i=index; ithread(); + if( t.try_grab_for( ts_omp_busy ) ) { + job& j = k->wait_for_job(); + array[index] = &j; + // The preincrement instead of post-increment of index is deliberate. + t.omp_dispatch.produce( client(), j, cookie, index PRODUCE_ARG(*this) ); + if( ++index==request_size ) + return; + } // else someone else snatched it. + } + } +} + +//------------------------------------------------------------------------ +// Methods of omp_dispatch_type +//------------------------------------------------------------------------ +void omp_dispatch_type::consume() { + job_type* j; + // Wait for short window between when master sets state of this thread to ts_omp_busy + // and master thread calls produce. + // FIXME - this is a very short spin while the producer is setting fields of *this, + // but nonetheless the loop should probably use exponential backoff, or at least pause instructions. + do { + j = job; + } while( !j ); + job = static_cast(NULL); + client->process(*j,cookie,index); +#if TBB_USE_ASSERT + // Return of method process implies "decrease_load" from client's viewpoint, even though + // the actual adjustment of the_balance only happens when this thread really goes to sleep. + --server->net_delta; +#endif /* TBB_USE_ASSERT */ +} + +//------------------------------------------------------------------------ +// Methods of server_thread +//------------------------------------------------------------------------ + +server_thread::server_thread() : + ref_count(0), + link(NULL), // FIXME: remove when all fixes are done. + my_map_pos(), + my_conn(NULL), my_job(NULL), my_ja(NULL) +{ + state = ts_idle; +#if TBB_USE_ASSERT + has_active_thread = false; +#endif /* TBB_USE_ASSERT */ +} + +server_thread::~server_thread() { + __TBB_ASSERT( !has_active_thread, NULL ); +} + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Suppress overzealous compiler warnings about an initialized variable 'sink_for_alloca' not referenced + #pragma warning(push) + #pragma warning(disable:4189) +#endif +__RML_DECL_THREAD_ROUTINE server_thread::thread_routine( void* arg ) { + server_thread* self = static_cast(arg); + AVOID_64K_ALIASING( self->my_index ); +#if TBB_USE_ASSERT + __TBB_ASSERT( !self->has_active_thread, NULL ); + self->has_active_thread = true; +#endif /* TBB_USE_ASSERT */ + self->loop(); + return NULL; +} +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning(pop) +#endif + +void server_thread::launch( size_t stack_size ) { + thread_monitor::launch( thread_routine, this, stack_size ); +} + +void server_thread::sleep_perhaps( thread_state_t asleep ) { + __TBB_ASSERT( asleep==ts_asleep, NULL ); + thread_monitor::cookie c; + monitor.prepare_wait(c); + if( state.compare_and_swap( asleep, ts_idle )==ts_idle ) { + if( request==rk_none ) { + monitor.commit_wait(c); + // Someone else woke me up. The compare_and_swap further below deals with spurious wakeups. + } else { + monitor.cancel_wait(); + } + // Following compare-and-swap logic tries to transition from asleep to idle while both ignoring the + // preserving the reserved_flag bit in state, because some other thread may be asynchronously clearing + // the reserved_flag bit within state. + thread_state_t s = read_state(); + if( s==ts_asleep ) { + state.compare_and_swap( ts_idle, ts_asleep ); + // I woke myself up, either because I cancelled the wait or suffered a spurious wakeup. + } else { + // Someone else woke me up; there the_balance is decremented by 1. -- tbb only + if( !is_omp_thread ) { + __TBB_ASSERT( state==ts_tbb_busy||state==ts_idle, NULL ); + } + } + } else { + // someone else made it busy ; see try_grab_for when state==ts_idle. + __TBB_ASSERT( state==ts_omp_busy||state==ts_tbb_busy, NULL ); + monitor.cancel_wait(); + } + __TBB_ASSERT( read_state()!=asleep, "a thread can only put itself to sleep" ); +} + +bool server_thread::wakeup( thread_state_t to, thread_state_t from ) { + bool success = false; + __TBB_ASSERT( from==ts_asleep && (to==ts_idle||to==ts_omp_busy||to==ts_tbb_busy), NULL ); + if( state.compare_and_swap( to, from )==from ) { + if( !is_omp_thread ) __TBB_ASSERT( to==ts_idle||to==ts_tbb_busy, NULL ); + // There is a small timing window that permits balance to become negative, + // but such occurrences are probably rare enough to not worry about, since + // at worst the result is slight temporary oversubscription. + monitor.notify(); + success = true; + } + return success; +} + +//! Attempt to change a thread's state to ts_omp_busy, and waking it up if necessary. +bool server_thread::try_grab_for( thread_state_t target_state ) { + bool success = false; + switch( read_state() ) { + case ts_asleep: + success = wakeup( target_state, ts_asleep ); + break; + case ts_idle: + success = state.compare_and_swap( target_state, ts_idle )==ts_idle; + break; + default: + // Thread is not available to be part of an OpenMP thread team. + break; + } + return success; +} + +template +bool server_thread::destroy_job( Connection& c ) { + __TBB_ASSERT( !is_omp_thread||state==ts_idle, NULL ); + __TBB_ASSERT( is_omp_thread||(state==ts_idle||state==ts_tbb_busy), NULL ); + if( !is_omp_thread ) { + __TBB_ASSERT( state==ts_idle||state==ts_tbb_busy, NULL ); + if( state==ts_idle ) + state.compare_and_swap( ts_done, ts_idle ); + // 'state' may be set to ts_tbb_busy by another thread.. + + if( state==ts_tbb_busy ) { // return the coin to the deposit + // need to deposit first to let the next connection see the change + ++the_balance; + state = ts_done; // no other thread changes the state when it is ts_*_busy + } + } + if( job_automaton* ja = my_ja ) { + rml::job* j; + if( ja->try_plug(j) ) { + __TBB_ASSERT( j, NULL ); + c.client().cleanup(*j); + c.remove_client_ref(); + } else { + // Some other thread took responsibility for cleaning up the job. + } + } + //! Must do remove client reference first, because execution of c.remove_ref() can cause *this to be destroyed. + int k = remove_ref(); + __TBB_ASSERT_EX( k==0, "more than one references?" ); +#if TBB_USE_ASSERT + has_active_thread = false; +#endif /* TBB_USE_ASSERT */ + c.remove_server_ref(); + return true; +} + +bool server_thread::process_requests() { + __TBB_ASSERT( request!=rk_none, "should only be called when at least one request is present" ); + do { + request_kind my_req = request; + request.compare_and_swap( rk_none, my_req ); + switch( my_req ) { + case rk_initialize_tbb_job: + static_cast(my_conn)->make_job( *this, *my_ja ); + break; + + case rk_initialize_omp_job: + static_cast(my_conn)->make_job( *this, *my_ja ); + break; + + case rk_terminate_tbb_job: + if( destroy_job( *static_cast(my_conn) ) ) + return true; + break; + + case rk_terminate_omp_job: + if( destroy_job( *static_cast(my_conn) ) ) + return true; + break; + default: + break; + } + } while( request!=rk_none ); + return false; +} + +//! Loop that each thread executes +void server_thread::loop() { + for(;;) { + __TBB_Yield(); + if( state==ts_idle ) + sleep_perhaps( ts_asleep ); + + // Drain mailbox before reading the state. + if( request!=rk_none ) + if( process_requests() ) + return; + + // read the state after draining the mail box + thread_state_t s = read_state(); + __TBB_ASSERT( s==ts_idle||s==ts_omp_busy||s==ts_tbb_busy, NULL ); + + if( s==ts_omp_busy ) { + // Enslaved by OpenMP team. + omp_dispatch.consume(); + /* here wake a tbb thread up if feasible */ + int bal = ++the_balance; + if( bal>0 ) + wakeup_some_tbb_threads(); + state = ts_idle; + } else if( s==ts_tbb_busy ) { + // do some TBB work. + __TBB_ASSERT( my_conn && my_job, NULL ); + tbb_connection_v1& conn = *static_cast(my_conn); + // give openmp higher priority + bool has_coin = true; + while( has_coin && conn.has_slack() && the_balance>=0 ) { + if( conn.try_process(*my_job) ) { + tbb_state = ts_visited; + if( conn.has_slack() && the_balance>=0 ) + has_coin = !conn.wakeup_next_thread( my_map_pos ); + } + } + state = ts_idle; + if( has_coin ) { + ++the_balance; // return the coin back to the deposit + if( conn.has_slack() ) { // a new adjust_job_request_estimate() is in progress + // it may have missed my changes to state and/or the_balance + int bal = --the_balance; // try to grab the coin back + if( bal>=0 ) { // I got the coin + if( state.compare_and_swap( ts_tbb_busy, ts_idle )!=ts_idle ) + ++the_balance; // someone else enlisted me. + } else { + // overdraft. return the coin + ++the_balance; + } + } // else the new request will see my changes to state & the_balance. + } + } + } +} + +template +static factory::status_type connect( factory& f, Server*& server, Client& client ) { +#if _MSC_VER && !defined(__INTEL_COMPILER) +// Suppress "conditional expression is constant" warning. +#pragma warning( push ) +#pragma warning( disable: 4127 ) +#endif + if( connection_traits::is_tbb ) + if( this_tbb_connection.compare_and_swap(reinterpret_cast(-1), reinterpret_cast(0))!=0 ) + return factory::st_connection_exists; +#if _MSC_VER && !defined(__INTEL_COMPILER) +#pragma warning( pop ) +#endif + server = new Connection(*static_cast(f.scratch_ptr),client); + return factory::st_success; +} + +extern "C" factory::status_type __RML_open_factory( factory& f, version_type& server_version, version_type client_version ) { + // Hack to keep this library from being closed by causing the first client's dlopen to not have a corresponding dlclose. + // This code will be removed once we figure out how to do shutdown of the RML perfectly. + static tbb::atomic one_time_flag; + if( one_time_flag.compare_and_swap(true,false)==false) { + f.library_handle = NULL; + } + // End of hack + + // initialize the_balance only once + if( the_balance_inited==0 ) { + if( the_balance_inited.compare_and_swap( 1, 0 )==0 ) { + the_balance = hardware_concurrency()-1; + the_balance_inited = 2; + } else { + tbb::internal::spin_wait_until_eq( the_balance_inited, 2 ); + } + } + + server_version = SERVER_VERSION; + f.scratch_ptr = 0; + if( client_version==0 ) { + return factory::st_incompatible; + } else { + f.scratch_ptr = new wait_counter; + return factory::st_success; + } +} + +extern "C" void __RML_close_factory( factory& f ) { + if( wait_counter* fc = static_cast(f.scratch_ptr) ) { + f.scratch_ptr = 0; + fc->wait(); + delete fc; + } +} + +void call_with_build_date_str( ::rml::server_info_callback_t cb, void* arg ); + +}} // rml::internal + +namespace tbb { +namespace internal { +namespace rml { + +extern "C" tbb_factory::status_type __TBB_make_rml_server( tbb_factory& f, tbb_server*& server, tbb_client& client ) { + return ::rml::internal::connect< ::rml::internal::tbb_connection_v1>(f,server,client); +} + +extern "C" void __TBB_call_with_my_server_info( ::rml::server_info_callback_t cb, void* arg ) { + return ::rml::internal::call_with_build_date_str( cb, arg ); +} + +}}} + +namespace __kmp { +namespace rml { + +extern "C" omp_factory::status_type __KMP_make_rml_server( omp_factory& f, omp_server*& server, omp_client& client ) { + return ::rml::internal::connect< ::rml::internal::omp_connection_v1>(f,server,client); +} + +extern "C" void __KMP_call_with_my_server_info( ::rml::server_info_callback_t cb, void* arg ) { + return ::rml::internal::call_with_build_date_str( cb, arg ); +} + +}} + +/* + * RML server info + */ +#include "version_string.tmp" + +#ifndef __TBB_VERSION_STRINGS +#pragma message("Warning: version_string.tmp isn't generated properly by version_info.sh script!") +#endif + +// We pass the build time as the RML server info. TBB is required to build RML, so we make it the same as the TBB build time. +#ifndef __TBB_DATETIME +#define __TBB_DATETIME __DATE__ " " __TIME__ +#endif +#define RML_SERVER_INFO "Intel(R) RML library built: " __TBB_DATETIME + +namespace rml { +namespace internal { +void call_with_build_date_str( ::rml::server_info_callback_t cb, void* arg ) +{ + (*cb)( arg, RML_SERVER_INFO ); +} +}} // rml::internal diff --git a/dep/tbb/src/rml/server/thread_monitor.h b/dep/tbb/src/rml/server/thread_monitor.h new file mode 100644 index 000000000..607188bfa --- /dev/null +++ b/dep/tbb/src/rml/server/thread_monitor.h @@ -0,0 +1,244 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// All platform-specific threading support is encapsulated here. */ + +#ifndef __RML_thread_monitor_H +#define __RML_thread_monitor_H + +#if USE_WINTHREAD +#include +#include +#include //_alloca +#elif USE_PTHREAD +#include +#include +#include +#else +#error Unsupported platform +#endif +#include + +// All platform-specific threading support is in this header. + +#if (_WIN32||_WIN64)&&!__TBB_ipf +// Deal with 64K aliasing. The formula for "offset" is a Fibonacci hash function, +// which has the desirable feature of spreading out the offsets fairly evenly +// without knowing the total number of offsets, and furthermore unlikely to +// accidentally cancel out other 64K aliasing schemes that Microsoft might implement later. +// See Knuth Vol 3. "Theorem S" for details on Fibonacci hashing. +// The second statement is really does need "volatile", otherwise the compiler might remove the _alloca. +#define AVOID_64K_ALIASING(idx) \ + size_t offset = (idx+1) * 40503U % (1U<<16); \ + void* volatile sink_for_alloca = _alloca(offset); \ + __TBB_ASSERT_EX(sink_for_alloca, "_alloca failed"); +#else +// Linux thread allocators avoid 64K aliasing. +#define AVOID_64K_ALIASING(idx) +#endif /* _WIN32||_WIN64 */ + +namespace rml { + +namespace internal { + +//! Monitor with limited two-phase commit form of wait. +/** At most one thread should wait on an instance at a time. */ +class thread_monitor { +public: + class cookie { + friend class thread_monitor; + unsigned long long my_version; + }; + thread_monitor(); + ~thread_monitor(); + + //! If a thread is waiting or started a two-phase wait, notify it. + /** Can be called by any thread. */ + void notify(); + + //! Begin two-phase wait. + /** Should only be called by thread that owns the monitor. + The caller must either complete the wait or cancel it. */ + void prepare_wait( cookie& c ); + + //! Complete a two-phase wait and wait until notification occurs after the earlier prepare_wait. + void commit_wait( cookie& c ); + + //! Cancel a two-phase wait. + void cancel_wait(); + +#if USE_WINTHREAD +#define __RML_DECL_THREAD_ROUTINE unsigned WINAPI + typedef unsigned (WINAPI *thread_routine_type)(void*); +#endif /* USE_WINTHREAD */ + +#if USE_PTHREAD +#define __RML_DECL_THREAD_ROUTINE void* + typedef void*(*thread_routine_type)(void*); +#endif /* USE_PTHREAD */ + + //! Launch a thread + static void launch( thread_routine_type thread_routine, void* arg, size_t stack_size ); + static void yield(); + +private: + cookie my_cookie; +#if USE_WINTHREAD + CRITICAL_SECTION critical_section; + HANDLE event; +#endif /* USE_WINTHREAD */ +#if USE_PTHREAD + pthread_mutex_t my_mutex; + pthread_cond_t my_cond; + static void check( int error_code, const char* routine ); +#endif /* USE_PTHREAD */ +}; + + +#if USE_WINTHREAD +#ifndef STACK_SIZE_PARAM_IS_A_RESERVATION +#define STACK_SIZE_PARAM_IS_A_RESERVATION 0x00010000 +#endif +inline void thread_monitor::launch( thread_routine_type thread_routine, void* arg, size_t stack_size ) { + unsigned thread_id; + uintptr_t status = _beginthreadex( NULL, unsigned(stack_size), thread_routine, arg, STACK_SIZE_PARAM_IS_A_RESERVATION, &thread_id ); + if( status==0 ) { + fprintf(stderr,"thread_monitor::launch: _beginthreadex failed\n"); + exit(1); + } else { + CloseHandle((HANDLE)status); + } +} + +inline void thread_monitor::yield() { + SwitchToThread(); +} + +inline thread_monitor::thread_monitor() { + event = CreateEvent( NULL, /*manualReset=*/true, /*initialState=*/false, NULL ); + InitializeCriticalSection( &critical_section ); + my_cookie.my_version = 0; +} + +inline thread_monitor::~thread_monitor() { + CloseHandle( event ); + DeleteCriticalSection( &critical_section ); +} + +inline void thread_monitor::notify() { + EnterCriticalSection( &critical_section ); + ++my_cookie.my_version; + SetEvent( event ); + LeaveCriticalSection( &critical_section ); +} + +inline void thread_monitor::prepare_wait( cookie& c ) { + EnterCriticalSection( &critical_section ); + c = my_cookie; +} + +inline void thread_monitor::commit_wait( cookie& c ) { + ResetEvent( event ); + LeaveCriticalSection( &critical_section ); + while( my_cookie.my_version==c.my_version ) { + WaitForSingleObject( event, INFINITE ); + ResetEvent( event ); + } +} + +inline void thread_monitor::cancel_wait() { + LeaveCriticalSection( &critical_section ); +} +#endif /* USE_WINTHREAD */ + +#if USE_PTHREAD +inline void thread_monitor::check( int error_code, const char* routine ) { + if( error_code ) { + fprintf(stderr,"thread_monitor %s\n", strerror(error_code) ); + exit(1); + } +} + +inline void thread_monitor::launch( void* (*thread_routine)(void*), void* arg, size_t stack_size ) { + // FIXME - consider more graceful recovery than just exiting if a thread cannot be launched. + // Note that there are some tricky situations to deal with, such that the thread is already + // grabbed as part of an OpenMP team, or is being launched as a replacement for a thread with + // too small a stack. + pthread_attr_t s; + check(pthread_attr_init( &s ), "pthread_attr_init"); + if( stack_size>0 ) { + check(pthread_attr_setstacksize( &s, stack_size ),"pthread_attr_setstack_size"); + } + pthread_t handle; + check( pthread_create( &handle, &s, thread_routine, arg ), "pthread_create" ); + check( pthread_detach( handle ), "pthread_detach" ); +} + +inline void thread_monitor::yield() { + sched_yield(); +} + +inline thread_monitor::thread_monitor() { + check( pthread_mutex_init(&my_mutex,NULL), "pthread_mutex_init" ); + check( pthread_cond_init(&my_cond,NULL), "pthread_cond_init" ); + my_cookie.my_version = 0; +} + +inline thread_monitor::~thread_monitor() { + pthread_cond_destroy(&my_cond); + pthread_mutex_destroy(&my_mutex); +} + +inline void thread_monitor::notify() { + check( pthread_mutex_lock( &my_mutex ), "pthread_mutex_lock" ); + ++my_cookie.my_version; + check( pthread_mutex_unlock( &my_mutex ), "pthread_mutex_unlock" ); + check( pthread_cond_signal(&my_cond), "pthread_cond_signal" ); +} + +inline void thread_monitor::prepare_wait( cookie& c ) { + check( pthread_mutex_lock( &my_mutex ), "pthread_mutex_lock" ); + c = my_cookie; +} + +inline void thread_monitor::commit_wait( cookie& c ) { + while( my_cookie.my_version==c.my_version ) { + pthread_cond_wait( &my_cond, &my_mutex ); + } + check( pthread_mutex_unlock( &my_mutex ), "pthread_mutex_unlock" ); +} + +inline void thread_monitor::cancel_wait() { + check( pthread_mutex_unlock( &my_mutex ), "pthread_mutex_unlock" ); +} +#endif /* USE_PTHREAD */ + +} // namespace internal +} // namespace rml + +#endif /* __RML_thread_monitor_H */ diff --git a/dep/tbb/src/rml/server/wait_counter.h b/dep/tbb/src/rml/server/wait_counter.h new file mode 100644 index 000000000..0951f9797 --- /dev/null +++ b/dep/tbb/src/rml/server/wait_counter.h @@ -0,0 +1,81 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __RML_wait_counter_H +#define __RML_wait_counter_H + +#include "thread_monitor.h" +#include "tbb/atomic.h" + +namespace rml { +namespace internal { + +class wait_counter { + thread_monitor my_monitor; + tbb::atomic my_count; + tbb::atomic n_transients; +public: + wait_counter() { + // The "1" here is subtracted by the call to "wait". + my_count=1; + n_transients=0; + } + + //! Wait for number of operator-- invocations to match number of operator++ invocations. + /** Exactly one thread should call this method. */ + void wait() { + int k = --my_count; + __TBB_ASSERT( k>=0, "counter underflow" ); + if( k>0 ) { + thread_monitor::cookie c; + my_monitor.prepare_wait(c); + if( my_count ) + my_monitor.commit_wait(c); + else + my_monitor.cancel_wait(); + } + while( n_transients>0 ) + __TBB_Yield(); + } + void operator++() { + ++my_count; + } + void operator--() { + ++n_transients; + int k = --my_count; + __TBB_ASSERT( k>=0, "counter underflow" ); + if( k==0 ) + my_monitor.notify(); + --n_transients; + } +}; + +} // namespace internal +} // namespace rml + +#endif /* __RML_wait_counter_H */ diff --git a/dep/tbb/src/rml/server/win32-rml-export.def b/dep/tbb/src/rml/server/win32-rml-export.def new file mode 100644 index 000000000..54be4b16e --- /dev/null +++ b/dep/tbb/src/rml/server/win32-rml-export.def @@ -0,0 +1,35 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +EXPORTS + +__RML_open_factory +__RML_close_factory +__TBB_make_rml_server +__KMP_make_rml_server +__TBB_call_with_my_server_info +__KMP_call_with_my_server_info + diff --git a/dep/tbb/src/rml/server/win64-rml-export.def b/dep/tbb/src/rml/server/win64-rml-export.def new file mode 100644 index 000000000..54be4b16e --- /dev/null +++ b/dep/tbb/src/rml/server/win64-rml-export.def @@ -0,0 +1,35 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +EXPORTS + +__RML_open_factory +__RML_close_factory +__TBB_make_rml_server +__KMP_make_rml_server +__TBB_call_with_my_server_info +__KMP_call_with_my_server_info + diff --git a/dep/tbb/src/tbb/cache_aligned_allocator.cpp b/dep/tbb/src/tbb/cache_aligned_allocator.cpp new file mode 100644 index 000000000..18e3d13cf --- /dev/null +++ b/dep/tbb/src/tbb/cache_aligned_allocator.cpp @@ -0,0 +1,329 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/cache_aligned_allocator.h" +#include "tbb/tbb_allocator.h" +#include "tbb_misc.h" +#include "dynamic_link.h" +#include + +#if _WIN32||_WIN64 +#include +#else +#include +#endif /* _WIN32||_WIN64 */ + +using namespace std; + +#if __TBB_WEAK_SYMBOLS + +#pragma weak scalable_malloc +#pragma weak scalable_free + +extern "C" { + void* scalable_malloc( size_t ); + void scalable_free( void* ); +} + +#endif /* __TBB_WEAK_SYMBOLS */ + +#define __TBB_IS_SCALABLE_MALLOC_FIX_READY 0 + +namespace tbb { + +namespace internal { + +//! Dummy routine used for first indirect call via MallocHandler. +static void* DummyMalloc( size_t size ); + +//! Dummy routine used for first indirect call via FreeHandler. +static void DummyFree( void * ptr ); + +//! Handler for memory allocation +static void* (*MallocHandler)( size_t size ) = &DummyMalloc; + +//! Handler for memory deallocation +static void (*FreeHandler)( void* pointer ) = &DummyFree; + +//! Table describing the how to link the handlers. +static const dynamic_link_descriptor MallocLinkTable[] = { + DLD(scalable_malloc, MallocHandler), + DLD(scalable_free, FreeHandler), +}; + +#if __TBB_IS_SCALABLE_MALLOC_FIX_READY +//! Dummy routine used for first indirect call via padded_allocate_handler. +static void* dummy_padded_allocate( size_t bytes, size_t alignment ); + +//! Dummy routine used for first indirect call via padded_free_handler. +static void dummy_padded_free( void * ptr ); + +// ! Allocates memory using standard malloc. It is used when scalable_allocator is not available +static void* padded_allocate( size_t bytes, size_t alignment ); + +// ! Allocates memory using scalable_malloc +static void* padded_allocate_via_scalable_malloc( size_t bytes, size_t alignment ); + +// ! Allocates memory using standard free. It is used when scalable_allocator is not available +static void padded_free( void* p ); + +//! Handler for padded memory allocation +static void* (*padded_allocate_handler)( size_t bytes, size_t alignment ) = &dummy_padded_allocate; + +//! Handler for padded memory deallocation +static void (*padded_free_handler)( void* p ) = &dummy_padded_free; + +#endif // #if __TBB_IS_SCALABLE_MALLOC_FIX_READY + + +#if TBB_USE_DEBUG +#define DEBUG_SUFFIX "_debug" +#else +#define DEBUG_SUFFIX +#endif /* TBB_USE_DEBUG */ + +// MALLOCLIB_NAME is the name of the TBB memory allocator library. +#if _WIN32||_WIN64 +#define MALLOCLIB_NAME "tbbmalloc" DEBUG_SUFFIX ".dll" +#elif __APPLE__ +#define MALLOCLIB_NAME "libtbbmalloc" DEBUG_SUFFIX ".dylib" +#elif __linux__ +#define MALLOCLIB_NAME "libtbbmalloc" DEBUG_SUFFIX __TBB_STRING(.so.TBB_COMPATIBLE_INTERFACE_VERSION) +#elif __FreeBSD__ || __sun +#define MALLOCLIB_NAME "libtbbmalloc" DEBUG_SUFFIX ".so" +#else +#error Unknown OS +#endif + +//! Initialize the allocation/free handler pointers. +/** Caller is responsible for ensuring this routine is called exactly once. + The routine attempts to dynamically link with the TBB memory allocator. + If that allocator is not found, it links to malloc and free. */ +void initialize_cache_aligned_allocator() { + __TBB_ASSERT( MallocHandler==&DummyMalloc, NULL ); + bool success = dynamic_link( MALLOCLIB_NAME, MallocLinkTable, 2 ); + if( !success ) { + // If unsuccessful, set the handlers to the default routines. + // This must be done now, and not before FillDynanmicLinks runs, because if other + // threads call the handlers, we want them to go through the DoOneTimeInitializations logic, + // which forces them to wait. + FreeHandler = &free; + MallocHandler = &malloc; +#if __TBB_IS_SCALABLE_MALLOC_FIX_READY + padded_allocate_handler = &padded_allocate; + padded_free_handler = &padded_free; + }else{ + padded_allocate_handler = &padded_allocate_via_scalable_malloc; + __TBB_ASSERT(FreeHandler != &free && FreeHandler != &DummyFree, NULL); + padded_free_handler = FreeHandler; +#endif // __TBB_IS_SCALABLE_MALLOC_FIX_READY + } +#if !__TBB_RML_STATIC + PrintExtraVersionInfo( "ALLOCATOR", success?"scalable_malloc":"malloc" ); +#endif +} + +//! Defined in task.cpp +extern void DoOneTimeInitializations(); + +//! Executed on very first call through MallocHandler +static void* DummyMalloc( size_t size ) { + DoOneTimeInitializations(); + __TBB_ASSERT( MallocHandler!=&DummyMalloc, NULL ); + return (*MallocHandler)( size ); +} + +//! Executed on very first call throught FreeHandler +static void DummyFree( void * ptr ) { + DoOneTimeInitializations(); + __TBB_ASSERT( FreeHandler!=&DummyFree, NULL ); + (*FreeHandler)( ptr ); +} + +#if __TBB_IS_SCALABLE_MALLOC_FIX_READY +//! Executed on very first call through padded_allocate_handler +static void* dummy_padded_allocate( size_t bytes, size_t alignment ) { + DoOneTimeInitializations(); + __TBB_ASSERT( padded_allocate_handler!=&dummy_padded_allocate, NULL ); + return (*padded_allocate_handler)(bytes, alignment); +} + +//! Executed on very first call throught padded_free_handler +static void dummy_padded_free( void * ptr ) { + DoOneTimeInitializations(); + __TBB_ASSERT( padded_free_handler!=&dummy_padded_free, NULL ); + (*padded_free_handler)( ptr ); +} +#endif // __TBB_IS_SCALABLE_MALLOC_FIX_READY + +static size_t NFS_LineSize = 128; + +size_t NFS_GetLineSize() { + return NFS_LineSize; +} + +//! Requests for blocks this size and higher are handled via malloc/free, +const size_t BigSize = 4096; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // unary minus operator applied to unsigned type, result still unsigned + #pragma warning( disable: 4146 4706 ) +#endif + +void* NFS_Allocate( size_t n, size_t element_size, void* /*hint*/ ) { + size_t m = NFS_LineSize; + __TBB_ASSERT( m<=NFS_MaxLineSize, "illegal value for NFS_LineSize" ); + __TBB_ASSERT( (m & m-1)==0, "must be power of two" ); + size_t bytes = n*element_size; +#if __TBB_IS_SCALABLE_MALLOC_FIX_READY + + if (bytes=BigSize?malloc(m+bytes):(*MallocHandler)(m+bytes))) ) { + // Overflow + throw bad_alloc(); + } + // Round up to next line + unsigned char* result = (unsigned char*)((uintptr)(base+m)&-m); + // Record where block actually starts. Use low order bit to record whether we used malloc or MallocHandler. + ((uintptr*)result)[-1] = uintptr(base)|(bytes>=BigSize); +#endif // __TBB_IS_SCALABLE_MALLOC_FIX_READY + /** The test may fail with TBB_IS_SCALABLE_MALLOC_FIX_READY = 1 + because scalable_malloc returns addresses aligned to 64 when large block is allocated */ + __TBB_ASSERT( ((size_t)result&(m-1)) == 0, "The address returned isn't aligned to cache line size" ); + return result; +} + +void NFS_Free( void* p ) { +#if __TBB_IS_SCALABLE_MALLOC_FIX_READY + (*padded_free_handler)( p ); +#else + if( p ) { + __TBB_ASSERT( (uintptr)p>=0x4096, "attempt to free block not obtained from cache_aligned_allocator" ); + // Recover where block actually starts + unsigned char* base = ((unsigned char**)p)[-1]; + __TBB_ASSERT( (void*)((uintptr)(base+NFS_LineSize)&-NFS_LineSize)==p, "not allocated by NFS_Allocate?" ); + if( uintptr(base)&1 ) { + // Is a big block - use free + free(base-1); + } else { + // Is a small block - use scalable allocator + (*FreeHandler)( base ); + } + } +#endif // __TBB_IS_SCALABLE_MALLOC_FIX_READY +} + +#if __TBB_IS_SCALABLE_MALLOC_FIX_READY +static void* padded_allocate_via_scalable_malloc( size_t bytes, size_t alignment ) { + unsigned char* base; + if( !(base=(unsigned char*)(*MallocHandler)((bytes+alignment)&-alignment))) { + throw bad_alloc(); + } + return base; // scalable_malloc returns aligned pointer +} + +static void* padded_allocate( size_t bytes, size_t alignment ) { + unsigned char* base; + if( !(base=(unsigned char*)malloc(alignment+bytes)) ) { + throw bad_alloc(); + } + // Round up to the next line + unsigned char* result = (unsigned char*)((uintptr)(base+alignment)&-alignment); + // Record where block actually starts. + ((uintptr*)result)[-1] = uintptr(base); + return result; +} + +static void padded_free( void* p ) { + if( p ) { + __TBB_ASSERT( (uintptr)p>=0x4096, "attempt to free block not obtained from cache_aligned_allocator" ); + // Recover where block actually starts + unsigned char* base = ((unsigned char**)p)[-1]; + __TBB_ASSERT( (void*)((uintptr)(base+NFS_LineSize)&-NFS_LineSize)==p, "not allocated by NFS_Allocate?" ); + free(base); + } +} +#endif // #if __TBB_IS_SCALABLE_MALLOC_FIX_READY + +void* __TBB_EXPORTED_FUNC allocate_via_handler_v3( size_t n ) { + void* result; + result = (*MallocHandler) (n); + if (!result) { + // Overflow + throw bad_alloc(); + } + return result; +} + +void __TBB_EXPORTED_FUNC deallocate_via_handler_v3( void *p ) { + if( p ) { + (*FreeHandler)( p ); + } +} + +bool __TBB_EXPORTED_FUNC is_malloc_used_v3() { + if (MallocHandler == &DummyMalloc) { + void* void_ptr = (*MallocHandler)(1); + (*FreeHandler)(void_ptr); + } + __TBB_ASSERT( MallocHandler!=&DummyMalloc && FreeHandler!=&DummyFree, NULL ); + __TBB_ASSERT(MallocHandler==&malloc && FreeHandler==&free || + MallocHandler!=&malloc && FreeHandler!=&free, NULL ); + return MallocHandler == &malloc; +} + +} // namespace internal + +} // namespace tbb + +#if __TBB_RML_STATIC +#include "tbb/atomic.h" +static tbb::atomic module_inited; +namespace tbb { +namespace internal { +void DoOneTimeInitializations() { + if( module_inited!=2 ) { + if( module_inited.compare_and_swap(1, 0)==0 ) { + initialize_cache_aligned_allocator(); + module_inited = 2; + } else { + do { + __TBB_Yield(); + } while( module_inited!=2 ); + } + } +} +}} //namespace tbb::internal +#endif diff --git a/dep/tbb/src/tbb/concurrent_hash_map.cpp b/dep/tbb/src/tbb/concurrent_hash_map.cpp new file mode 100644 index 000000000..d3937102c --- /dev/null +++ b/dep/tbb/src/tbb/concurrent_hash_map.cpp @@ -0,0 +1,66 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/concurrent_hash_map.h" + +namespace tbb { + +namespace internal { +#if !TBB_NO_LEGACY +struct hash_map_segment_base { + typedef spin_rw_mutex segment_mutex_t; + //! Type of a hash code. + typedef size_t hashcode_t; + //! Log2 of n_segment + static const size_t n_segment_bits = 6; + //! Maximum size of array of chains + static const size_t max_physical_size = size_t(1)<<(8*sizeof(hashcode_t)-n_segment_bits); + //! Mutex that protects this segment + segment_mutex_t my_mutex; + // Number of nodes + atomic my_logical_size; + // Size of chains + /** Always zero or a power of two */ + size_t my_physical_size; + //! True if my_logical_size>=my_physical_size. + /** Used to support Intel(R) Thread Checker. */ + bool __TBB_EXPORTED_METHOD internal_grow_predicate() const; +}; + +bool hash_map_segment_base::internal_grow_predicate() const { + // Intel(R) Thread Checker considers the following reads to be races, so we hide them in the + // library so that Intel(R) Thread Checker will ignore them. The reads are used in a double-check + // context, so the program is nonetheless correct despite the race. + return my_logical_size >= my_physical_size && my_physical_size < max_physical_size; +} +#endif//!TBB_NO_LEGACY + +} // namespace internal + +} // namespace tbb + diff --git a/dep/tbb/src/tbb/concurrent_queue.cpp b/dep/tbb/src/tbb/concurrent_queue.cpp new file mode 100644 index 000000000..33ce5910b --- /dev/null +++ b/dep/tbb/src/tbb/concurrent_queue.cpp @@ -0,0 +1,841 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include // for memset() +#include "tbb/tbb_stddef.h" +#include "tbb/tbb_machine.h" +#include "tbb/_concurrent_queue_internal.h" +#include "itt_notify.h" +#include +#if _WIN32||_WIN64 +#include +#endif +using namespace std; + +// enable sleep support +#define __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE 1 + +#if defined(_MSC_VER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (disable: 4267) +#endif + +#define RECORD_EVENTS 0 + + +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE +#if !_WIN32&&!_WIN64 +#include +#endif +#endif + +namespace tbb { + +namespace internal { + +typedef concurrent_queue_base_v3 concurrent_queue_base; + +typedef size_t ticket; + +//! A queue using simple locking. +/** For efficient, this class has no constructor. + The caller is expected to zero-initialize it. */ +struct micro_queue { + typedef concurrent_queue_base::page page; + + friend class micro_queue_pop_finalizer; + + atomic head_page; + atomic head_counter; + + atomic tail_page; + atomic tail_counter; + + spin_mutex page_mutex; + + void push( const void* item, ticket k, concurrent_queue_base& base ); + + bool pop( void* dst, ticket k, concurrent_queue_base& base ); + + micro_queue& assign( const micro_queue& src, concurrent_queue_base& base ); + + page* make_copy ( concurrent_queue_base& base, const page* src_page, size_t begin_in_page, size_t end_in_page, ticket& g_index ) ; + + void make_invalid( ticket k ); +}; + +// we need to yank it out of micro_queue because of concurrent_queue_base::deallocate_page being virtual. +class micro_queue_pop_finalizer: no_copy { + typedef concurrent_queue_base::page page; + ticket my_ticket; + micro_queue& my_queue; + page* my_page; + concurrent_queue_base &base; +public: + micro_queue_pop_finalizer( micro_queue& queue, concurrent_queue_base& b, ticket k, page* p ) : + my_ticket(k), my_queue(queue), my_page(p), base(b) + {} + ~micro_queue_pop_finalizer() { + page* p = my_page; + if( p ) { + spin_mutex::scoped_lock lock( my_queue.page_mutex ); + page* q = p->next; + my_queue.head_page = q; + if( !q ) { + my_queue.tail_page = NULL; + } + } + my_queue.head_counter = my_ticket; + if( p ) + base.deallocate_page( p ); + } +}; + +//! Internal representation of a ConcurrentQueue. +/** For efficient, this class has no constructor. + The caller is expected to zero-initialize it. */ +class concurrent_queue_rep { +public: +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE +# if _WIN32||_WIN64 + typedef HANDLE waitvar_t; + typedef CRITICAL_SECTION mutexvar_t; +# else + typedef pthread_cond_t waitvar_t; + typedef pthread_mutex_t mutexvar_t; +# endif +#endif /* __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ +private: + friend struct micro_queue; + + //! Approximately n_queue/golden ratio + static const size_t phi = 3; + +public: + //! Must be power of 2 + static const size_t n_queue = 8; + + //! Map ticket to an array index + static size_t index( ticket k ) { + return k*phi%n_queue; + } + +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE + atomic head_counter; + waitvar_t var_wait_for_items; + mutexvar_t mtx_items_avail; + atomic n_invalid_entries; + atomic n_waiting_consumers; +#if _WIN32||_WIN64 + uint32_t consumer_wait_generation; + uint32_t n_consumers_to_wakeup; + char pad1[NFS_MaxLineSize-((sizeof(atomic)+sizeof(waitvar_t)+sizeof(mutexvar_t)+sizeof(atomic)+sizeof(atomic)+sizeof(uint32_t)+sizeof(uint32_t))&(NFS_MaxLineSize-1))]; +#else + char pad1[NFS_MaxLineSize-((sizeof(atomic)+sizeof(waitvar_t)+sizeof(mutexvar_t)+sizeof(atomic)+sizeof(atomic))&(NFS_MaxLineSize-1))]; +#endif + + atomic tail_counter; + waitvar_t var_wait_for_slots; + mutexvar_t mtx_slots_avail; + atomic n_waiting_producers; +#if _WIN32||_WIN64 + uint32_t producer_wait_generation; + uint32_t n_producers_to_wakeup; + char pad2[NFS_MaxLineSize-((sizeof(atomic)+sizeof(waitvar_t)+sizeof(mutexvar_t)+sizeof(atomic)+sizeof(uint32_t)+sizeof(uint32_t))&(NFS_MaxLineSize-1))]; +#else + char pad2[NFS_MaxLineSize-((sizeof(atomic)+sizeof(waitvar_t)+sizeof(mutexvar_t)+sizeof(atomic))&(NFS_MaxLineSize-1))]; +#endif +#else /* !__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ + atomic head_counter; + atomic n_invalid_entries; + char pad1[NFS_MaxLineSize-sizeof(atomic)-sizeof(atomic)]; + atomic tail_counter; + char pad2[NFS_MaxLineSize-sizeof(atomic)]; +#endif /* __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ + micro_queue array[n_queue]; + + micro_queue& choose( ticket k ) { + // The formula here approximates LRU in a cache-oblivious way. + return array[index(k)]; + } + + //! Value for effective_capacity that denotes unbounded queue. + static const ptrdiff_t infinite_capacity = ptrdiff_t(~size_t(0)/2); +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // unary minus operator applied to unsigned type, result still unsigned + #pragma warning( push ) + #pragma warning( disable: 4146 ) +#endif + +static void* invalid_page; + +//------------------------------------------------------------------------ +// micro_queue +//------------------------------------------------------------------------ +void micro_queue::push( const void* item, ticket k, concurrent_queue_base& base ) { + k &= -concurrent_queue_rep::n_queue; + page* p = NULL; + size_t index = k/concurrent_queue_rep::n_queue & (base.items_per_page-1); + if( !index ) { + try { + p = base.allocate_page(); + } catch (...) { + ++base.my_rep->n_invalid_entries; + make_invalid( k ); + } + p->mask = 0; + p->next = NULL; + } + + if( tail_counter!=k ) { + atomic_backoff backoff; + do { + backoff.pause(); + // no memory. throws an exception; assumes concurrent_queue_rep::n_queue>1 + if( tail_counter&0x1 ) { + ++base.my_rep->n_invalid_entries; + throw bad_last_alloc(); + } + } while( tail_counter!=k ) ; + } + + if( p ) { + spin_mutex::scoped_lock lock( page_mutex ); + if( page* q = tail_page ) + q->next = p; + else + head_page = p; + tail_page = p; + } else { + p = tail_page; + } + ITT_NOTIFY( sync_acquired, p ); + + try { + base.copy_item( *p, index, item ); + ITT_NOTIFY( sync_releasing, p ); + // If no exception was thrown, mark item as present. + p->mask |= uintptr(1)<n_invalid_entries; + tail_counter += concurrent_queue_rep::n_queue; + throw; + } +} + +bool micro_queue::pop( void* dst, ticket k, concurrent_queue_base& base ) { + k &= -concurrent_queue_rep::n_queue; + spin_wait_until_eq( head_counter, k ); + spin_wait_while_eq( tail_counter, k ); + page& p = *head_page; + __TBB_ASSERT( &p, NULL ); + size_t index = k/concurrent_queue_rep::n_queue & (base.items_per_page-1); + bool success = false; + { + micro_queue_pop_finalizer finalizer( *this, base, k+concurrent_queue_rep::n_queue, index==base.items_per_page-1 ? &p : NULL ); + if( p.mask & uintptr(1)<n_invalid_entries; + } + } + return success; +} + +micro_queue& micro_queue::assign( const micro_queue& src, concurrent_queue_base& base ) +{ + head_counter = src.head_counter; + tail_counter = src.tail_counter; + page_mutex = src.page_mutex; + + const page* srcp = src.head_page; + if( srcp ) { + ticket g_index = head_counter; + try { + size_t n_items = (tail_counter-head_counter)/concurrent_queue_rep::n_queue; + size_t index = head_counter/concurrent_queue_rep::n_queue & (base.items_per_page-1); + size_t end_in_first_page = (index+n_itemsnext; srcp!=src.tail_page; srcp=srcp->next ) { + cur_page->next = make_copy( base, srcp, 0, base.items_per_page, g_index ); + cur_page = cur_page->next; + } + + __TBB_ASSERT( srcp==src.tail_page, NULL ); + + size_t last_index = tail_counter/concurrent_queue_rep::n_queue & (base.items_per_page-1); + if( last_index==0 ) last_index = base.items_per_page; + + cur_page->next = make_copy( base, srcp, 0, last_index, g_index ); + cur_page = cur_page->next; + } + tail_page = cur_page; + } catch (...) { + make_invalid( g_index ); + } + } else { + head_page = tail_page = NULL; + } + return *this; +} + +concurrent_queue_base::page* micro_queue::make_copy( concurrent_queue_base& base, const concurrent_queue_base::page* src_page, size_t begin_in_page, size_t end_in_page, ticket& g_index ) +{ + page* new_page = base.allocate_page(); + new_page->next = NULL; + new_page->mask = src_page->mask; + for( ; begin_in_page!=end_in_page; ++begin_in_page, ++g_index ) + if( new_page->mask & uintptr(1)<((void*)1), 0}; + // mark it so that no more pushes are allowed. + invalid_page = &dummy; + { + spin_mutex::scoped_lock lock( page_mutex ); + tail_counter = k+concurrent_queue_rep::n_queue+1; + if( page* q = tail_page ) + q->next = static_cast(invalid_page); + else + head_page = static_cast(invalid_page); + tail_page = static_cast(invalid_page); + } + throw; +} + +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning( pop ) +#endif // warning 4146 is back + +//------------------------------------------------------------------------ +// concurrent_queue_base +//------------------------------------------------------------------------ +concurrent_queue_base_v3::concurrent_queue_base_v3( size_t item_size ) { + items_per_page = item_size<=8 ? 32 : + item_size<=16 ? 16 : + item_size<=32 ? 8 : + item_size<=64 ? 4 : + item_size<=128 ? 2 : + 1; + my_capacity = size_t(-1)/(item_size>1 ? item_size : 2); + my_rep = cache_aligned_allocator().allocate(1); + __TBB_ASSERT( (size_t)my_rep % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->head_counter % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->tail_counter % NFS_GetLineSize()==0, "alignment error" ); + __TBB_ASSERT( (size_t)&my_rep->array % NFS_GetLineSize()==0, "alignment error" ); + memset(my_rep,0,sizeof(concurrent_queue_rep)); + this->item_size = item_size; +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE +#if _WIN32||_WIN64 + my_rep->var_wait_for_items = CreateEvent( NULL, TRUE/*manual reset*/, FALSE/*not signalled initially*/, NULL); + my_rep->var_wait_for_slots = CreateEvent( NULL, TRUE/*manual reset*/, FALSE/*not signalled initially*/, NULL); + InitializeCriticalSection( &my_rep->mtx_items_avail ); + InitializeCriticalSection( &my_rep->mtx_slots_avail ); +#else + // initialize pthread_mutex_t, and pthread_cond_t + pthread_mutexattr_t m_attr; + pthread_mutexattr_init( &m_attr ); +#if defined(PTHREAD_PRIO_INHERIT) && !__TBB_PRIO_INHERIT_BROKEN + pthread_mutexattr_setprotocol( &m_attr, PTHREAD_PRIO_INHERIT ); +#endif + pthread_mutex_init( &my_rep->mtx_items_avail, &m_attr ); + pthread_mutex_init( &my_rep->mtx_slots_avail, &m_attr ); + pthread_mutexattr_destroy( &m_attr ); + + pthread_condattr_t c_attr; + pthread_condattr_init( &c_attr ); + pthread_cond_init( &my_rep->var_wait_for_items, &c_attr ); + pthread_cond_init( &my_rep->var_wait_for_slots, &c_attr ); + pthread_condattr_destroy( &c_attr ); +#endif +#endif /* __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ +} + +concurrent_queue_base_v3::~concurrent_queue_base_v3() { + size_t nq = my_rep->n_queue; + for( size_t i=0; iarray[i].tail_page==NULL, "pages were not freed properly" ); +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE +# if _WIN32||_WIN64 + CloseHandle( my_rep->var_wait_for_items ); + CloseHandle( my_rep->var_wait_for_slots ); + DeleteCriticalSection( &my_rep->mtx_items_avail ); + DeleteCriticalSection( &my_rep->mtx_slots_avail ); +# else + pthread_mutex_destroy( &my_rep->mtx_items_avail ); + pthread_mutex_destroy( &my_rep->mtx_slots_avail ); + pthread_cond_destroy( &my_rep->var_wait_for_items ); + pthread_cond_destroy( &my_rep->var_wait_for_slots ); +# endif +#endif /* __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ + cache_aligned_allocator().deallocate(my_rep,1); +} + +void concurrent_queue_base_v3::internal_push( const void* src ) { + concurrent_queue_rep& r = *my_rep; +#if !__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE + ticket k = r.tail_counter++; + ptrdiff_t e = my_capacity; + if( e(my_capacity); + } + } + r.choose(k).push(src,k,*this); +#elif _WIN32||_WIN64 + ticket k = r.tail_counter++; + ptrdiff_t e = my_capacity; + atomic_backoff backoff; +#if DO_ITT_NOTIFY + bool sync_prepare_done = false; +#endif + + while( (ptrdiff_t)(k-r.head_counter)>=e ) { +#if DO_ITT_NOTIFY + if( !sync_prepare_done ) { + ITT_NOTIFY( sync_prepare, &sync_prepare_done ); + sync_prepare_done = true; + } +#endif + if( !backoff.bounded_pause() ) { + EnterCriticalSection( &r.mtx_slots_avail ); + r.n_waiting_producers++; + while( (ptrdiff_t)(k-r.head_counter)>=const_cast(my_capacity) ) { + uint32_t my_generation = r.producer_wait_generation; + for( ;; ) { + LeaveCriticalSection( &r.mtx_slots_avail ); + WaitForSingleObject( r.var_wait_for_slots, INFINITE ); + EnterCriticalSection( &r.mtx_slots_avail ); + if( r.n_producers_to_wakeup > 0 && r.producer_wait_generation != my_generation ) + break; + } + if( --r.n_producers_to_wakeup == 0 ) + ResetEvent( r.var_wait_for_slots ); + } + --r.n_waiting_producers; + LeaveCriticalSection( &r.mtx_slots_avail ); + break; + } + e = const_cast(my_capacity); + } +#if DO_ITT_NOTIFY + if( sync_prepare_done ) + ITT_NOTIFY( sync_acquired, &sync_prepare_done ); +#endif + + r.choose( k ).push( src, k, *this ); + + if( r.n_waiting_consumers>0 ) { + EnterCriticalSection( &r.mtx_items_avail ); + if( r.n_waiting_consumers>0 ) { + r.consumer_wait_generation++; + r.n_consumers_to_wakeup = r.n_waiting_consumers; + SetEvent( r.var_wait_for_items ); + } + LeaveCriticalSection( &r.mtx_items_avail ); + } +#else + ticket k = r.tail_counter++; + ptrdiff_t e = my_capacity; + atomic_backoff backoff; +#if DO_ITT_NOTIFY + bool sync_prepare_done = false; +#endif + while( (ptrdiff_t)(k-r.head_counter)>=e ) { +#if DO_ITT_NOTIFY + if( !sync_prepare_done ) { + ITT_NOTIFY( sync_prepare, &sync_prepare_done ); + sync_prepare_done = true; + } +#endif + if( !backoff.bounded_pause() ) { + // queue is full. go to sleep. let them go to sleep in order. + pthread_mutex_lock( &r.mtx_slots_avail ); + r.n_waiting_producers++; + while( (ptrdiff_t)(k-r.head_counter)>=const_cast(my_capacity) ) { + pthread_cond_wait( &r.var_wait_for_slots, &r.mtx_slots_avail ); + } + --r.n_waiting_producers; + pthread_mutex_unlock( &r.mtx_slots_avail ); + break; + } + e = const_cast(my_capacity); + } +#if DO_ITT_NOTIFY + if( sync_prepare_done ) + ITT_NOTIFY( sync_acquired, &sync_prepare_done ); +#endif + r.choose( k ).push( src, k, *this ); + + if( r.n_waiting_consumers>0 ) { + pthread_mutex_lock( &r.mtx_items_avail ); + // pthread_cond_broadcast() wakes up all consumers. + if( r.n_waiting_consumers>0 ) + pthread_cond_broadcast( &r.var_wait_for_items ); + pthread_mutex_unlock( &r.mtx_items_avail ); + } +#endif /* !__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ +} + +void concurrent_queue_base_v3::internal_pop( void* dst ) { + concurrent_queue_rep& r = *my_rep; +#if !__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE + ticket k; + do { + k = r.head_counter++; + } while( !r.choose(k).pop(dst,k,*this) ); +#elif _WIN32||_WIN64 + ticket k; + atomic_backoff backoff; +#if DO_ITT_NOTIFY + bool sync_prepare_done = false; +#endif + do { + k=r.head_counter++; + while( r.tail_counter<=k ) { +#if DO_ITT_NOTIFY + if( !sync_prepare_done ) { + ITT_NOTIFY( sync_prepare, dst ); + dst = (void*) ((intptr_t)dst | 1); + sync_prepare_done = true; + } +#endif + // Queue is empty; pause and re-try a few times + if( !backoff.bounded_pause() ) { + // it is really empty.. go to sleep + EnterCriticalSection( &r.mtx_items_avail ); + r.n_waiting_consumers++; + while( r.tail_counter<=k ) { + uint32_t my_generation = r.consumer_wait_generation; + for( ;; ) { + LeaveCriticalSection( &r.mtx_items_avail ); + WaitForSingleObject( r.var_wait_for_items, INFINITE ); + EnterCriticalSection( &r.mtx_items_avail ); + if( r.n_consumers_to_wakeup > 0 && r.consumer_wait_generation != my_generation ) + break; + } + if( --r.n_consumers_to_wakeup == 0 ) + ResetEvent( r.var_wait_for_items ); + } + --r.n_waiting_consumers; + LeaveCriticalSection( &r.mtx_items_avail ); + backoff.reset(); + break; // break from inner while + } + } // break to here + } while( !r.choose(k).pop(dst,k,*this) ); + + // wake up a producer.. + if( r.n_waiting_producers>0 ) { + EnterCriticalSection( &r.mtx_slots_avail ); + if( r.n_waiting_producers>0 ) { + r.producer_wait_generation++; + r.n_producers_to_wakeup = r.n_waiting_producers; + SetEvent( r.var_wait_for_slots ); + } + LeaveCriticalSection( &r.mtx_slots_avail ); + } +#else + ticket k; + atomic_backoff backoff; +#if DO_ITT_NOTIFY + bool sync_prepare_done = false; +#endif + do { + k = r.head_counter++; + while( r.tail_counter<=k ) { +#if DO_ITT_NOTIFY + if( !sync_prepare_done ) { + ITT_NOTIFY( sync_prepare, dst ); + dst = (void*) ((intptr_t)dst | 1); + sync_prepare_done = true; + } +#endif + // Queue is empty; pause and re-try a few times + if( !backoff.bounded_pause() ) { + // it is really empty.. go to sleep + pthread_mutex_lock( &r.mtx_items_avail ); + r.n_waiting_consumers++; + while( r.tail_counter<=k ) + pthread_cond_wait( &r.var_wait_for_items, &r.mtx_items_avail ); + --r.n_waiting_consumers; + pthread_mutex_unlock( &r.mtx_items_avail ); + backoff.reset(); + break; + } + } + } while( !r.choose(k).pop(dst,k,*this) ); + + if( r.n_waiting_producers>0 ) { + pthread_mutex_lock( &r.mtx_slots_avail ); + if( r.n_waiting_producers>0 ) + pthread_cond_broadcast( &r.var_wait_for_slots ); + pthread_mutex_unlock( &r.mtx_slots_avail ); + } +#endif /* !__TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ +} + +bool concurrent_queue_base_v3::internal_pop_if_present( void* dst ) { + concurrent_queue_rep& r = *my_rep; + ticket k; + do { + k = r.head_counter; + for(;;) { + if( r.tail_counter<=k ) { + // Queue is empty + return false; + } + // Queue had item with ticket k when we looked. Attempt to get that item. + ticket tk=k; + k = r.head_counter.compare_and_swap( tk+1, tk ); + if( k==tk ) + break; + // Another thread snatched the item, retry. + } + } while( !r.choose( k ).pop( dst, k, *this ) ); + +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE +#if _WIN32||_WIN64 + // wake up a producer.. + if( r.n_waiting_producers>0 ) { + EnterCriticalSection( &r.mtx_slots_avail ); + if( r.n_waiting_producers>0 ) { + r.producer_wait_generation++; + r.n_producers_to_wakeup = r.n_waiting_producers; + SetEvent( r.var_wait_for_slots ); + } + LeaveCriticalSection( &r.mtx_slots_avail ); + } +#else /* including MacOS */ + if( r.n_waiting_producers>0 ) { + pthread_mutex_lock( &r.mtx_slots_avail ); + if( r.n_waiting_producers>0 ) + pthread_cond_broadcast( &r.var_wait_for_slots ); + pthread_mutex_unlock( &r.mtx_slots_avail ); + } +#endif +#endif /* __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ + + return true; +} + +bool concurrent_queue_base_v3::internal_push_if_not_full( const void* src ) { + concurrent_queue_rep& r = *my_rep; + ticket k = r.tail_counter; + for(;;) { + if( (ptrdiff_t)(k-r.head_counter)>=my_capacity ) { + // Queue is full + return false; + } + // Queue had empty slot with ticket k when we looked. Attempt to claim that slot. + ticket tk=k; + k = r.tail_counter.compare_and_swap( tk+1, tk ); + if( k==tk ) + break; + // Another thread claimed the slot, so retry. + } + r.choose(k).push(src,k,*this); + +#if __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE +#if _WIN32||_WIN64 + if( r.n_waiting_consumers>0 ) { + EnterCriticalSection( &r.mtx_items_avail ); + if( r.n_waiting_consumers>0 ) { + r.consumer_wait_generation++; + r.n_consumers_to_wakeup = r.n_waiting_consumers; + SetEvent( r.var_wait_for_items ); + } + LeaveCriticalSection( &r.mtx_items_avail ); + } +#else /* including MacOS */ + if( r.n_waiting_consumers>0 ) { + pthread_mutex_lock( &r.mtx_items_avail ); + if( r.n_waiting_consumers>0 ) + pthread_cond_broadcast( &r.var_wait_for_items ); + pthread_mutex_unlock( &r.mtx_items_avail ); + } +#endif +#endif /* __TBB_NO_BUSY_WAIT_IN_CONCURRENT_QUEUE */ + return true; +} + +ptrdiff_t concurrent_queue_base_v3::internal_size() const { + __TBB_ASSERT( sizeof(ptrdiff_t)<=sizeof(size_t), NULL ); + return ptrdiff_t(my_rep->tail_counter-my_rep->head_counter-my_rep->n_invalid_entries); +} + +bool concurrent_queue_base_v3::internal_empty() const { + ticket tc = my_rep->tail_counter; + ticket hc = my_rep->head_counter; + // if tc!=r.tail_counter, the queue was not empty at some point between the two reads. + return ( tc==my_rep->tail_counter && ptrdiff_t(tc-hc-my_rep->n_invalid_entries)<=0 ); +} + +void concurrent_queue_base_v3::internal_set_capacity( ptrdiff_t capacity, size_t /*item_size*/ ) { + my_capacity = capacity<0 ? concurrent_queue_rep::infinite_capacity : capacity; +} + +void concurrent_queue_base_v3::internal_finish_clear() { + size_t nq = my_rep->n_queue; + for( size_t i=0; iarray[i].tail_page; + __TBB_ASSERT( my_rep->array[i].head_page==tp, "at most one page should remain" ); + if( tp!=NULL) { + if( tp!=invalid_page ) deallocate_page( tp ); + my_rep->array[i].tail_page = NULL; + } + } +} + +void concurrent_queue_base_v3::internal_throw_exception() const { + throw bad_alloc(); +} + +void concurrent_queue_base_v3::assign( const concurrent_queue_base& src ) { + items_per_page = src.items_per_page; + my_capacity = src.my_capacity; + + // copy concurrent_queue_rep. + my_rep->head_counter = src.my_rep->head_counter; + my_rep->tail_counter = src.my_rep->tail_counter; + my_rep->n_invalid_entries = src.my_rep->n_invalid_entries; + + // copy micro_queues + for( size_t i = 0; in_queue; ++i ) + my_rep->array[i].assign( src.my_rep->array[i], *this); + + __TBB_ASSERT( my_rep->head_counter==src.my_rep->head_counter && my_rep->tail_counter==src.my_rep->tail_counter, + "the source concurrent queue should not be concurrently modified." ); +} + +//------------------------------------------------------------------------ +// concurrent_queue_iterator_rep +//------------------------------------------------------------------------ +class concurrent_queue_iterator_rep: no_assign { +public: + ticket head_counter; + const concurrent_queue_base& my_queue; + concurrent_queue_base::page* array[concurrent_queue_rep::n_queue]; + concurrent_queue_iterator_rep( const concurrent_queue_base& queue ) : + head_counter(queue.my_rep->head_counter), + my_queue(queue) + { + const concurrent_queue_rep& rep = *queue.my_rep; + for( size_t k=0; ktail_counter ) { + item = NULL; + return true; + } else { + concurrent_queue_base::page* p = array[concurrent_queue_rep::index(k)]; + __TBB_ASSERT(p,NULL); + size_t i = k/concurrent_queue_rep::n_queue & (my_queue.items_per_page-1); + item = static_cast(static_cast(p+1)) + my_queue.item_size*i; + return (p->mask & uintptr(1)<().allocate(1); + new( my_rep ) concurrent_queue_iterator_rep(queue); + size_t k = my_rep->head_counter; + if( !my_rep->get_item(my_item, k) ) advance(); +} + +void concurrent_queue_iterator_base_v3::assign( const concurrent_queue_iterator_base& other ) { + if( my_rep!=other.my_rep ) { + if( my_rep ) { + cache_aligned_allocator().deallocate(my_rep, 1); + my_rep = NULL; + } + if( other.my_rep ) { + my_rep = cache_aligned_allocator().allocate(1); + new( my_rep ) concurrent_queue_iterator_rep( *other.my_rep ); + } + } + my_item = other.my_item; +} + +void concurrent_queue_iterator_base_v3::advance() { + __TBB_ASSERT( my_item, "attempt to increment iterator past end of queue" ); + size_t k = my_rep->head_counter; + const concurrent_queue_base& queue = my_rep->my_queue; +#if TBB_USE_ASSERT + void* tmp; + my_rep->get_item(tmp,k); + __TBB_ASSERT( my_item==tmp, NULL ); +#endif /* TBB_USE_ASSERT */ + size_t i = k/concurrent_queue_rep::n_queue & (queue.items_per_page-1); + if( i==queue.items_per_page-1 ) { + concurrent_queue_base::page*& root = my_rep->array[concurrent_queue_rep::index(k)]; + root = root->next; + } + // advance k + my_rep->head_counter = ++k; + if( !my_rep->get_item(my_item, k) ) advance(); +} + +concurrent_queue_iterator_base_v3::~concurrent_queue_iterator_base_v3() { + //delete my_rep; + cache_aligned_allocator().deallocate(my_rep, 1); + my_rep = NULL; +} + +} // namespace internal + +} // namespace tbb diff --git a/dep/tbb/src/tbb/concurrent_vector.cpp b/dep/tbb/src/tbb/concurrent_vector.cpp new file mode 100644 index 000000000..7dc51f490 --- /dev/null +++ b/dep/tbb/src/tbb/concurrent_vector.cpp @@ -0,0 +1,574 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/concurrent_vector.h" +#include "tbb/cache_aligned_allocator.h" +#include "tbb/tbb_exception.h" +#include "tbb_misc.h" +#include "itt_notify.h" +#include + +#if defined(_MSC_VER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (disable: 4267) +#endif + +using namespace std; + +namespace tbb { + +namespace internal { + class concurrent_vector_base_v3::helper :no_assign { +public: + //! memory page size + static const size_type page_size = 4096; + + inline static bool incompact_predicate(size_type size) { // assert size != 0, see source/test/test_vector_layout.cpp + return size < page_size || ((size-1)%page_size < page_size/2 && size < page_size * 128); // for more details + } + + inline static size_type find_segment_end(const concurrent_vector_base_v3 &v) { + segment_t *s = v.my_segment; + segment_index_t u = s==v.my_storage? pointers_per_short_table : pointers_per_long_table; + segment_index_t k = 0; + while( k < u && s[k].array > internal::vector_allocation_error_flag ) + ++k; + return k; + } + + //! assign first segment size. k - is index of last segment to be allocated, not a count of segments + inline static void assign_first_segment_if_neccessary(concurrent_vector_base_v3 &v, segment_index_t k) { + if( !v.my_first_block ) { + /* There was a suggestion to set first segment according to incompact_predicate: + while( k && !helper::incompact_predicate(segment_size( k ) * element_size) ) + --k; // while previous vector size is compact, decrement + // reasons to not do it: + // * constructor(n) is not ready to accept fragmented segments + // * backward compatibility due to that constructor + // * current version gives additional guarantee and faster init. + // * two calls to reserve() will give the same effect. + */ + v.my_first_block.compare_and_swap(k+1, 0); // store number of segments + } + } + + inline static void *allocate_segment(concurrent_vector_base_v3 &v, size_type n) { + void *ptr = v.vector_allocator_ptr(v, n); + if(!ptr) throw bad_alloc(); // check for bad allocation, throw exception + return ptr; + } + + //! Publish segment so other threads can see it. + inline static void publish_segment( segment_t& s, void* rhs ) { + // see also itt_store_pointer_with_release_v3() + ITT_NOTIFY( sync_releasing, &s.array ); + __TBB_store_with_release( s.array, rhs ); + } + + static size_type enable_segment(concurrent_vector_base_v3 &v, size_type k, size_type element_size) { + segment_t* s = v.my_segment; // TODO: optimize out as argument? Optimize accesses to my_first_block + __TBB_ASSERT( s[k].array <= internal::vector_allocation_error_flag, "concurrent operation during growth?" ); + if( !k ) { + assign_first_segment_if_neccessary(v, default_initial_segments-1); + try { + publish_segment(s[0], allocate_segment(v, segment_size(v.my_first_block) ) ); + } catch(...) { // intercept exception here, assign internal::vector_allocation_error_flag value, re-throw exception + publish_segment(s[0], internal::vector_allocation_error_flag); throw; + } + return 2; + } + size_type m = segment_size(k); + if( !v.my_first_block ) // push_back only + spin_wait_while_eq( v.my_first_block, segment_index_t(0) ); + if( k < v.my_first_block ) { + // s[0].array is changed only once ( 0 -> !0 ) and points to uninitialized memory + void *array0 = __TBB_load_with_acquire(s[0].array); + if( !array0 ) { + // sync_prepare called only if there is a wait + ITT_NOTIFY(sync_prepare, &s[0].array ); + spin_wait_while_eq( s[0].array, (void*)0 ); + array0 = __TBB_load_with_acquire(s[0].array); + } + ITT_NOTIFY(sync_acquired, &s[0].array); + if( array0 <= internal::vector_allocation_error_flag ) { // check for internal::vector_allocation_error_flag of initial segment + publish_segment(s[k], internal::vector_allocation_error_flag); // and assign internal::vector_allocation_error_flag here + throw bad_last_alloc(); // throw custom exception + } + publish_segment( s[k], + static_cast( static_cast(array0) + segment_base(k)*element_size ) + ); + } else { + try { + publish_segment(s[k], allocate_segment(v, m)); + } catch(...) { // intercept exception here, assign internal::vector_allocation_error_flag value, re-throw exception + publish_segment(s[k], internal::vector_allocation_error_flag); throw; + } + } + return m; + } + + inline static void extend_table_if_necessary(concurrent_vector_base_v3 &v, size_type k, size_type start ) { + if(k >= pointers_per_short_table && v.my_segment == v.my_storage) + extend_segment_table(v, start ); + } + + static void extend_segment_table(concurrent_vector_base_v3 &v, size_type start) { + if( start > segment_size(pointers_per_short_table) ) start = segment_size(pointers_per_short_table); + // If other threads are trying to set pointers in the short segment, wait for them to finish their + // assigments before we copy the short segment to the long segment. Note: grow_to_at_least depends on it + for( segment_index_t i = 0; segment_base(i) < start && v.my_segment == v.my_storage; i++ ) + if(!v.my_storage[i].array) { + ITT_NOTIFY(sync_prepare, &v.my_storage[i].array); + atomic_backoff backoff; + do backoff.pause(); while( v.my_segment == v.my_storage && !v.my_storage[i].array ); + ITT_NOTIFY(sync_acquired, &v.my_storage[i].array); + } + if( v.my_segment != v.my_storage ) return; + + segment_t* s = (segment_t*)NFS_Allocate( pointers_per_long_table, sizeof(segment_t), NULL ); + // if( !s ) throw bad_alloc() -- implemented in NFS_Allocate + memset( s, 0, pointers_per_long_table*sizeof(segment_t) ); + for( segment_index_t i = 0; i < pointers_per_short_table; i++) + s[i] = v.my_storage[i]; + if( v.my_segment.compare_and_swap( s, v.my_storage ) != v.my_storage ) + NFS_Free( s ); + } + + inline static segment_t &acquire_segment(concurrent_vector_base_v3 &v, size_type index, size_type element_size, bool owner) { + segment_t &s = v.my_segment[index]; // TODO: pass v.my_segment as arument + if( !__TBB_load_with_acquire(s.array) ) { // do not check for internal::vector_allocation_error_flag + if( owner ) { + enable_segment( v, index, element_size ); + } else { + ITT_NOTIFY(sync_prepare, &s.array); + spin_wait_while_eq( s.array, (void*)0 ); + ITT_NOTIFY(sync_acquired, &s.array); + } + } else { + ITT_NOTIFY(sync_acquired, &s.array); + } + if( s.array <= internal::vector_allocation_error_flag ) // check for internal::vector_allocation_error_flag + throw bad_last_alloc(); // throw custom exception, because it's hard to recover after internal::vector_allocation_error_flag correctly + return s; + } + + ///// non-static fields of helper for exception-safe iteration across segments + segment_t *table;// TODO: review all segment_index_t as just short type + size_type first_block, k, sz, start, finish, element_size; + helper(segment_t *segments, size_type fb, size_type esize, size_type index, size_type s, size_type f) throw() + : table(segments), first_block(fb), k(index), sz(0), start(s), finish(f), element_size(esize) {} + inline void first_segment() throw() { + __TBB_ASSERT( start <= finish, NULL ); + __TBB_ASSERT( first_block || !finish, NULL ); + if( k < first_block ) k = 0; // process solid segment at a time + size_type base = segment_base( k ); + __TBB_ASSERT( base <= start, NULL ); + finish -= base; start -= base; // rebase as offsets from segment k + sz = k ? base : segment_size( first_block ); // sz==base for k>0 + } + inline void next_segment() throw() { + finish -= sz; start = 0; // offsets from next segment + if( !k ) k = first_block; + else { ++k; sz <<= 1; } + } + template + inline size_type apply(const F &func) { + first_segment(); + while( sz < finish ) { // work for more than one segment + func( table[k], static_cast(table[k].array)+element_size*start, sz-start ); + next_segment(); + } + func( table[k], static_cast(table[k].array)+element_size*start, finish-start ); + return k; + } + inline void *get_segment_ptr(size_type index, bool wait) { + segment_t &s = table[index]; + if( !__TBB_load_with_acquire(s.array) && wait ) { + ITT_NOTIFY(sync_prepare, &s.array); + spin_wait_while_eq( s.array, (void*)0 ); + ITT_NOTIFY(sync_acquired, &s.array); + } + return s.array; + } + ~helper() { + if( sz >= finish ) return; // the work is done correctly + if( !sz ) { // allocation failed, restore the table + segment_index_t k_start = k, k_end = segment_index_of(finish-1); + if( segment_base( k_start ) < start ) + get_segment_ptr(k_start++, true); // wait + if( k_start < first_block ) { + void *array0 = get_segment_ptr(0, start>0); // wait if necessary + if( array0 && !k_start ) ++k_start; + if( array0 <= internal::vector_allocation_error_flag ) + for(; k_start < first_block && k_start <= k_end; ++k_start ) + publish_segment(table[k_start], internal::vector_allocation_error_flag); + else for(; k_start < first_block && k_start <= k_end; ++k_start ) + publish_segment(table[k_start], static_cast( + static_cast(array0) + segment_base(k_start)*element_size) ); + } + for(; k_start <= k_end; ++k_start ) // not in first block + if( !__TBB_load_with_acquire(table[k_start].array) ) + publish_segment(table[k_start], internal::vector_allocation_error_flag); + // fill alocated items + first_segment(); + goto recover; + } + while( sz <= finish ) { // there is still work for at least one segment + next_segment(); +recover: + void *array = table[k].array; + if( array > internal::vector_allocation_error_flag ) + std::memset( static_cast(array)+element_size*start, 0, ((sz internal::vector_allocation_error_flag ) + func( begin, n ); + } + }; +}; + +concurrent_vector_base_v3::~concurrent_vector_base_v3() { + segment_t* s = my_segment; + if( s != my_storage ) { + // Clear short segment. + for( segment_index_t i = 0; i < pointers_per_short_table; i++) + my_storage[i].array = NULL; +#if TBB_USE_DEBUG + for( segment_index_t i = 0; i < pointers_per_long_table; i++) + __TBB_ASSERT( my_segment[i].array <= internal::vector_allocation_error_flag, "Segment should have been freed. Please recompile with new TBB before using exceptions."); +#endif + my_segment = my_storage; + NFS_Free( s ); + } +} + +concurrent_vector_base_v3::size_type concurrent_vector_base_v3::internal_capacity() const { + return segment_base( helper::find_segment_end(*this) ); +} + +void concurrent_vector_base_v3::internal_throw_exception(size_type t) const { + switch(t) { + case 0: throw out_of_range("Index out of requested size range"); + case 1: throw range_error ("Index out of allocated segment slots"); + case 2: throw range_error ("Index is not allocated"); + } +} + +void concurrent_vector_base_v3::internal_reserve( size_type n, size_type element_size, size_type max_size ) { + if( n>max_size ) { + throw length_error("argument to concurrent_vector::reserve exceeds concurrent_vector::max_size()"); + } + __TBB_ASSERT( n, NULL ); + helper::assign_first_segment_if_neccessary(*this, segment_index_of(n-1)); + segment_index_t k = helper::find_segment_end(*this); + try { + for( ; segment_base(k)= pointers_per_short_table) + || src.my_segment[k].array <= internal::vector_allocation_error_flag ) { + my_early_size = b; break; + } + helper::extend_table_if_necessary(*this, k, 0); + size_type m = helper::enable_segment(*this, k, element_size); + if( m > n-b ) m = n-b; + my_early_size = b+m; + copy( my_segment[k].array, src.my_segment[k].array, m ); + } + } +} + +void concurrent_vector_base_v3::internal_assign( const concurrent_vector_base_v3& src, size_type element_size, internal_array_op1 destroy, internal_array_op2 assign, internal_array_op2 copy ) { + size_type n = src.my_early_size; + while( my_early_size>n ) { // TODO: improve + segment_index_t k = segment_index_of( my_early_size-1 ); + size_type b=segment_base(k); + size_type new_end = b>=n ? b : n; + __TBB_ASSERT( my_early_size>new_end, NULL ); + if( my_segment[k].array <= internal::vector_allocation_error_flag) // check vector was broken before + throw bad_last_alloc(); // throw custom exception + // destructors are supposed to not throw any exceptions + destroy( (char*)my_segment[k].array+element_size*(new_end-b), my_early_size-new_end ); + my_early_size = new_end; + } + size_type dst_initialized_size = my_early_size; + my_early_size = n; + helper::assign_first_segment_if_neccessary(*this, segment_index_of(n)); + size_type b; + for( segment_index_t k=0; (b=segment_base(k))= pointers_per_short_table) + || src.my_segment[k].array <= internal::vector_allocation_error_flag ) { // if source is damaged + my_early_size = b; break; // TODO: it may cause undestructed items + } + helper::extend_table_if_necessary(*this, k, 0); + if( !my_segment[k].array ) + helper::enable_segment(*this, k, element_size); + else if( my_segment[k].array <= internal::vector_allocation_error_flag ) + throw bad_last_alloc(); // throw custom exception + size_type m = k? segment_size(k) : 2; + if( m > n-b ) m = n-b; + size_type a = 0; + if( dst_initialized_size>b ) { + a = dst_initialized_size-b; + if( a>m ) a = m; + assign( my_segment[k].array, src.my_segment[k].array, a ); + m -= a; + a *= element_size; + } + if( m>0 ) + copy( (char*)my_segment[k].array+a, (char*)src.my_segment[k].array+a, m ); + } + __TBB_ASSERT( src.my_early_size==n, "detected use of concurrent_vector::operator= with right side that was concurrently modified" ); +} + +void* concurrent_vector_base_v3::internal_push_back( size_type element_size, size_type& index ) { + __TBB_ASSERT( sizeof(my_early_size)==sizeof(uintptr), NULL ); + size_type tmp = __TBB_FetchAndIncrementWacquire(&my_early_size); + index = tmp; + segment_index_t k_old = segment_index_of( tmp ); + size_type base = segment_base(k_old); + helper::extend_table_if_necessary(*this, k_old, tmp); + segment_t& s = helper::acquire_segment(*this, k_old, element_size, base==tmp); + size_type j_begin = tmp-base; + return (void*)((char*)s.array+element_size*j_begin); +} + +void concurrent_vector_base_v3::internal_grow_to_at_least( size_type new_size, size_type element_size, internal_array_op2 init, const void *src ) { + internal_grow_to_at_least_with_result( new_size, element_size, init, src ); +} + +concurrent_vector_base_v3::size_type concurrent_vector_base_v3::internal_grow_to_at_least_with_result( size_type new_size, size_type element_size, internal_array_op2 init, const void *src ) { + size_type e = my_early_size; + while( e= pointers_per_short_table && my_segment == my_storage ) { + spin_wait_while_eq( my_segment, my_storage ); + } + for( i = 0; i <= k_old; ++i ) { + segment_t &s = my_segment[i]; + if(!s.array) { + ITT_NOTIFY(sync_prepare, &s.array); + atomic_backoff backoff; + do backoff.pause(); + while( !__TBB_load_with_acquire(my_segment[i].array) ); // my_segment may change concurrently + ITT_NOTIFY(sync_acquired, &s.array); + } + if( my_segment[i].array <= internal::vector_allocation_error_flag ) + throw bad_last_alloc(); + } +#if TBB_USE_DEBUG + size_type capacity = internal_capacity(); + __TBB_ASSERT( capacity >= new_size, NULL); +#endif + return e; +} + +concurrent_vector_base_v3::size_type concurrent_vector_base_v3::internal_grow_by( size_type delta, size_type element_size, internal_array_op2 init, const void *src ) { + size_type result = my_early_size.fetch_and_add(delta); + internal_grow( result, result+delta, element_size, init, src ); + return result; +} + +void concurrent_vector_base_v3::internal_grow( const size_type start, size_type finish, size_type element_size, internal_array_op2 init, const void *src ) { + __TBB_ASSERT( start k_start && k_end >= range.first_block; --k_end ) // allocate segments in reverse order + helper::acquire_segment(*this, k_end, element_size, true/*for k_end>k_start*/); + for(; k_start <= k_end; ++k_start ) // but allocate first block in straight order + helper::acquire_segment(*this, k_start, element_size, segment_base( k_start ) >= start ); + range.apply( helper::init_body(init, src) ); +} + +void concurrent_vector_base_v3::internal_resize( size_type n, size_type element_size, size_type max_size, const void *src, + internal_array_op1 destroy, internal_array_op2 init ) { + size_type j = my_early_size; + if( n > j ) { // construct items + internal_reserve(n, element_size, max_size); + my_early_size = n; + helper for_each(my_segment, my_first_block, element_size, segment_index_of(j), j, n); + for_each.apply( helper::safe_init_body(init, src) ); + } else { + my_early_size = n; + helper for_each(my_segment, my_first_block, element_size, segment_index_of(n), n, j); + for_each.apply( helper::destroy_body(destroy) ); + } +} + +concurrent_vector_base_v3::segment_index_t concurrent_vector_base_v3::internal_clear( internal_array_op1 destroy ) { + __TBB_ASSERT( my_segment, NULL ); + size_type j = my_early_size; + my_early_size = 0; + helper for_each(my_segment, my_first_block, 0, 0, 0, j); // element_size is safe to be zero if 'start' is zero + j = for_each.apply( helper::destroy_body(destroy) ); + size_type i = helper::find_segment_end(*this); + return j < i? i : j+1; +} + +void *concurrent_vector_base_v3::internal_compact( size_type element_size, void *table, internal_array_op1 destroy, internal_array_op2 copy ) +{ + const size_type my_size = my_early_size; + const segment_index_t k_end = helper::find_segment_end(*this); // allocated segments + const segment_index_t k_stop = my_size? segment_index_of(my_size-1) + 1 : 0; // number of segments to store existing items: 0=>0; 1,2=>1; 3,4=>2; [5-8]=>3;.. + const segment_index_t first_block = my_first_block; // number of merged segments, getting values from atomics + + segment_index_t k = first_block; + if(k_stop < first_block) + k = k_stop; + else + while (k < k_stop && helper::incompact_predicate(segment_size( k ) * element_size) ) k++; + if(k_stop == k_end && k == first_block) + return NULL; + + segment_t *const segment_table = my_segment; + internal_segments_table &old = *static_cast( table ); + memset(&old, 0, sizeof(old)); + + if ( k != first_block && k ) // first segment optimization + { + // exception can occur here + void *seg = old.table[0] = helper::allocate_segment( *this, segment_size(k) ); + old.first_block = k; // fill info for freeing new segment if exception occurs + // copy items to the new segment + size_type my_segment_size = segment_size( first_block ); + for (segment_index_t i = 0, j = 0; i < k && j < my_size; j = my_segment_size) { + __TBB_ASSERT( segment_table[i].array > internal::vector_allocation_error_flag, NULL); + void *s = static_cast( + static_cast(seg) + segment_base(i)*element_size ); + if(j + my_segment_size >= my_size) my_segment_size = my_size - j; + try { // exception can occur here + copy( s, segment_table[i].array, my_segment_size ); + } catch(...) { // destroy all the already copied items + helper for_each(reinterpret_cast(&old.table[0]), old.first_block, element_size, + 0, 0, segment_base(i)+my_segment_size); + for_each.apply( helper::destroy_body(destroy) ); + throw; + } + my_segment_size = i? segment_size( ++i ) : segment_size( i = first_block ); + } + // commit the changes + memcpy(old.table, segment_table, k * sizeof(segment_t)); + for (segment_index_t i = 0; i < k; i++) { + segment_table[i].array = static_cast( + static_cast(seg) + segment_base(i)*element_size ); + } + old.first_block = first_block; my_first_block = k; // now, first_block != my_first_block + // destroy original copies + my_segment_size = segment_size( first_block ); // old.first_block actually + for (segment_index_t i = 0, j = 0; i < k && j < my_size; j = my_segment_size) { + if(j + my_segment_size >= my_size) my_segment_size = my_size - j; + // destructors are supposed to not throw any exceptions + destroy( old.table[i], my_segment_size ); + my_segment_size = i? segment_size( ++i ) : segment_size( i = first_block ); + } + } + // free unnecessary segments allocated by reserve() call + if ( k_stop < k_end ) { + old.first_block = first_block; + memcpy(old.table+k_stop, segment_table+k_stop, (k_end-k_stop) * sizeof(segment_t)); + memset(segment_table+k_stop, 0, (k_end-k_stop) * sizeof(segment_t)); + if( !k ) my_first_block = 0; + } + return table; +} + +void concurrent_vector_base_v3::internal_swap(concurrent_vector_base_v3& v) +{ + size_type my_sz = my_early_size, v_sz = v.my_early_size; + if(!my_sz && !v_sz) return; + size_type tmp = my_first_block; my_first_block = v.my_first_block; v.my_first_block = tmp; + bool my_short = (my_segment == my_storage), v_short = (v.my_segment == v.my_storage); + if ( my_short && v_short ) { // swap both tables + char tbl[pointers_per_short_table * sizeof(segment_t)]; + memcpy(tbl, my_storage, pointers_per_short_table * sizeof(segment_t)); + memcpy(my_storage, v.my_storage, pointers_per_short_table * sizeof(segment_t)); + memcpy(v.my_storage, tbl, pointers_per_short_table * sizeof(segment_t)); + } + else if ( my_short ) { // my -> v + memcpy(v.my_storage, my_storage, pointers_per_short_table * sizeof(segment_t)); + my_segment = v.my_segment; v.my_segment = v.my_storage; + } + else if ( v_short ) { // v -> my + memcpy(my_storage, v.my_storage, pointers_per_short_table * sizeof(segment_t)); + v.my_segment = my_segment; my_segment = my_storage; + } else { + segment_t *ptr = my_segment; my_segment = v.my_segment; v.my_segment = ptr; + } + my_early_size = v_sz; v.my_early_size = my_sz; +} + +} // namespace internal + +} // tbb diff --git a/dep/tbb/src/tbb/dynamic_link.cpp b/dep/tbb/src/tbb/dynamic_link.cpp new file mode 100644 index 000000000..f6de51099 --- /dev/null +++ b/dep/tbb/src/tbb/dynamic_link.cpp @@ -0,0 +1,133 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "dynamic_link.h" + +#ifndef LIBRARY_ASSERT +#include "tbb/tbb_stddef.h" +#define LIBRARY_ASSERT(x,y) __TBB_ASSERT(x,y) +#endif /* LIBRARY_ASSERT */ + +#if _WIN32||_WIN64 + #include /* alloca */ +#else + #include +#if __FreeBSD__ + #include /* alloca */ +#else + #include +#endif +#endif + +OPEN_INTERNAL_NAMESPACE + +#if __TBB_WEAK_SYMBOLS + +bool dynamic_link( void*, const dynamic_link_descriptor descriptors[], size_t n, size_t required ) +{ + if ( required == ~(size_t)0 ) + required = n; + LIBRARY_ASSERT( required<=n, "Number of required entry points exceeds their total number" ); + size_t k = 0; + // Check if the first required entries are present in what was loaded into our process + while ( k < required && descriptors[k].ptr ) + ++k; + if ( k < required ) + return false; + // Commit all the entry points. + for ( k = 0; k < n; ++k ) + *descriptors[k].handler = (pointer_to_handler) descriptors[k].ptr; + return true; +} + +#else /* !__TBB_WEAK_SYMBOLS */ + +bool dynamic_link( void* module, const dynamic_link_descriptor descriptors[], size_t n, size_t required ) +{ + pointer_to_handler *h = (pointer_to_handler*)alloca(n * sizeof(pointer_to_handler)); + if ( required == ~(size_t)0 ) + required = n; + LIBRARY_ASSERT( required<=n, "Number of required entry points exceeds their total number" ); + size_t k = 0; + for ( ; k < n; ++k ) { +#if _WIN32||_WIN64 + h[k] = pointer_to_handler(GetProcAddress( (HMODULE)module, descriptors[k].name )); +#else + // Lvalue casting is used; this way icc -strict-ansi does not warn about nonstandard pointer conversion + (void *&)h[k] = dlsym( module, descriptors[k].name ); +#endif /* _WIN32||_WIN64 */ + if ( !h[k] && k < required ) + return false; + } + LIBRARY_ASSERT( k == n, "if required entries are initialized, all entries are expected to be walked"); + // Commit the entry points. + // Cannot use memset here, because the writes must be atomic. + for( k = 0; k < n; ++k ) + *descriptors[k].handler = h[k]; + return true; +} + +#endif /* !__TBB_WEAK_SYMBOLS */ +bool dynamic_link( const char* library, const dynamic_link_descriptor descriptors[], size_t n, size_t required, dynamic_link_handle* handle ) +{ +#if _WIN32||_WIN64 + // Interpret non-NULL handle parameter as request to really link against another library. + if ( !handle && dynamic_link( GetModuleHandle(NULL), descriptors, n, required ) ) + // Target library was statically linked into this executable + return true; + // Prevent Windows from displaying silly message boxes if it fails to load library + // (e.g. because of MS runtime problems - one of those crazy manifest related ones) + UINT prev_mode = SetErrorMode (SEM_FAILCRITICALERRORS); + dynamic_link_handle module = LoadLibrary (library); + SetErrorMode (prev_mode); +#else + dynamic_link_handle module = dlopen( library, RTLD_LAZY ); +#endif /* _WIN32||_WIN64 */ + if( module ) { + if( !dynamic_link( module, descriptors, n, required ) ) { + // Return true if the library is there and it contains all the expected entry points. + dynamic_unlink(module); + module = NULL; + } + } + if( handle ) + *handle = module; + return module!=NULL; +} + +void dynamic_unlink( dynamic_link_handle handle ) { + if( handle ) { +#if _WIN32||_WIN64 + FreeLibrary( handle ); +#else + dlclose( handle ); +#endif /* _WIN32||_WIN64 */ + } +} + +CLOSE_INTERNAL_NAMESPACE diff --git a/dep/tbb/src/tbb/dynamic_link.h b/dep/tbb/src/tbb/dynamic_link.h new file mode 100644 index 000000000..1439eca7e --- /dev/null +++ b/dep/tbb/src/tbb/dynamic_link.h @@ -0,0 +1,102 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __TBB_dynamic_link +#define __TBB_dynamic_link + +// Support for dynamically linking to a shared library. +// By default, the symbols defined here go in namespace tbb::internal. +// The symbols can be put in another namespace by defining the preprocessor +// symbols OPEN_INTERNAL_NAMESPACE and CLOSE_INTERNAL_NAMESPACE to open and +// close the other namespace. See default definition below for an example. + +#ifndef OPEN_INTERNAL_NAMESPACE +#define OPEN_INTERNAL_NAMESPACE namespace tbb { namespace internal { +#define CLOSE_INTERNAL_NAMESPACE }} +#endif /* OPEN_INTERNAL_NAMESPACE */ + +#include +#if _WIN32||_WIN64 +#include +#endif /* _WIN32||_WIN64 */ + +OPEN_INTERNAL_NAMESPACE + +//! Type definition for a pointer to a void somefunc(void) +typedef void (*pointer_to_handler)(); + +// Double cast through the void* from func_ptr in DLD macro is necessary to +// prevent warnings from some compilers (g++ 4.1) +#if __TBB_WEAK_SYMBOLS + +#define DLD(s,h) {(pointer_to_handler)&s, (pointer_to_handler*)(void*)(&h)} +//! Association between a handler name and location of pointer to it. +struct dynamic_link_descriptor { + //! pointer to the handler + pointer_to_handler ptr; + //! Pointer to the handler + pointer_to_handler* handler; +}; + +#else /* !__TBB_WEAK_SYMBOLS */ + +#define DLD(s,h) {#s, (pointer_to_handler*)(void*)(&h)} +//! Association between a handler name and location of pointer to it. +struct dynamic_link_descriptor { + //! Name of the handler + const char* name; + //! Pointer to the handler + pointer_to_handler* handler; +}; + +#endif /* !__TBB_WEAK_SYMBOLS */ + +#if _WIN32||_WIN64 +typedef HMODULE dynamic_link_handle; +#else +typedef void* dynamic_link_handle; +#endif /* _WIN32||_WIN64 */ + +//! Fill in dynamically linked handlers. +/** 'n' is the length of the array descriptors[]. + 'required' is the number of the initial entries in the array descriptors[] + that have to be found in order for the call to succeed. If the library and + all the required handlers are found, then the corresponding handler pointers + are set, and the return value is true. Otherwise the original array of + descriptors is left untouched and the return value is false. **/ +bool dynamic_link( const char* libraryname, + const dynamic_link_descriptor descriptors[], + size_t n, + size_t required = ~(size_t)0, + dynamic_link_handle* handle = 0 ); + +void dynamic_unlink( dynamic_link_handle handle ); + +CLOSE_INTERNAL_NAMESPACE + +#endif /* __TBB_dynamic_link */ diff --git a/dep/tbb/src/tbb/enumerable_thread_specific.cpp b/dep/tbb/src/tbb/enumerable_thread_specific.cpp new file mode 100644 index 000000000..f576fb3b6 --- /dev/null +++ b/dep/tbb/src/tbb/enumerable_thread_specific.cpp @@ -0,0 +1,172 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/enumerable_thread_specific.h" +#include "tbb/concurrent_queue.h" +#include "tbb/cache_aligned_allocator.h" +#include "tbb/atomic.h" +#include "tbb/spin_mutex.h" + +namespace tbb { + + namespace internal { + + // Manages fake TLS keys and fake TLS space + // Uses only a single native TLS key through use of an enumerable_thread_specific< ... , ets_key_per_instance > + class tls_single_key_manager { + + // Typedefs + typedef concurrent_vector local_vector_type; + typedef enumerable_thread_specific< local_vector_type, cache_aligned_allocator, ets_key_per_instance > my_ets_type; + typedef local_vector_type::size_type fake_key_t; + + // The fake TLS space + my_ets_type my_vectors; + + // The next never-yet-assigned fake TLS key + atomic< fake_key_t > next_key; + + // A Q of fake TLS keys that can be reused + typedef spin_mutex free_mutex_t; + free_mutex_t free_mutex; + + struct free_node_t { + fake_key_t key; + free_node_t *next; + }; + + cache_aligned_allocator< free_node_t > my_allocator; + free_node_t *free_stack; + + bool pop_if_present( fake_key_t &k ) { + free_node_t *n = NULL; + { + free_mutex_t::scoped_lock(free_mutex); + n = free_stack; + if (n) free_stack = n->next; + } + if ( n ) { + k = n->key; + my_allocator.deallocate(n,1); + return true; + } + return false; + } + + void push( fake_key_t &k ) { + free_node_t *n = my_allocator.allocate(1); + n->key = k; + { + free_mutex_t::scoped_lock(free_mutex); + n->next = free_stack; + free_stack = n; + } + } + + public: + + tls_single_key_manager() : free_stack(NULL) { + next_key = 0; + } + + ~tls_single_key_manager() { + free_node_t *n = free_stack; + while (n != NULL) { + free_node_t *next = n->next; + my_allocator.deallocate(n,1); + n = next; + } + } + + // creates or finds an available fake TLS key + inline void create_key( fake_key_t &k ) { + if ( !(free_stack && pop_if_present( k )) ) { + k = next_key.fetch_and_add(1); + } + } + + // resets the fake TLS space associated with the key and then recycles the key + inline void destroy_key( fake_key_t &k ) { + for ( my_ets_type::iterator i = my_vectors.begin(); i != my_vectors.end(); ++i ) { + local_vector_type &ivec = *i; + if (ivec.size() > k) + ivec[k] = NULL; + } + push(k); + } + + // sets the fake TLS space to point to the given value for this thread + inline void set_tls( fake_key_t &k, void *value ) { + local_vector_type &my_vector = my_vectors.local(); + local_vector_type::size_type size = my_vector.size(); + + if ( size <= k ) { + // We use grow_by so that we can initialize the pointers to NULL + my_vector.grow_by( k - size + 1, NULL ); + } + my_vector[k] = value; + } + + inline void *get_tls( fake_key_t &k ) { + local_vector_type &my_vector = my_vectors.local(); + if (my_vector.size() > k) + return my_vector[k]; + else + return NULL; + } + + }; + + // The single static instance of tls_single_key_manager + static tls_single_key_manager tls_key_manager; + + // The EXPORTED functions + void + tls_single_key_manager_v4::create_key( tls_key_t &k) { + tls_key_manager.create_key( k ); + } + + void + tls_single_key_manager_v4::destroy_key( tls_key_t &k) { + tls_key_manager.destroy_key( k ); + } + + void + tls_single_key_manager_v4::set_tls( tls_key_t &k, void *value) { + tls_key_manager.set_tls( k, value); + } + + void * + tls_single_key_manager_v4::get_tls( tls_key_t &k ) { + return tls_key_manager.get_tls( k ); + } + + } + +} + diff --git a/dep/tbb/src/tbb/gate.h b/dep/tbb/src/tbb/gate.h new file mode 100644 index 000000000..fb1283621 --- /dev/null +++ b/dep/tbb/src/tbb/gate.h @@ -0,0 +1,221 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _TBB_Gate_H +#define _TBB_Gate_H + +#include "itt_notify.h" + +namespace tbb { + +namespace internal { + +#if __TBB_RML +//! Fake version of Gate for use with RML. +/** Really just an atomic intptr_t with a compare-and-swap operation, + but wrapped in syntax that makes it look like a normal Gate object, + in order to minimize source changes for RML in task.cpp. */ +class Gate { +public: + typedef intptr_t state_t; + + //! Get current state of gate + state_t get_state() const { + return state; + } + +#if defined(_MSC_VER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (disable: 4244) +#endif + + bool try_update( intptr_t value, intptr_t comparand ) { + return state.compare_and_swap(value,comparand)==comparand; + } +private: + atomic state; +}; + +#elif __TBB_USE_FUTEX + +//! Implementation of Gate based on futex. +/** Use this futex-based implementation where possible, because it is the simplest and usually fastest. */ +class Gate { +public: + typedef intptr_t state_t; + + Gate() { + ITT_SYNC_CREATE(&state, SyncType_Scheduler, SyncObj_Gate); + } + + //! Get current state of gate + state_t get_state() const { + return state; + } + //! Update state=value if state==comparand (flip==false) or state!=comparand (flip==true) + void try_update( intptr_t value, intptr_t comparand, bool flip=false ) { + __TBB_ASSERT( comparand!=0 || value!=0, "either value or comparand must be non-zero" ); +retry: + state_t old_state = state; + // First test for condition without using atomic operation + if( flip ? old_state!=comparand : old_state==comparand ) { + // Now atomically retest condition and set. + state_t s = state.compare_and_swap( value, old_state ); + if( s==old_state ) { + // compare_and_swap succeeded + if( value!=0 ) + futex_wakeup_all( &state ); // Update was successful and new state is not SNAPSHOT_EMPTY + } else { + // compare_and_swap failed. But for != case, failure may be spurious for our purposes if + // the value there is nonetheless not equal to value. This is a fairly rare event, so + // there is no need for backoff. In event of such a failure, we must retry. + if( flip && s!=value ) + goto retry; + } + } + } + //! Wait for state!=0. + void wait() { + if( state==0 ) + futex_wait( &state, 0 ); + } +private: + atomic state; +}; + +#elif USE_WINTHREAD + +class Gate { +public: + typedef intptr_t state_t; +private: + //! If state==0, then thread executing wait() suspend until state becomes non-zero. + state_t state; + CRITICAL_SECTION critical_section; + HANDLE event; +public: + //! Initialize with count=0 + Gate() : state(0) { + event = CreateEvent( NULL, true, false, NULL ); + InitializeCriticalSection( &critical_section ); + ITT_SYNC_CREATE(&event, SyncType_Scheduler, SyncObj_Gate); + ITT_SYNC_CREATE(&critical_section, SyncType_Scheduler, SyncObj_GateLock); + } + ~Gate() { + // Fake prepare/acquired pair for Intel(R) Parallel Amplifier to correctly attribute the operations below + ITT_NOTIFY( sync_prepare, &event ); + CloseHandle( event ); + DeleteCriticalSection( &critical_section ); + ITT_NOTIFY( sync_acquired, &event ); + } + //! Get current state of gate + state_t get_state() const { + return state; + } + //! Update state=value if state==comparand (flip==false) or state!=comparand (flip==true) + void try_update( intptr_t value, intptr_t comparand, bool flip=false ) { + __TBB_ASSERT( comparand!=0 || value!=0, "either value or comparand must be non-zero" ); + EnterCriticalSection( &critical_section ); + state_t old = state; + if( flip ? old!=comparand : old==comparand ) { + state = value; + if( !old ) + SetEvent( event ); + else if( !value ) + ResetEvent( event ); + } + LeaveCriticalSection( &critical_section ); + } + //! Wait for state!=0. + void wait() { + if( state==0 ) { + WaitForSingleObject( event, INFINITE ); + } + } +}; + +#elif USE_PTHREAD + +class Gate { +public: + typedef intptr_t state_t; +private: + //! If state==0, then thread executing wait() suspend until state becomes non-zero. + state_t state; + pthread_mutex_t mutex; + pthread_cond_t cond; +public: + //! Initialize with count=0 + Gate() : state(0) + { + pthread_mutex_init( &mutex, NULL ); + pthread_cond_init( &cond, NULL); + ITT_SYNC_CREATE(&cond, SyncType_Scheduler, SyncObj_Gate); + ITT_SYNC_CREATE(&mutex, SyncType_Scheduler, SyncObj_GateLock); + } + ~Gate() { + pthread_cond_destroy( &cond ); + pthread_mutex_destroy( &mutex ); + } + //! Get current state of gate + state_t get_state() const { + return state; + } + //! Update state=value if state==comparand (flip==false) or state!=comparand (flip==true) + void try_update( intptr_t value, intptr_t comparand, bool flip=false ) { + __TBB_ASSERT( comparand!=0 || value!=0, "either value or comparand must be non-zero" ); + pthread_mutex_lock( &mutex ); + state_t old = state; + if( flip ? old!=comparand : old==comparand ) { + state = value; + if( !old ) + pthread_cond_broadcast( &cond ); + } + pthread_mutex_unlock( &mutex ); + } + //! Wait for state!=0. + void wait() { + if( state==0 ) { + pthread_mutex_lock( &mutex ); + while( state==0 ) { + pthread_cond_wait( &cond, &mutex ); + } + pthread_mutex_unlock( &mutex ); + } + } +}; + +#else +#error Must define USE_PTHREAD or USE_WINTHREAD +#endif /* threading kind */ + +} // namespace Internal + +} // namespace ThreadingBuildingBlocks + +#endif /* _TBB_Gate_H */ diff --git a/dep/tbb/src/tbb/ia32-masm/atomic_support.asm b/dep/tbb/src/tbb/ia32-masm/atomic_support.asm new file mode 100644 index 000000000..e22bc1caf --- /dev/null +++ b/dep/tbb/src/tbb/ia32-masm/atomic_support.asm @@ -0,0 +1,196 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +.686 +.model flat,c +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchadd1 +__TBB_machine_fetchadd1: + mov edx,4[esp] + mov eax,8[esp] + lock xadd [edx],al + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchstore1 +__TBB_machine_fetchstore1: + mov edx,4[esp] + mov eax,8[esp] + lock xchg [edx],al + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_cmpswp1 +__TBB_machine_cmpswp1: + mov edx,4[esp] + mov ecx,8[esp] + mov eax,12[esp] + lock cmpxchg [edx],cl + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchadd2 +__TBB_machine_fetchadd2: + mov edx,4[esp] + mov eax,8[esp] + lock xadd [edx],ax + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchstore2 +__TBB_machine_fetchstore2: + mov edx,4[esp] + mov eax,8[esp] + lock xchg [edx],ax + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_cmpswp2 +__TBB_machine_cmpswp2: + mov edx,4[esp] + mov ecx,8[esp] + mov eax,12[esp] + lock cmpxchg [edx],cx + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchadd4 +__TBB_machine_fetchadd4: + mov edx,4[esp] + mov eax,8[esp] + lock xadd [edx],eax + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchstore4 +__TBB_machine_fetchstore4: + mov edx,4[esp] + mov eax,8[esp] + lock xchg [edx],eax + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_cmpswp4 +__TBB_machine_cmpswp4: + mov edx,4[esp] + mov ecx,8[esp] + mov eax,12[esp] + lock cmpxchg [edx],ecx + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchadd8 +__TBB_machine_fetchadd8: + push ebx + push edi + mov edi,12[esp] + mov eax,[edi] + mov edx,4[edi] +__TBB_machine_fetchadd8_loop: + mov ebx,16[esp] + mov ecx,20[esp] + add ebx,eax + adc ecx,edx + lock cmpxchg8b qword ptr [edi] + jnz __TBB_machine_fetchadd8_loop + pop edi + pop ebx + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_fetchstore8 +__TBB_machine_fetchstore8: + push ebx + push edi + mov edi,12[esp] + mov ebx,16[esp] + mov ecx,20[esp] + mov eax,[edi] + mov edx,4[edi] +__TBB_machine_fetchstore8_loop: + lock cmpxchg8b qword ptr [edi] + jnz __TBB_machine_fetchstore8_loop + pop edi + pop ebx + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_cmpswp8 +__TBB_machine_cmpswp8: + push ebx + push edi + mov edi,12[esp] + mov ebx,16[esp] + mov ecx,20[esp] + mov eax,24[esp] + mov edx,28[esp] + lock cmpxchg8b qword ptr [edi] + pop edi + pop ebx + ret +.code + ALIGN 4 + PUBLIC c __TBB_machine_load8 +__TBB_machine_Load8: + ; If location is on stack, compiler may have failed to align it correctly, so we do dynamic check. + mov ecx,4[esp] + test ecx,7 + jne load_slow + ; Load within a cache line + sub esp,12 + fild qword ptr [ecx] + fistp qword ptr [esp] + mov eax,[esp] + mov edx,4[esp] + add esp,12 + ret +load_slow: + ; Load is misaligned. Use cmpxchg8b. + push ebx + push edi + mov edi,ecx + xor eax,eax + xor ebx,ebx + xor ecx,ecx + xor edx,edx + lock cmpxchg8b qword ptr [edi] + pop edi + pop ebx + ret +EXTRN __TBB_machine_store8_slow:PROC +.code + ALIGN 4 + PUBLIC c __TBB_machine_store8 +__TBB_machine_Store8: + ; If location is on stack, compiler may have failed to align it correctly, so we do dynamic check. + mov ecx,4[esp] + test ecx,7 + jne __TBB_machine_store8_slow ;; tail call to tbb_misc.cpp + fild qword ptr 8[esp] + fistp qword ptr [ecx] + ret +end diff --git a/dep/tbb/src/tbb/ia32-masm/lock_byte.asm b/dep/tbb/src/tbb/ia32-masm/lock_byte.asm new file mode 100644 index 000000000..4f560c487 --- /dev/null +++ b/dep/tbb/src/tbb/ia32-masm/lock_byte.asm @@ -0,0 +1,46 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +; DO NOT EDIT - AUTOMATICALLY GENERATED FROM .s FILE +.686 +.model flat,c +.code + ALIGN 4 + PUBLIC c __TBB_machine_trylockbyte +__TBB_machine_trylockbyte: + mov edx,4[esp] + mov al,[edx] + mov cl,1 + test al,1 + jnz __TBB_machine_trylockbyte_contended + lock cmpxchg [edx],cl + jne __TBB_machine_trylockbyte_contended + mov eax,1 + ret +__TBB_machine_trylockbyte_contended: + xor eax,eax + ret +end diff --git a/dep/tbb/src/tbb/ia64-gas/atomic_support.s b/dep/tbb/src/tbb/ia64-gas/atomic_support.s new file mode 100644 index 000000000..17502894f --- /dev/null +++ b/dep/tbb/src/tbb/ia64-gas/atomic_support.s @@ -0,0 +1,678 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 + + + .proc __TBB_machine_fetchadd1__TBB_full_fence# + .global __TBB_machine_fetchadd1__TBB_full_fence# +__TBB_machine_fetchadd1__TBB_full_fence: +{ + mf + br __TBB_machine_fetchadd1acquire +} + .endp __TBB_machine_fetchadd1__TBB_full_fence# + + .proc __TBB_machine_fetchadd1acquire# + .global __TBB_machine_fetchadd1acquire# +__TBB_machine_fetchadd1acquire: + + + + + + + + ld1 r9=[r32] +;; +Retry_1acquire: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg1.acq r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_1acquire + br.ret.sptk.many b0 +# 49 "" + .endp __TBB_machine_fetchadd1acquire# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore1__TBB_full_fence# + .global __TBB_machine_fetchstore1__TBB_full_fence# +__TBB_machine_fetchstore1__TBB_full_fence: + mf +;; + xchg1 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore1__TBB_full_fence# + + + .proc __TBB_machine_fetchstore1acquire# + .global __TBB_machine_fetchstore1acquire# +__TBB_machine_fetchstore1acquire: + xchg1 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore1acquire# +# 88 "" + .section .text + .align 16 + + + .proc __TBB_machine_cmpswp1__TBB_full_fence# + .global __TBB_machine_cmpswp1__TBB_full_fence# +__TBB_machine_cmpswp1__TBB_full_fence: +{ + mf + br __TBB_machine_cmpswp1acquire +} + .endp __TBB_machine_cmpswp1__TBB_full_fence# + + .proc __TBB_machine_cmpswp1acquire# + .global __TBB_machine_cmpswp1acquire# +__TBB_machine_cmpswp1acquire: + + zxt1 r34=r34 +;; + + mov ar.ccv=r34 +;; + cmpxchg1.acq r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp1acquire# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 + + + .proc __TBB_machine_fetchadd2__TBB_full_fence# + .global __TBB_machine_fetchadd2__TBB_full_fence# +__TBB_machine_fetchadd2__TBB_full_fence: +{ + mf + br __TBB_machine_fetchadd2acquire +} + .endp __TBB_machine_fetchadd2__TBB_full_fence# + + .proc __TBB_machine_fetchadd2acquire# + .global __TBB_machine_fetchadd2acquire# +__TBB_machine_fetchadd2acquire: + + + + + + + + ld2 r9=[r32] +;; +Retry_2acquire: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg2.acq r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_2acquire + br.ret.sptk.many b0 +# 49 "" + .endp __TBB_machine_fetchadd2acquire# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore2__TBB_full_fence# + .global __TBB_machine_fetchstore2__TBB_full_fence# +__TBB_machine_fetchstore2__TBB_full_fence: + mf +;; + xchg2 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore2__TBB_full_fence# + + + .proc __TBB_machine_fetchstore2acquire# + .global __TBB_machine_fetchstore2acquire# +__TBB_machine_fetchstore2acquire: + xchg2 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore2acquire# +# 88 "" + .section .text + .align 16 + + + .proc __TBB_machine_cmpswp2__TBB_full_fence# + .global __TBB_machine_cmpswp2__TBB_full_fence# +__TBB_machine_cmpswp2__TBB_full_fence: +{ + mf + br __TBB_machine_cmpswp2acquire +} + .endp __TBB_machine_cmpswp2__TBB_full_fence# + + .proc __TBB_machine_cmpswp2acquire# + .global __TBB_machine_cmpswp2acquire# +__TBB_machine_cmpswp2acquire: + + zxt2 r34=r34 +;; + + mov ar.ccv=r34 +;; + cmpxchg2.acq r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp2acquire# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 + + + .proc __TBB_machine_fetchadd4__TBB_full_fence# + .global __TBB_machine_fetchadd4__TBB_full_fence# +__TBB_machine_fetchadd4__TBB_full_fence: +{ + mf + br __TBB_machine_fetchadd4acquire +} + .endp __TBB_machine_fetchadd4__TBB_full_fence# + + .proc __TBB_machine_fetchadd4acquire# + .global __TBB_machine_fetchadd4acquire# +__TBB_machine_fetchadd4acquire: + + cmp.eq p6,p0=1,r33 + cmp.eq p8,p0=-1,r33 + (p6) br.cond.dptk Inc_4acquire + (p8) br.cond.dpnt Dec_4acquire +;; + + ld4 r9=[r32] +;; +Retry_4acquire: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg4.acq r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_4acquire + br.ret.sptk.many b0 + +Inc_4acquire: + fetchadd4.acq r8=[r32],1 + br.ret.sptk.many b0 +Dec_4acquire: + fetchadd4.acq r8=[r32],-1 + br.ret.sptk.many b0 + + .endp __TBB_machine_fetchadd4acquire# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore4__TBB_full_fence# + .global __TBB_machine_fetchstore4__TBB_full_fence# +__TBB_machine_fetchstore4__TBB_full_fence: + mf +;; + xchg4 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore4__TBB_full_fence# + + + .proc __TBB_machine_fetchstore4acquire# + .global __TBB_machine_fetchstore4acquire# +__TBB_machine_fetchstore4acquire: + xchg4 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore4acquire# +# 88 "" + .section .text + .align 16 + + + .proc __TBB_machine_cmpswp4__TBB_full_fence# + .global __TBB_machine_cmpswp4__TBB_full_fence# +__TBB_machine_cmpswp4__TBB_full_fence: +{ + mf + br __TBB_machine_cmpswp4acquire +} + .endp __TBB_machine_cmpswp4__TBB_full_fence# + + .proc __TBB_machine_cmpswp4acquire# + .global __TBB_machine_cmpswp4acquire# +__TBB_machine_cmpswp4acquire: + + zxt4 r34=r34 +;; + + mov ar.ccv=r34 +;; + cmpxchg4.acq r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp4acquire# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 + + + .proc __TBB_machine_fetchadd8__TBB_full_fence# + .global __TBB_machine_fetchadd8__TBB_full_fence# +__TBB_machine_fetchadd8__TBB_full_fence: +{ + mf + br __TBB_machine_fetchadd8acquire +} + .endp __TBB_machine_fetchadd8__TBB_full_fence# + + .proc __TBB_machine_fetchadd8acquire# + .global __TBB_machine_fetchadd8acquire# +__TBB_machine_fetchadd8acquire: + + cmp.eq p6,p0=1,r33 + cmp.eq p8,p0=-1,r33 + (p6) br.cond.dptk Inc_8acquire + (p8) br.cond.dpnt Dec_8acquire +;; + + ld8 r9=[r32] +;; +Retry_8acquire: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg8.acq r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_8acquire + br.ret.sptk.many b0 + +Inc_8acquire: + fetchadd8.acq r8=[r32],1 + br.ret.sptk.many b0 +Dec_8acquire: + fetchadd8.acq r8=[r32],-1 + br.ret.sptk.many b0 + + .endp __TBB_machine_fetchadd8acquire# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore8__TBB_full_fence# + .global __TBB_machine_fetchstore8__TBB_full_fence# +__TBB_machine_fetchstore8__TBB_full_fence: + mf +;; + xchg8 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore8__TBB_full_fence# + + + .proc __TBB_machine_fetchstore8acquire# + .global __TBB_machine_fetchstore8acquire# +__TBB_machine_fetchstore8acquire: + xchg8 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore8acquire# +# 88 "" + .section .text + .align 16 + + + .proc __TBB_machine_cmpswp8__TBB_full_fence# + .global __TBB_machine_cmpswp8__TBB_full_fence# +__TBB_machine_cmpswp8__TBB_full_fence: +{ + mf + br __TBB_machine_cmpswp8acquire +} + .endp __TBB_machine_cmpswp8__TBB_full_fence# + + .proc __TBB_machine_cmpswp8acquire# + .global __TBB_machine_cmpswp8acquire# +__TBB_machine_cmpswp8acquire: + + + + + mov ar.ccv=r34 +;; + cmpxchg8.acq r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp8acquire# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 +# 19 "" + .proc __TBB_machine_fetchadd1release# + .global __TBB_machine_fetchadd1release# +__TBB_machine_fetchadd1release: + + + + + + + + ld1 r9=[r32] +;; +Retry_1release: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg1.rel r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_1release + br.ret.sptk.many b0 +# 49 "" + .endp __TBB_machine_fetchadd1release# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore1release# + .global __TBB_machine_fetchstore1release# +__TBB_machine_fetchstore1release: + mf +;; + xchg1 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore1release# +# 88 "" + .section .text + .align 16 +# 101 "" + .proc __TBB_machine_cmpswp1release# + .global __TBB_machine_cmpswp1release# +__TBB_machine_cmpswp1release: + + zxt1 r34=r34 +;; + + mov ar.ccv=r34 +;; + cmpxchg1.rel r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp1release# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 +# 19 "" + .proc __TBB_machine_fetchadd2release# + .global __TBB_machine_fetchadd2release# +__TBB_machine_fetchadd2release: + + + + + + + + ld2 r9=[r32] +;; +Retry_2release: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg2.rel r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_2release + br.ret.sptk.many b0 +# 49 "" + .endp __TBB_machine_fetchadd2release# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore2release# + .global __TBB_machine_fetchstore2release# +__TBB_machine_fetchstore2release: + mf +;; + xchg2 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore2release# +# 88 "" + .section .text + .align 16 +# 101 "" + .proc __TBB_machine_cmpswp2release# + .global __TBB_machine_cmpswp2release# +__TBB_machine_cmpswp2release: + + zxt2 r34=r34 +;; + + mov ar.ccv=r34 +;; + cmpxchg2.rel r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp2release# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 +# 19 "" + .proc __TBB_machine_fetchadd4release# + .global __TBB_machine_fetchadd4release# +__TBB_machine_fetchadd4release: + + cmp.eq p6,p0=1,r33 + cmp.eq p8,p0=-1,r33 + (p6) br.cond.dptk Inc_4release + (p8) br.cond.dpnt Dec_4release +;; + + ld4 r9=[r32] +;; +Retry_4release: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg4.rel r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_4release + br.ret.sptk.many b0 + +Inc_4release: + fetchadd4.rel r8=[r32],1 + br.ret.sptk.many b0 +Dec_4release: + fetchadd4.rel r8=[r32],-1 + br.ret.sptk.many b0 + + .endp __TBB_machine_fetchadd4release# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore4release# + .global __TBB_machine_fetchstore4release# +__TBB_machine_fetchstore4release: + mf +;; + xchg4 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore4release# +# 88 "" + .section .text + .align 16 +# 101 "" + .proc __TBB_machine_cmpswp4release# + .global __TBB_machine_cmpswp4release# +__TBB_machine_cmpswp4release: + + zxt4 r34=r34 +;; + + mov ar.ccv=r34 +;; + cmpxchg4.rel r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp4release# +// DO NOT EDIT - AUTOMATICALLY GENERATED FROM tools/generate_atomic/ipf_generate.sh +# 1 "" +# 1 "" +# 1 "" +# 1 "" + + + + + + .section .text + .align 16 +# 19 "" + .proc __TBB_machine_fetchadd8release# + .global __TBB_machine_fetchadd8release# +__TBB_machine_fetchadd8release: + + cmp.eq p6,p0=1,r33 + cmp.eq p8,p0=-1,r33 + (p6) br.cond.dptk Inc_8release + (p8) br.cond.dpnt Dec_8release +;; + + ld8 r9=[r32] +;; +Retry_8release: + mov ar.ccv=r9 + mov r8=r9; + add r10=r9,r33 +;; + cmpxchg8.rel r9=[r32],r10,ar.ccv +;; + cmp.ne p7,p0=r8,r9 + (p7) br.cond.dpnt Retry_8release + br.ret.sptk.many b0 + +Inc_8release: + fetchadd8.rel r8=[r32],1 + br.ret.sptk.many b0 +Dec_8release: + fetchadd8.rel r8=[r32],-1 + br.ret.sptk.many b0 + + .endp __TBB_machine_fetchadd8release# +# 62 "" + .section .text + .align 16 + .proc __TBB_machine_fetchstore8release# + .global __TBB_machine_fetchstore8release# +__TBB_machine_fetchstore8release: + mf +;; + xchg8 r8=[r32],r33 + br.ret.sptk.many b0 + .endp __TBB_machine_fetchstore8release# +# 88 "" + .section .text + .align 16 +# 101 "" + .proc __TBB_machine_cmpswp8release# + .global __TBB_machine_cmpswp8release# +__TBB_machine_cmpswp8release: + + + + + mov ar.ccv=r34 +;; + cmpxchg8.rel r8=[r32],r33,ar.ccv + br.ret.sptk.many b0 + .endp __TBB_machine_cmpswp8release# diff --git a/dep/tbb/src/tbb/ia64-gas/ia64_misc.s b/dep/tbb/src/tbb/ia64-gas/ia64_misc.s new file mode 100644 index 000000000..999bfb9ba --- /dev/null +++ b/dep/tbb/src/tbb/ia64-gas/ia64_misc.s @@ -0,0 +1,35 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + + // RSE backing store pointer retrieval + .section .text + .align 16 + .proc __TBB_get_bsp# + .global __TBB_get_bsp# +__TBB_get_bsp: + mov r8=ar.bsp + br.ret.sptk.many b0 + .endp __TBB_get_bsp# diff --git a/dep/tbb/src/tbb/ia64-gas/lock_byte.s b/dep/tbb/src/tbb/ia64-gas/lock_byte.s new file mode 100644 index 000000000..e7f199d89 --- /dev/null +++ b/dep/tbb/src/tbb/ia64-gas/lock_byte.s @@ -0,0 +1,54 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + + // Support for class TinyLock + .section .text + .align 16 + // unsigned int __TBB_machine_trylockbyte( byte& flag ); + // r32 = address of flag + .proc __TBB_machine_trylockbyte# + .global __TBB_machine_trylockbyte# +ADDRESS_OF_FLAG=r32 +RETCODE=r8 +FLAG=r9 +BUSY=r10 +SCRATCH=r11 +__TBB_machine_trylockbyte: + ld1.acq FLAG=[ADDRESS_OF_FLAG] + mov BUSY=1 + mov RETCODE=0 +;; + cmp.ne p6,p0=0,FLAG + mov ar.ccv=r0 +(p6) br.ret.sptk.many b0 +;; + cmpxchg1.acq SCRATCH=[ADDRESS_OF_FLAG],BUSY,ar.ccv // Try to acquire lock +;; + cmp.eq p6,p0=0,SCRATCH +;; +(p6) mov RETCODE=1 + br.ret.sptk.many b0 + .endp __TBB_machine_trylockbyte# diff --git a/dep/tbb/src/tbb/ia64-gas/log2.s b/dep/tbb/src/tbb/ia64-gas/log2.s new file mode 100644 index 000000000..2a4288898 --- /dev/null +++ b/dep/tbb/src/tbb/ia64-gas/log2.s @@ -0,0 +1,67 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + + // Support for class ConcurrentVector + .section .text + .align 16 + // unsigned long __TBB_machine_lg( unsigned long x ); + // r32 = x + .proc __TBB_machine_lg# + .global __TBB_machine_lg# +__TBB_machine_lg: + shr r16=r32,1 // .x +;; + shr r17=r32,2 // ..x + or r32=r32,r16 // xx +;; + shr r16=r32,3 // ...xx + or r32=r32,r17 // xxx +;; + shr r17=r32,5 // .....xxx + or r32=r32,r16 // xxxxx +;; + shr r16=r32,8 // ........xxxxx + or r32=r32,r17 // xxxxxxxx +;; + shr r17=r32,13 + or r32=r32,r16 // 13x +;; + shr r16=r32,21 + or r32=r32,r17 // 21x +;; + shr r17=r32,34 + or r32=r32,r16 // 34x +;; + shr r16=r32,55 + or r32=r32,r17 // 55x +;; + or r32=r32,r16 // 64x +;; + popcnt r8=r32 +;; + add r8=-1,r8 + br.ret.sptk.many b0 + .endp __TBB_machine_lg# diff --git a/dep/tbb/src/tbb/ia64-gas/pause.s b/dep/tbb/src/tbb/ia64-gas/pause.s new file mode 100644 index 000000000..bead89bcd --- /dev/null +++ b/dep/tbb/src/tbb/ia64-gas/pause.s @@ -0,0 +1,41 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + + .section .text + .align 16 + // void __TBB_machine_pause( long count ); + // r32 = count + .proc __TBB_machine_pause# + .global __TBB_machine_pause# +count = r32 +__TBB_machine_pause: + hint.m 0 + add count=-1,count +;; + cmp.eq p6,p7=0,count +(p7) br.cond.dpnt __TBB_machine_pause +(p6) br.ret.sptk.many b0 + .endp __TBB_machine_pause# diff --git a/dep/tbb/src/tbb/ibm_aix51/atomic_support.c b/dep/tbb/src/tbb/ibm_aix51/atomic_support.c new file mode 100644 index 000000000..2e052d772 --- /dev/null +++ b/dep/tbb/src/tbb/ibm_aix51/atomic_support.c @@ -0,0 +1,55 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include +#include + +/* This file must be compiled with gcc. The IBM compiler doesn't seem to + support inline assembly statements (October 2007). */ + +#ifdef __GNUC__ + +int32_t __TBB_machine_cas_32 (volatile void* ptr, int32_t value, int32_t comparand) { + __asm__ __volatile__ ("sync\n"); /* memory release operation */ + compare_and_swap ((atomic_p) ptr, &comparand, value); + __asm__ __volatile__ ("sync\n"); /* memory acquire operation */ + return comparand; +} + +int64_t __TBB_machine_cas_64 (volatile void* ptr, int64_t value, int64_t comparand) { + __asm__ __volatile__ ("sync\n"); /* memory release operation */ + compare_and_swaplp ((atomic_l) ptr, &comparand, value); + __asm__ __volatile__ ("sync\n"); /* memory acquire operation */ + return comparand; +} + +void __TBB_machine_flush () { + __asm__ __volatile__ ("sync\n"); +} + +#endif /* __GNUC__ */ diff --git a/dep/tbb/src/tbb/index.html b/dep/tbb/src/tbb/index.html new file mode 100644 index 000000000..c927b94a4 --- /dev/null +++ b/dep/tbb/src/tbb/index.html @@ -0,0 +1,32 @@ + + + +

Overview

+This directory contains the source code of the TBB core components. + +

Directories

+
+
tools_api +
Source code of the interface components provided by the Intel® Parallel Studio tools. +
intel64-masm +
Assembly code for the Intel® 64 architecture. +
ia32-masm +
Assembly code for IA32 architecture. +
ia64-gas +
Assembly code for IA64 architecture. +
ibm_aix51 +
Assembly code for AIX 5.1 port. +
+ +
+Up to parent directory +

+Copyright © 2005-2009 Intel Corporation. All Rights Reserved. +

+Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are +registered trademarks or trademarks of Intel Corporation or its +subsidiaries in the United States and other countries. +

+* Other names and brands may be claimed as the property of others. + + diff --git a/dep/tbb/src/tbb/intel64-masm/atomic_support.asm b/dep/tbb/src/tbb/intel64-masm/atomic_support.asm new file mode 100644 index 000000000..86a240864 --- /dev/null +++ b/dep/tbb/src/tbb/intel64-masm/atomic_support.asm @@ -0,0 +1,80 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +; DO NOT EDIT - AUTOMATICALLY GENERATED FROM .s FILE +.code + ALIGN 8 + PUBLIC __TBB_machine_fetchadd1 +__TBB_machine_fetchadd1: + mov rax,rdx + lock xadd [rcx],al + ret +.code + ALIGN 8 + PUBLIC __TBB_machine_fetchstore1 +__TBB_machine_fetchstore1: + mov rax,rdx + lock xchg [rcx],al + ret +.code + ALIGN 8 + PUBLIC __TBB_machine_cmpswp1 +__TBB_machine_cmpswp1: + mov rax,r8 + lock cmpxchg [rcx],dl + ret +.code + ALIGN 8 + PUBLIC __TBB_machine_fetchadd2 +__TBB_machine_fetchadd2: + mov rax,rdx + lock xadd [rcx],ax + ret +.code + ALIGN 8 + PUBLIC __TBB_machine_fetchstore2 +__TBB_machine_fetchstore2: + mov rax,rdx + lock xchg [rcx],ax + ret +.code + ALIGN 8 + PUBLIC __TBB_machine_cmpswp2 +__TBB_machine_cmpswp2: + mov rax,r8 + lock cmpxchg [rcx],dx + ret +.code + ALIGN 8 + PUBLIC __TBB_machine_pause +__TBB_machine_pause: +L1: + dw 090f3H; pause + add ecx,-1 + jne L1 + ret +end + diff --git a/dep/tbb/src/tbb/itt_notify.cpp b/dep/tbb/src/tbb/itt_notify.cpp new file mode 100644 index 000000000..27ebbfff9 --- /dev/null +++ b/dep/tbb/src/tbb/itt_notify.cpp @@ -0,0 +1,273 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "itt_notify.h" +#include "tbb/tbb_machine.h" + +#include + +namespace tbb { + namespace internal { + +#if __TBB_NEW_ITT_NOTIFY +#if DO_ITT_NOTIFY + + extern "C" int __TBB_load_ittnotify(); + + bool InitializeITT () { + return __TBB_load_ittnotify() != 0; + } + + +#endif /* DO_ITT_NOTIFY */ +#endif /* __TBB_NEW_ITT_NOTIFY */ + + void itt_store_pointer_with_release_v3( void* dst, void* src ) { + ITT_NOTIFY(sync_releasing, dst); + __TBB_store_with_release(*static_cast(dst),src); + } + + void* itt_load_pointer_with_acquire_v3( const void* src ) { + void* result = __TBB_load_with_acquire(*static_cast(src)); + ITT_NOTIFY(sync_acquired, const_cast(src)); + return result; + } + + void* itt_load_pointer_v3( const void* src ) { + void* result = *static_cast(src); + return result; + } + + void itt_set_sync_name_v3( void* obj, const tchar* name) { + ITT_SYNC_RENAME(obj, name); + (void)obj, (void)name; // Prevents compiler warning when ITT support is switched off + } + + } // namespace internal +} // namespace tbb + + +#if !__TBB_NEW_ITT_NOTIFY + +#include "tbb_misc.h" +#include "dynamic_link.h" +#include "tbb/cache_aligned_allocator.h" /* NFS_MaxLineSize */ + +#if _WIN32||_WIN64 + #include +#else /* !WIN */ + #include +#if __TBB_WEAK_SYMBOLS + #pragma weak __itt_notify_sync_prepare + #pragma weak __itt_notify_sync_acquired + #pragma weak __itt_notify_sync_releasing + #pragma weak __itt_notify_sync_cancel + #pragma weak __itt_thr_name_set + #pragma weak __itt_thread_set_name + #pragma weak __itt_sync_create + #pragma weak __itt_sync_rename + extern "C" { + void __itt_notify_sync_prepare(void *p); + void __itt_notify_sync_cancel(void *p); + void __itt_notify_sync_acquired(void *p); + void __itt_notify_sync_releasing(void *p); + int __itt_thr_name_set (void* p, int len); + void __itt_thread_set_name (const char* name); + void __itt_sync_create( void* obj, const char* name, const char* type, int attribute ); + void __itt_sync_rename( void* obj, const char* new_name ); + } +#endif /* __TBB_WEAK_SYMBOLS */ +#endif /* !WIN */ + +namespace tbb { +namespace internal { + +#if DO_ITT_NOTIFY + + +//! Table describing the __itt_notify handlers. +static const dynamic_link_descriptor ITT_HandlerTable[] = { + DLD( __itt_notify_sync_prepare, ITT_Handler_sync_prepare), + DLD( __itt_notify_sync_acquired, ITT_Handler_sync_acquired), + DLD( __itt_notify_sync_releasing, ITT_Handler_sync_releasing), + DLD( __itt_notify_sync_cancel, ITT_Handler_sync_cancel), +# if _WIN32||_WIN64 + DLD( __itt_thr_name_setW, ITT_Handler_thr_name_set), + DLD( __itt_thread_set_nameW, ITT_Handler_thread_set_name), +# else + DLD( __itt_thr_name_set, ITT_Handler_thr_name_set), + DLD( __itt_thread_set_name, ITT_Handler_thread_set_name), +# endif /* _WIN32 || _WIN64 */ + + +# if _WIN32||_WIN64 + DLD( __itt_sync_createW, ITT_Handler_sync_create), + DLD( __itt_sync_renameW, ITT_Handler_sync_rename) +# else + DLD( __itt_sync_create, ITT_Handler_sync_create), + DLD( __itt_sync_rename, ITT_Handler_sync_rename) +# endif +}; + +static const int ITT_HandlerTable_size = + sizeof(ITT_HandlerTable)/sizeof(dynamic_link_descriptor); + +// LIBITTNOTIFY_NAME is the name of the ITT notification library +# if _WIN32||_WIN64 +# define LIBITTNOTIFY_NAME "libittnotify.dll" +# elif __linux__ +# define LIBITTNOTIFY_NAME "libittnotify.so" +# else +# error Intel(R) Threading Tools not provided for this OS +# endif + +//! Performs tools support initialization. +/** Is called by DoOneTimeInitializations and ITT_DoOneTimeInitialization in + a protected (one-time) manner. Not to be invoked directly. **/ +bool InitializeITT() { + bool result = false; + // Check if we are running under a performance or correctness tool + bool t_checker = GetBoolEnvironmentVariable("KMP_FOR_TCHECK"); + bool t_profiler = GetBoolEnvironmentVariable("KMP_FOR_TPROFILE"); + __TBB_ASSERT(!(t_checker&&t_profiler), NULL); + if ( t_checker || t_profiler ) { + // Yes, we are in the tool mode. Try to load libittnotify library. + result = dynamic_link( LIBITTNOTIFY_NAME, ITT_HandlerTable, ITT_HandlerTable_size, 4 ); + } + if (result){ + if ( t_checker ) { + current_tool = ITC; + } else if ( t_profiler ) { + current_tool = ITP; + } + } else { + // Clear away the proxy (dummy) handlers + for (int i = 0; i < ITT_HandlerTable_size; i++) + *ITT_HandlerTable[i].handler = NULL; + current_tool = NONE; + } + PrintExtraVersionInfo( "ITT", result?"yes":"no" ); + return result; +} + +//! Performs one-time initialization of tools interoperability mechanisms. +/** Defined in task.cpp. Makes a protected do-once call to InitializeITT(). **/ +void ITT_DoOneTimeInitialization(); + +/** The following dummy_xxx functions are proxies that correspond to tool notification + APIs and are used to initialize corresponding pointers to the tool notifications + (ITT_Handler_xxx). When the first call to ITT_Handler_xxx takes place before + the whole library initialization (done by DoOneTimeInitializations) happened, + the proxy handler performs initialization of the tools support. After this + ITT_Handler_xxx will be set to either tool notification pointer or NULL. **/ +void dummy_sync_prepare( volatile void* ptr ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_sync_prepare!=&dummy_sync_prepare, NULL ); + if (ITT_Handler_sync_prepare) + (*ITT_Handler_sync_prepare) (ptr); +} + +void dummy_sync_acquired( volatile void* ptr ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_sync_acquired!=&dummy_sync_acquired, NULL ); + if (ITT_Handler_sync_acquired) + (*ITT_Handler_sync_acquired) (ptr); +} + +void dummy_sync_releasing( volatile void* ptr ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_sync_releasing!=&dummy_sync_releasing, NULL ); + if (ITT_Handler_sync_releasing) + (*ITT_Handler_sync_releasing) (ptr); +} + +void dummy_sync_cancel( volatile void* ptr ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_sync_cancel!=&dummy_sync_cancel, NULL ); + if (ITT_Handler_sync_cancel) + (*ITT_Handler_sync_cancel) (ptr); +} + +int dummy_thr_name_set( const tchar* str, int number ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_thr_name_set!=&dummy_thr_name_set, NULL ); + if (ITT_Handler_thr_name_set) + return (*ITT_Handler_thr_name_set) (str, number); + return -1; +} + +void dummy_thread_set_name( const tchar* name ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_thread_set_name!=&dummy_thread_set_name, NULL ); + if (ITT_Handler_thread_set_name) + (*ITT_Handler_thread_set_name)( name ); +} + +void dummy_sync_create( void* obj, const tchar* objname, const tchar* objtype, int /*attribute*/ ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_sync_create!=&dummy_sync_create, NULL ); + ITT_SYNC_CREATE( obj, objtype, objname ); +} + +void dummy_sync_rename( void* obj, const tchar* new_name ) { + ITT_DoOneTimeInitialization(); + __TBB_ASSERT( ITT_Handler_sync_rename!=&dummy_sync_rename, NULL ); + ITT_SYNC_RENAME(obj, new_name); +} + + + +//! Leading padding before the area where tool notification handlers are placed. +/** Prevents cache lines where the handler pointers are stored from thrashing. + Defined as extern to prevent compiler from placing the padding arrays separately + from the handler pointers (which are declared as extern). + Declared separately from definition to get rid of compiler warnings. **/ +extern char __ITT_Handler_leading_padding[NFS_MaxLineSize]; + +//! Trailing padding after the area where tool notification handlers are placed. +extern char __ITT_Handler_trailing_padding[NFS_MaxLineSize]; + +char __ITT_Handler_leading_padding[NFS_MaxLineSize] = {0}; +PointerToITT_Handler ITT_Handler_sync_prepare = &dummy_sync_prepare; +PointerToITT_Handler ITT_Handler_sync_acquired = &dummy_sync_acquired; +PointerToITT_Handler ITT_Handler_sync_releasing = &dummy_sync_releasing; +PointerToITT_Handler ITT_Handler_sync_cancel = &dummy_sync_cancel; +PointerToITT_thr_name_set ITT_Handler_thr_name_set = &dummy_thr_name_set; +PointerToITT_thread_set_name ITT_Handler_thread_set_name = &dummy_thread_set_name; +PointerToITT_sync_create ITT_Handler_sync_create = &dummy_sync_create; +PointerToITT_sync_rename ITT_Handler_sync_rename = &dummy_sync_rename; +char __ITT_Handler_trailing_padding[NFS_MaxLineSize] = {0}; + +target_tool current_tool = TO_BE_INITIALIZED; + +#endif /* DO_ITT_NOTIFY */ +} // namespace internal + +} // namespace tbb + +#endif /* !__TBB_NEW_ITT_NOTIFY */ diff --git a/dep/tbb/src/tbb/itt_notify.h b/dep/tbb/src/tbb/itt_notify.h new file mode 100644 index 000000000..db8aefcb8 --- /dev/null +++ b/dep/tbb/src/tbb/itt_notify.h @@ -0,0 +1,206 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _TBB_ITT_NOTIFY +#define _TBB_ITT_NOTIFY + +#include "tbb/tbb_stddef.h" + +#if DO_ITT_NOTIFY +#if __TBB_NEW_ITT_NOTIFY + +#if _WIN32||_WIN64 + #ifndef UNICODE + #define UNICODE + #endif +#endif /* WIN */ + +#include "tools_api/ittnotify.h" + +#if _WIN32||_WIN64 + #undef _T + #undef __itt_event_create + #define __itt_event_create __itt_event_createA +#endif /* WIN */ + +#endif /* __TBB_NEW_ITT_NOTIFY */ + +#endif /* DO_ITT_NOTIFY */ + +namespace tbb { +//! Unicode support +#if _WIN32||_WIN64 + //! Unicode character type. Always wchar_t on Windows. + /** We do not use typedefs from Windows TCHAR family to keep consistence of TBB coding style. **/ + typedef wchar_t tchar; + //! Standard Windows macro to markup the string literals. + #define _T(string_literal) L ## string_literal +#if !__TBB_NEW_ITT_NOTIFY + #define tstrlen wcslen +#endif /* !__TBB_NEW_ITT_NOTIFY */ +#else /* !WIN */ + typedef char tchar; + //! Standard Windows style macro to markup the string literals. + #define _T(string_literal) string_literal +#if !__TBB_NEW_ITT_NOTIFY + #define tstrlen strlen +#endif /* !__TBB_NEW_ITT_NOTIFY */ +#endif /* !WIN */ +} // namespace tbb + +#if DO_ITT_NOTIFY +namespace tbb { + //! Display names of internal synchronization types + extern const tchar + *SyncType_GlobalLock, + *SyncType_Scheduler; + //! Display names of internal synchronization components/scenarios + extern const tchar + *SyncObj_SchedulerInitialization, + *SyncObj_SchedulersList, + *SyncObj_TaskStealingLoop, + *SyncObj_WorkerTaskPool, + *SyncObj_MasterTaskPool, + *SyncObj_GateLock, + *SyncObj_Gate, + *SyncObj_SchedulerTermination, + *SyncObj_ContextsList + ; + + namespace internal { + void __TBB_EXPORTED_FUNC itt_set_sync_name_v3( void* obj, const tchar* name); + + } // namespace internal + +} // namespace tbb + +#if __TBB_NEW_ITT_NOTIFY +// const_cast() is necessary to cast off volatility +#define ITT_NOTIFY(name,obj) __itt_notify_##name(const_cast(static_cast(obj))) +#define ITT_THREAD_SET_NAME(name) __itt_thread_set_name(name) +#define ITT_SYNC_CREATE(obj, type, name) __itt_sync_create(obj, type, name, 2) +#define ITT_SYNC_RENAME(obj, name) __itt_sync_rename(obj, name) +#endif /* __TBB_NEW_ITT_NOTIFY */ + +#else /* !DO_ITT_NOTIFY */ + +#define ITT_NOTIFY(name,obj) ((void)0) +#define ITT_THREAD_SET_NAME(name) ((void)0) +#define ITT_SYNC_CREATE(obj, type, name) ((void)0) +#define ITT_SYNC_RENAME(obj, name) ((void)0) + +#endif /* !DO_ITT_NOTIFY */ + + +#if !__TBB_NEW_ITT_NOTIFY + +#if DO_ITT_NOTIFY + +namespace tbb { + +//! Identifies performance and correctness tools, which TBB sends special notifications to. +/** Enumerators must be ORable bit values. + + Initializing global tool indicator with TO_BE_INITIALIZED is required + to avoid bypassing early notification calls made through targeted macros until + initialization is performed from somewhere else. + + Yet this entails another problem. The first targeted calls that happen to go + into the proxy (dummy) handlers become promiscuous. **/ +enum target_tool { + NONE = 0ul, + ITC = 1ul, + ITP = 2ul, + TO_BE_INITIALIZED = ~0ul +}; + +namespace internal { + +//! Types of the tool notification functions (and corresponding proxy handlers). +typedef void (*PointerToITT_Handler)(volatile void*); +typedef int (*PointerToITT_thr_name_set)(const tchar*, int); +typedef void (*PointerToITT_thread_set_name)(const tchar*); + + +typedef void (*PointerToITT_sync_create)(void* obj, const tchar* type, const tchar* name, int attribute); +typedef void (*PointerToITT_sync_rename)(void* obj, const tchar* new_name); + +extern PointerToITT_Handler ITT_Handler_sync_prepare; +extern PointerToITT_Handler ITT_Handler_sync_acquired; +extern PointerToITT_Handler ITT_Handler_sync_releasing; +extern PointerToITT_Handler ITT_Handler_sync_cancel; +extern PointerToITT_thr_name_set ITT_Handler_thr_name_set; +extern PointerToITT_thread_set_name ITT_Handler_thread_set_name; +extern PointerToITT_sync_create ITT_Handler_sync_create; +extern PointerToITT_sync_rename ITT_Handler_sync_rename; + +extern target_tool current_tool; + +} // namespace internal + +} // namespace tbb + +//! Glues two tokens together. +#define ITT_HANDLER(name) tbb::internal::ITT_Handler_##name +#define CALL_ITT_HANDLER(name, arglist) ( ITT_HANDLER(name) ? (void)ITT_HANDLER(name)arglist : (void)0 ) + +//! Call routine itt_notify_(name) if corresponding handler is available. +/** For example, use ITT_NOTIFY(sync_releasing,x) to invoke __itt_notify_sync_releasing(x). + Ordinarily, preprocessor token gluing games should be avoided. + But here, it seemed to be the best way to handle the issue. */ +#define ITT_NOTIFY(name,obj) CALL_ITT_HANDLER(name,(obj)) +//! The same as ITT_NOTIFY but also checks if we are running under appropriate tool. +/** Parameter tools is an ORed set of target_tool enumerators. **/ +#define ITT_NOTIFY_TOOL(tools,name,obj) ( ITT_HANDLER(name) && ((tools) & tbb::internal::current_tool) ? ITT_HANDLER(name)(obj) : (void)0 ) + +#define ITT_THREAD_SET_NAME(name) ( \ + ITT_HANDLER(thread_set_name) ? ITT_HANDLER(thread_set_name)(name) \ + : CALL_ITT_HANDLER(thr_name_set,(name, tstrlen(name))) ) + + +/** 2 is the value of __itt_attr_mutex attribute. **/ +#define ITT_SYNC_CREATE(obj, type, name) CALL_ITT_HANDLER(sync_create,(obj, type, name, 2)) +#define ITT_SYNC_RENAME(obj, name) CALL_ITT_HANDLER(sync_rename,(obj, name)) + + + +#else /* !DO_ITT_NOTIFY */ + +#define ITT_NOTIFY_TOOL(tools,name,obj) ((void)0) + +#endif /* !DO_ITT_NOTIFY */ + +#if DO_ITT_QUIET +#define ITT_QUIET(x) (__itt_thr_mode_set(__itt_thr_prop_quiet,(x)?__itt_thr_state_set:__itt_thr_state_clr)) +#else +#define ITT_QUIET(x) ((void)0) +#endif /* DO_ITT_QUIET */ + +#endif /* !__TBB_NEW_ITT_NOTIFY */ + +#endif /* _TBB_ITT_NOTIFY */ diff --git a/dep/tbb/src/tbb/itt_notify_proxy.c b/dep/tbb/src/tbb/itt_notify_proxy.c new file mode 100644 index 000000000..9d4e67222 --- /dev/null +++ b/dep/tbb/src/tbb/itt_notify_proxy.c @@ -0,0 +1,55 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/tbb_config.h" + +/* This declaration in particular shuts up "empty translation unit" warning */ +extern int __TBB_load_ittnotify(); + +#if __TBB_NEW_ITT_NOTIFY +#if DO_ITT_NOTIFY + +#if _WIN32||_WIN64 + #ifndef UNICODE + #define UNICODE + #endif +#endif /* WIN */ + +extern void ITT_DoOneTimeInitialization(); + +#define ITT_SIMPLE_INIT 1 +#define __itt_init_lib_name ITT_DoOneTimeInitialization + +#include "tools_api/ittnotify_static.c" + +int __TBB_load_ittnotify() { + return __itt_init_lib(); +} + +#endif /* DO_ITT_NOTIFY */ +#endif /* __TBB_NEW_ITT_NOTIFY */ diff --git a/dep/tbb/src/tbb/lin32-tbb-export.def b/dep/tbb/src/tbb/lin32-tbb-export.def new file mode 100644 index 000000000..5fc2f53b4 --- /dev/null +++ b/dep/tbb/src/tbb/lin32-tbb-export.def @@ -0,0 +1,316 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/tbb_config.h" + +{ +global: + +/* cache_aligned_allocator.cpp */ +_ZN3tbb8internal12NFS_AllocateEjjPv; +_ZN3tbb8internal15NFS_GetLineSizeEv; +_ZN3tbb8internal8NFS_FreeEPv; +_ZN3tbb8internal23allocate_via_handler_v3Ej; +_ZN3tbb8internal25deallocate_via_handler_v3EPv; +_ZN3tbb8internal17is_malloc_used_v3Ev; + +/* task.cpp v3 */ +_ZN3tbb4task13note_affinityEt; +_ZN3tbb4task22internal_set_ref_countEi; +_ZN3tbb4task28internal_decrement_ref_countEv; +_ZN3tbb4task22spawn_and_wait_for_allERNS_9task_listE; +_ZN3tbb4task4selfEv; +_ZN3tbb4task7destroyERS0_; +_ZNK3tbb4task26is_owned_by_current_threadEv; +_ZN3tbb8internal19allocate_root_proxy4freeERNS_4taskE; +_ZN3tbb8internal19allocate_root_proxy8allocateEj; +_ZN3tbb8internal28affinity_partitioner_base_v36resizeEj; +_ZNK3tbb8internal20allocate_child_proxy4freeERNS_4taskE; +_ZNK3tbb8internal20allocate_child_proxy8allocateEj; +_ZNK3tbb8internal27allocate_continuation_proxy4freeERNS_4taskE; +_ZNK3tbb8internal27allocate_continuation_proxy8allocateEj; +_ZNK3tbb8internal34allocate_additional_child_of_proxy4freeERNS_4taskE; +_ZNK3tbb8internal34allocate_additional_child_of_proxy8allocateEj; +_ZTIN3tbb4taskE; +_ZTSN3tbb4taskE; +_ZTVN3tbb4taskE; +_ZN3tbb19task_scheduler_init19default_num_threadsEv; +_ZN3tbb19task_scheduler_init10initializeEij; +_ZN3tbb19task_scheduler_init10initializeEi; +_ZN3tbb19task_scheduler_init9terminateEv; +_ZN3tbb8internal26task_scheduler_observer_v37observeEb; +_ZN3tbb10empty_task7executeEv; +_ZN3tbb10empty_taskD0Ev; +_ZN3tbb10empty_taskD1Ev; +_ZTIN3tbb10empty_taskE; +_ZTSN3tbb10empty_taskE; +_ZTVN3tbb10empty_taskE; + +/* exception handling support */ +#if __TBB_EXCEPTIONS +_ZNK3tbb8internal32allocate_root_with_context_proxy8allocateEj; +_ZNK3tbb8internal32allocate_root_with_context_proxy4freeERNS_4taskE; +_ZNK3tbb18task_group_context28is_group_execution_cancelledEv; +_ZN3tbb18task_group_context22cancel_group_executionEv; +_ZN3tbb18task_group_context26register_pending_exceptionEv; +_ZN3tbb18task_group_context5resetEv; +_ZN3tbb18task_group_context4initEv; +_ZN3tbb18task_group_contextD1Ev; +_ZN3tbb18task_group_contextD2Ev; +_ZNK3tbb18captured_exception4nameEv; +_ZNK3tbb18captured_exception4whatEv; +_ZN3tbb18captured_exception10throw_selfEv; +_ZN3tbb18captured_exception3setEPKcS2_; +_ZN3tbb18captured_exception4moveEv; +_ZN3tbb18captured_exception5clearEv; +_ZN3tbb18captured_exception7destroyEv; +_ZN3tbb18captured_exception8allocateEPKcS2_; +_ZN3tbb18captured_exceptionD0Ev; +_ZN3tbb18captured_exceptionD1Ev; +_ZTIN3tbb18captured_exceptionE; +_ZTSN3tbb18captured_exceptionE; +_ZTVN3tbb18captured_exceptionE; +_ZN3tbb13tbb_exceptionD2Ev; +_ZTIN3tbb13tbb_exceptionE; +_ZTSN3tbb13tbb_exceptionE; +_ZTVN3tbb13tbb_exceptionE; +_ZN3tbb14bad_last_allocD0Ev; +_ZN3tbb14bad_last_allocD1Ev; +_ZNK3tbb14bad_last_alloc4whatEv; +_ZTIN3tbb14bad_last_allocE; +_ZTSN3tbb14bad_last_allocE; +_ZTVN3tbb14bad_last_allocE; +#endif /* __TBB_EXCEPTIONS */ + +/* tbb_misc.cpp */ +_ZN3tbb17assertion_failureEPKciS1_S1_; +_ZN3tbb21set_assertion_handlerEPFvPKciS1_S1_E; +_ZN3tbb8internal36get_initial_auto_partitioner_divisorEv; +_ZN3tbb8internal13handle_perrorEiPKc; +_ZN3tbb8internal15runtime_warningEPKcz; +__TBB_machine_store8_slow_perf_warning; +__TBB_machine_store8_slow; +TBB_runtime_interface_version; +_ZN3tbb8internal33throw_bad_last_alloc_exception_v4Ev; + +/* itt_notify.cpp */ +_ZN3tbb8internal32itt_load_pointer_with_acquire_v3EPKv; +_ZN3tbb8internal33itt_store_pointer_with_release_v3EPvS1_; +_ZN3tbb8internal20itt_set_sync_name_v3EPvPKc; +_ZN3tbb8internal19itt_load_pointer_v3EPKv; + +/* pipeline.cpp */ +_ZTIN3tbb6filterE; +_ZTSN3tbb6filterE; +_ZTVN3tbb6filterE; +_ZN3tbb6filterD2Ev; +_ZN3tbb8pipeline10add_filterERNS_6filterE; +_ZN3tbb8pipeline12inject_tokenERNS_4taskE; +_ZN3tbb8pipeline13remove_filterERNS_6filterE; +_ZN3tbb8pipeline3runEj; +#if __TBB_EXCEPTIONS +_ZN3tbb8pipeline3runEjRNS_18task_group_contextE; +#endif +_ZN3tbb8pipeline5clearEv; +_ZN3tbb19thread_bound_filter12process_itemEv; +_ZN3tbb19thread_bound_filter16try_process_itemEv; +_ZTIN3tbb8pipelineE; +_ZTSN3tbb8pipelineE; +_ZTVN3tbb8pipelineE; +_ZN3tbb8pipelineC1Ev; +_ZN3tbb8pipelineC2Ev; +_ZN3tbb8pipelineD0Ev; +_ZN3tbb8pipelineD1Ev; +_ZN3tbb8pipelineD2Ev; + +/* queuing_rw_mutex.cpp */ +_ZN3tbb16queuing_rw_mutex18internal_constructEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock17upgrade_to_writerEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock19downgrade_to_readerEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock7acquireERS0_b; +_ZN3tbb16queuing_rw_mutex11scoped_lock7releaseEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock11try_acquireERS0_b; + +#if !TBB_NO_LEGACY +/* spin_rw_mutex.cpp v2 */ +_ZN3tbb13spin_rw_mutex16internal_upgradeEPS0_; +_ZN3tbb13spin_rw_mutex22internal_itt_releasingEPS0_; +_ZN3tbb13spin_rw_mutex23internal_acquire_readerEPS0_; +_ZN3tbb13spin_rw_mutex23internal_acquire_writerEPS0_; +_ZN3tbb13spin_rw_mutex18internal_downgradeEPS0_; +_ZN3tbb13spin_rw_mutex23internal_release_readerEPS0_; +_ZN3tbb13spin_rw_mutex23internal_release_writerEPS0_; +_ZN3tbb13spin_rw_mutex27internal_try_acquire_readerEPS0_; +_ZN3tbb13spin_rw_mutex27internal_try_acquire_writerEPS0_; +#endif + +/* spin_rw_mutex v3 */ +_ZN3tbb16spin_rw_mutex_v318internal_constructEv; +_ZN3tbb16spin_rw_mutex_v316internal_upgradeEv; +_ZN3tbb16spin_rw_mutex_v318internal_downgradeEv; +_ZN3tbb16spin_rw_mutex_v323internal_acquire_readerEv; +_ZN3tbb16spin_rw_mutex_v323internal_acquire_writerEv; +_ZN3tbb16spin_rw_mutex_v323internal_release_readerEv; +_ZN3tbb16spin_rw_mutex_v323internal_release_writerEv; +_ZN3tbb16spin_rw_mutex_v327internal_try_acquire_readerEv; +_ZN3tbb16spin_rw_mutex_v327internal_try_acquire_writerEv; + +/* spin_mutex.cpp */ +_ZN3tbb10spin_mutex18internal_constructEv; +_ZN3tbb10spin_mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb10spin_mutex11scoped_lock16internal_releaseEv; +_ZN3tbb10spin_mutex11scoped_lock20internal_try_acquireERS0_; + +/* mutex.cpp */ +_ZN3tbb5mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb5mutex11scoped_lock16internal_releaseEv; +_ZN3tbb5mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb5mutex16internal_destroyEv; +_ZN3tbb5mutex18internal_constructEv; + +/* recursive_mutex.cpp */ +_ZN3tbb15recursive_mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb15recursive_mutex11scoped_lock16internal_releaseEv; +_ZN3tbb15recursive_mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb15recursive_mutex16internal_destroyEv; +_ZN3tbb15recursive_mutex18internal_constructEv; + +/* QueuingMutex.cpp */ +_ZN3tbb13queuing_mutex18internal_constructEv; +_ZN3tbb13queuing_mutex11scoped_lock7acquireERS0_; +_ZN3tbb13queuing_mutex11scoped_lock7releaseEv; +_ZN3tbb13queuing_mutex11scoped_lock11try_acquireERS0_; + +#if !TBB_NO_LEGACY +/* concurrent_hash_map */ +_ZNK3tbb8internal21hash_map_segment_base23internal_grow_predicateEv; + +/* concurrent_queue.cpp v2 */ +_ZN3tbb8internal21concurrent_queue_base12internal_popEPv; +_ZN3tbb8internal21concurrent_queue_base13internal_pushEPKv; +_ZN3tbb8internal21concurrent_queue_base21internal_set_capacityEij; +_ZN3tbb8internal21concurrent_queue_base23internal_pop_if_presentEPv; +_ZN3tbb8internal21concurrent_queue_base25internal_push_if_not_fullEPKv; +_ZN3tbb8internal21concurrent_queue_baseC2Ej; +_ZN3tbb8internal21concurrent_queue_baseD2Ev; +_ZTIN3tbb8internal21concurrent_queue_baseE; +_ZTSN3tbb8internal21concurrent_queue_baseE; +_ZTVN3tbb8internal21concurrent_queue_baseE; +_ZN3tbb8internal30concurrent_queue_iterator_base6assignERKS1_; +_ZN3tbb8internal30concurrent_queue_iterator_base7advanceEv; +_ZN3tbb8internal30concurrent_queue_iterator_baseC2ERKNS0_21concurrent_queue_baseE; +_ZN3tbb8internal30concurrent_queue_iterator_baseD2Ev; +_ZNK3tbb8internal21concurrent_queue_base13internal_sizeEv; +#endif + +/* concurrent_queue v3 */ +/* constructors */ +_ZN3tbb8internal24concurrent_queue_base_v3C2Ej; +_ZN3tbb8internal33concurrent_queue_iterator_base_v3C2ERKNS0_24concurrent_queue_base_v3E; +/* destructors */ +_ZN3tbb8internal24concurrent_queue_base_v3D2Ev; +_ZN3tbb8internal33concurrent_queue_iterator_base_v3D2Ev; +/* typeinfo */ +_ZTIN3tbb8internal24concurrent_queue_base_v3E; +_ZTSN3tbb8internal24concurrent_queue_base_v3E; +/* vtable */ +_ZTVN3tbb8internal24concurrent_queue_base_v3E; +/* methods */ +_ZN3tbb8internal33concurrent_queue_iterator_base_v37advanceEv; +_ZN3tbb8internal33concurrent_queue_iterator_base_v36assignERKS1_; +_ZN3tbb8internal24concurrent_queue_base_v313internal_pushEPKv; +_ZN3tbb8internal24concurrent_queue_base_v325internal_push_if_not_fullEPKv; +_ZN3tbb8internal24concurrent_queue_base_v312internal_popEPv; +_ZN3tbb8internal24concurrent_queue_base_v323internal_pop_if_presentEPv; +_ZN3tbb8internal24concurrent_queue_base_v321internal_set_capacityEij; +_ZNK3tbb8internal24concurrent_queue_base_v313internal_sizeEv; +_ZNK3tbb8internal24concurrent_queue_base_v314internal_emptyEv; +_ZN3tbb8internal24concurrent_queue_base_v321internal_finish_clearEv; +_ZNK3tbb8internal24concurrent_queue_base_v324internal_throw_exceptionEv; +_ZN3tbb8internal24concurrent_queue_base_v36assignERKS1_; + +#if !TBB_NO_LEGACY +/* concurrent_vector.cpp v2 */ +_ZN3tbb8internal22concurrent_vector_base13internal_copyERKS1_jPFvPvPKvjE; +_ZN3tbb8internal22concurrent_vector_base14internal_clearEPFvPvjEb; +_ZN3tbb8internal22concurrent_vector_base15internal_assignERKS1_jPFvPvjEPFvS4_PKvjESA_; +_ZN3tbb8internal22concurrent_vector_base16internal_grow_byEjjPFvPvjE; +_ZN3tbb8internal22concurrent_vector_base16internal_reserveEjjj; +_ZN3tbb8internal22concurrent_vector_base18internal_push_backEjRj; +_ZN3tbb8internal22concurrent_vector_base25internal_grow_to_at_leastEjjPFvPvjE; +_ZNK3tbb8internal22concurrent_vector_base17internal_capacityEv; +#endif + +/* concurrent_vector v3 */ +_ZN3tbb8internal25concurrent_vector_base_v313internal_copyERKS1_jPFvPvPKvjE; +_ZN3tbb8internal25concurrent_vector_base_v314internal_clearEPFvPvjE; +_ZN3tbb8internal25concurrent_vector_base_v315internal_assignERKS1_jPFvPvjEPFvS4_PKvjESA_; +_ZN3tbb8internal25concurrent_vector_base_v316internal_grow_byEjjPFvPvPKvjES4_; +_ZN3tbb8internal25concurrent_vector_base_v316internal_reserveEjjj; +_ZN3tbb8internal25concurrent_vector_base_v318internal_push_backEjRj; +_ZN3tbb8internal25concurrent_vector_base_v325internal_grow_to_at_leastEjjPFvPvPKvjES4_; +_ZNK3tbb8internal25concurrent_vector_base_v317internal_capacityEv; +_ZN3tbb8internal25concurrent_vector_base_v316internal_compactEjPvPFvS2_jEPFvS2_PKvjE; +_ZN3tbb8internal25concurrent_vector_base_v313internal_swapERS1_; +_ZNK3tbb8internal25concurrent_vector_base_v324internal_throw_exceptionEj; +_ZN3tbb8internal25concurrent_vector_base_v3D2Ev; +_ZN3tbb8internal25concurrent_vector_base_v315internal_resizeEjjjPKvPFvPvjEPFvS4_S3_jE; +_ZN3tbb8internal25concurrent_vector_base_v337internal_grow_to_at_least_with_resultEjjPFvPvPKvjES4_; + +/* tbb_thread */ +_ZN3tbb8internal13tbb_thread_v314internal_startEPFPvS2_ES2_; +_ZN3tbb8internal13tbb_thread_v320hardware_concurrencyEv; +_ZN3tbb8internal13tbb_thread_v34joinEv; +_ZN3tbb8internal13tbb_thread_v36detachEv; +_ZN3tbb8internal15free_closure_v3EPv; +_ZN3tbb8internal15thread_sleep_v3ERKNS_10tick_count10interval_tE; +_ZN3tbb8internal15thread_yield_v3Ev; +_ZN3tbb8internal16thread_get_id_v3Ev; +_ZN3tbb8internal19allocate_closure_v3Ej; +_ZN3tbb8internal7move_v3ERNS0_13tbb_thread_v3ES2_; + +local: + +/* TBB symbols */ +*3tbb*; +*__TBB*; + +/* Intel Compiler (libirc) symbols */ +__intel_*; +_intel_*; +get_memcpy_largest_cachelinesize; +get_memcpy_largest_cache_size; +get_mem_ops_method; +init_mem_ops_method; +irc__get_msg; +irc__print; +override_mem_ops_method; +set_memcpy_largest_cachelinesize; +set_memcpy_largest_cache_size; + +}; diff --git a/dep/tbb/src/tbb/lin64-tbb-export.def b/dep/tbb/src/tbb/lin64-tbb-export.def new file mode 100644 index 000000000..40b245b47 --- /dev/null +++ b/dep/tbb/src/tbb/lin64-tbb-export.def @@ -0,0 +1,311 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/tbb_config.h" + +{ +global: + +/* cache_aligned_allocator.cpp */ +_ZN3tbb8internal12NFS_AllocateEmmPv; +_ZN3tbb8internal15NFS_GetLineSizeEv; +_ZN3tbb8internal8NFS_FreeEPv; +_ZN3tbb8internal23allocate_via_handler_v3Em; +_ZN3tbb8internal25deallocate_via_handler_v3EPv; +_ZN3tbb8internal17is_malloc_used_v3Ev; + +/* task.cpp v3 */ +_ZN3tbb4task13note_affinityEt; +_ZN3tbb4task22internal_set_ref_countEi; +_ZN3tbb4task28internal_decrement_ref_countEv; +_ZN3tbb4task22spawn_and_wait_for_allERNS_9task_listE; +_ZN3tbb4task4selfEv; +_ZN3tbb4task7destroyERS0_; +_ZNK3tbb4task26is_owned_by_current_threadEv; +_ZN3tbb8internal19allocate_root_proxy4freeERNS_4taskE; +_ZN3tbb8internal19allocate_root_proxy8allocateEm; +_ZN3tbb8internal28affinity_partitioner_base_v36resizeEj; +_ZNK3tbb8internal20allocate_child_proxy4freeERNS_4taskE; +_ZNK3tbb8internal20allocate_child_proxy8allocateEm; +_ZNK3tbb8internal27allocate_continuation_proxy4freeERNS_4taskE; +_ZNK3tbb8internal27allocate_continuation_proxy8allocateEm; +_ZNK3tbb8internal34allocate_additional_child_of_proxy4freeERNS_4taskE; +_ZNK3tbb8internal34allocate_additional_child_of_proxy8allocateEm; +_ZTIN3tbb4taskE; +_ZTSN3tbb4taskE; +_ZTVN3tbb4taskE; +_ZN3tbb19task_scheduler_init19default_num_threadsEv; +_ZN3tbb19task_scheduler_init10initializeEim; +_ZN3tbb19task_scheduler_init10initializeEi; +_ZN3tbb19task_scheduler_init9terminateEv; +_ZN3tbb8internal26task_scheduler_observer_v37observeEb; +_ZN3tbb10empty_task7executeEv; +_ZN3tbb10empty_taskD0Ev; +_ZN3tbb10empty_taskD1Ev; +_ZTIN3tbb10empty_taskE; +_ZTSN3tbb10empty_taskE; +_ZTVN3tbb10empty_taskE; + +/* exception handling support */ +#if __TBB_EXCEPTIONS +_ZNK3tbb8internal32allocate_root_with_context_proxy8allocateEm; +_ZNK3tbb8internal32allocate_root_with_context_proxy4freeERNS_4taskE; +_ZNK3tbb18task_group_context28is_group_execution_cancelledEv; +_ZN3tbb18task_group_context22cancel_group_executionEv; +_ZN3tbb18task_group_context26register_pending_exceptionEv; +_ZN3tbb18task_group_context5resetEv; +_ZN3tbb18task_group_context4initEv; +_ZN3tbb18task_group_contextD1Ev; +_ZN3tbb18task_group_contextD2Ev; +_ZNK3tbb18captured_exception4nameEv; +_ZNK3tbb18captured_exception4whatEv; +_ZN3tbb18captured_exception10throw_selfEv; +_ZN3tbb18captured_exception3setEPKcS2_; +_ZN3tbb18captured_exception4moveEv; +_ZN3tbb18captured_exception5clearEv; +_ZN3tbb18captured_exception7destroyEv; +_ZN3tbb18captured_exception8allocateEPKcS2_; +_ZN3tbb18captured_exceptionD0Ev; +_ZN3tbb18captured_exceptionD1Ev; +_ZTIN3tbb18captured_exceptionE; +_ZTSN3tbb18captured_exceptionE; +_ZTVN3tbb18captured_exceptionE; +_ZN3tbb13tbb_exceptionD2Ev; +_ZTIN3tbb13tbb_exceptionE; +_ZTSN3tbb13tbb_exceptionE; +_ZTVN3tbb13tbb_exceptionE; +_ZN3tbb14bad_last_allocD0Ev; +_ZN3tbb14bad_last_allocD1Ev; +_ZNK3tbb14bad_last_alloc4whatEv; +_ZTIN3tbb14bad_last_allocE; +_ZTSN3tbb14bad_last_allocE; +_ZTVN3tbb14bad_last_allocE; +#endif /* __TBB_EXCEPTIONS */ + +/* tbb_misc.cpp */ +_ZN3tbb17assertion_failureEPKciS1_S1_; +_ZN3tbb21set_assertion_handlerEPFvPKciS1_S1_E; +_ZN3tbb8internal36get_initial_auto_partitioner_divisorEv; +_ZN3tbb8internal13handle_perrorEiPKc; +_ZN3tbb8internal15runtime_warningEPKcz; +TBB_runtime_interface_version; +_ZN3tbb8internal33throw_bad_last_alloc_exception_v4Ev; + +/* itt_notify.cpp */ +_ZN3tbb8internal32itt_load_pointer_with_acquire_v3EPKv; +_ZN3tbb8internal33itt_store_pointer_with_release_v3EPvS1_; +_ZN3tbb8internal20itt_set_sync_name_v3EPvPKc; +_ZN3tbb8internal19itt_load_pointer_v3EPKv; + +/* pipeline.cpp */ +_ZTIN3tbb6filterE; +_ZTSN3tbb6filterE; +_ZTVN3tbb6filterE; +_ZN3tbb6filterD2Ev; +_ZN3tbb8pipeline10add_filterERNS_6filterE; +_ZN3tbb8pipeline12inject_tokenERNS_4taskE; +_ZN3tbb8pipeline13remove_filterERNS_6filterE; +_ZN3tbb8pipeline3runEm; +#if __TBB_EXCEPTIONS +_ZN3tbb8pipeline3runEmRNS_18task_group_contextE; +#endif +_ZN3tbb8pipeline5clearEv; +_ZN3tbb19thread_bound_filter12process_itemEv; +_ZN3tbb19thread_bound_filter16try_process_itemEv; +_ZTIN3tbb8pipelineE; +_ZTSN3tbb8pipelineE; +_ZTVN3tbb8pipelineE; +_ZN3tbb8pipelineC1Ev; +_ZN3tbb8pipelineC2Ev; +_ZN3tbb8pipelineD0Ev; +_ZN3tbb8pipelineD1Ev; +_ZN3tbb8pipelineD2Ev; + +/* queuing_rw_mutex.cpp */ +_ZN3tbb16queuing_rw_mutex18internal_constructEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock17upgrade_to_writerEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock19downgrade_to_readerEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock7acquireERS0_b; +_ZN3tbb16queuing_rw_mutex11scoped_lock7releaseEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock11try_acquireERS0_b; + +#if !TBB_NO_LEGACY +/* spin_rw_mutex.cpp v2 */ +_ZN3tbb13spin_rw_mutex16internal_upgradeEPS0_; +_ZN3tbb13spin_rw_mutex22internal_itt_releasingEPS0_; +_ZN3tbb13spin_rw_mutex23internal_acquire_readerEPS0_; +_ZN3tbb13spin_rw_mutex23internal_acquire_writerEPS0_; +_ZN3tbb13spin_rw_mutex18internal_downgradeEPS0_; +_ZN3tbb13spin_rw_mutex23internal_release_readerEPS0_; +_ZN3tbb13spin_rw_mutex23internal_release_writerEPS0_; +_ZN3tbb13spin_rw_mutex27internal_try_acquire_readerEPS0_; +_ZN3tbb13spin_rw_mutex27internal_try_acquire_writerEPS0_; +#endif + +/* spin_rw_mutex v3 */ +_ZN3tbb16spin_rw_mutex_v318internal_constructEv; +_ZN3tbb16spin_rw_mutex_v316internal_upgradeEv; +_ZN3tbb16spin_rw_mutex_v318internal_downgradeEv; +_ZN3tbb16spin_rw_mutex_v323internal_acquire_readerEv; +_ZN3tbb16spin_rw_mutex_v323internal_acquire_writerEv; +_ZN3tbb16spin_rw_mutex_v323internal_release_readerEv; +_ZN3tbb16spin_rw_mutex_v323internal_release_writerEv; +_ZN3tbb16spin_rw_mutex_v327internal_try_acquire_readerEv; +_ZN3tbb16spin_rw_mutex_v327internal_try_acquire_writerEv; + +/* spin_mutex.cpp */ +_ZN3tbb10spin_mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb10spin_mutex11scoped_lock16internal_releaseEv; +_ZN3tbb10spin_mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb10spin_mutex18internal_constructEv; + +/* mutex.cpp */ +_ZN3tbb5mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb5mutex11scoped_lock16internal_releaseEv; +_ZN3tbb5mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb5mutex16internal_destroyEv; +_ZN3tbb5mutex18internal_constructEv; + +/* recursive_mutex.cpp */ +_ZN3tbb15recursive_mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb15recursive_mutex11scoped_lock16internal_releaseEv; +_ZN3tbb15recursive_mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb15recursive_mutex16internal_destroyEv; +_ZN3tbb15recursive_mutex18internal_constructEv; + +/* QueuingMutex.cpp */ +_ZN3tbb13queuing_mutex18internal_constructEv; +_ZN3tbb13queuing_mutex11scoped_lock7acquireERS0_; +_ZN3tbb13queuing_mutex11scoped_lock7releaseEv; +_ZN3tbb13queuing_mutex11scoped_lock11try_acquireERS0_; + +#if !TBB_NO_LEGACY +/* concurrent_hash_map */ +_ZNK3tbb8internal21hash_map_segment_base23internal_grow_predicateEv; + +/* concurrent_queue.cpp v2 */ +_ZN3tbb8internal21concurrent_queue_base12internal_popEPv; +_ZN3tbb8internal21concurrent_queue_base13internal_pushEPKv; +_ZN3tbb8internal21concurrent_queue_base21internal_set_capacityElm; +_ZN3tbb8internal21concurrent_queue_base23internal_pop_if_presentEPv; +_ZN3tbb8internal21concurrent_queue_base25internal_push_if_not_fullEPKv; +_ZN3tbb8internal21concurrent_queue_baseC2Em; +_ZN3tbb8internal21concurrent_queue_baseD2Ev; +_ZTIN3tbb8internal21concurrent_queue_baseE; +_ZTSN3tbb8internal21concurrent_queue_baseE; +_ZTVN3tbb8internal21concurrent_queue_baseE; +_ZN3tbb8internal30concurrent_queue_iterator_base6assignERKS1_; +_ZN3tbb8internal30concurrent_queue_iterator_base7advanceEv; +_ZN3tbb8internal30concurrent_queue_iterator_baseC2ERKNS0_21concurrent_queue_baseE; +_ZN3tbb8internal30concurrent_queue_iterator_baseD2Ev; +_ZNK3tbb8internal21concurrent_queue_base13internal_sizeEv; +#endif + +/* concurrent_queue v3 */ +/* constructors */ +_ZN3tbb8internal24concurrent_queue_base_v3C2Em; +_ZN3tbb8internal33concurrent_queue_iterator_base_v3C2ERKNS0_24concurrent_queue_base_v3E; +/* destructors */ +_ZN3tbb8internal24concurrent_queue_base_v3D2Ev; +_ZN3tbb8internal33concurrent_queue_iterator_base_v3D2Ev; +/* typeinfo */ +_ZTIN3tbb8internal24concurrent_queue_base_v3E; +_ZTSN3tbb8internal24concurrent_queue_base_v3E; +/* vtable */ +_ZTVN3tbb8internal24concurrent_queue_base_v3E; +/* methods */ +_ZN3tbb8internal33concurrent_queue_iterator_base_v36assignERKS1_; +_ZN3tbb8internal33concurrent_queue_iterator_base_v37advanceEv; +_ZN3tbb8internal24concurrent_queue_base_v313internal_pushEPKv; +_ZN3tbb8internal24concurrent_queue_base_v325internal_push_if_not_fullEPKv; +_ZN3tbb8internal24concurrent_queue_base_v312internal_popEPv; +_ZN3tbb8internal24concurrent_queue_base_v323internal_pop_if_presentEPv; +_ZN3tbb8internal24concurrent_queue_base_v321internal_finish_clearEv; +_ZN3tbb8internal24concurrent_queue_base_v321internal_set_capacityElm; +_ZNK3tbb8internal24concurrent_queue_base_v313internal_sizeEv; +_ZNK3tbb8internal24concurrent_queue_base_v314internal_emptyEv; +_ZNK3tbb8internal24concurrent_queue_base_v324internal_throw_exceptionEv; +_ZN3tbb8internal24concurrent_queue_base_v36assignERKS1_; + +#if !TBB_NO_LEGACY +/* concurrent_vector.cpp v2 */ +_ZN3tbb8internal22concurrent_vector_base13internal_copyERKS1_mPFvPvPKvmE; +_ZN3tbb8internal22concurrent_vector_base14internal_clearEPFvPvmEb; +_ZN3tbb8internal22concurrent_vector_base15internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_; +_ZN3tbb8internal22concurrent_vector_base16internal_grow_byEmmPFvPvmE; +_ZN3tbb8internal22concurrent_vector_base16internal_reserveEmmm; +_ZN3tbb8internal22concurrent_vector_base18internal_push_backEmRm; +_ZN3tbb8internal22concurrent_vector_base25internal_grow_to_at_leastEmmPFvPvmE; +_ZNK3tbb8internal22concurrent_vector_base17internal_capacityEv; +#endif + +/* concurrent_vector v3 */ +_ZN3tbb8internal25concurrent_vector_base_v313internal_copyERKS1_mPFvPvPKvmE; +_ZN3tbb8internal25concurrent_vector_base_v314internal_clearEPFvPvmE; +_ZN3tbb8internal25concurrent_vector_base_v315internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_; +_ZN3tbb8internal25concurrent_vector_base_v316internal_grow_byEmmPFvPvPKvmES4_; +_ZN3tbb8internal25concurrent_vector_base_v316internal_reserveEmmm; +_ZN3tbb8internal25concurrent_vector_base_v318internal_push_backEmRm; +_ZN3tbb8internal25concurrent_vector_base_v325internal_grow_to_at_leastEmmPFvPvPKvmES4_; +_ZNK3tbb8internal25concurrent_vector_base_v317internal_capacityEv; +_ZN3tbb8internal25concurrent_vector_base_v316internal_compactEmPvPFvS2_mEPFvS2_PKvmE; +_ZN3tbb8internal25concurrent_vector_base_v313internal_swapERS1_; +_ZNK3tbb8internal25concurrent_vector_base_v324internal_throw_exceptionEm; +_ZN3tbb8internal25concurrent_vector_base_v3D2Ev; +_ZN3tbb8internal25concurrent_vector_base_v315internal_resizeEmmmPKvPFvPvmEPFvS4_S3_mE; +_ZN3tbb8internal25concurrent_vector_base_v337internal_grow_to_at_least_with_resultEmmPFvPvPKvmES4_; + +/* tbb_thread */ +_ZN3tbb8internal13tbb_thread_v320hardware_concurrencyEv; +_ZN3tbb8internal13tbb_thread_v36detachEv; +_ZN3tbb8internal16thread_get_id_v3Ev; +_ZN3tbb8internal15free_closure_v3EPv; +_ZN3tbb8internal13tbb_thread_v34joinEv; +_ZN3tbb8internal13tbb_thread_v314internal_startEPFPvS2_ES2_; +_ZN3tbb8internal19allocate_closure_v3Em; +_ZN3tbb8internal7move_v3ERNS0_13tbb_thread_v3ES2_; +_ZN3tbb8internal15thread_yield_v3Ev; +_ZN3tbb8internal15thread_sleep_v3ERKNS_10tick_count10interval_tE; + +local: + +/* TBB symbols */ +*3tbb*; +*__TBB*; + +/* Intel Compiler (libirc) symbols */ +__intel_*; +_intel_*; +get_msg_buf; +get_text_buf; +message_catalog; +print_buf; +irc__get_msg; +irc__print; + +}; diff --git a/dep/tbb/src/tbb/lin64ipf-tbb-export.def b/dep/tbb/src/tbb/lin64ipf-tbb-export.def new file mode 100644 index 000000000..22514d8f2 --- /dev/null +++ b/dep/tbb/src/tbb/lin64ipf-tbb-export.def @@ -0,0 +1,355 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/tbb_config.h" + +{ +global: + +/* cache_aligned_allocator.cpp */ +_ZN3tbb8internal12NFS_AllocateEmmPv; +_ZN3tbb8internal15NFS_GetLineSizeEv; +_ZN3tbb8internal8NFS_FreeEPv; +_ZN3tbb8internal23allocate_via_handler_v3Em; +_ZN3tbb8internal25deallocate_via_handler_v3EPv; +_ZN3tbb8internal17is_malloc_used_v3Ev; + +/* task.cpp v3 */ +_ZN3tbb4task13note_affinityEt; +_ZN3tbb4task22internal_set_ref_countEi; +_ZN3tbb4task28internal_decrement_ref_countEv; +_ZN3tbb4task22spawn_and_wait_for_allERNS_9task_listE; +_ZN3tbb4task4selfEv; +_ZN3tbb4task7destroyERS0_; +_ZNK3tbb4task26is_owned_by_current_threadEv; +_ZN3tbb8internal19allocate_root_proxy4freeERNS_4taskE; +_ZN3tbb8internal19allocate_root_proxy8allocateEm; +_ZN3tbb8internal28affinity_partitioner_base_v36resizeEj; +_ZNK3tbb8internal20allocate_child_proxy4freeERNS_4taskE; +_ZNK3tbb8internal20allocate_child_proxy8allocateEm; +_ZNK3tbb8internal27allocate_continuation_proxy4freeERNS_4taskE; +_ZNK3tbb8internal27allocate_continuation_proxy8allocateEm; +_ZNK3tbb8internal34allocate_additional_child_of_proxy4freeERNS_4taskE; +_ZNK3tbb8internal34allocate_additional_child_of_proxy8allocateEm; +_ZTIN3tbb4taskE; +_ZTSN3tbb4taskE; +_ZTVN3tbb4taskE; +_ZN3tbb19task_scheduler_init19default_num_threadsEv; +_ZN3tbb19task_scheduler_init10initializeEim; +_ZN3tbb19task_scheduler_init10initializeEi; +_ZN3tbb19task_scheduler_init9terminateEv; +_ZN3tbb8internal26task_scheduler_observer_v37observeEb; +_ZN3tbb10empty_task7executeEv; +_ZN3tbb10empty_taskD0Ev; +_ZN3tbb10empty_taskD1Ev; +_ZTIN3tbb10empty_taskE; +_ZTSN3tbb10empty_taskE; +_ZTVN3tbb10empty_taskE; + +/* exception handling support */ +#if __TBB_EXCEPTIONS +_ZNK3tbb8internal32allocate_root_with_context_proxy8allocateEm; +_ZNK3tbb8internal32allocate_root_with_context_proxy4freeERNS_4taskE; +_ZNK3tbb18task_group_context28is_group_execution_cancelledEv; +_ZN3tbb18task_group_context22cancel_group_executionEv; +_ZN3tbb18task_group_context26register_pending_exceptionEv; +_ZN3tbb18task_group_context5resetEv; +_ZN3tbb18task_group_context4initEv; +_ZN3tbb18task_group_contextD1Ev; +_ZN3tbb18task_group_contextD2Ev; +_ZNK3tbb18captured_exception4nameEv; +_ZNK3tbb18captured_exception4whatEv; +_ZN3tbb18captured_exception10throw_selfEv; +_ZN3tbb18captured_exception3setEPKcS2_; +_ZN3tbb18captured_exception4moveEv; +_ZN3tbb18captured_exception5clearEv; +_ZN3tbb18captured_exception7destroyEv; +_ZN3tbb18captured_exception8allocateEPKcS2_; +_ZN3tbb18captured_exceptionD0Ev; +_ZN3tbb18captured_exceptionD1Ev; +_ZTIN3tbb18captured_exceptionE; +_ZTSN3tbb18captured_exceptionE; +_ZTVN3tbb18captured_exceptionE; +_ZN3tbb13tbb_exceptionD2Ev; +_ZTIN3tbb13tbb_exceptionE; +_ZTSN3tbb13tbb_exceptionE; +_ZTVN3tbb13tbb_exceptionE; +_ZN3tbb14bad_last_allocD0Ev; +_ZN3tbb14bad_last_allocD1Ev; +_ZNK3tbb14bad_last_alloc4whatEv; +_ZTIN3tbb14bad_last_allocE; +_ZTSN3tbb14bad_last_allocE; +_ZTVN3tbb14bad_last_allocE; +#endif /* __TBB_EXCEPTIONS */ + +/* tbb_misc.cpp */ +_ZN3tbb17assertion_failureEPKciS1_S1_; +_ZN3tbb21set_assertion_handlerEPFvPKciS1_S1_E; +_ZN3tbb8internal36get_initial_auto_partitioner_divisorEv; +_ZN3tbb8internal13handle_perrorEiPKc; +_ZN3tbb8internal15runtime_warningEPKcz; +TBB_runtime_interface_version; +_ZN3tbb8internal33throw_bad_last_alloc_exception_v4Ev; + +/* itt_notify.cpp */ +_ZN3tbb8internal32itt_load_pointer_with_acquire_v3EPKv; +_ZN3tbb8internal33itt_store_pointer_with_release_v3EPvS1_; +_ZN3tbb8internal20itt_set_sync_name_v3EPvPKc; +_ZN3tbb8internal19itt_load_pointer_v3EPKv; + +/* pipeline.cpp */ +_ZTIN3tbb6filterE; +_ZTSN3tbb6filterE; +_ZTVN3tbb6filterE; +_ZN3tbb6filterD2Ev; +_ZN3tbb8pipeline10add_filterERNS_6filterE; +_ZN3tbb8pipeline12inject_tokenERNS_4taskE; +_ZN3tbb8pipeline13remove_filterERNS_6filterE; +_ZN3tbb8pipeline3runEm; +#if __TBB_EXCEPTIONS +_ZN3tbb8pipeline3runEmRNS_18task_group_contextE; +#endif +_ZN3tbb8pipeline5clearEv; +_ZN3tbb19thread_bound_filter12process_itemEv; +_ZN3tbb19thread_bound_filter16try_process_itemEv; +_ZTIN3tbb8pipelineE; +_ZTSN3tbb8pipelineE; +_ZTVN3tbb8pipelineE; +_ZN3tbb8pipelineC1Ev; +_ZN3tbb8pipelineC2Ev; +_ZN3tbb8pipelineD0Ev; +_ZN3tbb8pipelineD1Ev; +_ZN3tbb8pipelineD2Ev; + +/* queuing_rw_mutex.cpp */ +_ZN3tbb16queuing_rw_mutex18internal_constructEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock17upgrade_to_writerEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock19downgrade_to_readerEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock7acquireERS0_b; +_ZN3tbb16queuing_rw_mutex11scoped_lock7releaseEv; +_ZN3tbb16queuing_rw_mutex11scoped_lock11try_acquireERS0_b; + +#if !TBB_NO_LEGACY +/* spin_rw_mutex.cpp v2 */ +_ZN3tbb13spin_rw_mutex16internal_upgradeEPS0_; +_ZN3tbb13spin_rw_mutex22internal_itt_releasingEPS0_; +_ZN3tbb13spin_rw_mutex23internal_acquire_readerEPS0_; +_ZN3tbb13spin_rw_mutex23internal_acquire_writerEPS0_; +_ZN3tbb13spin_rw_mutex18internal_downgradeEPS0_; +_ZN3tbb13spin_rw_mutex23internal_release_readerEPS0_; +_ZN3tbb13spin_rw_mutex23internal_release_writerEPS0_; +_ZN3tbb13spin_rw_mutex27internal_try_acquire_readerEPS0_; +_ZN3tbb13spin_rw_mutex27internal_try_acquire_writerEPS0_; +#endif + +/* spin_rw_mutex v3 */ +_ZN3tbb16spin_rw_mutex_v318internal_constructEv; +_ZN3tbb16spin_rw_mutex_v316internal_upgradeEv; +_ZN3tbb16spin_rw_mutex_v318internal_downgradeEv; +_ZN3tbb16spin_rw_mutex_v323internal_acquire_readerEv; +_ZN3tbb16spin_rw_mutex_v323internal_acquire_writerEv; +_ZN3tbb16spin_rw_mutex_v323internal_release_readerEv; +_ZN3tbb16spin_rw_mutex_v323internal_release_writerEv; +_ZN3tbb16spin_rw_mutex_v327internal_try_acquire_readerEv; +_ZN3tbb16spin_rw_mutex_v327internal_try_acquire_writerEv; + +/* spin_mutex.cpp */ +_ZN3tbb10spin_mutex18internal_constructEv; +_ZN3tbb10spin_mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb10spin_mutex11scoped_lock16internal_releaseEv; +_ZN3tbb10spin_mutex11scoped_lock20internal_try_acquireERS0_; + +/* mutex.cpp */ +_ZN3tbb5mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb5mutex11scoped_lock16internal_releaseEv; +_ZN3tbb5mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb5mutex16internal_destroyEv; +_ZN3tbb5mutex18internal_constructEv; + +/* recursive_mutex.cpp */ +_ZN3tbb15recursive_mutex11scoped_lock16internal_acquireERS0_; +_ZN3tbb15recursive_mutex11scoped_lock16internal_releaseEv; +_ZN3tbb15recursive_mutex11scoped_lock20internal_try_acquireERS0_; +_ZN3tbb15recursive_mutex16internal_destroyEv; +_ZN3tbb15recursive_mutex18internal_constructEv; + +/* QueuingMutex.cpp */ +_ZN3tbb13queuing_mutex18internal_constructEv; +_ZN3tbb13queuing_mutex11scoped_lock7acquireERS0_; +_ZN3tbb13queuing_mutex11scoped_lock7releaseEv; +_ZN3tbb13queuing_mutex11scoped_lock11try_acquireERS0_; + +#if !TBB_NO_LEGACY +/* concurrent_hash_map */ +_ZNK3tbb8internal21hash_map_segment_base23internal_grow_predicateEv; + +/* concurrent_queue.cpp v2 */ +_ZN3tbb8internal21concurrent_queue_base12internal_popEPv; +_ZN3tbb8internal21concurrent_queue_base13internal_pushEPKv; +_ZN3tbb8internal21concurrent_queue_base21internal_set_capacityElm; +_ZN3tbb8internal21concurrent_queue_base23internal_pop_if_presentEPv; +_ZN3tbb8internal21concurrent_queue_base25internal_push_if_not_fullEPKv; +_ZN3tbb8internal21concurrent_queue_baseC2Em; +_ZN3tbb8internal21concurrent_queue_baseD2Ev; +_ZTIN3tbb8internal21concurrent_queue_baseE; +_ZTSN3tbb8internal21concurrent_queue_baseE; +_ZTVN3tbb8internal21concurrent_queue_baseE; +_ZN3tbb8internal30concurrent_queue_iterator_base6assignERKS1_; +_ZN3tbb8internal30concurrent_queue_iterator_base7advanceEv; +_ZN3tbb8internal30concurrent_queue_iterator_baseC2ERKNS0_21concurrent_queue_baseE; +_ZN3tbb8internal30concurrent_queue_iterator_baseD2Ev; +_ZNK3tbb8internal21concurrent_queue_base13internal_sizeEv; +#endif + +/* concurrent_queue v3 */ +/* constructors */ +_ZN3tbb8internal24concurrent_queue_base_v3C2Em; +_ZN3tbb8internal33concurrent_queue_iterator_base_v3C2ERKNS0_24concurrent_queue_base_v3E; +/* destructors */ +_ZN3tbb8internal24concurrent_queue_base_v3D2Ev; +_ZN3tbb8internal33concurrent_queue_iterator_base_v3D2Ev; +/* typeinfo */ +_ZTIN3tbb8internal24concurrent_queue_base_v3E; +_ZTSN3tbb8internal24concurrent_queue_base_v3E; +/* vtable */ +_ZTVN3tbb8internal24concurrent_queue_base_v3E; +/* methods */ +_ZN3tbb8internal33concurrent_queue_iterator_base_v36assignERKS1_; +_ZN3tbb8internal33concurrent_queue_iterator_base_v37advanceEv; +_ZN3tbb8internal24concurrent_queue_base_v313internal_pushEPKv; +_ZN3tbb8internal24concurrent_queue_base_v325internal_push_if_not_fullEPKv; +_ZN3tbb8internal24concurrent_queue_base_v312internal_popEPv; +_ZN3tbb8internal24concurrent_queue_base_v323internal_pop_if_presentEPv; +_ZN3tbb8internal24concurrent_queue_base_v321internal_finish_clearEv; +_ZN3tbb8internal24concurrent_queue_base_v321internal_set_capacityElm; +_ZNK3tbb8internal24concurrent_queue_base_v313internal_sizeEv; +_ZNK3tbb8internal24concurrent_queue_base_v314internal_emptyEv; +_ZNK3tbb8internal24concurrent_queue_base_v324internal_throw_exceptionEv; +_ZN3tbb8internal24concurrent_queue_base_v36assignERKS1_; + +#if !TBB_NO_LEGACY +/* concurrent_vector.cpp v2 */ +_ZN3tbb8internal22concurrent_vector_base13internal_copyERKS1_mPFvPvPKvmE; +_ZN3tbb8internal22concurrent_vector_base14internal_clearEPFvPvmEb; +_ZN3tbb8internal22concurrent_vector_base15internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_; +_ZN3tbb8internal22concurrent_vector_base16internal_grow_byEmmPFvPvmE; +_ZN3tbb8internal22concurrent_vector_base16internal_reserveEmmm; +_ZN3tbb8internal22concurrent_vector_base18internal_push_backEmRm; +_ZN3tbb8internal22concurrent_vector_base25internal_grow_to_at_leastEmmPFvPvmE; +_ZNK3tbb8internal22concurrent_vector_base17internal_capacityEv; +#endif + +/* concurrent_vector v3 */ +_ZN3tbb8internal25concurrent_vector_base_v313internal_copyERKS1_mPFvPvPKvmE; +_ZN3tbb8internal25concurrent_vector_base_v314internal_clearEPFvPvmE; +_ZN3tbb8internal25concurrent_vector_base_v315internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_; +_ZN3tbb8internal25concurrent_vector_base_v316internal_grow_byEmmPFvPvPKvmES4_; +_ZN3tbb8internal25concurrent_vector_base_v316internal_reserveEmmm; +_ZN3tbb8internal25concurrent_vector_base_v318internal_push_backEmRm; +_ZN3tbb8internal25concurrent_vector_base_v325internal_grow_to_at_leastEmmPFvPvPKvmES4_; +_ZNK3tbb8internal25concurrent_vector_base_v317internal_capacityEv; +_ZN3tbb8internal25concurrent_vector_base_v316internal_compactEmPvPFvS2_mEPFvS2_PKvmE; +_ZN3tbb8internal25concurrent_vector_base_v313internal_swapERS1_; +_ZNK3tbb8internal25concurrent_vector_base_v324internal_throw_exceptionEm; +_ZN3tbb8internal25concurrent_vector_base_v3D2Ev; +_ZN3tbb8internal25concurrent_vector_base_v315internal_resizeEmmmPKvPFvPvmEPFvS4_S3_mE; +_ZN3tbb8internal25concurrent_vector_base_v337internal_grow_to_at_least_with_resultEmmPFvPvPKvmES4_; + +/* tbb_thread */ +_ZN3tbb8internal13tbb_thread_v320hardware_concurrencyEv; +_ZN3tbb8internal13tbb_thread_v36detachEv; +_ZN3tbb8internal16thread_get_id_v3Ev; +_ZN3tbb8internal15free_closure_v3EPv; +_ZN3tbb8internal13tbb_thread_v34joinEv; +_ZN3tbb8internal13tbb_thread_v314internal_startEPFPvS2_ES2_; +_ZN3tbb8internal19allocate_closure_v3Em; +_ZN3tbb8internal7move_v3ERNS0_13tbb_thread_v3ES2_; +_ZN3tbb8internal15thread_yield_v3Ev; +_ZN3tbb8internal15thread_sleep_v3ERKNS_10tick_count10interval_tE; + +/* asm functions */ +__TBB_machine_fetchadd1__TBB_full_fence; +__TBB_machine_fetchadd2__TBB_full_fence; +__TBB_machine_fetchadd4__TBB_full_fence; +__TBB_machine_fetchadd8__TBB_full_fence; +__TBB_machine_fetchstore1__TBB_full_fence; +__TBB_machine_fetchstore2__TBB_full_fence; +__TBB_machine_fetchstore4__TBB_full_fence; +__TBB_machine_fetchstore8__TBB_full_fence; +__TBB_machine_fetchadd1acquire; +__TBB_machine_fetchadd1release; +__TBB_machine_fetchadd2acquire; +__TBB_machine_fetchadd2release; +__TBB_machine_fetchadd4acquire; +__TBB_machine_fetchadd4release; +__TBB_machine_fetchadd8acquire; +__TBB_machine_fetchadd8release; +__TBB_machine_fetchstore1acquire; +__TBB_machine_fetchstore1release; +__TBB_machine_fetchstore2acquire; +__TBB_machine_fetchstore2release; +__TBB_machine_fetchstore4acquire; +__TBB_machine_fetchstore4release; +__TBB_machine_fetchstore8acquire; +__TBB_machine_fetchstore8release; +__TBB_machine_cmpswp1acquire; +__TBB_machine_cmpswp1release; +__TBB_machine_cmpswp1__TBB_full_fence; +__TBB_machine_cmpswp2acquire; +__TBB_machine_cmpswp2release; +__TBB_machine_cmpswp2__TBB_full_fence; +__TBB_machine_cmpswp4acquire; +__TBB_machine_cmpswp4release; +__TBB_machine_cmpswp4__TBB_full_fence; +__TBB_machine_cmpswp8acquire; +__TBB_machine_cmpswp8release; +__TBB_machine_cmpswp8__TBB_full_fence; +__TBB_machine_lg; +__TBB_machine_lockbyte; +__TBB_machine_pause; +__TBB_machine_trylockbyte; + +local: + +/* TBB symbols */ +*3tbb*; +*__TBB*; + +/* Intel Compiler (libirc) symbols */ +__intel_*; +_intel_*; +?0_memcopyA; +?0_memcopyDu; +?0_memcpyD; +?1__memcpy; +?1__memmove; +?1__serial_memmove; +memcpy; +memset; + +}; diff --git a/dep/tbb/src/tbb/mac32-tbb-export.def b/dep/tbb/src/tbb/mac32-tbb-export.def new file mode 100644 index 000000000..9366805e0 --- /dev/null +++ b/dep/tbb/src/tbb/mac32-tbb-export.def @@ -0,0 +1,294 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# cache_aligned_allocator.cpp +__ZN3tbb8internal12NFS_AllocateEmmPv +__ZN3tbb8internal15NFS_GetLineSizeEv +__ZN3tbb8internal8NFS_FreeEPv +__ZN3tbb8internal23allocate_via_handler_v3Em +__ZN3tbb8internal25deallocate_via_handler_v3EPv +__ZN3tbb8internal17is_malloc_used_v3Ev + +# task.cpp v3 +__ZN3tbb4task13note_affinityEt +__ZN3tbb4task22internal_set_ref_countEi +__ZN3tbb4task28internal_decrement_ref_countEv +__ZN3tbb4task22spawn_and_wait_for_allERNS_9task_listE +__ZN3tbb4task4selfEv +__ZN3tbb4task7destroyERS0_ +__ZNK3tbb4task26is_owned_by_current_threadEv +__ZN3tbb8internal19allocate_root_proxy4freeERNS_4taskE +__ZN3tbb8internal19allocate_root_proxy8allocateEm +__ZN3tbb8internal28affinity_partitioner_base_v36resizeEj +__ZN3tbb8internal36get_initial_auto_partitioner_divisorEv +__ZNK3tbb8internal20allocate_child_proxy4freeERNS_4taskE +__ZNK3tbb8internal20allocate_child_proxy8allocateEm +__ZNK3tbb8internal27allocate_continuation_proxy4freeERNS_4taskE +__ZNK3tbb8internal27allocate_continuation_proxy8allocateEm +__ZNK3tbb8internal34allocate_additional_child_of_proxy4freeERNS_4taskE +__ZNK3tbb8internal34allocate_additional_child_of_proxy8allocateEm +__ZTIN3tbb4taskE +__ZTSN3tbb4taskE +__ZTVN3tbb4taskE +__ZN3tbb19task_scheduler_init19default_num_threadsEv +__ZN3tbb19task_scheduler_init10initializeEim +__ZN3tbb19task_scheduler_init10initializeEi +__ZN3tbb19task_scheduler_init9terminateEv +__ZN3tbb8internal26task_scheduler_observer_v37observeEb +__ZN3tbb10empty_task7executeEv +__ZN3tbb10empty_taskD0Ev +__ZN3tbb10empty_taskD1Ev +__ZTIN3tbb10empty_taskE +__ZTSN3tbb10empty_taskE +__ZTVN3tbb10empty_taskE + +# exception handling support +__ZNK3tbb8internal32allocate_root_with_context_proxy8allocateEm +__ZNK3tbb8internal32allocate_root_with_context_proxy4freeERNS_4taskE +__ZNK3tbb18task_group_context28is_group_execution_cancelledEv +__ZN3tbb18task_group_context22cancel_group_executionEv +__ZN3tbb18task_group_context26register_pending_exceptionEv +__ZN3tbb18task_group_context5resetEv +__ZN3tbb18task_group_context4initEv +__ZN3tbb18task_group_contextD1Ev +__ZN3tbb18task_group_contextD2Ev +__ZNK3tbb18captured_exception4nameEv +__ZNK3tbb18captured_exception4whatEv +__ZN3tbb18captured_exception10throw_selfEv +__ZN3tbb18captured_exception3setEPKcS2_ +__ZN3tbb18captured_exception4moveEv +__ZN3tbb18captured_exception5clearEv +__ZN3tbb18captured_exception7destroyEv +__ZN3tbb18captured_exception8allocateEPKcS2_ +__ZN3tbb18captured_exceptionD0Ev +__ZN3tbb18captured_exceptionD1Ev +__ZTIN3tbb18captured_exceptionE +__ZTSN3tbb18captured_exceptionE +__ZTVN3tbb18captured_exceptionE +__ZTIN3tbb13tbb_exceptionE +__ZTSN3tbb13tbb_exceptionE +__ZTVN3tbb13tbb_exceptionE +__ZN3tbb14bad_last_allocD0Ev +__ZN3tbb14bad_last_allocD1Ev +__ZNK3tbb14bad_last_alloc4whatEv +__ZTIN3tbb14bad_last_allocE +__ZTSN3tbb14bad_last_allocE +__ZTVN3tbb14bad_last_allocE + +# Symbols for std exception classes thrown from TBB +__ZNSt11range_errorD1Ev +__ZNSt12length_errorD1Ev +__ZNSt12out_of_rangeD1Ev +__ZTISt11range_error +__ZTISt12length_error +__ZTISt12out_of_range +__ZTSSt11range_error +__ZTSSt12length_error +__ZTSSt12out_of_range + +# tbb_misc.cpp +__ZN3tbb17assertion_failureEPKciS1_S1_ +__ZN3tbb21set_assertion_handlerEPFvPKciS1_S1_E +__ZN3tbb8internal13handle_perrorEiPKc +__ZN3tbb8internal15runtime_warningEPKcz +___TBB_machine_store8_slow_perf_warning +___TBB_machine_store8_slow +_TBB_runtime_interface_version +__ZN3tbb8internal33throw_bad_last_alloc_exception_v4Ev + +# itt_notify.cpp +__ZN3tbb8internal32itt_load_pointer_with_acquire_v3EPKv +__ZN3tbb8internal33itt_store_pointer_with_release_v3EPvS1_ +__ZN3tbb8internal19itt_load_pointer_v3EPKv +__ZN3tbb8internal20itt_set_sync_name_v3EPvPKc + +# pipeline.cpp +__ZTIN3tbb6filterE +__ZTSN3tbb6filterE +__ZTVN3tbb6filterE +__ZN3tbb6filterD2Ev +__ZN3tbb8pipeline10add_filterERNS_6filterE +__ZN3tbb8pipeline12inject_tokenERNS_4taskE +__ZN3tbb8pipeline13remove_filterERNS_6filterE +__ZN3tbb8pipeline3runEm +__ZN3tbb8pipeline3runEmRNS_18task_group_contextE +__ZN3tbb8pipeline5clearEv +__ZN3tbb19thread_bound_filter12process_itemEv +__ZN3tbb19thread_bound_filter16try_process_itemEv +__ZN3tbb8pipelineC1Ev +__ZN3tbb8pipelineC2Ev +__ZN3tbb8pipelineD0Ev +__ZN3tbb8pipelineD1Ev +__ZN3tbb8pipelineD2Ev +__ZTIN3tbb8pipelineE +__ZTSN3tbb8pipelineE +__ZTVN3tbb8pipelineE + +# queuing_rw_mutex.cpp +__ZN3tbb16queuing_rw_mutex11scoped_lock17upgrade_to_writerEv +__ZN3tbb16queuing_rw_mutex11scoped_lock19downgrade_to_readerEv +__ZN3tbb16queuing_rw_mutex11scoped_lock7acquireERS0_b +__ZN3tbb16queuing_rw_mutex11scoped_lock7releaseEv +__ZN3tbb16queuing_rw_mutex11scoped_lock11try_acquireERS0_b +__ZN3tbb16queuing_rw_mutex18internal_constructEv + +#if !TBB_NO_LEGACY +# spin_rw_mutex.cpp v2 +__ZN3tbb13spin_rw_mutex16internal_upgradeEPS0_ +__ZN3tbb13spin_rw_mutex22internal_itt_releasingEPS0_ +__ZN3tbb13spin_rw_mutex23internal_acquire_readerEPS0_ +__ZN3tbb13spin_rw_mutex23internal_acquire_writerEPS0_ +__ZN3tbb13spin_rw_mutex18internal_downgradeEPS0_ +__ZN3tbb13spin_rw_mutex23internal_release_readerEPS0_ +__ZN3tbb13spin_rw_mutex23internal_release_writerEPS0_ +__ZN3tbb13spin_rw_mutex27internal_try_acquire_readerEPS0_ +__ZN3tbb13spin_rw_mutex27internal_try_acquire_writerEPS0_ +#endif + +# spin_rw_mutex v3 +__ZN3tbb16spin_rw_mutex_v316internal_upgradeEv +__ZN3tbb16spin_rw_mutex_v318internal_downgradeEv +__ZN3tbb16spin_rw_mutex_v323internal_acquire_readerEv +__ZN3tbb16spin_rw_mutex_v323internal_acquire_writerEv +__ZN3tbb16spin_rw_mutex_v323internal_release_readerEv +__ZN3tbb16spin_rw_mutex_v323internal_release_writerEv +__ZN3tbb16spin_rw_mutex_v327internal_try_acquire_readerEv +__ZN3tbb16spin_rw_mutex_v327internal_try_acquire_writerEv +__ZN3tbb16spin_rw_mutex_v318internal_constructEv + +# spin_mutex.cpp +__ZN3tbb10spin_mutex11scoped_lock16internal_acquireERS0_ +__ZN3tbb10spin_mutex11scoped_lock16internal_releaseEv +__ZN3tbb10spin_mutex11scoped_lock20internal_try_acquireERS0_ +__ZN3tbb10spin_mutex18internal_constructEv + +# mutex.cpp +__ZN3tbb5mutex11scoped_lock16internal_acquireERS0_ +__ZN3tbb5mutex11scoped_lock16internal_releaseEv +__ZN3tbb5mutex11scoped_lock20internal_try_acquireERS0_ +__ZN3tbb5mutex16internal_destroyEv +__ZN3tbb5mutex18internal_constructEv + +# recursive_mutex.cpp +__ZN3tbb15recursive_mutex11scoped_lock16internal_acquireERS0_ +__ZN3tbb15recursive_mutex11scoped_lock16internal_releaseEv +__ZN3tbb15recursive_mutex11scoped_lock20internal_try_acquireERS0_ +__ZN3tbb15recursive_mutex16internal_destroyEv +__ZN3tbb15recursive_mutex18internal_constructEv + +# queuing_mutex.cpp +__ZN3tbb13queuing_mutex11scoped_lock7acquireERS0_ +__ZN3tbb13queuing_mutex11scoped_lock7releaseEv +__ZN3tbb13queuing_mutex11scoped_lock11try_acquireERS0_ +__ZN3tbb13queuing_mutex18internal_constructEv + +#if !TBB_NO_LEGACY +# concurrent_hash_map +__ZNK3tbb8internal21hash_map_segment_base23internal_grow_predicateEv + +# concurrent_queue.cpp v2 +__ZN3tbb8internal21concurrent_queue_base12internal_popEPv +__ZN3tbb8internal21concurrent_queue_base13internal_pushEPKv +__ZN3tbb8internal21concurrent_queue_base21internal_set_capacityEim +__ZN3tbb8internal21concurrent_queue_base23internal_pop_if_presentEPv +__ZN3tbb8internal21concurrent_queue_base25internal_push_if_not_fullEPKv +__ZN3tbb8internal21concurrent_queue_baseC2Em +__ZN3tbb8internal21concurrent_queue_baseD2Ev +__ZTIN3tbb8internal21concurrent_queue_baseE +__ZTSN3tbb8internal21concurrent_queue_baseE +__ZTVN3tbb8internal21concurrent_queue_baseE +__ZN3tbb8internal30concurrent_queue_iterator_base6assignERKS1_ +__ZN3tbb8internal30concurrent_queue_iterator_base7advanceEv +__ZN3tbb8internal30concurrent_queue_iterator_baseC2ERKNS0_21concurrent_queue_baseE +__ZN3tbb8internal30concurrent_queue_iterator_baseD2Ev +__ZNK3tbb8internal21concurrent_queue_base13internal_sizeEv +#endif + +# concurrent_queue v3 +# constructors +__ZN3tbb8internal33concurrent_queue_iterator_base_v3C2ERKNS0_24concurrent_queue_base_v3E +__ZN3tbb8internal24concurrent_queue_base_v3C2Em +# destructors +__ZN3tbb8internal33concurrent_queue_iterator_base_v3D2Ev +__ZN3tbb8internal24concurrent_queue_base_v3D2Ev +# typeinfo +__ZTIN3tbb8internal24concurrent_queue_base_v3E +__ZTSN3tbb8internal24concurrent_queue_base_v3E +#vtable +__ZTVN3tbb8internal24concurrent_queue_base_v3E +# methods +__ZN3tbb8internal33concurrent_queue_iterator_base_v37advanceEv +__ZN3tbb8internal33concurrent_queue_iterator_base_v36assignERKS1_ +__ZN3tbb8internal24concurrent_queue_base_v313internal_pushEPKv +__ZN3tbb8internal24concurrent_queue_base_v325internal_push_if_not_fullEPKv +__ZN3tbb8internal24concurrent_queue_base_v312internal_popEPv +__ZN3tbb8internal24concurrent_queue_base_v323internal_pop_if_presentEPv +__ZN3tbb8internal24concurrent_queue_base_v321internal_set_capacityEim +__ZNK3tbb8internal24concurrent_queue_base_v313internal_sizeEv +__ZNK3tbb8internal24concurrent_queue_base_v314internal_emptyEv +__ZN3tbb8internal24concurrent_queue_base_v321internal_finish_clearEv +__ZNK3tbb8internal24concurrent_queue_base_v324internal_throw_exceptionEv +__ZN3tbb8internal24concurrent_queue_base_v36assignERKS1_ + +#if !TBB_NO_LEGACY +# concurrent_vector.cpp v2 +__ZN3tbb8internal22concurrent_vector_base13internal_copyERKS1_mPFvPvPKvmE +__ZN3tbb8internal22concurrent_vector_base14internal_clearEPFvPvmEb +__ZN3tbb8internal22concurrent_vector_base15internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_ +__ZN3tbb8internal22concurrent_vector_base16internal_grow_byEmmPFvPvmE +__ZN3tbb8internal22concurrent_vector_base16internal_reserveEmmm +__ZN3tbb8internal22concurrent_vector_base18internal_push_backEmRm +__ZN3tbb8internal22concurrent_vector_base25internal_grow_to_at_leastEmmPFvPvmE +__ZNK3tbb8internal22concurrent_vector_base17internal_capacityEv +#endif + +# concurrent_vector v3 +__ZN3tbb8internal25concurrent_vector_base_v313internal_copyERKS1_mPFvPvPKvmE +__ZN3tbb8internal25concurrent_vector_base_v314internal_clearEPFvPvmE +__ZN3tbb8internal25concurrent_vector_base_v315internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_ +__ZN3tbb8internal25concurrent_vector_base_v316internal_grow_byEmmPFvPvPKvmES4_ +__ZN3tbb8internal25concurrent_vector_base_v316internal_reserveEmmm +__ZN3tbb8internal25concurrent_vector_base_v318internal_push_backEmRm +__ZN3tbb8internal25concurrent_vector_base_v325internal_grow_to_at_leastEmmPFvPvPKvmES4_ +__ZNK3tbb8internal25concurrent_vector_base_v317internal_capacityEv +__ZN3tbb8internal25concurrent_vector_base_v316internal_compactEmPvPFvS2_mEPFvS2_PKvmE +__ZN3tbb8internal25concurrent_vector_base_v313internal_swapERS1_ +__ZNK3tbb8internal25concurrent_vector_base_v324internal_throw_exceptionEm +__ZN3tbb8internal25concurrent_vector_base_v3D2Ev +__ZN3tbb8internal25concurrent_vector_base_v315internal_resizeEmmmPKvPFvPvmEPFvS4_S3_mE +__ZN3tbb8internal25concurrent_vector_base_v337internal_grow_to_at_least_with_resultEmmPFvPvPKvmES4_ + +# tbb_thread +__ZN3tbb8internal13tbb_thread_v314internal_startEPFPvS2_ES2_ +__ZN3tbb8internal13tbb_thread_v320hardware_concurrencyEv +__ZN3tbb8internal13tbb_thread_v34joinEv +__ZN3tbb8internal13tbb_thread_v36detachEv +__ZN3tbb8internal15free_closure_v3EPv +__ZN3tbb8internal15thread_sleep_v3ERKNS_10tick_count10interval_tE +__ZN3tbb8internal15thread_yield_v3Ev +__ZN3tbb8internal16thread_get_id_v3Ev +__ZN3tbb8internal19allocate_closure_v3Em +__ZN3tbb8internal7move_v3ERNS0_13tbb_thread_v3ES2_ diff --git a/dep/tbb/src/tbb/mac64-tbb-export.def b/dep/tbb/src/tbb/mac64-tbb-export.def new file mode 100644 index 000000000..c91c8ceb7 --- /dev/null +++ b/dep/tbb/src/tbb/mac64-tbb-export.def @@ -0,0 +1,292 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# cache_aligned_allocator.cpp +__ZN3tbb8internal12NFS_AllocateEmmPv +__ZN3tbb8internal15NFS_GetLineSizeEv +__ZN3tbb8internal8NFS_FreeEPv +__ZN3tbb8internal23allocate_via_handler_v3Em +__ZN3tbb8internal25deallocate_via_handler_v3EPv +__ZN3tbb8internal17is_malloc_used_v3Ev + +# task.cpp v3 +__ZN3tbb4task13note_affinityEt +__ZN3tbb4task22internal_set_ref_countEi +__ZN3tbb4task28internal_decrement_ref_countEv +__ZN3tbb4task22spawn_and_wait_for_allERNS_9task_listE +__ZN3tbb4task4selfEv +__ZN3tbb4task7destroyERS0_ +__ZNK3tbb4task26is_owned_by_current_threadEv +__ZN3tbb8internal19allocate_root_proxy4freeERNS_4taskE +__ZN3tbb8internal19allocate_root_proxy8allocateEm +__ZN3tbb8internal28affinity_partitioner_base_v36resizeEj +__ZN3tbb8internal36get_initial_auto_partitioner_divisorEv +__ZNK3tbb8internal20allocate_child_proxy4freeERNS_4taskE +__ZNK3tbb8internal20allocate_child_proxy8allocateEm +__ZNK3tbb8internal27allocate_continuation_proxy4freeERNS_4taskE +__ZNK3tbb8internal27allocate_continuation_proxy8allocateEm +__ZNK3tbb8internal34allocate_additional_child_of_proxy4freeERNS_4taskE +__ZNK3tbb8internal34allocate_additional_child_of_proxy8allocateEm +__ZTIN3tbb4taskE +__ZTSN3tbb4taskE +__ZTVN3tbb4taskE +__ZN3tbb19task_scheduler_init19default_num_threadsEv +__ZN3tbb19task_scheduler_init10initializeEim +__ZN3tbb19task_scheduler_init10initializeEi +__ZN3tbb19task_scheduler_init9terminateEv +__ZN3tbb8internal26task_scheduler_observer_v37observeEb +__ZN3tbb10empty_task7executeEv +__ZN3tbb10empty_taskD0Ev +__ZN3tbb10empty_taskD1Ev +__ZTIN3tbb10empty_taskE +__ZTSN3tbb10empty_taskE +__ZTVN3tbb10empty_taskE + +# exception handling support +__ZNK3tbb8internal32allocate_root_with_context_proxy8allocateEm +__ZNK3tbb8internal32allocate_root_with_context_proxy4freeERNS_4taskE +__ZNK3tbb18task_group_context28is_group_execution_cancelledEv +__ZN3tbb18task_group_context22cancel_group_executionEv +__ZN3tbb18task_group_context26register_pending_exceptionEv +__ZN3tbb18task_group_context5resetEv +__ZN3tbb18task_group_context4initEv +__ZN3tbb18task_group_contextD1Ev +__ZN3tbb18task_group_contextD2Ev +__ZNK3tbb18captured_exception4nameEv +__ZNK3tbb18captured_exception4whatEv +__ZN3tbb18captured_exception10throw_selfEv +__ZN3tbb18captured_exception3setEPKcS2_ +__ZN3tbb18captured_exception4moveEv +__ZN3tbb18captured_exception5clearEv +__ZN3tbb18captured_exception7destroyEv +__ZN3tbb18captured_exception8allocateEPKcS2_ +__ZN3tbb18captured_exceptionD0Ev +__ZN3tbb18captured_exceptionD1Ev +__ZTIN3tbb18captured_exceptionE +__ZTSN3tbb18captured_exceptionE +__ZTVN3tbb18captured_exceptionE +__ZTIN3tbb13tbb_exceptionE +__ZTSN3tbb13tbb_exceptionE +__ZTVN3tbb13tbb_exceptionE +__ZN3tbb14bad_last_allocD0Ev +__ZN3tbb14bad_last_allocD1Ev +__ZNK3tbb14bad_last_alloc4whatEv +__ZTIN3tbb14bad_last_allocE +__ZTSN3tbb14bad_last_allocE +__ZTVN3tbb14bad_last_allocE + +# Symbols for std exception classes thrown from TBB +__ZNSt11range_errorD1Ev +__ZNSt12length_errorD1Ev +__ZNSt12out_of_rangeD1Ev +__ZTISt11range_error +__ZTISt12length_error +__ZTISt12out_of_range +__ZTSSt11range_error +__ZTSSt12length_error +__ZTSSt12out_of_range + +# tbb_misc.cpp +__ZN3tbb17assertion_failureEPKciS1_S1_ +__ZN3tbb21set_assertion_handlerEPFvPKciS1_S1_E +__ZN3tbb8internal13handle_perrorEiPKc +__ZN3tbb8internal15runtime_warningEPKcz +__ZN3tbb8internal33throw_bad_last_alloc_exception_v4Ev +_TBB_runtime_interface_version + +# itt_notify.cpp +__ZN3tbb8internal32itt_load_pointer_with_acquire_v3EPKv +__ZN3tbb8internal33itt_store_pointer_with_release_v3EPvS1_ +__ZN3tbb8internal19itt_load_pointer_v3EPKv +__ZN3tbb8internal20itt_set_sync_name_v3EPvPKc + +# pipeline.cpp +__ZTIN3tbb6filterE +__ZTSN3tbb6filterE +__ZTVN3tbb6filterE +__ZN3tbb6filterD2Ev +__ZN3tbb8pipeline10add_filterERNS_6filterE +__ZN3tbb8pipeline12inject_tokenERNS_4taskE +__ZN3tbb8pipeline13remove_filterERNS_6filterE +__ZN3tbb8pipeline3runEm +__ZN3tbb8pipeline3runEmRNS_18task_group_contextE +__ZN3tbb8pipeline5clearEv +__ZN3tbb19thread_bound_filter12process_itemEv +__ZN3tbb19thread_bound_filter16try_process_itemEv +__ZN3tbb8pipelineC1Ev +__ZN3tbb8pipelineC2Ev +__ZN3tbb8pipelineD0Ev +__ZN3tbb8pipelineD1Ev +__ZN3tbb8pipelineD2Ev +__ZTIN3tbb8pipelineE +__ZTSN3tbb8pipelineE +__ZTVN3tbb8pipelineE + +# queuing_rw_mutex.cpp +__ZN3tbb16queuing_rw_mutex11scoped_lock17upgrade_to_writerEv +__ZN3tbb16queuing_rw_mutex11scoped_lock19downgrade_to_readerEv +__ZN3tbb16queuing_rw_mutex11scoped_lock7acquireERS0_b +__ZN3tbb16queuing_rw_mutex11scoped_lock7releaseEv +__ZN3tbb16queuing_rw_mutex11scoped_lock11try_acquireERS0_b +__ZN3tbb16queuing_rw_mutex18internal_constructEv + +#if !TBB_NO_LEGACY +# spin_rw_mutex.cpp v2 +__ZN3tbb13spin_rw_mutex16internal_upgradeEPS0_ +__ZN3tbb13spin_rw_mutex22internal_itt_releasingEPS0_ +__ZN3tbb13spin_rw_mutex23internal_acquire_readerEPS0_ +__ZN3tbb13spin_rw_mutex23internal_acquire_writerEPS0_ +__ZN3tbb13spin_rw_mutex18internal_downgradeEPS0_ +__ZN3tbb13spin_rw_mutex23internal_release_readerEPS0_ +__ZN3tbb13spin_rw_mutex23internal_release_writerEPS0_ +__ZN3tbb13spin_rw_mutex27internal_try_acquire_readerEPS0_ +__ZN3tbb13spin_rw_mutex27internal_try_acquire_writerEPS0_ +#endif + +# spin_rw_mutex v3 +__ZN3tbb16spin_rw_mutex_v316internal_upgradeEv +__ZN3tbb16spin_rw_mutex_v318internal_downgradeEv +__ZN3tbb16spin_rw_mutex_v323internal_acquire_readerEv +__ZN3tbb16spin_rw_mutex_v323internal_acquire_writerEv +__ZN3tbb16spin_rw_mutex_v323internal_release_readerEv +__ZN3tbb16spin_rw_mutex_v323internal_release_writerEv +__ZN3tbb16spin_rw_mutex_v327internal_try_acquire_readerEv +__ZN3tbb16spin_rw_mutex_v327internal_try_acquire_writerEv +__ZN3tbb16spin_rw_mutex_v318internal_constructEv + +# spin_mutex.cpp +__ZN3tbb10spin_mutex11scoped_lock16internal_acquireERS0_ +__ZN3tbb10spin_mutex11scoped_lock16internal_releaseEv +__ZN3tbb10spin_mutex11scoped_lock20internal_try_acquireERS0_ +__ZN3tbb10spin_mutex18internal_constructEv + +# mutex.cpp +__ZN3tbb5mutex11scoped_lock16internal_acquireERS0_ +__ZN3tbb5mutex11scoped_lock16internal_releaseEv +__ZN3tbb5mutex11scoped_lock20internal_try_acquireERS0_ +__ZN3tbb5mutex16internal_destroyEv +__ZN3tbb5mutex18internal_constructEv + +# recursive_mutex.cpp +__ZN3tbb15recursive_mutex11scoped_lock16internal_acquireERS0_ +__ZN3tbb15recursive_mutex11scoped_lock16internal_releaseEv +__ZN3tbb15recursive_mutex11scoped_lock20internal_try_acquireERS0_ +__ZN3tbb15recursive_mutex16internal_destroyEv +__ZN3tbb15recursive_mutex18internal_constructEv + +# queuing_mutex.cpp +__ZN3tbb13queuing_mutex11scoped_lock7acquireERS0_ +__ZN3tbb13queuing_mutex11scoped_lock7releaseEv +__ZN3tbb13queuing_mutex11scoped_lock11try_acquireERS0_ +__ZN3tbb13queuing_mutex18internal_constructEv + +#if !TBB_NO_LEGACY +# concurrent_hash_map +__ZNK3tbb8internal21hash_map_segment_base23internal_grow_predicateEv + +# concurrent_queue.cpp v2 +__ZN3tbb8internal21concurrent_queue_base12internal_popEPv +__ZN3tbb8internal21concurrent_queue_base13internal_pushEPKv +__ZN3tbb8internal21concurrent_queue_base21internal_set_capacityElm +__ZN3tbb8internal21concurrent_queue_base23internal_pop_if_presentEPv +__ZN3tbb8internal21concurrent_queue_base25internal_push_if_not_fullEPKv +__ZN3tbb8internal21concurrent_queue_baseC2Em +__ZN3tbb8internal21concurrent_queue_baseD2Ev +__ZTIN3tbb8internal21concurrent_queue_baseE +__ZTSN3tbb8internal21concurrent_queue_baseE +__ZTVN3tbb8internal21concurrent_queue_baseE +__ZN3tbb8internal30concurrent_queue_iterator_base6assignERKS1_ +__ZN3tbb8internal30concurrent_queue_iterator_base7advanceEv +__ZN3tbb8internal30concurrent_queue_iterator_baseC2ERKNS0_21concurrent_queue_baseE +__ZN3tbb8internal30concurrent_queue_iterator_baseD2Ev +__ZNK3tbb8internal21concurrent_queue_base13internal_sizeEv +#endif + +# concurrent_queue v3 +# constructors +__ZN3tbb8internal33concurrent_queue_iterator_base_v3C2ERKNS0_24concurrent_queue_base_v3E +__ZN3tbb8internal24concurrent_queue_base_v3C2Em +# destructors +__ZN3tbb8internal33concurrent_queue_iterator_base_v3D2Ev +__ZN3tbb8internal24concurrent_queue_base_v3D2Ev +# typeinfo +__ZTIN3tbb8internal24concurrent_queue_base_v3E +__ZTSN3tbb8internal24concurrent_queue_base_v3E +#vtable +__ZTVN3tbb8internal24concurrent_queue_base_v3E +# methods +__ZN3tbb8internal33concurrent_queue_iterator_base_v36assignERKS1_ +__ZN3tbb8internal33concurrent_queue_iterator_base_v37advanceEv +__ZN3tbb8internal24concurrent_queue_base_v313internal_pushEPKv +__ZN3tbb8internal24concurrent_queue_base_v325internal_push_if_not_fullEPKv +__ZN3tbb8internal24concurrent_queue_base_v312internal_popEPv +__ZN3tbb8internal24concurrent_queue_base_v323internal_pop_if_presentEPv +__ZN3tbb8internal24concurrent_queue_base_v321internal_finish_clearEv +__ZN3tbb8internal24concurrent_queue_base_v321internal_set_capacityElm +__ZNK3tbb8internal24concurrent_queue_base_v313internal_sizeEv +__ZNK3tbb8internal24concurrent_queue_base_v314internal_emptyEv +__ZNK3tbb8internal24concurrent_queue_base_v324internal_throw_exceptionEv +__ZN3tbb8internal24concurrent_queue_base_v36assignERKS1_ + +#if !TBB_NO_LEGACY +# concurrent_vector.cpp v2 +__ZN3tbb8internal22concurrent_vector_base13internal_copyERKS1_mPFvPvPKvmE +__ZN3tbb8internal22concurrent_vector_base14internal_clearEPFvPvmEb +__ZN3tbb8internal22concurrent_vector_base15internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_ +__ZN3tbb8internal22concurrent_vector_base16internal_grow_byEmmPFvPvmE +__ZN3tbb8internal22concurrent_vector_base16internal_reserveEmmm +__ZN3tbb8internal22concurrent_vector_base18internal_push_backEmRm +__ZN3tbb8internal22concurrent_vector_base25internal_grow_to_at_leastEmmPFvPvmE +__ZNK3tbb8internal22concurrent_vector_base17internal_capacityEv +#endif + +# concurrent_vector v3 +__ZN3tbb8internal25concurrent_vector_base_v313internal_copyERKS1_mPFvPvPKvmE +__ZN3tbb8internal25concurrent_vector_base_v314internal_clearEPFvPvmE +__ZN3tbb8internal25concurrent_vector_base_v315internal_assignERKS1_mPFvPvmEPFvS4_PKvmESA_ +__ZN3tbb8internal25concurrent_vector_base_v316internal_grow_byEmmPFvPvPKvmES4_ +__ZN3tbb8internal25concurrent_vector_base_v316internal_reserveEmmm +__ZN3tbb8internal25concurrent_vector_base_v318internal_push_backEmRm +__ZN3tbb8internal25concurrent_vector_base_v325internal_grow_to_at_leastEmmPFvPvPKvmES4_ +__ZNK3tbb8internal25concurrent_vector_base_v317internal_capacityEv +__ZN3tbb8internal25concurrent_vector_base_v316internal_compactEmPvPFvS2_mEPFvS2_PKvmE +__ZN3tbb8internal25concurrent_vector_base_v313internal_swapERS1_ +__ZNK3tbb8internal25concurrent_vector_base_v324internal_throw_exceptionEm +__ZN3tbb8internal25concurrent_vector_base_v3D2Ev +__ZN3tbb8internal25concurrent_vector_base_v315internal_resizeEmmmPKvPFvPvmEPFvS4_S3_mE +__ZN3tbb8internal25concurrent_vector_base_v337internal_grow_to_at_least_with_resultEmmPFvPvPKvmES4_ + +# tbb_thread +__ZN3tbb8internal13tbb_thread_v320hardware_concurrencyEv +__ZN3tbb8internal13tbb_thread_v36detachEv +__ZN3tbb8internal16thread_get_id_v3Ev +__ZN3tbb8internal15free_closure_v3EPv +__ZN3tbb8internal13tbb_thread_v34joinEv +__ZN3tbb8internal13tbb_thread_v314internal_startEPFPvS2_ES2_ +__ZN3tbb8internal19allocate_closure_v3Em +__ZN3tbb8internal7move_v3ERNS0_13tbb_thread_v3ES2_ +__ZN3tbb8internal15thread_yield_v3Ev +__ZN3tbb8internal15thread_sleep_v3ERKNS_10tick_count10interval_tE diff --git a/dep/tbb/src/tbb/mutex.cpp b/dep/tbb/src/tbb/mutex.cpp new file mode 100644 index 000000000..3c619b69c --- /dev/null +++ b/dep/tbb/src/tbb/mutex.cpp @@ -0,0 +1,148 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/mutex.h" +#include "itt_notify.h" + +namespace tbb { + void mutex::scoped_lock::internal_acquire( mutex& m ) { + +#if _WIN32||_WIN64 + switch( m.state ) { + case INITIALIZED: + case HELD: + EnterCriticalSection( &m.impl ); + // If a thread comes here, and another thread holds the lock, it will block + // in EnterCriticalSection. When it returns from EnterCriticalSection, + // m.state must be set to INITIALIZED. If the same thread tries to acquire a lock it + // aleady holds, the the lock is in HELD state, thus will cause the assertion to fail. + __TBB_ASSERT(m.state!=HELD, "mutex::scoped_lock: deadlock caused by attempt to reacquire held mutex"); + m.state = HELD; + break; + case DESTROYED: + __TBB_ASSERT(false,"mutex::scoped_lock: mutex already destroyed"); + break; + default: + __TBB_ASSERT(false,"mutex::scoped_lock: illegal mutex state"); + break; + } +#else + int error_code = pthread_mutex_lock(&m.impl); + __TBB_ASSERT_EX(!error_code,"mutex::scoped_lock: pthread_mutex_lock failed"); +#endif /* _WIN32||_WIN64 */ + my_mutex = &m; + } + +void mutex::scoped_lock::internal_release() { + __TBB_ASSERT( my_mutex, "mutex::scoped_lock: not holding a mutex" ); +#if _WIN32||_WIN64 + switch( my_mutex->state ) { + case INITIALIZED: + __TBB_ASSERT(false,"mutex::scoped_lock: try to release the lock without acquisition"); + break; + case HELD: + my_mutex->state = INITIALIZED; + LeaveCriticalSection(&my_mutex->impl); + break; + case DESTROYED: + __TBB_ASSERT(false,"mutex::scoped_lock: mutex already destroyed"); + break; + default: + __TBB_ASSERT(false,"mutex::scoped_lock: illegal mutex state"); + break; + } +#else + int error_code = pthread_mutex_unlock(&my_mutex->impl); + __TBB_ASSERT_EX(!error_code, "mutex::scoped_lock: pthread_mutex_unlock failed"); +#endif /* _WIN32||_WIN64 */ + my_mutex = NULL; +} + +bool mutex::scoped_lock::internal_try_acquire( mutex& m ) { +#if _WIN32||_WIN64 + switch( m.state ) { + case INITIALIZED: + case HELD: + break; + case DESTROYED: + __TBB_ASSERT(false,"mutex::scoped_lock: mutex already destroyed"); + break; + default: + __TBB_ASSERT(false,"mutex::scoped_lock: illegal mutex state"); + break; + } +#endif /* _WIN32||_WIN64 */ + + bool result; +#if _WIN32||_WIN64 + result = TryEnterCriticalSection(&m.impl)!=0; + if( result ) { + __TBB_ASSERT(m.state!=HELD, "mutex::scoped_lock: deadlock caused by attempt to reacquire held mutex"); + m.state = HELD; + } +#else + result = pthread_mutex_trylock(&m.impl)==0; +#endif /* _WIN32||_WIN64 */ + if( result ) + my_mutex = &m; + return result; +} + +void mutex::internal_construct() { +#if _WIN32||_WIN64 + InitializeCriticalSection(&impl); + state = INITIALIZED; +#else + int error_code = pthread_mutex_init(&impl,NULL); + if( error_code ) + tbb::internal::handle_perror(error_code,"mutex: pthread_mutex_init failed"); +#endif /* _WIN32||_WIN64*/ + ITT_SYNC_CREATE(&impl, _T("tbb::mutex"), _T("")); +} + +void mutex::internal_destroy() { +#if _WIN32||_WIN64 + switch( state ) { + case INITIALIZED: + DeleteCriticalSection(&impl); + break; + case DESTROYED: + __TBB_ASSERT(false,"mutex: already destroyed"); + break; + default: + __TBB_ASSERT(false,"mutex: illegal state for destruction"); + break; + } + state = DESTROYED; +#else + int error_code = pthread_mutex_destroy(&impl); + __TBB_ASSERT_EX(!error_code,"mutex: pthread_mutex_destroy failed"); +#endif /* _WIN32||_WIN64 */ +} + +} // namespace tbb diff --git a/dep/tbb/src/tbb/pipeline.cpp b/dep/tbb/src/tbb/pipeline.cpp new file mode 100644 index 000000000..822609be1 --- /dev/null +++ b/dep/tbb/src/tbb/pipeline.cpp @@ -0,0 +1,687 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/pipeline.h" +#include "tbb/spin_mutex.h" +#include "tbb/cache_aligned_allocator.h" +#include "itt_notify.h" + + +namespace tbb { + +namespace internal { + +//! This structure is used to store task information in a input buffer +struct task_info { + void* my_object; + //! Invalid unless a task went through an ordered stage. + Token my_token; + //! False until my_token is set. + bool my_token_ready; + //! True if my_object is valid. + bool is_valid; + //! Set to initial state (no object, no token) + void reset() { + my_object = NULL; + my_token = 0; + my_token_ready = false; + is_valid = false; + } +}; +//! A buffer of input items for a filter. +/** Each item is a task_info, inserted into a position in the buffer corresponding to a Token. */ +class input_buffer { + friend class tbb::internal::pipeline_root_task; + friend class tbb::thread_bound_filter; + + typedef Token size_type; + + //! Array of deferred tasks that cannot yet start executing. + task_info* array; + + //! Size of array + /** Always 0 or a power of 2 */ + size_type array_size; + + //! Lowest token that can start executing. + /** All prior Token have already been seen. */ + Token low_token; + + //! Serializes updates. + spin_mutex array_mutex; + + //! Resize "array". + /** Caller is responsible to acquiring a lock on "array_mutex". */ + void grow( size_type minimum_size ); + + //! Initial size for "array" + /** Must be a power of 2 */ + static const size_type initial_buffer_size = 4; + + //! Used only for out of order buffer. + Token high_token; + + //! True for ordered filter, false otherwise. + bool is_ordered; + + //! True for thread-bound filter, false otherwise. + bool is_bound; +public: + //! Construct empty buffer. + input_buffer( bool is_ordered_, bool is_bound_ ) : + array(NULL), array_size(0), + low_token(0), high_token(0), + is_ordered(is_ordered_), is_bound(is_bound_) { + grow(initial_buffer_size); + __TBB_ASSERT( array, NULL ); + } + + //! Destroy the buffer. + ~input_buffer() { + __TBB_ASSERT( array, NULL ); + cache_aligned_allocator().deallocate(array,array_size); + poison_pointer( array ); + } + + //! Put a token into the buffer. + /** If task information was placed into buffer, returns true; + otherwise returns false, informing the caller to create and spawn a task. + */ + // Using template to avoid explicit dependency on stage_task + template + bool put_token( StageTask& putter ) { + { + spin_mutex::scoped_lock lock( array_mutex ); + Token token; + if( is_ordered ) { + if( !putter.my_token_ready ) { + putter.my_token = high_token++; + putter.my_token_ready = true; + } + token = putter.my_token; + } else + token = high_token++; + __TBB_ASSERT( (tokendiff_t)(token-low_token)>=0, NULL ); + if( token!=low_token || is_bound ) { + // Trying to put token that is beyond low_token. + // Need to wait until low_token catches up before dispatching. + if( token-low_token>=array_size ) + grow( token-low_token+1 ); + ITT_NOTIFY( sync_releasing, this ); + putter.put_task_info(array[token&array_size-1]); + return true; + } + } + return false; + } + + //! Note that processing of a token is finished. + /** Fires up processing of the next token, if processing was deferred. */ + // Using template to avoid explicit dependency on stage_task + template + void note_done( Token token, StageTask& spawner ) { + task_info wakee; + wakee.reset(); + { + spin_mutex::scoped_lock lock( array_mutex ); + if( !is_ordered || token==low_token ) { + // Wake the next task + task_info& item = array[++low_token & array_size-1]; + ITT_NOTIFY( sync_acquired, this ); + wakee = item; + item.is_valid = false; + } + } + if( wakee.is_valid ) + spawner.spawn_stage_task(wakee); + } + +#if __TBB_EXCEPTIONS + //! The method destroys all data in filters to prevent memory leaks + void clear( filter* my_filter ) { + long t=low_token; + for( size_type i=0; ifinalize(temp.my_object); + temp.is_valid = false; + } + } + } +#endif + + bool return_item(task_info& info, bool advance) { + spin_mutex::scoped_lock lock( array_mutex ); + task_info& item = array[low_token&array_size-1]; + ITT_NOTIFY( sync_acquired, this ); + if( item.is_valid ) { + info = item; + item.is_valid = false; + if (advance) low_token++; + return true; + } + return false; + } + + void put_item( task_info& info ) { + info.is_valid = true; + spin_mutex::scoped_lock lock( array_mutex ); + Token token; + if( is_ordered ) { + if( !info.my_token_ready ) { + info.my_token = high_token++; + info.my_token_ready = true; + } + token = info.my_token; + } else + token = high_token++; + __TBB_ASSERT( (tokendiff_t)(token-low_token)>=0, NULL ); + if( token-low_token>=array_size ) + grow( token-low_token+1 ); + ITT_NOTIFY( sync_releasing, this ); + array[token&array_size-1] = info; + } +}; + +void input_buffer::grow( size_type minimum_size ) { + size_type old_size = array_size; + size_type new_size = old_size ? 2*old_size : initial_buffer_size; + while( new_size().allocate(new_size); + task_info* old_array = array; + for( size_type i=0; i().deallocate(old_array,old_size); +} + +class stage_task: public task, public task_info { +private: + friend class tbb::pipeline; + pipeline& my_pipeline; + filter* my_filter; + //! True if this task has not yet read the input. + bool my_at_start; +public: + //! Construct stage_task for first stage in a pipeline. + /** Such a stage has not read any input yet. */ + stage_task( pipeline& pipeline ) : + my_pipeline(pipeline), + my_filter(pipeline.filter_list), + my_at_start(true) + { + task_info::reset(); + } + //! Construct stage_task for a subsequent stage in a pipeline. + stage_task( pipeline& pipeline, filter* filter_, const task_info& info ) : + task_info(info), + my_pipeline(pipeline), + my_filter(filter_), + my_at_start(false) + {} + //! Roughly equivalent to the constructor of input stage task + void reset() { + task_info::reset(); + my_filter = my_pipeline.filter_list; + my_at_start = true; + } + //! The virtual task execution method + /*override*/ task* execute(); +#if __TBB_EXCEPTIONS + ~stage_task() + { + if (my_filter && my_object && (my_filter->my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(4)) { + __TBB_ASSERT(is_cancelled(), "Trying to finalize the task that wasn't cancelled"); + my_filter->finalize(my_object); + my_object = NULL; + } + } +#endif // __TBB_EXCEPTIONS + //! Creates and spawns stage_task from task_info + void spawn_stage_task(const task_info& info) + { + stage_task* clone = new (allocate_additional_child_of(*parent())) + stage_task( my_pipeline, my_filter, info ); + spawn(*clone); + } + //! Puts current task information + void put_task_info(task_info &where_to_put ) { + where_to_put.my_object = my_object; + where_to_put.my_token = my_token; + where_to_put.my_token_ready = my_token_ready; + where_to_put.is_valid = true; + } +}; + +task* stage_task::execute() { + __TBB_ASSERT( !my_at_start || !my_object, NULL ); + __TBB_ASSERT( !my_filter->is_bound(), NULL ); + if( my_at_start ) { + if( my_filter->is_serial() ) { + my_object = (*my_filter)(my_object); + if( my_object ) { + if( my_filter->is_ordered() ) { + my_token = my_pipeline.token_counter++; // ideally, with relaxed semantics + my_token_ready = true; + } else if( (my_filter->my_filter_mode & my_filter->version_mask) >= __TBB_PIPELINE_VERSION(5) ) { + if( my_pipeline.has_thread_bound_filters ) + my_pipeline.token_counter++; // ideally, with relaxed semantics + } + if( !my_filter->next_filter_in_pipeline ) { + reset(); + goto process_another_stage; + } else { + ITT_NOTIFY( sync_releasing, &my_pipeline.input_tokens ); + if( --my_pipeline.input_tokens>0 ) + spawn( *new( allocate_additional_child_of(*parent()) ) stage_task( my_pipeline ) ); + } + } else { + my_pipeline.end_of_input = true; + return NULL; + } + } else /*not is_serial*/ { + if( my_pipeline.end_of_input ) + return NULL; + if( (my_filter->my_filter_mode & my_filter->version_mask) >= __TBB_PIPELINE_VERSION(5) ) { + if( my_pipeline.has_thread_bound_filters ) + my_pipeline.token_counter++; + } + ITT_NOTIFY( sync_releasing, &my_pipeline.input_tokens ); + if( --my_pipeline.input_tokens>0 ) + spawn( *new( allocate_additional_child_of(*parent()) ) stage_task( my_pipeline ) ); + my_object = (*my_filter)(my_object); + if( !my_object ) { + my_pipeline.end_of_input = true; + if( (my_filter->my_filter_mode & my_filter->version_mask) >= __TBB_PIPELINE_VERSION(5) ) { + if( my_pipeline.has_thread_bound_filters ) + my_pipeline.token_counter--; + } + return NULL; + } + } + my_at_start = false; + } else { + my_object = (*my_filter)(my_object); + if( my_filter->is_serial() ) + my_filter->my_input_buffer->note_done(my_token, *this); + } + my_filter = my_filter->next_filter_in_pipeline; + if( my_filter ) { + // There is another filter to execute. + // Crank up priority a notch. + add_to_depth(1); + if( my_filter->is_serial() ) { + // The next filter must execute tokens in order + if( my_filter->my_input_buffer->put_token(*this) ){ + // Can't proceed with the same item + if( my_filter->is_bound() ) { + // Find the next non-thread-bound filter + do { + my_filter = my_filter->next_filter_in_pipeline; + } while( my_filter && my_filter->is_bound() ); + // Check if there is an item ready to process + if( my_filter && my_filter->my_input_buffer->return_item(*this, !my_filter->is_serial()) ) + goto process_another_stage; + } + my_filter = NULL; // To prevent deleting my_object twice if exception occurs + return NULL; + } + } + } else { + // Reached end of the pipe. + if( ++my_pipeline.input_tokens>1 || my_pipeline.end_of_input || my_pipeline.filter_list->is_bound() ) + return NULL; // No need to recycle for new input + ITT_NOTIFY( sync_acquired, &my_pipeline.input_tokens ); + // Recycle as an input stage task. + reset(); + } +process_another_stage: + /* A semi-hackish way to reexecute the same task object immediately without spawning. + recycle_as_continuation marks the task for future execution, + and then 'this' pointer is returned to bypass spawning. */ + recycle_as_continuation(); + return this; +} + +class pipeline_root_task: public task { + pipeline& my_pipeline; + bool do_segment_scanning; + + /*override*/ task* execute() { + if( !my_pipeline.end_of_input ) + if( !my_pipeline.filter_list->is_bound() ) + if( my_pipeline.input_tokens > 0 ) { + recycle_as_continuation(); + set_ref_count(1); + return new( allocate_child() ) stage_task( my_pipeline ); + } + if( do_segment_scanning ) { + filter* current_filter = my_pipeline.filter_list->next_segment; + /* first non-thread-bound filter that follows thread-bound one + and may have valid items to process */ + filter* first_suitable_filter = current_filter; + while( current_filter ) { + __TBB_ASSERT( !current_filter->is_bound(), "filter is thread-bound?" ); + __TBB_ASSERT( current_filter->prev_filter_in_pipeline->is_bound(), "previous filter is not thread-bound?" ); + if( !my_pipeline.end_of_input + || (tokendiff_t)(my_pipeline.token_counter - current_filter->my_input_buffer->low_token) > 0 ) + { + task_info info; + info.reset(); + if( current_filter->my_input_buffer->return_item(info, !current_filter->is_serial()) ) { + set_ref_count(1); + recycle_as_continuation(); + return new( allocate_child() ) stage_task( my_pipeline, current_filter, info); + } + current_filter = current_filter->next_segment; + if( !current_filter ) { + if( !my_pipeline.end_of_input ) { + recycle_as_continuation(); + return this; + } + current_filter = first_suitable_filter; + __TBB_Yield(); + } + } else { + /* The preceding pipeline segment is empty. + Fast-forward to the next post-TBF segment. */ + first_suitable_filter = first_suitable_filter->next_segment; + current_filter = first_suitable_filter; + } + } /* end of while */ + return NULL; + } else { + if( !my_pipeline.end_of_input ) { + recycle_as_continuation(); + return this; + } + return NULL; + } + } +public: + pipeline_root_task( pipeline& pipeline ): my_pipeline(pipeline), do_segment_scanning(false) + { + __TBB_ASSERT( my_pipeline.filter_list, NULL ); + filter* first = my_pipeline.filter_list; + if( (first->my_filter_mode & first->version_mask) >= __TBB_PIPELINE_VERSION(5) ) { + // Scanning the pipeline for segments + filter* head_of_previous_segment = first; + for( filter* subfilter=first->next_filter_in_pipeline; + subfilter!=NULL; + subfilter=subfilter->next_filter_in_pipeline ) + { + if( subfilter->prev_filter_in_pipeline->is_bound() && !subfilter->is_bound() ) { + do_segment_scanning = true; + head_of_previous_segment->next_segment = subfilter; + head_of_previous_segment = subfilter; + } + } + } + } +}; + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings + // Suppress compiler warning about constant conditional expression + #pragma warning (disable: 4127) +#endif + +// The class destroys end_counter and clears all input buffers if pipeline was cancelled. +class pipeline_cleaner: internal::no_copy { + pipeline& my_pipeline; +public: + pipeline_cleaner(pipeline& _pipeline) : + my_pipeline(_pipeline) + {} + ~pipeline_cleaner(){ +#if __TBB_EXCEPTIONS + if (my_pipeline.end_counter->is_cancelled()) // Pipeline was cancelled + my_pipeline.clear_filters(); +#endif + my_pipeline.end_counter = NULL; + } +}; + +} // namespace internal + +void pipeline::inject_token( task& ) { + __TBB_ASSERT(0,"illegal call to inject_token"); +} + +#if __TBB_EXCEPTIONS +void pipeline::clear_filters() { + for( filter* f = filter_list; f; f = f->next_filter_in_pipeline ) { + if ((f->my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(4)) + if( internal::input_buffer* b = f->my_input_buffer ) + b->clear(f); + } +} +#endif + +pipeline::pipeline() : + filter_list(NULL), + filter_end(NULL), + end_counter(NULL), + end_of_input(false), + has_thread_bound_filters(false) +{ + token_counter = 0; + input_tokens = 0; +} + +pipeline::~pipeline() { + clear(); +} + +void pipeline::clear() { + filter* next; + for( filter* f = filter_list; f; f=next ) { + if( internal::input_buffer* b = f->my_input_buffer ) { + delete b; + f->my_input_buffer = NULL; + } + next=f->next_filter_in_pipeline; + f->next_filter_in_pipeline = filter::not_in_pipeline(); + if ( (f->my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(3) ) { + f->prev_filter_in_pipeline = filter::not_in_pipeline(); + f->my_pipeline = NULL; + } + if ( (f->my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(5) ) + f->next_segment = NULL; + } + filter_list = filter_end = NULL; +} + +void pipeline::add_filter( filter& filter_ ) { +#if TBB_USE_ASSERT + if ( (filter_.my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(3) ) + __TBB_ASSERT( filter_.prev_filter_in_pipeline==filter::not_in_pipeline(), "filter already part of pipeline?" ); + __TBB_ASSERT( filter_.next_filter_in_pipeline==filter::not_in_pipeline(), "filter already part of pipeline?" ); + __TBB_ASSERT( !end_counter, "invocation of add_filter on running pipeline" ); +#endif + if ( (filter_.my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(3) ) { + filter_.my_pipeline = this; + filter_.prev_filter_in_pipeline = filter_end; + if ( filter_list == NULL) + filter_list = &filter_; + else + filter_end->next_filter_in_pipeline = &filter_; + filter_.next_filter_in_pipeline = NULL; + filter_end = &filter_; + } + else + { + if( !filter_end ) + filter_end = reinterpret_cast(&filter_list); + + *reinterpret_cast(filter_end) = &filter_; + filter_end = reinterpret_cast(&filter_.next_filter_in_pipeline); + *reinterpret_cast(filter_end) = NULL; + } + if( (filter_.my_filter_mode & filter_.version_mask) >= __TBB_PIPELINE_VERSION(5) ) { + if( filter_.is_serial() ) { + if( filter_.is_bound() ) + has_thread_bound_filters = true; + filter_.my_input_buffer = new internal::input_buffer( filter_.is_ordered(), filter_.is_bound() ); + } + else { + if( filter_.prev_filter_in_pipeline && filter_.prev_filter_in_pipeline->is_bound() ) + filter_.my_input_buffer = new internal::input_buffer( false, false ); + } + } else { + if( filter_.is_serial() ) { + filter_.my_input_buffer = new internal::input_buffer( filter_.is_ordered(), false ); + } + } + +} + +void pipeline::remove_filter( filter& filter_ ) { + if (&filter_ == filter_list) + filter_list = filter_.next_filter_in_pipeline; + else { + __TBB_ASSERT( filter_.prev_filter_in_pipeline, "filter list broken?" ); + filter_.prev_filter_in_pipeline->next_filter_in_pipeline = filter_.next_filter_in_pipeline; + } + if (&filter_ == filter_end) + filter_end = filter_.prev_filter_in_pipeline; + else { + __TBB_ASSERT( filter_.next_filter_in_pipeline, "filter list broken?" ); + filter_.next_filter_in_pipeline->prev_filter_in_pipeline = filter_.prev_filter_in_pipeline; + } + if( internal::input_buffer* b = filter_.my_input_buffer ) { + delete b; + filter_.my_input_buffer = NULL; + } + filter_.next_filter_in_pipeline = filter_.prev_filter_in_pipeline = filter::not_in_pipeline(); + if ( (filter_.my_filter_mode & filter::version_mask) >= __TBB_PIPELINE_VERSION(5) ) + filter_.next_segment = NULL; + filter_.my_pipeline = NULL; +} + +void pipeline::run( size_t max_number_of_live_tokens +#if __TBB_EXCEPTIONS + , tbb::task_group_context& context +#endif + ) { + __TBB_ASSERT( max_number_of_live_tokens>0, "pipeline::run must have at least one token" ); + __TBB_ASSERT( !end_counter, "pipeline already running?" ); + if( filter_list ) { + internal::pipeline_cleaner my_pipeline_cleaner(*this); + end_of_input = false; +#if __TBB_EXCEPTIONS + end_counter = new( task::allocate_root(context) ) internal::pipeline_root_task( *this ); +#else + end_counter = new( task::allocate_root() ) internal::pipeline_root_task( *this ); +#endif + input_tokens = internal::Token(max_number_of_live_tokens); + // Start execution of tasks + task::spawn_root_and_wait( *end_counter ); + } +} + +#if __TBB_EXCEPTIONS +void pipeline::run( size_t max_number_of_live_tokens ) { + tbb::task_group_context context; + run(max_number_of_live_tokens, context); +} +#endif // __TBB_EXCEPTIONS + +filter::~filter() { + if ( (my_filter_mode & version_mask) >= __TBB_PIPELINE_VERSION(3) ) { + if ( next_filter_in_pipeline != filter::not_in_pipeline() ) { + __TBB_ASSERT( prev_filter_in_pipeline != filter::not_in_pipeline(), "probably filter list is broken" ); + my_pipeline->remove_filter(*this); + } else + __TBB_ASSERT( prev_filter_in_pipeline == filter::not_in_pipeline(), "probably filter list is broken" ); + } else { + __TBB_ASSERT( next_filter_in_pipeline==filter::not_in_pipeline(), "cannot destroy filter that is part of pipeline" ); + } +} + +thread_bound_filter::result_type thread_bound_filter::process_item() { + return internal_process_item(true); +} + +thread_bound_filter::result_type thread_bound_filter::try_process_item() { + return internal_process_item(false); +} + +thread_bound_filter::result_type thread_bound_filter::internal_process_item(bool is_blocking) { + internal::task_info info; + info.reset(); + + if( !prev_filter_in_pipeline ) { + if( my_pipeline->end_of_input ) + return end_of_stream; + while( my_pipeline->input_tokens == 0 ) { + if( is_blocking ) + __TBB_Yield(); + else + return item_not_available; + } + info.my_object = (*this)(info.my_object); + if( info.my_object ) { + my_pipeline->input_tokens--; + if( is_ordered() ) { + info.my_token = my_pipeline->token_counter; + info.my_token_ready = true; + } + my_pipeline->token_counter++; // ideally, with relaxed semantics + } else { + my_pipeline->end_of_input = true; + return end_of_stream; + } + } else { /* this is not an input filter */ + while( !my_input_buffer->return_item(info, /*advance=*/true) ) { + if( my_pipeline->end_of_input && my_input_buffer->low_token == my_pipeline->token_counter ) + return end_of_stream; + if( is_blocking ) + __TBB_Yield(); + else + return item_not_available; + } + info.my_object = (*this)(info.my_object); + } + if( next_filter_in_pipeline ) { + next_filter_in_pipeline->my_input_buffer->put_item(info); + } else { + my_pipeline->input_tokens++; + } + + return success; +} + +} // tbb + diff --git a/dep/tbb/src/tbb/private_server.cpp b/dep/tbb/src/tbb/private_server.cpp new file mode 100644 index 000000000..cda558e81 --- /dev/null +++ b/dep/tbb/src/tbb/private_server.cpp @@ -0,0 +1,346 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "../rml/include/rml_tbb.h" +#include "../rml/server/thread_monitor.h" +#include "tbb/atomic.h" +#include "tbb/cache_aligned_allocator.h" +#include "tbb/spin_mutex.h" +#include "tbb/tbb_thread.h" + +using rml::internal::thread_monitor; + +namespace tbb { +namespace internal { +namespace rml { + +class private_server; + +class private_worker: no_copy { + //! State in finite-state machine that controls the worker. + /** State diagram: + open --> normal --> quit + | + V + plugged + */ + enum state_t { + //! *this is initialized + st_init, + //! Associated thread is doing normal life sequence. + st_normal, + //! Associated thread is end normal life sequence. + st_quit, + //! Associated thread should skip normal life sequence, because private_server is shutting down. + st_plugged + }; + atomic my_state; + + //! Associated server + private_server& my_server; + + //! Associated client + tbb_client& my_client; + + //! index used for avoiding the 64K aliasing problem + const size_t my_index; + + //! Monitor for sleeping when there is no work to do. + /** The invariant that holds for sleeping workers is: + "my_slack<=0 && my_state==st_normal && I am on server's list of asleep threads" */ + thread_monitor my_thread_monitor; + + //! Link for list of sleeping workers + private_worker* my_next; + + friend class private_server; + + //! Actions executed by the associated thread + void run(); + + //! Called by a thread (usually not the associated thread) to commence termination. + void start_shutdown(); + + static __RML_DECL_THREAD_ROUTINE thread_routine( void* arg ); + +protected: + private_worker( private_server& server, tbb_client& client, const size_t i ) : + my_server(server), + my_client(client), + my_index(i) + { + my_state = st_init; + } + +}; + +static const size_t cache_line_size = tbb::internal::NFS_MaxLineSize; + + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Suppress overzealous compiler warnings about uninstantiatble class + #pragma warning(push) + #pragma warning(disable:4510 4610) +#endif +class padded_private_worker: public private_worker { + char pad[cache_line_size - sizeof(private_worker)%cache_line_size]; +public: + padded_private_worker( private_server& server, tbb_client& client, const size_t i ) : private_worker(server,client,i) {} +}; +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning(pop) +#endif + +class private_server: public tbb_server, no_copy { + tbb_client& my_client; + const tbb_client::size_type my_n_thread; + + //! Number of jobs that could use their associated thread minus number of active threads. + /** If negative, indicates oversubscription. + If positive, indicates that more threads should run. + Can be lowered asynchronously, but must be raised only while holding my_asleep_list_mutex, + because raising it impacts the invariant for sleeping threads. */ + atomic my_slack; + + //! Counter used to determine when to delete this. + atomic my_ref_count; + + padded_private_worker* my_thread_array; + + //! List of workers that are asleep or committed to sleeping until notified by another thread. + tbb::atomic my_asleep_list_root; + + //! Protects my_asleep_list_root + tbb::spin_mutex my_asleep_list_mutex; + +#if TBB_USE_ASSERT + atomic my_net_slack_requests; +#endif /* TBB_USE_ASSERT */ + + //! Used for double-check idiom + bool has_sleepers() const { + return my_asleep_list_root!=NULL; + } + + //! Try to add t to list of sleeping workers + bool try_insert_in_asleep_list( private_worker& t ); + + //! Equivalent of adding additional_slack to my_slack and waking up to 2 threads if my_slack permits. + void wake_some( int additional_slack ); + + virtual ~private_server(); + + void remove_server_ref() { + if( --my_ref_count==0 ) { + my_client.acknowledge_close_connection(); + this->~private_server(); + tbb::cache_aligned_allocator().deallocate( this, 1 ); + } + } + + friend class private_worker; +public: + private_server( tbb_client& client ); + + /*override*/ version_type version() const { + return 0; + } + + /*override*/ void request_close_connection() { + for( size_t i=0; i(arg); + AVOID_64K_ALIASING( self->my_index ); + self->run(); + return NULL; +} +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning(pop) +#endif + +void private_worker::start_shutdown() { + state_t s; + // Transition from st_init or st_normal to st_plugged or st_quit + do { + s = my_state; + __TBB_ASSERT( s==st_init||s==st_normal, NULL ); + } while( my_state.compare_and_swap( s==st_init? st_plugged : st_quit, s )!=s ); + if( s==st_normal ) { + // May have invalidated invariant for sleeping, so wake up the thread. + // Note that the notify() here occurs without maintaining invariants for my_slack. + // It does not matter, because my_state==st_quit overrides checking of my_slack. + my_thread_monitor.notify(); + } +} + +void private_worker::run() { + if( my_state.compare_and_swap( st_normal, st_init )==st_init ) { + ::rml::job& j = *my_client.create_one_job(); + --my_server.my_slack; + while( my_state==st_normal ) { + if( my_server.my_slack>=0 ) { + my_client.process(j); + } else { + thread_monitor::cookie c; + // Prepare to wait + my_thread_monitor.prepare_wait(c); + // Check/set the invariant for sleeping + if( my_state==st_normal && my_server.try_insert_in_asleep_list(*this) ) { + my_thread_monitor.commit_wait(c); + // Propagate chain reaction + if( my_server.has_sleepers() ) + my_server.wake_some(0); + } else { + // Invariant broken + my_thread_monitor.cancel_wait(); + } + } + } + my_client.cleanup(j); + ++my_server.my_slack; + } + my_server.remove_server_ref(); +} + +//------------------------------------------------------------------------ +// Methods of private_server +//------------------------------------------------------------------------ +private_server::private_server( tbb_client& client ) : + my_client(client), + my_n_thread(client.max_job_count()), + my_thread_array(NULL) +{ + my_ref_count = my_n_thread+1; + my_slack = 0; +#if TBB_USE_ASSERT + my_net_slack_requests = 0; +#endif /* TBB_USE_ASSERT */ + my_asleep_list_root = NULL; + size_t stack_size = client.min_stack_size(); + my_thread_array = tbb::cache_aligned_allocator().allocate( my_n_thread ); + memset( my_thread_array, 0, sizeof(private_worker)*my_n_thread ); + // FIXME - use recursive chain reaction to launch the threads. + for( size_t i=0; i().deallocate( my_thread_array, my_n_thread ); + tbb::internal::poison_pointer( my_thread_array ); +} + +inline bool private_server::try_insert_in_asleep_list( private_worker& t ) { + tbb::spin_mutex::scoped_lock lock(my_asleep_list_mutex); + // Contribute to slack under lock so that if another takes that unit of slack, + // it sees us sleeping on the list and wakes us up. + int k = ++my_slack; + if( k<=0 ) { + t.my_next = my_asleep_list_root; + my_asleep_list_root = &t; + return true; + } else { + --my_slack; + return false; + } +} + +void private_server::wake_some( int additional_slack ) { + __TBB_ASSERT( additional_slack>=0, NULL ); + private_worker* wakee[2]; + private_worker**w = wakee; + { + tbb::spin_mutex::scoped_lock lock(my_asleep_list_mutex); + while( my_asleep_list_root && w0 ) { + --additional_slack; + } else { + // Try to claim unit of slack + int old; + do { + old = my_slack; + if( old<=0 ) goto done; + } while( my_slack.compare_and_swap(old-1,old)!=old ); + } + // Pop sleeping worker to combine with claimed unit of slack + my_asleep_list_root = (*w++ = my_asleep_list_root)->my_next; + } + if( additional_slack ) { + // Contribute our unused slack to my_slack. + my_slack += additional_slack; + } + } +done: + while( w>wakee ) + (*--w)->my_thread_monitor.notify(); +} + +void private_server::adjust_job_count_estimate( int delta ) { +#if TBB_USE_ASSERT + my_net_slack_requests+=delta; +#endif /* TBB_USE_ASSERT */ + if( delta<0 ) { + my_slack+=delta; + } else if( delta>0 ) { + wake_some( delta ); + } +} + +//! Factory method called from task.cpp to create a private_server. +tbb_server* make_private_server( tbb_client& client ) { + return new( tbb::cache_aligned_allocator().allocate(1) ) private_server(client); +} + +} // namespace rml +} // namespace internal +} // namespace tbb diff --git a/dep/tbb/src/tbb/queuing_mutex.cpp b/dep/tbb/src/tbb/queuing_mutex.cpp new file mode 100644 index 000000000..db2b986f0 --- /dev/null +++ b/dep/tbb/src/tbb/queuing_mutex.cpp @@ -0,0 +1,117 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/tbb_machine.h" +#include "tbb/tbb_stddef.h" +#include "tbb_misc.h" +#include "tbb/queuing_mutex.h" +#include "itt_notify.h" + + +namespace tbb { + +using namespace internal; + +//! A method to acquire queuing_mutex lock +void queuing_mutex::scoped_lock::acquire( queuing_mutex& m ) +{ + __TBB_ASSERT( !this->mutex, "scoped_lock is already holding a mutex"); + + // Must set all fields before the fetch_and_store, because once the + // fetch_and_store executes, *this becomes accessible to other threads. + mutex = &m; + next = NULL; + going = 0; + + // The fetch_and_store must have release semantics, because we are + // "sending" the fields initialized above to other processors. + scoped_lock* pred = m.q_tail.fetch_and_store(this); + if( pred ) { + ITT_NOTIFY(sync_prepare, mutex); + __TBB_ASSERT( !pred->next, "the predecessor has another successor!"); + pred->next = this; + spin_wait_while_eq( going, 0ul ); + } + ITT_NOTIFY(sync_acquired, mutex); + + // Force acquire so that user's critical section receives correct values + // from processor that was previously in the user's critical section. + __TBB_load_with_acquire(going); +} + +//! A method to acquire queuing_mutex if it is free +bool queuing_mutex::scoped_lock::try_acquire( queuing_mutex& m ) +{ + __TBB_ASSERT( !this->mutex, "scoped_lock is already holding a mutex"); + + // Must set all fields before the fetch_and_store, because once the + // fetch_and_store executes, *this becomes accessible to other threads. + next = NULL; + going = 0; + + if( m.q_tail ) return false; + // The CAS must have release semantics, because we are + // "sending" the fields initialized above to other processors. + scoped_lock* pred = m.q_tail.compare_and_swap(this, NULL); + + // Force acquire so that user's critical section receives correct values + // from processor that was previously in the user's critical section. + // try_acquire should always have acquire semantic, even if failed. + __TBB_load_with_acquire(going); + if( !pred ) { + mutex = &m; + ITT_NOTIFY(sync_acquired, mutex); + return true; + } else return false; +} + +//! A method to release queuing_mutex lock +void queuing_mutex::scoped_lock::release( ) +{ + __TBB_ASSERT(this->mutex!=NULL, "no lock acquired"); + + ITT_NOTIFY(sync_releasing, mutex); + if( !next ) { + if( this == mutex->q_tail.compare_and_swap(NULL, this) ) { + // this was the only item in the queue, and the queue is now empty. + goto done; + } + // Someone in the queue + spin_wait_while_eq( next, (scoped_lock*)0 ); + } + __TBB_ASSERT(next,NULL); + __TBB_store_with_release(next->going, 1); +done: + initialize(); +} + +void queuing_mutex::internal_construct() { + ITT_SYNC_CREATE(this, _T("tbb::queuing_mutex"), _T("")); +} + +} // namespace tbb diff --git a/dep/tbb/src/tbb/queuing_rw_mutex.cpp b/dep/tbb/src/tbb/queuing_rw_mutex.cpp new file mode 100644 index 000000000..4c7034737 --- /dev/null +++ b/dep/tbb/src/tbb/queuing_rw_mutex.cpp @@ -0,0 +1,505 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +/** Before making any changes in the implementation, please emulate algorithmic changes + with SPIN tool using /tools/spin_models/ReaderWriterMutex.pml. + There could be some code looking as "can be restructured" but its structure does matter! */ + +#include "tbb/tbb_machine.h" +#include "tbb/tbb_stddef.h" +#include "tbb/tbb_machine.h" +#include "tbb/queuing_rw_mutex.h" +#include "itt_notify.h" + + +namespace tbb { + +using namespace internal; + +//! Flag bits in a state_t that specify information about a locking request. +enum state_t_flags { + STATE_NONE = 0, + STATE_WRITER = 1, + STATE_READER = 1<<1, + STATE_READER_UNBLOCKNEXT = 1<<2, + STATE_COMBINED_WAITINGREADER = STATE_READER | STATE_READER_UNBLOCKNEXT, + STATE_ACTIVEREADER = 1<<3, + STATE_COMBINED_READER = STATE_COMBINED_WAITINGREADER | STATE_ACTIVEREADER, + STATE_UPGRADE_REQUESTED = 1<<4, + STATE_UPGRADE_WAITING = 1<<5, + STATE_UPGRADE_LOSER = 1<<6, + STATE_COMBINED_UPGRADING = STATE_UPGRADE_WAITING | STATE_UPGRADE_LOSER +}; + +const unsigned char RELEASED = 0; +const unsigned char ACQUIRED = 1; + +template +inline atomic& as_atomic( T& t ) { + return *(atomic*)&t; +} + +inline bool queuing_rw_mutex::scoped_lock::try_acquire_internal_lock() +{ + return as_atomic(internal_lock).compare_and_swap(ACQUIRED,RELEASED) == RELEASED; +} + +inline void queuing_rw_mutex::scoped_lock::acquire_internal_lock() +{ + // Usually, we would use the test-test-and-set idiom here, with exponential backoff. + // But so far, experiments indicate there is no value in doing so here. + while( !try_acquire_internal_lock() ) { + __TBB_Pause(1); + } +} + +inline void queuing_rw_mutex::scoped_lock::release_internal_lock() +{ + __TBB_store_with_release(internal_lock,RELEASED); +} + +inline void queuing_rw_mutex::scoped_lock::wait_for_release_of_internal_lock() +{ + spin_wait_until_eq(internal_lock, RELEASED); +} + +inline void queuing_rw_mutex::scoped_lock::unblock_or_wait_on_internal_lock( uintptr_t flag ) { + if( flag ) + wait_for_release_of_internal_lock(); + else + release_internal_lock(); +} + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings + #pragma warning (push) + #pragma warning (disable: 4311 4312) +#endif + +//! A view of a T* with additional functionality for twiddling low-order bits. +template +class tricky_atomic_pointer: no_copy { +public: + typedef typename atomic_rep::word word; + + template + static T* fetch_and_add( T* volatile * location, word addend ) { + return reinterpret_cast( atomic_traits::fetch_and_add(location, addend) ); + } + template + static T* fetch_and_store( T* volatile * location, T* value ) { + return reinterpret_cast( atomic_traits::fetch_and_store(location, reinterpret_cast(value)) ); + } + template + static T* compare_and_swap( T* volatile * location, T* value, T* comparand ) { + return reinterpret_cast( + atomic_traits::compare_and_swap(location, reinterpret_cast(value), + reinterpret_cast(comparand)) + ); + } + + T* & ref; + tricky_atomic_pointer( T*& original ) : ref(original) {}; + tricky_atomic_pointer( T* volatile & original ) : ref(original) {}; + T* operator&( word operand2 ) const { + return reinterpret_cast( reinterpret_cast(ref) & operand2 ); + } + T* operator|( word operand2 ) const { + return reinterpret_cast( reinterpret_cast(ref) | operand2 ); + } +}; + +typedef tricky_atomic_pointer tricky_pointer; + +#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings + #pragma warning (pop) +#endif + +//! Mask for low order bit of a pointer. +static const tricky_pointer::word FLAG = 0x1; + +inline +uintptr get_flag( queuing_rw_mutex::scoped_lock* ptr ) { + return uintptr(tricky_pointer(ptr)&FLAG); +} + +//------------------------------------------------------------------------ +// Methods of queuing_rw_mutex::scoped_lock +//------------------------------------------------------------------------ + +void queuing_rw_mutex::scoped_lock::acquire( queuing_rw_mutex& m, bool write ) +{ + __TBB_ASSERT( !this->mutex, "scoped_lock is already holding a mutex"); + + // Must set all fields before the fetch_and_store, because once the + // fetch_and_store executes, *this becomes accessible to other threads. + mutex = &m; + prev = NULL; + next = NULL; + going = 0; + state = state_t(write ? STATE_WRITER : STATE_READER); + internal_lock = RELEASED; + + queuing_rw_mutex::scoped_lock* pred = m.q_tail.fetch_and_store(this); + + if( write ) { // Acquiring for write + + if( pred ) { + ITT_NOTIFY(sync_prepare, mutex); + pred = tricky_pointer(pred) & ~FLAG; + __TBB_ASSERT( !( tricky_pointer(pred) & FLAG ), "use of corrupted pointer!" ); + __TBB_ASSERT( !pred->next, "the predecessor has another successor!"); + // ensure release semantics on IPF + __TBB_store_with_release(pred->next,this); + spin_wait_until_eq(going, 1); + } + + } else { // Acquiring for read +#if DO_ITT_NOTIFY + bool sync_prepare_done = false; +#endif + if( pred ) { + unsigned short pred_state; + __TBB_ASSERT( !this->prev, "the predecessor is already set" ); + if( tricky_pointer(pred)&FLAG ) { + /* this is only possible if pred is an upgrading reader and it signals us to wait */ + pred_state = STATE_UPGRADE_WAITING; + pred = tricky_pointer(pred) & ~FLAG; + } else { + // Load pred->state now, because once pred->next becomes + // non-NULL, we must assume that *pred might be destroyed. + pred_state = pred->state.compare_and_swap(STATE_READER_UNBLOCKNEXT, STATE_READER); + } + this->prev = pred; + __TBB_ASSERT( !( tricky_pointer(pred) & FLAG ), "use of corrupted pointer!" ); + __TBB_ASSERT( !pred->next, "the predecessor has another successor!"); + // ensure release semantics on IPF + __TBB_store_with_release(pred->next,this); + if( pred_state != STATE_ACTIVEREADER ) { +#if DO_ITT_NOTIFY + sync_prepare_done = true; + ITT_NOTIFY(sync_prepare, mutex); +#endif + spin_wait_until_eq(going, 1); + } + } + unsigned short old_state = state.compare_and_swap(STATE_ACTIVEREADER, STATE_READER); + if( old_state!=STATE_READER ) { +#if DO_ITT_NOTIFY + if( !sync_prepare_done ) + ITT_NOTIFY(sync_prepare, mutex); +#endif + // Failed to become active reader -> need to unblock the next waiting reader first + __TBB_ASSERT( state==STATE_READER_UNBLOCKNEXT, "unexpected state" ); + spin_wait_while_eq(next, (scoped_lock*)NULL); + /* state should be changed before unblocking the next otherwise it might finish + and another thread can get our old state and left blocked */ + state = STATE_ACTIVEREADER; + // ensure release semantics on IPF + __TBB_store_with_release(next->going,1); + } + } + + ITT_NOTIFY(sync_acquired, mutex); + + // Force acquire so that user's critical section receives correct values + // from processor that was previously in the user's critical section. + __TBB_load_with_acquire(going); +} + +bool queuing_rw_mutex::scoped_lock::try_acquire( queuing_rw_mutex& m, bool write ) +{ + __TBB_ASSERT( !this->mutex, "scoped_lock is already holding a mutex"); + + // Must set all fields before the fetch_and_store, because once the + // fetch_and_store executes, *this becomes accessible to other threads. + prev = NULL; + next = NULL; + going = 0; + state = state_t(write ? STATE_WRITER : STATE_ACTIVEREADER); + internal_lock = RELEASED; + + if( m.q_tail ) return false; + // The CAS must have release semantics, because we are + // "sending" the fields initialized above to other processors. + queuing_rw_mutex::scoped_lock* pred = m.q_tail.compare_and_swap(this, NULL); + + // Force acquire so that user's critical section receives correct values + // from processor that was previously in the user's critical section. + // try_acquire should always have acquire semantic, even if failed. + __TBB_load_with_acquire(going); + + if( !pred ) { + mutex = &m; + ITT_NOTIFY(sync_acquired, mutex); + return true; + } else return false; + +} + +void queuing_rw_mutex::scoped_lock::release( ) +{ + __TBB_ASSERT(this->mutex!=NULL, "no lock acquired"); + + ITT_NOTIFY(sync_releasing, mutex); + + if( state == STATE_WRITER ) { // Acquired for write + + // The logic below is the same as "writerUnlock", but restructured to remove "return" in the middle of routine. + // In the statement below, acquire semantics of reading 'next' is required + // so that following operations with fields of 'next' are safe. + scoped_lock* n = __TBB_load_with_acquire(next); + if( !n ) { + if( this == mutex->q_tail.compare_and_swap(NULL, this) ) { + // this was the only item in the queue, and the queue is now empty. + goto done; + } + spin_wait_while_eq( next, (scoped_lock*)NULL ); + n = next; + } + n->going = 2; // protect next queue node from being destroyed too early + if( n->state==STATE_UPGRADE_WAITING ) { + // the next waiting for upgrade means this writer was upgraded before. + acquire_internal_lock(); + queuing_rw_mutex::scoped_lock* tmp = tricky_pointer::fetch_and_store(&(n->prev), NULL); + n->state = STATE_UPGRADE_LOSER; + __TBB_store_with_release(n->going,1); + unblock_or_wait_on_internal_lock(get_flag(tmp)); + } else { + __TBB_ASSERT( state & (STATE_COMBINED_WAITINGREADER | STATE_WRITER), "unexpected state" ); + __TBB_ASSERT( !( tricky_pointer(n->prev) & FLAG ), "use of corrupted pointer!" ); + n->prev = NULL; + // ensure release semantics on IPF + __TBB_store_with_release(n->going,1); + } + + } else { // Acquired for read + + queuing_rw_mutex::scoped_lock *tmp = NULL; +retry: + // Addition to the original paper: Mark this->prev as in use + queuing_rw_mutex::scoped_lock *pred = tricky_pointer::fetch_and_add(&(this->prev), FLAG); + + if( pred ) { + if( !(pred->try_acquire_internal_lock()) ) + { + // Failed to acquire the lock on pred. The predecessor either unlinks or upgrades. + // In the second case, it could or could not know my "in use" flag - need to check + tmp = tricky_pointer::compare_and_swap(&(this->prev), pred, tricky_pointer(pred)|FLAG ); + if( !(tricky_pointer(tmp)&FLAG) ) { + // Wait for the predecessor to change this->prev (e.g. during unlink) + spin_wait_while_eq( this->prev, tricky_pointer(pred)|FLAG ); + // Now owner of pred is waiting for _us_ to release its lock + pred->release_internal_lock(); + } + else ; // The "in use" flag is back -> the predecessor didn't get it and will release itself; nothing to do + + tmp = NULL; + goto retry; + } + __TBB_ASSERT(pred && pred->internal_lock==ACQUIRED, "predecessor's lock is not acquired"); + this->prev = pred; + acquire_internal_lock(); + + __TBB_store_with_release(pred->next,reinterpret_cast(NULL)); + + if( !next && this != mutex->q_tail.compare_and_swap(pred, this) ) { + spin_wait_while_eq( next, (void*)NULL ); + } + __TBB_ASSERT( !get_flag(next), "use of corrupted pointer" ); + + // ensure acquire semantics of reading 'next' + if( __TBB_load_with_acquire(next) ) { // I->next != nil + // Equivalent to I->next->prev = I->prev but protected against (prev[n]&FLAG)!=0 + tmp = tricky_pointer::fetch_and_store(&(next->prev), pred); + // I->prev->next = I->next; + __TBB_ASSERT(this->prev==pred, NULL); + __TBB_store_with_release(pred->next,next); + } + // Safe to release in the order opposite to acquiring which makes the code simplier + pred->release_internal_lock(); + + } else { // No predecessor when we looked + acquire_internal_lock(); // "exclusiveLock(&I->EL)" + // ensure acquire semantics of reading 'next' + scoped_lock* n = __TBB_load_with_acquire(next); + if( !n ) { + if( this != mutex->q_tail.compare_and_swap(NULL, this) ) { + spin_wait_while_eq( next, (scoped_lock*)NULL ); + n = next; + } else { + goto unlock_self; + } + } + n->going = 2; // protect next queue node from being destroyed too early + tmp = tricky_pointer::fetch_and_store(&(n->prev), NULL); + // ensure release semantics on IPF + __TBB_store_with_release(n->going,1); + } +unlock_self: + unblock_or_wait_on_internal_lock(get_flag(tmp)); + } +done: + spin_wait_while_eq( going, 2 ); + + initialize(); +} + +bool queuing_rw_mutex::scoped_lock::downgrade_to_reader() +{ + __TBB_ASSERT( state==STATE_WRITER, "no sense to downgrade a reader" ); + + ITT_NOTIFY(sync_releasing, mutex); + + // ensure acquire semantics of reading 'next' + if( ! __TBB_load_with_acquire(next) ) { + state = STATE_READER; + if( this==mutex->q_tail ) { + unsigned short old_state = state.compare_and_swap(STATE_ACTIVEREADER, STATE_READER); + if( old_state==STATE_READER ) { + goto downgrade_done; + } + } + /* wait for the next to register */ + spin_wait_while_eq( next, (void*)NULL ); + } + __TBB_ASSERT( next, "still no successor at this point!" ); + if( next->state & STATE_COMBINED_WAITINGREADER ) + __TBB_store_with_release(next->going,1); + else if( next->state==STATE_UPGRADE_WAITING ) + // the next waiting for upgrade means this writer was upgraded before. + next->state = STATE_UPGRADE_LOSER; + state = STATE_ACTIVEREADER; + +downgrade_done: + return true; +} + +bool queuing_rw_mutex::scoped_lock::upgrade_to_writer() +{ + __TBB_ASSERT( state==STATE_ACTIVEREADER, "only active reader can be upgraded" ); + + queuing_rw_mutex::scoped_lock * tmp; + queuing_rw_mutex::scoped_lock * me = this; + + ITT_NOTIFY(sync_releasing, mutex); + state = STATE_UPGRADE_REQUESTED; +requested: + __TBB_ASSERT( !( tricky_pointer(next) & FLAG ), "use of corrupted pointer!" ); + acquire_internal_lock(); + if( this != mutex->q_tail.compare_and_swap(tricky_pointer(me)|FLAG, this) ) { + spin_wait_while_eq( next, (void*)NULL ); + queuing_rw_mutex::scoped_lock * n; + n = tricky_pointer::fetch_and_add(&(this->next), FLAG); + unsigned short n_state = n->state; + /* the next reader can be blocked by our state. the best thing to do is to unblock it */ + if( n_state & STATE_COMBINED_WAITINGREADER ) + __TBB_store_with_release(n->going,1); + tmp = tricky_pointer::fetch_and_store(&(n->prev), this); + unblock_or_wait_on_internal_lock(get_flag(tmp)); + if( n_state & (STATE_COMBINED_READER | STATE_UPGRADE_REQUESTED) ) { + // save n|FLAG for simplicity of following comparisons + tmp = tricky_pointer(n)|FLAG; + atomic_backoff backoff; + while(next==tmp) { + if( state & STATE_COMBINED_UPGRADING ) { + if( __TBB_load_with_acquire(next)==tmp ) + next = n; + goto waiting; + } + backoff.pause(); + } + __TBB_ASSERT(next!=(tricky_pointer(n)|FLAG), NULL); + goto requested; + } else { + __TBB_ASSERT( n_state & (STATE_WRITER | STATE_UPGRADE_WAITING), "unexpected state"); + __TBB_ASSERT( (tricky_pointer(n)|FLAG)==next, NULL); + next = n; + } + } else { + /* We are in the tail; whoever comes next is blocked by q_tail&FLAG */ + release_internal_lock(); + } // if( this != mutex->q_tail... ) + state.compare_and_swap(STATE_UPGRADE_WAITING, STATE_UPGRADE_REQUESTED); + +waiting: + __TBB_ASSERT( !( tricky_pointer(next) & FLAG ), "use of corrupted pointer!" ); + __TBB_ASSERT( state & STATE_COMBINED_UPGRADING, "wrong state at upgrade waiting_retry" ); + __TBB_ASSERT( me==this, NULL ); + ITT_NOTIFY(sync_prepare, mutex); + /* if noone was blocked by the "corrupted" q_tail, turn it back */ + mutex->q_tail.compare_and_swap( this, tricky_pointer(me)|FLAG ); + queuing_rw_mutex::scoped_lock * pred; + pred = tricky_pointer::fetch_and_add(&(this->prev), FLAG); + if( pred ) { + bool success = pred->try_acquire_internal_lock(); + pred->state.compare_and_swap(STATE_UPGRADE_WAITING, STATE_UPGRADE_REQUESTED); + if( !success ) { + tmp = tricky_pointer::compare_and_swap(&(this->prev), pred, tricky_pointer(pred)|FLAG ); + if( tricky_pointer(tmp)&FLAG ) { + spin_wait_while_eq(this->prev, pred); + pred = this->prev; + } else { + spin_wait_while_eq( this->prev, tricky_pointer(pred)|FLAG ); + pred->release_internal_lock(); + } + } else { + this->prev = pred; + pred->release_internal_lock(); + spin_wait_while_eq(this->prev, pred); + pred = this->prev; + } + if( pred ) + goto waiting; + } else { + // restore the corrupted prev field for possible further use (e.g. if downgrade back to reader) + this->prev = pred; + } + __TBB_ASSERT( !pred && !this->prev, NULL ); + + // additional lifetime issue prevention checks + // wait for the successor to finish working with my fields + wait_for_release_of_internal_lock(); + // now wait for the predecessor to finish working with my fields + spin_wait_while_eq( going, 2 ); + // there is an acquire semantics statement in the end of spin_wait_while_eq. + + bool result = ( state != STATE_UPGRADE_LOSER ); + state = STATE_WRITER; + going = 1; + + ITT_NOTIFY(sync_acquired, mutex); + return result; +} + +void queuing_rw_mutex::internal_construct() { + ITT_SYNC_CREATE(this, _T("tbb::queuing_rw_mutex"), _T("")); +} + +} // namespace tbb diff --git a/dep/tbb/src/tbb/recursive_mutex.cpp b/dep/tbb/src/tbb/recursive_mutex.cpp new file mode 100644 index 000000000..95e62906c --- /dev/null +++ b/dep/tbb/src/tbb/recursive_mutex.cpp @@ -0,0 +1,143 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/recursive_mutex.h" +#include "itt_notify.h" + +namespace tbb { + +void recursive_mutex::scoped_lock::internal_acquire( recursive_mutex& m ) { +#if _WIN32||_WIN64 + switch( m.state ) { + case INITIALIZED: + // since we cannot look into the internal of the CriticalSection object + // we won't know how many times the lock has been acquired, and thus + // we won't know when we may safely set the state back to INITIALIZED + // if we change the state to HELD as in mutex.cpp. thus, we won't change + // the state for recursive_mutex + EnterCriticalSection( &m.impl ); + break; + case DESTROYED: + __TBB_ASSERT(false,"recursive_mutex::scoped_lock: mutex already destroyed"); + break; + default: + __TBB_ASSERT(false,"recursive_mutex::scoped_lock: illegal mutex state"); + break; + } +#else + int error_code = pthread_mutex_lock(&m.impl); + __TBB_ASSERT_EX(!error_code,"recursive_mutex::scoped_lock: pthread_mutex_lock failed"); +#endif /* _WIN32||_WIN64 */ + my_mutex = &m; +} + +void recursive_mutex::scoped_lock::internal_release() { + __TBB_ASSERT( my_mutex, "recursive_mutex::scoped_lock: not holding a mutex" ); +#if _WIN32||_WIN64 + switch( my_mutex->state ) { + case INITIALIZED: + LeaveCriticalSection( &my_mutex->impl ); + break; + case DESTROYED: + __TBB_ASSERT(false,"recursive_mutex::scoped_lock: mutex already destroyed"); + break; + default: + __TBB_ASSERT(false,"recursive_mutex::scoped_lock: illegal mutex state"); + break; + } +#else + int error_code = pthread_mutex_unlock(&my_mutex->impl); + __TBB_ASSERT_EX(!error_code, "recursive_mutex::scoped_lock: pthread_mutex_unlock failed"); +#endif /* _WIN32||_WIN64 */ + my_mutex = NULL; +} + +bool recursive_mutex::scoped_lock::internal_try_acquire( recursive_mutex& m ) { +#if _WIN32||_WIN64 + switch( m.state ) { + case INITIALIZED: + break; + case DESTROYED: + __TBB_ASSERT(false,"recursive_mutex::scoped_lock: mutex already destroyed"); + break; + default: + __TBB_ASSERT(false,"recursive_mutex::scoped_lock: illegal mutex state"); + break; + } +#endif /* _WIN32||_WIN64 */ + bool result; +#if _WIN32||_WIN64 + result = TryEnterCriticalSection(&m.impl)!=0; +#else + result = pthread_mutex_trylock(&m.impl)==0; +#endif /* _WIN32||_WIN64 */ + if( result ) + my_mutex = &m; + return result; +} + +void recursive_mutex::internal_construct() { +#if _WIN32||_WIN64 + InitializeCriticalSection(&impl); + state = INITIALIZED; +#else + pthread_mutexattr_t mtx_attr; + int error_code = pthread_mutexattr_init( &mtx_attr ); + if( error_code ) + tbb::internal::handle_perror(error_code,"recursive_mutex: pthread_mutexattr_init failed"); + + pthread_mutexattr_settype( &mtx_attr, PTHREAD_MUTEX_RECURSIVE ); + error_code = pthread_mutex_init( &impl, &mtx_attr ); + if( error_code ) + tbb::internal::handle_perror(error_code,"recursive_mutex: pthread_mutex_init failed"); + pthread_mutexattr_destroy( &mtx_attr ); +#endif /* _WIN32||_WIN64*/ + ITT_SYNC_CREATE(&impl, _T("tbb::recursive_mutex"), _T("")); +} + +void recursive_mutex::internal_destroy() { +#if _WIN32||_WIN64 + switch( state ) { + case INITIALIZED: + DeleteCriticalSection(&impl); + break; + case DESTROYED: + __TBB_ASSERT(false,"recursive_mutex: already destroyed"); + break; + default: + __TBB_ASSERT(false,"recursive_mutex: illegal state for destruction"); + break; + } + state = DESTROYED; +#else + int error_code = pthread_mutex_destroy(&impl); + __TBB_ASSERT_EX(!error_code,"recursive_mutex: pthread_mutex_destroy failed"); +#endif /* _WIN32||_WIN64 */ +} + +} // namespace tbb diff --git a/dep/tbb/src/tbb/spin_mutex.cpp b/dep/tbb/src/tbb/spin_mutex.cpp new file mode 100644 index 000000000..e233ffb33 --- /dev/null +++ b/dep/tbb/src/tbb/spin_mutex.cpp @@ -0,0 +1,68 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/tbb_machine.h" +#include "tbb/spin_mutex.h" +#include "itt_notify.h" +#include "tbb_misc.h" + +namespace tbb { + +void spin_mutex::scoped_lock::internal_acquire( spin_mutex& m ) { + __TBB_ASSERT( !my_mutex, "already holding a lock on a spin_mutex" ); + ITT_NOTIFY(sync_prepare, &m); + my_unlock_value = __TBB_LockByte(m.flag); + my_mutex = &m; + ITT_NOTIFY(sync_acquired, &m); +} + +void spin_mutex::scoped_lock::internal_release() { + __TBB_ASSERT( my_mutex, "release on spin_mutex::scoped_lock that is not holding a lock" ); + __TBB_ASSERT( !(my_unlock_value&1), "corrupted scoped_lock?" ); + + ITT_NOTIFY(sync_releasing, my_mutex); + __TBB_store_with_release(my_mutex->flag, static_cast(my_unlock_value)); + my_mutex = NULL; +} + +bool spin_mutex::scoped_lock::internal_try_acquire( spin_mutex& m ) { + __TBB_ASSERT( !my_mutex, "already holding a lock on a spin_mutex" ); + bool result = bool( __TBB_TryLockByte(m.flag) ); + if( result ) { + my_unlock_value = 0; + my_mutex = &m; + ITT_NOTIFY(sync_acquired, &m); + } + return result; +} + +void spin_mutex::internal_construct() { + ITT_SYNC_CREATE(this, _T("tbb::spin_mutex"), _T("")); +} + +} // namespace tbb diff --git a/dep/tbb/src/tbb/spin_rw_mutex.cpp b/dep/tbb/src/tbb/spin_rw_mutex.cpp new file mode 100644 index 000000000..b3ce9d851 --- /dev/null +++ b/dep/tbb/src/tbb/spin_rw_mutex.cpp @@ -0,0 +1,174 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "tbb/spin_rw_mutex.h" +#include "tbb/tbb_machine.h" +#include "itt_notify.h" + +#if defined(_MSC_VER) && defined(_Wp64) + // Workaround for overzealous compiler warnings in /Wp64 mode + #pragma warning (disable: 4244) +#endif + +namespace tbb { + +template // a template can work with private spin_rw_mutex::state_t +static inline T CAS(volatile T &addr, T newv, T oldv) { + // ICC (9.1 and 10.1 tried) unable to do implicit conversion + // from "volatile T*" to "volatile void*", so explicit cast added. + return T(__TBB_CompareAndSwapW((volatile void *)&addr, (intptr_t)newv, (intptr_t)oldv)); +} + +//! Acquire write lock on the given mutex. +bool spin_rw_mutex_v3::internal_acquire_writer() +{ + ITT_NOTIFY(sync_prepare, this); + internal::atomic_backoff backoff; + for(;;) { + state_t s = const_cast(state); // ensure reloading + if( !(s & BUSY) ) { // no readers, no writers + if( CAS(state, WRITER, s)==s ) + break; // successfully stored writer flag + backoff.reset(); // we could be very close to complete op. + } else if( !(s & WRITER_PENDING) ) { // no pending writers + __TBB_AtomicOR(&state, WRITER_PENDING); + } + backoff.pause(); + } + ITT_NOTIFY(sync_acquired, this); + return false; +} + +//! Release writer lock on the given mutex +void spin_rw_mutex_v3::internal_release_writer() +{ + ITT_NOTIFY(sync_releasing, this); + __TBB_AtomicAND( &state, READERS ); +} + +//! Acquire read lock on given mutex. +void spin_rw_mutex_v3::internal_acquire_reader() +{ + ITT_NOTIFY(sync_prepare, this); + internal::atomic_backoff backoff; + for(;;) { + state_t s = const_cast(state); // ensure reloading + if( !(s & (WRITER|WRITER_PENDING)) ) { // no writer or write requests + state_t t = (state_t)__TBB_FetchAndAddW( &state, (intptr_t) ONE_READER ); + if( !( t&WRITER )) + break; // successfully stored increased number of readers + // writer got there first, undo the increment + __TBB_FetchAndAddW( &state, -(intptr_t)ONE_READER ); + } + backoff.pause(); + } + + ITT_NOTIFY(sync_acquired, this); + __TBB_ASSERT( state & READERS, "invalid state of a read lock: no readers" ); +} + +//! Upgrade reader to become a writer. +/** Returns true if the upgrade happened without re-acquiring the lock and false if opposite */ +bool spin_rw_mutex_v3::internal_upgrade() +{ + state_t s = state; + __TBB_ASSERT( s & READERS, "invalid state before upgrade: no readers " ); + // check and set writer-pending flag + // required conditions: either no pending writers, or we are the only reader + // (with multiple readers and pending writer, another upgrade could have been requested) + while( (s & READERS)==ONE_READER || !(s & WRITER_PENDING) ) { + state_t old_s = s; + if( (s=CAS(state, s | WRITER | WRITER_PENDING, s))==old_s ) { + internal::atomic_backoff backoff; + ITT_NOTIFY(sync_prepare, this); + // the state should be 0...0111, i.e. 1 reader and waiting writer; + // both new readers and writers are blocked + while( (state & READERS) != ONE_READER ) // more than 1 reader + backoff.pause(); + __TBB_ASSERT((state&(WRITER_PENDING|WRITER))==(WRITER_PENDING|WRITER),"invalid state when upgrading to writer"); + + __TBB_FetchAndAddW( &state, - (intptr_t)(ONE_READER+WRITER_PENDING)); + ITT_NOTIFY(sync_acquired, this); + return true; // successfully upgraded + } + } + // slow reacquire + internal_release_reader(); + return internal_acquire_writer(); // always returns false +} + +//! Downgrade writer to a reader +void spin_rw_mutex_v3::internal_downgrade() { + ITT_NOTIFY(sync_releasing, this); + __TBB_FetchAndAddW( &state, (intptr_t)(ONE_READER-WRITER)); + __TBB_ASSERT( state & READERS, "invalid state after downgrade: no readers" ); +} + +//! Release read lock on the given mutex +void spin_rw_mutex_v3::internal_release_reader() +{ + __TBB_ASSERT( state & READERS, "invalid state of a read lock: no readers" ); + ITT_NOTIFY(sync_releasing, this); // release reader + __TBB_FetchAndAddWrelease( &state,-(intptr_t)ONE_READER); +} + +//! Try to acquire write lock on the given mutex +bool spin_rw_mutex_v3::internal_try_acquire_writer() +{ + // for a writer: only possible to acquire if no active readers or writers + state_t s = state; + if( !(s & BUSY) ) // no readers, no writers; mask is 1..1101 + if( CAS(state, WRITER, s)==s ) { + ITT_NOTIFY(sync_acquired, this); + return true; // successfully stored writer flag + } + return false; +} + +//! Try to acquire read lock on the given mutex +bool spin_rw_mutex_v3::internal_try_acquire_reader() +{ + // for a reader: acquire if no active or waiting writers + state_t s = state; + if( !(s & (WRITER|WRITER_PENDING)) ) { // no writers + state_t t = (state_t)__TBB_FetchAndAddW( &state, (intptr_t) ONE_READER ); + if( !( t&WRITER )) { // got the lock + ITT_NOTIFY(sync_acquired, this); + return true; // successfully stored increased number of readers + } + // writer got there first, undo the increment + __TBB_FetchAndAddW( &state, -(intptr_t)ONE_READER ); + } + return false; +} + + +void spin_rw_mutex_v3::internal_construct() { + ITT_SYNC_CREATE(this, _T("tbb::spin_rw_mutex"), _T("")); +} +} // namespace tbb diff --git a/dep/tbb/src/tbb/task.cpp b/dep/tbb/src/tbb/task.cpp new file mode 100644 index 000000000..857052270 --- /dev/null +++ b/dep/tbb/src/tbb/task.cpp @@ -0,0 +1,3912 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +/* This file contains the TBB task scheduler. There are many classes + lumped together here because very few are exposed to the outside + world, and by putting them in a single translation unit, the + compiler's optimizer might be able to do a better job. */ + +#if USE_PTHREAD + + // Some pthreads documentation says that must be first header. + #include + #define __TBB_THREAD_ROUTINE + +#elif USE_WINTHREAD + + #include + #include /* Need _beginthreadex from there */ + #include /* Need _alloca from there */ + #define __TBB_THREAD_ROUTINE WINAPI + +#else + + #error Must define USE_PTHREAD or USE_WINTHREAD + +#endif + +#include +#include +#include +#include +#include +#include +#include "tbb/tbb_stddef.h" + +/* Temporarily change "private" to "public" while including "tbb/task.h". + This hack allows us to avoid publishing internal types and methods + in the public header files just for sake of friend declarations. */ +#define private public +#include "tbb/task.h" +#if __TBB_EXCEPTIONS +#include "tbb/tbb_exception.h" +#endif /* __TBB_EXCEPTIONS */ +#undef private + +#include "tbb/task_scheduler_init.h" +#include "tbb/cache_aligned_allocator.h" +#include "tbb/tbb_machine.h" +#include "tbb/mutex.h" +#include "tbb/atomic.h" +#if __TBB_SCHEDULER_OBSERVER +#include "tbb/task_scheduler_observer.h" +#include "tbb/spin_rw_mutex.h" +#include "tbb/aligned_space.h" +#endif /* __TBB_SCHEDULER_OBSERVER */ +#if __TBB_EXCEPTIONS +#include "tbb/spin_mutex.h" +#endif /* __TBB_EXCEPTIONS */ + +#include "tbb/partitioner.h" + +#include "../rml/include/rml_tbb.h" + +namespace tbb { + namespace internal { + namespace rml { + tbb_server* make_private_server( tbb_client& client ); + } + } +} + +#if DO_TBB_TRACE +#include +#define TBB_TRACE(x) ((void)std::printf x) +#else +#define TBB_TRACE(x) ((void)(0)) +#endif /* DO_TBB_TRACE */ + +#if TBB_USE_ASSERT +#define COUNT_TASK_NODES 1 +#endif /* TBB_USE_ASSERT */ + +/* If nonzero, then gather statistics */ +#ifndef STATISTICS +#define STATISTICS 0 +#endif /* STATISTICS */ + +#if STATISTICS +#define GATHER_STATISTIC(x) (x) +#else +#define GATHER_STATISTIC(x) ((void)0) +#endif /* STATISTICS */ + +#if __TBB_EXCEPTIONS +// The standard offsetof macro does not work for us since its usage is restricted +// by POD-types only. Using 0x1000 (not NULL) is necessary to appease GCC. +#define __TBB_offsetof(class_name, member_name) \ + ((ptrdiff_t)&(reinterpret_cast(0x1000)->member_name) - 0x1000) +// Returns address of the object containing a member with the given name and address +#define __TBB_get_object_addr(class_name, member_name, member_addr) \ + reinterpret_cast((char*)member_addr - __TBB_offsetof(class_name, member_name)) +#endif /* __TBB_EXCEPTIONS */ + +// This macro is an attempt to get rid of ugly ifdefs in the shared parts of the code. +// It drops the second argument depending on whether the controlling macro is defined. +// The first argument is just a convenience allowing to keep comma before the macro usage. +#if __TBB_EXCEPTIONS + #define __TBB_CONTEXT_ARG(arg1, context) arg1, context +#else /* !__TBB_EXCEPTIONS */ + #define __TBB_CONTEXT_ARG(arg1, context) arg1 +#endif /* !__TBB_EXCEPTIONS */ + +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Workaround for overzealous compiler warnings + // These particular warnings are so ubquitous that no attempt is made to narrow + // the scope of the warnings. + #pragma warning (disable: 4100 4127 4312 4244 4267 4706) +#endif + +// internal headers +#include "tbb_misc.h" +#include "itt_notify.h" +#include "tls.h" + +namespace tbb { + +using namespace std; + +#if DO_ITT_NOTIFY + const tchar + *SyncType_GlobalLock = _T("TbbGlobalLock"), + *SyncType_Scheduler = _T("%Constant") + ; + const tchar + *SyncObj_SchedulerInitialization = _T("TbbSchedulerInitialization"), + *SyncObj_SchedulersList = _T("TbbSchedulersList"), + *SyncObj_WorkerLifeCycleMgmt = _T("TBB Scheduler"), + *SyncObj_TaskStealingLoop = _T("TBB Scheduler"), + *SyncObj_WorkerTaskPool = _T("TBB Scheduler"), + *SyncObj_MasterTaskPool = _T("TBB Scheduler"), + *SyncObj_TaskPoolSpinning = _T("TBB Scheduler"), + *SyncObj_Mailbox = _T("TBB Scheduler"), + *SyncObj_TaskReturnList = _T("TBB Scheduler"), + *SyncObj_GateLock = _T("TBB Scheduler"), + *SyncObj_Gate = _T("TBB Scheduler"), + *SyncObj_ContextsList = _T("TBB Scheduler") + ; +#endif /* DO_ITT_NOTIFY */ + +namespace internal { + +const stack_size_type MByte = 1<<20; +#if !defined(__TBB_WORDSIZE) +const stack_size_type ThreadStackSize = 1*MByte; +#elif __TBB_WORDSIZE<=4 +const stack_size_type ThreadStackSize = 2*MByte; +#else +const stack_size_type ThreadStackSize = 4*MByte; +#endif + +#if USE_PTHREAD +typedef void* thread_routine_return_type; +#else +typedef unsigned thread_routine_return_type; +#endif + +//------------------------------------------------------------------------ +// General utility section +//------------------------------------------------------------------------ + +#if TBB_USE_ASSERT + #define __TBB_POISON_DEQUE 1 +#endif /* TBB_USE_ASSERT */ + +#if __TBB_POISON_DEQUE + #if __ia64__ + task* const poisoned_taskptr = (task*)0xDDEEAADDDEADBEEF; + #elif _WIN64 + task* const poisoned_taskptr = (task*)0xDDEEAADDDEADBEEF; + #else + task* const poisoned_taskptr = (task*)0xDEADBEEF; + #endif + + #define __TBB_POISON_TASK_PTR(ptr) ptr = poisoned_taskptr + #define __TBB_ASSERT_VALID_TASK_PTR(ptr) __TBB_ASSERT( ptr != poisoned_taskptr, "task pointer in the deque is poisoned" ) +#else /* !__TBB_POISON_DEQUE */ + #define __TBB_POISON_TASK_PTR(ptr) ((void)0) + #define __TBB_ASSERT_VALID_TASK_PTR(ptr) ((void)0) +#endif /* !__TBB_POISON_DEQUE */ + + +//! Vector that grows without reallocations, and stores items in the reverse order. +/** Requires to initialize its first segment with a preallocated memory chunk + (usually it is static array or an array allocated on the stack). + The second template parameter specifies maximal number of segments. Each next + segment is twice as large as the previous one. **/ +template +class fast_reverse_vector +{ +public: + fast_reverse_vector ( T* initial_segment, size_t segment_size ) + : m_cur_segment(initial_segment) + , m_cur_segment_size(segment_size) + , m_pos(segment_size) + , m_num_segments(0) + , m_size(0) + { + __TBB_ASSERT ( initial_segment && segment_size, "Nonempty initial segment must be supplied"); + } + + ~fast_reverse_vector () + { + for ( size_t i = 1; i < m_num_segments; ++i ) + NFS_Free( m_segments[i] ); + } + + size_t size () const { return m_size + m_cur_segment_size - m_pos; } + + void push_back ( const T& val ) + { + if ( !m_pos ) { + m_segments[m_num_segments++] = m_cur_segment; + __TBB_ASSERT ( m_num_segments < max_segments, "Maximal capacity exceeded" ); + m_size += m_cur_segment_size; + m_cur_segment_size *= 2; + m_pos = m_cur_segment_size; + m_cur_segment = (T*)NFS_Allocate( m_cur_segment_size * sizeof(T), 1, NULL ); + } + m_cur_segment[--m_pos] = val; + } + + //! Copies the contents of the vector into the dst array. + /** Can only be used when T is a POD type, as copying does not invoke copy constructors. **/ + void copy_memory ( T* dst ) const + { + size_t size = m_cur_segment_size - m_pos; + memcpy( dst, m_cur_segment + m_pos, size * sizeof(T) ); + dst += size; + size = m_cur_segment_size / 2; + for ( long i = (long)m_num_segments - 1; i >= 0; --i ) { + memcpy( dst, m_segments[i], size * sizeof(T) ); + dst += size; + size /= 2; + } + } + +protected: + //! The current (not completely filled) segment + T *m_cur_segment; + + //! Capacity of m_cur_segment + size_t m_cur_segment_size; + + //! Insertion position in m_cur_segment + size_t m_pos; + + //! Array of filled segments (has fixed size specified by the second template parameter) + T *m_segments[max_segments]; + + //! Number of filled segments (the size of m_segments) + size_t m_num_segments; + + //! Number of items in the segments in m_segments + size_t m_size; + +}; // class fast_reverse_vector + +//------------------------------------------------------------------------ +// End of general utility section +//------------------------------------------------------------------------ + +//! Alignment for a task object +const size_t task_alignment = 16; + +//! Number of bytes reserved for a task prefix +/** If not exactly sizeof(task_prefix), the extra bytes *precede* the task_prefix. */ +const size_t task_prefix_reservation_size = ((sizeof(internal::task_prefix)-1)/task_alignment+1)*task_alignment; + +template class CustomScheduler; + + +class mail_outbox; + +struct task_proxy: public task { + static const intptr pool_bit = 1; + static const intptr mailbox_bit = 2; + /* All but two low-order bits represent a (task*). + Two low-order bits mean: + 1 = proxy is/was/will be in task pool + 2 = proxy is/was/will be in mailbox */ + intptr task_and_tag; + + //! Pointer to next task_proxy in a mailbox + task_proxy* next_in_mailbox; + + //! Mailbox to which this was mailed. + mail_outbox* outbox; +}; + +//! Internal representation of mail_outbox, without padding. +class unpadded_mail_outbox { +protected: + //! Pointer to first task_proxy in mailbox, or NULL if box is empty. + task_proxy* my_first; + + //! Pointer to last task_proxy in mailbox, or NULL if box is empty. + /** Low-order bit set to 1 to represent lock on the box. */ + task_proxy* my_last; + + //! Owner of mailbox is not executing a task, and has drained its own task pool. + bool my_is_idle; +}; + +//! Class representing where mail is put. +/** Padded to occupy a cache line. */ +class mail_outbox: unpadded_mail_outbox { + char pad[NFS_MaxLineSize-sizeof(unpadded_mail_outbox)]; + + //! Acquire lock on the box. + task_proxy* acquire() { + atomic_backoff backoff; + for(;;) { + // No fence on load, because subsequent compare-and-swap has the necessary fence. + intptr last = (intptr)my_last; + if( (last&1)==0 && __TBB_CompareAndSwapW(&my_last,last|1,last)==last) { + __TBB_ASSERT( (my_first==NULL)==((intptr(my_last)&~1)==0), NULL ); + return (task_proxy*)last; + } + backoff.pause(); + } + } + task_proxy* internal_pop() { + //! No fence on load of my_first, because if it is NULL, there's nothing further to read from another thread. + task_proxy* result = my_first; + if( result ) { + if( task_proxy* f = __TBB_load_with_acquire(result->next_in_mailbox) ) { + // No lock required + __TBB_store_with_release( my_first, f ); + } else { + // acquire() has the necessary fence. + task_proxy* l = acquire(); + __TBB_ASSERT(result==my_first,NULL); + if( !(my_first = result->next_in_mailbox) ) + l=0; + __TBB_store_with_release( my_last, l ); + } + } + return result; + } +public: + friend class mail_inbox; + + //! Push task_proxy onto the mailbox queue of another thread. + void push( task_proxy& t ) { + __TBB_ASSERT(&t!=NULL, NULL); + t.next_in_mailbox = NULL; + if( task_proxy* l = acquire() ) { + l->next_in_mailbox = &t; + } else { + my_first=&t; + } + // Fence required because caller is sending the task_proxy to another thread. + __TBB_store_with_release( my_last, &t ); + } +#if TBB_USE_ASSERT + //! Verify that *this is initialized empty mailbox. + /** Raise assertion if *this is not in initialized state, or sizeof(this) is wrong. + Instead of providing a constructor, we provide this assertion, because for + brevity and speed, we depend upon a memset to initialize instances of this class */ + void assert_is_initialized() const { + __TBB_ASSERT( sizeof(*this)==NFS_MaxLineSize, NULL ); + __TBB_ASSERT( !my_first, NULL ); + __TBB_ASSERT( !my_last, NULL ); + __TBB_ASSERT( !my_is_idle, NULL ); + } +#endif /* TBB_USE_ASSERT */ + + //! Drain the mailbox + intptr drain() { + intptr k = 0; + // No fences here because other threads have already quit. + for( ; task_proxy* t = my_first; ++k ) { + my_first = t->next_in_mailbox; + NFS_Free((char*)t-task_prefix_reservation_size); + } + return k; + } + + //! True if thread that owns this mailbox is looking for work. + bool recipient_is_idle() { + return my_is_idle; + } +}; + +//! Class representing source of mail. +class mail_inbox { + //! Corresponding sink where mail that we receive will be put. + mail_outbox* my_putter; +public: + //! Construct unattached inbox + mail_inbox() : my_putter(NULL) {} + + //! Attach inbox to a corresponding outbox. + void attach( mail_outbox& putter ) { + __TBB_ASSERT(!my_putter,"already attached"); + my_putter = &putter; + } + //! Detach inbox from its outbox + void detach() { + __TBB_ASSERT(my_putter,"not attached"); + my_putter = NULL; + } + //! Get next piece of mail, or NULL if mailbox is empty. + task_proxy* pop() { + return my_putter->internal_pop(); + } + //! Indicate whether thread that reads this mailbox is idle. + /** Raises assertion failure if mailbox is redundantly marked as not idle. */ + void set_is_idle( bool value ) { + if( my_putter ) { + __TBB_ASSERT( my_putter->my_is_idle || value, "attempt to redundantly mark mailbox as not idle" ); + my_putter->my_is_idle = value; + } + } +#if TBB_USE_ASSERT + //! Indicate whether thread that reads this mailbox is idle. + bool assert_is_idle( bool value ) const { + __TBB_ASSERT( !my_putter || my_putter->my_is_idle==value, NULL ); + return true; + } +#endif /* TBB_USE_ASSERT */ +#if DO_ITT_NOTIFY + //! Get pointer to corresponding outbox used for ITT_NOTIFY calls. + void* outbox() const {return my_putter;} +#endif /* DO_ITT_NOTIFY */ +}; + +#if __TBB_SCHEDULER_OBSERVER +//------------------------------------------------------------------------ +// observer_proxy +//------------------------------------------------------------------------ +class observer_proxy { + friend class task_scheduler_observer_v3; + //! Reference count used for garbage collection. + /** 1 for reference from my task_scheduler_observer. + 1 for each local_last_observer_proxy that points to me. + No accounting for predecessor in the global list. + No accounting for global_last_observer_proxy that points to me. */ + atomic gc_ref_count; + //! Pointer to next task_scheduler_observer + /** Valid even when *this has been removed from the global list. */ + observer_proxy* next; + //! Pointer to previous task_scheduler_observer in global list. + observer_proxy* prev; + //! Associated observer + task_scheduler_observer* observer; + //! Account for removing reference from p. No effect if p is NULL. + void remove_ref_slow(); + void remove_from_list(); + observer_proxy( task_scheduler_observer_v3& wo ); +public: + static observer_proxy* process_list( observer_proxy* local_last, bool is_worker, bool is_entry ); +}; +#endif /* __TBB_SCHEDULER_OBSERVER */ + + +//------------------------------------------------------------------------ +// Arena +//------------------------------------------------------------------------ + +class Arena; +class GenericScheduler; + +struct WorkerDescriptor { + //! NULL until worker is published. -1 if worker should not be published. + GenericScheduler* scheduler; + +}; + +//! The useful contents of an ArenaPrefix +class UnpaddedArenaPrefix: no_copy + ,rml::tbb_client +{ + friend class GenericScheduler; + template friend class internal::CustomScheduler; + friend class Arena; + friend class Governor; + friend struct WorkerDescriptor; + + //! Arena slot to try to acquire first for the next new master. + unsigned limit; + + //! Number of masters that own this arena. + /** This may be smaller than the number of masters who have entered the arena. */ + unsigned number_of_masters; + + //! Total number of slots in the arena + const unsigned number_of_slots; + + //! Number of workers that belong to this arena + const unsigned number_of_workers; + + //! Pointer to the RML server object that services requests for this arena. + rml::tbb_server* server; + //! Counter used to allocate job indices + tbb::atomic next_job_index; + + //! Stack size of worker threads + stack_size_type stack_size; + + //! Array of workers. + WorkerDescriptor* worker_list; + +#if COUNT_TASK_NODES + //! Net number of nodes that have been allocated from heap. + /** Updated each time a scheduler is destroyed. */ + atomic task_node_count; +#endif /* COUNT_TASK_NODES */ + + //! Estimate of number of available tasks. + /** The estimate is either 0 (SNAPSHOT_EMPTY), infinity (SNAPSHOT_FULL), or a special value. + The implementation of Arena::is_busy_or_empty requires that pool_state_t be unsigned. */ + typedef uintptr_t pool_state_t; + + //! Current estimate of number of available tasks. + tbb::atomic pool_state; + +protected: + UnpaddedArenaPrefix( unsigned number_of_slots_, unsigned number_of_workers_ ) : + number_of_masters(1), + number_of_slots(number_of_slots_), + number_of_workers(number_of_workers_) + { +#if COUNT_TASK_NODES + task_node_count = 0; +#endif /* COUNT_TASK_NODES */ + limit = number_of_workers_; + server = NULL; + stack_size = 0; + next_job_index = 0; + } + void open_connection_to_rml(); + +private: + //! Return reference to corresponding arena. + Arena& arena(); + + /*override*/ version_type version() const { + return 0; + } + + /*override*/ unsigned max_job_count() const { + return number_of_workers; + } + + /*override*/ size_t min_stack_size() const { + return stack_size; + } + + /*override*/ policy_type policy() const { + return throughput; + } + + /*override*/ job* create_one_job(); + + /*override*/ void cleanup( job& j ); + + /*override*/ void acknowledge_close_connection(); + + /*override*/ void process( job& j ); +}; + +//! The prefix to Arena with padding. +class ArenaPrefix: public UnpaddedArenaPrefix { + //! Padding to fill out to multiple of cache line size. + char pad[(sizeof(UnpaddedArenaPrefix)/NFS_MaxLineSize+1)*NFS_MaxLineSize-sizeof(UnpaddedArenaPrefix)]; + +public: + ArenaPrefix( unsigned number_of_slots_, unsigned number_of_workers_ ) : + UnpaddedArenaPrefix(number_of_slots_,number_of_workers_) + { + } +}; + + +struct ArenaSlot { + // Task pool (the deque of task pointers) of the scheduler that owns this slot + /** Also is used to specify if the slot is empty or locked: + 0 - empty + -1 - locked **/ + task** task_pool; + + //! Index of the first ready task in the deque. + /** Modified by thieves, and by the owner during compaction/reallocation **/ + size_t head; + + //! Padding to avoid false sharing caused by the thieves accessing this slot + char pad1[NFS_MaxLineSize - sizeof(size_t) - sizeof(task**)]; + + //! Index of the element following the last ready task in the deque. + /** Modified by the owner thread. **/ + size_t tail; + + //! Padding to avoid false sharing caused by the thieves accessing the next slot + char pad2[NFS_MaxLineSize - sizeof(size_t)]; +}; + + +class Arena { + friend class UnpaddedArenaPrefix; + friend class GenericScheduler; + template friend class internal::CustomScheduler; + friend class Governor; + friend struct WorkerDescriptor; + + //! Get reference to prefix portion + ArenaPrefix& prefix() const {return ((ArenaPrefix*)(void*)this)[-1];} + + //! Get reference to mailbox corresponding to given affinity_id. + mail_outbox& mailbox( affinity_id id ) { + __TBB_ASSERT( 0 count; + + //! Platform specific code to acquire resources. + static void acquire_resources(); + + //! Platform specific code to release resources. + static void release_resources(); + + static bool InitializationDone; + + // Scenarios are possible when tools interop has to be initialized before the + // TBB itself. This imposes a requirement that the global initialization lock + // has to support valid static initialization, and does not issue any tool + // notifications in any build mode. + typedef unsigned char mutex_type; + + // Global initialization lock + static mutex_type InitializationLock; + +public: + static void lock() { __TBB_LockByte( InitializationLock ); } + + static void unlock() { __TBB_store_with_release( InitializationLock, 0 ); } + + static bool initialization_done() { return __TBB_load_with_acquire(InitializationDone); } + + //! Add initial reference to resources. + /** We assume that dynamic loading of the library prevents any other threads from entering the library + until this constructor has finished running. */ + __TBB_InitOnce() { add_ref(); } + + //! Remove the initial reference to resources. + /** This is not necessarily the last reference if other threads are still running. + If the extra reference from DoOneTimeInitializations is present, remove it as well.*/ + ~__TBB_InitOnce(); + + //! Add reference to resources. If first reference added, acquire the resources. + static void add_ref() { + if( ++count==1 ) + acquire_resources(); + } + //! Remove reference to resources. If last reference removed, release the resources. + static void remove_ref() { + int k = --count; + __TBB_ASSERT(k>=0,"removed __TBB_InitOnce ref that was not added?"); + if( k==0 ) + release_resources(); + } +}; // class __TBB_InitOnce + +//------------------------------------------------------------------------ +// Class Governor +//------------------------------------------------------------------------ + +//! The class handles access to the single instance of Arena, and to TLS to keep scheduler instances. +/** It also supports automatic on-demand intialization of the TBB scheduler. + The class contains only static data members and methods.*/ +class Governor { + friend class __TBB_InitOnce; + friend void ITT_DoUnsafeOneTimeInitialization (); + + static basic_tls theTLS; + static Arena* theArena; + static mutex theArenaMutex; + + //! Create key for thread-local storage. + static void create_tls() { +#if USE_PTHREAD + int status = theTLS.create(auto_terminate); +#else + int status = theTLS.create(); +#endif + if( status ) + handle_perror(status, "TBB failed to initialize TLS storage\n"); + } + + //! Destroy the thread-local storage key. + static void destroy_tls() { +#if TBB_USE_ASSERT + if( __TBB_InitOnce::initialization_done() && theTLS.get() ) + fprintf(stderr, "TBB is unloaded while tbb::task_scheduler_init object is alive?"); +#endif + int status = theTLS.destroy(); + if( status ) + handle_perror(status, "TBB failed to destroy TLS storage"); + } + + //! Obtain the instance of arena to register a new master thread + /** If there is no active arena, create one. */ + static Arena* obtain_arena( int number_of_threads, stack_size_type thread_stack_size ) + { + mutex::scoped_lock lock( theArenaMutex ); + Arena* a = theArena; + if( a ) { + a->prefix().number_of_masters += 1; + } else { + if( number_of_threads==task_scheduler_init::automatic ) + number_of_threads = task_scheduler_init::default_num_threads(); + a = Arena::allocate_arena( 2*number_of_threads, number_of_threads-1, + thread_stack_size?thread_stack_size:ThreadStackSize ); + __TBB_ASSERT( a->prefix().number_of_masters==1, NULL ); + // Publish the Arena. + // A memory release fence is not required here, because workers have not started yet, + // and concurrent masters inspect theArena while holding theArenaMutex. + __TBB_ASSERT( !theArena, NULL ); + theArena = a; + // Must create server under lock, otherwise second master might see arena without a server. + a->prefix().open_connection_to_rml(); + } + return a; + } + + //! The internal routine to undo automatic initialization. + /** The signature is written with void* so that the routine + can be the destructor argument to pthread_key_create. */ + static void auto_terminate(void* scheduler); + +public: + //! Processes scheduler initialization request (possibly nested) in a master thread + /** If necessary creates new instance of arena and/or local scheduler. + The auto_init argument specifies if the call is due to automatic initialization. **/ + static GenericScheduler* init_scheduler( int num_threads, stack_size_type stack_size, bool auto_init = false ); + + //! Processes scheduler termination request (possibly nested) in a master thread + static void terminate_scheduler( GenericScheduler* s ); + + //! Dereference arena when a master thread stops using TBB. + /** If no more masters in the arena, terminate workers and destroy it. */ + static void finish_with_arena() { + mutex::scoped_lock lock( theArenaMutex ); + Arena* a = theArena; + __TBB_ASSERT( a, "theArena is missing" ); + if( --(a->prefix().number_of_masters) ) + a = NULL; + else { + theArena = NULL; + // Must do this while holding lock, otherwise terminate message might reach + // RML thread *after* initialize message reaches it for the next arena, which + // which causes TLS to be set to new value before old one is erased! + a->terminate_workers(); + } + } + + static size_t number_of_workers_in_arena() { + __TBB_ASSERT( theArena, "thread did not activate a task_scheduler_init object?" ); + // No fence required to read theArena, because it does not change after the thread starts. + return theArena->prefix().number_of_workers; + } + + //! Register TBB scheduler instance in thread local storage. + inline static void sign_on(GenericScheduler* s); + + //! Unregister TBB scheduler instance from thread local storage. + inline static void sign_off(GenericScheduler* s); + + //! Used to check validity of the local scheduler TLS contents. + static bool is_set ( GenericScheduler* s ) { return theTLS.get() == s; } + + //! Obtain the thread local instance of the TBB scheduler. + /** If the scheduler has not been initialized yet, initialization is done automatically. + Note that auto-initialized scheduler instance is destroyed only when its thread terminates. **/ + static GenericScheduler* local_scheduler () { + GenericScheduler* s = theTLS.get(); + return s ? s : init_scheduler( task_scheduler_init::automatic, 0, true ); + } + + //! Undo automatic initialization if necessary; call when a thread exits. + static void terminate_auto_initialized_scheduler() { + auto_terminate( theTLS.get() ); + } +}; // class Governor + +//------------------------------------------------------------------------ +// Begin shared data layout. +// +// The following global data items are read-only after initialization. +// The first item is aligned on a 128 byte boundary so that it starts a new cache line. +//------------------------------------------------------------------------ + +basic_tls Governor::theTLS; +Arena * Governor::theArena; +mutex Governor::theArenaMutex; + +//! Number of hardware threads +/** One more than the default number of workers. */ +static int DefaultNumberOfThreads; + +//! T::id for the scheduler traits type T to use for the scheduler +/** For example, the default value is DefaultSchedulerTraits::id. */ +static int SchedulerTraitsId; + +//! Counter of references to global shared resources such as TLS. +atomic __TBB_InitOnce::count; + +__TBB_InitOnce::mutex_type __TBB_InitOnce::InitializationLock; + +//! Flag that is set to true after one-time initializations are done. +bool __TBB_InitOnce::InitializationDone; + +#if DO_ITT_NOTIFY + static bool ITT_Present; + static bool ITT_InitializationDone; +#endif + +static rml::tbb_factory rml_server_factory; +//! Set to true if private statically linked RML server should be used instead of shared server. +static bool use_private_rml; + +#if !(_WIN32||_WIN64) || __TBB_TASK_CPP_DIRECTLY_INCLUDED + static __TBB_InitOnce __TBB_InitOnceHiddenInstance; +#endif + +#if __TBB_SCHEDULER_OBSERVER +typedef spin_rw_mutex::scoped_lock task_scheduler_observer_mutex_scoped_lock; +/** aligned_space used here to shut up warnings when mutex destructor is called while threads are still using it. */ +static aligned_space the_task_scheduler_observer_mutex; +static observer_proxy* global_first_observer_proxy; +static observer_proxy* global_last_observer_proxy; +#endif /* __TBB_SCHEDULER_OBSERVER */ + +//! Table of primes used by fast random-number generator. +/** Also serves to keep anything else from being placed in the same + cache line as the global data items preceding it. */ +static const unsigned Primes[] = { + 0x9e3779b1, 0xffe6cc59, 0x2109f6dd, 0x43977ab5, + 0xba5703f5, 0xb495a877, 0xe1626741, 0x79695e6b, + 0xbc98c09f, 0xd5bee2b3, 0x287488f9, 0x3af18231, + 0x9677cd4d, 0xbe3a6929, 0xadc6a877, 0xdcf0674b, + 0xbe4d6fe9, 0x5f15e201, 0x99afc3fd, 0xf3f16801, + 0xe222cfff, 0x24ba5fdb, 0x0620452d, 0x79f149e3, + 0xc8b93f49, 0x972702cd, 0xb07dd827, 0x6c97d5ed, + 0x085a3d61, 0x46eb5ea7, 0x3d9910ed, 0x2e687b5b, + 0x29609227, 0x6eb081f1, 0x0954c4e1, 0x9d114db9, + 0x542acfa9, 0xb3e6bd7b, 0x0742d917, 0xe9f3ffa7, + 0x54581edb, 0xf2480f45, 0x0bb9288f, 0xef1affc7, + 0x85fa0ca7, 0x3ccc14db, 0xe6baf34b, 0x343377f7, + 0x5ca19031, 0xe6d9293b, 0xf0a9f391, 0x5d2e980b, + 0xfc411073, 0xc3749363, 0xb892d829, 0x3549366b, + 0x629750ad, 0xb98294e5, 0x892d9483, 0xc235baf3, + 0x3d2402a3, 0x6bdef3c9, 0xbec333cd, 0x40c9520f +}; + +#if STATISTICS +//! Class for collecting statistics +/** There should be only one instance of this class. + Results are written to a file "statistics.txt" in tab-separated format. */ +static class statistics { +public: + statistics() { + my_file = fopen("statistics.txt","w"); + if( !my_file ) { + perror("fopen(\"statistics.txt\"\")"); + exit(1); + } + fprintf(my_file,"%13s\t%13s\t%13s\t%13s\t%13s\t%13s\n", "execute", "steal", "mail", "proxy_execute", "proxy_steal", "proxy_bypass" ); + } + ~statistics() { + fclose(my_file); + } + void record( long execute_count, long steal_count, long mail_received_count, + long proxy_execute_count, long proxy_steal_count, long proxy_bypass_count ) { + mutex::scoped_lock lock(my_mutex); + fprintf (my_file,"%13ld\t%13ld\t%13ld\t%13ld\t%13ld\t%13ld\n", execute_count, steal_count, mail_received_count, + proxy_execute_count, proxy_steal_count, proxy_bypass_count ); + } +private: + //! File into which statistics are written. + FILE* my_file; + //! Mutex that serializes accesses to my_file + mutex my_mutex; +} the_statistics; +#endif /* STATISTICS */ + +#if __TBB_EXCEPTIONS + struct scheduler_list_node_t { + scheduler_list_node_t *my_prev, + *my_next; + }; + + //! Head of the list of master thread schedulers. + static scheduler_list_node_t the_scheduler_list_head; + + //! Mutex protecting access to the list of schedulers. + static mutex the_scheduler_list_mutex; + +//! Counter that is incremented whenever new cancellation signal is sent to a task group. +/** Together with GenericScheduler::local_cancel_count forms cross-thread signaling + mechanism that allows to avoid locking at the hot path of normal execution flow. + + When a descendant task group context is being registered or unregistered, + the global and local counters are compared. If they differ, it means that + a cancellation signal is being propagated, and registration/deregistration + routines take slower branch that may block (at most one thread of the pool + can be blocked at any moment). Otherwise the control path is lock-free and fast. **/ + static uintptr_t global_cancel_count = 0; + + //! Context to be associated with dummy tasks of worker threads schedulers. + /** It is never used for its direct purpose, and is introduced solely for the sake + of avoiding one extra conditional branch in the end of wait_for_all method. **/ + static task_group_context dummy_context(task_group_context::isolated); +#endif /* __TBB_EXCEPTIONS */ + +//------------------------------------------------------------------------ +// End of shared data layout +//------------------------------------------------------------------------ + +//! Amount of time to pause between steals. +/** The default values below were found to be best empirically for K-Means + on the 32-way Altix and 4-way (*2 for HT) fxqlin04. */ +#if __TBB_ipf +static const long PauseTime = 1500; +#else +static const long PauseTime = 80; +#endif + +//------------------------------------------------------------------------ +// One-time Initializations +//------------------------------------------------------------------------ + +//! Defined in cache_aligned_allocator.cpp +extern void initialize_cache_aligned_allocator(); + +#if DO_ITT_NOTIFY +//! Performs initialization of tools support. +/** Defined in itt_notify.cpp. Must be called in a protected do-once manner. + \return true if notification hooks were installed, false otherwise. **/ +bool InitializeITT(); + +/** Thread-unsafe lazy one-time initialization of tools interop. + Used by both dummy handlers and general TBB one-time initialization routine. **/ +void ITT_DoUnsafeOneTimeInitialization () { + if ( !ITT_InitializationDone ) { + ITT_Present = InitializeITT(); + ITT_InitializationDone = true; + ITT_SYNC_CREATE(&Governor::theArenaMutex, SyncType_GlobalLock, SyncObj_SchedulerInitialization); + } +} + +/** Thread-safe lazy one-time initialization of tools interop. + Used by dummy handlers only. **/ +extern "C" +void ITT_DoOneTimeInitialization() { + __TBB_InitOnce::lock(); + ITT_DoUnsafeOneTimeInitialization(); + __TBB_InitOnce::unlock(); +} +#endif /* DO_ITT_NOTIFY */ + +//! Performs thread-safe lazy one-time general TBB initialization. +void DoOneTimeInitializations() { + __TBB_InitOnce::lock(); + // No fence required for load of InitializationDone, because we are inside a critical section. + if( !__TBB_InitOnce::InitializationDone ) { + __TBB_InitOnce::add_ref(); + if( GetBoolEnvironmentVariable("TBB_VERSION") ) + PrintVersion(); + bool have_itt = false; +#if DO_ITT_NOTIFY + ITT_DoUnsafeOneTimeInitialization(); + have_itt = ITT_Present; +#endif /* DO_ITT_NOTIFY */ + initialize_cache_aligned_allocator(); + ::rml::factory::status_type status = rml_server_factory.open(); + if( status!=::rml::factory::st_success ) { + use_private_rml = true; + PrintExtraVersionInfo( "RML", "private" ); + } else { + PrintExtraVersionInfo( "RML", "shared" ); + rml_server_factory.call_with_server_info( PrintRMLVersionInfo, (void*)"" ); + } + if( !have_itt ) + SchedulerTraitsId = IntelSchedulerTraits::id; +#if __TBB_EXCEPTIONS + else { + ITT_SYNC_CREATE(&the_scheduler_list_mutex, SyncType_GlobalLock, SyncObj_SchedulersList); + } +#endif /* __TBB_EXCEPTIONS */ + PrintExtraVersionInfo( "SCHEDULER", + SchedulerTraitsId==IntelSchedulerTraits::id ? "Intel" : "default" ); +#if __TBB_EXCEPTIONS + the_scheduler_list_head.my_next = &the_scheduler_list_head; + the_scheduler_list_head.my_prev = &the_scheduler_list_head; +#endif /* __TBB_EXCEPTIONS */ + __TBB_InitOnce::InitializationDone = true; + } + __TBB_InitOnce::unlock(); +} + +//------------------------------------------------------------------------ +// Methods of class __TBB_InitOnce +//------------------------------------------------------------------------ + +__TBB_InitOnce::~__TBB_InitOnce() { + remove_ref(); + // It is assumed that InitializationDone is not set after file-scope destructors start running, + // and thus no race on InitializationDone is possible. + if( initialization_done() ) { + // Remove reference that we added in DoOneTimeInitializations. + remove_ref(); + } +} + +void __TBB_InitOnce::acquire_resources() { + Governor::create_tls(); +} + +void __TBB_InitOnce::release_resources() { + rml_server_factory.close(); + Governor::destroy_tls(); +} + +#if (_WIN32||_WIN64) && !__TBB_TASK_CPP_DIRECTLY_INCLUDED +//! Windows "DllMain" that handles startup and shutdown of dynamic library. +extern "C" bool WINAPI DllMain( HANDLE /*hinstDLL*/, DWORD reason, LPVOID /*lpvReserved*/ ) { + switch( reason ) { + case DLL_PROCESS_ATTACH: + __TBB_InitOnce::add_ref(); + break; + case DLL_PROCESS_DETACH: + __TBB_InitOnce::remove_ref(); + // It is assumed that InitializationDone is not set after DLL_PROCESS_DETACH, + // and thus no race on InitializationDone is possible. + if( __TBB_InitOnce::initialization_done() ) { + // Remove reference that we added in DoOneTimeInitializations. + __TBB_InitOnce::remove_ref(); + } + break; + case DLL_THREAD_DETACH: + Governor::terminate_auto_initialized_scheduler(); + break; + } + return true; +} +#endif /* (_WIN32||_WIN64) && !__TBB_TASK_CPP_DIRECTLY_INCLUDED */ + +//------------------------------------------------------------------------ +// FastRandom +//------------------------------------------------------------------------ + +//! A fast random number generator. +/** Uses linear congruential method. */ +class FastRandom { + unsigned x, a; +public: + //! Get a random number. + unsigned short get() { + unsigned short r = x>>16; + x = x*a+1; + return r; + } + //! Construct a random number generator. + FastRandom( unsigned seed ) { + x = seed; + a = Primes[seed%(sizeof(Primes)/sizeof(Primes[0]))]; + } +}; + +//------------------------------------------------------------------------ +// GenericScheduler +//------------------------------------------------------------------------ + +// A pure virtual destructor should still have a body +// so the one for tbb::internal::scheduler::~scheduler() is provided here +scheduler::~scheduler( ) {} + + #define EmptyTaskPool ((task**)0u) + #define LockedTaskPool ((task**)~0u) + + #define LocalSpawn local_spawn + +//! Cilk-style task scheduler. +/** None of the fields here are every read or written by threads other than + the thread that creates the instance. + + Class GenericScheduler is an abstract base class that contains most of the scheduler, + except for tweaks specific to processors and tools (e.g. VTune). + The derived template class CustomScheduler fills in the tweaks. */ +class GenericScheduler: public scheduler + ,public ::rml::job +{ + friend class tbb::task; + friend class UnpaddedArenaPrefix; + friend class Arena; + friend class allocate_root_proxy; + friend class Governor; +#if __TBB_EXCEPTIONS + friend class allocate_root_with_context_proxy; + friend class tbb::task_group_context; +#endif /* __TBB_EXCEPTIONS */ +#if __TBB_SCHEDULER_OBSERVER + friend class task_scheduler_observer_v3; +#endif /* __TBB_SCHEDULER_OBSERVER */ + friend class scheduler; + template friend class internal::CustomScheduler; + + //! If sizeof(task) is <=quick_task_size, it is handled on a free list instead of malloc'd. + static const size_t quick_task_size = 256-task_prefix_reservation_size; + + //! Definitions for bits in task_prefix::extra_state + enum internal_state_t { + //! Tag for TBB <3.0 tasks. + es_version_2_task = 0, + //! Tag for TBB 3.0 tasks. + es_version_3_task = 1, + //! Tag for TBB 3.0 task_proxy. + es_task_proxy = 2, + //! Set if ref_count might be changed by another thread. Used for debugging. + es_ref_count_active = 0x40 + }; + + static bool is_version_3_task( task& t ) { + return (t.prefix().extra_state & 0x3F)==0x1; + } + + //! Position in the call stack specifying its maximal filling when stealing is still allowed + uintptr_t my_stealing_threshold; +#if __TBB_ipf + //! Position in the RSE backup area specifying its maximal filling when stealing is still allowed + uintptr_t my_rsb_stealing_threshold; +#endif + + static const size_t null_arena_index = ~0u; + + //! Index of the arena slot the scheduler occupies now, or occupied last time. + size_t arena_index; + + //! Capacity of ready tasks deque (number of elements - pointers to task). + size_t task_pool_size; + + //! Dummy slot used when scheduler is not in arena + /** Only its "head" and "tail" members are ever used. The scheduler uses + the "task_pool" shortcut to access the task deque. **/ + ArenaSlot dummy_slot; + + //! Pointer to the slot in the arena we own at the moment. + /** When out of arena it points to this scheduler's dummy_slot. **/ + mutable ArenaSlot* arena_slot; + + bool in_arena () const { return arena_slot != &dummy_slot; } + + bool is_local_task_pool_empty () { + return arena_slot->task_pool == EmptyTaskPool || arena_slot->head >= arena_slot->tail; + } + + //! The arena that I own (if master) or belong to (if worker) + Arena* const arena; + + //! Random number generator used for picking a random victim from which to steal. + FastRandom random; + + //! Free list of small tasks that can be reused. + task* free_list; + + //! Innermost task whose task::execute() is running. + task* innermost_running_task; + + //! Fake root task created by slave threads. + /** The task is used as the "parent" argument to method wait_for_all. */ + task* dummy_task; + + //! Reference count for scheduler + /** Number of task_scheduler_init objects that point to this scheduler */ + long ref_count; + + mail_inbox inbox; + + void attach_mailbox( affinity_id id ) { + __TBB_ASSERT(id>0,NULL); + inbox.attach( arena->mailbox(id) ); + my_affinity_id = id; + } + + //! The mailbox id assigned to this scheduler. + /** The id is assigned upon first entry into the arena. + TODO: how are id's being garbage collected? + TODO: master thread may enter arena and leave and then reenter. + We want to give it the same affinity_id upon reentry, if practical. + */ + affinity_id my_affinity_id; + + /* A couple of bools can be located here because space is otherwise just padding after my_affinity_id. */ + + //! True if this is assigned to thread local storage by registering with Governor. + bool is_registered; + + //! True if *this was created by automatic TBB initialization + bool is_auto_initialized; + +#if __TBB_SCHEDULER_OBSERVER + //! Last observer_proxy processed by this scheduler + observer_proxy* local_last_observer_proxy; + + //! Notify any entry observers that have been created since the last call by this thread. + void notify_entry_observers() { + local_last_observer_proxy = observer_proxy::process_list(local_last_observer_proxy,is_worker(),/*is_entry=*/true); + } + + //! Notify all exit observers that this thread is no longer participating in task scheduling. + void notify_exit_observers( bool is_worker ) { + observer_proxy::process_list(local_last_observer_proxy,is_worker,/*is_entry=*/false); + } +#endif /* __TBB_SCHEDULER_OBSERVER */ + +#if COUNT_TASK_NODES + //! Net number of big task objects that have been allocated but not yet freed. + intptr task_node_count; +#endif /* COUNT_TASK_NODES */ + +#if STATISTICS + long current_active; + long current_length; + //! Number of big tasks that have been malloc'd. + /** To find total number of tasks malloc'd, compute (current_big_malloc+small_task_count) */ + long current_big_malloc; + long execute_count; + //! Number of tasks stolen + long steal_count; + //! Number of tasks received from mailbox + long mail_received_count; + long proxy_execute_count; + long proxy_steal_count; + long proxy_bypass_count; +#endif /* STATISTICS */ + + //! Sets up the data necessary for the stealing limiting heuristics + void init_stack_info (); + + //! Returns true if stealing is allowed + bool can_steal () { + int anchor; +#if __TBB_ipf + return my_stealing_threshold < (uintptr_t)&anchor && (uintptr_t)__TBB_get_bsp() < my_rsb_stealing_threshold; +#else + return my_stealing_threshold < (uintptr_t)&anchor; +#endif + } + + //! Actions common to enter_arena and try_enter_arena + void do_enter_arena(); + + //! Used by workers to enter the arena + /** Does not lock the task pool in case if arena slot has been successfully grabbed. **/ + void enter_arena(); + + //! Used by masters to try to enter the arena + /** Does not lock the task pool in case if arena slot has been successfully grabbed. **/ + void try_enter_arena(); + + //! Leave the arena + void leave_arena(); + + //! Locks victim's task pool, and returns pointer to it. The pointer can be NULL. + task** lock_task_pool( ArenaSlot* victim_arena_slot ) const; + + //! Unlocks victim's task pool + void unlock_task_pool( ArenaSlot* victim_arena_slot, task** victim_task_pool ) const; + + + //! Locks the local task pool + void acquire_task_pool() const; + + //! Unlocks the local task pool + void release_task_pool() const; + + //! Get a task from the local pool. + //! Checks if t is affinitized to another thread, and if so, bundles it as proxy. + /** Returns either t or proxy containing t. **/ + task* prepare_for_spawning( task* t ); + + /** Called only by the pool owner. + Returns the pointer to the task or NULL if the pool is empty. + In the latter case compacts the pool. **/ + task* get_task(); + + //! Attempt to get a task from the mailbox. + /** Called only by the thread that owns *this. + Gets a task only if there is one not yet executed by another thread. + If successful, unlinks the task and returns a pointer to it. + Otherwise returns NULL. */ + task* get_mailbox_task(); + + //! True if t is a task_proxy + static bool is_proxy( const task& t ) { + return t.prefix().extra_state==es_task_proxy; + } + + //! Extracts task pointer from task_proxy, and frees the proxy. + /** Return NULL if underlying task was claimed by mailbox. */ + task* strip_proxy( task_proxy* result ); + + //! Steal task from another scheduler's ready pool. + task* steal_task( ArenaSlot& victim_arena_slot ); + + /** Initial size of the task deque sufficient to serve without reallocation + 4 nested paralle_for calls with iteration space of 65535 grains each. **/ + static const size_t min_task_pool_size = 64; + + //! Allocate task pool containing at least n elements. + task** allocate_task_pool( size_t n ); + + //! Deallocate task pool that was allocated by means of allocate_task_pool. + static void free_task_pool( task** pool ) { + __TBB_ASSERT( pool, "attempt to free NULL TaskPool" ); + NFS_Free( pool ); + } + + //! Grow ready task deque to at least n elements. + void grow( size_t n ); + + //! Initialize a scheduler for a master thread. + static GenericScheduler* create_master( Arena* a ); + + //! Perform necessary cleanup when a master thread stops using TBB. + void cleanup_master(); + + //! Initialize a scheduler for a worker thread. + static GenericScheduler* create_worker( Arena& a, size_t index ); + + + //! Top-level routine for worker threads + /** Argument arg is a WorkerDescriptor*, cast to a (void*). */ + static thread_routine_return_type __TBB_THREAD_ROUTINE worker_routine( void* arg ); + + //! Perform necessary cleanup when a worker thread finishes. + static void cleanup_worker( void* arg ); + +protected: + GenericScheduler( Arena* arena ); + +#if TBB_USE_ASSERT + //! Check that internal data structures are in consistent state. + /** Raises __TBB_ASSERT failure if inconsistency is found. */ + bool assert_okay() const; +#endif /* TBB_USE_ASSERT */ + +public: + void local_spawn( task& first, task*& next ); + void local_spawn_root_and_wait( task& first, task*& next ); + + /*override*/ + void spawn( task& first, task*& next ) { + Governor::local_scheduler()->local_spawn( first, next ); + } + /*override*/ + void spawn_root_and_wait( task& first, task*& next ) { + Governor::local_scheduler()->local_spawn_root_and_wait( first, next ); + } + + //! Allocate and construct a scheduler object. + static GenericScheduler* allocate_scheduler( Arena* arena ); + + //! Destroy and deallocate scheduler that was created with method allocate. + void free_scheduler(); + + //! Allocate task object, either from the heap or a free list. + /** Returns uninitialized task object with initialized prefix. */ + task& allocate_task( size_t number_of_bytes, + __TBB_CONTEXT_ARG(task* parent, task_group_context* context) ); + + //! Optimization hint to free_task that enables it omit unnecessary tests and code. + enum hint { + //! No hint + no_hint=0, + //! Task is known to have been allocated by this scheduler + is_local=1, + //! Task is known to be a small task. + /** Task should be returned to the free list of *some* scheduler, possibly not this scheduler. */ + is_small=2, + //! Bitwise-OR of is_local and is_small. + /** Task should be returned to free list of this scheduler. */ + is_small_local=3 + }; + + //! Put task on free list. + /** Does not call destructor. */ + template + void free_task( task& t ); + + void free_task_proxy( task_proxy& tp ) { +#if TBB_USE_ASSERT + poison_pointer( tp.outbox ); + poison_pointer( tp.next_in_mailbox ); + tp.task_and_tag = 0xDEADBEEF; +#endif /* TBB_USE_ASSERT */ + free_task(tp); + } + + //! Return task object to the memory allocator. + void deallocate_task( task& t ) { +#if TBB_USE_ASSERT + task_prefix& p = t.prefix(); + p.state = 0xFF; + p.extra_state = 0xFF; + poison_pointer(p.next); +#endif /* TBB_USE_ASSERT */ + NFS_Free((char*)&t-task_prefix_reservation_size); +#if COUNT_TASK_NODES + task_node_count -= 1; +#endif /* COUNT_TASK_NODES */ + } + + //! True if running on a worker thread, false otherwise. + inline bool is_worker() { + return arena_index < arena->prefix().number_of_workers; + } + +#if TEST_ASSEMBLY_ROUTINES + /** Defined in test_assembly.cpp */ + void test_assembly_routines(); +#endif /* TEST_ASSEMBLY_ROUTINES */ + +#if COUNT_TASK_NODES + intptr get_task_node_count( bool count_arena_workers = false ) { + return task_node_count + (count_arena_workers? arena->workers_task_node_count(): 0); + } +#endif /* COUNT_TASK_NODES */ + + //! Special value used to mark return_list as not taking any more entries. + static task* plugged_return_list() {return (task*)(intptr)(-1);} + + //! Number of small tasks that have been allocated by this scheduler. + intptr small_task_count; + + //! List of small tasks that have been returned to this scheduler by other schedulers. + task* return_list; + + //! Free a small task t that that was allocated by a different scheduler + void free_nonlocal_small_task( task& t ); + +#if __TBB_EXCEPTIONS + //! Padding isolating thread local members from members that can be written to by other threads. + char _padding1[NFS_MaxLineSize - sizeof(context_list_node_t)]; + + //! Head of the thread specific list of task group contexts. + context_list_node_t context_list_head; + + //! Mutex protecting access to the list of task group contexts. + spin_mutex context_list_mutex; + + //! Used to form the list of master thread schedulers. + scheduler_list_node_t my_node; + + //! Thread local counter of cancellation requests. + /** When this counter equals global_cancel_count, the cancellation state known + to this thread is synchronized with the global cancellation state. + \sa #global_cancel_count **/ + uintptr_t local_cancel_count; + + //! Propagates cancellation request to all descendants of the argument context. + void propagate_cancellation ( task_group_context* ctx ); + + //! Propagates cancellation request to contexts registered by this scheduler. + void propagate_cancellation (); +#endif /* __TBB_EXCEPTIONS */ +}; // class GenericScheduler + +//------------------------------------------------------------------------ +// auto_empty_task +//------------------------------------------------------------------------ + +//! Smart holder for the empty task class with automatic destruction +class auto_empty_task { + task* my_task; + GenericScheduler* my_scheduler; +public: + auto_empty_task ( __TBB_CONTEXT_ARG(GenericScheduler *s, task_group_context* context) ) + : my_task( new(&s->allocate_task(sizeof(empty_task), __TBB_CONTEXT_ARG(NULL, context))) empty_task ) + , my_scheduler(s) + {} + // empty_task has trivial destructor, so there's no need to call it. + ~auto_empty_task () { my_scheduler->free_task(*my_task); } + + operator task& () { return *my_task; } + task* operator & () { return my_task; } + task_prefix& prefix () { return my_task->prefix(); } +}; // class auto_empty_task + +//------------------------------------------------------------------------ +// Methods of class Governor that need full definition of GenericScheduler +//------------------------------------------------------------------------ + +void Governor::sign_on(GenericScheduler* s) { + __TBB_ASSERT( !s->is_registered, NULL ); + s->is_registered = true; + __TBB_InitOnce::add_ref(); + theTLS.set(s); +} + +void Governor::sign_off(GenericScheduler* s) { + if( s->is_registered ) { +#if USE_PTHREAD + __TBB_ASSERT( theTLS.get()==s || (!s->is_worker() && !theTLS.get()), "attempt to unregister a wrong scheduler instance" ); +#else + __TBB_ASSERT( theTLS.get()==s, "attempt to unregister a wrong scheduler instance" ); +#endif /* USE_PTHREAD */ + theTLS.set(NULL); + s->is_registered = false; + __TBB_InitOnce::remove_ref(); + } +} + +GenericScheduler* Governor::init_scheduler( int num_threads, stack_size_type stack_size, bool auto_init ) { + if( !__TBB_InitOnce::initialization_done() ) + DoOneTimeInitializations(); + GenericScheduler* s = theTLS.get(); + if( s ) { + s->ref_count += 1; + return s; + } + s = GenericScheduler::create_master( obtain_arena(num_threads, stack_size) ); + __TBB_ASSERT(s, "Somehow a local scheduler creation for a master thread failed"); + s->is_auto_initialized = auto_init; + return s; +} + +void Governor::terminate_scheduler( GenericScheduler* s ) { + __TBB_ASSERT( s == theTLS.get(), "Attempt to terminate non-local scheduler instance" ); + if( !--(s->ref_count) ) + s->cleanup_master(); +} + +void Governor::auto_terminate(void* arg){ + GenericScheduler* s = static_cast(arg); + if( s && s->is_auto_initialized ) { + if( !--(s->ref_count) ) { + if ( !theTLS.get() && !s->is_local_task_pool_empty() ) { + // This thread's TLS slot is already cleared. But in order to execute + // remaining tasks cleanup_master() will need TLS correctly set. + // So we temporarily restore its value. + theTLS.set(s); + s->cleanup_master(); + theTLS.set(NULL); + } + else + s->cleanup_master(); + } + } +} + +//------------------------------------------------------------------------ +// GenericScheduler implementation +//------------------------------------------------------------------------ + +inline task& GenericScheduler::allocate_task( size_t number_of_bytes, + __TBB_CONTEXT_ARG(task* parent, task_group_context* context) ) { + GATHER_STATISTIC(current_active+=1); + task* t = free_list; + if( number_of_bytes<=quick_task_size ) { + if( t ) { + GATHER_STATISTIC(current_length-=1); + __TBB_ASSERT( t->state()==task::freed, "free list of tasks is corrupted" ); + free_list = t->prefix().next; + } else if( return_list ) { + // No fence required for read of return_list above, because __TBB_FetchAndStoreW has a fence. + t = (task*)__TBB_FetchAndStoreW( &return_list, 0 ); + __TBB_ASSERT( t, "another thread emptied the return_list" ); + __TBB_ASSERT( t->prefix().origin==this, "task returned to wrong return_list" ); + ITT_NOTIFY( sync_acquired, &return_list ); + free_list = t->prefix().next; + } else { + t = (task*)((char*)NFS_Allocate( task_prefix_reservation_size+quick_task_size, 1, NULL ) + task_prefix_reservation_size ); +#if COUNT_TASK_NODES + ++task_node_count; +#endif /* COUNT_TASK_NODES */ + t->prefix().origin = this; + ++small_task_count; + } + } else { + GATHER_STATISTIC(current_big_malloc+=1); + t = (task*)((char*)NFS_Allocate( task_prefix_reservation_size+number_of_bytes, 1, NULL ) + task_prefix_reservation_size ); +#if COUNT_TASK_NODES + ++task_node_count; +#endif /* COUNT_TASK_NODES */ + t->prefix().origin = NULL; + } + task_prefix& p = t->prefix(); +#if __TBB_EXCEPTIONS + p.context = context; +#endif /* __TBB_EXCEPTIONS */ + p.owner = this; + p.ref_count = 0; + // Assign some not outrageously out-of-place value for a while + p.depth = 0; + p.parent = parent; + // In TBB 3.0 and later, the constructor for task sets extra_state to indicate the version of the tbb/task.h header. + // In TBB 2.0 and earlier, the constructor leaves extra_state as zero. + p.extra_state = 0; + p.affinity = 0; + p.state = task::allocated; + return *t; +} + +template +inline void GenericScheduler::free_task( task& t ) { + GATHER_STATISTIC(current_active-=1); + task_prefix& p = t.prefix(); + // Verify that optimization hints are correct. + __TBB_ASSERT( h!=is_small_local || p.origin==this, NULL ); + __TBB_ASSERT( !(h&is_small) || p.origin, NULL ); +#if TBB_USE_ASSERT + p.depth = 0xDEADBEEF; + p.ref_count = 0xDEADBEEF; + poison_pointer(p.owner); +#endif /* TBB_USE_ASSERT */ + __TBB_ASSERT( 1L<(t.prefix().origin); + __TBB_ASSERT( &s!=this, NULL ); + for(;;) { + task* old = s.return_list; + if( old==plugged_return_list() ) + break; + // Atomically insert t at head of s.return_list + t.prefix().next = old; + ITT_NOTIFY( sync_releasing, &s.return_list ); + if( __TBB_CompareAndSwapW( &s.return_list, (intptr)&t, (intptr)old )==(intptr)old ) + return; + } + deallocate_task(t); + if( __TBB_FetchAndDecrementWrelease( &s.small_task_count )==1 ) { + // We freed the last task allocated by scheduler s, so it's our responsibility + // to free the scheduler. + NFS_Free( &s ); + } +} + +//------------------------------------------------------------------------ +// CustomScheduler +//------------------------------------------------------------------------ + +//! A scheduler with a customized evaluation loop. +/** The customization can use SchedulerTraits to make decisions without needing a run-time check. */ +template +class CustomScheduler: private GenericScheduler { + //! Scheduler loop that dispatches tasks. + /** If child is non-NULL, it is dispatched first. + Then, until "parent" has a reference count of 1, other task are dispatched or stolen. */ + void local_wait_for_all( task& parent, task* child ); + + /*override*/ + void wait_for_all( task& parent, task* child ) { + static_cast(Governor::local_scheduler())->local_wait_for_all( parent, child ); + } + + typedef CustomScheduler scheduler_type; + + //! Construct a CustomScheduler + CustomScheduler( Arena* arena ) : GenericScheduler(arena) {} + + static bool tally_completion_of_one_predecessor( task& s ) { + task_prefix& p = s.prefix(); + if( SchedulerTraits::itt_possible ) + ITT_NOTIFY(sync_releasing, &p.ref_count); + if( SchedulerTraits::has_slow_atomic && p.ref_count==1 ) { + p.ref_count=0; + } else { + reference_count k = __TBB_FetchAndDecrementWrelease(&p.ref_count); + __TBB_ASSERT( k>0, "completion of task caused parent's reference count to underflow" ); + if( k!=1 ) + return false; + } + if( SchedulerTraits::itt_possible ) + ITT_NOTIFY(sync_acquired, &p.ref_count); + return true; + } + +public: + static GenericScheduler* allocate_scheduler( Arena* arena ) { + __TBB_ASSERT( arena, "missing arena" ); + scheduler_type* s = (scheduler_type*)NFS_Allocate(sizeof(scheduler_type),1,NULL); + new( s ) scheduler_type( arena ); + __TBB_ASSERT( s->assert_okay(), NULL ); + ITT_SYNC_CREATE(s, SyncType_Scheduler, SyncObj_TaskPoolSpinning); + return s; + } +}; + +//------------------------------------------------------------------------ +// AssertOkay +//------------------------------------------------------------------------ +#if TBB_USE_ASSERT +/** Logically, this method should be a member of class task. + But we do not want to publish it, so it is here instead. */ +static bool AssertOkay( const task& task ) { + __TBB_ASSERT( &task!=NULL, NULL ); + __TBB_ASSERT( (uintptr)&task % task_alignment == 0, "misaligned task" ); + __TBB_ASSERT( (unsigned)task.state()<=(unsigned)task::recycle, "corrupt task (invalid state)" ); + return true; +} +#endif /* TBB_USE_ASSERT */ + +//------------------------------------------------------------------------ +// Methods of Arena +//------------------------------------------------------------------------ +Arena* Arena::allocate_arena( unsigned number_of_slots, unsigned number_of_workers, stack_size_type stack_size) { + __TBB_ASSERT( sizeof(ArenaPrefix) % NFS_GetLineSize()==0, "ArenaPrefix not multiple of cache line size" ); + __TBB_ASSERT( sizeof(mail_outbox)==NFS_MaxLineSize, NULL ); + size_t n = sizeof(ArenaPrefix) + number_of_slots*(sizeof(mail_outbox)+sizeof(ArenaSlot)); + + unsigned char* storage = (unsigned char*)NFS_Allocate( n, 1, NULL ); + memset( storage, 0, n ); + Arena* a = (Arena*)(storage + sizeof(ArenaPrefix)+ number_of_slots*(sizeof(mail_outbox))); + __TBB_ASSERT( sizeof(a->slot[0]) % NFS_GetLineSize()==0, "Arena::slot size not multiple of cache line size" ); + __TBB_ASSERT( (uintptr)a % NFS_GetLineSize()==0, NULL ); + new( &a->prefix() ) ArenaPrefix( number_of_slots, number_of_workers ); + + // Allocate the worker_list + WorkerDescriptor * w = new WorkerDescriptor[number_of_workers]; + memset( w, 0, sizeof(WorkerDescriptor)*(number_of_workers)); + a->prefix().worker_list = w; + +#if TBB_USE_ASSERT + // Verify that earlier memset initialized the mailboxes. + for( unsigned j=1; j<=number_of_slots; ++j ) { + a->mailbox(j).assert_is_initialized(); + } +#endif /* TBB_USE_ASSERT */ + + a->prefix().stack_size = stack_size; + size_t k; + // Mark each worker slot as locked and unused + for( k=0; kslot + k, SyncType_Scheduler, SyncObj_WorkerTaskPool); + ITT_SYNC_CREATE(&w[k].scheduler, SyncType_Scheduler, SyncObj_WorkerLifeCycleMgmt); + ITT_SYNC_CREATE(&a->mailbox(k+1), SyncType_Scheduler, SyncObj_Mailbox); + } + // Mark rest of slots as unused + for( ; kslot + k, SyncType_Scheduler, SyncObj_MasterTaskPool); + ITT_SYNC_CREATE(&a->mailbox(k+1), SyncType_Scheduler, SyncObj_Mailbox); + } + + return a; +} + +inline void Arena::mark_pool_full() { + // Double-check idiom that is deliberately sloppy about memory fences. + // Technically, to avoid missed wakeups, there should be a full memory fence between the point we + // released the task pool (i.e. spawned task) and read the gate's state. However, adding such a + // fence might hurt overall performance more than it helps, because the fence would be executed + // on every task pool release, even when stealing does not occur. Since TBB allows parallelism, + // but never promises parallelism, the missed wakeup is not a correctness problem. + pool_state_t snapshot = prefix().pool_state; + if( is_busy_or_empty(snapshot) ) { + // Attempt to mark as full. The compare_and_swap below is a little unusual because the + // result is compared to a value that can be different than the comparand argument. + if( prefix().pool_state.compare_and_swap( SNAPSHOT_FULL, snapshot )==SNAPSHOT_EMPTY ) { + if( snapshot!=SNAPSHOT_EMPTY ) { + // This thread initialized s1 to "busy" and then another thread transitioned + // pool_state to "empty" in the meantime, which caused the compare_and_swap above + // to fail. Attempt to transition pool_state from "empty" to "full". + if( prefix().pool_state.compare_and_swap( SNAPSHOT_FULL, SNAPSHOT_EMPTY )!=SNAPSHOT_EMPTY ) { + // Some other thread transitioned pool_state from "empty", and hence became + // responsible for waking up workers. + return; + } + } + // This thread transitioned pool from empty to full state, and thus is responsible for + // telling RML that there is work to do. + prefix().server->adjust_job_count_estimate( int(prefix().number_of_workers) ); + } + } +} + +bool Arena::check_if_pool_is_empty() +{ + for(;;) { + pool_state_t snapshot = prefix().pool_state; + switch( snapshot ) { + case SNAPSHOT_EMPTY: + case SNAPSHOT_SERVER_GOING_AWAY: + return true; + case SNAPSHOT_FULL: { + // Use unique id for "busy" in order to avoid ABA problems. + const pool_state_t busy = pool_state_t(this); + // Request permission to take snapshot + if( prefix().pool_state.compare_and_swap( busy, SNAPSHOT_FULL )==SNAPSHOT_FULL ) { + // Got permission. Take the snapshot. + size_t n = prefix().limit; + size_t k; + for( k=0; k=n ) { + if( prefix().pool_state.compare_and_swap( SNAPSHOT_EMPTY, busy )==busy ) { + // This thread transitioned pool to empty state, and thus is responsible for + // telling RML that there is no other work to do. + prefix().server->adjust_job_count_estimate( -int(prefix().number_of_workers) ); + return true; + } + } else { + // Undo previous transition SNAPSHOT_FULL-->busy, unless another thread undid it. + prefix().pool_state.compare_and_swap( SNAPSHOT_FULL, busy ); + } + } + } + return false; + } + default: + // Another thread is taking a snapshot. + return false; + } + } +} + +void Arena::terminate_workers() { + for(;;) { + pool_state_t snapshot = prefix().pool_state; + if( snapshot==SNAPSHOT_SERVER_GOING_AWAY ) + break; + if( prefix().pool_state.compare_and_swap( SNAPSHOT_SERVER_GOING_AWAY, snapshot )==snapshot ) { + if( snapshot!=SNAPSHOT_EMPTY ) + prefix().server->adjust_job_count_estimate( -int(prefix().number_of_workers) ); + break; + } + } + prefix().server->request_close_connection(); +} + + +#if COUNT_TASK_NODES +intptr Arena::workers_task_node_count() { + intptr result = 0; + for( unsigned i=0; itask_node_count; + } + return result; +} +#endif + +//------------------------------------------------------------------------ +// Methods of GenericScheduler +//------------------------------------------------------------------------ +#if _MSC_VER && !defined(__INTEL_COMPILER) + // Suppress overzealous compiler warning about using 'this' in base initializer list. + #pragma warning(push) + #pragma warning(disable:4355) +#endif + +GenericScheduler::GenericScheduler( Arena* arena_ ) : + arena_index(null_arena_index), + task_pool_size(0), + arena_slot(&dummy_slot), + arena(arena_), + random( unsigned(this-(GenericScheduler*)NULL) ), + free_list(NULL), + innermost_running_task(NULL), + dummy_task(NULL), + ref_count(1), + my_affinity_id(0), + is_registered(false), + is_auto_initialized(false), +#if __TBB_SCHEDULER_OBSERVER + local_last_observer_proxy(NULL), +#endif /* __TBB_SCHEDULER_OBSERVER */ +#if COUNT_TASK_NODES + task_node_count(0), +#endif /* COUNT_TASK_NODES */ +#if STATISTICS + current_active(0), + current_length(0), + current_big_malloc(0), + execute_count(0), + steal_count(0), + mail_received_count(0), + proxy_execute_count(0), + proxy_steal_count(0), + proxy_bypass_count(0), +#endif /* STATISTICS */ + small_task_count(1), // Extra 1 is a guard reference + return_list(NULL) +{ + dummy_slot.task_pool = allocate_task_pool( min_task_pool_size ); + dummy_slot.head = dummy_slot.tail = 0; + dummy_task = &allocate_task( sizeof(task), __TBB_CONTEXT_ARG(NULL, NULL) ); +#if __TBB_EXCEPTIONS + context_list_head.my_prev = &context_list_head; + context_list_head.my_next = &context_list_head; + ITT_SYNC_CREATE(&context_list_mutex, SyncType_Scheduler, SyncObj_ContextsList); +#endif /* __TBB_EXCEPTIONS */ + dummy_task->prefix().ref_count = 2; + ITT_SYNC_CREATE(&dummy_task->prefix().ref_count, SyncType_Scheduler, SyncObj_WorkerLifeCycleMgmt); + ITT_SYNC_CREATE(&return_list, SyncType_Scheduler, SyncObj_TaskReturnList); + __TBB_ASSERT( assert_okay(), "constructor error" ); +} + +#if _MSC_VER && !defined(__INTEL_COMPILER) + #pragma warning(pop) +#endif // warning 4355 is back + +#if TBB_USE_ASSERT +bool GenericScheduler::assert_okay() const { +#if TBB_USE_ASSERT>=2||TEST_ASSEMBLY_ROUTINES + acquire_task_pool(); + task** tp = dummy_slot.task_pool; + __TBB_ASSERT( task_pool_size >= min_task_pool_size, NULL ); + __TBB_ASSERT( arena_slot->head <= arena_slot->tail, NULL ); + for ( size_t i = arena_slot->head; i < arena_slot->tail; ++i ) { + __TBB_ASSERT( (uintptr_t)tp[i] + 1 > 1u, "nil or invalid task pointer in the deque" ); + __TBB_ASSERT( tp[i]->prefix().state == task::ready || + tp[i]->prefix().extra_state == es_task_proxy, "task in the deque has invalid state" ); + } + release_task_pool(); +#endif /* TBB_USE_ASSERT>=2||TEST_ASSEMBLY_ROUTINES */ + return true; +} +#endif /* TBB_USE_ASSERT */ + +#if __TBB_EXCEPTIONS + +void GenericScheduler::propagate_cancellation () { + spin_mutex::scoped_lock lock(context_list_mutex); + // Acquire fence is necessary to ensure that the subsequent node->my_next load + // returned the correct value in case it was just inserted in another thread. + // The fence also ensures visibility of the correct my_parent value. + context_list_node_t *node = __TBB_load_with_acquire(context_list_head.my_next); + while ( node != &context_list_head ) { + task_group_context *ctx = __TBB_get_object_addr(task_group_context, my_node, node); + // The absence of acquire fence while reading my_cancellation_requested may result + // in repeated traversals of the same parents chain if another group (precedent or + // descendant) belonging to the tree being canceled sends cancellation request of + // its own around the same time. + if ( !ctx->my_cancellation_requested ) + ctx->propagate_cancellation_from_ancestors(); + node = node->my_next; + __TBB_ASSERT( ctx->is_alive(), "Walked into a destroyed context while propagating cancellation" ); + } +} + +/** Propagates cancellation down the tree of dependent contexts by walking each + thread's local list of contexts **/ +void GenericScheduler::propagate_cancellation ( task_group_context* ctx ) { + __TBB_ASSERT ( ctx->my_cancellation_requested, "No cancellation request in the context" ); + // The whole propagation algorithm is under the lock in order to ensure correctness + // in case of parallel cancellations at the different levels of the context tree. + // See the note 2 at the bottom of the file. + mutex::scoped_lock lock(the_scheduler_list_mutex); + // Advance global cancellation state + __TBB_FetchAndAddWrelease(&global_cancel_count, 1); + // First propagate to workers using arena to access their context lists + size_t num_workers = arena->prefix().number_of_workers; + for ( size_t i = 0; i < num_workers; ++i ) { + // No fence is necessary here since the context list of worker's scheduler + // can contain anything of interest only after the first stealing was done + // by that worker. And doing it applies the necessary fence + GenericScheduler *s = arena->prefix().worker_list[i].scheduler; + // If the worker is in the middle of its startup sequence, skip it. + if ( s ) + s->propagate_cancellation(); + } + // Then propagate to masters using the global list of master's schedulers + scheduler_list_node_t *node = the_scheduler_list_head.my_next; + while ( node != &the_scheduler_list_head ) { + __TBB_get_object_addr(GenericScheduler, my_node, node)->propagate_cancellation(); + node = node->my_next; + } + // Now sync up the local counters + for ( size_t i = 0; i < num_workers; ++i ) { + GenericScheduler *s = arena->prefix().worker_list[i].scheduler; + // If the worker is in the middle of its startup sequence, skip it. + if ( s ) + s->local_cancel_count = global_cancel_count; + } + node = the_scheduler_list_head.my_next; + while ( node != &the_scheduler_list_head ) { + __TBB_get_object_addr(GenericScheduler, my_node, node)->local_cancel_count = global_cancel_count; + node = node->my_next; + } +} +#endif /* __TBB_EXCEPTIONS */ + + + +void GenericScheduler::init_stack_info () { + // Stacks are growing top-down. Highest address is called "stack base", + // and the lowest is "stack limit". +#if USE_WINTHREAD +#if defined(_MSC_VER)&&_MSC_VER<1400 && !_WIN64 + NT_TIB *pteb = (NT_TIB*)__TBB_machine_get_current_teb(); +#else + NT_TIB *pteb = (NT_TIB*)NtCurrentTeb(); +#endif + __TBB_ASSERT( &pteb < pteb->StackBase && &pteb > pteb->StackLimit, "invalid stack info in TEB" ); + __TBB_ASSERT( arena->prefix().stack_size>0, "stack_size not initialized?" ); + // When a thread is created with the attribute STACK_SIZE_PARAM_IS_A_RESERVATION, stack limit + // in the TIB points to the committed part of the stack only. This renders the expression + // "(uintptr_t)pteb->StackBase / 2 + (uintptr_t)pteb->StackLimit / 2" virtually useless. + // Thus for worker threads we use the explicit stack size we used while creating them. + // And for master threads we rely on the following fact and assumption: + // - the default stack size of a master thread on Windows is 1M; + // - if it was explicitly set by the application it is at least as large as the size of a worker stack. + if ( is_worker() || arena->prefix().stack_size < MByte ) + my_stealing_threshold = (uintptr_t)pteb->StackBase - arena->prefix().stack_size / 2; + else + my_stealing_threshold = (uintptr_t)pteb->StackBase - MByte / 2; +#else /* USE_PTHREAD */ + // There is no portable way to get stack base address in Posix, so we use + // non-portable method (on all modern Linux) or the simplified approach + // based on the common sense assumptions. The most important assumption + // is that the main thread's stack size is not less than that of other threads. + size_t stack_size = arena->prefix().stack_size; + void *stack_base = &stack_size; +#if __TBB_ipf + void *rsb_base = __TBB_get_bsp(); +#endif +#if __linux__ + size_t np_stack_size = 0; + void *stack_limit = NULL; + pthread_attr_t attr_stack, np_attr_stack; + if( 0 == pthread_getattr_np(pthread_self(), &np_attr_stack) ) { + if ( 0 == pthread_attr_getstack(&np_attr_stack, &stack_limit, &np_stack_size) ) { + if ( 0 == pthread_attr_init(&attr_stack) ) { + if ( 0 == pthread_attr_getstacksize(&attr_stack, &stack_size) ) + { + stack_base = (char*)stack_limit + np_stack_size; + if ( np_stack_size < stack_size ) { + // We are in a secondary thread. Use reliable data. +#if __TBB_ipf + // IA64 stack is split into RSE backup and memory parts + rsb_base = stack_limit; + stack_size = np_stack_size/2; +#else + stack_size = np_stack_size; +#endif /* !__TBB_ipf */ + } + // We are either in the main thread or this thread stack + // is bigger that that of the main one. As we cannot discern + // these cases we fall back to the default (heuristic) values. + } + pthread_attr_destroy(&attr_stack); + } + } + pthread_attr_destroy(&np_attr_stack); + } +#endif /* __linux__ */ + __TBB_ASSERT( stack_size>0, "stack size must be positive" ); + my_stealing_threshold = (uintptr_t)((char*)stack_base - stack_size/2); +#if __TBB_ipf + my_rsb_stealing_threshold = (uintptr_t)((char*)rsb_base + stack_size/2); +#endif +#endif /* USE_PTHREAD */ +} + +task** GenericScheduler::allocate_task_pool( size_t n ) { + __TBB_ASSERT( n > task_pool_size, "Cannot shrink the task pool" ); + size_t byte_size = ((n * sizeof(task*) + NFS_MaxLineSize - 1) / NFS_MaxLineSize) * NFS_MaxLineSize; + task_pool_size = byte_size / sizeof(task*); + task** new_pool = (task**)NFS_Allocate( byte_size, 1, NULL ); + // No need to clear the fresh deque since valid items are designated by the head and tail members. +#if TBB_USE_ASSERT>=2 + // But clear it in the high vigilance debug mode + memset( new_pool, -1, n ); +#endif /* TBB_USE_ASSERT>=2 */ + return new_pool; +} + +void GenericScheduler::grow( size_t new_size ) { + __TBB_ASSERT( assert_okay(), NULL ); + if ( new_size < 2 * task_pool_size ) + new_size = 2 * task_pool_size; + task** new_pool = allocate_task_pool( new_size ); // updates task_pool_size + task** old_pool = dummy_slot.task_pool; + acquire_task_pool(); // requires the old dummy_slot.task_pool value + // arena_slot->tail should not be updated before arena_slot->head because their + // values are used by other threads to check if this task pool is empty. + size_t new_tail = arena_slot->tail - arena_slot->head; + __TBB_ASSERT( new_tail <= task_pool_size, "new task pool is too short" ); + memcpy( new_pool, old_pool + arena_slot->head, new_tail * sizeof(task*) ); + arena_slot->head = 0; + arena_slot->tail = new_tail; + dummy_slot.task_pool = new_pool; + release_task_pool(); // updates the task pool pointer in our arena slot + free_task_pool( old_pool ); + __TBB_ASSERT( assert_okay(), NULL ); +} + + +GenericScheduler* GenericScheduler::allocate_scheduler( Arena* arena ) { + switch( SchedulerTraitsId ) { + /* DefaultSchedulerTraits::id is listed explicitly as a case so that the host compiler + will issue an error message if it is the same as another id in the list. */ + default: + case DefaultSchedulerTraits::id: + return CustomScheduler::allocate_scheduler(arena); + case IntelSchedulerTraits::id: + return CustomScheduler::allocate_scheduler(arena); + } +} + +void GenericScheduler::free_scheduler() { + if( in_arena() ) { + acquire_task_pool(); + leave_arena(); + } +#if __TBB_EXCEPTIONS + task_group_context* &context = dummy_task->prefix().context; + // Only master thread's dummy task has a context + if ( context != &dummy_context) { + //! \todo Add assertion that master's dummy task context does not have children + context->task_group_context::~task_group_context(); + NFS_Free(context); + { + mutex::scoped_lock lock(the_scheduler_list_mutex); + my_node.my_next->my_prev = my_node.my_prev; + my_node.my_prev->my_next = my_node.my_next; + } + } +#endif /* __TBB_EXCEPTIONS */ + free_task( *dummy_task ); + + // k accounts for a guard reference and each task that we deallocate. + intptr k = 1; + for(;;) { + while( task* t = free_list ) { + free_list = t->prefix().next; + deallocate_task(*t); + ++k; + } + if( return_list==plugged_return_list() ) + break; + free_list = (task*)__TBB_FetchAndStoreW( &return_list, (intptr)plugged_return_list() ); + } + +#if COUNT_TASK_NODES + arena->prefix().task_node_count += task_node_count; +#endif /* COUNT_TASK_NODES */ +#if STATISTICS + the_statistics.record( execute_count, steal_count, mail_received_count, + proxy_execute_count, proxy_steal_count, proxy_bypass_count ); +#endif /* STATISTICS */ + free_task_pool( dummy_slot.task_pool ); + dummy_slot.task_pool = NULL; + // Update small_task_count last. Doing so sooner might cause another thread to free *this. + __TBB_ASSERT( small_task_count>=k, "small_task_count corrupted" ); + Governor::sign_off(this); + if( __TBB_FetchAndAddW( &small_task_count, -k )==k ) + NFS_Free( this ); +} + +/** ATTENTION: + This method is mostly the same as GenericScheduler::lock_task_pool(), with + a little different logic of slot state checks (slot is either locked or points + to our task pool). + Thus if either of them is changed, consider changing the counterpart as well. **/ +inline void GenericScheduler::acquire_task_pool() const { + if ( !in_arena() ) + return; // we are not in arena - nothing to lock + atomic_backoff backoff; + bool sync_prepare_done = false; + for(;;) { +#if TBB_USE_ASSERT + __TBB_ASSERT( arena_slot == arena->slot + arena_index, "invalid arena slot index" ); + // Local copy of the arena slot task pool pointer is necessary for the next + // assertion to work correctly to exclude asynchronous state transition effect. + task** tp = arena_slot->task_pool; + __TBB_ASSERT( tp == LockedTaskPool || tp == dummy_slot.task_pool, "slot ownership corrupt?" ); +#endif + if( arena_slot->task_pool != LockedTaskPool && + __TBB_CompareAndSwapW( &arena_slot->task_pool, (intptr_t)LockedTaskPool, + (intptr_t)dummy_slot.task_pool ) == (intptr_t)dummy_slot.task_pool ) + { + // We acquired our own slot + ITT_NOTIFY(sync_acquired, arena_slot); + break; + } + else if( !sync_prepare_done ) { + // Start waiting + ITT_NOTIFY(sync_prepare, arena_slot); + sync_prepare_done = true; + } + // Someone else acquired a lock, so pause and do exponential backoff. + backoff.pause(); +#if TEST_ASSEMBLY_ROUTINES + __TBB_ASSERT( arena_slot->task_pool == LockedTaskPool || + arena_slot->task_pool == dummy_slot.task_pool, NULL ); +#endif /* TEST_ASSEMBLY_ROUTINES */ + } + __TBB_ASSERT( arena_slot->task_pool == LockedTaskPool, "not really acquired task pool" ); +} // GenericScheduler::acquire_task_pool + +inline void GenericScheduler::release_task_pool() const { + if ( !in_arena() ) + return; // we are not in arena - nothing to unlock + __TBB_ASSERT( arena_slot, "we are not in arena" ); + __TBB_ASSERT( arena_slot->task_pool == LockedTaskPool, "arena slot is not locked" ); + ITT_NOTIFY(sync_releasing, arena_slot); + __TBB_store_with_release( arena_slot->task_pool, dummy_slot.task_pool ); +} + +/** ATTENTION: + This method is mostly the same as GenericScheduler::acquire_task_pool(), + with a little different logic of slot state checks (slot can be empty, locked + or point to any task pool other than ours, and asynchronous transitions between + all these states are possible). + Thus if any of them is changed, consider changing the counterpart as well **/ +inline task** GenericScheduler::lock_task_pool( ArenaSlot* victim_arena_slot ) const { + task** victim_task_pool; + atomic_backoff backoff; + bool sync_prepare_done = false; + for(;;) { + victim_task_pool = victim_arena_slot->task_pool; + // TODO: Investigate the effect of bailing out on the locked pool without trying to lock it. + // When doing this update assertion in the end of the method. + if ( victim_task_pool == EmptyTaskPool ) { + // The victim thread emptied its task pool - nothing to lock + if( sync_prepare_done ) + ITT_NOTIFY(sync_cancel, victim_arena_slot); + break; + } + if( victim_task_pool != LockedTaskPool && + __TBB_CompareAndSwapW( &victim_arena_slot->task_pool, + (intptr_t)LockedTaskPool, (intptr_t)victim_task_pool ) == (intptr_t)victim_task_pool ) + { + // We've locked victim's task pool + ITT_NOTIFY(sync_acquired, victim_arena_slot); + break; + } + else if( !sync_prepare_done ) { + // Start waiting + ITT_NOTIFY(sync_prepare, victim_arena_slot); + sync_prepare_done = true; + } + // Someone else acquired a lock, so pause and do exponential backoff. + backoff.pause(); + } + __TBB_ASSERT( victim_task_pool == EmptyTaskPool || + (victim_arena_slot->task_pool == LockedTaskPool && victim_task_pool != LockedTaskPool), + "not really locked victim's task pool?" ); + return victim_task_pool; +} // GenericScheduler::lock_task_pool + +inline void GenericScheduler::unlock_task_pool( ArenaSlot* victim_arena_slot, + task** victim_task_pool ) const { + __TBB_ASSERT( victim_arena_slot, "empty victim arena slot pointer" ); + __TBB_ASSERT( victim_arena_slot->task_pool == LockedTaskPool, "victim arena slot is not locked" ); + ITT_NOTIFY(sync_releasing, victim_arena_slot); + __TBB_store_with_release( victim_arena_slot->task_pool, victim_task_pool ); +} + + +inline task* GenericScheduler::prepare_for_spawning( task* t ) { + __TBB_ASSERT( t->state()==task::allocated, "attempt to spawn task that is not in 'allocated' state" ); + t->prefix().owner = this; + t->prefix().state = task::ready; +#if TBB_USE_ASSERT + if( task* parent = t->parent() ) { + internal::reference_count ref_count = parent->prefix().ref_count; + __TBB_ASSERT( ref_count>=0, "attempt to spawn task whose parent has a ref_count<0" ); + __TBB_ASSERT( ref_count!=0, "attempt to spawn task whose parent has a ref_count==0 (forgot to set_ref_count?)" ); + parent->prefix().extra_state |= es_ref_count_active; + } +#endif /* TBB_USE_ASSERT */ + affinity_id dst_thread = t->prefix().affinity; + __TBB_ASSERT( dst_thread == 0 || is_version_3_task(*t), "backwards compatibility to TBB 2.0 tasks is broken" ); + if( dst_thread != 0 && dst_thread != my_affinity_id ) { + task_proxy& proxy = (task_proxy&)allocate_task( sizeof(task_proxy), + __TBB_CONTEXT_ARG(NULL, NULL) ); + // Mark as a proxy + proxy.prefix().extra_state = es_task_proxy; + proxy.outbox = &arena->mailbox(dst_thread); + proxy.task_and_tag = intptr(t)|3; + proxy.next_in_mailbox = NULL; + ITT_NOTIFY( sync_releasing, proxy.outbox ); + // Mail the proxy - after this point t may be destroyed by another thread at any moment. + proxy.outbox->push(proxy); + return &proxy; + } + return t; +} + +/** Conceptually, this method should be a member of class scheduler. + But doing so would force us to publish class scheduler in the headers. */ +void GenericScheduler::local_spawn( task& first, task*& next ) { + __TBB_ASSERT( Governor::is_set(this), NULL ); + __TBB_ASSERT( assert_okay(), NULL ); + if ( &first.prefix().next == &next ) { + // Single task is being spawned + if ( arena_slot->tail == task_pool_size ) { + // 1 compensates for head possibly temporarily incremented by a thief + if ( arena_slot->head > 1 ) { + // Move the busy part of the deque to the beginning of the allocated space + acquire_task_pool(); + arena_slot->tail -= arena_slot->head; + memmove( dummy_slot.task_pool, dummy_slot.task_pool + arena_slot->head, arena_slot->tail * sizeof(task*) ); + arena_slot->head = 0; + release_task_pool(); + } + else { + grow( task_pool_size + 1 ); + } + } + dummy_slot.task_pool[arena_slot->tail] = prepare_for_spawning( &first ); + ITT_NOTIFY(sync_releasing, arena_slot); + // The following store with release is required on ia64 only + size_t new_tail = arena_slot->tail + 1; + __TBB_store_with_release( arena_slot->tail, new_tail ); + __TBB_ASSERT ( arena_slot->tail <= task_pool_size, "task deque end was overwritten" ); + } + else { + // Task list is being spawned + const size_t initial_capacity = 64; + task *arr[initial_capacity]; + fast_reverse_vector tasks(arr, initial_capacity); + task *t_next = NULL; + for( task* t = &first; ; t = t_next ) { + // After prepare_for_spawning returns t may already have been destroyed. + // So milk it while it is alive. + bool end = &t->prefix().next == &next; + t_next = t->prefix().next; + tasks.push_back( prepare_for_spawning(t) ); + if( end ) + break; + } + size_t num_tasks = tasks.size(); + __TBB_ASSERT ( arena_index != null_arena_index, "invalid arena slot index" ); + if ( arena_slot->tail + num_tasks > task_pool_size ) { + // 1 compensates for head possibly temporarily incremented by a thief + size_t new_size = arena_slot->tail - arena_slot->head + num_tasks + 1; + if ( new_size <= task_pool_size ) { + // Move the busy part of the deque to the beginning of the allocated space + acquire_task_pool(); + arena_slot->tail -= arena_slot->head; + memmove( dummy_slot.task_pool, dummy_slot.task_pool + arena_slot->head, arena_slot->tail * sizeof(task*) ); + arena_slot->head = 0; + release_task_pool(); + } + else { + grow( new_size ); + } + } +#if DO_ITT_NOTIFY + else { + // The preceding if-branch issues the same ittnotify inside release_task_pool() or grow() methods + ITT_NOTIFY(sync_releasing, arena_slot); + } +#endif /* DO_ITT_NOTIFY */ + tasks.copy_memory( dummy_slot.task_pool + arena_slot->tail ); + // The following store with release is required on ia64 only + size_t new_tail = arena_slot->tail + num_tasks; + __TBB_store_with_release( arena_slot->tail, new_tail ); + __TBB_ASSERT ( arena_slot->tail <= task_pool_size, "task deque end was overwritten" ); + } + if ( !in_arena() ) { + if ( is_worker() ) + enter_arena(); + else + try_enter_arena(); + } + + arena->mark_pool_full(); + __TBB_ASSERT( assert_okay(), NULL ); + + TBB_TRACE(("%p.internal_spawn exit\n", this )); +} + +void GenericScheduler::local_spawn_root_and_wait( task& first, task*& next ) { + __TBB_ASSERT( Governor::is_set(this), NULL ); + __TBB_ASSERT( &first, NULL ); + auto_empty_task dummy( __TBB_CONTEXT_ARG(this, first.prefix().context) ); + internal::reference_count n = 0; + for( task* t=&first; ; t=t->prefix().next ) { + ++n; + __TBB_ASSERT( !t->prefix().parent, "not a root task, or already running" ); + t->prefix().parent = &dummy; + if( &t->prefix().next==&next ) break; +#if __TBB_EXCEPTIONS + __TBB_ASSERT( t->prefix().context == t->prefix().next->prefix().context, + "all the root tasks in list must share the same context"); +#endif /* __TBB_EXCEPTIONS */ + } + dummy.prefix().ref_count = n+1; + if( n>1 ) + LocalSpawn( *first.prefix().next, next ); + TBB_TRACE(("spawn_root_and_wait((task_list*)%p): calling %p.loop\n",&first,this)); + wait_for_all( dummy, &first ); + TBB_TRACE(("spawn_root_and_wait((task_list*)%p): return\n",&first)); +} + +inline task* GenericScheduler::get_mailbox_task() { + __TBB_ASSERT( my_affinity_id>0, "not in arena" ); + task* result = NULL; + while( task_proxy* t = inbox.pop() ) { + intptr tat = __TBB_load_with_acquire(t->task_and_tag); + __TBB_ASSERT( tat==task_proxy::mailbox_bit || (tat==(tat|3)&&tat!=3), NULL ); + if( tat!=task_proxy::mailbox_bit && __TBB_CompareAndSwapW( &t->task_and_tag, task_proxy::pool_bit, tat )==tat ) { + // Successfully grabbed the task, and left pool seeker with job of freeing the proxy. + ITT_NOTIFY( sync_acquired, inbox.outbox() ); + result = (task*)(tat & ~3); + result->prefix().owner = this; + break; + } + free_task_proxy( *t ); + } + return result; +} + +inline task* GenericScheduler::strip_proxy( task_proxy* tp ) { + __TBB_ASSERT( tp->prefix().extra_state==es_task_proxy, NULL ); + intptr tat = __TBB_load_with_acquire(tp->task_and_tag); + if( (tat&3)==3 ) { + // proxy is shared by a pool and a mailbox. + // Attempt to transition it to "empty proxy in mailbox" state. + if( __TBB_CompareAndSwapW( &tp->task_and_tag, task_proxy::mailbox_bit, tat )==tat ) { + // Successfully grabbed the task, and left the mailbox with the job of freeing the proxy. + return (task*)(tat&~3); + } + __TBB_ASSERT( tp->task_and_tag==task_proxy::pool_bit, NULL ); + } else { + // We have exclusive access to the proxy + __TBB_ASSERT( (tat&3)==task_proxy::pool_bit, "task did not come from pool?" ); + __TBB_ASSERT ( !(tat&~3), "Empty proxy in the pool contains non-zero task pointer" ); + } +#if TBB_USE_ASSERT + tp->prefix().state = task::allocated; +#endif + free_task_proxy( *tp ); + // Another thread grabbed the underlying task via their mailbox + return NULL; +} + +inline task* GenericScheduler::get_task() { + task* result = NULL; +retry: + --arena_slot->tail; + __TBB_rel_acq_fence(); + if ( (intptr_t)arena_slot->head > (intptr_t)arena_slot->tail ) { + acquire_task_pool(); + if ( (intptr_t)arena_slot->head <= (intptr_t)arena_slot->tail ) { + // The thief backed off - grab the task + __TBB_ASSERT_VALID_TASK_PTR( dummy_slot.task_pool[arena_slot->tail] ); + result = dummy_slot.task_pool[arena_slot->tail]; + __TBB_POISON_TASK_PTR( dummy_slot.task_pool[arena_slot->tail] ); + } + else { + __TBB_ASSERT ( arena_slot->head == arena_slot->tail + 1, "victim/thief arbitration algorithm failure" ); + } + if ( (intptr_t)arena_slot->head < (intptr_t)arena_slot->tail ) { + release_task_pool(); + } + else { + // In any case the deque is empty now, so compact it + arena_slot->head = arena_slot->tail = 0; + if ( in_arena() ) + leave_arena(); + } + } + else { + __TBB_ASSERT_VALID_TASK_PTR( dummy_slot.task_pool[arena_slot->tail] ); + result = dummy_slot.task_pool[arena_slot->tail]; + __TBB_POISON_TASK_PTR( dummy_slot.task_pool[arena_slot->tail] ); + } + if( result && is_proxy(*result) ) { + result = strip_proxy((task_proxy*)result); + if( !result ) { + goto retry; + } + GATHER_STATISTIC( ++proxy_execute_count ); + // Following assertion should be true because TBB 2.0 tasks never specify affinity, and hence are not proxied. + __TBB_ASSERT( is_version_3_task(*result), "backwards compatibility with TBB 2.0 broken" ); + // Task affinity has changed. + innermost_running_task = result; + result->note_affinity(my_affinity_id); + } + return result; +} // GenericScheduler::get_task + +task* GenericScheduler::steal_task( ArenaSlot& victim_slot ) { + task** victim_pool = lock_task_pool( &victim_slot ); + if ( !victim_pool ) + return NULL; + const size_t none = ~0u; + size_t first_skipped_proxy = none; + task* result = NULL; +retry: + ++victim_slot.head; + __TBB_rel_acq_fence(); + if ( (intptr_t)victim_slot.head > (intptr_t)victim_slot.tail ) { + --victim_slot.head; + } + else { + __TBB_ASSERT_VALID_TASK_PTR( victim_pool[victim_slot.head - 1]); + result = victim_pool[victim_slot.head - 1]; + if( is_proxy(*result) ) { + task_proxy& tp = *static_cast(result); + // If task will likely be grabbed by whom it was mailed to, skip it. + if( (tp.task_and_tag & 3) == 3 && tp.outbox->recipient_is_idle() ) { + if ( first_skipped_proxy == none ) + first_skipped_proxy = victim_slot.head - 1; + result = NULL; + goto retry; + } + } + __TBB_POISON_TASK_PTR(victim_pool[victim_slot.head - 1]); + } + if ( first_skipped_proxy != none ) { + if ( result ) { + victim_pool[victim_slot.head - 1] = victim_pool[first_skipped_proxy]; + __TBB_POISON_TASK_PTR( victim_pool[first_skipped_proxy] ); + __TBB_store_with_release( victim_slot.head, first_skipped_proxy + 1 ); + } + else + __TBB_store_with_release( victim_slot.head, first_skipped_proxy ); + } + unlock_task_pool( &victim_slot, victim_pool ); + return result; +} + + +#define ConcurrentWaitsEnabled(t) (t.prefix().context->my_version_and_traits & task_group_context::concurrent_wait) +#define CancellationInfoPresent(t) (t->prefix().context->my_cancellation_requested) + +#if TBB_USE_CAPTURED_EXCEPTION + inline tbb_exception* TbbCurrentException( task_group_context*, tbb_exception* src) { return src->move(); } + inline tbb_exception* TbbCurrentException( task_group_context*, captured_exception* src) { return src; } +#else + // Using macro instead of an inline function here allows to avoid evaluation of the + // TbbCapturedException expression when exact propagation is enabled for the context. + #define TbbCurrentException(context, TbbCapturedException) \ + context->my_version_and_traits & task_group_context::exact_exception \ + ? tbb_exception_ptr::allocate() \ + : tbb_exception_ptr::allocate( *(TbbCapturedException) ); +#endif /* !TBB_USE_CAPTURED_EXCEPTION */ + +#define TbbRegisterCurrentException(context, TbbCapturedException) \ + if ( context->cancel_group_execution() ) { \ + /* We are the first to signal cancellation, so store the exception that caused it. */ \ + context->my_exception = TbbCurrentException( context, TbbCapturedException ); \ + } + +#define TbbCatchAll(context) \ + catch ( tbb_exception& exc ) { \ + TbbRegisterCurrentException( context, &exc ); \ + } catch ( std::exception& exc ) { \ + TbbRegisterCurrentException( context, captured_exception::allocate(typeid(exc).name(), exc.what()) ); \ + } catch ( ... ) { \ + TbbRegisterCurrentException( context, captured_exception::allocate("...", "Unidentified exception") );\ + } + +template +void CustomScheduler::local_wait_for_all( task& parent, task* child ) { + __TBB_ASSERT( Governor::is_set(this), NULL ); + if( child ) { + child->prefix().owner = this; + } + __TBB_ASSERT( parent.ref_count() >= (child && child->parent() == &parent ? 2 : 1), "ref_count is too small" ); + __TBB_ASSERT( assert_okay(), NULL ); + // Using parent's refcount in sync_prepare (in the stealing loop below) is + // a workaround for TP. We need to name it here to display correctly in Ampl. + if( SchedulerTraits::itt_possible ) + ITT_SYNC_CREATE(&parent.prefix().ref_count, SyncType_Scheduler, SyncObj_TaskStealingLoop); +#if __TBB_EXCEPTIONS + __TBB_ASSERT( parent.prefix().context || (is_worker() && &parent == dummy_task), "parent task does not have context" ); +#endif /* __TBB_EXCEPTIONS */ + task* t = child; + // Constants all_work_done and all_local_work_done are actually unreacheable + // refcount values that prevent early quitting the dispatch loop. They are + // defined to be in the middle of the range of negative values representable + // by the reference_count type. + static const reference_count + // For nested dispatch loops in masters and any dispatch loops in workers + parents_work_done = 1, + // For outermost dispatch loops in masters + all_work_done = (reference_count)3 << (sizeof(reference_count) * 8 - 2), + // For termination dispatch loops in masters + all_local_work_done = all_work_done + 1; + reference_count quit_point; + if( innermost_running_task == dummy_task ) { + // We are in the outermost task dispatch loop of a master thread, + __TBB_ASSERT( !is_worker(), NULL ); + quit_point = &parent == dummy_task ? all_local_work_done : all_work_done; + } else { + quit_point = parents_work_done; + } + task* old_innermost_running_task = innermost_running_task; +#if __TBB_EXCEPTIONS +exception_was_caught: + try { +#endif /* __TBB_EXCEPTIONS */ + // Outer loop steals tasks when necessary. + for(;;) { + // Middle loop evaluates tasks that are pulled off "array". + do { + // Inner loop evaluates tasks that are handed directly to us by other tasks. + while(t) { + __TBB_ASSERT( inbox.assert_is_idle(false), NULL ); +#if TBB_USE_ASSERT + __TBB_ASSERT(!is_proxy(*t),"unexpected proxy"); + __TBB_ASSERT( t->prefix().owner==this, NULL ); +#if __TBB_EXCEPTIONS + if ( !t->prefix().context->my_cancellation_requested ) +#endif + __TBB_ASSERT( 1L<state() & (1L<prefix().state = task::executing; +#if __TBB_EXCEPTIONS + if ( !t->prefix().context->my_cancellation_requested ) +#endif + { + TBB_TRACE(("%p.wait_for_all: %p.execute\n",this,t)); + GATHER_STATISTIC( ++execute_count ); + t_next = t->execute(); +#if STATISTICS + if (t_next) { + affinity_id next_affinity=t_next->prefix().affinity; + if (next_affinity != 0 && next_affinity != my_affinity_id) + GATHER_STATISTIC( ++proxy_bypass_count ); + } +#endif + } + if( t_next ) { + __TBB_ASSERT( t_next->state()==task::allocated, + "if task::execute() returns task, it must be marked as allocated" ); + // The store here has a subtle secondary effect - it fetches *t_next into cache. + t_next->prefix().owner = this; + } + __TBB_ASSERT(assert_okay(),NULL); + switch( task::state_type(t->prefix().state) ) { + case task::executing: { + // this block was copied below to case task::recycle + // when making changes, check it too + task* s = t->parent(); + __TBB_ASSERT( innermost_running_task==t, NULL ); + __TBB_ASSERT( t->prefix().ref_count==0, "Task still has children after it has been executed" ); + t->~task(); + if( s ) { + if( tally_completion_of_one_predecessor(*s) ) { +#if TBB_USE_ASSERT + s->prefix().extra_state &= ~es_ref_count_active; +#endif /* TBB_USE_ASSERT */ + s->prefix().owner = this; + + if( !t_next ) { + t_next = s; + } else { + LocalSpawn( *s, s->prefix().next ); + __TBB_ASSERT(assert_okay(),NULL); + } + } + } + free_task( *t ); + break; + } + + case task::recycle: { // state set by recycle_as_safe_continuation() + t->prefix().state = task::allocated; + // for safe continuation, need atomically decrement ref_count; + // the block was copied from above case task::executing, and changed. + // Use "s" here as name for t, so that code resembles case task::executing more closely. + task* const& s = t; + if( tally_completion_of_one_predecessor(*s) ) { + // Unused load is put here for sake of inserting an "acquire" fence. +#if TBB_USE_ASSERT + s->prefix().extra_state &= ~es_ref_count_active; + __TBB_ASSERT( s->prefix().owner==this, "ownership corrupt?" ); +#endif /* TBB_USE_ASSERT */ + if( !t_next ) { + t_next = s; + } else { + LocalSpawn( *s, s->prefix().next ); + __TBB_ASSERT(assert_okay(),NULL); + } + } + break; + } + + case task::reexecute: // set by recycle_to_reexecute() + __TBB_ASSERT( t_next && t_next != t, "reexecution requires that method 'execute' return another task" ); + TBB_TRACE(("%p.wait_for_all: put task %p back into array",this,t)); + t->prefix().state = task::allocated; + LocalSpawn( *t, t->prefix().next ); + __TBB_ASSERT(assert_okay(),NULL); + break; +#if TBB_USE_ASSERT + case task::allocated: + break; + case task::ready: + __TBB_ASSERT( false, "task is in READY state upon return from method execute()" ); + break; + default: + __TBB_ASSERT( false, "illegal state" ); +#else + default: // just to shut up some compilation warnings + break; +#endif /* TBB_USE_ASSERT */ + } + + t = t_next; + } // end of scheduler bypass loop + __TBB_ASSERT(assert_okay(),NULL); + + // If the parent's descendants are finished with and we are not in + // the outermost dispatch loop of a master thread, then we are done. + // This is necessary to prevent unbounded stack growth in case of deep + // wait_for_all nesting. + // Note that we cannot return from master's outermost dispatch loop + // until we process all the tasks in the local pool, since in case + // of multiple masters this could have left some of them forever + // waiting for their stolen children to be processed. + if ( parent.prefix().ref_count == quit_point ) + break; + t = get_task(); + __TBB_ASSERT(!t || !is_proxy(*t),"unexpected proxy"); +#if TBB_USE_ASSERT + __TBB_ASSERT(assert_okay(),NULL); + if(t) { + AssertOkay(*t); + __TBB_ASSERT( t->prefix().owner==this, "thread got task that it does not own" ); + } +#endif /* TBB_USE_ASSERT */ + } while( t ); // end of local task array processing loop + + if ( quit_point == all_local_work_done ) { + __TBB_ASSERT( arena_slot == &dummy_slot && arena_slot->head == 0 && arena_slot->tail == 0, NULL ); + innermost_running_task = old_innermost_running_task; + return; + } + inbox.set_is_idle( true ); + __TBB_ASSERT( arena->prefix().number_of_workers>0||parent.prefix().ref_count==1, "deadlock detected" ); + // The state "failure_count==-1" is used only when itt_possible is true, + // and denotes that a sync_prepare has not yet been issued. + for( int failure_count = -static_cast(SchedulerTraits::itt_possible);; ++failure_count) { + if( parent.prefix().ref_count==1 ) { + if( SchedulerTraits::itt_possible ) { + if( failure_count!=-1 ) { + ITT_NOTIFY(sync_prepare, &parent.prefix().ref_count); + // Notify Intel(R) Thread Profiler that thread has stopped spinning. + ITT_NOTIFY(sync_acquired, this); + } + ITT_NOTIFY(sync_acquired, &parent.prefix().ref_count); + } + inbox.set_is_idle( false ); + goto done; + } + // Try to steal a task from a random victim. + size_t n = arena->prefix().limit; + if( n>1 ) { + if( !my_affinity_id || !(t=get_mailbox_task()) ) { + if ( !can_steal() ) + goto fail; + size_t k = random.get() % (n-1); + ArenaSlot* victim = &arena->slot[k]; + // The following condition excludes the master that might have + // already taken our previous place in the arena from the list . + // of potential victims. But since such a situation can take + // place only in case of significant oversubscription, keeping + // the checks simple seems to be preferable to complicating the code. + if( k >= arena_index ) + ++victim; // Adjusts random distribution to exclude self + t = steal_task( *victim ); + if( !t ) goto fail; + if( is_proxy(*t) ) { + t = strip_proxy((task_proxy*)t); + if( !t ) goto fail; + GATHER_STATISTIC( ++proxy_steal_count ); + } + GATHER_STATISTIC( ++steal_count ); + if( is_version_3_task(*t) ) { + innermost_running_task = t; + t->note_affinity( my_affinity_id ); + } + } else { + GATHER_STATISTIC( ++mail_received_count ); + } + __TBB_ASSERT(t,NULL); +#if __TBB_SCHEDULER_OBSERVER + // No memory fence required for read of global_last_observer_proxy, because prior fence on steal/mailbox suffices. + if( local_last_observer_proxy!=global_last_observer_proxy ) { + notify_entry_observers(); + } +#endif /* __TBB_SCHEDULER_OBSERVER */ + { + if( SchedulerTraits::itt_possible ) { + if( failure_count!=-1 ) { + // FIXME - might be victim, or might be selected from a mailbox + // Notify Intel(R) Thread Profiler that thread has stopped spinning. + ITT_NOTIFY(sync_acquired, this); + // FIXME - might be victim, or might be selected from a mailbox + } + } + __TBB_ASSERT(t,NULL); + inbox.set_is_idle( false ); + break; + } + } +fail: + if( SchedulerTraits::itt_possible && failure_count==-1 ) { + // The first attempt to steal work failed, so notify Intel(R) Thread Profiler that + // the thread has started spinning. Ideally, we would do this notification + // *before* the first failed attempt to steal, but at that point we do not + // know that the steal will fail. + ITT_NOTIFY(sync_prepare, this); + failure_count = 0; + } + // Pause, even if we are going to yield, because the yield might return immediately. + __TBB_Pause(PauseTime); + int yield_threshold = 2*int(n); + if( failure_count>=yield_threshold ) { + __TBB_Yield(); + if( failure_count>=yield_threshold+100 ) { + if( !old_innermost_running_task && arena->check_if_pool_is_empty() ) { + // Current thread was created by RML and has nothing to do, so return it to the RML. + // For purposes of affinity support, the thread is considered idle while it is in RML. + // Restore innermost_running_task to its original value. + innermost_running_task = NULL; + return; + } + failure_count = yield_threshold; + } + } + } + __TBB_ASSERT(t,NULL); + __TBB_ASSERT(!is_proxy(*t),"unexpected proxy"); + t->prefix().owner = this; + } // end of stealing loop +#if __TBB_EXCEPTIONS + } TbbCatchAll( t->prefix().context ); + + if( task::state_type(t->prefix().state) == task::recycle ) { // state set by recycle_as_safe_continuation() + t->prefix().state = task::allocated; + // for safe continuation, need to atomically decrement ref_count; + if( SchedulerTraits::itt_possible ) + ITT_NOTIFY(sync_releasing, &t->prefix().ref_count); + if( __TBB_FetchAndDecrementWrelease(&t->prefix().ref_count)==1 ) { + if( SchedulerTraits::itt_possible ) + ITT_NOTIFY(sync_acquired, &t->prefix().ref_count); + }else{ + t = NULL; + } + } + goto exception_was_caught; +#endif /* __TBB_EXCEPTIONS */ +done: + if ( !ConcurrentWaitsEnabled(parent) ) + parent.prefix().ref_count = 0; +#if TBB_USE_ASSERT + parent.prefix().extra_state &= ~es_ref_count_active; +#endif /* TBB_USE_ASSERT */ + innermost_running_task = old_innermost_running_task; +#if __TBB_EXCEPTIONS + __TBB_ASSERT(parent.prefix().context && dummy_task->prefix().context, NULL); + task_group_context* parent_ctx = parent.prefix().context; + if ( parent_ctx->my_cancellation_requested ) { + task_group_context::exception_container_type *pe = parent_ctx->my_exception; + if ( innermost_running_task == dummy_task && parent_ctx == dummy_task->prefix().context ) { + // We are in the outermost task dispatch loop of a master thread, and + // the whole task tree has been collapsed. So we may clear cancellation data. + parent_ctx->my_cancellation_requested = 0; + __TBB_ASSERT(dummy_task->prefix().context == parent_ctx || !CancellationInfoPresent(dummy_task), + "Unexpected exception or cancellation data in the dummy task"); + // If possible, add assertion that master's dummy task context does not have children + } + if ( pe ) + pe->throw_self(); + } + __TBB_ASSERT(!is_worker() || !CancellationInfoPresent(dummy_task), + "Worker's dummy task context modified"); + __TBB_ASSERT(innermost_running_task != dummy_task || !CancellationInfoPresent(dummy_task), + "Unexpected exception or cancellation data in the master's dummy task"); +#endif /* __TBB_EXCEPTIONS */ + __TBB_ASSERT( assert_okay(), NULL ); +} + +#undef CancellationInfoPresent + +inline void GenericScheduler::do_enter_arena() { + arena_slot = &arena->slot[arena_index]; + __TBB_ASSERT ( arena_slot->head == arena_slot->tail, "task deque of a free slot must be empty" ); + arena_slot->head = dummy_slot.head; + arena_slot->tail = dummy_slot.tail; + // Release signal on behalf of previously spawned tasks (when this thread was not in arena yet) + ITT_NOTIFY(sync_releasing, arena_slot); + __TBB_store_with_release( arena_slot->task_pool, dummy_slot.task_pool ); + // We'll leave arena only when it's empty, so clean up local instances of indices. + dummy_slot.head = dummy_slot.tail = 0; +} + +void GenericScheduler::enter_arena() { + __TBB_ASSERT ( is_worker(), "only workers should use enter_arena()" ); + __TBB_ASSERT ( arena, "no arena: initialization not completed?" ); + __TBB_ASSERT ( !in_arena(), "worker already in arena?" ); + __TBB_ASSERT ( arena_index < arena->prefix().number_of_workers, "invalid worker arena slot index" ); + __TBB_ASSERT ( arena->slot[arena_index].task_pool == EmptyTaskPool, "someone else grabbed my arena slot?" ); + do_enter_arena(); +} + +void GenericScheduler::try_enter_arena() { + __TBB_ASSERT ( !is_worker(), "only masters should use try_enter_arena()" ); + __TBB_ASSERT ( arena, "no arena: initialization not completed?" ); + __TBB_ASSERT ( !in_arena(), "master already in arena?" ); + __TBB_ASSERT ( arena_index >= arena->prefix().number_of_workers && + arena_index < arena->prefix().number_of_slots, "invalid arena slot hint value" ); + + + size_t h = arena_index; + // We do not lock task pool upon successful entering arena + if( arena->slot[h].task_pool != EmptyTaskPool || + __TBB_CompareAndSwapW( &arena->slot[h].task_pool, (intptr_t)LockedTaskPool, + (intptr_t)EmptyTaskPool ) != (intptr_t)EmptyTaskPool ) + { + // Hinted arena slot is already busy, try some of the others at random + unsigned first = arena->prefix().number_of_workers, + last = arena->prefix().number_of_slots; + unsigned n = last - first - 1; + /// \todo Is this limit reasonable? + size_t max_attempts = last - first; + for (;;) { + size_t k = first + random.get() % n; + if( k >= h ) + ++k; // Adjusts random distribution to exclude previously tried slot + h = k; + if( arena->slot[h].task_pool == EmptyTaskPool && + __TBB_CompareAndSwapW( &arena->slot[h].task_pool, (intptr_t)LockedTaskPool, + (intptr_t)EmptyTaskPool ) == (intptr_t)EmptyTaskPool ) + { + break; + } + if ( --max_attempts == 0 ) { + // After so many attempts we are still unable to find a vacant arena slot. + // Cease the vain effort and work outside of arena for a while. + return; + } + } + } + // Successfully claimed a slot in the arena. + ITT_NOTIFY(sync_acquired, &arena->slot[h]); + __TBB_ASSERT ( arena->slot[h].task_pool == LockedTaskPool, "Arena slot is not actually acquired" ); + arena_index = h; + do_enter_arena(); + attach_mailbox( affinity_id(h+1) ); +} + +void GenericScheduler::leave_arena() { + __TBB_ASSERT( in_arena(), "Not in arena" ); + // Do not reset arena_index. It will be used to (attempt to) re-acquire the slot next time + __TBB_ASSERT( &arena->slot[arena_index] == arena_slot, "Arena slot and slot index mismatch" ); + __TBB_ASSERT ( arena_slot->task_pool == LockedTaskPool, "Task pool must be locked when leaving arena" ); + __TBB_ASSERT ( arena_slot->head == arena_slot->tail, "Cannot leave arena when the task pool is not empty" ); + if ( !is_worker() ) { + my_affinity_id = 0; + inbox.detach(); + } + ITT_NOTIFY(sync_releasing, &arena->slot[arena_index]); + __TBB_store_with_release( arena_slot->task_pool, EmptyTaskPool ); + arena_slot = &dummy_slot; +} + + +GenericScheduler* GenericScheduler::create_worker( Arena& a, size_t index ) { + GenericScheduler* s = GenericScheduler::allocate_scheduler(&a); + + // Put myself into the arena +#if __TBB_EXCEPTIONS + s->dummy_task->prefix().context = &dummy_context; + // Sync up the local cancellation state with the global one. No need for fence here. + s->local_cancel_count = global_cancel_count; +#endif /* __TBB_EXCEPTIONS */ + s->attach_mailbox( index+1 ); + s->arena_index = index; + s->init_stack_info(); + + __TBB_store_with_release( a.prefix().worker_list[index].scheduler, s ); + return s; +} + + +GenericScheduler* GenericScheduler::create_master( Arena* arena ) { + GenericScheduler* s = GenericScheduler::allocate_scheduler( arena ); + task& t = *s->dummy_task; + s->innermost_running_task = &t; + t.prefix().ref_count = 1; + Governor::sign_on(s); +#if __TBB_EXCEPTIONS + // Context to be used by root tasks by default (if the user has not specified one). + // Allocation is done by NFS allocator because we cannot reuse memory allocated + // for task objects since the free list is empty at the moment. + t.prefix().context = new ( NFS_Allocate(sizeof(task_group_context), 1, NULL) ) task_group_context(task_group_context::isolated); + scheduler_list_node_t &node = s->my_node; + { + mutex::scoped_lock lock(the_scheduler_list_mutex); + node.my_next = the_scheduler_list_head.my_next; + node.my_prev = &the_scheduler_list_head; + the_scheduler_list_head.my_next->my_prev = &node; + the_scheduler_list_head.my_next = &node; +#endif /* __TBB_EXCEPTIONS */ + unsigned last = arena->prefix().number_of_slots, + cur_limit = arena->prefix().limit; + // This slot index assignment is just a hint to ... + if ( cur_limit < last ) { + // ... to prevent competition between the first few masters. + s->arena_index = cur_limit++; + // In the absence of exception handling this code is a subject to data + // race in case of multiple masters concurrently entering empty arena. + // But it does not affect correctness, and can only result in a few + // masters competing for the same arena slot during the first acquisition. + // The cost of competition is low in comparison to that of oversubscription. + arena->prefix().limit = cur_limit; + } + else { + // ... to minimize the probability of competition between multiple masters. + unsigned first = arena->prefix().number_of_workers; + s->arena_index = first + s->random.get() % (last - first); + } +#if __TBB_EXCEPTIONS + } +#endif + s->init_stack_info(); +#if __TBB_EXCEPTIONS + // Sync up the local cancellation state with the global one. No need for fence here. + s->local_cancel_count = global_cancel_count; +#endif + __TBB_ASSERT( &task::self()==&t, NULL ); +#if __TBB_SCHEDULER_OBSERVER + // Process any existing observers. + s->notify_entry_observers(); +#endif /* __TBB_SCHEDULER_OBSERVER */ + return s; +} + + +void GenericScheduler::cleanup_worker( void* arg ) { + TBB_TRACE(("%p.cleanup_worker entered\n",arg)); + GenericScheduler& s = *(GenericScheduler*)arg; + __TBB_ASSERT( s.dummy_slot.task_pool, "cleaning up worker with missing task pool" ); +#if __TBB_SCHEDULER_OBSERVER + s.notify_exit_observers(/*is_worker=*/true); +#endif /* __TBB_SCHEDULER_OBSERVER */ + __TBB_ASSERT( s.arena_slot->task_pool == EmptyTaskPool || s.arena_slot->head == s.arena_slot->tail, + "worker has unfinished work at run down" ); + s.free_scheduler(); +} + +void GenericScheduler::cleanup_master() { + TBB_TRACE(("%p.cleanup_master entered\n",this)); + GenericScheduler& s = *this; // for similarity with cleanup_worker + __TBB_ASSERT( s.dummy_slot.task_pool, "cleaning up master with missing task pool" ); +#if __TBB_SCHEDULER_OBSERVER + s.notify_exit_observers(/*is_worker=*/false); +#endif /* __TBB_SCHEDULER_OBSERVER */ + if ( !is_local_task_pool_empty() ) { + __TBB_ASSERT ( Governor::is_set(this), "TLS slot is cleared before the task pool cleanup" ); + s.wait_for_all( *dummy_task, NULL ); + __TBB_ASSERT ( Governor::is_set(this), "Other thread reused our TLS key during the task pool cleanup" ); + } + s.free_scheduler(); + Governor::finish_with_arena(); +} + +//------------------------------------------------------------------------ +// UnpaddedArenaPrefix +//------------------------------------------------------------------------ +inline Arena& UnpaddedArenaPrefix::arena() { + return *static_cast(static_cast( static_cast(this)+1 )); +} + +void UnpaddedArenaPrefix::process( job& j ) { + GenericScheduler& s = static_cast(j); + __TBB_ASSERT( Governor::is_set(&s), NULL ); + __TBB_ASSERT( !s.innermost_running_task, NULL ); + s.wait_for_all(*s.dummy_task,NULL); + __TBB_ASSERT( !s.innermost_running_task, NULL ); +} + +void UnpaddedArenaPrefix::cleanup( job& j ) { + GenericScheduler& s = static_cast(j); + GenericScheduler::cleanup_worker( &s ); +} + +void UnpaddedArenaPrefix::open_connection_to_rml() { + __TBB_ASSERT( !server, NULL ); + __TBB_ASSERT( stack_size>0, NULL ); + if( !use_private_rml ) { + ::rml::factory::status_type status = rml_server_factory.make_server( server, *this ); + if( status==::rml::factory::st_success ) { + __TBB_ASSERT( server, NULL ); + return; + } + use_private_rml = true; + fprintf(stderr,"warning from TBB: make_server failed with status %x, falling back on private rml",status); + } + server = rml::make_private_server( *this ); +} + +void UnpaddedArenaPrefix::acknowledge_close_connection() { + arena().free_arena(); +} + +::rml::job* UnpaddedArenaPrefix::create_one_job() { + GenericScheduler* s = GenericScheduler::create_worker( arena(), next_job_index++ ); + Governor::sign_on(s); + return s; +} + +//------------------------------------------------------------------------ +// Methods of allocate_root_proxy +//------------------------------------------------------------------------ +task& allocate_root_proxy::allocate( size_t size ) { + internal::GenericScheduler* v = Governor::local_scheduler(); + __TBB_ASSERT( v, "thread did not activate a task_scheduler_init object?" ); +#if __TBB_EXCEPTIONS + task_prefix& p = v->innermost_running_task->prefix(); +#endif + // New root task becomes part of the currently running task's cancellation context + return v->allocate_task( size, __TBB_CONTEXT_ARG(NULL, p.context) ); +} + +void allocate_root_proxy::free( task& task ) { + internal::GenericScheduler* v = Governor::local_scheduler(); + __TBB_ASSERT( v, "thread does not have initialized task_scheduler_init object?" ); +#if __TBB_EXCEPTIONS + // No need to do anything here as long as there is no context -> task connection +#endif /* __TBB_EXCEPTIONS */ + v->free_task( task ); +} + +#if __TBB_EXCEPTIONS +//------------------------------------------------------------------------ +// Methods of allocate_root_with_context_proxy +//------------------------------------------------------------------------ +task& allocate_root_with_context_proxy::allocate( size_t size ) const { + internal::GenericScheduler* v = Governor::local_scheduler(); + __TBB_ASSERT( v, "thread did not activate a task_scheduler_init object?" ); + task_prefix& p = v->innermost_running_task->prefix(); + task& t = v->allocate_task( size, __TBB_CONTEXT_ARG(NULL, &my_context) ); + // The supported usage model prohibits concurrent initial binding. Thus we + // do not need interlocked operations or fences here. + if ( my_context.my_kind == task_group_context::binding_required ) { + __TBB_ASSERT ( my_context.my_owner, "Context without owner" ); + __TBB_ASSERT ( !my_context.my_parent, "Parent context set before initial binding" ); + // If we are in the outermost task dispatch loop of a master thread, then + // there is nothing to bind this context to, and we skip the binding part. + if ( v->innermost_running_task != v->dummy_task ) { + // By not using the fence here we get faster code in case of normal execution + // flow in exchange of a bit higher probability that in cases when cancellation + // is in flight we will take deeper traversal branch. Normally cache coherency + // mechanisms are efficient enough to deliver updated value most of the time. + uintptr_t local_count_snapshot = ((GenericScheduler*)my_context.my_owner)->local_cancel_count; + __TBB_store_with_release(my_context.my_parent, p.context); + uintptr_t global_count_snapshot = __TBB_load_with_acquire(global_cancel_count); + if ( !my_context.my_cancellation_requested ) { + if ( local_count_snapshot == global_count_snapshot ) { + // It is possible that there is active cancellation request in our + // parents chain. Fortunately the equality of the local and global + // counters means that if this is the case it's already been propagated + // to our parent. + my_context.my_cancellation_requested = p.context->my_cancellation_requested; + } else { + // Another thread was propagating cancellation request at the moment + // when we set our parent, but since we do not use locks we could've + // been skipped. So we have to make sure that we get the cancellation + // request if one of our ancestors has been canceled. + my_context.propagate_cancellation_from_ancestors(); + } + } + } + my_context.my_kind = task_group_context::binding_completed; + } + // else the context either has already been associated with its parent or is isolated + return t; +} + +void allocate_root_with_context_proxy::free( task& task ) const { + internal::GenericScheduler* v = Governor::local_scheduler(); + __TBB_ASSERT( v, "thread does not have initialized task_scheduler_init object?" ); + // No need to do anything here as long as unbinding is performed by context destructor only. + v->free_task( task ); +} +#endif /* __TBB_EXCEPTIONS */ + +//------------------------------------------------------------------------ +// Methods of allocate_continuation_proxy +//------------------------------------------------------------------------ +task& allocate_continuation_proxy::allocate( size_t size ) const { + task& t = *((task*)this); + __TBB_ASSERT( AssertOkay(t), NULL ); + GenericScheduler* s = Governor::local_scheduler(); + task* parent = t.parent(); + t.prefix().parent = NULL; + return s->allocate_task( size, __TBB_CONTEXT_ARG(parent, t.prefix().context) ); +} + +void allocate_continuation_proxy::free( task& mytask ) const { + // Restore the parent as it was before the corresponding allocate was called. + ((task*)this)->prefix().parent = mytask.parent(); + Governor::local_scheduler()->free_task(mytask); +} + +//------------------------------------------------------------------------ +// Methods of allocate_child_proxy +//------------------------------------------------------------------------ +task& allocate_child_proxy::allocate( size_t size ) const { + task& t = *((task*)this); + __TBB_ASSERT( AssertOkay(t), NULL ); + GenericScheduler* s = Governor::local_scheduler(); + return s->allocate_task( size, __TBB_CONTEXT_ARG(&t, t.prefix().context) ); +} + +void allocate_child_proxy::free( task& mytask ) const { + Governor::local_scheduler()->free_task(mytask); +} + +//------------------------------------------------------------------------ +// Methods of allocate_additional_child_of_proxy +//------------------------------------------------------------------------ +task& allocate_additional_child_of_proxy::allocate( size_t size ) const { + __TBB_ASSERT( AssertOkay(self), NULL ); + parent.increment_ref_count(); + GenericScheduler* s = Governor::local_scheduler(); + return s->allocate_task( size, __TBB_CONTEXT_ARG(&parent, parent.prefix().context) ); +} + +void allocate_additional_child_of_proxy::free( task& task ) const { + // Undo the increment. We do not check the result of the fetch-and-decrement. + // We could consider be spawning the task if the fetch-and-decrement returns 1. + // But we do not know that was the programmer's intention. + // Furthermore, if it was the programmer's intention, the program has a fundamental + // race condition (that we warn about in Reference manual), because the + // reference count might have become zero before the corresponding call to + // allocate_additional_child_of_proxy::allocate. + parent.internal_decrement_ref_count(); + Governor::local_scheduler()->free_task(task); +} + +//------------------------------------------------------------------------ +// Support for auto_partitioner +//------------------------------------------------------------------------ +size_t get_initial_auto_partitioner_divisor() { + const size_t X_FACTOR = 4; + return X_FACTOR * (Governor::number_of_workers_in_arena()+1); +} + +//------------------------------------------------------------------------ +// Methods of affinity_partitioner_base_v3 +//------------------------------------------------------------------------ +void affinity_partitioner_base_v3::resize( unsigned factor ) { + // Check factor to avoid asking for number of workers while there might be no arena. + size_t new_size = factor ? factor*(Governor::number_of_workers_in_arena()+1) : 0; + if( new_size!=my_size ) { + if( my_array ) { + NFS_Free( my_array ); + // Following two assignments must be done here for sake of exception safety. + my_array = NULL; + my_size = 0; + } + if( new_size ) { + my_array = static_cast(NFS_Allocate(new_size,sizeof(affinity_id), NULL )); + memset( my_array, 0, sizeof(affinity_id)*new_size ); + my_size = new_size; + } + } +} + +} // namespace internal + +using namespace tbb::internal; + +#if __TBB_EXCEPTIONS + +//------------------------------------------------------------------------ +// captured_exception +//------------------------------------------------------------------------ + +inline +void copy_string ( char*& dst, const char* src ) { + if ( src ) { + size_t len = strlen(src) + 1; + dst = (char*)allocate_via_handler_v3(len); + strncpy (dst, src, len); + } + else + dst = NULL; +} + +void captured_exception::set ( const char* name, const char* info ) throw() +{ + copy_string(const_cast(my_exception_name), name); + copy_string(const_cast(my_exception_info), info); +} + +void captured_exception::clear () throw() { + deallocate_via_handler_v3 (const_cast(my_exception_name)); + deallocate_via_handler_v3 (const_cast(my_exception_info)); +} + +captured_exception* captured_exception::move () throw() { + captured_exception *e = (captured_exception*)allocate_via_handler_v3(sizeof(captured_exception)); + if ( e ) { + ::new (e) captured_exception(); + e->my_exception_name = my_exception_name; + e->my_exception_info = my_exception_info; + e->my_dynamic = true; + my_exception_name = my_exception_info = NULL; + } + return e; +} + +void captured_exception::destroy () throw() { + __TBB_ASSERT ( my_dynamic, "Method destroy can be used only on objects created by clone or allocate" ); + if ( my_dynamic ) { + this->captured_exception::~captured_exception(); + deallocate_via_handler_v3 (this); + } +} + +captured_exception* captured_exception::allocate ( const char* name, const char* info ) { + captured_exception *e = (captured_exception*)allocate_via_handler_v3( sizeof(captured_exception) ); + if ( e ) { + ::new (e) captured_exception(name, info); + e->my_dynamic = true; + } + return e; +} + +const char* captured_exception::name() const throw() { + return my_exception_name; +} + +const char* captured_exception::what() const throw() { + return my_exception_info; +} + + +//------------------------------------------------------------------------ +// tbb_exception_ptr +//------------------------------------------------------------------------ + +#if !TBB_USE_CAPTURED_EXCEPTION + +namespace internal { + +template +tbb_exception_ptr* AllocateExceptionContainer( const T& src ) { + tbb_exception_ptr *eptr = (tbb_exception_ptr*)allocate_via_handler_v3( sizeof(tbb_exception_ptr) ); + if ( eptr ) + new (eptr) tbb_exception_ptr(src); + return eptr; +} + +tbb_exception_ptr* tbb_exception_ptr::allocate () { + return AllocateExceptionContainer( std::current_exception() ); +} + +tbb_exception_ptr* tbb_exception_ptr::allocate ( const tbb_exception& ) { + return AllocateExceptionContainer( std::current_exception() ); +} + +tbb_exception_ptr* tbb_exception_ptr::allocate ( captured_exception& src ) { + tbb_exception_ptr *res = AllocateExceptionContainer( src ); + src.destroy(); + return res; +} + +void tbb_exception_ptr::destroy () throw() { + this->tbb_exception_ptr::~tbb_exception_ptr(); + deallocate_via_handler_v3 (this); +} + +} // namespace internal +#endif /* !TBB_USE_CAPTURED_EXCEPTION */ + + +//------------------------------------------------------------------------ +// task_group_context +//------------------------------------------------------------------------ + +task_group_context::~task_group_context () { + if ( my_kind != isolated ) { + GenericScheduler *s = (GenericScheduler*)my_owner; + __TBB_ASSERT ( Governor::is_set(s), "Task group context is destructed by wrong thread" ); + my_node.my_next->my_prev = my_node.my_prev; + uintptr_t local_count_snapshot = s->local_cancel_count; + my_node.my_prev->my_next = my_node.my_next; + __TBB_rel_acq_fence(); + if ( local_count_snapshot != global_cancel_count ) { + // Another thread was propagating cancellation request when we removed + // ourselves from the list. We must ensure that it does not access us + // when this destructor finishes. We'll be able to acquire the lock + // below only after the other thread finishes with us. + spin_mutex::scoped_lock lock(s->context_list_mutex); + } + } +#if TBB_USE_DEBUG + my_version_and_traits = 0xDeadBeef; +#endif /* TBB_USE_DEBUG */ + if ( my_exception ) + my_exception->destroy(); +} + +void task_group_context::init () { + __TBB_ASSERT ( sizeof(uintptr_t) < 32, "Layout of my_version_and_traits must be reconsidered on this platform" ); + __TBB_ASSERT ( sizeof(task_group_context) == 2 * NFS_MaxLineSize, "Context class has wrong size - check padding and members alignment" ); + __TBB_ASSERT ( (uintptr_t(this) & (sizeof(my_cancellation_requested) - 1)) == 0, "Context is improperly aligned" ); + __TBB_ASSERT ( my_kind == isolated || my_kind == bound, "Context can be created only as isolated or bound" ); + my_parent = NULL; + my_cancellation_requested = 0; + my_exception = NULL; + if ( my_kind == bound ) { + GenericScheduler *s = Governor::local_scheduler(); + my_owner = s; + __TBB_ASSERT ( my_owner, "Thread has not activated a task_scheduler_init object?" ); + // Backward links are used by this thread only, thus no fences are necessary + my_node.my_prev = &s->context_list_head; + s->context_list_head.my_next->my_prev = &my_node; + // The only operation on the thread local list of contexts that may be performed + // concurrently is its traversal by another thread while propagating cancellation + // request. Therefore the release fence below is necessary to ensure that the new + // value of my_node.my_next is visible to the traversing thread + // after it reads new value of v->context_list_head.my_next. + my_node.my_next = s->context_list_head.my_next; + __TBB_store_with_release(s->context_list_head.my_next, &my_node); + } +} + +bool task_group_context::cancel_group_execution () { + __TBB_ASSERT ( my_cancellation_requested == 0 || my_cancellation_requested == 1, "Invalid cancellation state"); + if ( my_cancellation_requested || __TBB_CompareAndSwapW(&my_cancellation_requested, 1, 0) ) { + // This task group has already been canceled + return false; + } + Governor::local_scheduler()->propagate_cancellation(this); + return true; +} + +bool task_group_context::is_group_execution_cancelled () const { + return my_cancellation_requested != 0; +} + +// IMPORTANT: It is assumed that this method is not used concurrently! +void task_group_context::reset () { + //! \todo Add assertion that this context does not have children + // No fences are necessary since this context can be accessed from another thread + // only after stealing happened (which means necessary fences were used). + if ( my_exception ) { + my_exception->destroy(); + my_exception = NULL; + } + my_cancellation_requested = 0; +} + +void task_group_context::propagate_cancellation_from_ancestors () { + task_group_context *parent = my_parent; + while ( parent && !parent->my_cancellation_requested ) + parent = parent->my_parent; + if ( parent ) { + // One of our ancestor groups was canceled. Cancel all its descendants. + task_group_context *ctx = this; + do { + __TBB_store_with_release(ctx->my_cancellation_requested, 1); + ctx = ctx->my_parent; + } while ( ctx != parent ); + } +} + +void task_group_context::register_pending_exception () { + if ( my_cancellation_requested ) + return; + try { + throw; + } TbbCatchAll( this ); +} + +#endif /* __TBB_EXCEPTIONS */ + +//------------------------------------------------------------------------ +// task +//------------------------------------------------------------------------ + +void task::internal_set_ref_count( int count ) { + __TBB_ASSERT( count>=0, "count must not be negative" ); + __TBB_ASSERT( !(prefix().extra_state&GenericScheduler::es_ref_count_active), "ref_count race detected" ); + ITT_NOTIFY(sync_releasing, &prefix().ref_count); + prefix().ref_count = count; +} + +internal::reference_count task::internal_decrement_ref_count() { + ITT_NOTIFY( sync_releasing, &prefix().ref_count ); + internal::reference_count k = __TBB_FetchAndDecrementWrelease( &prefix().ref_count ); + __TBB_ASSERT( k>=1, "task's reference count underflowed" ); + if( k==1 ) + ITT_NOTIFY( sync_acquired, &prefix().ref_count ); + return k-1; +} + +task& task::self() { + GenericScheduler *v = Governor::local_scheduler(); + __TBB_ASSERT( v->assert_okay(), NULL ); + __TBB_ASSERT( v->innermost_running_task, NULL ); + return *v->innermost_running_task; +} + +bool task::is_owned_by_current_thread() const { + return true; +} + +void task::destroy( task& victim ) { + __TBB_ASSERT( victim.prefix().ref_count== (ConcurrentWaitsEnabled(victim) ? 1 : 0), "Task being destroyed must not have children" ); + __TBB_ASSERT( victim.state()==task::allocated, "illegal state for victim task" ); + task* parent = victim.parent(); + victim.~task(); + if( parent ) { + __TBB_ASSERT( parent->state()==task::allocated, "attempt to destroy child of running or corrupted parent?" ); + parent->internal_decrement_ref_count(); + } + Governor::local_scheduler()->free_task( victim ); +} + +void task::spawn_and_wait_for_all( task_list& list ) { + scheduler* s = Governor::local_scheduler(); + task* t = list.first; + if( t ) { + if( &t->prefix().next!=list.next_ptr ) + s->spawn( *t->prefix().next, *list.next_ptr ); + list.clear(); + } + s->wait_for_all( *this, t ); +} + +/** Defined out of line so that compiler does not replicate task's vtable. + It's pointless to define it inline anyway, because all call sites to it are virtual calls + that the compiler is unlikely to optimize. */ +void task::note_affinity( affinity_id ) { +} + +//------------------------------------------------------------------------ +// task_scheduler_init +//------------------------------------------------------------------------ + +/** Left out-of-line for the sake of the backward binary compatibility **/ +void task_scheduler_init::initialize( int number_of_threads ) { + initialize( number_of_threads, 0 ); +} + +void task_scheduler_init::initialize( int number_of_threads, stack_size_type thread_stack_size ) { + if( number_of_threads!=deferred ) { + __TBB_ASSERT( !my_scheduler, "task_scheduler_init already initialized" ); + __TBB_ASSERT( number_of_threads==-1 || number_of_threads>=1, + "number_of_threads for task_scheduler_init must be -1 or positive" ); + my_scheduler = Governor::init_scheduler( number_of_threads, thread_stack_size ); + } else { + __TBB_ASSERT( !thread_stack_size, "deferred initialization ignores stack size setting" ); + } +} + +void task_scheduler_init::terminate() { + GenericScheduler* s = static_cast(my_scheduler); + my_scheduler = NULL; + __TBB_ASSERT( s, "task_scheduler_init::terminate without corresponding task_scheduler_init::initialize()"); + Governor::terminate_scheduler(s); +} + +int task_scheduler_init::default_num_threads() { + // No memory fence required, because at worst each invoking thread calls NumberOfHardwareThreads. + int n = DefaultNumberOfThreads; + if( !n ) { + DefaultNumberOfThreads = n = DetectNumberOfWorkers(); + } + return n; +} + +#if __TBB_SCHEDULER_OBSERVER +//------------------------------------------------------------------------ +// Methods of observer_proxy +//------------------------------------------------------------------------ +namespace internal { + +#if TBB_USE_ASSERT +static atomic observer_proxy_count; + +struct check_observer_proxy_count { + ~check_observer_proxy_count() { + if( observer_proxy_count!=0 ) { + fprintf(stderr,"warning: leaked %ld observer_proxy objects\n", long(observer_proxy_count)); + } + } +}; + +static check_observer_proxy_count the_check_observer_proxy_count; +#endif /* TBB_USE_ASSERT */ + +observer_proxy::observer_proxy( task_scheduler_observer_v3& tso ) : next(NULL), observer(&tso) { +#if TBB_USE_ASSERT + ++observer_proxy_count; +#endif /* TBB_USE_ASSERT */ + // 1 for observer + gc_ref_count = 1; + { + // Append to the global list + task_scheduler_observer_mutex_scoped_lock lock(the_task_scheduler_observer_mutex.begin()[0],/*is_writer=*/true); + observer_proxy* p = global_last_observer_proxy; + prev = p; + if( p ) + p->next=this; + else + global_first_observer_proxy = this; + global_last_observer_proxy = this; + } +} + +void observer_proxy::remove_from_list() { + // Take myself off the global list. + if( next ) + next->prev = prev; + else + global_last_observer_proxy = prev; + if( prev ) + prev->next = next; + else + global_first_observer_proxy = next; +#if TBB_USE_ASSERT + poison_pointer(prev); + poison_pointer(next); + gc_ref_count = -666; +#endif /* TBB_USE_ASSERT */ +} + +void observer_proxy::remove_ref_slow() { + int r = gc_ref_count; + while(r>1) { + __TBB_ASSERT( r!=0, NULL ); + int r_old = gc_ref_count.compare_and_swap(r-1,r); + if( r_old==r ) { + // Successfully decremented count. + return; + } + r = r_old; + } + __TBB_ASSERT( r==1, NULL ); + // Reference count might go to zero + { + task_scheduler_observer_mutex_scoped_lock lock(the_task_scheduler_observer_mutex.begin()[0],/*is_writer=*/true); + r = --gc_ref_count; + if( !r ) { + remove_from_list(); + } + } + if( !r ) { + __TBB_ASSERT( gc_ref_count == -666, NULL ); +#if TBB_USE_ASSERT + --observer_proxy_count; +#endif /* TBB_USE_ASSERT */ + delete this; + } +} + +observer_proxy* observer_proxy::process_list( observer_proxy* local_last, bool is_worker, bool is_entry ) { + // Pointer p marches though the list. + // If is_entry, start with our previous list position, otherwise start at beginning of list. + observer_proxy* p = is_entry ? local_last : NULL; + for(;;) { + task_scheduler_observer* tso=NULL; + // Hold lock on list only long enough to advance to next proxy in list. + { + task_scheduler_observer_mutex_scoped_lock lock(the_task_scheduler_observer_mutex.begin()[0],/*is_writer=*/false); + do { + if( local_last && local_last->observer ) { + // 2 = 1 for observer and 1 for local_last + __TBB_ASSERT( local_last->gc_ref_count>=2, NULL ); + // Can decrement count quickly, because it cannot become zero here. + --local_last->gc_ref_count; + local_last = NULL; + } else { + // Use slow form of decrementing the reference count, after lock is released. + } + if( p ) { + // We were already processing the list. + if( observer_proxy* q = p->next ) { + // Step to next item in list. + p=q; + } else { + // At end of list. + if( is_entry ) { + // Remember current position in the list, so we can start at on the next call. + ++p->gc_ref_count; + } else { + // Finishin running off the end of the list + p=NULL; + } + goto done; + } + } else { + // Starting pass through the list + p = global_first_observer_proxy; + if( !p ) + goto done; + } + tso = p->observer; + } while( !tso ); + ++p->gc_ref_count; + ++tso->my_busy_count; + } + __TBB_ASSERT( !local_last || p!=local_last, NULL ); + if( local_last ) + local_last->remove_ref_slow(); + // Do not hold any locks on the list while calling user's code. + try { + if( is_entry ) + tso->on_scheduler_entry( is_worker ); + else + tso->on_scheduler_exit( is_worker ); + } catch(...) { + // Suppress exception, because user routines are supposed to be observing, not changing + // behavior of a master or worker thread. +#if TBB_USE_ASSERT + fprintf(stderr,"warning: %s threw exception\n",is_entry?"on_scheduler_entry":"on_scheduler_exit"); +#endif /* __TBB_USE_ASSERT */ + } + intptr bc = --tso->my_busy_count; + __TBB_ASSERT_EX( bc>=0, "my_busy_count underflowed" ); + local_last = p; + } +done: + // Return new value to be used as local_last next time. + if( local_last ) + local_last->remove_ref_slow(); + __TBB_ASSERT( !p || is_entry, NULL ); + return p; +} + +void task_scheduler_observer_v3::observe( bool state ) { + if( state ) { + if( !my_proxy ) { + if( !__TBB_InitOnce::initialization_done() ) + DoOneTimeInitializations(); + my_busy_count = 0; + my_proxy = new observer_proxy(*this); + if( GenericScheduler* s = Governor::local_scheduler() ) { + // Notify newly created observer of its own thread. + // Any other pending observers are notified too. + s->notify_entry_observers(); + } + } + } else { + if( observer_proxy* proxy = my_proxy ) { + my_proxy = NULL; + __TBB_ASSERT( proxy->gc_ref_count>=1, "reference for observer missing" ); + { + task_scheduler_observer_mutex_scoped_lock lock(the_task_scheduler_observer_mutex.begin()[0],/*is_writer=*/true); + proxy->observer = NULL; + } + proxy->remove_ref_slow(); + while( my_busy_count ) { + __TBB_Yield(); + } + } + } +} + +} // namespace internal +#endif /* __TBB_SCHEDULER_OBSERVER */ + +} // namespace tbb + + diff --git a/dep/tbb/src/tbb/tbb_assert_impl.h b/dep/tbb/src/tbb/tbb_assert_impl.h new file mode 100644 index 000000000..2a381f9d0 --- /dev/null +++ b/dep/tbb/src/tbb/tbb_assert_impl.h @@ -0,0 +1,101 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// IMPORTANT: To use assertion handling in TBB, exactly one of the TBB source files +// should #include tbb_assert_impl.h thus instantiating assertion handling routines. +// The intent of putting it to a separate file is to allow some tests to use it +// as well in order to avoid dependency on the library. + +// include headers for required function declarations +#include +#include +#include +#include +#if _MSC_VER +#include +#define __TBB_USE_DBGBREAK_DLG TBB_USE_DEBUG +#endif + +#if _MSC_VER >= 1400 +#define __TBB_EXPORTED_FUNC __cdecl +#else +#define __TBB_EXPORTED_FUNC +#endif + +using namespace std; + +namespace tbb { + //! Type for an assertion handler + typedef void(*assertion_handler_type)( const char* filename, int line, const char* expression, const char * comment ); + + static assertion_handler_type assertion_handler; + + assertion_handler_type __TBB_EXPORTED_FUNC set_assertion_handler( assertion_handler_type new_handler ) { + assertion_handler_type old_handler = assertion_handler; + assertion_handler = new_handler; + return old_handler; + } + + void __TBB_EXPORTED_FUNC assertion_failure( const char* filename, int line, const char* expression, const char* comment ) { + if( assertion_handler_type a = assertion_handler ) { + (*a)(filename,line,expression,comment); + } else { + static bool already_failed; + if( !already_failed ) { + already_failed = true; + fprintf( stderr, "Assertion %s failed on line %d of file %s\n", + expression, line, filename ); + if( comment ) + fprintf( stderr, "Detailed description: %s\n", comment ); +#if __TBB_USE_DBGBREAK_DLG + if(1 == _CrtDbgReport(_CRT_ASSERT, filename, line, "tbb_debug.dll", "%s\r\n%s", expression, comment?comment:"")) + _CrtDbgBreak(); +#else + fflush(stderr); + abort(); +#endif + } + } + } + +#if defined(_MSC_VER)&&_MSC_VER<1400 +# define vsnprintf _vsnprintf +#endif + + namespace internal { + //! Report a runtime warning. + void __TBB_EXPORTED_FUNC runtime_warning( const char* format, ... ) + { + char str[1024]; memset(str, 0, 1024); + va_list args; va_start(args, format); + vsnprintf( str, 1024-1, format, args); + fprintf( stderr, "TBB Warning: %s\n", str); + } + } // namespace internal + +} /* namespace tbb */ diff --git a/dep/tbb/src/tbb/tbb_misc.cpp b/dep/tbb/src/tbb/tbb_misc.cpp new file mode 100644 index 000000000..75ba5d582 --- /dev/null +++ b/dep/tbb/src/tbb/tbb_misc.cpp @@ -0,0 +1,157 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// Source file for miscellaneous entities that are infrequently referenced by +// an executing program. + +#include "tbb/tbb_stddef.h" +// Out-of-line TBB assertion handling routines are instantiated here. +#include "tbb_assert_impl.h" + +#include "tbb_misc.h" +#include +#include +#include +#if defined(__EXCEPTIONS) || defined(_CPPUNWIND) || defined(__SUNPRO_CC) + #include "tbb/tbb_exception.h" + #include // std::string is used to construct runtime_error + #include +#endif + +using namespace std; + +#include "tbb/tbb_machine.h" + +namespace tbb { + +namespace internal { + +#if defined(__EXCEPTIONS) || defined(_CPPUNWIND) || defined(__SUNPRO_CC) +// The above preprocessor symbols are defined by compilers when exception handling is enabled. +// However, in some cases it could be disabled for this file. + +void handle_perror( int error_code, const char* what ) { + char buf[128]; + sprintf(buf,"%s: ",what); + char* end = strchr(buf,0); + size_t n = buf+sizeof(buf)-end; + strncpy( end, strerror( error_code ), n ); + // Ensure that buffer ends in terminator. + buf[sizeof(buf)-1] = 0; + throw runtime_error(buf); +} + +void throw_bad_last_alloc_exception_v4() +{ + throw bad_last_alloc(); +} +#endif //__EXCEPTIONS || _CPPUNWIND + +bool GetBoolEnvironmentVariable( const char * name ) { + if( const char* s = getenv(name) ) + return strcmp(s,"0") != 0; + return false; +} + +#include "tbb_version.h" + +/** The leading "\0" is here so that applying "strings" to the binary delivers a clean result. */ +static const char VersionString[] = "\0" TBB_VERSION_STRINGS; + +static bool PrintVersionFlag = false; + +void PrintVersion() { + PrintVersionFlag = true; + fputs(VersionString+1,stderr); +} + +void PrintExtraVersionInfo( const char* category, const char* description ) { + if( PrintVersionFlag ) + fprintf(stderr, "%s: %s\t%s\n", "TBB", category, description ); +} + +void PrintRMLVersionInfo( void* arg, const char* server_info ) +{ + PrintExtraVersionInfo( server_info, (const char *)arg ); +} + +} // namespace internal + +extern "C" int TBB_runtime_interface_version() { + return TBB_INTERFACE_VERSION; +} + +} // namespace tbb + +#if !__TBB_RML_STATIC +#if __TBB_x86_32 + +#include "tbb/atomic.h" + +// in MSVC environment, int64_t defined in tbb::internal namespace only (see tbb_stddef.h) +#if _MSC_VER +using tbb::internal::int64_t; +#endif + +//! Warn about 8-byte store that crosses a cache line. +extern "C" void __TBB_machine_store8_slow_perf_warning( volatile void *ptr ) { + // Report run-time warning unless we have already recently reported warning for that address. + const unsigned n = 4; + static tbb::atomic cache[n]; + static tbb::atomic k; + for( unsigned i=0; i(ptr); + tbb::internal::runtime_warning( "atomic store on misaligned 8-byte location %p is slow", ptr ); +done:; +} + +//! Handle 8-byte store that crosses a cache line. +extern "C" void __TBB_machine_store8_slow( volatile void *ptr, int64_t value ) { + for( tbb::internal::atomic_backoff b;; b.pause() ) { + int64_t tmp = *(int64_t*)ptr; + if( __TBB_machine_cmpswp8(ptr,value,tmp)==tmp ) + break; + } +} + +#endif /* __TBB_x86_32 */ +#endif /* !__TBB_RML_STATIC */ + +#if __TBB_ipf +extern "C" intptr_t __TBB_machine_lockbyte( volatile unsigned char& flag ) { + if ( !__TBB_TryLockByte(flag) ) { + tbb::internal::atomic_backoff b; + do { + b.pause(); + } while ( !__TBB_TryLockByte(flag) ); + } + return 0; +} +#endif diff --git a/dep/tbb/src/tbb/tbb_misc.h b/dep/tbb/src/tbb/tbb_misc.h new file mode 100644 index 000000000..7481899c4 --- /dev/null +++ b/dep/tbb/src/tbb/tbb_misc.h @@ -0,0 +1,132 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _TBB_tbb_misc_H +#define _TBB_tbb_misc_H + +#include "tbb/tbb_stddef.h" +#include "tbb/tbb_machine.h" + +#if _WIN32||_WIN64 +#include +#elif defined(__linux__) +#include +#elif defined(__sun) +#include +#include +#elif defined(__APPLE__) +#include +#include +#elif defined(__FreeBSD__) +#include +#endif + +namespace tbb { + +namespace internal { + +#if defined(__TBB_DetectNumberOfWorkers) +static inline int DetectNumberOfWorkers() { + return __TBB_DetectNumberOfWorkers(); +} + +#else + +#if _WIN32||_WIN64 +static inline int DetectNumberOfWorkers() { + SYSTEM_INFO si; + GetSystemInfo(&si); + return static_cast(si.dwNumberOfProcessors); +} + +#elif defined(__linux__) || defined(__APPLE__) || defined(__FreeBSD__) || defined(__sun) +static inline int DetectNumberOfWorkers() { + long number_of_workers; + +#if (defined(__FreeBSD__) || defined(__sun)) && defined(_SC_NPROCESSORS_ONLN) + number_of_workers = sysconf(_SC_NPROCESSORS_ONLN); + +// In theory, sysconf should work everywhere. +// But in practice, system-specific methods are more reliable +#elif defined(__linux__) + number_of_workers = get_nprocs(); +#elif defined(__APPLE__) + int name[2] = {CTL_HW, HW_AVAILCPU}; + int ncpu; + size_t size = sizeof(ncpu); + sysctl( name, 2, &ncpu, &size, NULL, 0 ); + number_of_workers = ncpu; +#else +#error DetectNumberOfWorkers: Method to detect the number of online CPUs is unknown +#endif + +// Fail-safety strap + if ( number_of_workers < 1 ) { + number_of_workers = 1; + } + + return number_of_workers; +} + +#else +#error DetectNumberOfWorkers: OS detection method is unknown + +#endif /* os kind */ + +#endif + +// assertion_failure is declared in tbb/tbb_stddef.h because it user code +// needs to see its declaration. + +//! Throw std::runtime_error of form "(what): (strerror of error_code)" +/* The "what" should be fairly short, not more than about 64 characters. + Because we control all the call sites to handle_perror, it is pointless + to bullet-proof it for very long strings. + + Design note: ADR put this routine off to the side in tbb_misc.cpp instead of + Task.cpp because the throw generates a pathetic lot of code, and ADR wanted + this large chunk of code to be placed on a cold page. */ +void __TBB_EXPORTED_FUNC handle_perror( int error_code, const char* what ); + +//! True if environment variable with given name is set and not 0; otherwise false. +bool GetBoolEnvironmentVariable( const char * name ); + +//! Print TBB version information on stderr +void PrintVersion(); + +//! Print extra TBB version information on stderr +void PrintExtraVersionInfo( const char* category, const char* description ); + +//! A callback routine to print RML version information on stderr +void PrintRMLVersionInfo( void* arg, const char* server_info ); + +} // namespace internal + +} // namespace tbb + +#endif /* _TBB_tbb_misc_H */ diff --git a/dep/tbb/src/tbb/tbb_resource.rc b/dep/tbb/src/tbb/tbb_resource.rc new file mode 100644 index 000000000..d61cac42b --- /dev/null +++ b/dep/tbb/src/tbb/tbb_resource.rc @@ -0,0 +1,126 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +// Microsoft Visual C++ generated resource script. +// +#ifdef APSTUDIO_INVOKED +#ifndef APSTUDIO_READONLY_SYMBOLS +#define _APS_NO_MFC 1 +#define _APS_NEXT_RESOURCE_VALUE 102 +#define _APS_NEXT_COMMAND_VALUE 40001 +#define _APS_NEXT_CONTROL_VALUE 1001 +#define _APS_NEXT_SYMED_VALUE 101 +#endif +#endif + +#define APSTUDIO_READONLY_SYMBOLS +///////////////////////////////////////////////////////////////////////////// +// +// Generated from the TEXTINCLUDE 2 resource. +// +#include +#define ENDL "\r\n" +#include "tbb_version.h" + +///////////////////////////////////////////////////////////////////////////// +#undef APSTUDIO_READONLY_SYMBOLS + +///////////////////////////////////////////////////////////////////////////// +// Neutral resources + +//#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_NEU) +#ifdef _WIN32 +LANGUAGE LANG_NEUTRAL, SUBLANG_NEUTRAL +#pragma code_page(1252) +#endif //_WIN32 + +///////////////////////////////////////////////////////////////////////////// +// manifest integration +#ifdef TBB_MANIFEST +#include "winuser.h" +2 RT_MANIFEST tbbmanifest.exe.manifest +#endif + +///////////////////////////////////////////////////////////////////////////// +// +// Version +// + +VS_VERSION_INFO VERSIONINFO + FILEVERSION TBB_VERNUMBERS + PRODUCTVERSION TBB_VERNUMBERS + FILEFLAGSMASK 0x17L +#ifdef _DEBUG + FILEFLAGS 0x1L +#else + FILEFLAGS 0x0L +#endif + FILEOS 0x40004L + FILETYPE 0x2L + FILESUBTYPE 0x0L +BEGIN + BLOCK "StringFileInfo" + BEGIN + BLOCK "000004b0" + BEGIN + VALUE "CompanyName", "Intel Corporation\0" + VALUE "FileDescription", "Threading Building Blocks library\0" + VALUE "FileVersion", TBB_VERSION "\0" +//what is it? VALUE "InternalName", "tbb\0" + VALUE "LegalCopyright", "Copyright 2005-2009 Intel Corporation. All Rights Reserved.\0" + VALUE "LegalTrademarks", "\0" +#ifndef TBB_USE_DEBUG + VALUE "OriginalFilename", "tbb.dll\0" +#else + VALUE "OriginalFilename", "tbb_debug.dll\0" +#endif + VALUE "ProductName", "Intel(R) Threading Building Blocks for Windows\0" + VALUE "ProductVersion", TBB_VERSION "\0" + VALUE "Comments", TBB_VERSION_STRINGS "\0" + VALUE "PrivateBuild", "\0" + VALUE "SpecialBuild", "\0" + END + END + BLOCK "VarFileInfo" + BEGIN + VALUE "Translation", 0x0, 1200 + END +END + +//#endif // Neutral resources +///////////////////////////////////////////////////////////////////////////// + + +#ifndef APSTUDIO_INVOKED +///////////////////////////////////////////////////////////////////////////// +// +// Generated from the TEXTINCLUDE 3 resource. +// + + +///////////////////////////////////////////////////////////////////////////// +#endif // not APSTUDIO_INVOKED + diff --git a/dep/tbb/src/tbb/tbb_thread.cpp b/dep/tbb/src/tbb/tbb_thread.cpp new file mode 100644 index 000000000..bb328e242 --- /dev/null +++ b/dep/tbb/src/tbb/tbb_thread.cpp @@ -0,0 +1,209 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#if _WIN32||_WIN64 +#include /* Need _beginthreadex from there */ +#include /* Need std::runtime_error from there */ +#include /* Need std::string from there */ +#endif // _WIN32||_WIN64 +#include "tbb_misc.h" // for handle_perror +#include "tbb/tbb_stddef.h" +#include "tbb/tbb_thread.h" +#include "tbb/tbb_allocator.h" +#include "tbb/task_scheduler_init.h" /* Need task_scheduler_init::default_num_threads() */ + +namespace tbb { + +namespace internal { + +//! Allocate a closure +void* allocate_closure_v3( size_t size ) +{ + return allocate_via_handler_v3( size ); +} + +//! Free a closure allocated by allocate_closure_v3 +void free_closure_v3( void *ptr ) +{ + deallocate_via_handler_v3( ptr ); +} + +#if _WIN32||_WIN64 +#if defined(__EXCEPTIONS) || defined(_CPPUNWIND) +// The above preprocessor symbols are defined by compilers when exception handling is enabled. + +void handle_win_error( int error_code ) +{ + LPTSTR msg_buf; + + FormatMessage( + FORMAT_MESSAGE_ALLOCATE_BUFFER | + FORMAT_MESSAGE_FROM_SYSTEM | + FORMAT_MESSAGE_IGNORE_INSERTS, + NULL, + error_code, + 0, + (LPTSTR) &msg_buf, + 0, NULL ); + const std::string msg_str(msg_buf); + LocalFree(msg_buf); + throw std::runtime_error(msg_str); +} +#endif //__EXCEPTIONS || _CPPUNWIND +#endif // _WIN32||_WIN64 + +void tbb_thread_v3::join() +{ + __TBB_ASSERT( joinable(), "thread should be joinable when join called" ); +#if _WIN32||_WIN64 + DWORD status = WaitForSingleObject( my_handle, INFINITE ); + if ( status == WAIT_FAILED ) + handle_win_error( GetLastError() ); + BOOL close_stat = CloseHandle( my_handle ); + if ( close_stat == 0 ) + handle_win_error( GetLastError() ); + my_thread_id = 0; +#else + int status = pthread_join( my_handle, NULL ); + if( status ) + handle_perror( status, "pthread_join" ); +#endif // _WIN32||_WIN64 + my_handle = 0; +} + +void tbb_thread_v3::detach() { + __TBB_ASSERT( joinable(), "only joinable thread can be detached" ); +#if _WIN32||_WIN64 + BOOL status = CloseHandle( my_handle ); + if ( status == 0 ) + handle_win_error( GetLastError() ); + my_thread_id = 0; +#else + int status = pthread_detach( my_handle ); + if( status ) + handle_perror( status, "pthread_detach" ); +#endif // _WIN32||_WIN64 + my_handle = 0; +} + +const size_t MB = 1<<20; +#if !defined(__TBB_WORDSIZE) +const size_t ThreadStackSize = 1*MB; +#elif __TBB_WORDSIZE<=4 +const size_t ThreadStackSize = 2*MB; +#else +const size_t ThreadStackSize = 4*MB; +#endif + +void tbb_thread_v3::internal_start( __TBB_NATIVE_THREAD_ROUTINE_PTR(start_routine), + void* closure ) { +#if _WIN32||_WIN64 + unsigned thread_id; + // The return type of _beginthreadex is "uintptr_t" on new MS compilers, + // and 'unsigned long' on old MS compilers. Our uintptr works for both. + uintptr status = _beginthreadex( NULL, ThreadStackSize, start_routine, + closure, 0, &thread_id ); + if( status==0 ) + handle_perror(errno,"__beginthreadex"); + else { + my_handle = (HANDLE)status; + my_thread_id = thread_id; + } +#else + pthread_t thread_handle; + int status; + pthread_attr_t stack_size; + status = pthread_attr_init( &stack_size ); + if( status ) + handle_perror( status, "pthread_attr_init" ); + status = pthread_attr_setstacksize( &stack_size, ThreadStackSize ); + if( status ) + handle_perror( status, "pthread_attr_setstacksize" ); + + status = pthread_create( &thread_handle, &stack_size, start_routine, closure ); + if( status ) + handle_perror( status, "pthread_create" ); + + my_handle = thread_handle; +#endif // _WIN32||_WIN64 +} + +unsigned tbb_thread_v3::hardware_concurrency() { + return task_scheduler_init::default_num_threads(); +} + +tbb_thread_v3::id thread_get_id_v3() { +#if _WIN32||_WIN64 + return tbb_thread_v3::id( GetCurrentThreadId() ); +#else + return tbb_thread_v3::id( pthread_self() ); +#endif // _WIN32||_WIN64 +} + +void move_v3( tbb_thread_v3& t1, tbb_thread_v3& t2 ) +{ + if (t1.joinable()) + t1.detach(); + t1.my_handle = t2.my_handle; + t2.my_handle = 0; +#if _WIN32||_WIN64 + t1.my_thread_id = t2.my_thread_id; + t2.my_thread_id = 0; +#endif // _WIN32||_WIN64 +} + +void thread_yield_v3() +{ + __TBB_Yield(); +} + +void thread_sleep_v3(const tick_count::interval_t &i) +{ +#if _WIN32||_WIN64 + tick_count t0 = tick_count::now(); + tick_count t1 = t0; + for(;;) { + double remainder = (i-(t1-t0)).seconds()*1e3; // milliseconds remaining to sleep + if( remainder<=0 ) break; + DWORD t = remainder>=INFINITE ? INFINITE-1 : DWORD(remainder); + Sleep( t ); + t1 = tick_count::now(); + } +#else + struct timespec req; + double sec = i.seconds(); + + req.tv_sec = static_cast(sec); + req.tv_nsec = static_cast( (sec - req.tv_sec)*1e9 ); + nanosleep(&req, NULL); +#endif // _WIN32||_WIN64 +} + +} // internal + +} // tbb diff --git a/dep/tbb/src/tbb/tbb_version.h b/dep/tbb/src/tbb/tbb_version.h new file mode 100644 index 000000000..07a91d6f5 --- /dev/null +++ b/dep/tbb/src/tbb/tbb_version.h @@ -0,0 +1,101 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +// Please define version number in the file: +#include "../../include/tbb/tbb_stddef.h" + +// And don't touch anything below +#ifndef ENDL +#define ENDL "\n" +#endif +#include "../../build/vsproject/version_string.tmp" + +#ifndef __TBB_VERSION_STRINGS +#pragma message("Warning: version_string.tmp isn't generated properly by version_info.sh script!") +// here is an example of macros value: +#define __TBB_VERSION_STRINGS \ +"TBB: BUILD_HOST\tUnknown\n" \ +"TBB: BUILD_ARCH\tUnknown\n" \ +"TBB: BUILD_OS\t\tUnknown\n" \ +"TBB: BUILD_CL\t\tUnknown\n" \ +"TBB: BUILD_COMPILER\tUnknown\n" \ +"TBB: BUILD_COMMAND\tUnknown\n" +#endif +#ifndef __TBB_DATETIME +#ifdef RC_INVOKED +#define __TBB_DATETIME "Unknown" +#else +#define __TBB_DATETIME __DATE__ __TIME__ +#endif +#endif + +#define __TBB_VERSION_NUMBER "TBB: VERSION\t\t" __TBB_STRING(TBB_VERSION_MAJOR.TBB_VERSION_MINOR) ENDL +#define __TBB_INTERFACE_VERSION_NUMBER "TBB: INTERFACE VERSION\t" __TBB_STRING(TBB_INTERFACE_VERSION) ENDL +#define __TBB_VERSION_DATETIME "TBB: BUILD_DATE\t\t" __TBB_DATETIME ENDL +#ifndef TBB_USE_DEBUG + #define __TBB_VERSION_USE_DEBUG "TBB: TBB_USE_DEBUG\tundefined" ENDL +#elif TBB_USE_DEBUG==0 + #define __TBB_VERSION_USE_DEBUG "TBB: TBB_USE_DEBUG\t0" ENDL +#elif TBB_USE_DEBUG==1 + #define __TBB_VERSION_USE_DEBUG "TBB: TBB_USE_DEBUG\t1" ENDL +#elif TBB_USE_DEBUG==2 + #define __TBB_VERSION_USE_DEBUG "TBB: TBB_USE_DEBUG\t2" ENDL +#else + #error Unexpected value for TBB_USE_DEBUG +#endif +#ifndef TBB_USE_ASSERT + #define __TBB_VERSION_USE_ASSERT "TBB: TBB_USE_ASSERT\tundefined" ENDL +#elif TBB_USE_ASSERT==0 + #define __TBB_VERSION_USE_ASSERT "TBB: TBB_USE_ASSERT\t0" ENDL +#elif TBB_USE_ASSERT==1 + #define __TBB_VERSION_USE_ASSERT "TBB: TBB_USE_ASSERT\t1" ENDL +#elif TBB_USE_ASSERT==2 + #define __TBB_VERSION_USE_ASSERT "TBB: TBB_USE_ASSERT\t2" ENDL +#else + #error Unexpected value for TBB_USE_ASSERT +#endif +#ifndef DO_ITT_NOTIFY + #define __TBB_VERSION_DO_NOTIFY "TBB: DO_ITT_NOTIFY\tundefined" ENDL +#elif DO_ITT_NOTIFY==1 + #define __TBB_VERSION_DO_NOTIFY "TBB: DO_ITT_NOTIFY\t1" ENDL +#elif DO_ITT_NOTIFY==0 + #define __TBB_VERSION_DO_NOTIFY +#else + #error Unexpected value for DO_ITT_NOTIFY +#endif + +#define TBB_VERSION_STRINGS __TBB_VERSION_NUMBER __TBB_INTERFACE_VERSION_NUMBER __TBB_VERSION_DATETIME __TBB_VERSION_STRINGS __TBB_VERSION_USE_DEBUG __TBB_VERSION_USE_ASSERT __TBB_VERSION_DO_NOTIFY + +// numbers +#ifndef __TBB_VERSION_YMD +#define __TBB_VERSION_YMD 0, 0 +#endif + +#define TBB_VERNUMBERS TBB_VERSION_MAJOR, TBB_VERSION_MINOR, __TBB_VERSION_YMD + +#define TBB_VERSION __TBB_STRING(TBB_VERNUMBERS) diff --git a/dep/tbb/src/tbb/tls.h b/dep/tbb/src/tbb/tls.h new file mode 100644 index 000000000..2e4768c15 --- /dev/null +++ b/dep/tbb/src/tbb/tls.h @@ -0,0 +1,119 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _TBB_tls_H +#define _TBB_tls_H + +#if USE_PTHREAD +#include +#else /* assume USE_WINTHREAD */ +#include +#endif + +namespace tbb { + +namespace internal { + +typedef void (*tls_dtor_t)(void*); + +//! Basic cross-platform wrapper class for TLS operations. +template +class basic_tls { +#if USE_PTHREAD + typedef pthread_key_t tls_key_t; +public: + int create( tls_dtor_t dtor = NULL ) { + return pthread_key_create(&my_key, dtor); + } + int destroy() { return pthread_key_delete(my_key); } + void set( T value ) { pthread_setspecific(my_key, (void*)value); } + T get() { return (T)pthread_getspecific(my_key); } +#else /* USE_WINTHREAD */ + typedef DWORD tls_key_t; +public: + int create() { + tls_key_t tmp = TlsAlloc(); + if( tmp==TLS_OUT_OF_INDEXES ) + return TLS_OUT_OF_INDEXES; + my_key = tmp; + return 0; + } + int destroy() { TlsFree(my_key); my_key=0; return 0; } + void set( T value ) { TlsSetValue(my_key, (LPVOID)value); } + T get() { return (T)TlsGetValue(my_key); } +#endif +private: + tls_key_t my_key; +}; + +//! More advanced TLS support template class. +/** It supports RAII and to some extent mimic __declspec(thread) variables. */ +template +class tls : public basic_tls { + typedef basic_tls base; +public: + tls() { base::create(); } + ~tls() { base::destroy(); } + T operator=(T value) { base::set(value); return value; } + operator T() { return base::get(); } +}; + +template +class tls : basic_tls { + typedef basic_tls base; + static void internal_dtor(void* ptr) { + if (ptr) delete (T*)ptr; + } + T* internal_get() { + T* result = base::get(); + if (!result) { + result = new T; + base::set(result); + } + return result; + } +public: + tls() { +#if USE_PTHREAD + base::create( internal_dtor ); +#else + base::create(); +#endif + } + ~tls() { base::destroy(); } + T* operator=(T* value) { base::set(value); return value; } + operator T*() { return internal_get(); } + T* operator->() { return internal_get(); } + T& operator*() { return *internal_get(); } +}; + +} // namespace internal + +} // namespace tbb + +#endif /* _TBB_tls_H */ diff --git a/dep/tbb/src/tbb/tools_api/_config.h b/dep/tbb/src/tbb/tools_api/_config.h new file mode 100644 index 000000000..17c97e53e --- /dev/null +++ b/dep/tbb/src/tbb/tools_api/_config.h @@ -0,0 +1,94 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef __CONFIG_H_ +#define __CONFIG_H_ + +#ifndef ITT_OS_WIN +# define ITT_OS_WIN 1 +#endif /* ITT_OS_WIN */ + +#ifndef ITT_OS_LINUX +# define ITT_OS_LINUX 2 +#endif /* ITT_OS_LINUX */ + +#ifndef ITT_OS_MAC +# define ITT_OS_MAC 3 +#endif /* ITT_OS_MAC */ + +#ifndef ITT_OS +# if defined WIN32 || defined _WIN32 +# define ITT_OS ITT_OS_WIN +# elif defined( __APPLE__ ) && defined( __MACH__ ) +# define ITT_OS ITT_OS_MAC +# else +# define ITT_OS ITT_OS_LINUX +# endif +#endif /* ITT_OS */ + +#ifndef ITT_ARCH_IA32 +# define ITT_ARCH_IA32 1 +#endif /* ITT_ARCH_IA32 */ + +#ifndef ITT_ARCH_IA32E +# define ITT_ARCH_IA32E 2 +#endif /* ITT_ARCH_IA32E */ + +#ifndef ITT_ARCH_IA64 +# define ITT_ARCH_IA64 3 +#endif /* ITT_ARCH_IA64 */ + + +#ifndef ITT_ARCH +# if defined _M_X64 || defined _M_AMD64 || defined __x86_64__ +# define ITT_ARCH ITT_ARCH_IA32E +# elif defined _M_IA64 || defined __ia64 +# define ITT_ARCH ITT_ARCH_IA64 +# else +# define ITT_ARCH ITT_ARCH_IA32 +# endif +#endif + +#ifndef ITT_PLATFORM_WIN +# define ITT_PLATFORM_WIN 1 +#endif /* ITT_PLATFORM_WIN */ + +#ifndef ITT_PLATFORM_POSIX +# define ITT_PLATFORM_POSIX 2 +#endif /* ITT_PLATFORM_POSIX */ + +#ifndef ITT_PLATFORM +# if ITT_OS==ITT_OS_WIN +# define ITT_PLATFORM ITT_PLATFORM_WIN +# else +# define ITT_PLATFORM ITT_PLATFORM_POSIX +# endif /* _WIN32 */ +#endif /* ITT_PLATFORM */ + +#endif /* __CONFIG_H_ */ + diff --git a/dep/tbb/src/tbb/tools_api/_disable_warnings.h b/dep/tbb/src/tbb/tools_api/_disable_warnings.h new file mode 100644 index 000000000..e32f24ff5 --- /dev/null +++ b/dep/tbb/src/tbb/tools_api/_disable_warnings.h @@ -0,0 +1,42 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "_config.h" + +#if ITT_PLATFORM==ITT_PLATFORM_WIN + +#pragma warning (disable: 593) /* parameter "XXXX" was set but never used */ +#pragma warning (disable: 344) /* typedef name has already been declared (with same type) */ +#pragma warning (disable: 174) /* expression has no effect */ + +#elif defined __INTEL_COMPILER + +#pragma warning (disable: 869) /* parameter "XXXXX" was never referenced */ +#pragma warning (disable: 1418) /* external function definition with no prior declaration */ + +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ diff --git a/dep/tbb/src/tbb/tools_api/_ittnotify_static.h b/dep/tbb/src/tbb/tools_api/_ittnotify_static.h new file mode 100644 index 000000000..9604b4c4f --- /dev/null +++ b/dep/tbb/src/tbb/tools_api/_ittnotify_static.h @@ -0,0 +1,166 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "_config.h" + +#ifndef ITT_STUB +#define ITT_STUB ITT_STUBV +#endif /* ITT_STUB */ + +#ifndef ITTAPI_CALL +#define ITTAPI_CALL CDECL +#endif /* ITTAPI_CALL */ + +/* parameters for macro: + type, func_name, arguments, params, func_name_in_dll, group + */ + +ITT_STUBV(void, pause,(void),(), pause, __itt_control_group) + +ITT_STUBV(void, resume,(void),(), resume, __itt_control_group) + +#if ITT_PLATFORM==ITT_PLATFORM_WIN + +ITT_STUB(int, markA,(__itt_mark_type mt, const char *parameter),(mt,parameter), markA, __itt_mark_group) + +ITT_STUB(int, markW,(__itt_mark_type mt, const wchar_t *parameter),(mt,parameter), markW, __itt_mark_group) + +ITT_STUB(int, mark_globalA,(__itt_mark_type mt, const char *parameter),(mt,parameter), mark_globalA, __itt_mark_group) + +ITT_STUB(int, mark_globalW,(__itt_mark_type mt, const wchar_t *parameter),(mt,parameter), mark_globalW, __itt_mark_group) + +ITT_STUBV(void, thread_set_nameA,( const char *name),(name), thread_set_nameA, __itt_thread_group) + +ITT_STUBV(void, thread_set_nameW,( const wchar_t *name),(name), thread_set_nameW, __itt_thread_group) + +ITT_STUBV(void, sync_createA,(void *addr, const char *objtype, const char *objname, int attribute), (addr, objtype, objname, attribute), sync_createA, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, sync_createW,(void *addr, const wchar_t *objtype, const wchar_t *objname, int attribute), (addr, objtype, objname, attribute), sync_createW, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, sync_renameA, (void *addr, const char *name), (addr, name), sync_renameA, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, sync_renameW, (void *addr, const wchar_t *name), (addr, name), sync_renameW, __itt_sync_group | __itt_fsync_group) +#else /* WIN32 */ + +ITT_STUB(int, mark,(__itt_mark_type mt, const char *parameter),(mt,parameter), mark, __itt_mark_group) +ITT_STUB(int, mark_global,(__itt_mark_type mt, const char *parameter),(mt,parameter), mark_global, __itt_mark_group) + +ITT_STUBV(void, sync_set_name,(void *addr, const char *objtype, const char *objname, int attribute),(addr,objtype,objname,attribute), sync_set_name, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, thread_set_name,( const char *name),(name), thread_set_name, __itt_thread_group) + +ITT_STUBV(void, sync_create,(void *addr, const char *objtype, const char *objname, int attribute), (addr, objtype, objname, attribute), sync_create, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, sync_rename, (void *addr, const char *name), (addr, name), sync_rename, __itt_sync_group | __itt_fsync_group) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +ITT_STUBV(void, sync_destroy,(void *addr), (addr), sync_destroy, __itt_sync_group | __itt_fsync_group) + +ITT_STUB(int, mark_off,(__itt_mark_type mt),(mt), mark_off, __itt_mark_group) +ITT_STUB(int, mark_global_off,(__itt_mark_type mt),(mt), mark_global_off, __itt_mark_group) + +ITT_STUBV(void, thread_ignore,(void),(), thread_ignore, __itt_thread_group) + +ITT_STUBV(void, sync_prepare,(void* addr),(addr), sync_prepare, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, sync_cancel,(void *addr),(addr), sync_cancel, __itt_sync_group) + +ITT_STUBV(void, sync_acquired,(void *addr),(addr), sync_acquired, __itt_sync_group) + +ITT_STUBV(void, sync_releasing,(void* addr),(addr), sync_releasing, __itt_sync_group) + +ITT_STUBV(void, sync_released,(void* addr),(addr), sync_released, __itt_sync_group) + +ITT_STUBV(void, memory_read,( void *address, size_t size ), (address, size), memory_read, __itt_all_group) +ITT_STUBV(void, memory_write,( void *address, size_t size ), (address, size), memory_write, __itt_all_group) +ITT_STUBV(void, memory_update,( void *address, size_t size ), (address, size), memory_update, __itt_all_group) + +ITT_STUB(int, jit_notify_event,(__itt_jit_jvm_event event_type, void* event_data),(event_type, event_data), jit_notify_event, __itt_jit_group) + +#ifndef NO_ITT_LEGACY + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +ITT_STUB(__itt_mark_type, mark_createA,(const char *name),(name), mark_createA, __itt_mark_group) +ITT_STUB(__itt_mark_type, mark_createW,(const wchar_t *name),(name), mark_createW, __itt_mark_group) +#else /* WIN32 */ +ITT_STUB(__itt_mark_type, mark_create,(const char *name),(name), mark_create, __itt_mark_group) +#endif +ITT_STUBV(void, fsync_prepare,(void* addr),(addr), sync_prepare, __itt_fsync_group) + +ITT_STUBV(void, fsync_cancel,(void *addr),(addr), sync_cancel, __itt_fsync_group) + +ITT_STUBV(void, fsync_acquired,(void *addr),(addr), sync_acquired, __itt_fsync_group) + +ITT_STUBV(void, fsync_releasing,(void* addr),(addr), sync_releasing, __itt_fsync_group) + +ITT_STUBV(void, fsync_released,(void* addr),(addr), sync_released, __itt_fsync_group) + +ITT_STUBV(void, notify_sync_prepare,(void *p),(p), notify_sync_prepare, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, notify_sync_cancel,(void *p),(p), notify_sync_cancel, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, notify_sync_acquired,(void *p),(p), notify_sync_acquired, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, notify_sync_releasing,(void *p),(p), notify_sync_releasing, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, notify_cpath_target,(void),(), notify_cpath_target, __itt_all_group) + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +ITT_STUBV(void, sync_set_nameA,(void *addr, const char *objtype, const char *objname, int attribute),(addr,objtype,objname,attribute), sync_set_nameA, __itt_sync_group | __itt_fsync_group) + +ITT_STUBV(void, sync_set_nameW,(void *addr, const wchar_t *objtype, const wchar_t *objname, int attribute),(addr,objtype,objname,attribute), sync_set_nameW, __itt_sync_group | __itt_fsync_group) + +ITT_STUB (int, thr_name_setA,( char *name, int namelen ),(name,namelen), thr_name_setA, __itt_thread_group) + +ITT_STUB (int, thr_name_setW,( wchar_t *name, int namelen ),(name,namelen), thr_name_setW, __itt_thread_group) + +ITT_STUB (__itt_event, event_createA,( char *name, int namelen ),(name,namelen), event_createA, __itt_mark_group) + +ITT_STUB (__itt_event, event_createW,( wchar_t *name, int namelen ),(name,namelen), event_createW, __itt_mark_group) +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +ITT_STUB (int, thr_name_set,( char *name, int namelen ),(name,namelen), thr_name_set, __itt_thread_group) + +ITT_STUB (__itt_event, event_create,( char *name, int namelen ),(name,namelen), event_create, __itt_mark_group) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +ITT_STUBV(void, thr_ignore,(void),(), thr_ignore, __itt_thread_group) + +ITT_STUB (int, event_start,( __itt_event event ),(event), event_start, __itt_mark_group) + +ITT_STUB (int, event_end,( __itt_event event ),(event), event_end, __itt_mark_group) + +ITT_STUB (__itt_state_t, state_get, (void), (), state_get, __itt_all_group) +ITT_STUB (__itt_state_t, state_set,( __itt_state_t state), (state), state_set, __itt_all_group) +ITT_STUB (__itt_obj_state_t, obj_mode_set, ( __itt_obj_prop_t prop, __itt_obj_state_t state), (prop, state), obj_mode_set, __itt_all_group) +ITT_STUB (__itt_thr_state_t, thr_mode_set, (__itt_thr_prop_t prop, __itt_thr_state_t state), (prop, state), thr_mode_set, __itt_all_group) + +ITT_STUB (const char*, api_version,(void),(), api_version, __itt_all_group) +ITT_STUB (unsigned int, jit_get_new_method_id, (void), (), jit_get_new_method_id, __itt_jit_group) + +#endif /* NO_ITT_LEGACY */ + diff --git a/dep/tbb/src/tbb/tools_api/ittnotify.h b/dep/tbb/src/tbb/tools_api/ittnotify.h new file mode 100644 index 000000000..e9ebb0f20 --- /dev/null +++ b/dep/tbb/src/tbb/tools_api/ittnotify.h @@ -0,0 +1,1234 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +/** @mainpage + * Ability to control the collection during runtime. User API can be inserted into the user application. + * Commands include: + - Pause/resume analysis + - Stop analysis and application, view results + - Cancel analysis and application without generating results + - Mark current time in results + * The User API provides ability to control the collection, set marks at the execution of specific user code and + * specify custom synchronization primitives implemented without standard system APIs. + * + * Use case: User inserts API calls to the desired places in her code. The code is then compiled and + * linked with static part of User API library. User can recompile the code with specific macro defined + * to enable API calls. If this macro is not defined there is no run-time overhead and no need to link + * with static part of User API library. During runtime the static library loads and initializes the dynamic part. + * In case of instrumentation-based collection, only a stub library is loaded; otherwise a proxy library is loaded, + * which calls the collector. + * + * User API set is native (C/C++) only (no MRTE support). As amitigation can use JNI or C/C++ function + * call from managed code where needed. If the collector causes significant overhead or data storage, then + * pausing analysis should reduce the overhead to minimal levels. +*/ +/** @example example.cpp + * @brief The following example program shows the usage of User API + */ + +#ifndef _ITTNOTIFY_H_ +#define _ITTNOTIFY_H_ +/** @file ittnotify.h + * @brief Header file which contains declaration of user API functions and types + */ + +/** @cond exclude_from_documentation */ +#ifndef ITT_OS_WIN +# define ITT_OS_WIN 1 +#endif /* ITT_OS_WIN */ + +#ifndef ITT_OS_LINUX +# define ITT_OS_LINUX 2 +#endif /* ITT_OS_LINUX */ + +#ifndef ITT_OS_MAC +# define ITT_OS_MAC 3 +#endif /* ITT_OS_MAC */ + +#ifndef ITT_OS +# if defined WIN32 || defined _WIN32 +# define ITT_OS ITT_OS_WIN +# elif defined( __APPLE__ ) && defined( __MACH__ ) +# define ITT_OS ITT_OS_MAC +# else +# define ITT_OS ITT_OS_LINUX +# endif +#endif /* ITT_OS */ + +#ifndef ITT_PLATFORM_WIN +# define ITT_PLATFORM_WIN 1 +#endif /* ITT_PLATFORM_WIN */ + +#ifndef ITT_PLATFORM_POSIX +# define ITT_PLATFORM_POSIX 2 +#endif /* ITT_PLATFORM_POSIX */ + +#ifndef ITT_PLATFORM +# if ITT_OS==ITT_OS_WIN +# define ITT_PLATFORM ITT_PLATFORM_WIN +# else +# define ITT_PLATFORM ITT_PLATFORM_POSIX +# endif /* _WIN32 */ +#endif /* ITT_PLATFORM */ + + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +#include +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#ifdef __cplusplus +extern "C" { +#endif /* __cplusplus */ + +#define ITTAPI_CALL CDECL + +#ifndef CDECL +#if ITT_PLATFORM==ITT_PLATFORM_WIN +# define CDECL __cdecl +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +# define CDECL +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#endif /* CDECL */ + +/** @endcond */ + +/** @brief user event type */ +typedef int __itt_mark_type; +typedef int __itt_event; +typedef int __itt_state_t; + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +# ifdef UNICODE + typedef wchar_t __itt_char; +# else /* UNICODE */ + typedef char __itt_char; +# endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @brief Typedef for char or wchar_t (if Unicode symbol is allowed) on Windows. + * And typedef for char on Linux. + */ + typedef char __itt_char; +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @cond exclude_from_documentation */ +typedef enum __itt_obj_state { + __itt_obj_state_err = 0, + __itt_obj_state_clr = 1, + __itt_obj_state_set = 2, + __itt_obj_state_use = 3 +} __itt_obj_state_t; + +typedef enum __itt_thr_state { + __itt_thr_state_err = 0, + __itt_thr_state_clr = 1, + __itt_thr_state_set = 2 +} __itt_thr_state_t; + +typedef enum __itt_obj_prop { + __itt_obj_prop_watch = 1, + __itt_obj_prop_ignore = 2, + __itt_obj_prop_sharable = 3 +} __itt_obj_prop_t; + +typedef enum __itt_thr_prop { + __itt_thr_prop_quiet = 1 +} __itt_thr_prop_t; +/** @endcond */ +typedef enum __itt_error_code { + __itt_error_success = 0, /*!< no error */ + __itt_error_no_module = 1, /*!< module can't be loaded */ + __itt_error_no_symbol = 2, /*!< symbol not found */ + __itt_error_unknown_group = 3, /*!< unknown group specified */ + __itt_error_cant_read_env = 4 /*!< variable value too long */ +} __itt_error_code; + +typedef void (__itt_error_notification_t)(__itt_error_code code, const char* msg); + +/******************************************* + * Various constants used by JIT functions * + *******************************************/ + + /*! @enum ___itt_jit_jvm_event + * event notification + */ + typedef enum ___itt_jit_jvm_event + { + + __itt_JVM_EVENT_TYPE_SHUTDOWN = 2, /*!< Shutdown. Program exiting. EventSpecificData NA*/ + __itt_JVM_EVENT_TYPE_METHOD_LOAD_FINISHED=13,/*!< JIT profiling. Issued after method code jitted into memory but before code is executed + * event_data is an __itt_JIT_Method_Load */ + __itt_JVM_EVENT_TYPE_METHOD_UNLOAD_START /*!< JIT profiling. Issued before unload. Method code will no longer be executed, but code and info are still in memory. + * The VTune profiler may capture method code only at this point. event_data is __itt_JIT_Method_Id */ + + } __itt_jit_jvm_event; + +/*! @enum ___itt_jit_environment_type + * @brief Enumerator for the environment of methods + */ +typedef enum ___itt_jit_environment_type +{ + __itt_JIT_JITTINGAPI = 2 +} __itt_jit_environment_type; + +/********************************** + * Data structures for the events * + **********************************/ + + /*! @struct ___itt_jit_method_id + * @brief structure for the events: __itt_iJVM_EVENT_TYPE_METHOD_UNLOAD_START + */ +typedef struct ___itt_jit_method_id +{ + /** @brief Id of the method (same as the one passed in the __itt_JIT_Method_Load struct */ + unsigned int method_id; + +} *__itt_pjit_method_id, __itt_jit_method_id; + +/*! @struct ___itt_jit_line_number_info + * @brief structure for the events: __itt_iJVM_EVENT_TYPE_METHOD_LOAD_FINISHED + */ +typedef struct ___itt_jit_line_number_info +{ + /** @brief x86 Offset from the begining of the method */ + unsigned int offset; + /** @brief source line number from the begining of the source file. */ + unsigned int line_number; + +} *__itt_pjit_line_number_info, __itt_jit_line_number_info; +/*! @struct ___itt_jit_method_load + * @brief structure for the events: __itt_iJVM_EVENT_TYPE_METHOD_LOAD_FINISHED + */ +typedef struct ___itt_jit_method_load +{ + /** @brief unique method ID - can be any unique value, (except 0 - 999) */ + unsigned int method_id; + /** @brief method name (can be with or without the class and signature, in any case the class name will be added to it) */ + char* method_name; + /** @brief virtual address of that method - This determines the method range for the iJVM_EVENT_TYPE_ENTER/LEAVE_METHOD_ADDR events */ + void* method_load_address; + /** @brief Size in memory - Must be exact */ + unsigned int method_size; + /** @brief Line Table size in number of entries - Zero if none */ + unsigned int line_number_size; + /** @brief Pointer to the begining of the line numbers info array */ + __itt_pjit_line_number_info line_number_table; + /** @brief unique class ID */ + unsigned int class_id; + /** @brief class file name */ + char* class_file_name; + /** @brief source file name */ + char* source_file_name; + /** @brief bits supplied by the user for saving in the JIT file... */ + void* user_data; + /** @brief the size of the user data buffer */ + unsigned int user_data_size; + /** @note no need to fill this field, it's filled by VTune */ + __itt_jit_environment_type env; +} *__itt_pjit_method_load, __itt_jit_method_load; + +/** + * @brief General behavior: application continues to run, but no profiling information is being collected + + * - Pausing occurs not only for the current thread but for all process as well as spawned processes + * - Intel(R) Parallel Inspector: does not analyze or report errors that involve memory access. + * - Intel(R) Parallel Inspector: Other errors are reported as usual. Pausing data collection in + Intel(R) Parallel Inspector only pauses tracing and analyzing memory access. It does not pause + tracing or analyzing threading APIs. + * - Intel(R) Parallel Amplifier: does continue to record when new threads are started + * - Other effects: possible reduction of runtime overhead + */ +void ITTAPI_CALL __itt_pause(void); + +/** + * @brief General behavior: application continues to run, collector resumes profiling information + * collection for all threads and processes of profiled application + */ +void ITTAPI_CALL __itt_resume(void); + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +__itt_mark_type ITTAPI_CALL __itt_mark_createA(const char *name); +__itt_mark_type ITTAPI_CALL __itt_mark_createW(const wchar_t *name); +#ifdef UNICODE +# define __itt_mark_create __itt_mark_createW +# define __itt_mark_create_ptr __itt_mark_createW_ptr +#else /* UNICODE */ +# define __itt_mark_create __itt_mark_createA +# define __itt_mark_create_ptr __itt_mark_createA_ptr +#endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @brief Creates a user event type (mark) with the specified name using char or Unicode string. + * @param[in] name - name of mark to create + * @return Returns a handle to the mark type + */ +__itt_mark_type ITTAPI_CALL __itt_mark_create(const __itt_char* name); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +int ITTAPI_CALL __itt_markA(__itt_mark_type mt, const char *parameter); +int ITTAPI_CALL __itt_markW(__itt_mark_type mt, const wchar_t *parameter); + +int ITTAPI_CALL __itt_mark_globalA(__itt_mark_type mt, const char *parameter); +int ITTAPI_CALL __itt_mark_globalW(__itt_mark_type mt, const wchar_t *parameter); + +#ifdef UNICODE +# define __itt_mark __itt_markW +# define __itt_mark_ptr __itt_markW_ptr + +# define __itt_mark_global __itt_mark_globalW +# define __itt_mark_global_ptr __itt_mark_globalW_ptr +#else /* UNICODE */ +# define __itt_mark __itt_markA +# define __itt_mark_ptr __itt_markA_ptr + +# define __itt_mark_global __itt_mark_globalA +# define __itt_mark_global_ptr __itt_mark_globalA_ptr +#endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @brief Creates a "discrete" user event type (mark) of the specified type and an optional parameter using char or Unicode string. + + * - The mark of "discrete" type is placed to collection results in case of success. It appears in overtime view(s) as a special tick sign. + * - The call is "synchronous" - function returns after mark is actually added to results. + * - This function is useful, for example, to mark different phases of application (beginning of the next mark automatically meand end of current region). + * - Can be used together with "continuous" marks (see below) at the same collection session + * @param[in] mt - mark, created by __itt_mark_create(const __itt_char* name) function + * @param[in] parameter - string parameter of mark + * @return Returns zero value in case of success, non-zero value otherwise. + */ +int ITTAPI_CALL __itt_mark(__itt_mark_type mt, const __itt_char* parameter); +/** @brief Use this if necessary to create a "discrete" user event type (mark) for process + * rather then for one thread + * @see int ITTAPI_CALL __itt_mark(__itt_mark_type mt, const __itt_char* parameter); + */ +int ITTAPI_CALL __itt_mark_global(__itt_mark_type mt, const __itt_char* parameter); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** + * @brief Creates an "end" point for "continuous" mark with specified name. + + * - Returns zero value in case of success, non-zero value otherwise. Also returns non-zero value when preceding "begin" point for the mark with the same name failed to be created or not created. (*) + * - The mark of "continuous" type is placed to collection results in case of success. It appears in overtime view(s) as a special tick sign (different from "discrete" mark) together with line from corresponding "begin" mark to "end" mark. (*) * - Continuous marks can overlap (*) and be nested inside each other. Discrete mark can be nested inside marked region + * + * @param[in] mt - mark, created by __itt_mark_create(const __itt_char* name) function + * + * @return Returns zero value in case of success, non-zero value otherwise. + */ +int ITTAPI_CALL __itt_mark_off(__itt_mark_type mt); +/** @brief Use this if necessary to create an "end" point for mark of process + * @see int ITTAPI_CALL __itt_mark_off(__itt_mark_type mt); + */ +int ITTAPI_CALL __itt_mark_global_off(__itt_mark_type mt); + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +void ITTAPI_CALL __itt_thread_set_nameA(const char *name); +void ITTAPI_CALL __itt_thread_set_nameW(const wchar_t *name); +#ifdef UNICODE +# define __itt_thread_set_name __itt_thread_set_nameW +# define __itt_thread_set_name_ptr __itt_thread_set_nameW_ptr +#else /* UNICODE */ +# define __itt_thread_set_name __itt_thread_set_nameA +# define __itt_thread_set_name_ptr __itt_thread_set_nameA_ptr +#endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @brief Sets thread name using char or Unicode string + * @param[in] name - name of thread + */ +void ITTAPI_CALL __itt_thread_set_name(const __itt_char* name); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +/** @brief Mark current thread as ignored from this point on, for the duration of its existence. */ +void ITTAPI_CALL __itt_thread_ignore(void); +/** @brief Is called when sync object is destroyed (needed to track lifetime of objects) */ +void ITTAPI_CALL __itt_sync_destroy(void *addr); +/** @brief Enter spin loop on user-defined sync object */ +void ITTAPI_CALL __itt_sync_prepare(void* addr); +/** @brief Quit spin loop without acquiring spin object */ +void ITTAPI_CALL __itt_sync_cancel(void *addr); +/** @brief Successful spin loop completion (sync object acquired) */ +void ITTAPI_CALL __itt_sync_acquired(void *addr); +/** @brief Start sync object releasing code. Is called before the lock release call. */ +void ITTAPI_CALL __itt_sync_releasing(void* addr); +/** @brief Sync object released. Is called after the release call */ +void ITTAPI_CALL __itt_sync_released(void* addr); + +/** @brief Fast synchronization which does no require spinning. + + * - This special function is to be used by TBB and OpenMP libraries only when they know + * there is no spin but they need to suppress TC warnings about shared variable modifications. + * - It only has corresponding pointers in static library and does not have corresponding function + * in dynamic library. + * @see void ITTAPI_CALL __itt_sync_prepare(void* addr); +*/ +void ITTAPI_CALL __itt_fsync_prepare(void* addr); +/** @brief Fast synchronization which does no require spinning. + + * - This special function is to be used by TBB and OpenMP libraries only when they know + * there is no spin but they need to suppress TC warnings about shared variable modifications. + * - It only has corresponding pointers in static library and does not have corresponding function + * in dynamic library. + * @see void ITTAPI_CALL __itt_sync_cancel(void *addr); +*/ +void ITTAPI_CALL __itt_fsync_cancel(void *addr); +/** @brief Fast synchronization which does no require spinning. + + * - This special function is to be used by TBB and OpenMP libraries only when they know + * there is no spin but they need to suppress TC warnings about shared variable modifications. + * - It only has corresponding pointers in static library and does not have corresponding function + * in dynamic library. + * @see void ITTAPI_CALL __itt_sync_acquired(void *addr); +*/ +void ITTAPI_CALL __itt_fsync_acquired(void *addr); +/** @brief Fast synchronization which does no require spinning. + + * - This special function is to be used by TBB and OpenMP libraries only when they know + * there is no spin but they need to suppress TC warnings about shared variable modifications. + * - It only has corresponding pointers in static library and does not have corresponding function + * in dynamic library. + * @see void ITTAPI_CALL __itt_sync_releasing(void* addr); +*/ +void ITTAPI_CALL __itt_fsync_releasing(void* addr); +/** @brief Fast synchronization which does no require spinning. + + * - This special function is to be used by TBB and OpenMP libraries only when they know + * there is no spin but they need to suppress TC warnings about shared variable modifications. + * - It only has corresponding pointers in static library and does not have corresponding function + * in dynamic library. + * @see void ITTAPI_CALL __itt_sync_released(void* addr); +*/ +void ITTAPI_CALL __itt_fsync_released(void* addr); + +/** @hideinitializer + * @brief possible value of attribute argument for sync object type + */ +#define __itt_attr_barrier 1 +/** @hideinitializer + * @brief possible value of attribute argument for sync object type + */ +#define __itt_attr_mutex 2 + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +void ITTAPI_CALL __itt_sync_set_nameA(void *addr, const char *objtype, const char *objname, int attribute); +void ITTAPI_CALL __itt_sync_set_nameW(void *addr, const wchar_t *objtype, const wchar_t *objname, int attribute); +#ifdef UNICODE +# define __itt_sync_set_name __itt_sync_set_nameW +# define __itt_sync_set_name_ptr __itt_sync_set_nameW_ptr +#else /* UNICODE */ +# define __itt_sync_set_name __itt_sync_set_nameA +# define __itt_sync_set_name_ptr __itt_sync_set_nameA_ptr +#endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @deprecated Legacy API + * @brief Assign a name to a sync object using char or Unicode string + * @param[in] addr - pointer to the sync object. You should use a real pointer to your object + * to make sure that the values don't clash with other object addresses + * @param[in] objtype - null-terminated object type string. If NULL is passed, the object will + * be assumed to be of generic "User Synchronization" type + * @param[in] objname - null-terminated object name string. If NULL, no name will be assigned + * to the object -- you can use the __itt_sync_rename call later to assign + * the name + * @param[in] attribute - one of [ #__itt_attr_barrier , #__itt_attr_mutex] values which defines the + * exact semantics of how prepare/acquired/releasing calls work. + */ +void ITTAPI_CALL __itt_sync_set_name(void *addr, const __itt_char* objtype, const __itt_char* objname, int attribute); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +void ITTAPI_CALL __itt_sync_createA(void *addr, const char *objtype, const char *objname, int attribute); +void ITTAPI_CALL __itt_sync_createW(void *addr, const wchar_t *objtype, const wchar_t *objname, int attribute); +#ifdef UNICODE +#define __itt_sync_create __itt_sync_createW +# define __itt_sync_create_ptr __itt_sync_createW_ptr +#else /* UNICODE */ +#define __itt_sync_create __itt_sync_createA +# define __itt_sync_create_ptr __itt_sync_createA_ptr +#endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @brief Register the creation of a sync object using char or Unicode string + * @param[in] addr - pointer to the sync object. You should use a real pointer to your object + * to make sure that the values don't clash with other object addresses + * @param[in] objtype - null-terminated object type string. If NULL is passed, the object will + * be assumed to be of generic "User Synchronization" type + * @param[in] objname - null-terminated object name string. If NULL, no name will be assigned + * to the object -- you can use the __itt_sync_rename call later to assign + * the name + * @param[in] attribute - one of [ #__itt_attr_barrier, #__itt_attr_mutex] values which defines the + * exact semantics of how prepare/acquired/releasing calls work. +**/ +void ITTAPI_CALL __itt_sync_create(void *addr, const __itt_char* objtype, const __itt_char* objname, int attribute); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @brief Assign a name to a sync object using char or Unicode string. + + * Sometimes you cannot assign the name to a sync object in the __itt_sync_set_name call because it + * is not yet known there. In this case you should use the rename call which allows to assign the + * name after the creation has been registered. The renaming can be done multiple times. All waits + * after a new name has been assigned will be attributed to the sync object with this name. + * @param[in] addr - pointer to the sync object + * @param[in] name - null-terminated object name string +**/ +#if ITT_PLATFORM==ITT_PLATFORM_WIN +void ITTAPI_CALL __itt_sync_renameA(void *addr, const char *name); +void ITTAPI_CALL __itt_sync_renameW(void *addr, const wchar_t *name); +#ifdef UNICODE +#define __itt_sync_rename __itt_sync_renameW +# define __itt_sync_rename_ptr __itt_sync_renameW_ptr +#else /* UNICODE */ +#define __itt_sync_rename __itt_sync_renameA +# define __itt_sync_rename_ptr __itt_sync_renameA_ptr +#endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +void ITTAPI_CALL __itt_sync_rename(void *addr, const __itt_char* name); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +/** @cond exclude_from_documentaion */ +int __itt_jit_notify_event(__itt_jit_jvm_event event_type, void* event_data); +unsigned int __itt_jit_get_new_method_id(void); +const char* ITTAPI_CALL __itt_api_version(void); +__itt_error_notification_t* __itt_set_error_handler(__itt_error_notification_t*); + +#if ITT_OS == ITT_OS_WIN +#define LIBITTNOTIFY_CC __cdecl +#define LIBITTNOTIFY_EXPORT __declspec(dllexport) +#define LIBITTNOTIFY_IMPORT __declspec(dllimport) +#elif ITT_OS == ITT_OS_MAC || ITT_OS == ITT_OS_LINUX +#define LIBITTNOTIFY_CC /* nothing */ +#define LIBITTNOTIFY_EXPORT /* nothing */ +#define LIBITTNOTIFY_IMPORT /* nothing */ +#else /* ITT_OS == ITT_OS_WIN */ +#error "Unsupported OS" +#endif /* ITT_OS == ITT_OS_WIN */ + +#define LIBITTNOTIFY_API +/** @endcond */ + +/** @deprecated Legacy API + * @brief Hand instrumentation of user synchronization + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_notify_sync_prepare(void *p); +/** @deprecated Legacy API + * @brief Hand instrumentation of user synchronization + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_notify_sync_cancel(void *p); +/** @deprecated Legacy API + * @brief Hand instrumentation of user synchronization + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_notify_sync_acquired(void *p); +/** @deprecated Legacy API + * @brief Hand instrumentation of user synchronization + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_notify_sync_releasing(void *p); +/** @deprecated Legacy API + * @brief itt_notify_cpath_target is handled by Thread Profiler only. + * Inform Thread Profiler that the current thread has recahed a critical path target. + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_notify_cpath_target(void); + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +LIBITTNOTIFY_API int LIBITTNOTIFY_CC __itt_thr_name_setA( char *name, int namelen ); +LIBITTNOTIFY_API int LIBITTNOTIFY_CC __itt_thr_name_setW( wchar_t *name, int namelen ); +# ifdef UNICODE +# define __itt_thr_name_set __itt_thr_name_setW +# define __itt_thr_name_set_ptr __itt_thr_name_setW_ptr +# else +# define __itt_thr_name_set __itt_thr_name_setA +# define __itt_thr_name_set_ptr __itt_thr_name_setA_ptr +# endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @deprecated Legacy API + * @brief Set name to be associated with thread in analysis GUI. + * Return __itt_err upon failure (name or namelen being null,name and namelen mismatched) + */ +LIBITTNOTIFY_API int LIBITTNOTIFY_CC __itt_thr_name_set( __itt_char *name, int namelen ); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +/** @brief Mark current thread as ignored from this point on, for the duration of its existence. */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_thr_ignore(void); + +/* User event notification */ +#if ITT_PLATFORM==ITT_PLATFORM_WIN +/** @deprecated Legacy API + * @brief User event notification. + * Event create APIs return non-zero event identifier upon success and __itt_err otherwise + * (name or namelen being null/name and namelen not matching, user event feature not enabled) + */ +LIBITTNOTIFY_API __itt_event LIBITTNOTIFY_CC __itt_event_createA( char *name, int namelen ); +LIBITTNOTIFY_API __itt_event LIBITTNOTIFY_CC __itt_event_createW( wchar_t *name, int namelen ); +# ifdef UNICODE +# define __itt_event_create __itt_event_createW +# define __itt_event_create_ptr __itt_event_createW_ptr +# else +# define __itt_event_create __itt_event_createA +# define __itt_event_create_ptr __itt_event_createA_ptr +# endif /* UNICODE */ +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +/** @deprecated Legacy API + * @brief User event notification. + * Event create APIs return non-zero event identifier upon success and __itt_err otherwise + * (name or namelen being null/name and namelen not matching, user event feature not enabled) + */ +LIBITTNOTIFY_API __itt_event LIBITTNOTIFY_CC __itt_event_create( __itt_char *name, int namelen ); +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +/** @deprecated Legacy API + * @brief Record an event occurance. + * These APIs return __itt_err upon failure (invalid event id/user event feature not enabled) + */ +LIBITTNOTIFY_API int LIBITTNOTIFY_CC __itt_event_start( __itt_event event ); +/** @deprecated Legacy API + * @brief Record an event occurance. event_end is optional if events do not have durations. + * These APIs return __itt_err upon failure (invalid event id/user event feature not enabled) + */ +LIBITTNOTIFY_API int LIBITTNOTIFY_CC __itt_event_end( __itt_event event ); /** optional */ + + +/** @deprecated Legacy API + * @brief managing thread and object states + */ +LIBITTNOTIFY_API __itt_state_t LIBITTNOTIFY_CC __itt_state_get(void); +/** @deprecated Legacy API + * @brief managing thread and object states + */ +LIBITTNOTIFY_API __itt_state_t LIBITTNOTIFY_CC __itt_state_set( __itt_state_t ); + +/** @deprecated Legacy API + * @brief managing thread and object modes + */ +LIBITTNOTIFY_API __itt_thr_state_t LIBITTNOTIFY_CC __itt_thr_mode_set( __itt_thr_prop_t, __itt_thr_state_t ); +/** @deprecated Legacy API + * @brief managing thread and object modes + */ +LIBITTNOTIFY_API __itt_obj_state_t LIBITTNOTIFY_CC __itt_obj_mode_set( __itt_obj_prop_t, __itt_obj_state_t ); + +/** @deprecated Non-supported Legacy API + * @brief Inform the tool of memory accesses on reading + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_memory_read( void *address, size_t size ); +/** @deprecated Non-supported Legacy API + * @brief Inform the tool of memory accesses on writing + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_memory_write( void *address, size_t size ); +/** @deprecated Non-supported Legacy API + * @brief Inform the tool of memory accesses on updating + */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_memory_update( void *address, size_t size ); + +/** @cond exclude_from_documentation */ +/* The following 3 are currently for INTERNAL use only */ +/** @internal */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_test_delay( int ); +/** @internal */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_test_seq_init( void *, int ); +/** @internal */ +LIBITTNOTIFY_API void LIBITTNOTIFY_CC __itt_test_seq_wait( void *, int ); +/** @endcond */ + +#ifdef __cplusplus +} +#endif /* __cplusplus */ + + +/* ********************************************************************************* + ********************************************************************************* + ********************************************************************************* */ +/** @cond exclude_from_documentation */ +#define ITT_JOIN_AUX(p,n) p##n +#define ITT_JOIN(p,n) ITT_JOIN_AUX(p,n) + +#ifndef INTEL_ITTNOTIFY_PREFIX +#define INTEL_ITTNOTIFY_PREFIX __itt_ +#endif /* INTEL_ITTNOTIFY_PREFIX */ +#ifndef INTEL_ITTNOTIFY_POSTFIX +# define INTEL_ITTNOTIFY_POSTFIX _ptr_ +#endif /* INTEL_ITTNOTIFY_POSTFIX */ + +#ifndef _ITTNOTIFY_H_MACRO_BODY_ + +#define ____ITTNOTIFY_NAME_(p,n) p##n +#define ___ITTNOTIFY_NAME_(p,n) ____ITTNOTIFY_NAME_(p,n) +#define __ITTNOTIFY_NAME_(n) ___ITTNOTIFY_NAME_(INTEL_ITTNOTIFY_PREFIX,n) +#define _ITTNOTIFY_NAME_(n) __ITTNOTIFY_NAME_(ITT_JOIN(n,INTEL_ITTNOTIFY_POSTFIX)) + +#ifdef ITT_STUBV +#undef ITT_STUBV +#endif +#define ITT_STUBV(type,name,args,params) \ + typedef type (ITTAPI_CALL* ITT_JOIN(_ITTNOTIFY_NAME_(name),_t)) args; \ + extern ITT_JOIN(_ITTNOTIFY_NAME_(name),_t) _ITTNOTIFY_NAME_(name); +#undef ITT_STUB +#define ITT_STUB ITT_STUBV + +#ifdef __cplusplus +extern "C" { +#endif /* __cplusplus */ + +#define __itt_error_handler ITT_JOIN(INTEL_ITTNOTIFY_PREFIX, error_handler) +void __itt_error_handler(__itt_jit_jvm_event event_type, void* event_data); + +extern const __itt_state_t _ITTNOTIFY_NAME_(state_err); +extern const __itt_event _ITTNOTIFY_NAME_(event_err); +extern const int _ITTNOTIFY_NAME_(err); + +#define __itt_state_err _ITTNOTIFY_NAME_(state_err) +#define __itt_event_err _ITTNOTIFY_NAME_(event_err) +#define __itt_err _ITTNOTIFY_NAME_(err) + +ITT_STUBV(void, pause,(void),()) +ITT_STUBV(void, resume,(void),()) + +#if ITT_PLATFORM==ITT_PLATFORM_WIN + +ITT_STUB(__itt_mark_type, mark_createA,(const char *name),(name)) + +ITT_STUB(__itt_mark_type, mark_createW,(const wchar_t *name),(name)) + +ITT_STUB(int, markA,(__itt_mark_type mt, const char *parameter),(mt,parameter)) + +ITT_STUB(int, markW,(__itt_mark_type mt, const wchar_t *parameter),(mt,parameter)) + +ITT_STUB(int, mark_globalA,(__itt_mark_type mt, const char *parameter),(mt,parameter)) + +ITT_STUB(int, mark_globalW,(__itt_mark_type mt, const wchar_t *parameter),(mt,parameter)) + +ITT_STUBV(void, thread_set_nameA,( const char *name),(name)) + +ITT_STUBV(void, thread_set_nameW,( const wchar_t *name),(name)) + +ITT_STUBV(void, sync_createA,(void *addr, const char *objtype, const char *objname, int attribute), (addr, objtype, objname, attribute)) + +ITT_STUBV(void, sync_createW,(void *addr, const wchar_t *objtype, const wchar_t *objname, int attribute), (addr, objtype, objname, attribute)) + +ITT_STUBV(void, sync_renameA, (void *addr, const char *name), (addr, name)) + +ITT_STUBV(void, sync_renameW, (void *addr, const wchar_t *name), (addr, name)) +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +ITT_STUB(__itt_mark_type, mark_create,(const char *name),(name)) + +ITT_STUB(int, mark,(__itt_mark_type mt, const char *parameter),(mt,parameter)) + +ITT_STUB(int, mark_global,(__itt_mark_type mt, const char *parameter),(mt,parameter)) + +ITT_STUBV(void, sync_set_name,(void *addr, const char *objtype, const char *objname, int attribute),(addr,objtype,objname,attribute)) + +ITT_STUBV(void, thread_set_name,( const char *name),(name)) + +ITT_STUBV(void, sync_create,(void *addr, const char *objtype, const char *objname, int attribute), (addr, objtype, objname, attribute)) + +ITT_STUBV(void, sync_rename, (void *addr, const char *name), (addr, name)) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +ITT_STUB(int, mark_off,(__itt_mark_type mt),(mt)) + +ITT_STUB(int, mark_global_off,(__itt_mark_type mt),(mt)) + +ITT_STUBV(void, thread_ignore,(void),()) + +ITT_STUBV(void, sync_prepare,(void* addr),(addr)) + +ITT_STUBV(void, sync_cancel,(void *addr),(addr)) + +ITT_STUBV(void, sync_acquired,(void *addr),(addr)) + +ITT_STUBV(void, sync_releasing,(void* addr),(addr)) + +ITT_STUBV(void, sync_released,(void* addr),(addr)) + +ITT_STUBV(void, fsync_prepare,(void* addr),(addr)) + +ITT_STUBV(void, fsync_cancel,(void *addr),(addr)) + +ITT_STUBV(void, fsync_acquired,(void *addr),(addr)) + +ITT_STUBV(void, fsync_releasing,(void* addr),(addr)) + +ITT_STUBV(void, fsync_released,(void* addr),(addr)) + +ITT_STUBV(void, sync_destroy,(void *addr), (addr)) + +ITT_STUBV(void, notify_sync_prepare,(void *p),(p)) + +ITT_STUBV(void, notify_sync_cancel,(void *p),(p)) + +ITT_STUBV(void, notify_sync_acquired,(void *p),(p)) + +ITT_STUBV(void, notify_sync_releasing,(void *p),(p)) + +ITT_STUBV(void, notify_cpath_target,(),()) + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +ITT_STUBV(void, sync_set_nameA,(void *addr, const char *objtype, const char *objname, int attribute),(addr,objtype,objname,attribute)) + +ITT_STUBV(void, sync_set_nameW,(void *addr, const wchar_t *objtype, const wchar_t *objname, int attribute),(addr,objtype,objname,attribute)) + +ITT_STUB (int, thr_name_setA,( char *name, int namelen ),(name,namelen)) + +ITT_STUB (int, thr_name_setW,( wchar_t *name, int namelen ),(name,namelen)) + +ITT_STUB (__itt_event, event_createA,( char *name, int namelen ),(name,namelen)) + +ITT_STUB (__itt_event, event_createW,( wchar_t *name, int namelen ),(name,namelen)) +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +ITT_STUB (int, thr_name_set,( char *name, int namelen ),(name,namelen)) + +ITT_STUB (__itt_event, event_create,( char *name, int namelen ),(name,namelen)) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +ITT_STUBV(void, thr_ignore,(void),()) + +ITT_STUB (int, event_start,( __itt_event event ),(event)) + +ITT_STUB (int, event_end,( __itt_event event ),(event)) + +ITT_STUB (__itt_state_t, state_get, (), ()) +ITT_STUB (__itt_state_t, state_set,( __itt_state_t state), (state)) +ITT_STUB (__itt_obj_state_t, obj_mode_set, ( __itt_obj_prop_t prop, __itt_obj_state_t state), (prop, state)) +ITT_STUB (__itt_thr_state_t, thr_mode_set, (__itt_thr_prop_t prop, __itt_thr_state_t state), (prop, state)) + +ITT_STUB(const char*, api_version,(void),()) + +ITT_STUB(int, jit_notify_event, (__itt_jit_jvm_event event_type, void* event_data), (event_type, event_data)) +ITT_STUB(unsigned int, jit_get_new_method_id, (void), ()) + +ITT_STUBV(void, memory_read,( void *address, size_t size ), (address, size)) +ITT_STUBV(void, memory_write,( void *address, size_t size ), (address, size)) +ITT_STUBV(void, memory_update,( void *address, size_t size ), (address, size)) + +ITT_STUBV(void, test_delay, (int p1), (p1)) +ITT_STUBV(void, test_seq_init, ( void* p1, int p2), (p1, p2)) +ITT_STUBV(void, test_seq_wait, ( void* p1, int p2), (p1, p2)) +#ifdef __cplusplus +} /* extern "C" */ +#endif /* __cplusplus */ + + +#ifndef INTEL_NO_ITTNOTIFY_API + +#define __ITTNOTIFY_VOID_CALL__(n) (!_ITTNOTIFY_NAME_(n)) ? (void)0 : _ITTNOTIFY_NAME_(n) +#define __ITTNOTIFY_DATA_CALL__(n) (!_ITTNOTIFY_NAME_(n)) ? 0 : _ITTNOTIFY_NAME_(n) + +#define __itt_pause __ITTNOTIFY_VOID_CALL__(pause) +#define __itt_pause_ptr _ITTNOTIFY_NAME_(pause) + +#define __itt_resume __ITTNOTIFY_VOID_CALL__(resume) +#define __itt_resume_ptr _ITTNOTIFY_NAME_(resume) + +#if ITT_PLATFORM==ITT_PLATFORM_WIN + +#define __itt_mark_createA __ITTNOTIFY_DATA_CALL__(mark_createA) +#define __itt_mark_createA_ptr _ITTNOTIFY_NAME_(mark_createA) + +#define __itt_mark_createW __ITTNOTIFY_DATA_CALL__(mark_createW) +#define __itt_mark_createW_ptr _ITTNOTIFY_NAME_(mark_createW) + +#define __itt_markA __ITTNOTIFY_DATA_CALL__(markA) +#define __itt_markA_ptr _ITTNOTIFY_NAME_(markA) + +#define __itt_markW __ITTNOTIFY_DATA_CALL__(markW) +#define __itt_markW_ptr _ITTNOTIFY_NAME_(markW) + +#define __itt_mark_globalA __ITTNOTIFY_DATA_CALL__(mark_globalA) +#define __itt_mark_globalA_ptr _ITTNOTIFY_NAME_(mark_globalA) + +#define __itt_mark_globalW __ITTNOTIFY_DATA_CALL__(mark_globalW) +#define __itt_mark_globalW_ptr _ITTNOTIFY_NAME_(mark_globalW) + +#define __itt_thread_set_nameA __ITTNOTIFY_VOID_CALL__(thread_set_nameA) +#define __itt_thread_set_nameA_ptr _ITTNOTIFY_NAME_(thread_set_nameA) + +#define __itt_thread_set_nameW __ITTNOTIFY_VOID_CALL__(thread_set_nameW) +#define __itt_thread_set_nameW_ptr _ITTNOTIFY_NAME_(thread_set_nameW) + +#define __itt_sync_createA __ITTNOTIFY_VOID_CALL__(sync_createA) +#define __itt_sync_createA_ptr _ITTNOTIFY_NAME_(sync_createA) + +#define __itt_sync_createW __ITTNOTIFY_VOID_CALL__(sync_createW) +#define __itt_sync_createW_ptr _ITTNOTIFY_NAME_(sync_createW) + +#define __itt_sync_renameA __ITTNOTIFY_VOID_CALL__(sync_renameA) +#define __itt_sync_renameA_ptr _ITTNOTIFY_NAME_(sync_renameA) + +#define __itt_sync_renameW __ITTNOTIFY_VOID_CALL__(sync_renameW) +#define __itt_sync_renameW_ptr _ITTNOTIFY_NAME_(sync_renameW) +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#define __itt_mark_create __ITTNOTIFY_DATA_CALL__(mark_create) +#define __itt_mark_create_ptr _ITTNOTIFY_NAME_(mark_create) + +#define __itt_mark __ITTNOTIFY_DATA_CALL__(mark) +#define __itt_mark_ptr _ITTNOTIFY_NAME_(mark) + +#define __itt_mark_global __ITTNOTIFY_DATA_CALL__(mark_global) +#define __itt_mark_global_ptr _ITTNOTIFY_NAME_(mark_global) + +#define __itt_sync_set_name __ITTNOTIFY_VOID_CALL__(sync_set_name) +#define __itt_sync_set_name_ptr _ITTNOTIFY_NAME_(sync_set_name) + +#define __itt_thread_set_name __ITTNOTIFY_VOID_CALL__(thread_set_name) +#define __itt_thread_set_name_ptr _ITTNOTIFY_NAME_(thread_set_name) + +#define __itt_sync_create __ITTNOTIFY_VOID_CALL__(sync_create) +#define __itt_sync_create_ptr _ITTNOTIFY_NAME_(sync_create) + +#define __itt_sync_rename __ITTNOTIFY_VOID_CALL__(sync_rename) +#define __itt_sync_rename_ptr _ITTNOTIFY_NAME_(sync_rename) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#define __itt_mark_off __ITTNOTIFY_DATA_CALL__(mark_off) +#define __itt_mark_off_ptr _ITTNOTIFY_NAME_(mark_off) + +#define __itt_thread_ignore __ITTNOTIFY_VOID_CALL__(thread_ignore) +#define __itt_thread_ignore_ptr _ITTNOTIFY_NAME_(thread_ignore) + +#define __itt_sync_prepare __ITTNOTIFY_VOID_CALL__(sync_prepare) +#define __itt_sync_prepare_ptr _ITTNOTIFY_NAME_(sync_prepare) + +#define __itt_sync_cancel __ITTNOTIFY_VOID_CALL__(sync_cancel) +#define __itt_sync_cancel_ptr _ITTNOTIFY_NAME_(sync_cancel) + +#define __itt_sync_acquired __ITTNOTIFY_VOID_CALL__(sync_acquired) +#define __itt_sync_acquired_ptr _ITTNOTIFY_NAME_(sync_acquired) + +#define __itt_sync_releasing __ITTNOTIFY_VOID_CALL__(sync_releasing) +#define __itt_sync_releasing_ptr _ITTNOTIFY_NAME_(sync_releasing) + +#define __itt_sync_released __ITTNOTIFY_VOID_CALL__(sync_released) +#define __itt_sync_released_ptr _ITTNOTIFY_NAME_(sync_released) + +#define __itt_fsync_prepare __ITTNOTIFY_VOID_CALL__(fsync_prepare) +#define __itt_fsync_prepare_ptr _ITTNOTIFY_NAME_(fsync_prepare) + +#define __itt_fsync_cancel __ITTNOTIFY_VOID_CALL__(fsync_cancel) +#define __itt_fsync_cancel_ptr _ITTNOTIFY_NAME_(fsync_cancel) + +#define __itt_fsync_acquired __ITTNOTIFY_VOID_CALL__(fsync_acquired) +#define __itt_fsync_acquired_ptr _ITTNOTIFY_NAME_(fsync_acquired) + +#define __itt_fsync_releasing __ITTNOTIFY_VOID_CALL__(fsync_releasing) +#define __itt_fsync_releasing_ptr _ITTNOTIFY_NAME_(fsync_releasing) + +#define __itt_fsync_released __ITTNOTIFY_VOID_CALL__(fsync_released) +#define __itt_fsync_released_ptr _ITTNOTIFY_NAME_(fsync_released) + +#define __itt_sync_destroy __ITTNOTIFY_VOID_CALL__(sync_destroy) +#define __itt_sync_destroy_ptr _ITTNOTIFY_NAME_(sync_destroy) + +#define __itt_notify_sync_prepare __ITTNOTIFY_VOID_CALL__(notify_sync_prepare) +#define __itt_notify_sync_prepare_ptr _ITTNOTIFY_NAME_(notify_sync_prepare) + +#define __itt_notify_sync_cancel __ITTNOTIFY_VOID_CALL__(notify_sync_cancel) +#define __itt_notify_sync_cancel_ptr _ITTNOTIFY_NAME_(notify_sync_cancel) + +#define __itt_notify_sync_acquired __ITTNOTIFY_VOID_CALL__(notify_sync_acquired) +#define __itt_notify_sync_acquired_ptr _ITTNOTIFY_NAME_(notify_sync_acquired) + +#define __itt_notify_sync_releasing __ITTNOTIFY_VOID_CALL__(notify_sync_releasing) +#define __itt_notify_sync_releasing_ptr _ITTNOTIFY_NAME_(notify_sync_releasing) + +#define __itt_notify_cpath_target __ITTNOTIFY_VOID_CALL__(notify_cpath_target) +#define __itt_notify_cpath_target_ptr _ITTNOTIFY_NAME_(notify_cpath_target) + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +#define __itt_sync_set_nameA __ITTNOTIFY_VOID_CALL__(sync_set_nameA) +#define __itt_sync_set_nameA_ptr _ITTNOTIFY_NAME_(sync_set_nameA) + +#define __itt_sync_set_nameW __ITTNOTIFY_VOID_CALL__(sync_set_nameW) +#define __itt_sync_set_nameW_ptr _ITTNOTIFY_NAME_(sync_set_nameW) + +#define __itt_thr_name_setA __ITTNOTIFY_DATA_CALL__(thr_name_setA) +#define __itt_thr_name_setA_ptr _ITTNOTIFY_NAME_(thr_name_setA) + +#define __itt_thr_name_setW __ITTNOTIFY_DATA_CALL__(thr_name_setW) +#define __itt_thr_name_setW_ptr _ITTNOTIFY_NAME_(thr_name_setW) + +#define __itt_event_createA __ITTNOTIFY_DATA_CALL__(event_createA) +#define __itt_event_createA_ptr _ITTNOTIFY_NAME_(event_createA) + +#define __itt_event_createW __ITTNOTIFY_DATA_CALL__(event_createW) +#define __itt_event_createW_ptr _ITTNOTIFY_NAME_(event_createW) +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#define __itt_thr_name_set __ITTNOTIFY_DATA_CALL__(thr_name_set) +#define __itt_thr_name_set_ptr _ITTNOTIFY_NAME_(thr_name_set) + +#define __itt_event_create __ITTNOTIFY_DATA_CALL__(event_create) +#define __itt_event_create_ptr _ITTNOTIFY_NAME_(event_create) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#define __itt_thr_ignore __ITTNOTIFY_VOID_CALL__(thr_ignore) +#define __itt_thr_ignore_ptr _ITTNOTIFY_NAME_(thr_ignore) + +#define __itt_event_start __ITTNOTIFY_DATA_CALL__(event_start) +#define __itt_event_start_ptr _ITTNOTIFY_NAME_(event_start) + +#define __itt_event_end __ITTNOTIFY_DATA_CALL__(event_end) +#define __itt_event_end_ptr _ITTNOTIFY_NAME_(event_end) + +#define __itt_state_get __ITTNOTIFY_DATA_CALL__(state_get) +#define __itt_state_get_ptr _ITTNOTIFY_NAME_(state_get) + +#define __itt_state_set __ITTNOTIFY_DATA_CALL__(state_set) +#define __itt_state_set_ptr _ITTNOTIFY_NAME_(state_set) + +#define __itt_obj_mode_set __ITTNOTIFY_DATA_CALL__(obj_mode_set) +#define __itt_obj_mode_set_ptr _ITTNOTIFY_NAME_(obj_mode_set) + +#define __itt_thr_mode_set __ITTNOTIFY_DATA_CALL__(thr_mode_set) +#define __itt_thr_mode_set_ptr _ITTNOTIFY_NAME_(thr_mode_set) + +#define __itt_api_version __ITTNOTIFY_DATA_CALL__(api_version) +#define __itt_api_version_ptr _ITTNOTIFY_NAME_(api_version) + +#define __itt_jit_notify_event __ITTNOTIFY_DATA_CALL__(jit_notify_event) +#define __itt_jit_notify_event_ptr _ITTNOTIFY_NAME_(jit_notify_event) + +#define __itt_jit_get_new_method_id __ITTNOTIFY_DATA_CALL__(jit_get_new_method_id) +#define __itt_jit_get_new_method_id_ptr _ITTNOTIFY_NAME_(jit_get_new_method_id) + +#define __itt_memory_read __ITTNOTIFY_VOID_CALL__(memory_read) +#define __itt_memory_read_ptr _ITTNOTIFY_NAME_(memory_read) + +#define __itt_memory_write __ITTNOTIFY_VOID_CALL__(memory_write) +#define __itt_memory_write_ptr _ITTNOTIFY_NAME_(memory_write) + +#define __itt_memory_update __ITTNOTIFY_VOID_CALL__(memory_update) +#define __itt_memory_update_ptr _ITTNOTIFY_NAME_(memory_update) + + +#define __itt_test_delay __ITTNOTIFY_VOID_CALL__(test_delay) +#define __itt_test_delay_ptr _ITTNOTIFY_NAME_(test_delay) + +#define __itt_test_seq_init __ITTNOTIFY_VOID_CALL__(test_seq_init) +#define __itt_test_seq_init_ptr _ITTNOTIFY_NAME_(test_seq_init) + +#define __itt_test_seq_wait __ITTNOTIFY_VOID_CALL__(test_seq_wait) +#define __itt_test_seq_wait_ptr _ITTNOTIFY_NAME_(test_seq_wait) + +#define __itt_set_error_handler ITT_JOIN(INTEL_ITTNOTIFY_PREFIX, set_error_handler) + +#else /* INTEL_NO_ITTNOTIFY_API */ + +#define __itt_pause() +#define __itt_pause_ptr 0 + +#define __itt_resume() +#define __itt_resume_ptr 0 + +#if ITT_PLATFORM==ITT_PLATFORM_WIN + +#define __itt_mark_createA(name) (__itt_mark_type)0 +#define __itt_mark_createA_ptr 0 + +#define __itt_mark_createW(name) (__itt_mark_type)0 +#define __itt_mark_createW_ptr 0 + +#define __itt_markA(mt,parameter) (int)0 +#define __itt_markA_ptr 0 + +#define __itt_markW(mt,parameter) (int)0 +#define __itt_markW_ptr 0 + +#define __itt_mark_globalA(mt,parameter) (int)0 +#define __itt_mark_globalA_ptr 0 + +#define __itt_mark_globalW(mt,parameter) (int)0 +#define __itt_mark_globalW_ptr 0 + +#define __itt_thread_set_nameA(name) +#define __itt_thread_set_nameA_ptr 0 + +#define __itt_thread_set_nameW(name) +#define __itt_thread_set_nameW_ptr 0 + +#define __itt_sync_createA(addr, objtype, objname, attribute) +#define __itt_sync_createA_ptr 0 + +#define __itt_sync_createW(addr, objtype, objname, attribute) +#define __itt_sync_createW_ptr 0 + +#define __itt_sync_renameA(addr, name) +#define __itt_sync_renameA_ptr 0 + +#define __itt_sync_renameW(addr, name) +#define __itt_sync_renameW_ptr 0 +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#define __itt_mark_create(name) (__itt_mark_type)0 +#define __itt_mark_create_ptr 0 + +#define __itt_mark(mt,parameter) (int)0 +#define __itt_mark_ptr 0 + +#define __itt_mark_global(mt,parameter) (int)0 +#define __itt_mark_global_ptr 0 + +#define __itt_sync_set_name(addr,objtype,objname,attribute) +#define __itt_sync_set_name_ptr 0 + +#define __itt_thread_set_name(name) +#define __itt_thread_set_name_ptr 0 + +#define __itt_sync_create(addr, objtype, objname, attribute) +#define __itt_sync_create_ptr 0 + +#define __itt_sync_rename(addr, name) +#define __itt_sync_rename_ptr 0 +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#define __itt_mark_off(mt) (int)0 +#define __itt_mark_off_ptr 0 + +#define __itt_thread_ignore() +#define __itt_thread_ignore_ptr 0 + +#define __itt_sync_prepare(addr) +#define __itt_sync_prepare_ptr 0 + +#define __itt_sync_cancel(addr) +#define __itt_sync_cancel_ptr 0 + +#define __itt_sync_acquired(addr) +#define __itt_sync_acquired_ptr 0 + +#define __itt_sync_releasing(addr) +#define __itt_sync_releasing_ptr 0 + +#define __itt_sync_released(addr) +#define __itt_sync_released_ptr 0 + +#define __itt_fsync_prepare(addr) +#define __itt_fsync_prepare_ptr 0 + +#define __itt_fsync_cancel(addr) +#define __itt_fsync_cancel_ptr 0 + +#define __itt_fsync_acquired(addr) +#define __itt_fsync_acquired_ptr 0 + +#define __itt_fsync_releasing(addr) +#define __itt_fsync_releasing_ptr 0 + +#define __itt_fsync_released(addr) +#define __itt_fsync_released_ptr 0 + +#define __itt_sync_destroy(addr) +#define __itt_sync_destroy_ptr 0 + +#define __itt_notify_sync_prepare(p) +#define __itt_notify_sync_prepare_ptr 0 + +#define __itt_notify_sync_cancel(p) +#define __itt_notify_sync_cancel_ptr 0 + +#define __itt_notify_sync_acquired(p) +#define __itt_notify_sync_acquired_ptr 0 + +#define __itt_notify_sync_releasing(p) +#define __itt_notify_sync_releasing_ptr 0 + +#define __itt_notify_cpath_target() +#define __itt_notify_cpath_target_ptr 0 + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +#define __itt_sync_set_nameA(addr,objtype,objname,attribute) +#define __itt_sync_set_nameA_ptr 0 + +#define __itt_sync_set_nameW(addr,objtype,objname,attribute) +#define __itt_sync_set_nameW_ptr 0 + +#define __itt_thr_name_setA(name,namelen) (int)0 +#define __itt_thr_name_setA_ptr 0 + +#define __itt_thr_name_setW(name,namelen) (int)0 +#define __itt_thr_name_setW_ptr 0 + +#define __itt_event_createA(name,namelen) (__itt_event)0 +#define __itt_event_createA_ptr 0 + +#define __itt_event_createW(name,namelen) (__itt_event)0 +#define __itt_event_createW_ptr 0 +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#define __itt_thr_name_set(name,namelen) (int)0 +#define __itt_thr_name_set_ptr 0 + +#define __itt_event_create(name,namelen) (__itt_event)0 +#define __itt_event_create_ptr 0 +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#define __itt_thr_ignore() +#define __itt_thr_ignore_ptr 0 + +#define __itt_event_start(event) (int)0 +#define __itt_event_start_ptr 0 + +#define __itt_event_end(event) (int)0 +#define __itt_event_end_ptr 0 + +#define __itt_state_get() (__itt_state_t)0 +#define __itt_state_get_ptr 0 + +#define __itt_state_set(state) (__itt_state_t)0 +#define __itt_state_set_ptr 0 + +#define __itt_obj_mode_set(prop, state) (__itt_obj_state_t)0 +#define __itt_obj_mode_set_ptr 0 + +#define __itt_thr_mode_set(prop, state) (__itt_thr_state_t)0 +#define __itt_thr_mode_set_ptr 0 + +#define __itt_api_version() (const char*)0 +#define __itt_api_version_ptr 0 + +#define __itt_jit_notify_event(event_type,event_data) (int)0 +#define __itt_jit_notify_event_ptr 0 + +#define __itt_jit_get_new_method_id() (unsigned int)0 +#define __itt_jit_get_new_method_id_ptr 0 + +#define __itt_memory_read(address, size) +#define __itt_memory_read_ptr 0 + +#define __itt_memory_write(address, size) +#define __itt_memory_write_ptr 0 + +#define __itt_memory_update(address, size) +#define __itt_memory_update_ptr 0 + +#define __itt_test_delay(p1) +#define __itt_test_delay_ptr 0 + +#define __itt_test_seq_init(p1,p2) +#define __itt_test_seq_init_ptr 0 + +#define __itt_test_seq_wait(p1,p2) +#define __itt_test_seq_wait_ptr 0 + +#define __itt_set_error_handler(x) + +#endif /* INTEL_NO_ITTNOTIFY_API */ + +#endif /* _ITTNOTIFY_H_MACRO_BODY_ */ + +#endif /* _ITTNOTIFY_H_ */ +/** @endcond */ + diff --git a/dep/tbb/src/tbb/tools_api/ittnotify_static.c b/dep/tbb/src/tbb/tools_api/ittnotify_static.c new file mode 100644 index 000000000..d03758bc6 --- /dev/null +++ b/dep/tbb/src/tbb/tools_api/ittnotify_static.c @@ -0,0 +1,577 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "_config.h" + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +#include +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#include +#include +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#include +#include +#include + +#define __ITT_INTERNAL_INCLUDE + +#define _ITTNOTIFY_H_MACRO_BODY_ + +#include "_disable_warnings.h" + +#include "ittnotify.h" + +#ifdef __cplusplus +# define ITT_EXTERN_C extern "C" +#else +# define ITT_EXTERN_C /* nothing */ +#endif /* __cplusplus */ + +#ifndef __itt_init_lib_name +# define __itt_init_lib_name __itt_init_lib +#endif /* __itt_init_lib_name */ + +static int __itt_init_lib(void); + +#ifndef INTEL_ITTNOTIFY_PREFIX +#define INTEL_ITTNOTIFY_PREFIX __itt_ +#endif /* INTEL_ITTNOTIFY_PREFIX */ +#ifndef INTEL_ITTNOTIFY_POSTFIX +# define INTEL_ITTNOTIFY_POSTFIX _ptr_ +#endif /* INTEL_ITTNOTIFY_POSTFIX */ + +#define ___N_(p,n) p##n +#define __N_(p,n) ___N_(p,n) +#define _N_(n) __N_(INTEL_ITTNOTIFY_PREFIX,n) + +/* building pointers to imported funcs */ +#undef ITT_STUBV +#undef ITT_STUB +#define ITT_STUB(type,name,args,params,ptr,group) \ + static type ITTAPI_CALL ITT_JOIN(_N_(name),_init) args; \ + typedef type ITTAPI_CALL name##_t args; \ + ITT_EXTERN_C name##_t* ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) = ITT_JOIN(_N_(name),_init); \ + static type ITTAPI_CALL ITT_JOIN(_N_(name),_init) args \ + { \ + __itt_init_lib_name(); \ + if(ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX)) \ + return ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) params; \ + else \ + return (type)0; \ + } + +#define ITT_STUBV(type,name,args,params,ptr,group) \ + static type ITTAPI_CALL ITT_JOIN(_N_(name),_init) args; \ + typedef type ITTAPI_CALL name##_t args; \ + ITT_EXTERN_C name##_t* ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) = ITT_JOIN(_N_(name),_init); \ + static type ITTAPI_CALL ITT_JOIN(_N_(name),_init) args \ + { \ + __itt_init_lib_name(); \ + if(ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX)) \ + ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) params; \ + else \ + return; \ + } + +const __itt_state_t _N_(state_err) = 0; +const __itt_event _N_(event_err) = 0; +const int _N_(err) = 0; + +#include "_ittnotify_static.h" + +typedef enum ___itt_group_id +{ + __itt_none_group = 0, + __itt_control_group = 1, + __itt_thread_group = 2, + __itt_mark_group = 4, + __itt_sync_group = 8, + __itt_fsync_group = 16, + __itt_jit_group = 32, + __itt_all_group = -1 +} __itt_group_id; + + +#ifndef CDECL +#if ITT_PLATFORM==ITT_PLATFORM_WIN +# define CDECL __cdecl +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +# define CDECL +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#endif /* CDECL */ + +#ifndef STDCALL +#if ITT_PLATFORM==ITT_PLATFORM_WIN +# define STDCALL __stdcall +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +# define STDCALL +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +#endif /* STDCALL */ + +#if ITT_PLATFORM==ITT_PLATFORM_WIN + typedef FARPROC FPTR; +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + typedef void* FPTR; +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + + +/* OS communication functions */ +#if ITT_PLATFORM==ITT_PLATFORM_WIN +typedef HMODULE lib_t; +typedef CRITICAL_SECTION mutex_t; +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +typedef void* lib_t; +typedef pthread_mutex_t mutex_t; +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +static lib_t ittnotify_lib; + +static __itt_error_notification_t* error_handler = 0; + +#if ITT_OS==ITT_OS_WIN +static const char* ittnotify_lib_name = "libittnotify.dll"; +#elif ITT_OS==ITT_OS_LINUX +static const char* ittnotify_lib_name = "libittnotify.so"; +#elif ITT_OS==ITT_OS_MAC +static const char* ittnotify_lib_name = "libittnotify.dylib"; +#else +#error Unsupported or unknown OS. +#endif + +#ifndef LIB_VAR_NAME +#if ITT_ARCH==ITT_ARCH_IA32 +#define LIB_VAR_NAME INTEL_LIBITTNOTIFY32 +#else +#define LIB_VAR_NAME INTEL_LIBITTNOTIFY64 +#endif +#endif /* LIB_VAR_NAME */ + +#define __TO_STR(x) #x +#define _TO_STR(x) __TO_STR(x) + +static int __itt_fstrcmp(const char* s1, const char* s2) +{ + int i; + + if(!s1 && !s2) + return 0; + else if(!s1 && s2) + return -1; + else if(s1 && !s2) + return 1; + + for(i = 0; s1[i] || s2[i]; i++) + if(s1[i] > s2[i]) + return 1; + else if(s1[i] < s2[i]) + return -1; + return 0; +} + +static const char* __itt_fsplit(const char* s, const char* sep, const char** out, int* len) +{ + int i; + int j; + + if(!s || !sep || !out || !len) + return 0; + + for(i = 0; s[i]; i++) + { + int b = 0; + for(j = 0; sep[j]; j++) + if(s[i] == sep[j]) + { + b = 1; + break; + } + if(!b) + break; + } + + if(!s[i]) + return 0; + + *len = 0; + *out = s + i; + + for(; s[i]; i++, (*len)++) + { + int b = 0; + for(j = 0; sep[j]; j++) + if(s[i] == sep[j]) + { + b = 1; + break; + } + if(b) + break; + } + + for(; s[i]; i++) + { + int b = 0; + for(j = 0; sep[j]; j++) + if(s[i] == sep[j]) + { + b = 1; + break; + } + if(!b) + break; + } + + return s + i; +} + +static char* __itt_fstrcpyn(char* dst, const char* src, int len) +{ + int i; + + if(!src || !dst) + return 0; + + for(i = 0; i < len; i++) + dst[i] = src[i]; + dst[len] = 0; + return dst; +} + +#ifdef ITT_NOTIFY_EXT_REPORT +# define ERROR_HANDLER ITT_JOIN(INTEL_ITTNOTIFY_PREFIX, error_handler) +ITT_EXTERN_C void ERROR_HANDLER(__itt_error_code, const char* msg); +#endif /* ITT_NOTIFY_EXT_REPORT */ + +static void __itt_report_error(__itt_error_code code, const char* msg) +{ + if(error_handler) + error_handler(code, msg); +#ifdef ITT_NOTIFY_EXT_REPORT + ERROR_HANDLER(code, msg); +#endif /* ITT_NOTIFY_EXT_REPORT */ +} + +static const char* __itt_get_env_var(const char* name) +{ + static char env_value[4096]; +#if ITT_PLATFORM==ITT_PLATFORM_WIN + int i; + DWORD rc; + for(i = 0; i < sizeof(env_value); i++) + env_value[i] = 0; + rc = GetEnvironmentVariableA(name, env_value, sizeof(env_value) - 1); + if(rc >= sizeof(env_value)) + __itt_report_error(__itt_error_cant_read_env, name); + else if(!rc) + return 0; + else + return env_value; +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + char* env = getenv(name); + int i; + for(i = 0; i < sizeof(env_value); i++) + env_value[i] = 0; + if(env) + { + if(strlen(env) >= sizeof(env_value)) + { + __itt_report_error(__itt_error_cant_read_env, name); + return 0; + } + strncpy(env_value, env, sizeof(env_value) - 1); + return env_value; + } +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + return 0; +} + +static const char* __itt_get_lib_name() +{ + const char* lib_name = __itt_get_env_var(_TO_STR(LIB_VAR_NAME)); + if(!lib_name) + lib_name = ittnotify_lib_name; + + return lib_name; +} + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +# define __itt_get_proc(lib, name) GetProcAddress(lib, name) +# define __itt_init_mutex(mutex) InitializeCriticalSection(mutex) +# define __itt_mutex_lock(mutex) EnterCriticalSection(mutex) +# define __itt_mutex_unlock(mutex) LeaveCriticalSection(mutex) +# define __itt_load_lib(name) LoadLibraryA(name) +#else /* ITT_PLATFORM==ITT_PLATFORM_WIN */ +# define __itt_get_proc(lib, name) dlsym(lib, name) +# define __itt_init_mutex(mutex) pthread_mutex_init(mutex, 0) +# define __itt_mutex_lock(mutex) pthread_mutex_lock(mutex) +# define __itt_mutex_unlock(mutex) pthread_mutex_unlock(mutex) +# define __itt_load_lib(name) dlopen(name, RTLD_LAZY) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +#ifndef ITT_SIMPLE_INIT +/* function stubs */ + +#undef ITT_STUBV +#undef ITT_STUB + +#define ITT_STUBV(type,name,args,params,ptr,group) \ +ITT_EXTERN_C type ITTAPI_CALL _N_(name) args \ +{ \ + __itt_init_lib_name(); \ + if(ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX)) \ + ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) params; \ + else \ + return; \ +} + +#define ITT_STUB(type,name,args,params,ptr,group) \ +ITT_EXTERN_C type ITTAPI_CALL _N_(name) args \ +{ \ + __itt_init_lib_name(); \ + if(ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX)) \ + return ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) params; \ + else \ + return (type)0; \ +} + +#include "_ittnotify_static.h" + +#endif /* ITT_SIMPLE_INIT */ + +typedef struct ___itt_group_list +{ + __itt_group_id id; + const char* name; +} __itt_group_list; + +static __itt_group_list group_list[] = { + {__itt_control_group, "control"}, + {__itt_thread_group, "thread"}, + {__itt_mark_group, "mark"}, + {__itt_sync_group, "sync"}, + {__itt_fsync_group, "fsync"}, + {__itt_jit_group, "jit"}, + {__itt_all_group, "all"}, + {__itt_none_group, 0} +}; + +typedef struct ___itt_group_alias +{ + const char* env_var; + __itt_group_id groups; +} __itt_group_alias; + +static __itt_group_alias group_alias[] = { + {"KMP_FOR_TPROFILE", (__itt_group_id)(__itt_control_group | __itt_thread_group | __itt_sync_group | __itt_mark_group)}, + {"KMP_FOR_TCHECK", (__itt_group_id)(__itt_control_group | __itt_thread_group | __itt_fsync_group | __itt_mark_group)}, + {0, __itt_none_group} +}; + +typedef struct ___itt_func_map +{ + const char* name; + void** func_ptr; + __itt_group_id group; +} __itt_func_map; + + +#define _P_(name) ITT_JOIN(_N_(name),INTEL_ITTNOTIFY_POSTFIX) + +#define ITT_STRINGIZE_AUX(p) #p +#define ITT_STRINGIZE(p) ITT_STRINGIZE_AUX(p) + +#define __ptr_(pname,name,group) {ITT_STRINGIZE(ITT_JOIN(__itt_,pname)), (void**)(void*)&_P_(name), (__itt_group_id)(group)}, + +#undef ITT_STUB +#undef ITT_STUBV + +#define ITT_STUB(type,name,args,params,ptr,group) __ptr_(ptr,name,group) +#define ITT_STUBV ITT_STUB + +static __itt_func_map func_map[] = { +#include "_ittnotify_static.h" + {0, 0, __itt_none_group} +}; + +static __itt_group_id __itt_get_groups() +{ + __itt_group_id res = __itt_none_group; + + const char* group_str = __itt_get_env_var("INTEL_ITTNOTIFY_GROUPS"); + if(group_str) + { + char gr[255]; + const char* chunk; + int len; + while((group_str = __itt_fsplit(group_str, ",; ", &chunk, &len)) != 0) + { + int j; + int group_detected = 0; + __itt_fstrcpyn(gr, chunk, len); + for(j = 0; group_list[j].name; j++) + { + if(!__itt_fstrcmp(gr, group_list[j].name)) + { + res = (__itt_group_id)(res | group_list[j].id); + group_detected = 1; + break; + } + } + + if(!group_detected) + __itt_report_error(__itt_error_unknown_group, gr); + } + return res; + } + else + { + int i; + for(i = 0; group_alias[i].env_var; i++) + if(__itt_get_env_var(group_alias[i].env_var)) + return group_alias[i].groups; + } + + return res; +} + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +#pragma warning(push) +#pragma warning(disable: 4054) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ + +static int __itt_init_lib() +{ + static volatile int init = 0; + static int result = 0; + +#ifndef ITT_SIMPLE_INIT + +#if ITT_PLATFORM==ITT_PLATFORM_POSIX + static mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; +#else + static volatile int mutex_initialized = 0; + static mutex_t mutex; + static LONG inter_counter = 0; +#endif + + if(!init) + { +#if ITT_PLATFORM==ITT_PLATFORM_WIN + if(!mutex_initialized) + { + if(InterlockedIncrement(&inter_counter) == 1) + { + __itt_init_mutex(&mutex); + mutex_initialized = 1; + } + else + while(!mutex_initialized) + SwitchToThread(); + } +#endif + + __itt_mutex_lock(&mutex); +#endif /* ITT_SIMPLE_INIT */ + if(!init) + { + int i; + + __itt_group_id groups = __itt_get_groups(); + + for(i = 0; func_map[i].name; i++) + *func_map[i].func_ptr = 0; + + if(groups != __itt_none_group) + { +#ifdef ITT_COMPLETE_GROUP + __itt_group_id zero_group = __itt_none_group; +#endif /* ITT_COMPLETE_GROUP */ + + ittnotify_lib = __itt_load_lib(__itt_get_lib_name()); + if(ittnotify_lib) + { + for(i = 0; func_map[i].name; i++) + { + if(func_map[i].name && func_map[i].func_ptr && (func_map[i].group & groups)) + { + *func_map[i].func_ptr = (void*)__itt_get_proc(ittnotify_lib, func_map[i].name); + if(!(*func_map[i].func_ptr) && func_map[i].name) + { + __itt_report_error(__itt_error_no_symbol, func_map[i].name); +#ifdef ITT_COMPLETE_GROUP + zero_group = (__itt_group_id)(zero_group | func_map[i].group); +#endif /* ITT_COMPLETE_GROUP */ + } + else + result = 1; + } + } + } + else + { + __itt_report_error(__itt_error_no_module, __itt_get_lib_name()); + } + +#ifdef ITT_COMPLETE_GROUP + for(i = 0; func_map[i].name; i++) + if(func_map[i].group & zero_group) + *func_map[i].func_ptr = 0; + + result = 0; + + for(i = 0; func_map[i].name; i++) /* evaluating if any function ptr is non empty */ + if(*func_map[i].func_ptr) + { + result = 1; + break; + } +#endif /* ITT_COMPLETE_GROUP */ + } + + init = 1; /* first checking of 'init' flag happened out of mutex, that is why setting flag to 1 */ + /* must be after call table is filled (to avoid condition races) */ + } +#ifndef ITT_SIMPLE_INIT + __itt_mutex_unlock(&mutex); + } +#endif /* ITT_SIMPLE_INIT */ + return result; +} + +#define SET_ERROR_HANDLER ITT_JOIN(INTEL_ITTNOTIFY_PREFIX, set_error_handler) + +ITT_EXTERN_C __itt_error_notification_t* SET_ERROR_HANDLER(__itt_error_notification_t* handler) +{ + __itt_error_notification_t* prev = error_handler; + error_handler = handler; + return prev; +} + +#if ITT_PLATFORM==ITT_PLATFORM_WIN +#pragma warning(pop) +#endif /* ITT_PLATFORM==ITT_PLATFORM_WIN */ diff --git a/dep/tbb/src/tbb/win32-tbb-export.def b/dep/tbb/src/tbb/win32-tbb-export.def new file mode 100644 index 000000000..d78bf6d6a --- /dev/null +++ b/dep/tbb/src/tbb/win32-tbb-export.def @@ -0,0 +1,261 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +#include "tbb/tbb_config.h" + +EXPORTS + +; Assembly-language support that is called directly by clients +;__TBB_machine_cmpswp1 +;__TBB_machine_cmpswp2 +;__TBB_machine_cmpswp4 +__TBB_machine_cmpswp8 +;__TBB_machine_fetchadd1 +;__TBB_machine_fetchadd2 +;__TBB_machine_fetchadd4 +__TBB_machine_fetchadd8 +;__TBB_machine_fetchstore1 +;__TBB_machine_fetchstore2 +;__TBB_machine_fetchstore4 +__TBB_machine_fetchstore8 +__TBB_machine_store8 +__TBB_machine_load8 +__TBB_machine_trylockbyte + +; cache_aligned_allocator.cpp +?NFS_Allocate@internal@tbb@@YAPAXIIPAX@Z +?NFS_GetLineSize@internal@tbb@@YAIXZ +?NFS_Free@internal@tbb@@YAXPAX@Z +?allocate_via_handler_v3@internal@tbb@@YAPAXI@Z +?deallocate_via_handler_v3@internal@tbb@@YAXPAX@Z +?is_malloc_used_v3@internal@tbb@@YA_NXZ + +; task.cpp v3 +?allocate@allocate_additional_child_of_proxy@internal@tbb@@QBEAAVtask@3@I@Z +?allocate@allocate_child_proxy@internal@tbb@@QBEAAVtask@3@I@Z +?allocate@allocate_continuation_proxy@internal@tbb@@QBEAAVtask@3@I@Z +?allocate@allocate_root_proxy@internal@tbb@@SAAAVtask@3@I@Z +?destroy@task@tbb@@QAEXAAV12@@Z +?free@allocate_additional_child_of_proxy@internal@tbb@@QBEXAAVtask@3@@Z +?free@allocate_child_proxy@internal@tbb@@QBEXAAVtask@3@@Z +?free@allocate_continuation_proxy@internal@tbb@@QBEXAAVtask@3@@Z +?free@allocate_root_proxy@internal@tbb@@SAXAAVtask@3@@Z +?internal_set_ref_count@task@tbb@@AAEXH@Z +?internal_decrement_ref_count@task@tbb@@AAEHXZ +?is_owned_by_current_thread@task@tbb@@QBE_NXZ +?note_affinity@task@tbb@@UAEXG@Z +?resize@affinity_partitioner_base_v3@internal@tbb@@AAEXI@Z +?self@task@tbb@@SAAAV12@XZ +?spawn_and_wait_for_all@task@tbb@@QAEXAAVtask_list@2@@Z +?default_num_threads@task_scheduler_init@tbb@@SAHXZ +?initialize@task_scheduler_init@tbb@@QAEXHI@Z +?initialize@task_scheduler_init@tbb@@QAEXH@Z +?terminate@task_scheduler_init@tbb@@QAEXXZ +?observe@task_scheduler_observer_v3@internal@tbb@@QAEX_N@Z + +; exception handling support +#if __TBB_EXCEPTIONS +?allocate@allocate_root_with_context_proxy@internal@tbb@@QBEAAVtask@3@I@Z +?free@allocate_root_with_context_proxy@internal@tbb@@QBEXAAVtask@3@@Z +?is_group_execution_cancelled@task_group_context@tbb@@QBE_NXZ +?cancel_group_execution@task_group_context@tbb@@QAE_NXZ +?reset@task_group_context@tbb@@QAEXXZ +?init@task_group_context@tbb@@IAEXXZ +?register_pending_exception@task_group_context@tbb@@QAEXXZ +??1task_group_context@tbb@@QAE@XZ +?name@captured_exception@tbb@@UBEPBDXZ +?what@captured_exception@tbb@@UBEPBDXZ +??1captured_exception@tbb@@UAE@XZ +?move@captured_exception@tbb@@UAEPAV12@XZ +?destroy@captured_exception@tbb@@UAEXXZ +?set@captured_exception@tbb@@QAEXPBD0@Z +?clear@captured_exception@tbb@@QAEXXZ +#endif /* __TBB_EXCEPTIONS */ + +; tbb_misc.cpp +?assertion_failure@tbb@@YAXPBDH00@Z +?get_initial_auto_partitioner_divisor@internal@tbb@@YAIXZ +?handle_perror@internal@tbb@@YAXHPBD@Z +?set_assertion_handler@tbb@@YAP6AXPBDH00@ZP6AX0H00@Z@Z +?runtime_warning@internal@tbb@@YAXPBDZZ +TBB_runtime_interface_version +?throw_bad_last_alloc_exception_v4@internal@tbb@@YAXXZ + +; itt_notify.cpp +?itt_load_pointer_with_acquire_v3@internal@tbb@@YAPAXPBX@Z +?itt_store_pointer_with_release_v3@internal@tbb@@YAXPAX0@Z +?itt_set_sync_name_v3@internal@tbb@@YAXPAXPB_W@Z +?itt_load_pointer_v3@internal@tbb@@YAPAXPBX@Z + +; pipeline.cpp +??0pipeline@tbb@@QAE@XZ +??1filter@tbb@@UAE@XZ +??1pipeline@tbb@@UAE@XZ +??_7pipeline@tbb@@6B@ +?add_filter@pipeline@tbb@@QAEXAAVfilter@2@@Z +?clear@pipeline@tbb@@QAEXXZ +?inject_token@pipeline@tbb@@AAEXAAVtask@2@@Z +?run@pipeline@tbb@@QAEXI@Z +#if __TBB_EXCEPTIONS +?run@pipeline@tbb@@QAEXIAAVtask_group_context@2@@Z +#endif +?process_item@thread_bound_filter@tbb@@QAE?AW4result_type@12@XZ +?try_process_item@thread_bound_filter@tbb@@QAE?AW4result_type@12@XZ + +; queuing_rw_mutex.cpp +?internal_construct@queuing_rw_mutex@tbb@@QAEXXZ +?acquire@scoped_lock@queuing_rw_mutex@tbb@@QAEXAAV23@_N@Z +?downgrade_to_reader@scoped_lock@queuing_rw_mutex@tbb@@QAE_NXZ +?release@scoped_lock@queuing_rw_mutex@tbb@@QAEXXZ +?upgrade_to_writer@scoped_lock@queuing_rw_mutex@tbb@@QAE_NXZ +?try_acquire@scoped_lock@queuing_rw_mutex@tbb@@QAE_NAAV23@_N@Z + +#if !TBB_NO_LEGACY +; spin_rw_mutex.cpp v2 +?internal_acquire_reader@spin_rw_mutex@tbb@@CAXPAV12@@Z +?internal_acquire_writer@spin_rw_mutex@tbb@@CA_NPAV12@@Z +?internal_downgrade@spin_rw_mutex@tbb@@CAXPAV12@@Z +?internal_itt_releasing@spin_rw_mutex@tbb@@CAXPAV12@@Z +?internal_release_reader@spin_rw_mutex@tbb@@CAXPAV12@@Z +?internal_release_writer@spin_rw_mutex@tbb@@CAXPAV12@@Z +?internal_upgrade@spin_rw_mutex@tbb@@CA_NPAV12@@Z +?internal_try_acquire_writer@spin_rw_mutex@tbb@@CA_NPAV12@@Z +?internal_try_acquire_reader@spin_rw_mutex@tbb@@CA_NPAV12@@Z +#endif + +; spin_rw_mutex v3 +?internal_construct@spin_rw_mutex_v3@tbb@@AAEXXZ +?internal_upgrade@spin_rw_mutex_v3@tbb@@AAE_NXZ +?internal_downgrade@spin_rw_mutex_v3@tbb@@AAEXXZ +?internal_acquire_reader@spin_rw_mutex_v3@tbb@@AAEXXZ +?internal_acquire_writer@spin_rw_mutex_v3@tbb@@AAE_NXZ +?internal_release_reader@spin_rw_mutex_v3@tbb@@AAEXXZ +?internal_release_writer@spin_rw_mutex_v3@tbb@@AAEXXZ +?internal_try_acquire_reader@spin_rw_mutex_v3@tbb@@AAE_NXZ +?internal_try_acquire_writer@spin_rw_mutex_v3@tbb@@AAE_NXZ + +; spin_mutex.cpp +?internal_construct@spin_mutex@tbb@@QAEXXZ +?internal_acquire@scoped_lock@spin_mutex@tbb@@AAEXAAV23@@Z +?internal_release@scoped_lock@spin_mutex@tbb@@AAEXXZ +?internal_try_acquire@scoped_lock@spin_mutex@tbb@@AAE_NAAV23@@Z + +; mutex.cpp +?internal_acquire@scoped_lock@mutex@tbb@@AAEXAAV23@@Z +?internal_release@scoped_lock@mutex@tbb@@AAEXXZ +?internal_try_acquire@scoped_lock@mutex@tbb@@AAE_NAAV23@@Z +?internal_construct@mutex@tbb@@AAEXXZ +?internal_destroy@mutex@tbb@@AAEXXZ + +; recursive_mutex.cpp +?internal_acquire@scoped_lock@recursive_mutex@tbb@@AAEXAAV23@@Z +?internal_release@scoped_lock@recursive_mutex@tbb@@AAEXXZ +?internal_try_acquire@scoped_lock@recursive_mutex@tbb@@AAE_NAAV23@@Z +?internal_construct@recursive_mutex@tbb@@AAEXXZ +?internal_destroy@recursive_mutex@tbb@@AAEXXZ + +; queuing_mutex.cpp +?internal_construct@queuing_mutex@tbb@@QAEXXZ +?acquire@scoped_lock@queuing_mutex@tbb@@QAEXAAV23@@Z +?release@scoped_lock@queuing_mutex@tbb@@QAEXXZ +?try_acquire@scoped_lock@queuing_mutex@tbb@@QAE_NAAV23@@Z + +#if !TBB_NO_LEGACY +; concurrent_hash_map.cpp +?internal_grow_predicate@hash_map_segment_base@internal@tbb@@QBE_NXZ + +; concurrent_queue.cpp v2 +?advance@concurrent_queue_iterator_base@internal@tbb@@IAEXXZ +?assign@concurrent_queue_iterator_base@internal@tbb@@IAEXABV123@@Z +?internal_size@concurrent_queue_base@internal@tbb@@IBEHXZ +??0concurrent_queue_base@internal@tbb@@IAE@I@Z +??0concurrent_queue_iterator_base@internal@tbb@@IAE@ABVconcurrent_queue_base@12@@Z +??1concurrent_queue_base@internal@tbb@@MAE@XZ +??1concurrent_queue_iterator_base@internal@tbb@@IAE@XZ +?internal_pop@concurrent_queue_base@internal@tbb@@IAEXPAX@Z +?internal_pop_if_present@concurrent_queue_base@internal@tbb@@IAE_NPAX@Z +?internal_push@concurrent_queue_base@internal@tbb@@IAEXPBX@Z +?internal_push_if_not_full@concurrent_queue_base@internal@tbb@@IAE_NPBX@Z +?internal_set_capacity@concurrent_queue_base@internal@tbb@@IAEXHI@Z +#endif + +; concurrent_queue v3 +??1concurrent_queue_iterator_base_v3@internal@tbb@@IAE@XZ +??0concurrent_queue_iterator_base_v3@internal@tbb@@IAE@ABVconcurrent_queue_base_v3@12@@Z +?advance@concurrent_queue_iterator_base_v3@internal@tbb@@IAEXXZ +?assign@concurrent_queue_iterator_base_v3@internal@tbb@@IAEXABV123@@Z +??0concurrent_queue_base_v3@internal@tbb@@IAE@I@Z +??1concurrent_queue_base_v3@internal@tbb@@MAE@XZ +?internal_pop@concurrent_queue_base_v3@internal@tbb@@IAEXPAX@Z +?internal_pop_if_present@concurrent_queue_base_v3@internal@tbb@@IAE_NPAX@Z +?internal_push@concurrent_queue_base_v3@internal@tbb@@IAEXPBX@Z +?internal_push_if_not_full@concurrent_queue_base_v3@internal@tbb@@IAE_NPBX@Z +?internal_size@concurrent_queue_base_v3@internal@tbb@@IBEHXZ +?internal_empty@concurrent_queue_base_v3@internal@tbb@@IBE_NXZ +?internal_set_capacity@concurrent_queue_base_v3@internal@tbb@@IAEXHI@Z +?internal_finish_clear@concurrent_queue_base_v3@internal@tbb@@IAEXXZ +?internal_throw_exception@concurrent_queue_base_v3@internal@tbb@@IBEXXZ +?assign@concurrent_queue_base_v3@internal@tbb@@IAEXABV123@@Z + +#if !TBB_NO_LEGACY +; concurrent_vector.cpp v2 +?internal_assign@concurrent_vector_base@internal@tbb@@IAEXABV123@IP6AXPAXI@ZP6AX1PBXI@Z4@Z +?internal_capacity@concurrent_vector_base@internal@tbb@@IBEIXZ +?internal_clear@concurrent_vector_base@internal@tbb@@IAEXP6AXPAXI@Z_N@Z +?internal_copy@concurrent_vector_base@internal@tbb@@IAEXABV123@IP6AXPAXPBXI@Z@Z +?internal_grow_by@concurrent_vector_base@internal@tbb@@IAEIIIP6AXPAXI@Z@Z +?internal_grow_to_at_least@concurrent_vector_base@internal@tbb@@IAEXIIP6AXPAXI@Z@Z +?internal_push_back@concurrent_vector_base@internal@tbb@@IAEPAXIAAI@Z +?internal_reserve@concurrent_vector_base@internal@tbb@@IAEXIII@Z +#endif + +; concurrent_vector v3 +??1concurrent_vector_base_v3@internal@tbb@@IAE@XZ +?internal_assign@concurrent_vector_base_v3@internal@tbb@@IAEXABV123@IP6AXPAXI@ZP6AX1PBXI@Z4@Z +?internal_capacity@concurrent_vector_base_v3@internal@tbb@@IBEIXZ +?internal_clear@concurrent_vector_base_v3@internal@tbb@@IAEIP6AXPAXI@Z@Z +?internal_copy@concurrent_vector_base_v3@internal@tbb@@IAEXABV123@IP6AXPAXPBXI@Z@Z +?internal_grow_by@concurrent_vector_base_v3@internal@tbb@@IAEIIIP6AXPAXPBXI@Z1@Z +?internal_grow_to_at_least@concurrent_vector_base_v3@internal@tbb@@IAEXIIP6AXPAXPBXI@Z1@Z +?internal_push_back@concurrent_vector_base_v3@internal@tbb@@IAEPAXIAAI@Z +?internal_reserve@concurrent_vector_base_v3@internal@tbb@@IAEXIII@Z +?internal_compact@concurrent_vector_base_v3@internal@tbb@@IAEPAXIPAXP6AX0I@ZP6AX0PBXI@Z@Z +?internal_swap@concurrent_vector_base_v3@internal@tbb@@IAEXAAV123@@Z +?internal_throw_exception@concurrent_vector_base_v3@internal@tbb@@IBEXI@Z +?internal_resize@concurrent_vector_base_v3@internal@tbb@@IAEXIIIPBXP6AXPAXI@ZP6AX10I@Z@Z +?internal_grow_to_at_least_with_result@concurrent_vector_base_v3@internal@tbb@@IAEIIIP6AXPAXPBXI@Z1@Z + +; tbb_thread +?join@tbb_thread_v3@internal@tbb@@QAEXXZ +?detach@tbb_thread_v3@internal@tbb@@QAEXXZ +?internal_start@tbb_thread_v3@internal@tbb@@AAEXP6GIPAX@Z0@Z +?allocate_closure_v3@internal@tbb@@YAPAXI@Z +?free_closure_v3@internal@tbb@@YAXPAX@Z +?hardware_concurrency@tbb_thread_v3@internal@tbb@@SAIXZ +?thread_yield_v3@internal@tbb@@YAXXZ +?thread_sleep_v3@internal@tbb@@YAXABVinterval_t@tick_count@2@@Z +?move_v3@internal@tbb@@YAXAAVtbb_thread_v3@12@0@Z +?thread_get_id_v3@internal@tbb@@YA?AVid@tbb_thread_v3@12@XZ diff --git a/dep/tbb/src/tbb/win64-tbb-export.def b/dep/tbb/src/tbb/win64-tbb-export.def new file mode 100644 index 000000000..4a3debff9 --- /dev/null +++ b/dep/tbb/src/tbb/win64-tbb-export.def @@ -0,0 +1,257 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +; This file is organized with a section for each .cpp file. +; Each of these sections is in alphabetical order. + +#include "tbb/tbb_config.h" + +EXPORTS + +; Assembly-language support that is called directly by clients +__TBB_machine_cmpswp1 +__TBB_machine_fetchadd1 +__TBB_machine_fetchstore1 +__TBB_machine_cmpswp2 +__TBB_machine_fetchadd2 +__TBB_machine_fetchstore2 +__TBB_machine_pause + +; cache_aligned_allocator.cpp +?NFS_Allocate@internal@tbb@@YAPEAX_K0PEAX@Z +?NFS_GetLineSize@internal@tbb@@YA_KXZ +?NFS_Free@internal@tbb@@YAXPEAX@Z +?allocate_via_handler_v3@internal@tbb@@YAPEAX_K@Z +?deallocate_via_handler_v3@internal@tbb@@YAXPEAX@Z +?is_malloc_used_v3@internal@tbb@@YA_NXZ + + +; task.cpp v3 +?resize@affinity_partitioner_base_v3@internal@tbb@@AEAAXI@Z +?allocate@allocate_additional_child_of_proxy@internal@tbb@@QEBAAEAVtask@3@_K@Z +?allocate@allocate_child_proxy@internal@tbb@@QEBAAEAVtask@3@_K@Z +?allocate@allocate_continuation_proxy@internal@tbb@@QEBAAEAVtask@3@_K@Z +?allocate@allocate_root_proxy@internal@tbb@@SAAEAVtask@3@_K@Z +?destroy@task@tbb@@QEAAXAEAV12@@Z +?free@allocate_additional_child_of_proxy@internal@tbb@@QEBAXAEAVtask@3@@Z +?free@allocate_child_proxy@internal@tbb@@QEBAXAEAVtask@3@@Z +?free@allocate_continuation_proxy@internal@tbb@@QEBAXAEAVtask@3@@Z +?free@allocate_root_proxy@internal@tbb@@SAXAEAVtask@3@@Z +?internal_set_ref_count@task@tbb@@AEAAXH@Z +?internal_decrement_ref_count@task@tbb@@AEAA_JXZ +?is_owned_by_current_thread@task@tbb@@QEBA_NXZ +?note_affinity@task@tbb@@UEAAXG@Z +?self@task@tbb@@SAAEAV12@XZ +?spawn_and_wait_for_all@task@tbb@@QEAAXAEAVtask_list@2@@Z +?default_num_threads@task_scheduler_init@tbb@@SAHXZ +?initialize@task_scheduler_init@tbb@@QEAAXH_K@Z +?initialize@task_scheduler_init@tbb@@QEAAXH@Z +?terminate@task_scheduler_init@tbb@@QEAAXXZ +?observe@task_scheduler_observer_v3@internal@tbb@@QEAAX_N@Z + +; exception handling support +#if __TBB_EXCEPTIONS +?allocate@allocate_root_with_context_proxy@internal@tbb@@QEBAAEAVtask@3@_K@Z +?free@allocate_root_with_context_proxy@internal@tbb@@QEBAXAEAVtask@3@@Z +?is_group_execution_cancelled@task_group_context@tbb@@QEBA_NXZ +?cancel_group_execution@task_group_context@tbb@@QEAA_NXZ +?reset@task_group_context@tbb@@QEAAXXZ +?init@task_group_context@tbb@@IEAAXXZ +?register_pending_exception@task_group_context@tbb@@QEAAXXZ +??1task_group_context@tbb@@QEAA@XZ +?name@captured_exception@tbb@@UEBAPEBDXZ +?what@captured_exception@tbb@@UEBAPEBDXZ +??1captured_exception@tbb@@UEAA@XZ +?move@captured_exception@tbb@@UEAAPEAV12@XZ +?destroy@captured_exception@tbb@@UEAAXXZ +?set@captured_exception@tbb@@QEAAXPEBD0@Z +?clear@captured_exception@tbb@@QEAAXXZ +#endif /* __TBB_EXCEPTIONS */ + +; tbb_misc.cpp +?assertion_failure@tbb@@YAXPEBDH00@Z +?get_initial_auto_partitioner_divisor@internal@tbb@@YA_KXZ +?handle_perror@internal@tbb@@YAXHPEBD@Z +?set_assertion_handler@tbb@@YAP6AXPEBDH00@ZP6AX0H00@Z@Z +?runtime_warning@internal@tbb@@YAXPEBDZZ +TBB_runtime_interface_version +?throw_bad_last_alloc_exception_v4@internal@tbb@@YAXXZ + +; itt_notify.cpp +?itt_load_pointer_with_acquire_v3@internal@tbb@@YAPEAXPEBX@Z +?itt_store_pointer_with_release_v3@internal@tbb@@YAXPEAX0@Z +?itt_load_pointer_v3@internal@tbb@@YAPEAXPEBX@Z +?itt_set_sync_name_v3@internal@tbb@@YAXPEAXPEB_W@Z + +; pipeline.cpp +??_7pipeline@tbb@@6B@ +??0pipeline@tbb@@QEAA@XZ +??1filter@tbb@@UEAA@XZ +??1pipeline@tbb@@UEAA@XZ +?add_filter@pipeline@tbb@@QEAAXAEAVfilter@2@@Z +?clear@pipeline@tbb@@QEAAXXZ +?inject_token@pipeline@tbb@@AEAAXAEAVtask@2@@Z +?run@pipeline@tbb@@QEAAX_K@Z +#if __TBB_EXCEPTIONS +?run@pipeline@tbb@@QEAAX_KAEAVtask_group_context@2@@Z +#endif +?process_item@thread_bound_filter@tbb@@QEAA?AW4result_type@12@XZ +?try_process_item@thread_bound_filter@tbb@@QEAA?AW4result_type@12@XZ + +; queuing_rw_mutex.cpp +?internal_construct@queuing_rw_mutex@tbb@@QEAAXXZ +?acquire@scoped_lock@queuing_rw_mutex@tbb@@QEAAXAEAV23@_N@Z +?downgrade_to_reader@scoped_lock@queuing_rw_mutex@tbb@@QEAA_NXZ +?release@scoped_lock@queuing_rw_mutex@tbb@@QEAAXXZ +?upgrade_to_writer@scoped_lock@queuing_rw_mutex@tbb@@QEAA_NXZ +?try_acquire@scoped_lock@queuing_rw_mutex@tbb@@QEAA_NAEAV23@_N@Z + +#if !TBB_NO_LEGACY +; spin_rw_mutex.cpp v2 +?internal_itt_releasing@spin_rw_mutex@tbb@@CAXPEAV12@@Z +?internal_acquire_writer@spin_rw_mutex@tbb@@CA_NPEAV12@@Z +?internal_acquire_reader@spin_rw_mutex@tbb@@CAXPEAV12@@Z +?internal_downgrade@spin_rw_mutex@tbb@@CAXPEAV12@@Z +?internal_upgrade@spin_rw_mutex@tbb@@CA_NPEAV12@@Z +?internal_release_reader@spin_rw_mutex@tbb@@CAXPEAV12@@Z +?internal_release_writer@spin_rw_mutex@tbb@@CAXPEAV12@@Z +?internal_try_acquire_writer@spin_rw_mutex@tbb@@CA_NPEAV12@@Z +?internal_try_acquire_reader@spin_rw_mutex@tbb@@CA_NPEAV12@@Z +#endif + +; spin_rw_mutex v3 +?internal_construct@spin_rw_mutex_v3@tbb@@AEAAXXZ +?internal_upgrade@spin_rw_mutex_v3@tbb@@AEAA_NXZ +?internal_downgrade@spin_rw_mutex_v3@tbb@@AEAAXXZ +?internal_acquire_reader@spin_rw_mutex_v3@tbb@@AEAAXXZ +?internal_acquire_writer@spin_rw_mutex_v3@tbb@@AEAA_NXZ +?internal_release_reader@spin_rw_mutex_v3@tbb@@AEAAXXZ +?internal_release_writer@spin_rw_mutex_v3@tbb@@AEAAXXZ +?internal_try_acquire_reader@spin_rw_mutex_v3@tbb@@AEAA_NXZ +?internal_try_acquire_writer@spin_rw_mutex_v3@tbb@@AEAA_NXZ + +; spin_mutex.cpp +?internal_construct@spin_mutex@tbb@@QEAAXXZ +?internal_acquire@scoped_lock@spin_mutex@tbb@@AEAAXAEAV23@@Z +?internal_release@scoped_lock@spin_mutex@tbb@@AEAAXXZ +?internal_try_acquire@scoped_lock@spin_mutex@tbb@@AEAA_NAEAV23@@Z + +; mutex.cpp +?internal_acquire@scoped_lock@mutex@tbb@@AEAAXAEAV23@@Z +?internal_release@scoped_lock@mutex@tbb@@AEAAXXZ +?internal_try_acquire@scoped_lock@mutex@tbb@@AEAA_NAEAV23@@Z +?internal_construct@mutex@tbb@@AEAAXXZ +?internal_destroy@mutex@tbb@@AEAAXXZ + +; recursive_mutex.cpp +?internal_construct@recursive_mutex@tbb@@AEAAXXZ +?internal_destroy@recursive_mutex@tbb@@AEAAXXZ +?internal_acquire@scoped_lock@recursive_mutex@tbb@@AEAAXAEAV23@@Z +?internal_try_acquire@scoped_lock@recursive_mutex@tbb@@AEAA_NAEAV23@@Z +?internal_release@scoped_lock@recursive_mutex@tbb@@AEAAXXZ + +; queuing_mutex.cpp +?internal_construct@queuing_mutex@tbb@@QEAAXXZ +?acquire@scoped_lock@queuing_mutex@tbb@@QEAAXAEAV23@@Z +?release@scoped_lock@queuing_mutex@tbb@@QEAAXXZ +?try_acquire@scoped_lock@queuing_mutex@tbb@@QEAA_NAEAV23@@Z + +#if !TBB_NO_LEGACY +; concurrent_hash_map.cpp +?internal_grow_predicate@hash_map_segment_base@internal@tbb@@QEBA_NXZ + +; concurrent_queue.cpp v2 +??0concurrent_queue_base@internal@tbb@@IEAA@_K@Z +??0concurrent_queue_iterator_base@internal@tbb@@IEAA@AEBVconcurrent_queue_base@12@@Z +??1concurrent_queue_base@internal@tbb@@MEAA@XZ +??1concurrent_queue_iterator_base@internal@tbb@@IEAA@XZ +?advance@concurrent_queue_iterator_base@internal@tbb@@IEAAXXZ +?assign@concurrent_queue_iterator_base@internal@tbb@@IEAAXAEBV123@@Z +?internal_pop@concurrent_queue_base@internal@tbb@@IEAAXPEAX@Z +?internal_pop_if_present@concurrent_queue_base@internal@tbb@@IEAA_NPEAX@Z +?internal_push@concurrent_queue_base@internal@tbb@@IEAAXPEBX@Z +?internal_push_if_not_full@concurrent_queue_base@internal@tbb@@IEAA_NPEBX@Z +?internal_set_capacity@concurrent_queue_base@internal@tbb@@IEAAX_J_K@Z +?internal_size@concurrent_queue_base@internal@tbb@@IEBA_JXZ +#endif + +; concurrent_queue v3 +??0concurrent_queue_iterator_base_v3@internal@tbb@@IEAA@AEBVconcurrent_queue_base_v3@12@@Z +??1concurrent_queue_iterator_base_v3@internal@tbb@@IEAA@XZ +?assign@concurrent_queue_iterator_base_v3@internal@tbb@@IEAAXAEBV123@@Z +?advance@concurrent_queue_iterator_base_v3@internal@tbb@@IEAAXXZ +??0concurrent_queue_base_v3@internal@tbb@@IEAA@_K@Z +??1concurrent_queue_base_v3@internal@tbb@@MEAA@XZ +?internal_push@concurrent_queue_base_v3@internal@tbb@@IEAAXPEBX@Z +?internal_push_if_not_full@concurrent_queue_base_v3@internal@tbb@@IEAA_NPEBX@Z +?internal_pop@concurrent_queue_base_v3@internal@tbb@@IEAAXPEAX@Z +?internal_pop_if_present@concurrent_queue_base_v3@internal@tbb@@IEAA_NPEAX@Z +?internal_size@concurrent_queue_base_v3@internal@tbb@@IEBA_JXZ +?internal_empty@concurrent_queue_base_v3@internal@tbb@@IEBA_NXZ +?internal_finish_clear@concurrent_queue_base_v3@internal@tbb@@IEAAXXZ +?internal_set_capacity@concurrent_queue_base_v3@internal@tbb@@IEAAX_J_K@Z +?internal_throw_exception@concurrent_queue_base_v3@internal@tbb@@IEBAXXZ +?assign@concurrent_queue_base_v3@internal@tbb@@IEAAXAEBV123@@Z + +#if !TBB_NO_LEGACY +; concurrent_vector.cpp v2 +?internal_assign@concurrent_vector_base@internal@tbb@@IEAAXAEBV123@_KP6AXPEAX1@ZP6AX2PEBX1@Z5@Z +?internal_capacity@concurrent_vector_base@internal@tbb@@IEBA_KXZ +?internal_clear@concurrent_vector_base@internal@tbb@@IEAAXP6AXPEAX_K@Z_N@Z +?internal_copy@concurrent_vector_base@internal@tbb@@IEAAXAEBV123@_KP6AXPEAXPEBX1@Z@Z +?internal_grow_by@concurrent_vector_base@internal@tbb@@IEAA_K_K0P6AXPEAX0@Z@Z +?internal_grow_to_at_least@concurrent_vector_base@internal@tbb@@IEAAX_K0P6AXPEAX0@Z@Z +?internal_push_back@concurrent_vector_base@internal@tbb@@IEAAPEAX_KAEA_K@Z +?internal_reserve@concurrent_vector_base@internal@tbb@@IEAAX_K00@Z +#endif + +; concurrent_vector v3 +??1concurrent_vector_base_v3@internal@tbb@@IEAA@XZ +?internal_assign@concurrent_vector_base_v3@internal@tbb@@IEAAXAEBV123@_KP6AXPEAX1@ZP6AX2PEBX1@Z5@Z +?internal_capacity@concurrent_vector_base_v3@internal@tbb@@IEBA_KXZ +?internal_clear@concurrent_vector_base_v3@internal@tbb@@IEAA_KP6AXPEAX_K@Z@Z +?internal_copy@concurrent_vector_base_v3@internal@tbb@@IEAAXAEBV123@_KP6AXPEAXPEBX1@Z@Z +?internal_grow_by@concurrent_vector_base_v3@internal@tbb@@IEAA_K_K0P6AXPEAXPEBX0@Z2@Z +?internal_grow_to_at_least@concurrent_vector_base_v3@internal@tbb@@IEAAX_K0P6AXPEAXPEBX0@Z2@Z +?internal_push_back@concurrent_vector_base_v3@internal@tbb@@IEAAPEAX_KAEA_K@Z +?internal_reserve@concurrent_vector_base_v3@internal@tbb@@IEAAX_K00@Z +?internal_compact@concurrent_vector_base_v3@internal@tbb@@IEAAPEAX_KPEAXP6AX10@ZP6AX1PEBX0@Z@Z +?internal_swap@concurrent_vector_base_v3@internal@tbb@@IEAAXAEAV123@@Z +?internal_throw_exception@concurrent_vector_base_v3@internal@tbb@@IEBAX_K@Z +?internal_resize@concurrent_vector_base_v3@internal@tbb@@IEAAX_K00PEBXP6AXPEAX0@ZP6AX210@Z@Z +?internal_grow_to_at_least_with_result@concurrent_vector_base_v3@internal@tbb@@IEAA_K_K0P6AXPEAXPEBX0@Z2@Z + +; tbb_thread +?allocate_closure_v3@internal@tbb@@YAPEAX_K@Z +?detach@tbb_thread_v3@internal@tbb@@QEAAXXZ +?free_closure_v3@internal@tbb@@YAXPEAX@Z +?hardware_concurrency@tbb_thread_v3@internal@tbb@@SAIXZ +?internal_start@tbb_thread_v3@internal@tbb@@AEAAXP6AIPEAX@Z0@Z +?join@tbb_thread_v3@internal@tbb@@QEAAXXZ +?move_v3@internal@tbb@@YAXAEAVtbb_thread_v3@12@0@Z +?thread_get_id_v3@internal@tbb@@YA?AVid@tbb_thread_v3@12@XZ +?thread_sleep_v3@internal@tbb@@YAXAEBVinterval_t@tick_count@2@@Z +?thread_yield_v3@internal@tbb@@YAXXZ diff --git a/dep/tbb/src/tbbmalloc/Customize.h b/dep/tbb/src/tbbmalloc/Customize.h new file mode 100644 index 000000000..adc6d4c76 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/Customize.h @@ -0,0 +1,120 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _TBB_malloc_Customize_H_ +#define _TBB_malloc_Customize_H_ + +/* Thread shutdown notification callback */ +/* redefine the name of the callback to meet TBB requirements + for externally visible names of service functions */ +#define mallocThreadShutdownNotification __TBB_mallocThreadShutdownNotification +#define mallocProcessShutdownNotification __TBB_mallocProcessShutdownNotification + +extern "C" void mallocThreadShutdownNotification(void *); +extern "C" void mallocProcessShutdownNotification(void); + +// customizing MALLOC_ASSERT macro +#include "tbb/tbb_stddef.h" +#define MALLOC_ASSERT(assertion, message) __TBB_ASSERT(assertion, message) + +#ifndef MALLOC_DEBUG +#define MALLOC_DEBUG TBB_USE_DEBUG +#endif + +#include "tbb/tbb_machine.h" + +#if DO_ITT_NOTIFY +#include "tbb/itt_notify.h" +#define MALLOC_ITT_SYNC_PREPARE(pointer) ITT_NOTIFY(sync_prepare, (pointer)) +#define MALLOC_ITT_SYNC_ACQUIRED(pointer) ITT_NOTIFY(sync_acquired, (pointer)) +#define MALLOC_ITT_SYNC_RELEASING(pointer) ITT_NOTIFY(sync_releasing, (pointer)) +#define MALLOC_ITT_SYNC_CANCEL(pointer) ITT_NOTIFY(sync_cancel, (pointer)) +#else +#define MALLOC_ITT_SYNC_PREPARE(pointer) ((void)0) +#define MALLOC_ITT_SYNC_ACQUIRED(pointer) ((void)0) +#define MALLOC_ITT_SYNC_RELEASING(pointer) ((void)0) +#define MALLOC_ITT_SYNC_CANCEL(pointer) ((void)0) +#endif + +//! Stripped down version of spin_mutex. +/** Instances of MallocMutex must be declared in memory that is zero-initialized. + There are no constructors. This is a feature that lets it be + used in situations where the mutex might be used while file-scope constructors + are running. + + There are no methods "acquire" or "release". The scoped_lock must be used + in a strict block-scoped locking pattern. Omitting these methods permitted + further simplication. */ +class MallocMutex { + unsigned char value; + + //! Deny assignment + void operator=( MallocMutex& MallocMutex ); +public: + class scoped_lock { + const unsigned char value; + MallocMutex& mutex; + public: + scoped_lock( MallocMutex& m ) : value( __TBB_LockByte(m.value)), mutex(m) {} + ~scoped_lock() { __TBB_store_with_release(mutex.value, value); } + }; + friend class scoped_lock; +}; + +inline intptr_t AtomicIncrement( volatile intptr_t& counter ) { + return __TBB_FetchAndAddW( &counter, 1 )+1; +} + +inline uintptr_t AtomicAdd( volatile uintptr_t& counter, uintptr_t value ) { + return __TBB_FetchAndAddW( &counter, value ); +} + +inline intptr_t AtomicCompareExchange( volatile intptr_t& location, intptr_t new_value, intptr_t comparand) { + return __TBB_CompareAndSwapW( &location, new_value, comparand ); +} + +#define USE_DEFAULT_MEMORY_MAPPING 1 + +// To support malloc replacement with LD_PRELOAD +#include "proxy.h" + +#if MALLOC_LD_PRELOAD +#define malloc_proxy __TBB_malloc_proxy +extern "C" void * __TBB_malloc_proxy(size_t) __attribute__ ((weak)); +#else +const bool malloc_proxy = false; +#endif + +namespace rml { +namespace internal { + void init_tbbmalloc(); +} } // namespaces + +#define MALLOC_EXTRA_INITIALIZATION rml::internal::init_tbbmalloc() + +#endif /* _TBB_malloc_Customize_H_ */ diff --git a/dep/tbb/src/tbbmalloc/LifoQueue.h b/dep/tbb/src/tbbmalloc/LifoQueue.h new file mode 100644 index 000000000..9a81dd9b7 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/LifoQueue.h @@ -0,0 +1,97 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _itt_common_malloc_LifoQueue_H_ +#define _itt_common_malloc_LifoQueue_H_ + +#include "TypeDefinitions.h" +#include // for memset() + +//! Checking the synchronization method +/** FINE_GRAIN_LOCKS is the only variant for now; should be defined for LifoQueue */ +#ifndef FINE_GRAIN_LOCKS +#define FINE_GRAIN_LOCKS +#endif + +namespace rml { + +namespace internal { + +class LifoQueue { +public: + inline LifoQueue(); + inline void push(void** ptr); + inline void* pop(void); + +private: + void * top; +#ifdef FINE_GRAIN_LOCKS + MallocMutex lock; +#endif /* FINE_GRAIN_LOCKS */ +}; + +#ifdef FINE_GRAIN_LOCKS +/* LifoQueue assumes zero initialization so a vector of it can be created + * by just allocating some space with no call to constructor. + * On Linux, it seems to be necessary to avoid linking with C++ libraries. + * + * By usage convention there is no race on the initialization. */ +LifoQueue::LifoQueue( ) : top(NULL) +{ + // MallocMutex assumes zero initialization + memset(&lock, 0, sizeof(MallocMutex)); +} + +void LifoQueue::push( void **ptr ) +{ + MallocMutex::scoped_lock scoped_cs(lock); + *ptr = top; + top = ptr; +} + +void * LifoQueue::pop( ) +{ + void **result=NULL; + { + MallocMutex::scoped_lock scoped_cs(lock); + if (!top) goto done; + result = (void **) top; + top = *result; + } + *result = NULL; +done: + return result; +} + +#endif /* FINE_GRAIN_LOCKS */ + +} // namespace internal +} // namespace rml + +#endif /* _itt_common_malloc_LifoQueue_H_ */ + diff --git a/dep/tbb/src/tbbmalloc/MapMemory.h b/dep/tbb/src/tbbmalloc/MapMemory.h new file mode 100644 index 000000000..64bf66b0c --- /dev/null +++ b/dep/tbb/src/tbbmalloc/MapMemory.h @@ -0,0 +1,101 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#ifndef _itt_shared_malloc_MapMemory_H +#define _itt_shared_malloc_MapMemory_H + +#if __linux__ || __APPLE__ || __sun || __FreeBSD__ + +#if __sun && !defined(_XPG4_2) + // To have void* as mmap's 1st argument + #define _XPG4_2 1 + #define XPG4_WAS_DEFINED 1 +#endif + +#include + +#if XPG4_WAS_DEFINED + #undef _XPG4_2 + #undef XPG4_WAS_DEFINED +#endif + +#define MEMORY_MAPPING_USES_MALLOC 0 +void* MapMemory (size_t bytes) +{ + void* result = 0; +#ifndef MAP_ANONYMOUS +// Mac OS* X defines MAP_ANON, which is deprecated in Linux. +#define MAP_ANONYMOUS MAP_ANON +#endif /* MAP_ANONYMOUS */ + result = mmap(result, bytes, (PROT_READ | PROT_WRITE), MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); + return result==MAP_FAILED? 0: result; +} + +int UnmapMemory(void *area, size_t bytes) +{ + return munmap(area, bytes); +} + +#elif _WIN32 || _WIN64 +#include + +#define MEMORY_MAPPING_USES_MALLOC 0 +void* MapMemory (size_t bytes) +{ + /* Is VirtualAlloc thread safe? */ + return VirtualAlloc(NULL, bytes, (MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN), PAGE_READWRITE); +} + +int UnmapMemory(void *area, size_t bytes) +{ + BOOL result = VirtualFree(area, 0, MEM_RELEASE); + return !result; +} + +#else +#include + +#define MEMORY_MAPPING_USES_MALLOC 1 +void* MapMemory (size_t bytes) +{ + return malloc( bytes ); +} + +int UnmapMemory(void *area, size_t bytes) +{ + free( area ); + return 0; +} + +#endif /* OS dependent */ + +#if MALLOC_CHECK_RECURSION && MEMORY_MAPPING_USES_MALLOC +#error Impossible to protect against malloc recursion when memory mapping uses malloc. +#endif + +#endif /* _itt_shared_malloc_MapMemory_H */ diff --git a/dep/tbb/src/tbbmalloc/MemoryAllocator.cpp b/dep/tbb/src/tbbmalloc/MemoryAllocator.cpp new file mode 100644 index 000000000..749efda36 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/MemoryAllocator.cpp @@ -0,0 +1,2391 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + + +#include "TypeDefinitions.h" /* Also includes customization layer Customize.h */ + +#if USE_PTHREAD + // Some pthreads documentation says that must be first header. + #include + #define TlsSetValue_func pthread_setspecific + #define TlsGetValue_func pthread_getspecific + typedef pthread_key_t tls_key_t; + #include + inline void do_yield() {sched_yield();} + +#elif USE_WINTHREAD + #define _WIN32_WINNT 0x0400 + #include + #define TlsSetValue_func TlsSetValue + #define TlsGetValue_func TlsGetValue + typedef DWORD tls_key_t; + inline void do_yield() {SwitchToThread();} + +#else + #error Must define USE_PTHREAD or USE_WINTHREAD + +#endif + +#include +#include +#include +#include +#if MALLOC_CHECK_RECURSION +#include /* for placement new */ +#endif /* MALLOC_CHECK_RECURSION */ + +extern "C" { + void * scalable_malloc(size_t size); + void scalable_free(void *object); + void mallocThreadShutdownNotification(void*); +} + +/********* Various compile-time options **************/ + +#define MALLOC_TRACE 0 + +#if MALLOC_TRACE +#define TRACEF(x) printf x +#else +#define TRACEF(x) ((void)0) +#endif /* MALLOC_TRACE */ + +#define ASSERT_TEXT NULL + +//! Define the main synchronization method +/** It should be specified before including LifoQueue.h */ +#define FINE_GRAIN_LOCKS +#include "LifoQueue.h" + +#define COLLECT_STATISTICS MALLOC_DEBUG && defined(MALLOCENV_COLLECT_STATISTICS) +#include "Statistics.h" + +#define FREELIST_NONBLOCKING 1 + +// If USE_MALLOC_FOR_LARGE_OBJECT is nonzero, then large allocations are done via malloc. +// Otherwise large allocations are done using the scalable allocator's block allocator. +// As of 06.Jun.17, using malloc is about 10x faster on Linux. +#if !_WIN32 +#define USE_MALLOC_FOR_LARGE_OBJECT 1 +#endif + +/********* End compile-time options **************/ + +namespace rml { + +namespace internal { + +/******* A helper class to support overriding malloc with scalable_malloc *******/ +#if MALLOC_CHECK_RECURSION + +inline bool isMallocInitialized(); + +class RecursiveMallocCallProtector { + // pointer to an automatic data of holding thread + static void *autoObjPtr; + static MallocMutex rmc_mutex; + static pthread_t owner_thread; +/* Under FreeBSD 8.0 1st call to any pthread function including pthread_self + leads to pthread initialization, that causes malloc calls. As 1st usage of + RecursiveMallocCallProtector can be before pthread initialized, pthread calls + can't be used in 1st instance of RecursiveMallocCallProtector. + RecursiveMallocCallProtector is used 1st time in checkInitialization(), + so there is a guarantee that on 2nd usage pthread is initialized. + No such situation observed with other supported OSes. + */ +#if __FreeBSD__ + static bool canUsePthread; +#else + static const bool canUsePthread = true; +#endif +/* + The variable modified in checkInitialization, + so can be read without memory barriers. + */ + static bool mallocRecursionDetected; + + MallocMutex::scoped_lock* lock_acquired; + char scoped_lock_space[sizeof(MallocMutex::scoped_lock)+1]; + + static uintptr_t absDiffPtr(void *x, void *y) { + uintptr_t xi = (uintptr_t)x, yi = (uintptr_t)y; + return xi > yi ? xi - yi : yi - xi; + } +public: + + RecursiveMallocCallProtector() : lock_acquired(NULL) { + lock_acquired = new (scoped_lock_space) MallocMutex::scoped_lock( rmc_mutex ); + if (canUsePthread) + owner_thread = pthread_self(); + autoObjPtr = &scoped_lock_space; + } + ~RecursiveMallocCallProtector() { + if (lock_acquired) { + autoObjPtr = NULL; + lock_acquired->~scoped_lock(); + } + } + static bool sameThreadActive() { + if (!autoObjPtr) // fast path + return false; + // Some thread has an active recursive call protector; check if the current one. + // Exact pthread_self based test + if (canUsePthread) + if (pthread_equal( owner_thread, pthread_self() )) { + mallocRecursionDetected = true; + return true; + } else + return false; + // inexact stack size based test + const uintptr_t threadStackSz = 2*1024*1024; + int dummy; + return absDiffPtr(autoObjPtr, &dummy)(TlsGetValue_func(Tid_key)); + if( !result ) { + RecursiveMallocCallProtector scoped; + // Thread-local value is zero -> first call from this thread, + // need to initialize with next ID value (IDs start from 1) + result = AtomicIncrement(ThreadIdCount); // returned new value! + TlsSetValue_func( Tid_key, reinterpret_cast(result) ); + } + return result; +} + +static inline void* getThreadMallocTLS() { + void *result; + result = TlsGetValue_func( TLS_pointer_key ); +// The assert below is incorrect: with lazy initialization, it fails on the first call of the function. +// MALLOC_ASSERT( result, "Memory allocator not initialized" ); + return result; +} + +static inline void setThreadMallocTLS( void * newvalue ) { + RecursiveMallocCallProtector scoped; + TlsSetValue_func( TLS_pointer_key, newvalue ); +} + +/*********** End code to provide thread ID and a TLS pointer **********/ + +/* + * The identifier to make sure that memory is allocated by scalable_malloc. + */ +const uint64_t theMallocUniqueID=0xE3C7AF89A1E2D8C1ULL; + +/* + * This number of bins in the TLS that leads to blocks that we can allocate in. + */ +const uint32_t numBlockBinLimit = 32; + + /* + * The number of bins to cache large objects. + */ +const uint32_t numLargeObjectBins = 1024; // for 1024 max cached size is near 8MB + +/********* The data structures and global objects **************/ + +struct FreeObject { + FreeObject *next; +}; + +/* + * The following constant is used to define the size of struct Block, the block header. + * The intent is to have the size of a Block multiple of the cache line size, this allows us to + * get good alignment at the cost of some overhead equal to the amount of padding included in the Block. + */ + +const int blockHeaderAlignment = 64; // a common size of a cache line + +struct Block; + +/* The 'next' field in the block header has to maintain some invariants: + * it needs to be on a 16K boundary and the first field in the block. + * Any value stored there needs to have the lower 14 bits set to 0 + * so that various assert work. This means that if you want to smash this memory + * for debugging purposes you will need to obey this invariant. + * The total size of the header needs to be a power of 2 to simplify + * the alignment requirements. For now it is a 128 byte structure. + * To avoid false sharing, the fields changed only locally are separated + * from the fields changed by foreign threads. + * Changing the size of the block header would require to change + * some bin allocation sizes, in particular "fitting" sizes (see above). + */ + +struct LocalBlockFields { + Block *next; /* This field needs to be on a 16K boundary and the first field in the block + for LIFO lists to work. */ + uint64_t mallocUniqueID; /* The field to identify memory allocated by scalable_malloc */ + Block *previous; /* Use double linked list to speed up removal */ + unsigned int objectSize; + unsigned int owner; + FreeObject *bumpPtr; /* Bump pointer moves from the end to the beginning of a block */ + FreeObject *freeList; + unsigned int allocatedCount; /* Number of objects allocated (obviously by the owning thread) */ + unsigned int isFull; +}; + +struct Block : public LocalBlockFields { + size_t __pad_local_fields[(blockHeaderAlignment-sizeof(LocalBlockFields))/sizeof(size_t)]; + FreeObject *publicFreeList; + Block *nextPrivatizable; + size_t __pad_public_fields[(blockHeaderAlignment-2*sizeof(void*))/sizeof(size_t)]; +}; + +struct Bin { + Block *activeBlk; + Block *mailbox; + MallocMutex mailLock; +}; + +/* + * This is a LIFO linked list that one can init, push or pop from + */ +static LifoQueue freeBlockList; + +/* + * When a block that is not completely free is returned for reuse by other threads + * this is where the block goes. + * + * LifoQueue assumes zero initialization; so below its constructors are omitted, + * to avoid linking with C++ libraries on Linux. + */ +static char globalBinSpace[sizeof(LifoQueue)*numBlockBinLimit]; +static LifoQueue* globalSizeBins = (LifoQueue*)globalBinSpace; + +static struct LargeObjectCacheStat { + uintptr_t age; + size_t cacheSize; +} loCacheStat; + +struct CachedObject { + CachedObject *next, + *prev; + uintptr_t age; + bool fromMapMemory; +}; + +class CachedObjectsList { + CachedObject *first, + *last; + /* age of an oldest object in the list; equal to last->age, if last defined, + used for quick cheching it without acquiring the lock. */ + uintptr_t oldest; + /* currAge when something was excluded out of list because of the age, + not because of cache hit */ + uintptr_t lastCleanedAge; + /* Current threshold value for the objects of a particular size. + Set on cache miss. */ + uintptr_t ageThreshold; + + MallocMutex lock; + /* CachedObjectsList should be placed in zero-initialized memory, + ctor not needed. */ + CachedObjectsList(); +public: + inline void push(void *buf, bool fromMapMemory, uintptr_t currAge); + inline CachedObject* pop(uintptr_t currAge); + void releaseLastIfOld(uintptr_t currAge, size_t size); +}; + +/* + * Array of bins with lists of recently freed objects cached for re-use. + */ +static char globalCachedObjectBinsSpace[sizeof(CachedObjectsList)*numLargeObjectBins]; +static CachedObjectsList* globalCachedObjectBins = (CachedObjectsList*)globalCachedObjectBinsSpace; + +/********* End of the data structures **************/ + +/********** Various numeric parameters controlling allocations ********/ + +/* + * The size of the TLS should be enough to hold numBlockBinLimit bins. + */ +const uint32_t tlsSize = numBlockBinLimit * sizeof(Bin); + +/* + * blockSize - the size of a block, it must be larger than maxSegregatedObjectSize. + * + */ +const uintptr_t blockSize = 16*1024; + +/* + * There are bins for all 8 byte aligned objects less than this segregated size; 8 bins in total + */ +const uint32_t minSmallObjectIndex = 0; +const uint32_t numSmallObjectBins = 8; +const uint32_t maxSmallObjectSize = 64; + +/* + * There are 4 bins between each couple of powers of 2 [64-128-256-...] + * from maxSmallObjectSize till this size; 16 bins in total + */ +const uint32_t minSegregatedObjectIndex = minSmallObjectIndex+numSmallObjectBins; +const uint32_t numSegregatedObjectBins = 16; +const uint32_t maxSegregatedObjectSize = 1024; + +/* + * And there are 5 bins with the following allocation sizes: 1792, 2688, 3968, 5376, 8064. + * They selected to fit 9, 6, 4, 3, and 2 sizes per a block, and also are multiples of 128. + * If sizeof(Block) changes from 128, these sizes require close attention! + */ +const uint32_t minFittingIndex = minSegregatedObjectIndex+numSegregatedObjectBins; +const uint32_t numFittingBins = 5; + +const uint32_t fittingAlignment = 128; + +#define SET_FITTING_SIZE(N) ( (blockSize-sizeof(Block))/N ) & ~(fittingAlignment-1) +const uint32_t fittingSize1 = SET_FITTING_SIZE(9); +const uint32_t fittingSize2 = SET_FITTING_SIZE(6); +const uint32_t fittingSize3 = SET_FITTING_SIZE(4); +const uint32_t fittingSize4 = SET_FITTING_SIZE(3); +const uint32_t fittingSize5 = SET_FITTING_SIZE(2); +#undef SET_FITTING_SIZE + +/* + * The total number of thread-specific Block-based bins + */ +const uint32_t numBlockBins = minFittingIndex+numFittingBins; + +/* + * Objects of this size and larger are considered large objects. + */ +const uint32_t minLargeObjectSize = fittingSize5 + 1; + +/* + * Block::objectSize value used to mark blocks allocated by startupAlloc + */ +const unsigned int startupAllocObjSizeMark = ~(unsigned int)0; + +/* + * Difference between object sizes in large object bins + */ +const uint32_t largeObjectCacheStep = 8*1024; + +/* + * Object cache cleanup frequency. + * It should be power of 2 for the fast checking. + */ +const unsigned cacheCleanupFreq = 256; + +/* + * Get virtual memory in pieces of this size: 0x0100000 is 1 megabyte decimal + */ +static size_t mmapRequestSize = 0x0100000; + +/********** End of numeric parameters controlling allocations *********/ + +#if !MALLOC_DEBUG +#if __INTEL_COMPILER || _MSC_VER +#define NOINLINE(decl) __declspec(noinline) decl +#define ALWAYSINLINE(decl) __forceinline decl +#elif __GNUC__ +#define NOINLINE(decl) decl __attribute__ ((noinline)) +#define ALWAYSINLINE(decl) decl __attribute__ ((always_inline)) +#else +#define NOINLINE(decl) decl +#define ALWAYSINLINE(decl) decl +#endif + +static NOINLINE( Block* getPublicFreeListBlock(Bin* bin) ); +static NOINLINE( void moveBlockToBinFront(Block *block) ); +static NOINLINE( void processLessUsedBlock(Block *block) ); + +static ALWAYSINLINE( Bin* getAllocationBin(size_t size) ); +static ALWAYSINLINE( void checkInitialization() ); + +#undef ALWAYSINLINE +#undef NOINLINE +#endif /* !MALLOC_DEBUG */ + +/*********** Code to acquire memory from the OS or other executive ****************/ + +#if USE_DEFAULT_MEMORY_MAPPING +#include "MapMemory.h" +#else +/* assume MapMemory and UnmapMemory are customized */ +#endif + +#if USE_MALLOC_FOR_LARGE_OBJECT + +// (get|free)RawMemory only necessary for the USE_MALLOC_FOR_LARGE_OBJECT case +static inline void* getRawMemory (size_t size, bool alwaysUseMap = false) +{ + void *object; + + if (alwaysUseMap) + object = MapMemory(size); + else +#if MALLOC_CHECK_RECURSION + if (RecursiveMallocCallProtector::noRecursion()) + object = malloc(size); + else if ( rml::internal::original_malloc_found ) + object = (*rml::internal::original_malloc_ptr)(size); + else + object = MapMemory(size); +#else + object = malloc(size); +#endif /* MALLOC_CHECK_RECURSION */ + return object; +} + +static inline void freeRawMemory (void *object, size_t size, bool alwaysUseMap) +{ + if (alwaysUseMap) + UnmapMemory(object, size); + else +#if MALLOC_CHECK_RECURSION + if (RecursiveMallocCallProtector::noRecursion()) + free(object); + else if ( rml::internal::original_malloc_found ) + (*rml::internal::original_free_ptr)(object); + else + UnmapMemory(object, size); +#else + free(object); +#endif /* MALLOC_CHECK_RECURSION */ +} + +#else /* USE_MALLOC_FOR_LARGE_OBJECT */ + +static inline void* getRawMemory (size_t size, bool = false) { return MapMemory(size); } + +static inline void freeRawMemory (void *object, size_t size, bool) { + UnmapMemory(object, size); +} + +#endif /* USE_MALLOC_FOR_LARGE_OBJECT */ + +/********* End memory acquisition code ********************************/ + +/********* Now some rough utility code to deal with indexing the size bins. **************/ + +/* + * Given a number return the highest non-zero bit in it. It is intended to work with 32-bit values only. + * Moreover, on IPF, for sake of simplicity and performance, it is narrowed to only serve for 64 to 1023. + * This is enough for current algorithm of distribution of sizes among bins. + */ +#if _WIN64 && _MSC_VER>=1400 && !__INTEL_COMPILER +extern "C" unsigned char _BitScanReverse( unsigned long* i, unsigned long w ); +#pragma intrinsic(_BitScanReverse) +#endif +static inline unsigned int highestBitPos(unsigned int n) +{ + unsigned int pos; +#if __ARCH_x86_32||__ARCH_x86_64 + +# if __linux__||__APPLE__||__FreeBSD__||__sun||__MINGW32__ + __asm__ ("bsr %1,%0" : "=r"(pos) : "r"(n)); +# elif (_WIN32 && (!_WIN64 || __INTEL_COMPILER)) + __asm + { + bsr eax, n + mov pos, eax + } +# elif _WIN64 && _MSC_VER>=1400 + _BitScanReverse((unsigned long*)&pos, (unsigned long)n); +# else +# error highestBitPos() not implemented for this platform +# endif + +#elif __ARCH_ipf || __ARCH_other + static unsigned int bsr[16] = {0,6,7,7,8,8,8,8,9,9,9,9,9,9,9,9}; + MALLOC_ASSERT( n>=64 && n<1024, ASSERT_TEXT ); + pos = bsr[ n>>6 ]; +#else +# error highestBitPos() not implemented for this platform +#endif /* __ARCH_* */ + return pos; +} + +/* + * Depending on indexRequest, for a given size return either the index into the bin + * for objects of this size, or the actual size of objects in this bin. + */ +template +static unsigned int getIndexOrObjectSize (unsigned int size) +{ + if (size <= maxSmallObjectSize) { // selection from 4/8/16/24/32/40/48/56/64 + /* Index 0 holds up to 8 bytes, Index 1 16 and so forth */ + return indexRequest ? (size - 1) >> 3 : alignUp(size,8); + } + else if (size <= maxSegregatedObjectSize ) { // 80/96/112/128 / 160/192/224/256 / 320/384/448/512 / 640/768/896/1024 + unsigned int order = highestBitPos(size-1); // which group of bin sizes? + MALLOC_ASSERT( 6<=order && order<=9, ASSERT_TEXT ); + if (indexRequest) + return minSegregatedObjectIndex - (4*6) - 4 + (4*order) + ((size-1)>>(order-2)); + else { + unsigned int alignment = 128 >> (9-order); // alignment in the group + MALLOC_ASSERT( alignment==16 || alignment==32 || alignment==64 || alignment==128, ASSERT_TEXT ); + return alignUp(size,alignment); + } + } + else { + if( size <= fittingSize3 ) { + if( size <= fittingSize2 ) { + if( size <= fittingSize1 ) + return indexRequest ? minFittingIndex : fittingSize1; + else + return indexRequest ? minFittingIndex+1 : fittingSize2; + } else + return indexRequest ? minFittingIndex+2 : fittingSize3; + } else { + if( size <= fittingSize5 ) { + if( size <= fittingSize4 ) + return indexRequest ? minFittingIndex+3 : fittingSize4; + else + return indexRequest ? minFittingIndex+4 : fittingSize5; + } else { + MALLOC_ASSERT( 0,ASSERT_TEXT ); // this should not happen + return ~0U; + } + } + } +} + +static unsigned int getIndex (unsigned int size) +{ + return getIndexOrObjectSize(size); +} + +static unsigned int getObjectSize (unsigned int size) +{ + return getIndexOrObjectSize(size); +} + +/* + * Initialization code. + * + */ + +/* + * Big Blocks are the blocks we get from the OS or some similar place using getMemory above. + * They are placed on the freeBlockList once they are acquired. + */ + +static inline void *alignBigBlock(void *unalignedBigBlock) +{ + void *alignedBigBlock; + /* align the entireHeap so all blocks are aligned. */ + alignedBigBlock = alignUp(unalignedBigBlock, blockSize); + return alignedBigBlock; +} + +/* Divide the big block into smaller bigBlocks that hold this many blocks. + * This is done since we really need a lot of blocks on the freeBlockList or there will be + * contention problems. + */ +const unsigned int blocksPerBigBlock = 16; + +/* Returns 0 if unsuccessful, otherwise 1. */ +static int mallocBigBlock() +{ + void *unalignedBigBlock; + void *alignedBigBlock; + void *bigBlockCeiling; + Block *splitBlock; + void *splitEdge; + size_t bigBlockSplitSize; + + unalignedBigBlock = getRawMemory(mmapRequestSize, /*alwaysUseMap=*/true); + + if (!unalignedBigBlock) { + TRACEF(( "[ScalableMalloc trace] in mallocBigBlock, getMemory returns 0\n" )); + /* We can't get any more memory from the OS or executive so return 0 */ + return 0; + } + + alignedBigBlock = alignBigBlock(unalignedBigBlock); + bigBlockCeiling = (void*)((uintptr_t)unalignedBigBlock + mmapRequestSize); + + bigBlockSplitSize = blocksPerBigBlock * blockSize; + + splitBlock = (Block*)alignedBigBlock; + + while ( ((uintptr_t)splitBlock + blockSize) <= (uintptr_t)bigBlockCeiling ) { + splitEdge = (void*)((uintptr_t)splitBlock + bigBlockSplitSize); + if( splitEdge > bigBlockCeiling) { + splitEdge = alignDown(bigBlockCeiling, blockSize); + } + splitBlock->bumpPtr = (FreeObject*)splitEdge; + freeBlockList.push((void**) splitBlock); + splitBlock = (Block*)splitEdge; + } + + TRACEF(( "[ScalableMalloc trace] in mallocBigBlock returning 1\n" )); + return 1; +} + +/* + * The malloc routines themselves need to be able to occasionally malloc some space, + * in order to set up the structures used by the thread local structures. This + * routine preforms that fuctions. + */ + +/* + * Forward Refs + */ +static void initEmptyBlock(Block *block, size_t size); +static Block *getEmptyBlock(size_t size); + +static MallocMutex bootStrapLock; + +static Block *bootStrapBlock = NULL; +static Block *bootStrapBlockUsed = NULL; +static FreeObject *bootStrapObjectList = NULL; + +static void *bootStrapMalloc(size_t size) +{ + FreeObject *result; + + MALLOC_ASSERT( size == tlsSize, ASSERT_TEXT ); + + { // Lock with acquire + MallocMutex::scoped_lock scoped_cs(bootStrapLock); + + if( bootStrapObjectList) { + result = bootStrapObjectList; + bootStrapObjectList = bootStrapObjectList->next; + } else { + if (!bootStrapBlock) { + bootStrapBlock = getEmptyBlock(size); + if (!bootStrapBlock) return NULL; + } + result = bootStrapBlock->bumpPtr; + bootStrapBlock->bumpPtr = (FreeObject *)((uintptr_t)bootStrapBlock->bumpPtr - bootStrapBlock->objectSize); + if ((uintptr_t)bootStrapBlock->bumpPtr < (uintptr_t)bootStrapBlock+sizeof(Block)) { + bootStrapBlock->bumpPtr = NULL; + bootStrapBlock->next = bootStrapBlockUsed; + bootStrapBlockUsed = bootStrapBlock; + bootStrapBlock = NULL; + } + } + } // Unlock with release + + memset (result, 0, size); + return (void*)result; +} + +static void bootStrapFree(void* ptr) +{ + MALLOC_ASSERT( ptr, ASSERT_TEXT ); + { // Lock with acquire + MallocMutex::scoped_lock scoped_cs(bootStrapLock); + ((FreeObject*)ptr)->next = bootStrapObjectList; + bootStrapObjectList = (FreeObject*)ptr; + } // Unlock with release +} + +/********* End rough utility code **************/ + +/********* Thread and block related code *************/ + +#if MALLOC_DEBUG>1 +/* The debug version verifies the TLSBin as needed */ +static void verifyTLSBin (Bin* bin, size_t size) +{ + Block* temp; + Bin* tls; + uint32_t index = getIndex(size); + uint32_t objSize = getObjectSize(size); + + tls = (Bin*)getThreadMallocTLS(); + MALLOC_ASSERT( bin == tls+index, ASSERT_TEXT ); + + if (tls[index].activeBlk) { + MALLOC_ASSERT( tls[index].activeBlk->mallocUniqueID==theMallocUniqueID, ASSERT_TEXT ); + MALLOC_ASSERT( tls[index].activeBlk->owner == getThreadId(), ASSERT_TEXT ); + MALLOC_ASSERT( tls[index].activeBlk->objectSize == objSize, ASSERT_TEXT ); + + for (temp = tls[index].activeBlk->next; temp; temp=temp->next) { + MALLOC_ASSERT( temp!=tls[index].activeBlk, ASSERT_TEXT ); + MALLOC_ASSERT( temp->mallocUniqueID==theMallocUniqueID, ASSERT_TEXT ); + MALLOC_ASSERT( temp->owner == getThreadId(), ASSERT_TEXT ); + MALLOC_ASSERT( temp->objectSize == objSize, ASSERT_TEXT ); + MALLOC_ASSERT( temp->previous->next == temp, ASSERT_TEXT ); + if (temp->next) { + MALLOC_ASSERT( temp->next->previous == temp, ASSERT_TEXT ); + } + } + for (temp = tls[index].activeBlk->previous; temp; temp=temp->previous) { + MALLOC_ASSERT( temp!=tls[index].activeBlk, ASSERT_TEXT ); + MALLOC_ASSERT( temp->mallocUniqueID==theMallocUniqueID, ASSERT_TEXT ); + MALLOC_ASSERT( temp->owner == getThreadId(), ASSERT_TEXT ); + MALLOC_ASSERT( temp->objectSize == objSize, ASSERT_TEXT ); + MALLOC_ASSERT( temp->next->previous == temp, ASSERT_TEXT ); + if (temp->previous) { + MALLOC_ASSERT( temp->previous->next == temp, ASSERT_TEXT ); + } + } + } +} +#else +inline static void verifyTLSBin (Bin*, size_t) {} +#endif /* MALLOC_DEBUG>1 */ + +/* + * Add a block to the start of this tls bin list. + */ +static void pushTLSBin (Bin* bin, Block* block) +{ + /* The objectSize should be defined and not a parameter + because the function is applied to partially filled blocks as well */ + unsigned int size = block->objectSize; + Block* activeBlk; + + MALLOC_ASSERT( block->owner == getThreadId(), ASSERT_TEXT ); + MALLOC_ASSERT( block->objectSize != 0, ASSERT_TEXT ); + MALLOC_ASSERT( block->next == NULL, ASSERT_TEXT ); + MALLOC_ASSERT( block->previous == NULL, ASSERT_TEXT ); + + MALLOC_ASSERT( bin, ASSERT_TEXT ); + verifyTLSBin(bin, size); + activeBlk = bin->activeBlk; + + block->next = activeBlk; + if( activeBlk ) { + block->previous = activeBlk->previous; + activeBlk->previous = block; + if( block->previous ) + block->previous->next = block; + } else { + bin->activeBlk = block; + } + + verifyTLSBin(bin, size); +} + +/* + * Take a block out of its tls bin (e.g. before removal). + */ +static void outofTLSBin (Bin* bin, Block* block) +{ + unsigned int size = block->objectSize; + + MALLOC_ASSERT( block->owner == getThreadId(), ASSERT_TEXT ); + MALLOC_ASSERT( block->objectSize != 0, ASSERT_TEXT ); + + MALLOC_ASSERT( bin, ASSERT_TEXT ); + verifyTLSBin(bin, size); + + if (block == bin->activeBlk) { + bin->activeBlk = block->previous? block->previous : block->next; + } + /* Delink the block */ + if (block->previous) { + MALLOC_ASSERT( block->previous->next == block, ASSERT_TEXT ); + block->previous->next = block->next; + } + if (block->next) { + MALLOC_ASSERT( block->next->previous == block, ASSERT_TEXT ); + block->next->previous = block->previous; + } + block->next = NULL; + block->previous = NULL; + + verifyTLSBin(bin, size); +} + +/* + * Return the bin for the given size. If the TLS bin structure is absent, create it. + */ +static Bin* getAllocationBin(size_t size) +{ + Bin* tls = (Bin*)getThreadMallocTLS(); + if( !tls ) { + MALLOC_ASSERT( tlsSize >= sizeof(Bin) * numBlockBins, ASSERT_TEXT ); + tls = (Bin*) bootStrapMalloc(tlsSize); + if ( !tls ) return NULL; + /* the block contains zeroes after bootStrapMalloc, so bins are initialized */ +#if MALLOC_DEBUG + for (int i = 0; i < numBlockBinLimit; i++) { + MALLOC_ASSERT( tls[i].activeBlk == 0, ASSERT_TEXT ); + MALLOC_ASSERT( tls[i].mailbox == 0, ASSERT_TEXT ); + } +#endif + setThreadMallocTLS(tls); + } + MALLOC_ASSERT( tls, ASSERT_TEXT ); + return tls+getIndex(size); +} + +const float emptyEnoughRatio = 1.0 / 4.0; /* "Reactivate" a block if this share of its objects is free. */ + +static unsigned int emptyEnoughToUse (Block *mallocBlock) +{ + const float threshold = (blockSize - sizeof(Block)) * (1-emptyEnoughRatio); + + if (mallocBlock->bumpPtr) { + /* If we are still using a bump ptr for this block it is empty enough to use. */ + STAT_increment(mallocBlock->owner, getIndex(mallocBlock->objectSize), examineEmptyEnough); + mallocBlock->isFull = 0; + return 1; + } + + /* allocatedCount shows how many objects in the block are in use; however it still counts + blocks freed by other threads; so prior call to privatizePublicFreeList() is recommended */ + mallocBlock->isFull = (mallocBlock->allocatedCount*mallocBlock->objectSize > threshold)? 1: 0; +#if COLLECT_STATISTICS + if (mallocBlock->isFull) + STAT_increment(mallocBlock->owner, getIndex(mallocBlock->objectSize), examineNotEmpty); + else + STAT_increment(mallocBlock->owner, getIndex(mallocBlock->objectSize), examineEmptyEnough); +#endif + return 1-mallocBlock->isFull; +} + +/* Restore the bump pointer for an empty block that is planned to use */ +static void restoreBumpPtr (Block *block) +{ + MALLOC_ASSERT( block->allocatedCount == 0, ASSERT_TEXT ); + MALLOC_ASSERT( block->publicFreeList == NULL, ASSERT_TEXT ); + STAT_increment(block->owner, getIndex(block->objectSize), freeRestoreBumpPtr); + block->bumpPtr = (FreeObject *)((uintptr_t)block + blockSize - block->objectSize); + block->freeList = NULL; + block->isFull = 0; +} + +#if !(FREELIST_NONBLOCKING) +static MallocMutex publicFreeListLock; // lock for changes of publicFreeList +#endif + +const uintptr_t UNUSABLE = 0x1; +inline bool isSolidPtr( void* ptr ) +{ + return (UNUSABLE|(uintptr_t)ptr)!=UNUSABLE; +} +inline bool isNotForUse( void* ptr ) +{ + return (uintptr_t)ptr==UNUSABLE; +} + +static void freePublicObject (Block *block, FreeObject *objectToFree) +{ + Bin* theBin; + FreeObject *publicFreeList; + +#if FREELIST_NONBLOCKING + FreeObject *temp = block->publicFreeList; + MALLOC_ITT_SYNC_RELEASING(&block->publicFreeList); + do { + publicFreeList = objectToFree->next = temp; + temp = (FreeObject*)AtomicCompareExchange( + (intptr_t&)block->publicFreeList, + (intptr_t)objectToFree, (intptr_t)publicFreeList ); + // no backoff necessary because trying to make change, not waiting for a change + } while( temp != publicFreeList ); +#else + STAT_increment(getThreadId(), ThreadCommonCounters, lockPublicFreeList); + { + MallocMutex::scoped_lock scoped_cs(publicFreeListLock); + publicFreeList = objectToFree->next = block->publicFreeList; + block->publicFreeList = objectToFree; + } +#endif + + if( publicFreeList==NULL ) { + // if the block is abandoned, its nextPrivatizable pointer should be UNUSABLE + // otherwise, it should point to the bin the block belongs to. + // reading nextPrivatizable is thread-safe below, because: + // 1) the executing thread atomically got publicFreeList==NULL and changed it to non-NULL; + // 2) only owning thread can change it back to NULL, + // 3) but it can not be done until the block is put to the mailbox + // So the executing thread is now the only one that can change nextPrivatizable + if( !isNotForUse(block->nextPrivatizable) ) { + MALLOC_ASSERT( block->nextPrivatizable!=NULL, ASSERT_TEXT ); + MALLOC_ASSERT( block->owner!=0, ASSERT_TEXT ); + theBin = (Bin*) block->nextPrivatizable; + MallocMutex::scoped_lock scoped_cs(theBin->mailLock); + block->nextPrivatizable = theBin->mailbox; + theBin->mailbox = block; + } else { + MALLOC_ASSERT( block->owner==0, ASSERT_TEXT ); + } + } + STAT_increment(getThreadId(), ThreadCommonCounters, freeToOtherThread); + STAT_increment(block->owner, getIndex(block->objectSize), freeByOtherThread); +} + +static void privatizePublicFreeList (Block *mallocBlock) +{ + FreeObject *temp, *publicFreeList; + + MALLOC_ASSERT( mallocBlock->owner == getThreadId(), ASSERT_TEXT ); +#if FREELIST_NONBLOCKING + temp = mallocBlock->publicFreeList; + do { + publicFreeList = temp; + temp = (FreeObject*)AtomicCompareExchange( + (intptr_t&)mallocBlock->publicFreeList, + 0, (intptr_t)publicFreeList); + // no backoff necessary because trying to make change, not waiting for a change + } while( temp != publicFreeList ); + MALLOC_ITT_SYNC_ACQUIRED(&mallocBlock->publicFreeList); +#else + STAT_increment(mallocBlock->owner, ThreadCommonCounters, lockPublicFreeList); + { + MallocMutex::scoped_lock scoped_cs(publicFreeListLock); + publicFreeList = mallocBlock->publicFreeList; + mallocBlock->publicFreeList = NULL; + } + temp = publicFreeList; +#endif + + MALLOC_ASSERT( publicFreeList && publicFreeList==temp, ASSERT_TEXT ); // there should be something in publicFreeList! + if( !isNotForUse(temp) ) { // return/getPartialBlock could set it to UNUSABLE + MALLOC_ASSERT( mallocBlock->allocatedCount <= (blockSize-sizeof(Block))/mallocBlock->objectSize, ASSERT_TEXT ); + /* other threads did not change the counter freeing our blocks */ + mallocBlock->allocatedCount--; + while( isSolidPtr(temp->next) ){ // the list will end with either NULL or UNUSABLE + temp = temp->next; + mallocBlock->allocatedCount--; + } + MALLOC_ASSERT( mallocBlock->allocatedCount < (blockSize-sizeof(Block))/mallocBlock->objectSize, ASSERT_TEXT ); + /* merge with local freeList */ + temp->next = mallocBlock->freeList; + mallocBlock->freeList = publicFreeList; + STAT_increment(mallocBlock->owner, getIndex(mallocBlock->objectSize), allocPrivatized); + } +} + +static Block* getPublicFreeListBlock (Bin* bin) +{ + Block* block; + MALLOC_ASSERT( bin, ASSERT_TEXT ); +// the counter should be changed STAT_increment(getThreadId(), ThreadCommonCounters, lockPublicFreeList); + { + MallocMutex::scoped_lock scoped_cs(bin->mailLock); + block = bin->mailbox; + if( block ) { + MALLOC_ASSERT( block->owner == getThreadId(), ASSERT_TEXT ); + MALLOC_ASSERT( !isNotForUse(block->nextPrivatizable), ASSERT_TEXT ); + bin->mailbox = block->nextPrivatizable; + block->nextPrivatizable = (Block*) bin; + } + } + if( block ) { + MALLOC_ASSERT( isSolidPtr(block->publicFreeList), ASSERT_TEXT ); + privatizePublicFreeList(block); + } + return block; +} + +static Block *getPartialBlock(Bin* bin, unsigned int size) +{ + Block *result; + MALLOC_ASSERT( bin, ASSERT_TEXT ); + unsigned int index = getIndex(size); + result = (Block *) globalSizeBins[index].pop(); + if (result) { + MALLOC_ASSERT( result->mallocUniqueID==theMallocUniqueID, ASSERT_TEXT ); + result->next = NULL; + result->previous = NULL; + MALLOC_ASSERT( result->publicFreeList!=NULL, ASSERT_TEXT ); + /* There is not a race here since no other thread owns this block */ + MALLOC_ASSERT( result->owner == 0, ASSERT_TEXT ); + result->owner = getThreadId(); + // It is safe to change nextPrivatizable, as publicFreeList is not null + MALLOC_ASSERT( isNotForUse(result->nextPrivatizable), ASSERT_TEXT ); + result->nextPrivatizable = (Block*)bin; + // the next call is required to change publicFreeList to 0 + privatizePublicFreeList(result); + if( result->allocatedCount ) { + // check its fullness and set result->isFull + emptyEnoughToUse(result); + } else { + restoreBumpPtr(result); + } + MALLOC_ASSERT( !isNotForUse(result->publicFreeList), ASSERT_TEXT ); + STAT_increment(result->owner, index, allocBlockPublic); + } + return result; +} + +static void returnPartialBlock(Bin* bin, Block *block) +{ + unsigned int index = getIndex(block->objectSize); + MALLOC_ASSERT( bin, ASSERT_TEXT ); + MALLOC_ASSERT( block->owner==getThreadId(), ASSERT_TEXT ); + STAT_increment(block->owner, index, freeBlockPublic); + // need to set publicFreeList to non-zero, so other threads + // will not change nextPrivatizable and it can be zeroed. + if ((intptr_t)block->nextPrivatizable==(intptr_t)bin) { + void* oldval; +#if FREELIST_NONBLOCKING + oldval = (void*)AtomicCompareExchange((intptr_t&)block->publicFreeList, (intptr_t)UNUSABLE, 0); +#else + STAT_increment(block->owner, ThreadCommonCounters, lockPublicFreeList); + { + MallocMutex::scoped_lock scoped_cs(publicFreeListLock); + if ( (oldval=block->publicFreeList)==NULL ) + (uintptr_t&)(block->publicFreeList) = UNUSABLE; + } +#endif + if ( oldval!=NULL ) { + // another thread freed an object; we need to wait until it finishes. + // I believe there is no need for exponential backoff, as the wait here is not for a lock; + // but need to yield, so the thread we wait has a chance to run. + int count = 256; + while( (intptr_t)const_cast(block->nextPrivatizable)==(intptr_t)bin ) { + if (--count==0) { + do_yield(); + count = 256; + } + } + } + } else { + MALLOC_ASSERT( isSolidPtr(block->publicFreeList), ASSERT_TEXT ); + } + MALLOC_ASSERT( block->publicFreeList!=NULL, ASSERT_TEXT ); + // now it is safe to change our data + block->previous = NULL; + block->owner = 0; + // it is caller responsibility to ensure that the list of blocks + // formed by nextPrivatizable pointers is kept consistent if required. + // if only called from thread shutdown code, it does not matter. + (uintptr_t&)(block->nextPrivatizable) = UNUSABLE; + globalSizeBins[index].push((void **)block); +} + +static void cleanBlockHeader(Block *block) +{ +#if MALLOC_DEBUG + memset (block, 0x0e5, blockSize); +#endif + block->next = NULL; + block->previous = NULL; + block->freeList = NULL; + block->allocatedCount = 0; + block->isFull = 0; + + block->publicFreeList = NULL; +} + +static void initEmptyBlock(Block *block, size_t size) +{ + // Having getIndex and getObjectSize called next to each other + // allows better compiler optimization as they basically share the code. + unsigned int index = getIndex(size); + unsigned int objectSize = getObjectSize(size); + Bin* tls = (Bin*)getThreadMallocTLS(); + + cleanBlockHeader(block); + block->mallocUniqueID = theMallocUniqueID; + block->objectSize = objectSize; + block->owner = getThreadId(); + // bump pointer should be prepared for first allocation - thus mode it down to objectSize + block->bumpPtr = (FreeObject *)((uintptr_t)block + blockSize - objectSize); + + // each block should have the address where the head of the list of "privatizable" blocks is kept + // the only exception is a block for boot strap which is initialized when TLS is yet NULL + block->nextPrivatizable = tls? (Block*)(tls + index) : NULL; + TRACEF(( "[ScalableMalloc trace] Empty block %p is initialized, owner is %d, objectSize is %d, bumpPtr is %p\n", + block, block->owner, block->objectSize, block->bumpPtr )); + } + +/* Return an empty uninitialized block in a non-blocking fashion. */ +static Block *getRawBlock() +{ + Block *result; + Block *bigBlock; + + result = NULL; + + bigBlock = (Block *) freeBlockList.pop(); + + while (!bigBlock) { + /* We are out of blocks so go to the OS and get another one */ + if (!mallocBigBlock()) { + return NULL; + } + bigBlock = (Block *) freeBlockList.pop(); + } + + // check alignment + MALLOC_ASSERT( isAligned( bigBlock, blockSize ), ASSERT_TEXT ); + MALLOC_ASSERT( isAligned( bigBlock->bumpPtr, blockSize ), ASSERT_TEXT ); + // block should be at least as big as blockSize; otherwise the previous block can be damaged. + MALLOC_ASSERT( (uintptr_t)bigBlock->bumpPtr >= (uintptr_t)bigBlock + blockSize, ASSERT_TEXT ); + bigBlock->bumpPtr = (FreeObject *)((uintptr_t)bigBlock->bumpPtr - blockSize); + result = (Block *)bigBlock->bumpPtr; + if ( result!=bigBlock ) { + TRACEF(( "[ScalableMalloc trace] Pushing partial rest of block back on.\n" )); + freeBlockList.push((void **)bigBlock); + } + return result; +} + +/* Return an empty uninitialized block in a non-blocking fashion. */ +static Block *getEmptyBlock(size_t size) +{ + Block *result = getRawBlock(); + + if (result) { + initEmptyBlock(result, size); + STAT_increment(result->owner, getIndex(result->objectSize), allocBlockNew); + } + + return result; +} + +/* We have a block give it back to the malloc block manager */ +static void returnEmptyBlock (Block *block, bool keepTheBin = true) +{ + // it is caller's responsibility to ensure no data is lost before calling this + MALLOC_ASSERT( block->allocatedCount==0, ASSERT_TEXT ); + MALLOC_ASSERT( block->publicFreeList==NULL, ASSERT_TEXT ); + if (keepTheBin) { + /* We should keep the TLS bin structure */ + MALLOC_ASSERT( block->next == NULL, ASSERT_TEXT ); + MALLOC_ASSERT( block->previous == NULL, ASSERT_TEXT ); + } + STAT_increment(block->owner, getIndex(block->objectSize), freeBlockBack); + + cleanBlockHeader(block); + + block->nextPrivatizable = NULL; + + block->mallocUniqueID=0; + block->objectSize = 0; + block->owner = (unsigned)-1; + // for an empty block, bump pointer should point right after the end of the block + block->bumpPtr = (FreeObject *)((uintptr_t)block + blockSize); + freeBlockList.push((void **)block); +} + +inline static Block* getActiveBlock( Bin* bin ) +{ + MALLOC_ASSERT( bin, ASSERT_TEXT ); + return bin->activeBlk; +} + +inline static void setActiveBlock (Bin* bin, Block *block) +{ + MALLOC_ASSERT( bin, ASSERT_TEXT ); + MALLOC_ASSERT( block->owner == getThreadId(), ASSERT_TEXT ); + // it is the caller responsibility to keep bin consistence (i.e. ensure this block is in the bin list) + bin->activeBlk = block; +} + +inline static Block* setPreviousBlockActive( Bin* bin ) +{ + MALLOC_ASSERT( bin && bin->activeBlk, ASSERT_TEXT ); + Block* temp = bin->activeBlk->previous; + if( temp ) { + MALLOC_ASSERT( temp->isFull == 0, ASSERT_TEXT ); + bin->activeBlk = temp; + } + return temp; +} + +#if MALLOC_CHECK_RECURSION + +/* + * It's a special kind of allocation that can be used when malloc is + * not available (either during startup or when malloc was already called and + * we are, say, inside pthread_setspecific's call). + * Block can contain objects of different sizes, + * allocations are performed by moving bump pointer and increasing of object counter, + * releasing is done via counter of objects allocated in the block + * or moving bump pointer if releasing object is on a bound. + */ + +struct StartupBlock : public Block { + size_t availableSize() { + return blockSize - ((uintptr_t)bumpPtr - (uintptr_t)this); + } +}; + +static MallocMutex startupMallocLock; +static StartupBlock *firstStartupBlock; + +static StartupBlock *getNewStartupBlock() +{ + StartupBlock *block = (StartupBlock *)getRawBlock(); + + if (!block) return NULL; + + cleanBlockHeader(block); + block->mallocUniqueID = theMallocUniqueID; + // use startupAllocObjSizeMark to mark objects from startup block marker + block->objectSize = startupAllocObjSizeMark; + block->bumpPtr = (FreeObject *)((uintptr_t)block + sizeof(StartupBlock)); + return block; +} + +/* TODO: Function is called when malloc nested call is detected, so simultaneous + usage from different threads are unprobable, so block pre-allocation + can be not useful, and the code might be simplified. */ +static FreeObject *startupAlloc(size_t size) +{ + FreeObject *result; + StartupBlock *newBlock = NULL; + bool newBlockUnused = false; + + /* Objects must be aligned on their natural bounds, + and objects bigger than word on word's bound. */ + size = alignUp(size, sizeof(size_t)); + // We need size of an object to implement msize. + size_t reqSize = size + sizeof(size_t); + // speculatively allocates newBlock to later use or return it as unused + if (!firstStartupBlock || firstStartupBlock->availableSize() < reqSize) + if (!(newBlock = getNewStartupBlock())) + return NULL; + + { + MallocMutex::scoped_lock scoped_cs(startupMallocLock); + + if (!firstStartupBlock || firstStartupBlock->availableSize() < reqSize) { + if (!newBlock && !(newBlock = getNewStartupBlock())) + return NULL; + newBlock->next = (Block*)firstStartupBlock; + if (firstStartupBlock) + firstStartupBlock->previous = (Block*)newBlock; + firstStartupBlock = newBlock; + } else + newBlockUnused = true; + result = firstStartupBlock->bumpPtr; + firstStartupBlock->allocatedCount++; + firstStartupBlock->bumpPtr = + (FreeObject *)((uintptr_t)firstStartupBlock->bumpPtr + reqSize); + } + if (newBlock && newBlockUnused) + returnEmptyBlock(newBlock); + + // keep object size at the negative offset + *((size_t*)result) = size; + return (FreeObject*)((size_t*)result+1); +} + +static size_t startupMsize(void *ptr) { return *((size_t*)ptr - 1); } + +static void startupFree(StartupBlock *block, void *ptr) +{ + Block* blockToRelease = NULL; + { + MallocMutex::scoped_lock scoped_cs(startupMallocLock); + + MALLOC_ASSERT(firstStartupBlock, ASSERT_TEXT); + MALLOC_ASSERT(startupAllocObjSizeMark==block->objectSize + && block->allocatedCount>0, ASSERT_TEXT); + MALLOC_ASSERT((uintptr_t)ptr>=(uintptr_t)block+sizeof(StartupBlock) + && (uintptr_t)ptr+startupMsize(ptr)<=(uintptr_t)block+blockSize, + ASSERT_TEXT); + if (0 == --block->allocatedCount) { + if (block == firstStartupBlock) + firstStartupBlock = (StartupBlock*)firstStartupBlock->next; + if (block->previous) + block->previous->next = block->next; + if (block->next) + block->next->previous = block->previous; + blockToRelease = block; + } else if ((uintptr_t)ptr + startupMsize(ptr) == (uintptr_t)block->bumpPtr) { + // last object in the block released + FreeObject *newBump = (FreeObject*)((size_t*)ptr - 1); + MALLOC_ASSERT((uintptr_t)newBump>(uintptr_t)block+sizeof(StartupBlock), + ASSERT_TEXT); + block->bumpPtr = newBump; + } + } + if (blockToRelease) { + blockToRelease->previous = blockToRelease->next = NULL; + returnEmptyBlock(blockToRelease); + } +} + +#endif /* MALLOC_CHECK_RECURSION */ + +/********* End thread related code *************/ + +/********* Library initialization *************/ + +//! Value indicating the state of initialization. +/* 0 = initialization not started. + * 1 = initialization started but not finished. + * 2 = initialization finished. + * In theory, we only need values 0 and 2. But value 1 is nonetheless + * useful for detecting errors in the double-check pattern. + */ +static int mallocInitialized; // implicitly initialized to 0 +static MallocMutex initAndShutMutex; + +inline bool isMallocInitialized() { return 2 == mallocInitialized; } + +/* + * Allocator initialization routine; + * it is called lazily on the very first scalable_malloc call. + */ +static void initMemoryManager() +{ + TRACEF(( "[ScalableMalloc trace] sizeof(Block) is %d (expected 128); sizeof(uintptr_t) is %d\n", + sizeof(Block), sizeof(uintptr_t) )); + MALLOC_ASSERT( 2*blockHeaderAlignment == sizeof(Block), ASSERT_TEXT ); + +// Create keys for thread-local storage and for thread id +// TODO: add error handling, and on error do something better than exit(1) +#if USE_WINTHREAD + TLS_pointer_key = TlsAlloc(); + Tid_key = TlsAlloc(); +#else + int status1 = pthread_key_create( &TLS_pointer_key, mallocThreadShutdownNotification ); + int status2 = pthread_key_create( &Tid_key, NULL ); + if ( status1 || status2 ) { + fprintf (stderr, "The memory manager cannot create tls key during initialization; exiting \n"); + exit(1); + } +#endif /* USE_WINTHREAD */ +#if COLLECT_STATISTICS + initStatisticsCollection(); +#endif + + TRACEF(( "[ScalableMalloc trace] Asking for a mallocBigBlock\n" )); + if (!mallocBigBlock()) { + fprintf (stderr, "The memory manager cannot access sufficient memory to initialize; exiting \n"); + exit(1); + } +} + +//! Ensures that initMemoryManager() is called once and only once. +/** Does not return until initMemoryManager() has been completed by a thread. + There is no need to call this routine if mallocInitialized==2 . */ +static void checkInitialization() +{ + if (mallocInitialized==2) return; + MallocMutex::scoped_lock lock( initAndShutMutex ); + if (mallocInitialized!=2) { + MALLOC_ASSERT( mallocInitialized==0, ASSERT_TEXT ); + mallocInitialized = 1; + RecursiveMallocCallProtector scoped; + initMemoryManager(); +#ifdef MALLOC_EXTRA_INITIALIZATION + MALLOC_EXTRA_INITIALIZATION; +#endif +#if MALLOC_CHECK_RECURSION + RecursiveMallocCallProtector::detectNaiveOverload(); +#endif + MALLOC_ASSERT( mallocInitialized==1, ASSERT_TEXT ); + mallocInitialized = 2; + } + MALLOC_ASSERT( mallocInitialized==2, ASSERT_TEXT ); /* It can't be 0 or I would have initialized it */ +} + +/********* End library initialization *************/ + +/********* The malloc show begins *************/ + + +/********* Allocation of large objects ************/ + +/* + * The program wants a large object that we are not prepared to deal with. + * so we pass the problem on to the OS. Large Objects are the only objects in + * the system that begin on a 16K byte boundary since the blocks used for smaller + * objects have the Block structure at each 16K boundary. + * + */ + +struct LargeObjectHeader { + void *unalignedResult; /* The base of the memory returned from getMemory, this is what is used to return this to the OS */ + size_t unalignedSize; /* The size that was requested from getMemory */ + uint64_t mallocUniqueID; /* The field to check whether the memory was allocated by scalable_malloc */ + size_t objectSize; /* The size originally requested by a client */ + bool fromMapMemory; /* Memory allocated when MapMemory usage is forced */ +}; + +void CachedObjectsList::push(void *buf, bool fromMapMemory, uintptr_t currAge) +{ + CachedObject *ptr = (CachedObject*)buf; + ptr->prev = NULL; + ptr->age = currAge; + ptr->fromMapMemory = fromMapMemory; + + MallocMutex::scoped_lock scoped_cs(lock); + ptr->next = first; + first = ptr; + if (ptr->next) ptr->next->prev = ptr; + if (!last) { + MALLOC_ASSERT(0 == oldest, ASSERT_TEXT); + oldest = currAge; + last = ptr; + } +} + +CachedObject *CachedObjectsList::pop(uintptr_t currAge) +{ + CachedObject *result=NULL; + { + MallocMutex::scoped_lock scoped_cs(lock); + if (first) { + result = first; + first = result->next; + if (first) + first->prev = NULL; + else { + last = NULL; + oldest = 0; + } + } else { + /* If cache miss occured, set ageThreshold to twice the difference + between current time and last time cache was cleaned. */ + ageThreshold = 2*(currAge - lastCleanedAge); + } + } + return result; +} + +void CachedObjectsList::releaseLastIfOld(uintptr_t currAge, size_t size) +{ + CachedObject *toRelease = NULL; + + /* oldest may be more recent then age, that's why cast to signed type + was used. age overflow is also processed correctly. */ + if (last && (intptr_t)(currAge - oldest) > ageThreshold) { + MallocMutex::scoped_lock scoped_cs(lock); + // double check + if (last && (intptr_t)(currAge - last->age) > ageThreshold) { + do { + last = last->prev; + } while (last && (intptr_t)(currAge - last->age) > ageThreshold); + if (last) { + toRelease = last->next; + oldest = last->age; + last->next = NULL; + } else { + toRelease = first; + first = NULL; + oldest = 0; + } + MALLOC_ASSERT( toRelease, ASSERT_TEXT ); + lastCleanedAge = toRelease->age; + } + else + return; + } + while ( toRelease ) { + CachedObject *helper = toRelease->next; + freeRawMemory(toRelease, size, toRelease->fromMapMemory); + toRelease = helper; + } +} + +/* A predicate checks whether an object starts on blockSize boundary */ +static inline unsigned int isLargeObject(void *object) +{ + return isAligned(object, blockSize); +} + +static uintptr_t cleanupCacheIfNeed () +{ + /* loCacheStat.age overflow is OK, as we only want difference between + * its current value and some recent. + * + * Both malloc and free should increment loCacheStat.age, as in + * a different case mulitiple cache object would have same age, + * and accuracy of predictors suffers. + */ + uintptr_t currAge = (uintptr_t)AtomicIncrement((intptr_t&)loCacheStat.age); + + if ( 0 == currAge % cacheCleanupFreq ) { + size_t objSize; + int i; + + for (i = numLargeObjectBins-1, + objSize = (numLargeObjectBins-1)*largeObjectCacheStep+blockSize; + i >= 0; + i--, objSize-=largeObjectCacheStep) { + /* cached object size on iteration is + * i*largeObjectCacheStep+blockSize, it seems iterative + * computation of it improves performance. + */ + // release from cache objects that are older then ageThreshold + globalCachedObjectBins[i].releaseLastIfOld(currAge, objSize); + } + } + return currAge; +} + +static CachedObject* allocateCachedLargeObject (size_t size) +{ + MALLOC_ASSERT( size%largeObjectCacheStep==0, ASSERT_TEXT ); + CachedObject *block = NULL; + // blockSize is the minimal alignment and thus the minimal size of a large object. + size_t idx = (size-blockSize)/largeObjectCacheStep; + if (idxfromMapMemory; + unalignedArea = cachedObj; + } else { + unalignedArea = getRawMemory(allocationSize); + if (!unalignedArea) + return NULL; + STAT_increment(getThreadId(), ThreadCommonCounters, allocNewLargeObj); + } + } + void *alignedArea = (void*)alignUp((uintptr_t)unalignedArea+sizeof(LargeObjectHeader), alignment); + LargeObjectHeader *header = (LargeObjectHeader*)((uintptr_t)alignedArea-sizeof(LargeObjectHeader)); + header->unalignedResult = unalignedArea; + header->mallocUniqueID=theMallocUniqueID; + header->unalignedSize = allocationSize; + header->objectSize = size; + header->fromMapMemory = startupAlloc || blockFromMapMemory; + MALLOC_ASSERT( isLargeObject(alignedArea), ASSERT_TEXT ); + return alignedArea; +} + +static bool freeLargeObjectToCache (LargeObjectHeader* header) +{ + size_t size = header->unalignedSize; + size_t idx = (size-blockSize)/largeObjectCacheStep; + if (idxunalignedResult, + header->fromMapMemory, currAge); + + STAT_increment(getThreadId(), ThreadCommonCounters, cacheLargeObj); + return true; + } + return false; +} + +static inline void freeLargeObject (void *object) +{ + LargeObjectHeader *header; + header = (LargeObjectHeader *)((uintptr_t)object - sizeof(LargeObjectHeader)); + header->mallocUniqueID = 0; + if (!freeLargeObjectToCache(header)) { + freeRawMemory(header->unalignedResult, header->unalignedSize, + /*alwaysUseMap=*/ header->fromMapMemory); + STAT_increment(getThreadId(), ThreadCommonCounters, freeLargeObj); + } +} + +/*********** End allocation of large objects **********/ + + +static FreeObject *allocateFromFreeList(Block *mallocBlock) +{ + FreeObject *result; + + if (!mallocBlock->freeList) { + return NULL; + } + + result = mallocBlock->freeList; + MALLOC_ASSERT( result, ASSERT_TEXT ); + + mallocBlock->freeList = result->next; + MALLOC_ASSERT( mallocBlock->allocatedCount < (blockSize-sizeof(Block))/mallocBlock->objectSize, ASSERT_TEXT ); + mallocBlock->allocatedCount++; + STAT_increment(mallocBlock->owner, getIndex(mallocBlock->objectSize), allocFreeListUsed); + + return result; +} + +static FreeObject *allocateFromBumpPtr(Block *mallocBlock) +{ + FreeObject *result = mallocBlock->bumpPtr; + if (result) { + mallocBlock->bumpPtr = + (FreeObject *) ((uintptr_t) mallocBlock->bumpPtr - mallocBlock->objectSize); + if ( (uintptr_t)mallocBlock->bumpPtr < (uintptr_t)mallocBlock+sizeof(Block) ) { + mallocBlock->bumpPtr = NULL; + } + MALLOC_ASSERT( mallocBlock->allocatedCount < (blockSize-sizeof(Block))/mallocBlock->objectSize, ASSERT_TEXT ); + mallocBlock->allocatedCount++; + STAT_increment(mallocBlock->owner, getIndex(mallocBlock->objectSize), allocBumpPtrUsed); + } + return result; +} + +inline static FreeObject* allocateFromBlock( Block *mallocBlock ) +{ + FreeObject *result; + + MALLOC_ASSERT( mallocBlock->owner == getThreadId(), ASSERT_TEXT ); + + /* for better cache locality, first looking in the free list. */ + if ( (result = allocateFromFreeList(mallocBlock)) ) { + return result; + } + MALLOC_ASSERT( !mallocBlock->freeList, ASSERT_TEXT ); + + /* if free list is empty, try thread local bump pointer allocation. */ + if ( (result = allocateFromBumpPtr(mallocBlock)) ) { + return result; + } + MALLOC_ASSERT( !mallocBlock->bumpPtr, ASSERT_TEXT ); + + /* the block is considered full. */ + mallocBlock->isFull = 1; + return NULL; +} + +static void moveBlockToBinFront(Block *block) +{ + Bin* bin = getAllocationBin(block->objectSize); + /* move the block to the front of the bin */ + outofTLSBin(bin, block); + pushTLSBin(bin, block); +} + +static void processLessUsedBlock(Block *block) +{ + Bin* bin = getAllocationBin(block->objectSize); + if (block != getActiveBlock(bin) && block != getActiveBlock(bin)->previous ) { + /* We are not actively using this block; return it to the general block pool */ + outofTLSBin(bin, block); + returnEmptyBlock(block); + } else { + /* all objects are free - let's restore the bump pointer */ + restoreBumpPtr(block); + } +} + +/* + * All aligned allocations fall into one of the following categories: + * 1. if both request size and alignment are <= maxSegregatedObjectSize, + * we just align the size up, and request this amount, because for every size + * aligned to some power of 2, the allocated object is at least that aligned. + * 2. for bigger size, check if already guaranteed fittingAlignment is enough. + * 3. if size+alignmentalignment? blockSize: alignment); + } + + MALLOC_ASSERT( isAligned(result, alignment), ASSERT_TEXT ); + return result; +} + +static void *reallocAligned(void *ptr, size_t size, size_t alignment = 0) +{ + void *result; + size_t copySize; + + if (isLargeObject(ptr)) { + LargeObjectHeader* loh = (LargeObjectHeader *)((uintptr_t)ptr - sizeof(LargeObjectHeader)); + MALLOC_ASSERT( loh->mallocUniqueID==theMallocUniqueID, ASSERT_TEXT ); + copySize = loh->unalignedSize-((uintptr_t)ptr-(uintptr_t)loh->unalignedResult); + if (size <= copySize && (0==alignment || isAligned(ptr, alignment))) { + loh->objectSize = size; + return ptr; + } else { + copySize = loh->objectSize; + result = alignment ? allocateAligned(size, alignment) : scalable_malloc(size); + } + } else { + Block* block = (Block *)alignDown(ptr, blockSize); + MALLOC_ASSERT( block->mallocUniqueID==theMallocUniqueID, ASSERT_TEXT ); + copySize = block->objectSize; + if (size <= copySize && (0==alignment || isAligned(ptr, alignment))) { + return ptr; + } else { + result = alignment ? allocateAligned(size, alignment) : scalable_malloc(size); + } + } + if (result) { + memcpy(result, ptr, copySizeobjectSize; +} + +/* Finds the real object inside the block */ +static inline FreeObject *findAllocatedObject(const void *address, const Block *block) +{ + // calculate offset from the end of the block space + uintptr_t offset = (uintptr_t)block + blockSize - (uintptr_t)address; + MALLOC_ASSERT( offsetobjectSize; + // and move the address down to where the real object starts. + return (FreeObject*)((uintptr_t)address - (offset? block->objectSize-offset: 0)); +} + +} // namespace internal +} // namespace rml + +using namespace rml::internal; + +/* + * When a thread is shutting down this routine should be called to remove all the thread ids + * from the malloc blocks and replace them with a NULL thread id. + * + */ +#if MALLOC_TRACE +static unsigned int threadGoingDownCount = 0; +#endif + +/* + * for pthreads, the function is set as a callback in pthread_key_create for TLS bin. + * it will be automatically called at thread exit with the key value as the argument. + * + * for Windows, it should be called directly e.g. from DllMain; the argument can be NULL + * one should include "TypeDefinitions.h" for the declaration of this function. +*/ +extern "C" void mallocThreadShutdownNotification(void* arg) +{ + Bin *tls; + Block *threadBlock; + Block *threadlessBlock; + unsigned int index; + + { + MallocMutex::scoped_lock lock( initAndShutMutex ); + if ( mallocInitialized == 0 ) return; + } + + TRACEF(( "[ScalableMalloc trace] Thread id %d blocks return start %d\n", + getThreadId(), threadGoingDownCount++ )); +#ifdef USE_WINTHREAD + tls = (Bin*)getThreadMallocTLS(); +#else + tls = (Bin*)arg; +#endif + if (tls) { + for (index = 0; index < numBlockBins; index++) { + if (tls[index].activeBlk==NULL) + continue; + threadlessBlock = tls[index].activeBlk->previous; + while (threadlessBlock) { + threadBlock = threadlessBlock->previous; + if (threadlessBlock->allocatedCount==0 && threadlessBlock->publicFreeList==NULL) { + /* we destroy the thread, no need to keep its TLS bin -> the second param is false */ + returnEmptyBlock(threadlessBlock, false); + } else { + returnPartialBlock(tls+index, threadlessBlock); + } + threadlessBlock = threadBlock; + } + threadlessBlock = tls[index].activeBlk; + while (threadlessBlock) { + threadBlock = threadlessBlock->next; + if (threadlessBlock->allocatedCount==0 && threadlessBlock->publicFreeList==NULL) { + /* we destroy the thread, no need to keep its TLS bin -> the second param is false */ + returnEmptyBlock(threadlessBlock, false); + } else { + returnPartialBlock(tls+index, threadlessBlock); + } + threadlessBlock = threadBlock; + } + tls[index].activeBlk = 0; + } + bootStrapFree((void*)tls); + setThreadMallocTLS(NULL); + } + + TRACEF(( "[ScalableMalloc trace] Thread id %d blocks return end\n", getThreadId() )); +} + +extern "C" void mallocProcessShutdownNotification(void) +{ + // for now, this function is only necessary for dumping statistics +#if COLLECT_STATISTICS + ThreadId nThreads = ThreadIdCount; + for( int i=1; i<=nThreads && imallocUniqueID + : &((Block *)alignDown(ptr, blockSize))->mallocUniqueID + ); + return id == theMallocUniqueID; +} + +/********* The malloc code *************/ + +extern "C" void * scalable_malloc(size_t size) +{ + Bin* bin; + Block * mallocBlock; + FreeObject *result; + + if (!size) size = sizeof(size_t); + +#if MALLOC_CHECK_RECURSION + if (RecursiveMallocCallProtector::sameThreadActive()) { + result = size= minLargeObjectSize) { + result = (FreeObject*)mallocLargeObject(size, blockSize); + if (!result) errno = ENOMEM; + return result; + } + + /* + * Get an element in thread-local array corresponding to the given size; + * It keeps ptr to the active block for allocations of this size + */ + bin = getAllocationBin(size); + if ( !bin ) { + errno = ENOMEM; + return NULL; + } + + /* Get the block of you want to try to allocate in. */ + mallocBlock = getActiveBlock(bin); + + if (mallocBlock) { + do { + if( (result = allocateFromBlock(mallocBlock)) ) { + return result; + } + // the previous block, if any, should be empty enough + } while( (mallocBlock = setPreviousBlockActive(bin)) ); + } + MALLOC_ASSERT( !(bin->activeBlk) || bin->activeBlk->isFull==1, ASSERT_TEXT ); + + /* + * else privatize publicly freed objects in some block and allocate from it + */ + mallocBlock = getPublicFreeListBlock( bin ); + if (mallocBlock) { + if (emptyEnoughToUse(mallocBlock)) { + /* move the block to the front of the bin */ + outofTLSBin(bin, mallocBlock); + pushTLSBin(bin, mallocBlock); + } + MALLOC_ASSERT( mallocBlock->freeList, ASSERT_TEXT ); + if ( (result = allocateFromFreeList(mallocBlock)) ) { + return result; + } + /* Else something strange happened, need to retry from the beginning; */ + TRACEF(( "[ScalableMalloc trace] Something is wrong: no objects in public free list; reentering.\n" )); + return scalable_malloc(size); + } + + /* + * no suitable own blocks, try to get a partial block that some other thread has discarded. + */ + mallocBlock = getPartialBlock(bin, size); + while (mallocBlock) { + pushTLSBin(bin, mallocBlock); +// guaranteed by pushTLSBin: MALLOC_ASSERT( *bin==mallocBlock || (*bin)->previous==mallocBlock, ASSERT_TEXT ); + setActiveBlock(bin, mallocBlock); + if( (result = allocateFromBlock(mallocBlock)) ) { + return result; + } + mallocBlock = getPartialBlock(bin, size); + } + + /* + * else try to get a new empty block + */ + mallocBlock = getEmptyBlock(size); + if (mallocBlock) { + pushTLSBin(bin, mallocBlock); +// guaranteed by pushTLSBin: MALLOC_ASSERT( *bin==mallocBlock || (*bin)->previous==mallocBlock, ASSERT_TEXT ); + setActiveBlock(bin, mallocBlock); + if( (result = allocateFromBlock(mallocBlock)) ) { + return result; + } + /* Else something strange happened, need to retry from the beginning; */ + TRACEF(( "[ScalableMalloc trace] Something is wrong: no objects in empty block; reentering.\n" )); + return scalable_malloc(size); + } + /* + * else nothing works so return NULL + */ + TRACEF(( "[ScalableMalloc trace] No memory found, returning NULL.\n" )); + errno = ENOMEM; + return NULL; +} + +/********* End the malloc code *************/ + +/********* The free code *************/ + +extern "C" void scalable_free (void *object) { + Block *block; + ThreadId myTid; + FreeObject *objectToFree; + + if (!object) { + return; + } + + if (isLargeObject(object)) { + freeLargeObject(object); + return; + } + + block = (Block *)alignDown(object, blockSize);/* mask low bits to get the block */ + MALLOC_ASSERT( block->mallocUniqueID == theMallocUniqueID, ASSERT_TEXT ); + MALLOC_ASSERT( block->allocatedCount, ASSERT_TEXT ); + +#if MALLOC_CHECK_RECURSION + if (block->objectSize == startupAllocObjSizeMark) { + startupFree((StartupBlock *)block, object); + return; + } +#endif + + myTid = getThreadId(); + + // Due to aligned allocations, a pointer passed to scalable_free + // might differ from the address of internally allocated object. + // Small objects however should always be fine. + if (block->objectSize <= maxSegregatedObjectSize) + objectToFree = (FreeObject*)object; + // "Fitting size" allocations are suspicious if aligned higher than naturally + else { + if ( ! isAligned(object,2*fittingAlignment) ) + // TODO: the above check is questionable - it gives false negatives in ~50% cases, + // so might even be slower in average than unconditional use of findAllocatedObject. + // here it should be a "real" object + objectToFree = (FreeObject*)object; + else + // here object can be an aligned address, so applying additional checks + objectToFree = findAllocatedObject(object, block); + MALLOC_ASSERT( isAligned(objectToFree,fittingAlignment), ASSERT_TEXT ); + } + MALLOC_ASSERT( isProperlyPlaced(objectToFree, block), ASSERT_TEXT ); + + if (myTid == block->owner) { + objectToFree->next = block->freeList; + block->freeList = objectToFree; + block->allocatedCount--; + MALLOC_ASSERT( block->allocatedCount < (blockSize-sizeof(Block))/block->objectSize, ASSERT_TEXT ); +#if COLLECT_STATISTICS + if (getActiveBlock(getAllocationBin(block->objectSize)) != block) + STAT_increment(myTid, getIndex(block->objectSize), freeToInactiveBlock); + else + STAT_increment(myTid, getIndex(block->objectSize), freeToActiveBlock); +#endif + if (block->isFull) { + if (emptyEnoughToUse(block)) + moveBlockToBinFront(block); + } else { + if (block->allocatedCount==0 && block->publicFreeList==NULL) + processLessUsedBlock(block); + } + } else { /* Slower path to add to the shared list, the allocatedCount is updated by the owner thread in malloc. */ + freePublicObject (block, objectToFree); + } +} + +/* + * A variant that provides additional memory safety, by checking whether the given address + * was obtained with this allocator, and if not redirecting to the provided alternative call. + */ +extern "C" void safer_scalable_free (void *object, void (*original_free)(void*)) +{ + if (!object) + return; + + if (isRecognized(object)) + scalable_free(object); + else if (original_free) + original_free(object); +} + +/********* End the free code *************/ + +/********* Code for scalable_realloc ***********/ + +/* + * From K&R + * "realloc changes the size of the object pointed to by p to size. The contents will + * be unchanged up to the minimum of the old and the new sizes. If the new size is larger, + * the new space is uninitialized. realloc returns a pointer to the new space, or + * NULL if the request cannot be satisfied, in which case *p is unchanged." + * + */ +extern "C" void* scalable_realloc(void* ptr, size_t size) +{ + /* corner cases left out of reallocAligned to not deal with errno there */ + if (!ptr) { + return scalable_malloc(size); + } + if (!size) { + scalable_free(ptr); + return NULL; + } + void* tmp = reallocAligned(ptr, size, 0); + if (!tmp) errno = ENOMEM; + return tmp; +} + +/* + * A variant that provides additional memory safety, by checking whether the given address + * was obtained with this allocator, and if not redirecting to the provided alternative call. + */ +extern "C" void* safer_scalable_realloc (void* ptr, size_t sz, void* original_realloc) +{ + if (!ptr) { + return scalable_malloc(sz); + } + if (isRecognized(ptr)) { + if (!sz) { + scalable_free(ptr); + return NULL; + } + void* tmp = reallocAligned(ptr, sz, 0); + if (!tmp) errno = ENOMEM; + return tmp; + } +#if USE_WINTHREAD + else if (original_realloc && sz) { + orig_ptrs *original_ptrs = static_cast(original_realloc); + if ( original_ptrs->orig_msize ){ + size_t oldSize = original_ptrs->orig_msize(ptr); + void *newBuf = scalable_malloc(sz); + if (newBuf) { + memcpy(newBuf, ptr, szorig_free ){ + original_ptrs->orig_free( ptr ); + } + } + return newBuf; + } + } +#else + else if (original_realloc) { + typedef void* (*realloc_ptr_t)(void*,size_t); + realloc_ptr_t original_realloc_ptr; + (void *&)original_realloc_ptr = original_realloc; + return original_realloc_ptr(ptr,sz); + } +#endif + return NULL; +} + +/********* End code for scalable_realloc ***********/ + +/********* Code for scalable_calloc ***********/ + +/* + * From K&R + * calloc returns a pointer to space for an array of nobj objects, + * each of size size, or NULL if the request cannot be satisfied. + * The space is initialized to zero bytes. + * + */ + +extern "C" void * scalable_calloc(size_t nobj, size_t size) +{ + size_t arraySize = nobj * size; + void* result = scalable_malloc(arraySize); + if (result) + memset(result, 0, arraySize); + return result; +} + +/********* End code for scalable_calloc ***********/ + +/********* Code for aligned allocation API **********/ + +extern "C" int scalable_posix_memalign(void **memptr, size_t alignment, size_t size) +{ + if ( !isPowerOfTwoMultiple(alignment, sizeof(void*)) ) + return EINVAL; + void *result = allocateAligned(size, alignment); + if (!result) + return ENOMEM; + *memptr = result; + return 0; +} + +extern "C" void * scalable_aligned_malloc(size_t size, size_t alignment) +{ + if (!isPowerOfTwo(alignment) || 0==size) { + errno = EINVAL; + return NULL; + } + void* tmp = allocateAligned(size, alignment); + if (!tmp) + errno = ENOMEM; + return tmp; +} + +extern "C" void * scalable_aligned_realloc(void *ptr, size_t size, size_t alignment) +{ + /* corner cases left out of reallocAligned to not deal with errno there */ + if (!isPowerOfTwo(alignment)) { + errno = EINVAL; + return NULL; + } + if (!ptr) { + return allocateAligned(size, alignment); + } + if (!size) { + scalable_free(ptr); + return NULL; + } + + void* tmp = reallocAligned(ptr, size, alignment); + if (!tmp) errno = ENOMEM; + return tmp; +} + +extern "C" void * safer_scalable_aligned_realloc(void *ptr, size_t size, size_t alignment, void* orig_function) +{ + /* corner cases left out of reallocAligned to not deal with errno there */ + if (!isPowerOfTwo(alignment)) { + errno = EINVAL; + return NULL; + } + if (!ptr) { + return allocateAligned(size, alignment); + } + if (isRecognized(ptr)) { + if (!size) { + scalable_free(ptr); + return NULL; + } + void* tmp = reallocAligned(ptr, size, alignment); + if (!tmp) errno = ENOMEM; + return tmp; + } +#if USE_WINTHREAD + else { + orig_ptrs *original_ptrs = static_cast(orig_function); + if (size) { + if ( original_ptrs->orig_msize ){ + size_t oldSize = original_ptrs->orig_msize(ptr); + void *newBuf = allocateAligned(size, alignment); + if (newBuf) { + memcpy(newBuf, ptr, sizeorig_free ){ + original_ptrs->orig_free( ptr ); + } + } + return newBuf; + }else{ + //We can't do anything with this. Just keeping old pointer + return NULL; + } + } else { + if ( original_ptrs->orig_free ){ + original_ptrs->orig_free( ptr ); + } + return NULL; + } + } +#endif + return NULL; +} + +extern "C" void scalable_aligned_free(void *ptr) +{ + scalable_free(ptr); +} + +/********* end code for aligned allocation API **********/ + +/********* Code for scalable_msize ***********/ + +/* + * Returns the size of a memory block allocated in the heap. + */ +extern "C" size_t scalable_msize(void* ptr) +{ + if (ptr) { + if (isLargeObject(ptr)) { + LargeObjectHeader* loh = (LargeObjectHeader*)((uintptr_t)ptr - sizeof(LargeObjectHeader)); + if (loh->mallocUniqueID==theMallocUniqueID) + return loh->unalignedSize-((uintptr_t)ptr-(uintptr_t)loh->unalignedResult); + } else { + Block* block = (Block *)alignDown(ptr, blockSize); + if (block->mallocUniqueID==theMallocUniqueID) { +#if MALLOC_CHECK_RECURSION + size_t size = block->objectSize? block->objectSize : startupMsize(ptr); +#else + size_t size = block->objectSize; +#endif + MALLOC_ASSERT(size>0 && size // size_t +#if _MSC_VER +typedef unsigned __int32 uint32_t; +typedef unsigned __int64 uint64_t; +#else +#include +#endif + +namespace rml { +namespace internal { + +extern bool original_malloc_found; +extern void* (*original_malloc_ptr)(size_t); +extern void (*original_free_ptr)(void*); + +} } // namespaces + +//! PROVIDE YOUR OWN Customize.h IF YOU FEEL NECESSARY +#include "Customize.h" + +/* + * Functions to align an integer down or up to the given power of two, + * and test for such an alignment, and for power of two. + */ +template +static inline T alignDown(T arg, uintptr_t alignment) { + return T( (uintptr_t)arg & ~(alignment-1)); +} +template +static inline T alignUp (T arg, uintptr_t alignment) { + return T(((uintptr_t)arg+(alignment-1)) & ~(alignment-1)); + // /*is this better?*/ return (((uintptr_t)arg-1) | (alignment-1)) + 1; +} +template +static inline bool isAligned(T arg, uintptr_t alignment) { + return 0==((uintptr_t)arg & (alignment-1)); +} +static inline bool isPowerOfTwo(uintptr_t arg) { + return arg && (0==(arg & (arg-1))); +} +static inline bool isPowerOfTwoMultiple(uintptr_t arg, uintptr_t divisor) { + // Divisor is assumed to be a power of two (which is valid for current uses). + MALLOC_ASSERT( isPowerOfTwo(divisor), "Divisor should be a power of two" ); + return arg && (0==(arg & (arg-divisor))); +} + +#endif /* _itt_shared_malloc_TypeDefinitions_H_ */ diff --git a/dep/tbb/src/tbbmalloc/lin-tbbmalloc-export.def b/dep/tbb/src/tbbmalloc/lin-tbbmalloc-export.def new file mode 100644 index 000000000..49a4590a9 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/lin-tbbmalloc-export.def @@ -0,0 +1,70 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +{ +global: + +scalable_calloc; +scalable_free; +scalable_malloc; +scalable_realloc; +scalable_posix_memalign; +scalable_aligned_malloc; +scalable_aligned_realloc; +scalable_aligned_free; +__TBB_internal_calloc; +__TBB_internal_free; +__TBB_internal_malloc; +__TBB_internal_realloc; +__TBB_internal_posix_memalign; +scalable_msize; + +local: + +/* TBB symbols */ +*3rml8internal*; +*3tbb*; +*__TBB*; +__itt_*; +ITT_DoOneTimeInitialization; +TBB_runtime_interface_version; + +/* Intel Compiler (libirc) symbols */ +__intel_*; +_intel_*; +get_memcpy_largest_cachelinesize; +get_memcpy_largest_cache_size; +get_mem_ops_method; +init_mem_ops_method; +irc__get_msg; +irc__print; +override_mem_ops_method; +set_memcpy_largest_cachelinesize; +set_memcpy_largest_cache_size; + +}; diff --git a/dep/tbb/src/tbbmalloc/lin32-proxy-export.def b/dep/tbb/src/tbbmalloc/lin32-proxy-export.def new file mode 100644 index 000000000..16411ce47 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/lin32-proxy-export.def @@ -0,0 +1,59 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +{ +global: +calloc; +free; +malloc; +realloc; +posix_memalign; +memalign; +valloc; +pvalloc; +mallinfo; +mallopt; +__TBB_malloc_proxy; +__TBB_internal_find_original_malloc; +_ZdaPv; /* next ones are new/delete */ +_ZdaPvRKSt9nothrow_t; +_ZdlPv; +_ZdlPvRKSt9nothrow_t; +_Znaj; +_ZnajRKSt9nothrow_t; +_Znwj; +_ZnwjRKSt9nothrow_t; + +local: + +/* TBB symbols */ +*3rml8internal*; +*3tbb*; +*__TBB*; + +}; diff --git a/dep/tbb/src/tbbmalloc/lin64-proxy-export.def b/dep/tbb/src/tbbmalloc/lin64-proxy-export.def new file mode 100644 index 000000000..21a0f0832 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/lin64-proxy-export.def @@ -0,0 +1,59 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +{ +global: +calloc; +free; +malloc; +realloc; +posix_memalign; +memalign; +valloc; +pvalloc; +mallinfo; +mallopt; +__TBB_malloc_proxy; +__TBB_internal_find_original_malloc; +_ZdaPv; /* next ones are new/delete */ +_ZdaPvRKSt9nothrow_t; +_ZdlPv; +_ZdlPvRKSt9nothrow_t; +_Znam; +_ZnamRKSt9nothrow_t; +_Znwm; +_ZnwmRKSt9nothrow_t; + +local: + +/* TBB symbols */ +*3rml8internal*; +*3tbb*; +*__TBB*; + +}; diff --git a/dep/tbb/src/tbbmalloc/lin64ipf-proxy-export.def b/dep/tbb/src/tbbmalloc/lin64ipf-proxy-export.def new file mode 100644 index 000000000..21a0f0832 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/lin64ipf-proxy-export.def @@ -0,0 +1,59 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +{ +global: +calloc; +free; +malloc; +realloc; +posix_memalign; +memalign; +valloc; +pvalloc; +mallinfo; +mallopt; +__TBB_malloc_proxy; +__TBB_internal_find_original_malloc; +_ZdaPv; /* next ones are new/delete */ +_ZdaPvRKSt9nothrow_t; +_ZdlPv; +_ZdlPvRKSt9nothrow_t; +_Znam; +_ZnamRKSt9nothrow_t; +_Znwm; +_ZnwmRKSt9nothrow_t; + +local: + +/* TBB symbols */ +*3rml8internal*; +*3tbb*; +*__TBB*; + +}; diff --git a/dep/tbb/src/tbbmalloc/mac32-tbbmalloc-export.def b/dep/tbb/src/tbbmalloc/mac32-tbbmalloc-export.def new file mode 100644 index 000000000..c211ce52c --- /dev/null +++ b/dep/tbb/src/tbbmalloc/mac32-tbbmalloc-export.def @@ -0,0 +1,36 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# MemoryAllocator.cpp +_scalable_calloc +_scalable_free +_scalable_malloc +_scalable_realloc +_scalable_posix_memalign +_scalable_aligned_malloc +_scalable_aligned_realloc +_scalable_aligned_free +_scalable_msize diff --git a/dep/tbb/src/tbbmalloc/mac64-tbbmalloc-export.def b/dep/tbb/src/tbbmalloc/mac64-tbbmalloc-export.def new file mode 100644 index 000000000..c211ce52c --- /dev/null +++ b/dep/tbb/src/tbbmalloc/mac64-tbbmalloc-export.def @@ -0,0 +1,36 @@ +# Copyright 2005-2009 Intel Corporation. All Rights Reserved. +# +# This file is part of Threading Building Blocks. +# +# Threading Building Blocks is free software; you can redistribute it +# and/or modify it under the terms of the GNU General Public License +# version 2 as published by the Free Software Foundation. +# +# Threading Building Blocks is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied warranty +# of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Threading Building Blocks; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +# As a special exception, you may use this file as part of a free software +# library without restriction. Specifically, if other files instantiate +# templates or use macros or inline functions from this file, or you compile +# this file and link it with other files to produce an executable, this +# file does not by itself cause the resulting executable to be covered by +# the GNU General Public License. This exception does not however +# invalidate any other reasons why the executable file might be covered by +# the GNU General Public License. + +# MemoryAllocator.cpp +_scalable_calloc +_scalable_free +_scalable_malloc +_scalable_realloc +_scalable_posix_memalign +_scalable_aligned_malloc +_scalable_aligned_realloc +_scalable_aligned_free +_scalable_msize diff --git a/dep/tbb/src/tbbmalloc/proxy.cpp b/dep/tbb/src/tbbmalloc/proxy.cpp new file mode 100644 index 000000000..5f27d3cf3 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/proxy.cpp @@ -0,0 +1,434 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +#include "proxy.h" + +#if MALLOC_LD_PRELOAD + +/*** service functions and variables ***/ + +#include // for sysconf +#include + +static long memoryPageSize; + +static inline void initPageSize() +{ + memoryPageSize = sysconf(_SC_PAGESIZE); +} + +/* For the expected behaviour (i.e., finding malloc/free/etc from libc.so, + not from ld-linux.so) dlsym(RTLD_NEXT) should be called from + a LD_PRELOADed library, not another dynamic library. + So we have to put find_original_malloc here. + */ +extern "C" bool __TBB_internal_find_original_malloc(int num, const char *names[], + void *ptrs[]) +{ + for (int i=0; i +#include // for memset + +extern "C" struct mallinfo mallinfo() __THROW +{ + struct mallinfo m; + memset(&m, 0, sizeof(struct mallinfo)); + + return m; +} +#endif /* __linux__ */ + +/*** replacements for global operators new and delete ***/ + +#include + +void * operator new(size_t sz) throw (std::bad_alloc) { + void *res = scalable_malloc(sz); + if (NULL == res) throw std::bad_alloc(); + return res; +} +void* operator new[](size_t sz) throw (std::bad_alloc) { + void *res = scalable_malloc(sz); + if (NULL == res) throw std::bad_alloc(); + return res; +} +void operator delete(void* ptr) throw() { + scalable_free(ptr); +} +void operator delete[](void* ptr) throw() { + scalable_free(ptr); +} +void* operator new(size_t sz, const std::nothrow_t&) throw() { + return scalable_malloc(sz); +} +void* operator new[](std::size_t sz, const std::nothrow_t&) throw() { + return scalable_malloc(sz); +} +void operator delete(void* ptr, const std::nothrow_t&) throw() { + scalable_free(ptr); +} +void operator delete[](void* ptr, const std::nothrow_t&) throw() { + scalable_free(ptr); +} + +#endif /* MALLOC_LD_PRELOAD */ + + +#ifdef _WIN32 +#include + +#include +#include "tbb_function_replacement.h" + +void safer_scalable_free2( void *ptr) +{ + safer_scalable_free( ptr, NULL ); +} + +// we do not support _expand(); +void* safer_expand( void *, size_t ) +{ + return NULL; +} + +#define __TBB_QV(EXP) #EXP +#define __TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(CRTLIB)\ +void (*orig_free_##CRTLIB)(void*); \ +void safer_scalable_free_##CRTLIB( void *ptr) \ +{ \ + safer_scalable_free( ptr, orig_free_##CRTLIB ); \ +} \ + \ +size_t (*orig_msize_##CRTLIB)(void*); \ +size_t safer_scalable_msize_##CRTLIB( void *ptr) \ +{ \ + return safer_scalable_msize( ptr, orig_msize_##CRTLIB ); \ +} \ + \ +void* safer_scalable_realloc_##CRTLIB( void *ptr, size_t size ) \ +{ \ + orig_ptrs func_ptrs = {orig_free_##CRTLIB, orig_msize_##CRTLIB}; \ + return safer_scalable_realloc( ptr, size, &func_ptrs ); \ +} \ + \ +void* safer_scalable_aligned_realloc_##CRTLIB( void *ptr, size_t size, size_t aligment ) \ +{ \ + orig_ptrs func_ptrs = {orig_free_##CRTLIB, orig_msize_##CRTLIB}; \ + return safer_scalable_aligned_realloc( ptr, size, aligment, &func_ptrs ); \ +} + +#if _WIN64 +#define __TBB_ORIG_ALLOCATOR_REPLACEMENT_CALL(CRT_VER)\ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "free", (FUNCPTR)safer_scalable_free_ ## CRT_VER ## d, 9, (FUNCPTR*)&orig_free_ ## CRT_VER ## d ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "free", (FUNCPTR)safer_scalable_free_ ## CRT_VER, 0, NULL ); \ + orig_free_ ## CRT_VER = NULL; \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "_msize",(FUNCPTR)safer_scalable_msize_ ## CRT_VER ## d, 9, (FUNCPTR*)&orig_msize_ ## CRT_VER ## d ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "_msize",(FUNCPTR)safer_scalable_msize_ ## CRT_VER, 7, (FUNCPTR*)&orig_msize_ ## CRT_VER ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "realloc", (FUNCPTR)safer_scalable_realloc_ ## CRT_VER ## d, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "realloc", (FUNCPTR)safer_scalable_realloc_ ## CRT_VER, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "_aligned_free", (FUNCPTR)safer_scalable_free_ ## CRT_VER ## d, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "_aligned_free", (FUNCPTR)safer_scalable_free_ ## CRT_VER, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "_aligned_realloc",(FUNCPTR)safer_scalable_aligned_realloc_ ## CRT_VER ## d, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "_aligned_realloc",(FUNCPTR)safer_scalable_aligned_realloc_ ## CRT_VER, 0, NULL); +#else +#define __TBB_ORIG_ALLOCATOR_REPLACEMENT_CALL(CRT_VER)\ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "free", (FUNCPTR)safer_scalable_free_ ## CRT_VER ## d, 5, (FUNCPTR*)&orig_free_ ## CRT_VER ## d ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "free", (FUNCPTR)safer_scalable_free_ ## CRT_VER, 7, (FUNCPTR*)&orig_free_ ## CRT_VER ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "_msize",(FUNCPTR)safer_scalable_msize_ ## CRT_VER ## d, 5, (FUNCPTR*)&orig_msize_ ## CRT_VER ## d ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "_msize",(FUNCPTR)safer_scalable_msize_ ## CRT_VER, 7, (FUNCPTR*)&orig_msize_ ## CRT_VER ); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "realloc", (FUNCPTR)safer_scalable_realloc_ ## CRT_VER ## d, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "realloc", (FUNCPTR)safer_scalable_realloc_ ## CRT_VER, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "_aligned_free", (FUNCPTR)safer_scalable_free_ ## CRT_VER ## d, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "_aligned_free", (FUNCPTR)safer_scalable_free_ ## CRT_VER, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ## d.dll), "_aligned_realloc",(FUNCPTR)safer_scalable_aligned_realloc_ ## CRT_VER ## d, 0, NULL); \ + ReplaceFunctionWithStore( __TBB_QV(CRT_VER ##.dll), "_aligned_realloc",(FUNCPTR)safer_scalable_aligned_realloc_ ## CRT_VER, 0, NULL); +#endif + +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr70d); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr70); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr71d); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr71); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr80d); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr80); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr90d); +__TBB_ORIG_ALLOCATOR_REPLACEMENT_WRAPPER(msvcr90); + + +/*** replacements for global operators new and delete ***/ + +#include + +#if _MSC_VER && !defined(__INTEL_COMPILER) +#pragma warning( push ) +#pragma warning( disable : 4290 ) +#endif + +void * operator_new(size_t sz) throw (std::bad_alloc) { + void *res = scalable_malloc(sz); + if (NULL == res) throw std::bad_alloc(); + return res; +} +void* operator_new_arr(size_t sz) throw (std::bad_alloc) { + void *res = scalable_malloc(sz); + if (NULL == res) throw std::bad_alloc(); + return res; +} +void operator_delete(void* ptr) throw() { + safer_scalable_free2(ptr); +} +#if _MSC_VER && !defined(__INTEL_COMPILER) +#pragma warning( pop ) +#endif + +void operator_delete_arr(void* ptr) throw() { + safer_scalable_free2(ptr); +} +void* operator_new_t(size_t sz, const std::nothrow_t&) throw() { + return scalable_malloc(sz); +} +void* operator_new_arr_t(std::size_t sz, const std::nothrow_t&) throw() { + return scalable_malloc(sz); +} +void operator_delete_t(void* ptr, const std::nothrow_t&) throw() { + safer_scalable_free2(ptr); +} +void operator_delete_arr_t(void* ptr, const std::nothrow_t&) throw() { + safer_scalable_free2(ptr); +} + +const char* modules_to_replace[] = { + "msvcr80d.dll", + "msvcr80.dll", + "msvcr90d.dll", + "msvcr90.dll", + "msvcr70d.dll", + "msvcr70.dll", + "msvcr71d.dll", + "msvcr71.dll", + }; + +/* +We need to replace following functions: +malloc +calloc +_aligned_malloc +_expand (by dummy implementation) +??2@YAPAXI@Z operator new (ia32) +??_U@YAPAXI@Z void * operator new[] (size_t size) (ia32) +??3@YAXPAX@Z operator delete (ia32) +??_V@YAXPAX@Z operator delete[] (ia32) +??2@YAPEAX_K@Z void * operator new(unsigned __int64) (intel64) +??_V@YAXPEAX@Z void * operator new[](unsigned __int64) (intel64) +??3@YAXPEAX@Z operator delete (intel64) +??_V@YAXPEAX@Z operator delete[] (intel64) +??2@YAPAXIABUnothrow_t@std@@@Z void * operator new (size_t sz, const std::nothrow_t&) throw() (optional) +??_U@YAPAXIABUnothrow_t@std@@@Z void * operator new[] (size_t sz, const std::nothrow_t&) throw() (optional) + +and these functions have runtime-specific replacement: +realloc +free +_msize +_aligned_realloc +_aligned_free +*/ + +typedef struct FRData_t { + //char *_module; + const char *_func; + FUNCPTR _fptr; + FRR_ON_ERROR _on_error; +} FRDATA; + +FRDATA routines_to_replace[] = { + { "malloc", (FUNCPTR)scalable_malloc, FRR_FAIL }, + { "calloc", (FUNCPTR)scalable_calloc, FRR_FAIL }, + { "_aligned_malloc", (FUNCPTR)scalable_aligned_malloc, FRR_FAIL }, + { "_expand", (FUNCPTR)safer_expand, FRR_IGNORE }, +#if _WIN64 + { "??2@YAPEAX_K@Z", (FUNCPTR)operator_new, FRR_FAIL }, + { "??_U@YAPEAX_K@Z", (FUNCPTR)operator_new_arr, FRR_FAIL }, + { "??3@YAXPEAX@Z", (FUNCPTR)operator_delete, FRR_FAIL }, + { "??_V@YAXPEAX@Z", (FUNCPTR)operator_delete_arr, FRR_FAIL }, +#else + { "??2@YAPAXI@Z", (FUNCPTR)operator_new, FRR_FAIL }, + { "??_U@YAPAXI@Z", (FUNCPTR)operator_new_arr, FRR_FAIL }, + { "??3@YAXPAX@Z", (FUNCPTR)operator_delete, FRR_FAIL }, + { "??_V@YAXPAX@Z", (FUNCPTR)operator_delete_arr, FRR_FAIL }, +#endif + { "??2@YAPAXIABUnothrow_t@std@@@Z", (FUNCPTR)operator_new_t, FRR_IGNORE }, + { "??_U@YAPAXIABUnothrow_t@std@@@Z", (FUNCPTR)operator_new_arr_t, FRR_IGNORE } +}; + +#ifndef UNICODE +void ReplaceFunctionWithStore( const char*dllName, const char *funcName, FUNCPTR newFunc, UINT opcodesNumber, FUNCPTR* origFunc ) +#else +void ReplaceFunctionWithStore( const wchar_t *dllName, const char *funcName, FUNCPTR newFunc, UINT opcodesNumber, FUNCPTR* origFunc ) +#endif +{ + FRR_TYPE type = ReplaceFunction( dllName, funcName, newFunc, opcodesNumber, origFunc ); + if (type == FRR_NODLL) return; + if ( type != FRR_OK ) + { + fprintf(stderr, "Failed to replace function %s in module %s\n", + funcName, dllName); + exit(1); + } +} + +void doMallocReplacement() +{ + int i,j; + + // Replace functions without storing original code + int modules_to_replace_count = sizeof(modules_to_replace) / sizeof(modules_to_replace[0]); + int routines_to_replace_count = sizeof(routines_to_replace) / sizeof(routines_to_replace[0]); + for ( j=0; j + +extern "C" { + void * scalable_malloc(size_t size); + void * scalable_calloc(size_t nobj, size_t size); + void scalable_free(void *ptr); + void * scalable_realloc(void* ptr, size_t size); + void * scalable_aligned_malloc(size_t size, size_t alignment); + void * scalable_aligned_realloc(void* ptr, size_t size, size_t alignment); + int scalable_posix_memalign(void **memptr, size_t alignment, size_t size); + size_t scalable_msize(void *ptr); + void safer_scalable_free( void *ptr, void (*original_free)(void*)); + void * safer_scalable_realloc( void *ptr, size_t, void* ); + void * safer_scalable_aligned_realloc( void *ptr, size_t, size_t, void* ); + size_t safer_scalable_msize( void *ptr, size_t (*orig_msize_crt80d)(void*)); + + void * __TBB_internal_malloc(size_t size); + void * __TBB_internal_calloc(size_t num, size_t size); + void __TBB_internal_free(void *ptr); + void * __TBB_internal_realloc(void* ptr, size_t sz); + int __TBB_internal_posix_memalign(void **memptr, size_t alignment, size_t size); + + bool __TBB_internal_find_original_malloc(int num, const char *names[], void *table[]); +} // extern "C" + +// Struct with original free() and _msize() pointers +struct orig_ptrs { + void (*orig_free) (void*); + size_t (*orig_msize)(void*); +}; + +#endif /* _TBB_malloc_proxy_H_ */ diff --git a/dep/tbb/src/tbbmalloc/tbb_function_replacement.cpp b/dep/tbb/src/tbbmalloc/tbb_function_replacement.cpp new file mode 100644 index 000000000..f4b0d92a9 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/tbb_function_replacement.cpp @@ -0,0 +1,396 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + This file is part of Threading Building Blocks. + + Threading Building Blocks is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public License + version 2 as published by the Free Software Foundation. + + Threading Building Blocks is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied warranty + of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with Threading Building Blocks; if not, write to the Free Software + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + As a special exception, you may use this file as part of a free software + library without restriction. Specifically, if other files instantiate + templates or use macros or inline functions from this file, or you compile + this file and link it with other files to produce an executable, this + file does not by itself cause the resulting executable to be covered by + the GNU General Public License. This exception does not however + invalidate any other reasons why the executable file might be covered by + the GNU General Public License. +*/ + +//We works on windows only +#ifdef _WIN32 +#define _CRT_SECURE_NO_DEPRECATE 1 + +#include +#include +#include "tbb_function_replacement.h" + +inline UINT_PTR Ptr2Addrint(LPVOID ptr) +{ + Int2Ptr i2p; + i2p.lpv = ptr; + return i2p.uip; +} + +inline LPVOID Addrint2Ptr(UINT_PTR ptr) +{ + Int2Ptr i2p; + i2p.uip = ptr; + return i2p.lpv; +} + +// Is the distance between addr1 and addr2 smaller than dist +inline bool IsInDistance(UINT_PTR addr1, UINT_PTR addr2, __int64 dist) +{ + __int64 diff = addr1>addr2 ? addr1-addr2 : addr2-addr1; + return diff= m_allocSize) + { + // Found a free region, try to allocate a page in this region + void *newPage = VirtualAlloc(newAddr, m_allocSize, MEM_COMMIT|MEM_RESERVE, PAGE_READWRITE); + if (!newPage) + break; + + // Add the new page to the pages database + MemoryBuffer *pBuff = new (m_lastBuffer) MemoryBuffer(newPage, m_allocSize); + ++m_lastBuffer; + return pBuff; + } + } + + // Failed to find a buffer in the distance + return 0; + } + +public: + MemoryProvider() + { + SYSTEM_INFO sysInfo; + GetSystemInfo(&sysInfo); + m_allocSize = sysInfo.dwAllocationGranularity; + m_lastBuffer = &m_pages[0]; + } + + // We can't free the pages in the destructor because the trampolines + // are using these memory locations and a replaced function might be called + // after the destructor was called. + ~MemoryProvider() + { + } + + // Return a memory location in distance less than 2^31 from input address + UINT_PTR GetLocation(UINT_PTR addr) + { + MemoryBuffer *pBuff = m_pages; + for (; pBuffm_next, addr, MAX_DISTANCE); ++pBuff) + { + if (pBuff->m_next < pBuff->m_base + pBuff->m_size) + { + UINT_PTR loc = pBuff->m_next; + pBuff->m_next += MAX_PROBE_SIZE; + return loc; + } + } + + pBuff = CreateBuffer(addr); + if(!pBuff) + return 0; + + UINT_PTR loc = pBuff->m_next; + pBuff->m_next += MAX_PROBE_SIZE; + return loc; + } + +private: + MemoryBuffer m_pages[MAX_NUM_BUFFERS]; + MemoryBuffer *m_lastBuffer; + DWORD m_allocSize; +}; + +static MemoryProvider memProvider; + +// Insert jump relative instruction to the input address +// RETURN: the size of the trampoline or 0 on failure +static DWORD InsertTrampoline32(void *inpAddr, void *targetAddr, UINT opcodesNumber, FUNCPTR* storedAddr) +{ + UINT_PTR srcAddr = Ptr2Addrint(inpAddr); + UINT_PTR tgtAddr = Ptr2Addrint(targetAddr); + // Check that the target fits in 32 bits + if (!IsInDistance(srcAddr, tgtAddr, MAX_DISTANCE)) + return 0; + + UINT_PTR offset; + UINT offset32; + UCHAR *codePtr = (UCHAR *)inpAddr; + + // If requested, store original function code + if ( storedAddr ){ + UINT_PTR strdAddr = memProvider.GetLocation(srcAddr); + if (!strdAddr) + return 0; + *storedAddr = (FUNCPTR)Addrint2Ptr(strdAddr); + // Set 'executable' flag for original instructions in the new place + DWORD pageFlags = PAGE_EXECUTE_READWRITE; + if (!VirtualProtect(*storedAddr, MAX_PROBE_SIZE, pageFlags, &pageFlags)) return 0; + // Copy original instructions to the new place + memcpy(*storedAddr, codePtr, opcodesNumber); + // Set jump to the code after replacement + offset = srcAddr - strdAddr - SIZE_OF_RELJUMP; + offset32 = (UINT)((offset & 0xFFFFFFFF)); + *((UCHAR*)*storedAddr+opcodesNumber) = 0xE9; + memcpy(((UCHAR*)*storedAddr+opcodesNumber+1), &offset32, sizeof(offset32)); + } + + // The following will work correctly even if srcAddr>tgtAddr, as long as + // address difference is less than 2^31, which is guaranteed by IsInDistance. + offset = tgtAddr - srcAddr - SIZE_OF_RELJUMP; + offset32 = (UINT)(offset & 0xFFFFFFFF); + // Insert the jump to the new code + *codePtr = 0xE9; + memcpy(codePtr+1, &offset32, sizeof(offset32)); + + // Fill the rest with NOPs to correctly see disassembler of old code in debugger. + for( unsigned i=SIZE_OF_RELJUMP; i +#include +#include +#if __sun +#include /* for memset */ +#include +#endif + +#if MALLOC_LD_PRELOAD + +extern "C" { + +void safer_scalable_free( void*, void (*)(void*) ); +void * safer_scalable_realloc( void*, size_t, void* ); + +bool __TBB_internal_find_original_malloc(int num, const char *names[], void *table[]) __attribute__ ((weak)); + +} + +#endif /* MALLOC_LD_PRELOAD */ +#endif /* MALLOC_CHECK_RECURSION */ + +namespace rml { +namespace internal { + +#if MALLOC_CHECK_RECURSION + +void* (*original_malloc_ptr)(size_t) = 0; +void (*original_free_ptr)(void*) = 0; +static void* (*original_calloc_ptr)(size_t,size_t) = 0; +static void* (*original_realloc_ptr)(void*,size_t) = 0; + +#endif /* MALLOC_CHECK_RECURSION */ + +#if __TBB_NEW_ITT_NOTIFY +extern "C" +#endif +void ITT_DoOneTimeInitialization() {} // required for itt_notify.cpp to work + +#if DO_ITT_NOTIFY +/** Caller is responsible for ensuring this routine is called exactly once. */ +void MallocInitializeITT() { +#if __TBB_NEW_ITT_NOTIFY + tbb::internal::__TBB_load_ittnotify(); +#else + bool success = false; + // Check if we are running under control of VTune. + if( GetBoolEnvironmentVariable("KMP_FOR_TCHECK") || GetBoolEnvironmentVariable("KMP_FOR_TPROFILE") ) { + // Yes, we are under control of VTune. Check for libittnotify library. + success = dynamic_link( LIBITTNOTIFY_NAME, ITT_HandlerTable, 5 ); + } + if (!success){ + for (int i = 0; i < 5; i++) + *ITT_HandlerTable[i].handler = NULL; + } +#endif /* !__TBB_NEW_ITT_NOTIFY */ +} +#endif /* DO_ITT_NOTIFY */ + +void init_tbbmalloc() { +#if MALLOC_LD_PRELOAD + if (malloc_proxy && __TBB_internal_find_original_malloc) { + const char *alloc_names[] = { "malloc", "free", "realloc", "calloc"}; + void *orig_alloc_ptrs[4]; + + if (__TBB_internal_find_original_malloc(4, alloc_names, orig_alloc_ptrs)) { + (void *&)original_malloc_ptr = orig_alloc_ptrs[0]; + (void *&)original_free_ptr = orig_alloc_ptrs[1]; + (void *&)original_realloc_ptr = orig_alloc_ptrs[2]; + (void *&)original_calloc_ptr = orig_alloc_ptrs[3]; + MALLOC_ASSERT( original_malloc_ptr!=malloc_proxy, + "standard malloc not found" ); +/* It's workaround for a bug in GNU Libc 2.9 (as it shipped with Fedora 10). + 1st call to libc's malloc should be not from threaded code. + */ + original_free_ptr(original_malloc_ptr(1024)); + original_malloc_found = 1; + } + } +#endif /* MALLOC_LD_PRELOAD */ + +#if DO_ITT_NOTIFY + MallocInitializeITT(); +#endif +} + +#if !(_WIN32||_WIN64) +struct RegisterProcessShutdownNotification { + ~RegisterProcessShutdownNotification() { + mallocProcessShutdownNotification(); + } +}; + +static RegisterProcessShutdownNotification reg; +#endif + +#if MALLOC_CHECK_RECURSION + +bool original_malloc_found; + +#if MALLOC_LD_PRELOAD + +extern "C" { + +void * __TBB_internal_malloc(size_t size) +{ + return scalable_malloc(size); +} + +void * __TBB_internal_calloc(size_t num, size_t size) +{ + return scalable_calloc(num, size); +} + +int __TBB_internal_posix_memalign(void **memptr, size_t alignment, size_t size) +{ + return scalable_posix_memalign(memptr, alignment, size); +} + +void* __TBB_internal_realloc(void* ptr, size_t sz) +{ + return safer_scalable_realloc(ptr, sz, (void*&)original_realloc_ptr); +} + +void __TBB_internal_free(void *object) +{ + safer_scalable_free(object, original_free_ptr); +} + +} /* extern "C" */ + +#endif /* MALLOC_LD_PRELOAD */ +#endif /* MALLOC_CHECK_RECURSION */ + +} } // namespaces + +#ifdef _WIN32 +#include + +extern "C" BOOL WINAPI DllMain( HINSTANCE hInst, DWORD callReason, LPVOID ) +{ + + if (callReason==DLL_THREAD_DETACH) + { + mallocThreadShutdownNotification(NULL); + } + else if (callReason==DLL_PROCESS_DETACH) + { + mallocProcessShutdownNotification(); + } + return TRUE; +} + +#endif //_WIN32 + diff --git a/dep/tbb/src/tbbmalloc/tbbmalloc.rc b/dep/tbb/src/tbbmalloc/tbbmalloc.rc new file mode 100644 index 000000000..4e8a2ed0b --- /dev/null +++ b/dep/tbb/src/tbbmalloc/tbbmalloc.rc @@ -0,0 +1,129 @@ +// Copyright 2005-2009 Intel Corporation. All Rights Reserved. +// +// This file is part of Threading Building Blocks. +// +// Threading Building Blocks is free software; you can redistribute it +// and/or modify it under the terms of the GNU General Public License +// version 2 as published by the Free Software Foundation. +// +// Threading Building Blocks is distributed in the hope that it will be +// useful, but WITHOUT ANY WARRANTY; without even the implied warranty +// of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Threading Building Blocks; if not, write to the Free Software +// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +// +// As a special exception, you may use this file as part of a free software +// library without restriction. Specifically, if other files instantiate +// templates or use macros or inline functions from this file, or you compile +// this file and link it with other files to produce an executable, this +// file does not by itself cause the resulting executable to be covered by +// the GNU General Public License. This exception does not however +// invalidate any other reasons why the executable file might be covered by +// the GNU General Public License. + +// Microsoft Visual C++ generated resource script. +// +#ifdef APSTUDIO_INVOKED +#ifndef APSTUDIO_READONLY_SYMBOLS +#define _APS_NO_MFC 1 +#define _APS_NEXT_RESOURCE_VALUE 102 +#define _APS_NEXT_COMMAND_VALUE 40001 +#define _APS_NEXT_CONTROL_VALUE 1001 +#define _APS_NEXT_SYMED_VALUE 101 +#endif +#endif + +#define APSTUDIO_READONLY_SYMBOLS +///////////////////////////////////////////////////////////////////////////// +// +// Generated from the TEXTINCLUDE 2 resource. +// +#include +#define ENDL "\r\n" +#include "../tbb/tbb_version.h" + +#define TBBMALLOC_VERNUMBERS TBB_VERSION_MAJOR, TBB_VERSION_MINOR, __TBB_VERSION_YMD +#define TBBMALLOC_VERSION __TBB_STRING(TBBMALLOC_VERNUMBERS) + +///////////////////////////////////////////////////////////////////////////// +#undef APSTUDIO_READONLY_SYMBOLS + +///////////////////////////////////////////////////////////////////////////// +// Neutral resources + +#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_NEU) +#ifdef _WIN32 +LANGUAGE LANG_NEUTRAL, SUBLANG_NEUTRAL +#pragma code_page(1252) +#endif //_WIN32 + +///////////////////////////////////////////////////////////////////////////// +// manifest integration +#ifdef TBB_MANIFEST +#include "winuser.h" +2 RT_MANIFEST tbbmanifest.exe.manifest +#endif + +///////////////////////////////////////////////////////////////////////////// +// +// Version +// + +VS_VERSION_INFO VERSIONINFO + FILEVERSION TBBMALLOC_VERNUMBERS + PRODUCTVERSION TBB_VERNUMBERS + FILEFLAGSMASK 0x17L +#ifdef _DEBUG + FILEFLAGS 0x1L +#else + FILEFLAGS 0x0L +#endif + FILEOS 0x40004L + FILETYPE 0x2L + FILESUBTYPE 0x0L +BEGIN + BLOCK "StringFileInfo" + BEGIN + BLOCK "000004b0" + BEGIN + VALUE "CompanyName", "Intel Corporation\0" + VALUE "FileDescription", "Scalable Allocator library\0" + VALUE "FileVersion", TBBMALLOC_VERSION "\0" +//what is it? VALUE "InternalName", "tbbmalloc\0" + VALUE "LegalCopyright", "Copyright 2005-2009 Intel Corporation. All Rights Reserved.\0" + VALUE "LegalTrademarks", "\0" +#ifndef TBB_USE_DEBUG + VALUE "OriginalFilename", "tbbmalloc.dll\0" +#else + VALUE "OriginalFilename", "tbbmalloc_debug.dll\0" +#endif + VALUE "ProductName", "Intel(R) Threading Building Blocks for Windows\0" + VALUE "ProductVersion", TBB_VERSION "\0" + VALUE "Comments", TBB_VERSION_STRINGS "\0" + VALUE "PrivateBuild", "\0" + VALUE "SpecialBuild", "\0" + END + END + BLOCK "VarFileInfo" + BEGIN + VALUE "Translation", 0x0, 1200 + END +END + +#endif // Neutral resources +///////////////////////////////////////////////////////////////////////////// + + +#ifndef APSTUDIO_INVOKED +///////////////////////////////////////////////////////////////////////////// +// +// Generated from the TEXTINCLUDE 3 resource. +// + + +///////////////////////////////////////////////////////////////////////////// +#endif // not APSTUDIO_INVOKED + diff --git a/dep/tbb/src/tbbmalloc/win-gcc-tbbmalloc-export.def b/dep/tbb/src/tbbmalloc/win-gcc-tbbmalloc-export.def new file mode 100644 index 000000000..0e55b4dfc --- /dev/null +++ b/dep/tbb/src/tbbmalloc/win-gcc-tbbmalloc-export.def @@ -0,0 +1,37 @@ +/* + Copyright 2005-2009 Intel Corporation. All Rights Reserved. + + The source code contained or described herein and all documents related + to the source code ("Material") are owned by Intel Corporation or its + suppliers or licensors. Title to the Material remains with Intel + Corporation or its suppliers and licensors. The Material is protected + by worldwide copyright laws and treaty provisions. No part of the + Material may be used, copied, reproduced, modified, published, uploaded, + posted, transmitted, distributed, or disclosed in any way without + Intel's prior express written permission. + + No license under any patent, copyright, trade secret or other + intellectual property right is granted to or conferred upon you by + disclosure or delivery of the Materials, either expressly, by + implication, inducement, estoppel or otherwise. Any license under such + intellectual property rights must be express and approved by Intel in + writing. +*/ + +{ +global: +scalable_calloc; +scalable_free; +scalable_malloc; +scalable_realloc; +scalable_posix_memalign; +scalable_aligned_malloc; +scalable_aligned_realloc; +scalable_aligned_free; +safer_scalable_free; +safer_scalable_realloc; +scalable_msize; +safer_scalable_msize; +safer_scalable_aligned_realloc; +local:*; +}; diff --git a/dep/tbb/src/tbbmalloc/win32-tbbmalloc-export.def b/dep/tbb/src/tbbmalloc/win32-tbbmalloc-export.def new file mode 100644 index 000000000..e04026398 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/win32-tbbmalloc-export.def @@ -0,0 +1,42 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +EXPORTS + +; MemoryAllocator.cpp +scalable_calloc +scalable_free +scalable_malloc +scalable_realloc +scalable_posix_memalign +scalable_aligned_malloc +scalable_aligned_realloc +scalable_aligned_free +safer_scalable_free +safer_scalable_realloc +scalable_msize +safer_scalable_msize +safer_scalable_aligned_realloc diff --git a/dep/tbb/src/tbbmalloc/win64-tbbmalloc-export.def b/dep/tbb/src/tbbmalloc/win64-tbbmalloc-export.def new file mode 100644 index 000000000..e04026398 --- /dev/null +++ b/dep/tbb/src/tbbmalloc/win64-tbbmalloc-export.def @@ -0,0 +1,42 @@ +; Copyright 2005-2009 Intel Corporation. All Rights Reserved. +; +; This file is part of Threading Building Blocks. +; +; Threading Building Blocks is free software; you can redistribute it +; and/or modify it under the terms of the GNU General Public License +; version 2 as published by the Free Software Foundation. +; +; Threading Building Blocks is distributed in the hope that it will be +; useful, but WITHOUT ANY WARRANTY; without even the implied warranty +; of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +; GNU General Public License for more details. +; +; You should have received a copy of the GNU General Public License +; along with Threading Building Blocks; if not, write to the Free Software +; Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +; +; As a special exception, you may use this file as part of a free software +; library without restriction. Specifically, if other files instantiate +; templates or use macros or inline functions from this file, or you compile +; this file and link it with other files to produce an executable, this +; file does not by itself cause the resulting executable to be covered by +; the GNU General Public License. This exception does not however +; invalidate any other reasons why the executable file might be covered by +; the GNU General Public License. + +EXPORTS + +; MemoryAllocator.cpp +scalable_calloc +scalable_free +scalable_malloc +scalable_realloc +scalable_posix_memalign +scalable_aligned_malloc +scalable_aligned_realloc +scalable_aligned_free +safer_scalable_free +safer_scalable_realloc +scalable_msize +safer_scalable_msize +safer_scalable_aligned_realloc diff --git a/src/framework/Makefile.am b/src/framework/Makefile.am index 748d5325e..9a1e8bb06 100644 --- a/src/framework/Makefile.am +++ b/src/framework/Makefile.am @@ -47,6 +47,7 @@ EXTRA_DIST = \ Platform/Define.h \ Policies/CreationPolicy.h \ Policies/ObjectLifeTime.h \ + Policies/MemoryManagement.cpp \ Policies/Singleton.h \ Policies/SingletonImp.h \ Policies/ThreadingModel.h \ diff --git a/src/framework/Policies/MemoryManagement.cpp b/src/framework/Policies/MemoryManagement.cpp new file mode 100644 index 000000000..e9555e6ef --- /dev/null +++ b/src/framework/Policies/MemoryManagement.cpp @@ -0,0 +1,69 @@ +/* +* Copyright (C) 2009 MaNGOS +* +* This program is free software; you can redistribute it and/or modify +* it under the terms of the GNU General Public License as published by +* the Free Software Foundation; either version 2 of the License, or +* (at your option) any later version. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +* GNU General Public License for more details. +* +* You should have received a copy of the GNU General Public License +* along with this program; if not, write to the Free Software +* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +//lets use Intel scalable_allocator by default and +//switch to OS specific allocator only when _STANDARD_MALLOC is defined +#ifndef USE_STANDARD_MALLOC + +#include "../../dep/tbb/include/tbb/scalable_allocator.h" + +void * operator new(size_t sz) throw (std::bad_alloc) +{ + void *res = scalable_malloc(sz); + if (NULL == res) throw std::bad_alloc(); + return res; +} + +void* operator new[](size_t sz) throw (std::bad_alloc) +{ + void *res = scalable_malloc(sz); + if (NULL == res) throw std::bad_alloc(); + return res; +} + +void operator delete(void* ptr) throw() +{ + scalable_free(ptr); +} + +void operator delete[](void* ptr) throw() +{ + scalable_free(ptr); +} + +void* operator new(size_t sz, const std::nothrow_t&) throw() +{ + return scalable_malloc(sz); +} + +void* operator new[](size_t sz, const std::nothrow_t&) throw() +{ + return scalable_malloc(sz); +} + +void operator delete(void* ptr, const std::nothrow_t&) throw() +{ + scalable_free(ptr); +} + +void operator delete[](void* ptr, const std::nothrow_t&) throw() +{ + scalable_free(ptr); +} + +#endif diff --git a/src/mangosd/Makefile.am b/src/mangosd/Makefile.am index 3fd406888..608d0a1ba 100644 --- a/src/mangosd/Makefile.am +++ b/src/mangosd/Makefile.am @@ -43,9 +43,10 @@ mangos_worldd_LDADD = \ ../shared/vmap/libmangosvmaps.a \ ../framework/libmangosframework.a \ ../../dep/src/sockets/libmangossockets.a \ - ../../dep/src/g3dlite/libg3dlite.a + ../../dep/src/g3dlite/libg3dlite.a \ + ../../dep/tbb/libtbbmalloc.so -mangos_worldd_LDFLAGS = -L../../dep/src/sockets -L../../dep/src/g3dlite -L../bindings/universal/ -L$(libdir) $(MANGOS_LIBS) -export-dynamic +mangos_worldd_LDFLAGS = -L../../dep/src/sockets -L../../dep/src/g3dlite -L../bindings/universal/ -L../../dep/tbb -L$(libdir) $(MANGOS_LIBS) -export-dynamic ## Additional files to include when running 'make dist' # Include world daemon configuration diff --git a/src/realmd/Makefile.am b/src/realmd/Makefile.am index 6aa09c392..4969c4082 100644 --- a/src/realmd/Makefile.am +++ b/src/realmd/Makefile.am @@ -36,9 +36,10 @@ mangos_realmd_LDADD = \ ../shared/Auth/libmangosauth.a \ ../shared/libmangosshared.a \ ../framework/libmangosframework.a \ - ../../dep/src/sockets/libmangossockets.a + ../../dep/src/sockets/libmangossockets.a \ + ../../dep/tbb/libtbbmalloc.so -mangos_realmd_LDFLAGS = -L../../dep/src/sockets -L$(libdir) $(MANGOS_LIBS) +mangos_realmd_LDFLAGS = -L../../dep/src/sockets -L../../dep/tbb -L$(libdir) $(MANGOS_LIBS) ## Additional files to include when running 'make dist' # Include realm list daemon configuration diff --git a/src/shared/revision_nr.h b/src/shared/revision_nr.h index 4b1ffe600..1dc76f1c5 100644 --- a/src/shared/revision_nr.h +++ b/src/shared/revision_nr.h @@ -1,4 +1,4 @@ #ifndef __REVISION_NR_H__ #define __REVISION_NR_H__ - #define REVISION_NR "8734" + #define REVISION_NR "8735" #endif // __REVISION_NR_H__ diff --git a/win/VC100/framework.vcxproj b/win/VC100/framework.vcxproj index 89142fa08..1eafcd3bd 100644 --- a/win/VC100/framework.vcxproj +++ b/win/VC100/framework.vcxproj @@ -1,4 +1,5 @@ - + + Debug_NoPCH @@ -142,7 +143,7 @@ /Zl /MP %(AdditionalOptions) Disabled ..\..\src\framework;..\..\dep\ACE_wrappers;%(AdditionalIncludeDirectories) - WIN32;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) + WIN32;USE_STANDARD_MALLOC;;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) false EnableFastChecks MultiThreadedDebugDLL @@ -169,7 +170,7 @@ /Zl /MP %(AdditionalOptions) Disabled ..\..\src\framework;..\..\dep\ACE_wrappers;%(AdditionalIncludeDirectories) - WIN32;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) + WIN32;USE_STANDARD_MALLOC;;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) false EnableFastChecks MultiThreadedDebugDLL @@ -193,7 +194,7 @@ /Zl /MP %(AdditionalOptions) OnlyExplicitInline ..\..\src\framework;..\..\dep\ACE_wrappers;%(AdditionalIncludeDirectories) - WIN32;NDEBUG;_LIB;%(PreprocessorDefinitions) + WIN32;USE_STANDARD_MALLOC;;NDEBUG;_LIB;%(PreprocessorDefinitions) true false MultiThreadedDLL @@ -220,7 +221,7 @@ /Zl /MP %(AdditionalOptions) OnlyExplicitInline ..\..\src\framework;..\..\dep\ACE_wrappers;%(AdditionalIncludeDirectories) - WIN32;NDEBUG;_LIB;%(PreprocessorDefinitions) + WIN32;USE_STANDARD_MALLOC;;NDEBUG;_LIB;%(PreprocessorDefinitions) true false MultiThreadedDLL @@ -244,7 +245,7 @@ /Zl /MP %(AdditionalOptions) Disabled ..\..\src\framework;..\..\dep\ACE_wrappers;%(AdditionalIncludeDirectories) - WIN32;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) + WIN32;USE_STANDARD_MALLOC;;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) false EnableFastChecks MultiThreadedDebugDLL @@ -271,7 +272,7 @@ /Zl /MP %(AdditionalOptions) Disabled ..\..\src\framework;..\..\dep\ACE_wrappers;%(AdditionalIncludeDirectories) - WIN32;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) + WIN32;USE_STANDARD_MALLOC;;_DEBUG;MANGOS_DEBUG;_LIB;%(PreprocessorDefinitions) false EnableFastChecks MultiThreadedDebugDLL @@ -322,6 +323,7 @@ + diff --git a/win/VC100/mangosd.vcxproj b/win/VC100/mangosd.vcxproj index 77325676c..3176231fa 100644 --- a/win/VC100/mangosd.vcxproj +++ b/win/VC100/mangosd.vcxproj @@ -174,9 +174,9 @@ /MACHINE:I386 %(AdditionalOptions) - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRT.LIB;msvcrt.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;framework.lib;msvcrt.lib;%(AdditionalDependencies) true - ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) + ..\..\dep\lib\$(Platform)_$(Configuration);.\framework__$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true ..\..\bin\$(Platform)_$(Configuration)\mangosd.pdb true @@ -224,9 +224,9 @@ 0x0409 - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRT.LIB;msvcrt.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;framework.lib;msvcrt.lib;%(AdditionalDependencies) true - ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) + ..\..\dep\lib\$(Platform)_$(Configuration);.\framework__$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true ..\..\bin\$(Platform)_$(Configuration)\mangosd.pdb true @@ -274,11 +274,11 @@ /MACHINE:I386 %(AdditionalOptions) - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;framework.lib;msvcrtd.lib;%(AdditionalDependencies) true - ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) + ..\..\dep\lib\$(Platform)_$(Configuration);.\framework__$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true ..\..\bin\$(Platform)_$(Configuration)\mangosd.pdb true @@ -325,11 +325,11 @@ 0x0409 - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;framework.lib;msvcrtd.lib;%(AdditionalDependencies) true - ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) + ..\..\dep\lib\$(Platform)_$(Configuration);.\framework__$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true ..\..\bin\$(Platform)_$(Configuration)\mangosd.pdb true @@ -376,11 +376,11 @@ /MACHINE:I386 %(AdditionalOptions) - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;framework.lib;msvcrtd.lib;%(AdditionalDependencies) true - ..\..\dep\lib\$(Platform)_debug;%(AdditionalLibraryDirectories) + ..\..\dep\lib\$(Platform)_$(Configuration);.\framework__$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true ..\..\bin\$(Platform)_$(Configuration)\mangosd.pdb true @@ -427,11 +427,11 @@ 0x0409 - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;framework.lib;msvcrtd.lib;%(AdditionalDependencies) true - ..\..\dep\lib\$(Platform)_debug;%(AdditionalLibraryDirectories) + ..\..\dep\lib\$(Platform)_$(Configuration);.\framework__$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true ..\..\bin\$(Platform)_$(Configuration)\mangosd.pdb true diff --git a/win/VC100/realmd.vcxproj b/win/VC100/realmd.vcxproj index 3a62163a4..c69da23e0 100644 --- a/win/VC100/realmd.vcxproj +++ b/win/VC100/realmd.vcxproj @@ -173,7 +173,7 @@ /MACHINE:I386 %(AdditionalOptions) - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRT.LIB;msvcrt.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;%(AdditionalDependencies) true ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true @@ -217,7 +217,7 @@ 0x0409 - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRT.LIB;msvcrt.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;%(AdditionalDependencies) true ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true @@ -261,7 +261,7 @@ /MACHINE:I386 %(AdditionalOptions) - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;%(AdditionalDependencies) true ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true @@ -306,7 +306,7 @@ 0x0409 - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;%(AdditionalDependencies) true ..\..\dep\lib\$(Platform)_$(Configuration);%(AdditionalLibraryDirectories) true @@ -351,7 +351,7 @@ /MACHINE:I386 %(AdditionalOptions) - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;%(AdditionalDependencies) true ..\..\dep\lib\$(Platform)_debug;%(AdditionalLibraryDirectories) true @@ -396,7 +396,7 @@ 0x0409 - libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;MSVCPRTD.LIB;msvcrtd.lib;%(AdditionalDependencies) + libmySQL.lib;libeay32.lib;ws2_32.lib;winmm.lib;odbc32.lib;odbccp32.lib;advapi32.lib;dbghelp.lib;%(AdditionalDependencies) true ..\..\dep\lib\$(Platform)_debug;%(AdditionalLibraryDirectories) true diff --git a/win/VC100/tbb.vcxproj b/win/VC100/tbb.vcxproj new file mode 100644 index 000000000..44a51aa7a --- /dev/null +++ b/win/VC100/tbb.vcxproj @@ -0,0 +1,473 @@ + + + + Debug_NoPCH + Win32 + + + Debug_NoPCH + Win32 + + + Debug_NoPCH + x64 + + + Debug_NoPCH + x64 + + + Debug + Win32 + + + Debug + Win32 + + + Debug + x64 + + + Debug + x64 + + + Release + Win32 + + + Release + Win32 + + + Release + x64 + + + Release + x64 + + + + {F62787DD-1327-448B-9818-030062BCFAA5} + tbb + Win32Proj + + + + DynamicLibrary + NotSet + + + DynamicLibrary + NotSet + true + + + DynamicLibrary + NotSet + + + DynamicLibrary + NotSet + true + + + DynamicLibrary + NotSet + + + DynamicLibrary + NotSet + + + + + + + + + + + <_ProjectFileVersion>10.0.20506.1 + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbb__$(Platform)_$(Configuration)\ + tbb_debug + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbb__$(Platform)_$(Configuration)\ + tbb_debug + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbb__$(Platform)_$(Configuration)\ + tbb + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbb__$(Platform)_$(Configuration)\ + tbb + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + $(Configuration)\ + tbb_debug + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + $(Platform)\$(Configuration)\ + tbb_debug + .dll + false + + + + /c /MDd /Od /Ob0 /Zi /EHsc /GR /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /DDO_ITT_ANNOTATE /D_USE_RTM_VERSION /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /W4 /Wp64 /I../../src /I../../include %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + true + EnableFastChecks + MultiThreadedDebugDLL + + + Level3 + ProgramDatabase + + + /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbb.def %(AdditionalOptions) + true + Windows + false + + + MachineX86 + + + + + X64 + + + /c /MDd /Od /Ob0 /Zi /EHsc /GR /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /DDO_ITT_ANNOTATE /D_USE_RTM_VERSION /GS- /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /W4 /Wp64 /I../../src /I../../include %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + true + EnableFastChecks + MultiThreadedDebugDLL + + + Level3 + ProgramDatabase + false + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbb.def %(AdditionalOptions) + true + Windows + false + + + MachineX64 + + + + + /c /MD /O2 /Zi /EHsc /GR /Zc:forScope /Zc:wchar_t /Oy /D_USE_RTM_VERSION /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /W4 /Wp64 /I../../src /I../../include %(AdditionalOptions) + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + MultiThreadedDLL + + + Level3 + ProgramDatabase + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbb.def %(AdditionalOptions) + true + Windows + true + true + false + + + MachineX86 + + + + + X64 + + + /c /MD /O2 /Zi /EHsc /GR /Zc:forScope /Zc:wchar_t /D_USE_RTM_VERSION /GS- /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /W4 /Wp64 /I../../src /I../../include %(AdditionalOptions) + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + MultiThreadedDLL + + + Level3 + ProgramDatabase + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbb.def %(AdditionalOptions) + true + Windows + true + true + false + + + MachineX64 + + + + + /c /MDd /Od /Ob0 /Zi /EHsc /GR /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /DDO_ITT_ANNOTATE /D_USE_RTM_VERSION /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /W4 /Wp64 /I../../src /I../../include %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + true + EnableFastChecks + MultiThreadedDebugDLL + + + Level3 + ProgramDatabase + + + /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbb.def %(AdditionalOptions) + true + Windows + false + + + MachineX86 + + + + + X64 + + + /c /MDd /Od /Ob0 /Zi /EHsc /GR /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /DDO_ITT_ANNOTATE /D_USE_RTM_VERSION /GS- /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /W4 /Wp64 /I../../src /I../../include %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + true + EnableFastChecks + MultiThreadedDebugDLL + + + Level3 + ProgramDatabase + false + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbb.def %(AdditionalOptions) + true + Windows + false + + + MachineX64 + + + + + /coff /Zi + true + /coff /Zi + true + /coff /Zi + true + /coff /Zi + + + true + building atomic_support.obj + ml64 /Fo"..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj" /DUSE_FRAME_POINTER /DEM64T=1 /c /Zi ../../dep/tbb/src/tbb/intel64-masm/atomic_support.asm + + ..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj;%(Outputs) + true + building atomic_support.obj + ml64 /Fo"..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj" /DEM64T=1 /c /Zi ../../dep/tbb/src/tbb/intel64-masm/atomic_support.asm + + ..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj;%(Outputs) + true + building atomic_support.obj + ml64 /Fo..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj" /DUSE_FRAME_POINTER /DEM64T=1 /c /Zi ../../dep/tbb/src/tbb/intel64-masm/atomic_support.asm + + ..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj;%(Outputs) + + + /coff /Zi + true + /coff /Zi + true + /coff /Zi + true + /coff /Zi + + + + + generating tbb.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../dep/tbb/src /I../../dep/tbb/include >$(IntDir)tbb.def + + .\tbb__$(Platform)_$(Configuration)\tbb.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbb.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../dep/tbb/src /I../../dep/tbb/include >$(IntDir)tbb.def + + .\tbb__$(Platform)_$(Configuration)\tbb.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbb.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../dep/tbb/src /I../../dep/tbb/include >$(IntDir)tbb.def + + .\tbb__$(Platform)_$(Configuration)\tbb.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + + + + + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbb.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../dep/tbb/src /I../../dep/tbb/include >$(IntDir)tbb.def + + ..\..\bin\$(Platform)_$(Configuration)\tbb.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbb.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../dep/tbb/src /I../../dep/tbb/include >$(IntDir)tbb.def + + ..\..\bin\$(Platform)_$(Configuration)\tbb.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbb.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../dep/tbb/src /I../../dep/tbb/include >$(IntDir)tbb.def + + ..\..\bin\$(Platform)_$(Configuration)\tbb.def;%(Outputs) + + + + + /I../../src /I../../include /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 %(AdditionalOptions) + /I../../src /I../../include /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 %(AdditionalOptions) + /I../../src /I../../include /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 %(AdditionalOptions) + /I../../src /I../../include /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 %(AdditionalOptions) + /I../../src /I../../include /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 %(AdditionalOptions) + /I../../src /I../../include /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 %(AdditionalOptions) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/win/VC100/tbbmalloc.vcxproj b/win/VC100/tbbmalloc.vcxproj new file mode 100644 index 000000000..6adbf0bea --- /dev/null +++ b/win/VC100/tbbmalloc.vcxproj @@ -0,0 +1,449 @@ + + + + Debug_NoPCH + Win32 + + + Debug_NoPCH + Win32 + + + Debug_NoPCH + x64 + + + Debug_NoPCH + x64 + + + Debug + Win32 + + + Debug + Win32 + + + Debug + x64 + + + Debug + x64 + + + Release + Win32 + + + Release + Win32 + + + Release + x64 + + + Release + x64 + + + + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} + tbbmalloc + Win32Proj + + + + DynamicLibrary + NotSet + + + DynamicLibrary + NotSet + true + + + DynamicLibrary + NotSet + + + DynamicLibrary + NotSet + true + + + DynamicLibrary + NotSet + + + DynamicLibrary + NotSet + + + + + + + + + + + <_ProjectFileVersion>10.0.20506.1 + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbbmalloc__$(Platform)_$(Configuration)\ + tbbmalloc_debug + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbbmalloc__$(Platform)_$(Configuration)\ + tbbmalloc_debug + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbbmalloc__$(Platform)_$(Configuration)\ + tbbmalloc + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbbmalloc__$(Platform)_$(Configuration)\ + tbbmalloc + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbbmalloc__$(Platform)_$(Configuration)\ + tbbmalloc_debug + .dll + false + ..\..\dep\lib\$(Platform)_$(Configuration)\ + .\tbbmalloc__$(Platform)_$(Configuration)\ + tbbmalloc_debug + .dll + false + + + + /c /MDd /Od /Ob0 /Zi /EHs- /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../src /I../../include /I../../src/tbbmalloc /I../../src/tbbmalloc %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + true + + + + + MultiThreadedDebugDLL + + + Level3 + false + ProgramDatabase + 4244;4267;%(DisableSpecificWarnings) + + + /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbbmalloc.def %(AdditionalOptions) + true + Windows + MachineX86 + + + + + X64 + + + /c /MDd /Od /Ob0 /Zi /EHs- /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /GS- /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../src /I../../include /I../../src/tbbmalloc /I../../src/tbbmalloc %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + false + + + + + MultiThreadedDebugDLL + true + + + Level3 + false + ProgramDatabase + 4244;4267;%(DisableSpecificWarnings) + false + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbbmalloc.def %(AdditionalOptions) + true + Windows + MachineX64 + + + + + /c /MD /O2 /Zi /EHs- /Zc:forScope /Zc:wchar_t /Oy /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../src /I../../include /I../../src/tbbmalloc /I../../src/tbbmalloc %(AdditionalOptions) + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + + + MultiThreadedDLL + + + Level3 + false + ProgramDatabase + 4244;4267;%(DisableSpecificWarnings) + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbbmalloc.def %(AdditionalOptions) + true + Windows + true + true + MachineX86 + + + + + X64 + + + /c /MD /O2 /Zi /EHs- /Zc:forScope /Zc:wchar_t /GS- /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../src /I../../include /I../../src/tbbmalloc /I../../src/tbbmalloc %(AdditionalOptions) + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + + + MultiThreadedDLL + + + Level3 + false + ProgramDatabase + 4244;4267;%(DisableSpecificWarnings) + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbbmalloc.def %(AdditionalOptions) + true + Windows + true + true + MachineX64 + + + + + /c /MDd /Od /Ob0 /Zi /EHs- /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../src /I../../include /I../../src/tbbmalloc /I../../src/tbbmalloc %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + %(PreprocessorDefinitions) + true + + + + + MultiThreadedDebugDLL + + + Level3 + false + ProgramDatabase + 4244;4267;%(DisableSpecificWarnings) + + + /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbbmalloc.def %(AdditionalOptions) + true + Windows + MachineX86 + + + + + X64 + + + /c /MDd /Od /Ob0 /Zi /EHs- /Zc:forScope /Zc:wchar_t /DTBB_USE_DEBUG /GS- /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 /I../../src /I../../include /I../../src/tbbmalloc /I../../src/tbbmalloc %(AdditionalOptions) + Disabled + ..\..\dep\tbb\include;..\..\dep\tbb\src;..\..\dep\tbb\build;..\..\dep\tbb\build\vsproject;%(AdditionalIncludeDirectories) + false + + + + + MultiThreadedDebugDLL + true + + + Level3 + false + ProgramDatabase + 4244;4267;%(DisableSpecificWarnings) + false + + + /nologo /DLL /MAP /DEBUG /fixed:no /INCREMENTAL:NO /DEF:$(IntDir)tbbmalloc.def %(AdditionalOptions) + true + Windows + MachineX64 + + + + + true + building atomic_support.obj + ml64 /Fo"..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj" /DUSE_FRAME_POINTER /DEM64T=1 /c /Zi ../../dep/tbb/src/tbb/intel64-masm/atomic_support.asm + + ..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj;%(Outputs) + true + building atomic_support.obj + ml64 /Fo"..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj" /DEM64T=1 /c /Zi ../../dep/tbb/src/tbb/intel64-masm/atomic_support.asm + + ..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj;%(Outputs) + true + building atomic_support.obj + ml64 /Fo"..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj" /DUSE_FRAME_POINTER /DEM64T=1 /c /Zi ../../dep/tbb/src/tbb/intel64-masm/atomic_support.asm + + ..\..\bin\$(Platform)_$(Configuration)\atomic_support.obj;%(Outputs) + + + + + /coff /Zi + true + /coff /Zi + true + /coff /Zi + true + /coff /Zi + + + /coff /Zi + true + /coff /Zi + true + /coff /Zi + true + /coff /Zi + + + + + generating tbbmalloc.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbbmalloc/win32-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + .\tbbmalloc__$(Platform)_$(Configuration)\tbbmalloc.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbbmalloc/win32-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + $(IntDir)tbbmalloc.def;%(Outputs) + generating tbbmalloc.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbbmalloc/win32-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + .\tbbmalloc__$(Platform)_$(Configuration)\tbbmalloc.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win32-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbbmalloc.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbbmalloc/win32-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + .\tbbmalloc__$(Platform)_$(Configuration)\tbbmalloc.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbbmalloc/win32-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + $(IntDir)tbbmalloc.def;%(Outputs) + + + + + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbbmalloc.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbbmalloc/win64-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + ..\..\bin\$(Platform)_$(Configuration)\tbbmalloc.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbbmalloc.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbbmalloc/win64-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + ..\..\bin\$(Platform)_$(Configuration)\tbbmalloc.def;%(Outputs) + true + generating tbb.def file + cl /nologo /TC /EP ../../src/tbb/win64-tbb-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbb.def + + $(IntDir)tbb.def;%(Outputs) + generating tbbmalloc.def file + cl /nologo /TC /EP ../../dep/tbb/src/tbbmalloc/win64-tbbmalloc-export.def /DTBB_USE_DEBUG /DDO_ITT_NOTIFY /DUSE_WINTHREAD /D_CRT_SECURE_NO_DEPRECATE /D_WIN32_WINNT=0x0400 /D__TBB_BUILD=1 >$(IntDir)tbbmalloc.def + + ..\..\bin\$(Platform)_$(Configuration)\tbbmalloc.def;%(Outputs) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/win/VC80/framework.vcproj b/win/VC80/framework.vcproj index 833262ea8..60c60b7c5 100644 --- a/win/VC80/framework.vcproj +++ b/win/VC80/framework.vcproj @@ -526,6 +526,10 @@ RelativePath="..\..\src\framework\Policies\CreationPolicy.h" > + + @@ -566,10 +570,6 @@ RelativePath="..\..\src\framework\Utilities\EventProcessor.h" > - - @@ -578,6 +578,10 @@ RelativePath="..\..\src\framework\Utilities\TypeList.h" > + + diff --git a/win/VC80/genrevision.vcproj b/win/VC80/genrevision.vcproj index 2faca2dd6..43683025d 100644 --- a/win/VC80/genrevision.vcproj +++ b/win/VC80/genrevision.vcproj @@ -44,7 +44,6 @@ Name="VCCLCompilerTool" Optimization="0" PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE" - MinimalRebuild="true" BasicRuntimeChecks="3" RuntimeLibrary="3" UsePrecompiledHeader="0" @@ -85,6 +84,83 @@ + + + + + + + + + + + + + + + + + + + + + @@ -115,7 +191,6 @@ Name="VCCLCompilerTool" Optimization="0" PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE" - MinimalRebuild="true" BasicRuntimeChecks="3" RuntimeLibrary="3" UsePrecompiledHeader="0" @@ -156,6 +231,83 @@ + + + + + + + + + + + + + + + + + + + + + @@ -231,148 +383,7 @@ Name="VCAppVerifierTool" /> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + diff --git a/win/VC80/mangosd.vcproj b/win/VC80/mangosd.vcproj index 02c78396f..0e1a61c4a 100644 --- a/win/VC80/mangosd.vcproj +++ b/win/VC80/mangosd.vcproj @@ -78,11 +78,11 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/win/VC80/tbbmalloc.vcproj b/win/VC80/tbbmalloc.vcproj new file mode 100644 index 000000000..1a890c291 --- /dev/null +++ b/win/VC80/tbbmalloc.vcproj @@ -0,0 +1,1056 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/win/VC90/ACE_vc9.vcproj b/win/VC90/ACE_vc9.vcproj index 5fbd7008c..70a9f1429 100644 --- a/win/VC90/ACE_vc9.vcproj +++ b/win/VC90/ACE_vc9.vcproj @@ -6,6 +6,7 @@ ProjectGUID="{BD537C9A-FECA-1BAD-6757-8A6348EA12C8}" RootNamespace="ACE" Keyword="Win32Proj" + TargetFrameworkVersion="0" > + + diff --git a/win/VC90/mangosd.vcproj b/win/VC90/mangosd.vcproj index 8098e9919..2f2c6f5ea 100644 --- a/win/VC90/mangosd.vcproj +++ b/win/VC90/mangosd.vcproj @@ -80,11 +80,11 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/win/VC90/tbbmalloc.vcproj b/win/VC90/tbbmalloc.vcproj new file mode 100644 index 000000000..54f0968c5 --- /dev/null +++ b/win/VC90/tbbmalloc.vcproj @@ -0,0 +1,1051 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/win/mangosdVC100.sln b/win/mangosdVC100.sln index e2052abf3..8c5c46a57 100644 --- a/win/mangosdVC100.sln +++ b/win/mangosdVC100.sln @@ -12,6 +12,7 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "shared", "VC100\shared.vcxp {803F488E-4C5A-4866-8D5C-1E6C03C007C2} = {803F488E-4C5A-4866-8D5C-1E6C03C007C2} {BD537C9A-FECA-1BAD-6757-8A6348EA12C8} = {BD537C9A-FECA-1BAD-6757-8A6348EA12C8} {8072769E-CF10-48BF-B9E1-12752A5DAC6E} = {8072769E-CF10-48BF-B9E1-12752A5DAC6E} + {F62787DD-1327-448B-9818-030062BCFAA5} = {F62787DD-1327-448B-9818-030062BCFAA5} EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "mangosd", "VC100\mangosd.vcxproj", "{A3A04E47-43A2-4C08-90B3-029CEF558594}" @@ -25,6 +26,9 @@ EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "zlib", "VC100\zlib.vcxproj", "{8F1DEA42-6A5B-4B62-839D-C141A7BFACF2}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "framework", "VC100\framework.vcxproj", "{BF6F5D0E-33A5-4E23-9E7D-DD481B7B5B9E}" + ProjectSection(ProjectDependencies) = postProject + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} = {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} + EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "realmd", "VC100\realmd.vcxproj", "{563E9905-3657-460C-AE63-0AC39D162E23}" ProjectSection(ProjectDependencies) = postProject @@ -45,6 +49,13 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "genrevision", "VC100\genrev EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ACE_Wrappers", "VC100\ACE_vc10.vcxproj", "{BD537C9A-FECA-1BAD-6757-8A6348EA12C8}" EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbbmalloc", "VC100\tbbmalloc.vcxproj", "{B15F131E-328A-4D42-ADC2-9FF4CA6306D8}" + ProjectSection(ProjectDependencies) = postProject + {F62787DD-1327-448B-9818-030062BCFAA5} = {F62787DD-1327-448B-9818-030062BCFAA5} + EndProjectSection +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbb", "VC100\tbb.vcxproj", "{F62787DD-1327-448B-9818-030062BCFAA5}" +EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug_NoPCH|Win32 = Debug_NoPCH|Win32 @@ -187,6 +198,30 @@ Global {BD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|Win32.Build.0 = Release|Win32 {BD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|x64.ActiveCfg = Release|X64 {BD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|x64.Build.0 = Release|X64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|X64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|x64.Build.0 = Debug_NoPCH|X64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.ActiveCfg = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.Build.0 = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.ActiveCfg = Debug|X64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.Build.0 = Debug|X64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.ActiveCfg = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.Build.0 = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.ActiveCfg = Release|X64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.Build.0 = Release|X64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|X64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|x64.Build.0 = Debug_NoPCH|X64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.ActiveCfg = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.Build.0 = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.ActiveCfg = Debug|X64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.Build.0 = Debug|X64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.ActiveCfg = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.Build.0 = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.ActiveCfg = Release|X64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.Build.0 = Release|X64 EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE diff --git a/win/mangosdVC80.sln b/win/mangosdVC80.sln index 67c2d9734..b177a1e00 100644 --- a/win/mangosdVC80.sln +++ b/win/mangosdVC80.sln @@ -7,19 +7,20 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "game", "VC80\game.vcproj", EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "shared", "VC80\shared.vcproj", "{90297C34-F231-4DF4-848E-A74BCC0E40ED}" ProjectSection(ProjectDependencies) = postProject - {BF6F5D0E-33A5-4E23-9E7D-DD481B7B5B9E} = {BF6F5D0E-33A5-4E23-9E7D-DD481B7B5B9E} - {AD537C9A-FECA-1BAD-6757-8A6348EA12C8} = {AD537C9A-FECA-1BAD-6757-8A6348EA12C8} - {8072769E-CF10-48BF-B9E1-12752A5DAC6E} = {8072769E-CF10-48BF-B9E1-12752A5DAC6E} + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} = {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} {803F488E-4C5A-4866-8D5C-1E6C03C007C2} = {803F488E-4C5A-4866-8D5C-1E6C03C007C2} + {8072769E-CF10-48BF-B9E1-12752A5DAC6E} = {8072769E-CF10-48BF-B9E1-12752A5DAC6E} + {AD537C9A-FECA-1BAD-6757-8A6348EA12C8} = {AD537C9A-FECA-1BAD-6757-8A6348EA12C8} + {BF6F5D0E-33A5-4E23-9E7D-DD481B7B5B9E} = {BF6F5D0E-33A5-4E23-9E7D-DD481B7B5B9E} {8F1DEA42-6A5B-4B62-839D-C141A7BFACF2} = {8F1DEA42-6A5B-4B62-839D-C141A7BFACF2} EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "mangosd", "VC80\mangosd.vcproj", "{A3A04E47-43A2-4C08-90B3-029CEF558594}" ProjectSection(ProjectDependencies) = postProject - {90297C34-F231-4DF4-848E-A74BCC0E40ED} = {90297C34-F231-4DF4-848E-A74BCC0E40ED} - {1DC6C4DA-A028-41F3-877D-D5400C594F88} = {1DC6C4DA-A028-41F3-877D-D5400C594F88} - {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} = {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} {563E9905-3657-460C-AE63-0AC39D162E23} = {563E9905-3657-460C-AE63-0AC39D162E23} + {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} = {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} + {1DC6C4DA-A028-41F3-877D-D5400C594F88} = {1DC6C4DA-A028-41F3-877D-D5400C594F88} + {90297C34-F231-4DF4-848E-A74BCC0E40ED} = {90297C34-F231-4DF4-848E-A74BCC0E40ED} EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "zlib", "VC80\zlib.vcproj", "{8F1DEA42-6A5B-4B62-839D-C141A7BFACF2}" @@ -28,8 +29,8 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "framework", "VC80\framework EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "realmd", "VC80\realmd.vcproj", "{563E9905-3657-460C-AE63-0AC39D162E23}" ProjectSection(ProjectDependencies) = postProject - {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} = {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} {90297C34-F231-4DF4-848E-A74BCC0E40ED} = {90297C34-F231-4DF4-848E-A74BCC0E40ED} + {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} = {04BAF755-0D67-46F8-B1C6-77AE5368F3CB} EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "script", "VC80\script.vcproj", "{4205C8A9-79B7-4354-8064-F05FB9CA0C96}" @@ -45,6 +46,13 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "genrevision", "VC80\genrevi EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ACE_Wrappers", "VC80\ACE_vc8.vcproj", "{AD537C9A-FECA-1BAD-6757-8A6348EA12C8}" EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbb", "VC80\tbb.vcproj", "{F62787DD-1327-448B-9818-030062BCFAA5}" +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbbmalloc", "VC80\tbbmalloc.vcproj", "{B15F131E-328A-4D42-ADC2-9FF4CA6306D8}" + ProjectSection(ProjectDependencies) = postProject + {F62787DD-1327-448B-9818-030062BCFAA5} = {F62787DD-1327-448B-9818-030062BCFAA5} + EndProjectSection +EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug_NoPCH|Win32 = Debug_NoPCH|Win32 @@ -173,8 +181,8 @@ Global {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Debug|x64.Build.0 = Debug|Win32 {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Release|Win32.ActiveCfg = Release|Win32 {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Release|Win32.Build.0 = Release|Win32 - {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Release|x64.ActiveCfg = Release|Win32 - {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Release|x64.Build.0 = Release|Win32 + {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Release|x64.ActiveCfg = Release|x64 + {803F488E-4C5A-4866-8D5C-1E6C03C007C2}.Release|x64.Build.0 = Release|x64 {AD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 {AD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 {AD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|x64 @@ -187,6 +195,30 @@ Global {AD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|Win32.Build.0 = Release|Win32 {AD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|x64.ActiveCfg = Release|x64 {AD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|x64.Build.0 = Release|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|x64.Build.0 = Debug_NoPCH|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.ActiveCfg = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.Build.0 = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.ActiveCfg = Debug|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.Build.0 = Debug|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.ActiveCfg = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.Build.0 = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.ActiveCfg = Release|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.Build.0 = Release|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|x64.Build.0 = Debug_NoPCH|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.ActiveCfg = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.Build.0 = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.ActiveCfg = Debug|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.Build.0 = Debug|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.ActiveCfg = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.Build.0 = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.ActiveCfg = Release|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.Build.0 = Release|x64 EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE diff --git a/win/mangosdVC90.sln b/win/mangosdVC90.sln index 266b2100d..b271296fd 100644 --- a/win/mangosdVC90.sln +++ b/win/mangosdVC90.sln @@ -25,6 +25,9 @@ EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "zlib", "VC90\zlib.vcproj", "{8F1DEA42-6A5B-4B62-839D-C141A7BFACF2}" EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "framework", "VC90\framework.vcproj", "{BF6F5D0E-33A5-4E23-9E7D-DD481B7B5B9E}" + ProjectSection(ProjectDependencies) = postProject + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} = {B15F131E-328A-4D42-ADC2-9FF4CA6306D8} + EndProjectSection EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "realmd", "VC90\realmd.vcproj", "{563E9905-3657-460C-AE63-0AC39D162E23}" ProjectSection(ProjectDependencies) = postProject @@ -45,6 +48,13 @@ Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "genrevision", "VC90\genrevi EndProject Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ACE_Wrappers", "VC90\ACE_vc9.vcproj", "{BD537C9A-FECA-1BAD-6757-8A6348EA12C8}" EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbb", "VC90\tbb.vcproj", "{F62787DD-1327-448B-9818-030062BCFAA5}" +EndProject +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "tbbmalloc", "VC90\tbbmalloc.vcproj", "{B15F131E-328A-4D42-ADC2-9FF4CA6306D8}" + ProjectSection(ProjectDependencies) = postProject + {F62787DD-1327-448B-9818-030062BCFAA5} = {F62787DD-1327-448B-9818-030062BCFAA5} + EndProjectSection +EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug_NoPCH|Win32 = Debug_NoPCH|Win32 @@ -187,6 +197,30 @@ Global {BD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|Win32.Build.0 = Release|Win32 {BD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|x64.ActiveCfg = Release|x64 {BD537C9A-FECA-1BAD-6757-8A6348EA12C8}.Release|x64.Build.0 = Release|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug_NoPCH|x64.Build.0 = Debug_NoPCH|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.ActiveCfg = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|Win32.Build.0 = Debug|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.ActiveCfg = Debug|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Debug|x64.Build.0 = Debug|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.ActiveCfg = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|Win32.Build.0 = Release|Win32 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.ActiveCfg = Release|x64 + {F62787DD-1327-448B-9818-030062BCFAA5}.Release|x64.Build.0 = Release|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|Win32.ActiveCfg = Debug_NoPCH|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|Win32.Build.0 = Debug_NoPCH|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|x64.ActiveCfg = Debug_NoPCH|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug_NoPCH|x64.Build.0 = Debug_NoPCH|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.ActiveCfg = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|Win32.Build.0 = Debug|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.ActiveCfg = Debug|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Debug|x64.Build.0 = Debug|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.ActiveCfg = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|Win32.Build.0 = Release|Win32 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.ActiveCfg = Release|x64 + {B15F131E-328A-4D42-ADC2-9FF4CA6306D8}.Release|x64.Build.0 = Release|x64 EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE