├── numbered_slots_upshot.jpg ├── .gitignore ├── README.md ├── LICENSE ├── ConcurrentMap.cpp └── simdb.hpp /numbered_slots_upshot.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LiveAsynchronousVisualizedArchitecture/simdb/HEAD/numbered_slots_upshot.jpg -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # svn 2 | *.svn 3 | 4 | # json node graphs 5 | *.lava 6 | 7 | # Compiled Object files 8 | *.slo 9 | *.lo 10 | *.o 11 | *.obj 12 | 13 | # Precompiled Headers 14 | *.gch 15 | *.pch 16 | 17 | # Compiled Dynamic libraries 18 | *.so 19 | *.dylib 20 | 21 | # Fortran module files 22 | *.mod 23 | *.smod 24 | 25 | # Compiled Static libraries 26 | *.lai 27 | *.la 28 | *.a 29 | 30 | # Executables 31 | *.exe 32 | *.out 33 | *.app 34 | 35 | # visual studio nonsense 36 | *.sdf 37 | *.pdb 38 | *.idb 39 | *.tlog 40 | *.log 41 | *.ilk 42 | *.user 43 | #*.filters 44 | *.opensdf 45 | *.suo 46 | *.psess 47 | *.vsp 48 | 49 | # intel compiler nonsense 50 | *.amplxeproj 51 | 52 | # User-specific files 53 | *.suo 54 | *.user 55 | *.userosscache 56 | *.sln.docstates 57 | 58 | # Build results 59 | [Dd]ebug/ 60 | [Dd]ebugPublic/ 61 | [Rr]elease/ 62 | [Rr]eleases/ 63 | x64/ 64 | x86/ 65 | bld/ 66 | [Bb]in/ 67 | [Oo]bj/ 68 | [Ll]og/ 69 | 70 | # Visual C++ cache files 71 | ipch/ 72 | *.aps 73 | *.ncb 74 | *.opendb 75 | *.opensdf 76 | *.sdf 77 | *.cachefile 78 | *.VC.db 79 | *.VC.VC.opendb 80 | 81 | 82 | 83 | bak/ 84 | nuklear/nuklear_test/nuklear_test/ 85 | 86 | *.swp 87 | 88 | *.swp 89 | 90 | *.swp 91 | 92 | nuklear/.VizDataStructures.hpp.swp 93 | 94 | nuklear/simdb_15_test 95 | 96 | nuklear/simdb_15_test 97 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | ![alt text](https://github.com/LiveAsynchronousVisualizedArchitecture/simdb/blob/master/numbered_slots_upshot.jpg "A key value store is kind of like this") 3 | 4 | # SimDB 5 | #### A high performance, shared memory, lock free, cross platform, single file, no dependencies, C++11 key-value store. 6 | 7 | SimDB is part of LAVA (Live Asynchronous Visualized Architecture) which is a series of single file, minimal dependency, C++11 files to create highly concurrent software while the program being written runs live with internal data visualized. 8 | 9 | - Hash based key-value store created to be a fundamental piece of a larger software architecture. 10 | 11 | - High Performance - Real benchmarking needs to be done, but superficial loops seem to run *conservatively* at 500,000 small get() and put() calls per logical core per second. Because it is lock free the performance scales well while using at least a dozen threads. 12 | 13 | - Shared Memory - Uses shared memory maps on Windows, Linux, and OS X without relying on any external dependencies. This makes it __exceptionally good at interprocess communication__. 14 | 15 | - Lock Free - The user facing functions are thread-safe and lock free with the exception of the constructor (to avoid race conditions between multiple processes creating the memory mapped file at the same time). 16 | 17 | - Cross Platform - Compiles with Visual Studio 2013 and ICC 15.0 on Windows, gcc 5.4 on Linux, gcc on OS X, and clang on OS X. 18 | 19 | - Single File - simdb.hpp and the C++11 standard library is all you need. No Windows SDK or any other dependencies, not even from the parent project. 20 | 21 | - Apache 2.0 License - No need to GPL your whole program to include one file. 22 | 23 | This has already been used for both debugging and visualization, but *should be treated as alpha software*. Though there are no known outstanding bugs, there are almost certainly bugs (and small design issues) waiting to be discovered and so will need to be fixed as they arise. 24 | 25 | #### Getting Started 26 | 27 | ```cpp 28 | simdb db("test", 1024, 4096); 29 | ``` 30 | 31 | This creates a shared memory file that will be named "simdb_test". It will be a file in a temp directory on Linux and OSX and a 'section object' in the current windows session namespace (basically a temp file complicated by windows nonsense). 32 | 33 | It will have 4096 blocks of 1024 bytes each. It will contain about 4 megabytes of space in its blocks and the actual file will have a size of about 4MB + some overhead for the organization (though the OS won't write pages of the memory map to disk unless it is neccesary). 34 | 35 | ```cpp 36 | auto dbs = simdb_listDBs(); 37 | ``` 38 | 39 | This will return a list of the simdb files in the temp directory as a std::vector of std::string. Simdb files are automatically prefixed with "simdb_" and thus can searched for easily here. This can make interprocess communication easier so that you can do things like list the available db files in a GUI. It is here for convenience largely because of how difficult it is to list the temporary memory mapped files on windows. 40 | 41 | ```cpp 42 | db.put("lock free", "is the way to be"); 43 | ``` 44 | 45 | SimDB works with arbitrary byte buffers for both keys and values. This example uses a convenience function to make a common case easier. 46 | 47 | ```cpp 48 | string s = db.get("lock free"); // returns "is the way to be" 49 | ``` 50 | 51 | This is another convenience function for the same reason. Next will be an example of the direct functions that these wrap. 52 | 53 | ```cpp 54 | string lf = "lock free"; 55 | 56 | string way = "is the way to be"; 57 | 58 | i64 len = db.len( lf.data(), (u32)lf.length() ); 59 | 60 | string way2(len,'\0'); 61 | 62 | bool ok = db.get( lf.data(), (u32)lf.length(), (void*)way.data(), (u32)way.length() ); 63 | ``` 64 | 65 | Here we can see the fundamental functions used to interface with the db. An arbitrary bytes buffer is given for the key and another for the value. Keep in mind here that get() can fail, since another thread can delete or change the key being read between the call to len() (which gets the number of bytes held in the value of the given key) and the call to get(). 66 | Not shown is del(), which will take a key and delete it. 67 | 68 | 69 | *Inside simdb.hpp there is a more extensive explanation of the inner working and how it achieves lock free concurrency* 70 | 71 | 72 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /ConcurrentMap.cpp: -------------------------------------------------------------------------------- 1 | 2 | // todo: test 128 bit atomic with native C++ 3 | // todo: look into 128 bit atomic with windows 4 | 5 | #ifdef _MSC_VER 6 | #pragma warning(push, 0) 7 | #endif 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | #include 19 | #include "simdb.hpp" 20 | 21 | using u8 = uint8_t; 22 | using u32 = uint32_t; 23 | using u64 = uint64_t; 24 | using i8 = int8_t; 25 | using i32 = int32_t; 26 | using i64 = int64_t; 27 | using au64 = std::atomic; 28 | using au32 = std::atomic; 29 | 30 | #ifdef _WIN32 31 | #include 32 | #include 33 | #endif 34 | 35 | //#include 36 | 37 | #ifndef COMBINE 38 | #define COMBINE2(a,b) a ## b 39 | #define COMBINE(a,b) COMBINE2(a,b) 40 | #endif 41 | 42 | #ifndef PAUSE 43 | #define PAUSE std::cout << "Paused at line " << __LINE__ << std::endl; int COMBINE(VAR,__LINE__); std::cin >> COMBINE(VAR,__LINE__); 44 | #endif 45 | 46 | #ifndef TO 47 | #define TO(to, var) for(std::remove_const::type var = 0; var < to; ++var) 48 | //#define TO(to, var) for(auto var = 0ull; var < (unsigned long long)to; ++var) 49 | #endif 50 | 51 | u32 intHash(u32 h) 52 | { 53 | //h += 1; 54 | h ^= h >> 16; 55 | h *= 0x85ebca6b; 56 | h ^= h >> 13; 57 | h *= 0xc2b2ae35; 58 | h ^= h >> 16; 59 | return h; 60 | } 61 | u32 nextPowerOf2(u32 v) 62 | { 63 | v--; 64 | v |= v >> 1; 65 | v |= v >> 2; 66 | v |= v >> 4; 67 | v |= v >> 8; 68 | v |= v >> 16; 69 | v++; 70 | 71 | return v; 72 | } 73 | 74 | template struct RngInt 75 | { 76 | std::mt19937 m_gen; 77 | std::uniform_int_distribution m_dis; 78 | 79 | RngInt(T lo = 0, T hi = 1, int seed = 16807) 80 | : m_gen(seed), m_dis(lo, hi) 81 | { } 82 | 83 | inline T operator()() 84 | { return m_dis(m_gen); } 85 | 86 | inline T operator()(T lo, T hi) 87 | { 88 | std::uniform_int_distribution dis(lo, hi); 89 | return dis(m_gen); 90 | } 91 | }; 92 | 93 | template 94 | STR keepAlphaNumeric(STR const& s) 95 | { 96 | using namespace std; 97 | 98 | regex alphaNumeric("[a-zA-Z\\d]+"); 99 | sregex_iterator iter( ALL(s), alphaNumeric ); 100 | sregex_iterator iter_end; 101 | 102 | STR out; 103 | while( iter != iter_end ) 104 | out += iter++->str(); // ... 105 | 106 | return out; 107 | } 108 | 109 | template 110 | STR1 subNonFilename(STR1 const& s, STR2 const& substr) 111 | { 112 | using namespace std; 113 | 114 | //string patStr("[#%&\\{\\}\\\\<>\\*\\?/\\w\\$!'\":@\\+`\\|=\\.]+"); 115 | //string patStr("#|%|&|\\{|\\}|\\\\|<|>|\\*|\\?|/|\\w|\\$|!|'|\"|:|@|\\+|`|\\||=|\\."); 116 | 117 | STR1 patStr(":|\\*|\\.|\\?|\\\\|/|\\||>|<"); 118 | regex pattern(patStr); 119 | return regex_replace(s, pattern, substr); 120 | } 121 | 122 | template inline auto 123 | Concat(const T& a) -> T 124 | { return a; } 125 | template inline auto 126 | Concat(const T1& a, const T&... args) -> T1 127 | { 128 | //T1 ret; 129 | //ret.append( ALL(a) ); 130 | //ret.append( ALL(Concat(args...)) ); 131 | return a + Concat(args...); 132 | } 133 | 134 | inline std::string 135 | toString(std::vector const& v) 136 | { 137 | using namespace std; 138 | 139 | ostringstream convert; 140 | TO(v.size(),i) convert << v[i] << " "; 141 | convert << endl; 142 | return convert.str(); 143 | } 144 | 145 | template inline std::string 146 | toString(T const& x) 147 | { 148 | std::ostringstream convert; 149 | convert << x; 150 | return convert.str(); 151 | } 152 | 153 | template inline std::string 154 | toString(const T1& a, const T&... args) 155 | { 156 | return toString(a) + toString(args...) ; 157 | } 158 | 159 | //template< template class L, class... T, int IDX = 0> std::string 160 | //toString(const std::tuple& tpl) 161 | //{ 162 | // using namespace std; 163 | // 164 | // const auto len = mp_len::value; 165 | // 166 | // string ret; 167 | // ret += toString(get(tpl), " "); 168 | // if(IDX < len-1) ret += toString(get(tpl)); 169 | // return ret; 170 | //} 171 | 172 | inline std::ostream& Print(std::ostream& o) { return o; } 173 | template inline std::ostream& 174 | Print(std::ostream& o, const T&... args) 175 | { 176 | o << toString(args ...); 177 | o.flush(); 178 | return o; 179 | } 180 | template inline std::ostream& 181 | Println(std::ostream& o, const T&... args) 182 | { 183 | //o << toString(args...) << std::endl; 184 | Print(o, args..., "\n"); 185 | return o; 186 | } 187 | template inline void 188 | Print(const T&... args) 189 | { 190 | Print(std::cout, args...); 191 | //std::cout << toString(args...); 192 | } 193 | template inline void 194 | Println(const T&... args) 195 | { 196 | Println(std::cout, args...); 197 | //std::cout << toString(args...) << std::endl; 198 | } 199 | template inline void 200 | PrintSpaceln(const T& a) 201 | { 202 | Print(std::cout, a); 203 | } 204 | template inline void 205 | PrintSpaceln(const T1& a, const T&... args) 206 | { 207 | Print(std::cout, a, " "); 208 | PrintSpaceln(args...); 209 | Println(); 210 | } 211 | 212 | using std::thread; 213 | using str = std::string; 214 | 215 | //inline void prefetch2(char const* const p) 216 | //{ 217 | // _mm_prefetch(p, _MM_HINT_T2); 218 | // //_m_prefetch((void*)p); 219 | //} 220 | //inline void prefetch1(char const* const p) 221 | //{ 222 | // _mm_prefetch(p, _MM_HINT_T1); 223 | // //_m_prefetch((void*)p); 224 | //} 225 | //inline void prefetch0(char const* const p) 226 | //{ 227 | // _mm_prefetch(p, _MM_HINT_T0); 228 | // //_m_prefetch((void*)p); 229 | //} 230 | 231 | template > using vec = std::vector; // will need C++ ifdefs eventually 232 | 233 | void printkey(simdb const& db, str const& key) 234 | { 235 | //u32 vlen; 236 | //auto len = db.len(key, &vlen); 237 | // 238 | //auto val = str(vlen, '\0'); 239 | //auto ok = db.get(key); 240 | 241 | auto val = db.get(key); 242 | Println(key,": ", val); 243 | } 244 | 245 | void printdb(simdb const& db) 246 | { 247 | Println("size: ", db.size()); 248 | 249 | //str memstr; 250 | //memstr.resize(db.size()+1); 251 | 252 | vec memv(db.memsize(), 0); 253 | memcpy( (void*)memv.data(), db.mem(), db.memsize() ); 254 | 255 | //str memstr( (const char*)db.data(), (const char*)db.data() + db.size()); 256 | //Println("\nmem: ", memstr, "\n" ); 257 | 258 | Println("\n"); 259 | 260 | u64 blksz = db.blockSize(); 261 | TO(memv.size(),i){ 262 | if(i % blksz == 0){ 263 | putc('|', stdout); 264 | } 265 | putc(memv[i] ,stdout); 266 | } 267 | } 268 | 269 | void printkeys(simdb const& db) 270 | { 271 | Println("\n---Keys---"); 272 | auto keys = db.getKeyStrs(); 273 | TO(keys.size(), i){ 274 | Println(keys[i].str, ": ", db.get(keys[i].str) ); 275 | //printkey(db, db.get(keys[i].s) ); 276 | } 277 | 278 | //printkey(db, keys[i].s ); 279 | } 280 | 281 | void printhsh(simdb const& db) 282 | { 283 | u32* d = (u32*)db.hashData(); 284 | for(u32 i=0; i<(db.blocks()*2); ++i){ 285 | if(i%4==0) printf("|"); 286 | else if(i%2==0) printf(" "); 287 | 288 | printf(" 0x%08x ", d[i]); 289 | 290 | //if(i%8) printf("|"); 291 | //else if(i%4) printf(" "); 292 | } 293 | printf("\n\n"); 294 | 295 | //auto vi = (simdb::VerIdx*)db.hashData(); 296 | //for(u32 i=0; i<(db.blocks()); ++i){ 297 | // if(i%2==0) printf("|"); 298 | // 299 | // printf(" %u %u ", vi[i].idx, vi[i].version); 300 | // 301 | // //if(i%8) printf("|"); 302 | // //else if(i%4) printf(" "); 303 | //} 304 | //printf("\n\n"); 305 | } 306 | 307 | int main() 308 | { 309 | using namespace std; 310 | 311 | //Println("size of simdb on the stack: ", sizeof(simdb)); 312 | 313 | simdb db("test", 2<<10, 2<<12); 314 | 315 | //simdb db("H:\\projects\\lava\\test.simdb", 32, 64, true); 316 | //simdb db("test.simdb", 32, 64, true); 317 | 318 | //printhsh(db); 319 | 320 | str numkey[] = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"}; 321 | str label[] = {"zero","one","two","three","four","five","six","seven","eight","nine","ten","eleven"}; 322 | 323 | //int sz = (int)thrds.size(); 324 | 325 | //int sz = 12; 326 | //vec thrds; 327 | //vec> rngSwitches; 328 | //TO(sz,i){ rngSwitches.emplace_back(0,1,i); } 329 | // 330 | //TO(sz,i) 331 | //{ 332 | // int idx = i % sz; 333 | // thrds.emplace_back([i, idx, &rngSwitches, &numkey, &label, &db] 334 | // { 335 | // auto& numk = numkey[idx]; 336 | // auto& lbl = label[idx]; 337 | // TO(1,j){ 338 | // db.put(numk, lbl); 339 | // if(rngSwitches[idx]()){ db.del(numk); } 340 | // //bool ok = db.del(numk); 341 | // //if(!ok){ Println(numk," not deleted"); } 342 | // //while(!db.del(numk)){} 343 | // } 344 | // 345 | // Println(i, " done"); 346 | // }); 347 | //} 348 | //TO(thrds.size(),i){ thrds[i].join(); } 349 | 350 | str wat = "wat"; 351 | str wut = "wut"; 352 | str skidoosh = "skidoosh"; 353 | str kablam = "kablam"; 354 | str longkey = "this is a super long key as a test"; 355 | str longval = "value that is really long as a really long value test"; 356 | 357 | string lf = "lock free"; 358 | string way = "is the way to be"; 359 | //db.put( lf.data(), (u32)lf.length(), way.data(), (u32)way.length() ); 360 | 361 | i64 len = db.len( lf.data(), (u32)lf.length() ); 362 | string way2(len,'\0'); 363 | /*bool ok =*/ db.get( lf.data(), (u32)lf.length(), (void*)way.data(), (u32)way.length() ); 364 | 365 | Println("\n",way,"\n"); 366 | 367 | //Println("put: ", db.put( wat.data(), (u32)wat.length(), skidoosh.data(), (u32)skidoosh.length()) ); 368 | //Println("put: ", db.put( (void*)wat.data(), (u32)wat.length(), (void*)skidoosh.data(), (u32)skidoosh.length()) ); 369 | 370 | if( db.isOwner() ){ 371 | Println("put: ", db.put("lock free", "is the way to be") ); 372 | 373 | Println("put: ", db.put(wat, skidoosh) ); 374 | //db.del("wat"); 375 | Println("put: ", db.put( wut.data(), (u32)wut.length(), kablam.data(), (u32)kablam.length()) ); 376 | //db.del("wut"); 377 | Println("put: ", db.put(kablam, skidoosh) ); //Println("put: ", db.put( kablam.data(),(u32)kablam.length(), skidoosh.data(), (u32)skidoosh.length()) ); 378 | //db.del("kablam"); 379 | 380 | Println("put: ", db.put(wat, skidoosh) ); 381 | //Println("del wat: ", db.del("wat") ); 382 | 383 | //Println("put: ", db.put(longkey, longval) ); 384 | //Println("del wat: ", db.del(longkey) ); 385 | 386 | Println(); 387 | } 388 | //db.flush(); 389 | 390 | Println(); 391 | //printdb(db); 392 | 393 | //else{ 394 | // Println("put: ", db.put( (void*)wat.data(), (u32)wat.length(), (void*)skidoosh.data(), (u32)skidoosh.length()) ); 395 | //} 396 | 397 | Println(); 398 | //printhsh(db); // Println(); 399 | 400 | //Println("BlkLst size: ", sizeof(CncrHsh::BlkLst) ); 401 | // 402 | //u32 vlen=0,ver=0; 403 | //i64 len = db.len( wat.data(), (u32)wat.length(), &vlen, &ver); 404 | //str val(vlen, '\0'); 405 | //bool ok = db.get( wat.data(), (u32)wat.length(), (void*)val.data(), (u32)val.length() ); 406 | //Println("ok: ", ok, " value: ", val, " wat total len: ", len, " wat val len: ", vlen, "\n"); 407 | // 408 | //len = db.len( longkey.data(), (u32)longkey.length(), &vlen, &ver); 409 | //val = str(vlen, '\0'); 410 | //ok = db.get( longkey.data(), (u32)longkey.length(), (void*)val.data(), (u32)val.length() ); 411 | //Println("ok: ", ok, " longkey value: ", val, " longkey total len: ", len, " longkey val len: ", vlen, "\n"); 412 | // 413 | //str v; 414 | //db.get(wat, &v); Println("value: ", v); 415 | //db.get(wut, &v); Println("value: ", v); 416 | //db.get(kablam,&v); Println("value: ", v); 417 | 418 | printkeys(db); 419 | 420 | //Println("\nKEYS"); 421 | //auto keys = db.getKeyStrs(); 422 | //for(auto k : keys) Println(k,": ", db.get(k) ); 423 | //Println("\n"); 424 | 425 | //TO(6,i) 426 | //{ 427 | // u32 klen, vlen; 428 | // auto nxt = db.nxt(); 429 | // bool oklen = db.len(nxt.idx, nxt.version, &klen, &vlen); 430 | // str key(klen,'\0'); 431 | // bool okkey = db.getKey(nxt.idx, nxt.version, (void*)key.data(), klen); 432 | // 433 | // //str val; 434 | // //bool okval = db.get(key, &val); 435 | // str val = db.get(key); 436 | // Println("VerIdx: ",nxt.idx,", ",nxt.version, 437 | // " str len: ", key.length(), " nxt key: [", key, 438 | // "] val: [", val,"] val len: ", val.length() ); 439 | //} 440 | 441 | //Println("wat data len: ", db.len(wat) ); 442 | //Println("wut data len: ", db.len(wut) ); 443 | //Println("kablam data len: ", db.len(kablam) ); 444 | // 445 | //str clear = " "; 446 | //auto watlen = db.get("wat", (void*)clear.data() ); 447 | ////auto watslen = db.get(str("w"), (void*)clear.data() ); 448 | //Println("watlen: ", watlen); 449 | //Println("get \"wat\": ", clear); 450 | //Println(); 451 | // 452 | //clear = " "; 453 | //auto wutlen = db.get("wut", (void*)clear.data() ); 454 | //Println("wutlen: ", wutlen); 455 | //Println("get \"wut\": ", clear); 456 | //Println(); 457 | // 458 | //clear = " "; 459 | //auto kablamlen = db.get("kablam", (void*)clear.data() ); 460 | //Println("kablamlen: ", kablamlen); 461 | //Println("get \"kablam\": ", clear); 462 | //Println(); 463 | // 464 | 465 | //Println("size: ", db.size()); 466 | // 467 | ////str memstr; 468 | ////memstr.resize(db.size()+1); 469 | //vec memv(db.memsize(), 0); 470 | //memcpy( (void*)memv.data(), db.mem(), db.memsize() ); 471 | // 472 | //str memstr( (const char*)db.data(), (const char*)db.data() + db.size()); 473 | //Println("\nmem: ", memstr, "\n" ); 474 | 475 | // 476 | //Println("owner: ", db.isOwner(), "\n\n"); 477 | // 478 | ////std::vector::value_type v; 479 | ////Println("v size: ", sizeof(v)); 480 | // 481 | ////ui64 cnt = (ui64)((1<<17)*1.5); 482 | //ui64 cnt = (1<<16); 483 | ////ui64 bytes = lava_vec::sizeBytes(cnt); 484 | ////void* mem = malloc( bytes ); 485 | ////lava_vec lv(mem, cnt); 486 | // 487 | //auto lv = STACK_VEC(i64, cnt); 488 | ////memset(lv.data(), 0, 16*sizeof(u32) ); 489 | //Println("capacity: ", lv.capacity() ); 490 | //Println("size: ", lv.size() ); 491 | //Println("sizeBytes: ", lv.sizeBytes() ); 492 | //TO(lv.size(), i) lv[i] = i; 493 | //cout << lv[lv.size()-1] << " "; 494 | ////TO((i32)lv.size(), i) Print(" ",i,":",lv[i]); 495 | ////TO(lv.size(), i) cout << lv[i] << " "; 496 | ////lv.~lava_vec(); // running the destructor explicitly tests double destrucion since it will be destructed at the end of the function also 497 | 498 | auto dbs = simdb_listDBs(); 499 | Println("\n\n db list"); 500 | //TO(dbs.size(),i) wcout << dbs[i] << "\n"; 501 | TO(dbs.size(),i){ cout << dbs[i] << "\n"; } 502 | Println("\n\n"); 503 | 504 | 505 | Println("\n\n DONE \n\n"); 506 | PAUSE 507 | db.close(); 508 | Println("\n\n CLOSED \n\n"); 509 | PAUSE 510 | 511 | return 0; 512 | } 513 | 514 | #ifdef _MSC_VER 515 | #pragma warning(pop) 516 | #endif 517 | 518 | 519 | 520 | 521 | 522 | 523 | 524 | // template > 525 | //using vec = std::vector; 526 | 527 | //struct _u128 { u64 hi, lo; }; 528 | //using u128 = __declspec(align(128)) volatile _u128; 529 | 530 | //SECTION(128 bit atomic compare and exchange) 531 | //{ 532 | // u128 dest = { 101, 102 }; 533 | // u128 comp = { 101, 102 }; 534 | // u128 desired = { 85, 86 }; 535 | // 536 | // _InterlockedCompareExchange128( 537 | // (i64*)(&dest), 538 | // desired.hi, 539 | // desired.lo, 540 | // (i64*)(&comp) ); 541 | // 542 | // Println("dest: ", dest.hi, " ", dest.lo); 543 | // Println("comp: ", comp.hi, " ", comp.lo); 544 | // Println("\n\n"); 545 | // 546 | // //u128 dest = { 101, 102 }; 547 | // //u128 comp = { 101, 102 }; 548 | // //u128 desired = { 85, 86 }; 549 | // 550 | // _InterlockedCompareExchange128( 551 | // (i64*)(&dest), 552 | // desired.hi, 553 | // desired.lo, 554 | // (i64*)(&comp) ); 555 | // 556 | // Println("dest: ", dest.hi, " ", dest.lo); 557 | // Println("comp: ", comp.hi, " ", comp.lo); 558 | // Println("\n\n"); 559 | //} 560 | 561 | //u32 sz = 18921703; 562 | //u32 sz = 400; 563 | //ConcurrentMap cm(sz); 564 | 565 | //ScopeTime t; 566 | //t.start(); 567 | //cm.init(sz); 568 | //t.stop("Init"); 569 | 570 | //Println( (i64)t.stop() ); 571 | //t.start(); 572 | // 573 | //Println("sz: ", cm.size()); 574 | // 575 | //TO(100,i) { 576 | // Println("i: ",i," ", intHash(i) ); 577 | //} 578 | // 579 | //TO(100,i) { 580 | // Println("i: ",i," ", nextPowerOf2(i)); 581 | //} 582 | 583 | // 584 | //u32 loopSz = (u32)( double(cm.size()) / 1.5); 585 | 586 | // 587 | //RngInt rng(1, 2); 588 | 589 | //vec thrds; 590 | //TO(5,tid) 591 | //{ 592 | // thrds.push_back( thread([&cm, &rng, loopSz, tid]() 593 | // { 594 | // ScopeTime t; 595 | // 596 | // t.start(); 597 | // TO(loopSz, i) { 598 | // auto val = i*10 + tid; 599 | // u32 pidx = cm.put(i, val); 600 | // SleepMs( rng() ); 601 | // 602 | // //SleepMs( (int)pow(4-tid,2) ); 603 | // //cout << pidx << " "; 604 | // //Println("Put Idx: ", (i64)pidx); 605 | // } 606 | // t.stop("Put"); 607 | // //Println( t.getSeconds() ); 608 | // 609 | // //t.start(); 610 | // //TO(loopSz, i) { 611 | // // u32 gidx = cm.get(i); 612 | // // cout << gidx << " "; 613 | // // //Println("Get Idx: ", (i64)gidx); 614 | // //} 615 | // //t.stop("Get"); 616 | // //Println( t.getSeconds() ); 617 | // })); // .detach(); 618 | // //thrds.back().detach(); 619 | //} 620 | ////for(auto& th : thrds) th.detach(); 621 | //for(auto& th : thrds) th.join(); 622 | 623 | // test getting back from the map 624 | //t.start(); 625 | //TO(loopSz, i) { 626 | // u32 gidx = cm.get(i); 627 | // cout << gidx << " "; 628 | // //Println("Get Idx: ", (i64)gidx); 629 | //} 630 | //Println(); 631 | //t.stop("Get"); 632 | 633 | //RngInt rngb(0,1); 634 | //RngInt rngl(0,loopSz-1); 635 | //ConcurrentList cl(loopSz); 636 | 637 | //// serial test of ConcurrentList 638 | ////t.start(); 639 | ////TO(loopSz, i){ 640 | //// Print(cl.idx(),":", cl.alloc(), " "); 641 | ////} 642 | ////TO(loopSz, i){ 643 | //// Print(cl.idx(),":", cl.free(i), " "); 644 | ////} 645 | ////Println(); 646 | ////auto lv = cl.list(); 647 | ////TO(lv->size(), i){ 648 | //// Print( i,":",(*lv)[i], " "); 649 | ////} 650 | ////Println(); 651 | ////t.stop("List"); 652 | 653 | //Println("\nLinks: ", cl.lnkCnt(), " "); 654 | //vec thrds; 655 | //TO(12,tid) 656 | //{ 657 | // thrds.push_back( thread([&cl, &rngb, &rngl, loopSz, tid]() 658 | // { 659 | // ScopeTime t; 660 | 661 | // t.start(); 662 | // TO(loopSz/5, i){ 663 | // //if(rngb()) 664 | // Print(tid,":",cl.nxt()," "); 665 | // //else Print(tid,":",cl.free(rngl()), " "); 666 | // SleepMs( rngb() ); 667 | // } 668 | // t.stop("alloc/free"); 669 | // })); 670 | //} 671 | //for(auto& th : thrds) th.join(); 672 | 673 | //Println(); 674 | //auto lv = cl.list(); 675 | //TO(lv->size(), i){ 676 | // Print( i,":",(*lv)[i], " "); 677 | //} 678 | //Println(); 679 | //Println("\nLinks: ", cl.lnkCnt(), " "); 680 | 681 | //i32 blkSz = 5; 682 | //i32 blocks = 2; 683 | //vec mem(blocks*blkSz, 0); 684 | //ConcurrentStore cs(mem.data(), blkSz, (u32)(blocks) ); 685 | // 686 | //Println("\n"); 687 | // 688 | //TO(cs.m_cl.list()->size(), i){ 689 | // Println( (*cs.m_cl.list())[i] ); 690 | //} 691 | //Println("\n\n"); 692 | // 693 | //TO(2,i){ 694 | // i32 blks = 0; 695 | // auto s = "w"; 696 | // i32 slen = (i32)strlen(s)+1; 697 | // //i32 slen = 1; 698 | // auto idx = cs.alloc(slen, &blks); // must allocate the exact number of bytes and no more, since that number will be used to read and write 699 | // cs.put(idx, (void*)s, slen); 700 | // 701 | // vec gs(slen,0); 702 | // cs.get(idx, gs.data(), slen); 703 | // Println(gs.data()); 704 | // cs.free(idx); 705 | // 706 | // TO(cs.m_blockCount, b){ 707 | // Println(cs.nxtBlock(b)); 708 | // } 709 | // Println("\n\n"); 710 | // TO(cs.m_cl.list()->size(), i){ 711 | // Println( (*cs.m_cl.list())[i] ); 712 | // } 713 | // Println("\n\n"); 714 | // 715 | //} 716 | 717 | //ConcurrentHash ch(64); 718 | //vec thrds; 719 | //TO(24,tid) 720 | //{ 721 | // thrds.push_back( thread([&ch, &rng, tid]() 722 | // { 723 | // TO(64,h) 724 | // { 725 | // ch.put(h, h*h); 726 | // //Print(h,": ", ch.put(h, h*h) ); 727 | // Print(h,":", ch.get(h)==h*h, " "); 728 | // } 729 | // } )); 730 | // thrds.back().detach(); 731 | //} 732 | 733 | //auto fileHndl = CreateFileMapping( 734 | // INVALID_HANDLE_VALUE, 735 | // NULL, 736 | // PAGE_READWRITE, 737 | // 0, 738 | // 0x0000FFFF, 739 | // "Global\\simdb_15"); 740 | // 741 | //if(fileHndl==NULL){/*error*/} 742 | // 743 | //i32 memSz = 256; 744 | //auto mapmem = MapViewOfFile(fileHndl, // handle to map object 745 | // FILE_MAP_ALL_ACCESS, // read/write permission 746 | // 0, 747 | // 0, 748 | // memSz); 749 | 750 | // OpenFileMapping if the file exists 751 | // 752 | //Println(fileHndl); 753 | //Println("\n\n"); 754 | //Println(mapmem); 755 | 756 | // 757 | //Println("kv sz: ", sizeof(simdb::KV) ); 758 | 759 | //u32 isKey : 1; 760 | //i32 readers : 31; 761 | 762 | //union KeyAndReaders 763 | //{ 764 | // struct{ u32 isKey : 1; i32 readers : 31; }; 765 | // u32 asInt; 766 | //}; 767 | //union BlkLst 768 | //{ 769 | // struct { KeyAndReaders kr; u32 idx; }; 770 | // ui64 asInt; 771 | //}; 772 | //Println("KeyAndReaders sz: ", sizeof(KeyAndReaders) ); 773 | //Println("BlkLst sz: ", sizeof(BlkLst) ); 774 | 775 | //union KV // 256 million keys (28 bits), 256 million values (28 bit), 255 readers (8 bits) 776 | //{ 777 | // struct 778 | // { 779 | // ui64 key : 28; 780 | // ui64 val : 28; 781 | // ui64 readers : 8; 782 | // }; 783 | // ui64 asInt; 784 | //}; 785 | 786 | //Println("KV sz: ", sizeof(ConcurrentHash::KV) ); 787 | //Println("empty kv: ", ConcurrentHash::empty_kv().key == ConcurrentHash::EMPTY_KEY ); 788 | //Println("empty kv: ", ConcurrentHash::EMPTY_KEY ); 789 | 790 | //Println("\n"); 791 | //struct ui128_t { uint64_t lo, hi; }; 792 | ////struct ui128_t { uint64_t low; }; 793 | 794 | //bool lkFree = atomic{}.is_lock_free(); 795 | //Println("is lock free 128: ", lkFree ); 796 | 797 | //ui128_t a = {0, 101}; 798 | //i8 alignmem[256]; 799 | //void* mem = (void*)(alignmem+(128-((ui64)alignmem % 128))); 800 | ////Println("mem: ", mem, " rem: ", ((ui64)mem)%128 ); 801 | //memcpy(mem, &a, sizeof(a)); 802 | //int ok = _InterlockedCompareExchange128((volatile long long*)mem, 202, 1, (long long*)&a); 803 | //memcpy(&a, mem, sizeof(a)); 804 | //ui128_t* b = (ui128_t*)mem; 805 | ////Println("ok: [", ok, "] lo: [", b->lo, "] hi: [", b->hi, "]"); 806 | 807 | //auto sz = sizeof(ConcurrentStore::BlkLst); 808 | //Println("Blklst sz: ", sz); 809 | 810 | // 811 | //Println("simdb stack sz: ", sizeof(simdb) ); 812 | 813 | //thrds[i] = move(thread( [i,&label,&db] 814 | //RngInt rnd(i*10, ((i+1)*10)-1); 815 | //RngInt rnd(0, 10, i); 816 | //db.put( toString(rnd()), label[i] ); 817 | //thrds[i].detach(); 818 | 819 | //TO(10,i) Print(rnd(), " "); 820 | //break; 821 | -------------------------------------------------------------------------------- /simdb.hpp: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | Copyright 2017 Simeon Bassett 4 | 5 | Licensed under the Apache License, Version 2.0 (the "License"); 6 | you may not use this file except in compliance with the License. 7 | You may obtain a copy of the License at 8 | 9 | http://www.apache.org/licenses/LICENSE-2.0 10 | 11 | Unless required by applicable law or agreed to in writing, software 12 | distributed under the License is distributed on an "AS IS" BASIS, 13 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | See the License for the specific language governing permissions and 15 | limitations under the License. 16 | */ 17 | 18 | /* 19 | SimDB 20 | 21 | What it does: 22 | | SimDB is a key value store that uses arbitrary byte data (of arbitrary length) as both the key and the value. 23 | | It additionally uses shared memory, which allows processes to communicate with each other quickly. 24 | | It is lock free and scales well with multiple threads writing, reading, and deleting concurrently. 25 | 26 | How it works: 27 | |-simdb: 28 | | This contains the user facing interface. It contains the ConcurrentHash, ConcurentStore, and SharedMem classes as members. 29 | | These data structures are made to be an interface over the contiguous memory given to them using a single address. 30 | | They do not allocate any heap memory themselves, but do have a few class members that will be on the stack. At the time of this writing it is 176 bytes on the stack. 31 | |-SharedMem: 32 | | | Interface to OS specific shared memory functions. Also handles an initial alignment. 33 | |-ConcurrentHash: 34 | | | Hash map that uses atomic operations on an array of VerIdx structs. 35 | | | It uses 64 bit atomic operations to compare-exchange one VerIdx at a time (VerIdx is two unsigned 32 bit integers, a version and an index). 36 | | | This makes sure that reading, writing and deleting is lock free. 37 | | | Writing is lock free since a VerIdx is already fully created and written to before putting it in the VerIdx array (m_vis) and the put operation here is a single 64 bit compare and swap. 38 | | | Deletion is lock free since the index in VerIdx is only freed from the CncrLst after setting the VerIdx here to DELETED. Actually deletion means 1. setting the VerIdx to DELETED 2. decrementing the readers of the blocklist that idx points to 3. If the readers variable of that blocklist is decremented below its initial value then the thread that took it below its initial value is the one to free it. 39 | | | Get is lock free since it can read an index from a VerIdx, increment readers, compare its key to the key in the list of blocks, read the value in the blocks to the output buffer and finally decrement the readers variable. Just like deletion, if a thread decrements readers below its initial value, it needs to free the block list. This means the last one out cleans up. 40 | |-ConcurrentStore: 41 | | | Keeps track of block lists. 42 | | | This primarily uses an array of BlkLst structs (which are 24 bytes each). 43 | | | The BlkLst lava_vec is used to make linked lists of block indices. 44 | | | The idea of a block list ends up being a starting index (from the VerIdx struct in the concurrent hash). The BlkLst struct at the starting index contains an index of the next BlkLst struct and so on until reaching a BlkLst that has an index of LIST_END. This means that one array contains multiple linked lists (using indices and not pointers of course). 45 | | | This exposes an alloc() function and a free() function. 46 | | | alloc() gets the index of the next block from CncrLst (concurrent list). 47 | | | The BlkLst struct keeps the total length and the key length / value offset since it does not have to be atomic and is only initialized and used when one thread allocates and only destroyed when one thread frees, just like the actual data blocks. 48 | |-ConcurrentList: 49 | | | The concurrent list is an array integers. 50 | | | The number of elements (like all the arrays) is the number of blocks. 51 | | | There is one integer per block with the integer at a given index representing the next slot in the list. 52 | | | The end of the list will have value LIST_END. On initialization the array's values would be |1|2|3|4| ... LIST_END, which makes a list from the start to the end. This means s_lv[0] would return 1. 53 | 54 | Terms: 55 | |-Block List: 56 | | A sequence of block indices. The entry in ConcurrentHash gives the position in the block list array where the list starts. 57 | | The value at each index in the array contains the index of the next block. 58 | | The list end is know when a special value of LIST_END is found as the value in the array. 59 | |-Block List Version: 60 | | This is a version number given to each block list on allocation (not each block). 61 | | It is used to link a ConcurrentHash value to the block list. 62 | | If the versions are the same, it is known that the block list at the index read from ConcurrentHash has not changed. 63 | | This change could happen if: 64 | | | 1. Thread ONE reads the entry in ConcurrentHash but has not accessed the block list index in the entry yet. Pretend that thread one stalls and nothing more happens until further down. 65 | | | 2. Thread TWO has already allocated a block list and swaps its new entry for the old entry which is still carried by thread one. 66 | | | 3. Thread TWO now must free the block list given by the old entry, which it does, because no thread is reading it since thread one is still stalled. 67 | | | 4. Thread TWO allocates another block list, which ends up using the blocks it just deallocated. 68 | | | 5. Thread ONE wakes up and reads from the block index it found in the ConcurrentHash entry, which is no longer the same and may not even be the head of the list. 69 | | | If the index is used purely for matching the binary key, this wouldn't be a problem. 70 | | | When the index is used to find a binary value however, this is a problem, since the length of a different value could be the same, and there would be no data to be able to tell that they are different. 71 | 72 | How it achieves lock free concurrency: 73 | | ConcurrentHash is treated as the authority of what is stored in the database. 74 | | It has an array of VerIdx structs that can also be treated as 64 bit integers. Each is dealt with atomically. 75 | | Its individual bits are used as a bitfied struct containing an index into ConcurrentStore's block list as well as the version number of that list. 76 | | The core is m_vis, which is an array of VerIdx structs. The memory ordering is swapped on every other index in preparation for robin hood hashing techniques, so the actual memory layout (separated into 128 bit chunks) is |Index Version Version Index|Index Version Version Index| 77 | |-Finding a matching index: 78 | | | 1. Use the hash of the key bytes to jump to an index. 79 | | | 2. Load the integer atomically from that index and treat it as a VerIdx struct. 80 | | | 3. Use the index from that struct to read the bytes from the list of blocks in BlkLst. 81 | | | 4. Increment the readers variable atomically, so that it won't be deleted before this thread is done with it. 82 | | | 5. If there is a match, keep reading the list of blocks to fill the output buffer with the value section of the block list. 83 | | | 6. After, decrement the readers variable atomically. If readers goes below its initial value, this thread will be the one to free the block list. 84 | 85 | Other notables: 86 | | All of the main classes have a static sizeBytes() function that takes in the same arguments as a constructor and return the number of bytes that it will need in the shared memory 87 | | Classes have member variables that are used as interfaces to the shared memory denoted with s_ (s for shared) 88 | | Normal member variables that are just data on the stack are denoted with m_ (m for member) 89 | 90 | _________________ 91 | | Memory Layout | 92 | ----------------- 93 | ______________________________________________________________________________________________________________________ 94 | |Flags|BlockSize|BlockCount|ConcurrentHash|ConcurrentStore|ConcurentList|...BlockCount*BlockSize bytes for blocks....| 95 | _____________________________/ \_______ \______________________________________________________ 96 | ______|____________________________________ ____________|_________________________________________________ ________|___________________________________________ 97 | |size(bytes)|...array of VerIdx structs...| |Block List Version|size(bytes)|...array of BlkLst structs...| |size(bytes)|...array of unsigned 32 bit ints (u32)| 98 | 99 | 100 | First 24 bytes (in 8 byte / unsigned 64 bit chunks): 101 | ____________________________ 102 | |Flags|BlockSize|BlockCount| 103 | 104 | Flags: Right now holds count of the number of processes that have the db open. When the count goes to 0, the last process will delete the shared memory file. 105 | BlockSize: The size in bytes of a block. A good default would be to set this to the common page size of 4096 bytes. 106 | BlockCount: The number of blocks. This hash table array, block list array and concurrent list array will all be the same length. This multiplied by the BlockSize will give the total amount of bytes available for key and value data. More blocks will also mean the hash table will have less collisions as well as less contention between threads. 107 | 108 | */ 109 | 110 | // -todo: make a list cut itself off at the end by inserting LIST_END as the last value 111 | // -todo: look into readers and matching - should two threads with the same key ever be able to double insert into the db? - MATCH_REMOVED was not re-looping on the current index 112 | // -todo: make MATCH_REMOVED restart the current index 113 | // -todo: make runIfMatch return a pair that includes the return value of the function it runs 114 | // -todo: make sure version setting on free sets the version to 0 on the whole list 115 | // -todo: make sure incReaders and decReaders are using explicit sequential consistency - already done 116 | // -todo: make sure that if there is a version mismatch when comparing a block list, the block list version is still used when trying to swap the version+idx - would only the index actually be needed since a block list with incremented readers won't give up its index, thus it should be unique? 117 | // -todo: take version argument out of incReaders and decReaders 118 | // -todo: make a temporary thread_local variable for each thread to count how many allocations it has made and how many allocations it has freed - worked very well to narrow down the problem 119 | // -todo: make sure that the VerIdx being returned from putHashed is actually what was atomically swapped out 120 | // -todo: try putting LIST_END at the end of the the concurrent lists - not needed for now 121 | // -todo: debug why 2 threads inserting the same key seems to need all blocks instead of just 3 * 2 * 2 (three blocks per key * two threads * two block lists per thread) - delete flag in block lists was not always set 122 | // -todo: assert that the block list is never already deleted when being deleted from putHashed - that wasn't the problem 123 | // -todo: check what happens when the same key but different versions are inserted - do two different versions end up in the DB? does one version end up undeletable ? - this was fixed by only comparing the key without the version 124 | // -todo: check path of thread that deletes a key, make sure it replaces the index in the hash map - how do two conflicting indices in the hash map resolve? the thread that replaces needs to delete the old allocation using the version - is the version / deleted flag being changed atomically in the block list 125 | // -todo: change the Match enum to be an bit bitfield with flags - not needed for now 126 | // -todo: make simdb len() and get() ignore version numbers for match and only match keys 127 | 128 | // todo: make sure get() only increments and decrements the first/key block in the block list 129 | // todo: make simdb give a proper error if running out of space 130 | // todo: make simdb expand when eighther out of space or initialized with a larger amount of space 131 | // todo: make a get function that takes a key version struct 132 | // todo: make a get function that returns a tbl if tbl.hpp is included 133 | 134 | #ifdef _MSC_VER 135 | #pragma once 136 | #pragma warning(push, 0) 137 | #endif 138 | 139 | #ifndef __SIMDB_HEADER_GUARD__ 140 | #define __SIMDB_HEADER_GUARD__ 141 | 142 | // turn asserts on an off - not sure of the best way to handle this with gcc and clang yet 143 | #ifdef _MSC_VER 144 | #if !defined(_DEBUG) 145 | #define NDEBUG 146 | #endif 147 | #endif 148 | 149 | #if !defined(SECTION) 150 | #define SECTION(_msvc_only_collapses_macros_with_arguments, ...) 151 | #endif 152 | 153 | // platform specific includes - mostly for shared memory mapping and auxillary functions like open, close and the windows equivilents 154 | #if defined(_WIN32) // windows 155 | #include 156 | #include 157 | 158 | #include 159 | 160 | //#ifdef UNICODE 161 | // #undef UNICODE 162 | //#endif 163 | #define NOMINMAX 164 | #define WIN32_LEAN_AND_MEAN 165 | #include 166 | #include 167 | 168 | #ifdef MIN 169 | #undef MIN 170 | #endif 171 | #ifdef MAX 172 | #undef MAX 173 | #endif 174 | 175 | #ifdef _MSC_VER 176 | #if !defined(_CRT_SECURE_NO_WARNINGS) 177 | #define _CRT_SECURE_NO_WARNINGS 178 | #endif 179 | 180 | #if !defined(_SCL_SECURE_NO_WARNINGS) 181 | #define _SCL_SECURE_NO_WARNINGS 182 | #endif 183 | #endif 184 | #elif defined(__APPLE__) || defined(__MACH__) || defined(__unix__) || defined(__FreeBSD__) || defined(__linux__) // osx, linux and freebsd 185 | // for mmap and munmap 186 | // PROT_READ and PROT_WRITE to allow reading and writing but not executing of the mapped memory pages 187 | // MAP_ANONYMOUS | MAP_SHARED for the anonymous shared memory we want 188 | // mmap is system call 2 on osx, freebsd, and linux 189 | // the apple docs for mmap say "BSD System Calls" so I guess they haven't changed them around 190 | #include 191 | #include 192 | #include 193 | #include 194 | #include // for flock (file lock) 195 | #include 196 | #include 197 | #include 198 | #include 199 | #include 200 | #endif 201 | 202 | #include 203 | #include 204 | #include 205 | #include 206 | #include 207 | #include 208 | #include 209 | #include 210 | #include 211 | #include 212 | #include 213 | 214 | // platform specific type definitions 215 | #ifdef _WIN32 // these have to be outside the anonymous namespace 216 | typedef void *HANDLE; 217 | typedef HANDLE *PHANDLE; 218 | typedef wchar_t WCHAR; // wc, 16-bit UNICODE character 219 | typedef UCHAR BOOLEAN; // winnt 220 | typedef unsigned long ULONG; 221 | #endif 222 | 223 | //#ifndef NDEBUG 224 | thread_local int __simdb_allocs = 0; 225 | thread_local int __simdb_deallocs = 0; 226 | //#endif 227 | 228 | namespace { 229 | enum Match { MATCH_FALSE=0, MATCH_TRUE=1, MATCH_REMOVED = -1, MATCH_TRUE_WRONG_VERSION = -2 }; 230 | 231 | template 232 | class lava_noop 233 | { 234 | void operator()(){} 235 | }; 236 | 237 | inline uint64_t fnv_64a_buf(void const *const buf, uint64_t len) // sbassett - I know basically nothing about hash functions and there is likely a better one out there 238 | { 239 | uint64_t hval = 0xCBF29CE484222325; 240 | uint8_t* bp = (uint8_t*)buf; // start of buffer 241 | uint8_t* be = bp + len; // beyond end of buffer 242 | while(bp < be){ // FNV-1a hash each octet of the buffer 243 | hval ^= (uint64_t)*bp++; // xor the bottom with the current octet */ 244 | hval += (hval << 1) + (hval << 4) + (hval << 5) + 245 | (hval << 7) + (hval << 8) + (hval << 40); 246 | } 247 | return hval; 248 | } 249 | 250 | inline void prefetch1(char const* const p) 251 | { 252 | #ifdef _MSC_VER // if msvc or intel compilers 253 | _mm_prefetch(p, _MM_HINT_T1); 254 | #elif defined(__GNUC__) || defined(__clang__) 255 | __builtin_prefetch(p); 256 | #else 257 | 258 | #endif 259 | } 260 | 261 | #ifdef _WIN32 262 | typedef struct _UNICODE_STRING { 263 | USHORT Length; 264 | USHORT MaximumLength; 265 | #ifdef MIDL_PASS 266 | [size_is(MaximumLength / 2), length_is((Length) / 2) ] USHORT * Buffer; 267 | #else // MIDL_PASS 268 | _Field_size_bytes_part_(MaximumLength, Length) PWCH Buffer; 269 | #endif // MIDL_PASS 270 | } UNICODE_STRING; 271 | typedef UNICODE_STRING *PUNICODE_STRING; 272 | 273 | typedef struct _OBJECT_ATTRIBUTES { 274 | ULONG Length; 275 | HANDLE RootDirectory; 276 | PUNICODE_STRING ObjectName; 277 | ULONG Attributes; 278 | PVOID SecurityDescriptor; // Points to type SECURITY_DESCRIPTOR 279 | PVOID SecurityQualityOfService; // Points to type SECURITY_QUALITY_OF_SERVICE 280 | } OBJECT_ATTRIBUTES; 281 | typedef OBJECT_ATTRIBUTES *POBJECT_ATTRIBUTES; 282 | 283 | typedef long LONG; 284 | typedef LONG NTSTATUS; 285 | 286 | // the following is api poison, but seems to be the only way to list the global anonymous memory maps in windows 287 | #define DIRECTORY_QUERY 0x0001 288 | #define STATUS_SUCCESS ((NTSTATUS)0x00000000L) // ntsubauth 289 | #define OBJ_CASE_INSENSITIVE 0x00000040L 290 | #define STATUS_NO_MORE_FILES ((NTSTATUS)0x80000006L) 291 | #define STATUS_NO_MORE_ENTRIES ((NTSTATUS)0x8000001AL) 292 | 293 | typedef struct _IO_STATUS_BLOCK { 294 | union { 295 | NTSTATUS Status; 296 | PVOID Pointer; 297 | }; 298 | ULONG_PTR Information; 299 | } IO_STATUS_BLOCK, *PIO_STATUS_BLOCK; 300 | 301 | using NTOPENDIRECTORYOBJECT = NTSTATUS (WINAPI*)( 302 | _Out_ PHANDLE DirectoryHandle, 303 | _In_ ACCESS_MASK DesiredAccess, 304 | _In_ POBJECT_ATTRIBUTES ObjectAttributes 305 | ); 306 | using NTOPENFILE = NTSTATUS (WINAPI*)( 307 | _Out_ PHANDLE FileHandle, 308 | _In_ ACCESS_MASK DesiredAccess, 309 | _In_ POBJECT_ATTRIBUTES ObjectAttributes, 310 | _Out_ PIO_STATUS_BLOCK IoStatusBlock, 311 | _In_ ULONG ShareAccess, 312 | _In_ ULONG OpenOptions 313 | ); 314 | using NTQUERYDIRECTORYOBJECT = NTSTATUS(WINAPI*)( 315 | _In_ HANDLE DirectoryHandle, 316 | _Out_opt_ PVOID Buffer, 317 | _In_ ULONG Length, 318 | _In_ BOOLEAN ReturnSingleEntry, 319 | _In_ BOOLEAN RestartScan, 320 | _Inout_ PULONG Context, 321 | _Out_opt_ PULONG ReturnLength 322 | ); 323 | using RTLINITUNICODESTRING = VOID(*)( 324 | _Out_ PUNICODE_STRING DestinationString, 325 | _In_opt_ PCWSTR SourceString 326 | ); 327 | 328 | struct OBJECT_DIRECTORY_INFORMATION { UNICODE_STRING name; UNICODE_STRING type; }; 329 | 330 | //auto GetLastErrorStdStr() -> std::string 331 | //{ 332 | // DWORD error = GetLastError(); 333 | // if (error) 334 | // { 335 | // LPVOID lpMsgBuf; 336 | // DWORD bufLen = FormatMessage( 337 | // FORMAT_MESSAGE_ALLOCATE_BUFFER | 338 | // FORMAT_MESSAGE_FROM_SYSTEM | 339 | // FORMAT_MESSAGE_IGNORE_INSERTS, 340 | // NULL, 341 | // error, 342 | // MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), 343 | // (LPTSTR) &lpMsgBuf, 344 | // 0, NULL ); 345 | // if (bufLen) 346 | // { 347 | // LPCSTR lpMsgStr = (LPCSTR)lpMsgBuf; 348 | // std::string result(lpMsgStr, lpMsgStr+bufLen); 349 | // 350 | // LocalFree(lpMsgBuf); 351 | // 352 | // return result; 353 | // } 354 | // } 355 | // return std::string(); 356 | //} 357 | PVOID GetLibraryProcAddress(PSTR LibraryName, PSTR ProcName) 358 | { 359 | return GetProcAddress(GetModuleHandleA(LibraryName), ProcName); 360 | } 361 | int win_printf(const char * format, ...) 362 | { 363 | char szBuff[1024]; 364 | int retValue; 365 | DWORD cbWritten; 366 | va_list argptr; 367 | 368 | va_start( argptr, format ); 369 | retValue = wvsprintfA( szBuff, format, argptr ); 370 | va_end( argptr ); 371 | 372 | WriteFile( GetStdHandle(STD_OUTPUT_HANDLE), szBuff, retValue, 373 | &cbWritten, 0 ); 374 | 375 | return retValue; 376 | } 377 | #endif // end #ifdef _WIN32 378 | } 379 | 380 | #ifdef _WIN32 381 | #pragma warning(pop) 382 | #endif 383 | 384 | enum class simdb_error { 385 | NO_ERRORS=2, 386 | DIR_NOT_FOUND, 387 | DIR_ENTRY_ERROR, 388 | COULD_NOT_OPEN_MAP_FILE, 389 | COULD_NOT_MEMORY_MAP_FILE, 390 | SHARED_MEMORY_ERROR, 391 | FTRUNCATE_FAILURE, 392 | FLOCK_FAILURE, 393 | PATH_TOO_LONG 394 | }; 395 | 396 | template 397 | class lava_vec 398 | { 399 | public: 400 | using u64 = uint64_t; 401 | 402 | private: 403 | void* p; 404 | 405 | void set_sizeBytes(u64 sb){ ((u64*)p)[-1] = sb; } // an offset of -2 should be the first 8 bytes, which store the size in bytes of the whole memory span of this lava_vec 406 | 407 | public: 408 | static u64 sizeBytes(u64 count) // sizeBytes is meant to take the same arguments as a constructor and return the total number of bytes to hold the entire stucture given those arguments 409 | { 410 | return sizeof(u64) + count*sizeof(T); 411 | } 412 | 413 | lava_vec(){} 414 | lava_vec(void* addr, u64 count, bool owner=true) : 415 | p( ((u64*)addr) + 1 ) 416 | { 417 | if(owner){ 418 | set_sizeBytes( lava_vec::sizeBytes(count) ); 419 | } 420 | } 421 | lava_vec(void* addr) : p( ((u64*)addr) + 2 ) {} 422 | lava_vec(lava_vec const&) = delete; 423 | void operator=(lava_vec const&) = delete; 424 | 425 | lava_vec(lava_vec&& rval){ p=rval.p; rval.p=nullptr; } 426 | ~lava_vec(){} 427 | 428 | T& operator[](u64 i){ return data()[i]; } 429 | 430 | T* data(){ return (T*)p; } 431 | u64 sizeBytes() const { return ((u64*)p)[0]; } // first 8 bytes should be the total size of the buffer in bytes 432 | auto addr() const -> void* 433 | { 434 | return p; 435 | } 436 | }; 437 | class CncrLst 438 | { 439 | // Internally this is an array of indices that makes a linked list 440 | // Externally indices can be gotten atomically and given back atomically 441 | // | This is used to get free indices one at a time, and give back in-use indices one at a time 442 | // Uses the first 8 bytes that would normally store sizeBytes as the 64 bits of memory for the Head structure 443 | // Aligns the head on a 64 bytes boundary with the rest of the memory on a separate 64 byte boudary. This puts them on separate cache lines which should eliminate false sharing between cores when atomicallyaccessing the Head union (which will happen quite a bit) 444 | public: 445 | using u32 = uint32_t; 446 | using u64 = uint64_t; 447 | using au32 = volatile std::atomic; 448 | using au64 = volatile std::atomic; 449 | using ListVec = lava_vec; 450 | 451 | union Head 452 | { 453 | struct { u32 ver; u32 idx; }; // ver is version, idx is index 454 | u64 asInt; 455 | }; 456 | 457 | static const u32 LIST_END = 0xFFFFFFFF; 458 | static const u32 NXT_VER_SPECIAL = 0xFFFFFFFF; 459 | 460 | //private: 461 | ListVec s_lv; 462 | volatile au64* s_h; 463 | 464 | public: 465 | static u64 sizeBytes(u32 size) { return ListVec::sizeBytes(size) + 128; } // an extra 128 bytes so that Head can be placed (why 128 bytes? so that the head can be aligned on its own cache line to avoid false sharing, since it is a potential bottleneck) 466 | static u32 incVersion(u32 v) { return v==NXT_VER_SPECIAL? 1 : v+1; } 467 | 468 | CncrLst(){} 469 | CncrLst(void* addr, u32 size, bool owner=true) // this constructor is for when the memory is owned an needs to be initialized 470 | { // separate out initialization and let it be done explicitly in the simdb constructor? 471 | u64 addrRem = (u64)addr % 64; 472 | u64 alignAddr = (u64)addr + (64-addrRem); 473 | assert( alignAddr % 64 == 0 ); 474 | s_h = (au64*)alignAddr; 475 | 476 | u32* listAddr = (u32*)((u64)alignAddr+64); 477 | new (&s_lv) ListVec(listAddr, size, owner); 478 | 479 | if(owner){ 480 | for(u32 i=0; i<(size-1); ++i) s_lv[i] = i+1; 481 | s_lv[size-1] = LIST_END; 482 | 483 | ((Head*)s_h)->idx = 0; 484 | ((Head*)s_h)->ver = 0; 485 | } 486 | } 487 | 488 | bool headCmpEx(u64* expected, au64 desired) 489 | { 490 | using namespace std; 491 | 492 | //return atomic_compare_exchange_strong_explicit( 493 | // s_h, (volatile au64*)&expected, desired, 494 | // memory_order_seq_cst, memory_order_seq_cst 495 | // ); 496 | 497 | //return atomic_compare_exchange_strong( 498 | // s_h, (volatile au64*)&expected, desired 499 | //); 500 | 501 | return atomic_compare_exchange_strong_explicit( 502 | s_h, expected, desired, 503 | memory_order_seq_cst, memory_order_seq_cst 504 | ); 505 | } 506 | u32 nxt() // moves forward in the list and return the previous index 507 | { 508 | Head curHead, nxtHead; 509 | curHead.asInt = s_h->load(); 510 | do{ 511 | if(curHead.idx==LIST_END){ 512 | return LIST_END; 513 | } 514 | 515 | nxtHead.idx = s_lv[curHead.idx]; 516 | nxtHead.ver = curHead.ver==NXT_VER_SPECIAL? 1 : curHead.ver+1; 517 | }while( !headCmpEx( &curHead.asInt, nxtHead.asInt) ); 518 | //}while( !headCmpEx(curHead.asInt, nxtHead.asInt) ); 519 | //}while( !s_h->compare_exchange_strong(curHead.asInt, nxtHead.asInt) ); 520 | 521 | return curHead.idx; 522 | } 523 | u32 nxt(u32 prev) // moves forward in the list and return the previous index 524 | { 525 | using namespace std; 526 | 527 | Head curHead, nxtHead, prevHead; 528 | curHead.asInt = s_h->load(); 529 | do{ 530 | if(curHead.idx==LIST_END){ 531 | return LIST_END; 532 | } 533 | 534 | prevHead = curHead; 535 | nxtHead.idx = s_lv[curHead.idx]; 536 | nxtHead.ver = curHead.ver==NXT_VER_SPECIAL? 1 : curHead.ver+1; 537 | }while( !headCmpEx( &curHead.asInt, nxtHead.asInt) ); 538 | //}while( !headCmpEx(curHead.asInt, nxtHead.asInt) ); 539 | //}while( !s_h->compare_exchange_strong(curHead.asInt, nxtHead.asInt) ); 540 | 541 | //s_lv[prev] = curHead.idx; 542 | atomic_store( (au32*)&s_lv[prev], curHead.idx); 543 | 544 | return curHead.idx; 545 | } 546 | u32 free(u32 idx) // not thread safe when reading from the list, but it doesn't matter because you shouldn't be reading while freeing anyway, since the CncrHsh will already have the index taken out and the free will only be triggered after the last reader has read from it 547 | { 548 | Head curHead, nxtHead; u32 retIdx; 549 | curHead.asInt = s_h->load(); 550 | do{ 551 | retIdx = s_lv[idx] = curHead.idx; 552 | nxtHead.idx = idx; 553 | nxtHead.ver = curHead.ver + 1; 554 | }while( !headCmpEx( &curHead.asInt, nxtHead.asInt) ); 555 | //}while( !headCmpEx(curHead.asInt, nxtHead.asInt) ); 556 | //}while( !s_h->compare_exchange_strong(curHead.asInt, nxtHead.asInt) ); 557 | 558 | return retIdx; 559 | } 560 | u32 free(u32 st, u32 en) // not thread safe when reading from the list, but it doesn't matter because you shouldn't be reading while freeing anyway, since the CncrHsh will already have the index taken out and the free will only be triggered after the last reader has read from it 561 | { 562 | using namespace std; 563 | 564 | Head curHead, nxtHead; u32 retIdx; 565 | curHead.asInt = s_h->load(); 566 | do{ 567 | //retIdx = s_lv[en] = curHead.idx; 568 | retIdx = curHead.idx; 569 | atomic_store_explicit( (au32*)&(s_lv[en]), curHead.idx, memory_order_seq_cst); 570 | //atomic_store( (au32*)&(s_lv[en]), curHead.idx); 571 | nxtHead.idx = st; 572 | nxtHead.ver = curHead.ver + 1; 573 | }while( !headCmpEx( &curHead.asInt, nxtHead.asInt) ); 574 | //}while( !headCmpEx(curHead.asInt, nxtHead.asInt) ); 575 | //}while( !s_h->compare_exchange_strong(curHead.asInt, nxtHead.asInt) ); 576 | 577 | return retIdx; 578 | } 579 | u32 alloc(u32 count) 580 | { 581 | u32 st = nxt(); 582 | u32 cur = st; 583 | if(st == LIST_END) return LIST_END; 584 | else --count; 585 | 586 | while( count > 0 ){ 587 | u32 nxtIdx = nxt(cur); 588 | if(nxtIdx == LIST_END){ 589 | free(st,cur); 590 | return LIST_END; 591 | } 592 | cur = nxtIdx; 593 | --count; 594 | } 595 | 596 | //s_lv[cur] = LIST_END; 597 | return st; 598 | } 599 | auto count() const -> u32 { return ((Head*)s_h)->ver; } 600 | auto idx() const -> u32 601 | { 602 | Head h; 603 | h.asInt = s_h->load(); 604 | return h.idx; 605 | } 606 | auto list() -> ListVec const* { return &s_lv; } // not thread safe 607 | u32 lnkCnt() // not thread safe 608 | { 609 | u32 cnt = 0; 610 | u32 curIdx = idx(); 611 | while( curIdx != LIST_END ){ 612 | curIdx = s_lv[curIdx]; 613 | ++cnt; 614 | } 615 | return cnt; 616 | } 617 | auto head() -> Head* { return (Head*)s_h; } 618 | }; 619 | class CncrStr // CncrStr is Concurrent Store 620 | { 621 | public: 622 | using u8 = uint8_t; 623 | using u32 = uint32_t; 624 | using i32 = int32_t; 625 | using u64 = uint64_t; 626 | using i64 = int64_t; 627 | using au32 = std::atomic; 628 | using au64 = std::atomic; 629 | 630 | union VerIdx 631 | { 632 | struct { u32 idx; u32 version; }; 633 | u64 asInt; 634 | 635 | VerIdx(){} 636 | VerIdx(u32 _idx, u32 _version) : idx(_idx), version(_version) {} 637 | }; 638 | union KeyReaders 639 | { 640 | struct{ u32 isKey : 1; u32 isDeleted : 1; i32 readers : 30; }; 641 | u32 asInt; 642 | }; 643 | struct BlkLst // 24 bytes total 644 | { 645 | union{ 646 | KeyReaders kr; 647 | struct{ u32 isKey : 1; u32 isDeleted : 1; i32 readers : 30; }; 648 | }; // 4 bytes - kr is key readers 649 | u32 idx, version, len, klen, hash; // 20 bytes 650 | 651 | BlkLst() : isKey(0), isDeleted(0), readers(0), idx(0), version(0), len(0), klen(0), hash(0) {} 652 | BlkLst(bool _isKey, i32 _readers, u32 _idx, u32 _version, u32 _len=0, u32 _klen=0, u32 _hash=0) : 653 | isKey(_isKey), 654 | isDeleted(0), 655 | readers(_readers), 656 | idx(_idx), 657 | version(_version), 658 | hash(_hash) 659 | { 660 | len = _len; 661 | klen = _klen; 662 | } 663 | }; 664 | struct BlkCnt { u32 end : 1; u32 cnt : 31; }; // this is returned from alloc() and may not be neccesary - it is the number of blocks allocated and if the end was reached 665 | 666 | using ai32 = std::atomic; 667 | using BlockLists = lava_vec; // only the indices returned from the concurrent list are altered, and only one thread will deal with any single index at a time 668 | 669 | static const u32 LIST_END = CncrLst::LIST_END; 670 | 671 | static VerIdx List_End() 672 | { 673 | VerIdx vi; 674 | vi.idx = CncrLst::LIST_END; 675 | vi.version = 0; 676 | return vi; 677 | } 678 | static bool IsListEnd(VerIdx vi) 679 | { 680 | static const VerIdx empty = List_End(); 681 | return empty.asInt == vi.asInt; 682 | } 683 | 684 | bool cmpEx(au32* val, u32* expected, u32 desired) const 685 | { 686 | using namespace std; 687 | return atomic_compare_exchange_strong_explicit( 688 | val, expected, desired, 689 | memory_order_seq_cst, memory_order_seq_cst 690 | ); 691 | } 692 | BlkLst incReaders(u32 blkIdx) const //u32 version) const // BI is Block Index increment the readers by one and return the previous kv from the successful swap 693 | { 694 | using namespace std; 695 | 696 | KeyReaders cur, nxt; 697 | BlkLst* bl = &s_bls[blkIdx]; 698 | au32* areaders = (au32*)&(bl->kr); 699 | cur.asInt = atomic_load_explicit(areaders, memory_order_seq_cst); 700 | do{ 701 | if(cur.readers<0 || cur.isDeleted){ return BlkLst(); } 702 | nxt = cur; 703 | nxt.readers += 1; 704 | }while( !cmpEx(areaders, &cur.asInt, nxt.asInt) ); 705 | 706 | return *bl; // after readers has been incremented this block list entry is not going away. The only thing that would change would be the readers and that doesn't matter to the calling function. 707 | 708 | //cur.asInt = areaders->load(); 709 | // 710 | //if(bl->version!=version || cur.readers<0 || cur.isDeleted){ return BlkLst(); } 711 | // 712 | //}while( !areaders->compare_exchange_strong(cur.asInt, nxt.asInt) ); 713 | } 714 | //bool decReadersOrDel(u32 blkIdx, u32 version, bool del=false) const // BI is Block Index increment the readers by one and return the previous kv from the successful swap 715 | bool decReadersOrDel(u32 blkIdx, bool del=false) const // BI is Block Index increment the readers by one and return the previous kv from the successful swap 716 | { 717 | using namespace std; 718 | 719 | KeyReaders cur, nxt; bool doDelete=false; 720 | 721 | BlkLst* bl = &s_bls[blkIdx]; 722 | au32* areaders = (au32*)&(bl->kr); 723 | cur.asInt = atomic_load_explicit(areaders, memory_order_seq_cst); 724 | do{ 725 | doDelete = false; 726 | nxt = cur; 727 | if(del){ 728 | if(cur.isDeleted){ return true; } 729 | if(cur.readers==0){ 730 | doDelete = true; 731 | } 732 | nxt.isDeleted = true; 733 | }else{ 734 | if(cur.readers==1 && cur.isDeleted){ doDelete=true; } 735 | nxt.readers -= 1; 736 | } 737 | }while( !cmpEx(areaders, &cur.asInt, nxt.asInt) ); 738 | 739 | if(doDelete){ doFree(blkIdx); return false; } 740 | 741 | return true; 742 | 743 | //cur.asInt = areaders->load(); 744 | //if(bl->version!=version){ return false; } 745 | // 746 | //if(cur.readers==0 && !cur.isDeleted){ doDelete=true; } 747 | // 748 | //}while( !areaders->compare_exchange_strong(cur.asInt, nxt.asInt) ); 749 | // 750 | //return cur.isDeleted; 751 | } 752 | 753 | //private: 754 | // s_ variables are used to indicate data structures and memory that is in the shared memory, usually just a pointer on the stack and of course, nothing on the heap 755 | // The order of the shared memory as it is in the memory mapped file: Version, CncrLst, BlockLists, Blocks 756 | mutable CncrLst s_cl; // flat data structure - pointer to memory 757 | mutable BlockLists s_bls; // flat data structure - pointer to memory - bl is Block Lists 758 | void* s_blksAddr; // points to the block space in the shared memory 759 | au64* s_version; // pointer to the shared version number 760 | 761 | u32 m_blockSize; 762 | u64 m_szBytes; 763 | 764 | VerIdx nxtBlock(u32 blkIdx) const 765 | { 766 | BlkLst bl = s_bls[blkIdx]; 767 | prefetch1( (char const* const)blockFreePtr(bl.idx) ); 768 | return VerIdx(bl.idx, bl.version); 769 | } 770 | u32 blockFreeSize() const { return m_blockSize; } 771 | u8* blockFreePtr(u32 blkIdx) const { return ((u8*)s_blksAddr) + blkIdx*m_blockSize; } 772 | u8* blkPtr(u32 blkIdx) const { return ((u8*)s_blksAddr) + blkIdx*m_blockSize; } 773 | u32 blocksNeeded(u32 len, u32* out_rem=nullptr) 774 | { 775 | u32 freeSz = blockFreeSize(); 776 | u32 byteRem = len % freeSz; 777 | u32 blocks = len / freeSz + (byteRem? 1 : 0); // should never be 0 if blocksize is greater than the size of the index type 778 | 779 | if(out_rem) *out_rem = byteRem; 780 | 781 | return blocks; 782 | } 783 | u32 findEndSetVersion(u32 blkIdx, u32 version) const // find the last BlkLst slot in the linked list of blocks to free 784 | { 785 | u32 cur=blkIdx, prev=blkIdx; // the first index will have its version set twice 786 | while(cur != LIST_END){ 787 | s_bls[cur].version = version; 788 | prev = cur; 789 | cur = s_bls[cur].idx; 790 | } 791 | return prev; 792 | 793 | //assert(s_cl.s_lv[cur] == s_bls[cur].idx); 794 | // 795 | //sim_assert(s_cl.s_lv[cur]==s_bls[cur].idx, s_cl.s_lv[cur], s_bls[cur].idx ); 796 | // 797 | //auto lvIdx = s_cl.s_lv[cur]; 798 | //auto blsIdx = s_bls[cur].idx; 799 | //sim_assert(lvIdx == blsIdx, lvIdx, blsIdx ); 800 | // 801 | //sim_assert(s_cl.s_lv[prev]==s_bls[prev].idx, s_cl.s_lv[prev], s_bls[prev].idx ); 802 | // 803 | //return cur; 804 | } 805 | void doFree(u32 blkIdx) const // frees a list/chain of blocks - don't need to zero out the memory of the blocks or reset any of the BlkLsts' variables since they will be re-initialized anyway 806 | { 807 | using namespace std; 808 | 809 | u32 listEnd = findEndSetVersion(blkIdx, 0); 810 | 811 | 812 | //sim_assert(s_lv[en], s_lv[en] == LIST_END, en); 813 | //assert(s_cl.s_lv[listEnd] == LIST_END); 814 | 815 | s_cl.free(blkIdx, listEnd); 816 | 817 | __simdb_deallocs += 1; 818 | 819 | // doesn't work - LIST_END only works for allocation 820 | //u32 cur = blkIdx; 821 | //while(cur != LIST_END) 822 | // cur = s_cl.free(cur); 823 | } 824 | u32 writeBlock(u32 blkIdx, void const* const bytes, u32 len=0, u32 ofst=0) // don't need to increment readers since write should be done before the block is exposed to any other threads 825 | { 826 | u32 blkFree = blockFreeSize(); 827 | u8* p = blockFreePtr(blkIdx); 828 | u32 cpyLen = len==0? blkFree : len; // if next is negative, then it will be the length of the bytes in that block 829 | p += ofst; 830 | memcpy(p, bytes, cpyLen); 831 | 832 | return cpyLen; 833 | } 834 | u32 readBlock(u32 blkIdx, u32 version, void *const bytes, u32 ofst=0, u32 len=0) const 835 | { 836 | //BlkLst bl = incReaders(blkIdx, version); 837 | BlkLst bl = incReaders(blkIdx); 838 | if(bl.version==0){ return 0; } 839 | u32 blkFree = blockFreeSize(); 840 | u8* p = blockFreePtr(blkIdx); 841 | u32 cpyLen = len==0? blkFree-ofst : len; 842 | memcpy(bytes, p+ofst, cpyLen); 843 | decReadersOrDel(blkIdx); 844 | //decReadersOrDel(blkIdx, version); 845 | 846 | return cpyLen; 847 | } 848 | 849 | public: 850 | static u64 BlockListsOfst(){ return sizeof(u64); } 851 | static u64 CListOfst(u32 blockCount){ return BlockListsOfst() + BlockLists::sizeBytes(blockCount); } // BlockLists::sizeBytes ends up being sizeof(BlkLst)*blockCount + 2 u64 variables 852 | static u64 BlksOfst(u32 blockCount){ return CListOfst(blockCount) + CncrLst::sizeBytes(blockCount); } 853 | static u64 sizeBytes(u32 blockSize, u32 blockCount){ return BlksOfst(blockCount) + blockSize*blockCount; } 854 | 855 | CncrStr(){} 856 | CncrStr(void* addr, u32 blockSize, u32 blockCount, bool owner=true) : 857 | s_cl( (u8*)addr + CListOfst(blockCount), blockCount, owner), 858 | s_bls( (u8*)addr + BlockListsOfst(), blockCount, owner), 859 | s_blksAddr( (u8*)addr + BlksOfst(blockCount) ), 860 | s_version( (au64*)addr ), 861 | m_blockSize(blockSize), 862 | m_szBytes( *((u64*)addr) ) 863 | { 864 | if(owner){ 865 | for(u32 i=0; istore(1); // todo: what is this version for if CncrLst already has a version? 867 | } 868 | assert(blockSize > sizeof(i32)); 869 | } 870 | 871 | auto alloc(u32 size, u32 klen, u32 hash, BlkCnt* out_blocks=nullptr) -> VerIdx 872 | { 873 | u32 byteRem = 0; 874 | u32 blocks = blocksNeeded(size, &byteRem); 875 | u32 st = s_cl.alloc(blocks); 876 | SECTION(handle allocation errors from the concurrent list){ 877 | if(st==LIST_END){ 878 | if(out_blocks){ *out_blocks = {true, 0} ; } 879 | return List_End(); 880 | } 881 | } 882 | 883 | u32 ver = (u32)s_version->fetch_add(1); 884 | u32 cur=st, cnt=0; 885 | SECTION(loop for the number of blocks needed and get new block and link it to the list) 886 | { 887 | for(u32 i=0; iend = s_cl.s_lv[cur] == LIST_END; 898 | out_blocks->cnt = cnt; 899 | } 900 | 901 | s_bls[cur] = BlkLst(false,0,LIST_END,ver,size,0,0); // if there is only one block needed, cur and st could be the same 902 | 903 | auto b = s_bls[st]; // debugging 904 | 905 | s_bls[st].isKey = true; 906 | s_bls[st].hash = hash; 907 | s_bls[st].len = size; 908 | s_bls[st].klen = klen; 909 | s_bls[st].isDeleted = false; 910 | 911 | __simdb_allocs += 1; 912 | 913 | VerIdx vi(st, ver); 914 | return vi; 915 | } 916 | } 917 | bool free(u32 blkIdx, u32 version) // doesn't always free a list/chain of blocks - it decrements the readers and when the readers gets below the value that it started at, only then it is deleted (by the first thread to take it below the starting number) 918 | { 919 | //return decReadersOrDel(blkIdx, version, true); 920 | return decReadersOrDel(blkIdx, true); 921 | } 922 | void put(u32 blkIdx, void const *const kbytes, u32 klen, void const *const vbytes, u32 vlen) // don't need version because this will only be used after allocating and therefore will only be seen by one thread until it is inserted into the ConcurrentHash 923 | { 924 | using namespace std; 925 | 926 | u8* b = (u8*)kbytes; 927 | bool kjagged = (klen % blockFreeSize()) != 0; 928 | u32 kblocks = kjagged? blocksNeeded(klen)-1 : blocksNeeded(klen); 929 | u32 remklen = klen - (kblocks*blockFreeSize()); 930 | 931 | u32 fillvlen = min(vlen, blockFreeSize()-remklen); 932 | u32 tailvlen = vlen-fillvlen; 933 | bool vjagged = (tailvlen % blockFreeSize()) != 0; 934 | u32 vblocks = vjagged? blocksNeeded(tailvlen)-1 : blocksNeeded(tailvlen); 935 | u32 remvlen = max(0, tailvlen - (vblocks*blockFreeSize()) ); 936 | 937 | u32 cur = blkIdx; 938 | for(u32 i=0; i0){ 953 | b += writeBlock(cur, b, remvlen); 954 | } 955 | } 956 | u32 get(u32 blkIdx, u32 version, void *const bytes, u32 maxlen, u32* out_readlen=nullptr) const 957 | { 958 | using namespace std; 959 | 960 | if(blkIdx == LIST_END){ return 0; } 961 | 962 | //BlkLst bl = incReaders(blkIdx, version); 963 | BlkLst bl = incReaders(blkIdx); 964 | 965 | u32 vlen = bl.len-bl.klen; 966 | if(bl.len==0 || vlen>maxlen ) return 0; 967 | 968 | auto kdiv = div((i64)bl.klen, (i64)blockFreeSize()); 969 | auto kblks = kdiv.quot; 970 | u32 krem = (u32)kdiv.rem; 971 | auto vrdLen = 0; 972 | u32 len = 0; 973 | u32 rdLen = 0; 974 | u8* b = (u8*)bytes; 975 | i32 cur = blkIdx; 976 | VerIdx nxt; 977 | for(int i=0; i(blockFreeSize()-krem, vlen); 983 | rdLen = (u32)readBlock(cur, version, b, krem, vrdLen); 984 | b += rdLen; 985 | len += rdLen; 986 | nxt = nxtBlock(cur); if(nxt.version!=version){ goto read_failure; } 987 | 988 | while(len(blockFreeSize(), maxlen-len); 991 | cur = nxt.idx; 992 | rdLen = readBlock(cur, version, b, 0, vrdLen); if(rdLen==0) break; // rdLen is read length 993 | b += rdLen; 994 | len += rdLen; 995 | nxt = nxtBlock(cur); 996 | } 997 | 998 | if(out_readlen){ *out_readlen = len; } 999 | 1000 | read_failure: 1001 | decReadersOrDel(blkIdx, false); 1002 | //decReadersOrDel(blkIdx, version); 1003 | 1004 | return len; // only one return after the top to make sure readers can be decremented - maybe it should be wrapped in a struct with a destructor 1005 | } 1006 | u32 getKey(u32 blkIdx, u32 version, void *const bytes, u32 maxlen) const 1007 | { 1008 | if(blkIdx == LIST_END){ return 0; } 1009 | 1010 | //BlkLst bl = incReaders(blkIdx, version); 1011 | BlkLst bl = incReaders(blkIdx); 1012 | 1013 | if(bl.len==0 || (bl.klen)>maxlen ) return 0; 1014 | 1015 | auto kdiv = div((i64)bl.klen, (i64)blockFreeSize()); 1016 | auto kblks = kdiv.quot; 1017 | u32 krem = (u32)kdiv.rem; 1018 | u32 len = 0; 1019 | u32 rdLen = 0; 1020 | u8* b = (u8*)bytes; 1021 | VerIdx vi = { blkIdx, version }; 1022 | 1023 | int i=0; 1024 | while( i curlen){ 1080 | Match cmpBlk = memcmpBlk(curidx, version, curbuf, p, curlen); // the end 1081 | if(cmpBlk != MATCH_TRUE) return cmpBlk; //MATCH_FALSE; 1082 | 1083 | return verOk? MATCH_TRUE : MATCH_TRUE_WRONG_VERSION; 1084 | }else{ 1085 | Match cmp = memcmpBlk(curidx, version, curbuf, p, blksz); 1086 | if(cmp!=MATCH_TRUE){ return cmp; } 1087 | } 1088 | 1089 | curbuf += blksz; 1090 | curlen -= blksz; 1091 | curidx = nxt.idx; 1092 | nxt = nxtBlock(curidx); 1093 | 1094 | verOk &= nxt.version != version; 1095 | //if(nxt.version!=version){ return MATCH_FALSE; } 1096 | } 1097 | } 1098 | u32 len(u32 blkIdx, u32 version, u32* out_vlen=nullptr) const 1099 | { 1100 | BlkLst bl = s_bls[blkIdx]; 1101 | if(version==bl.version && bl.len>0){ 1102 | if(out_vlen) *out_vlen = bl.len - bl.klen; 1103 | return bl.len; 1104 | }else 1105 | return 0; 1106 | } 1107 | auto list() const -> CncrLst const& { return s_cl; } 1108 | auto data() const -> const void* { return (void*)s_blksAddr; } 1109 | auto blkLst(u32 i) const -> BlkLst { return s_bls[i]; } 1110 | 1111 | friend class CncrHsh; 1112 | }; 1113 | class CncrHsh 1114 | { 1115 | public: 1116 | using u8 = uint8_t; 1117 | using u32 = uint32_t; 1118 | using u64 = uint64_t; 1119 | using i64 = int64_t; 1120 | using au64 = std::atomic; 1121 | using VerIdx = CncrStr::VerIdx; 1122 | using BlkLst = CncrStr::BlkLst; 1123 | 1124 | struct VerIpd { u32 version, ipd; }; // ipd is Ideal Position Distance 1125 | 1126 | static const u32 KEY_MAX = 0xFFFFFFFF; 1127 | static const u32 EMPTY = KEY_MAX; // first 21 bits set 1128 | static const u32 DELETED = KEY_MAX - 1; // 0xFFFFFFFE; // 1 less than the EMPTY 1129 | static const u32 LIST_END = CncrStr::LIST_END; 1130 | static const u32 SLOT_END = CncrStr::LIST_END; 1131 | 1132 | static u64 sizeBytes(u32 size) // the size in bytes that this structure will take up in the shared memory 1133 | { 1134 | return lava_vec::sizeBytes(size) + 16; // extra 16 bytes for 128 bit alignment padding 1135 | } 1136 | static u32 nextPowerOf2(u32 v) 1137 | { 1138 | v--; 1139 | v |= v >> 1; 1140 | v |= v >> 2; 1141 | v |= v >> 4; 1142 | v |= v >> 8; 1143 | v |= v >> 16; 1144 | v++; 1145 | 1146 | return v; 1147 | } 1148 | static u32 HashBytes(const void *const buf, u32 len) 1149 | { 1150 | u64 hsh = fnv_64a_buf(buf, len); 1151 | return (u32)( (hsh>>32) ^ ((u32)hsh)); 1152 | } 1153 | static VerIdx empty_vi(){ return VerIdx(EMPTY,0); } 1154 | static VerIdx deleted_vi(){ return VerIdx(DELETED,0); } 1155 | static i64 vi_i64(VerIdx vi){ u64 iVi=vi.asInt; return *((i64*)(&iVi)); } // interpret the u64 bits directly as a signed 64 bit integer instead 1156 | static i64 vi_i64(u64 i){ return *((i64*)&i); } // interpret the u64 bits directly as a signed 64 bit integer instead 1157 | static bool IsEmpty(VerIdx vi) 1158 | { 1159 | static VerIdx emptyvi = empty_vi(); 1160 | return emptyvi.asInt == vi.asInt; 1161 | } 1162 | static u32 lo32(u64 n){ return (n>>32); } 1163 | static u32 hi32(u64 n){ return (n<<32)>>32; } 1164 | static u64 swp32(u64 n){ return (((u64)hi32(n))<<32) | ((u64)lo32(n)); } 1165 | static u64 inclo32(u64 n, u32 i){ return ((u64)hi32(n)+i)<<32 | lo32(n); } 1166 | static u64 incHi32(u64 n, u32 i){ return ((u64)hi32(n))<<32 | (lo32(n)+i); } 1167 | static u64 shftToHi64(u32 n){ return ((u64)n)<<32; } 1168 | static u64 make64(u32 lo, u32 hi){ return (((u64)lo)<<32) | ((u64)hi); } 1169 | 1170 | private: 1171 | using VerIdxs = lava_vec; 1172 | 1173 | u32 m_sz; 1174 | mutable VerIdxs s_vis; // s_vis is key value(s) - needs to be changed to versioned indices, m_vis 1175 | CncrStr* m_csp; // csp is concurrent store pointer 1176 | 1177 | VerIdx store_vi(u32 i, u64 vi) const 1178 | { 1179 | using namespace std; 1180 | 1181 | bool odd = i%2 == 1; 1182 | VerIdx strVi; 1183 | if(odd) strVi = VerIdx(lo32(vi), hi32(vi)); // the odd numbers need to be swapped so that their indices are on the outer border of 128 bit alignment - the indices need to be on the border of the 128 bit boundary so they can be swapped with an unaligned 64 bit atomic operation 1184 | else strVi = VerIdx(hi32(vi), lo32(vi)); 1185 | 1186 | u64 prev = atomic_exchange_explicit( (au64*)(s_vis.data()+i), *((u64*)(&strVi)), memory_order_seq_cst); 1187 | //u64 prev = atomic_exchange( (au64*)(s_vis.data()+i), *((u64*)(&strVi)) ); 1188 | 1189 | if(odd) return VerIdx(lo32(prev), hi32(prev)); 1190 | else return VerIdx(hi32(prev), lo32(prev)); 1191 | } 1192 | bool cmpex_vi(u32 i, VerIdx expected, VerIdx desired) const 1193 | { 1194 | using namespace std; 1195 | 1196 | u64 exp = i%2? swp32(expected.asInt) : expected.asInt; // if the index (i) is odd, swap the upper and lower 32 bits around 1197 | u64 desi = i%2? swp32(desired.asInt) : desired.asInt; // desi is desired int 1198 | au64* addr = (au64*)(s_vis.data()+i); 1199 | //bool ok = addr->compare_exchange_strong( exp, desi ); 1200 | bool ok = atomic_compare_exchange_strong_explicit(addr, &exp, desi, memory_order_seq_cst, memory_order_seq_cst); 1201 | 1202 | return ok; 1203 | } 1204 | //void doFree(u32 i) const 1205 | //{ 1206 | // store_vi(i, empty_vi().asInt); 1207 | //} 1208 | VerIpd ipd(u32 i, u32 blkIdx) const // ipd is Ideal Position Distance - it is the distance a CncrHsh index value is from the position that it gets hashed to 1209 | { 1210 | BlkLst bl = m_csp->blkLst(blkIdx); 1211 | u32 ip = bl.hash % m_sz; // ip is Ideal Position 1212 | u32 ipd = i>ip? i-ip : m_sz - ip + i; 1213 | return {bl.version, ipd}; 1214 | } 1215 | VerIdx prev(u32 i, u32* out_idx) const 1216 | { 1217 | *out_idx=prevIdx(i); 1218 | return load(*out_idx); 1219 | } 1220 | VerIdx nxt(u32 i, u32* out_idx) const 1221 | { 1222 | *out_idx=nxtIdx(i); 1223 | return load(*out_idx); 1224 | } 1225 | 1226 | //bool runIfMatch(VerIdx vi, const void* const buf, u32 len, u32 hash, FUNC f) const 1227 | //Match runIfMatch(VerIdx vi, const void* const buf, u32 len, u32 hash, FUNC f) const 1228 | template 1229 | auto runIfMatch(VerIdx vi, const void* const buf, u32 len, u32 hash, FUNC f, T defaultRet = decltype(f(vi))() ) const -> std::pair // std::pair 1230 | { 1231 | Match m; 1232 | T funcRet = defaultRet; 1233 | 1234 | //auto b = m_csp->incReaders(vi.idx, vi.version); 1235 | auto b = m_csp->incReaders(vi.idx); 1236 | SECTION(work on the now protected block list without returning until after the readers are decremented) 1237 | { 1238 | if(b.isDeleted){ 1239 | m = MATCH_REMOVED; 1240 | }else{ 1241 | m = m_csp->compare(vi.idx, vi.version, buf, len, hash); 1242 | if(m==MATCH_TRUE || m==MATCH_TRUE_WRONG_VERSION){ 1243 | //funcRet = f(vi); 1244 | funcRet = f( VerIdx(vi.idx, b.version) ); 1245 | } 1246 | } 1247 | } 1248 | //if( !m_csp->decReadersOrDel(vi.idx, vi.version, false) ){ 1249 | if( !m_csp->decReadersOrDel(vi.idx,false) ){ 1250 | m = MATCH_REMOVED; 1251 | } 1252 | 1253 | return {m, funcRet}; 1254 | 1255 | // todo: should this increment and decrement the readers, as well as doing something different if it was the thread that freed the blocks 1256 | // 1257 | //if(b.isDeleted){ m = MATCH_REMOVED; } 1258 | //b. 1259 | // 1260 | //bool matched = false; 1261 | //decltype(f(vi)) funcRet; // not inside a scope 1262 | // 1263 | //matched=true; 1264 | // 1265 | //m_csp->decReaders(vi.idx, vi.version); 1266 | //decReaders(i); 1267 | // 1268 | //return matched; 1269 | } 1270 | 1271 | public: 1272 | CncrHsh(){} 1273 | CncrHsh(void* addr, u32 size, CncrStr* cs, bool owner=true) : 1274 | m_sz(nextPowerOf2(size)), 1275 | m_csp(cs) 1276 | { 1277 | u64 paddr = (u64)addr; // paddr is padded address 1278 | u8 rem = 16 - paddr%16; 1279 | u8 ofst = 16 - rem; 1280 | void* algnMem = (void*)(paddr+ofst); assert( ((u64)algnMem) % 16 == 0 ); 1281 | 1282 | new (&s_vis) VerIdxs(algnMem, m_sz); // initialize the lava_vec of VerIdx structs with the 128 bit aligned address 1283 | 1284 | if(owner){ 1285 | init(size, cs); 1286 | } 1287 | } 1288 | CncrHsh(CncrHsh const& lval) = delete; 1289 | CncrHsh(CncrHsh&& rval) = delete; 1290 | CncrHsh& operator=(CncrHsh const& lval) = delete; 1291 | CncrHsh& operator=(CncrHsh&& rval) = delete; 1292 | 1293 | VerIdx operator[](u32 idx) const { return s_vis[idx]; } 1294 | 1295 | VerIdx putHashed(u32 hash, VerIdx lstVi, const void *const key, u32 klen) const 1296 | { 1297 | // This function needs to return the VerIdx it was given if there was not a place for the allocation, since it would neighther be stored in the hash map or swapped for another VerIdx that will be freed 1298 | using namespace std; 1299 | static const VerIdx empty = empty_vi(); 1300 | 1301 | //VerIdx desired = lstVi; 1302 | u32 i=hash%m_sz, en=prevIdx(i); 1303 | for(;; i=nxtIdx(i) ) 1304 | { 1305 | VerIdx vi = load(i); 1306 | if(vi.idx>=DELETED){ // it is either deleted or empty 1307 | bool success = cmpex_vi(i, vi, lstVi); 1308 | if(success){ 1309 | return vi; 1310 | }else{ 1311 | i=prevIdx(i); 1312 | continue; 1313 | } // retry the same loop again if a good slot was found but it was changed by another thread between the load and the compare-exchange 1314 | } // Either we just added the key, or another thread did. 1315 | 1316 | VerIdx foundVi = empty_vi(); 1317 | const auto ths = this; 1318 | auto f = [ths,i,lstVi,&foundVi](VerIdx vi){ 1319 | foundVi = vi; 1320 | bool success = ths->cmpex_vi(i, vi, lstVi); // this should be hit even when the the versions don't match, since m_csp->compare() will return MATCH_TRUE_WRONG_VERSION 1321 | return success; 1322 | }; 1323 | auto cmpAndSuccess = runIfMatch(vi, key, klen, hash, f, false); 1324 | Match cmp = cmpAndSuccess.first; 1325 | bool success = cmpAndSuccess.second; 1326 | 1327 | if(cmp==MATCH_FALSE){ 1328 | if(i==en){ 1329 | return lstVi; // By returning the given VerIdx, we say that there was no place for it found and it needs to be deallocated 1330 | }else{ continue; } 1331 | }else if(cmp==MATCH_REMOVED){ // if the block list is marked as deleted, try this index again, since the index must have changed first 1332 | i=prevIdx(i); 1333 | continue; 1334 | } 1335 | 1336 | if(success){ 1337 | return foundVi; 1338 | //return vi; 1339 | }else{ 1340 | i=prevIdx(i); 1341 | continue; 1342 | } 1343 | } 1344 | } 1345 | 1346 | template 1347 | bool runMatch(const void *const key, u32 klen, u32 hash, FUNC f, T defaultRet = decltype(f(vi))() ) const 1348 | { 1349 | using namespace std; 1350 | 1351 | u32 i = hash % m_sz; 1352 | u32 en = prevIdx(i); 1353 | for(;; i=nxtIdx(i) ) 1354 | { 1355 | VerIdx vi = load(i); 1356 | if(vi.idx!=EMPTY && vi.idx!=DELETED){ 1357 | Match match = runIfMatch(vi,key,klen,hash,f, defaultRet).first; 1358 | 1359 | if(match==MATCH_TRUE || match==MATCH_TRUE_WRONG_VERSION){ return true; } 1360 | } 1361 | 1362 | if(i==en){ return false; } 1363 | } 1364 | } 1365 | 1366 | VerIdx delHashed(const void *const key, u32 klen, u32 hash) const 1367 | { 1368 | using namespace std; 1369 | static const VerIdx empty = empty_vi(); 1370 | static const VerIdx deleted = deleted_vi(); 1371 | 1372 | u32 i = hash % m_sz; 1373 | u32 en = prevIdx(i); 1374 | for(; i!=en ; i=nxtIdx(i) ) 1375 | { 1376 | VerIdx vi = load(i); 1377 | if(vi.idx>=DELETED){continue;} 1378 | 1379 | Match m = m_csp->compare(vi.idx, vi.version, key, klen, hash); 1380 | if(m==MATCH_TRUE){ 1381 | bool success = cmpex_vi(i, vi, deleted); 1382 | if(success){ 1383 | //cleanDeletion(i); 1384 | return vi; 1385 | }else{ 1386 | i=prevIdx(i); continue; 1387 | } 1388 | 1389 | //return vi; // unreachable 1390 | } 1391 | 1392 | if(m==MATCH_REMOVED || i==en){ return empty; } 1393 | } 1394 | 1395 | return empty; // not unreachable 1396 | } 1397 | 1398 | bool init(u32 sz, CncrStr* cs) 1399 | { 1400 | using namespace std; 1401 | 1402 | m_csp = cs; 1403 | m_sz = sz; 1404 | 1405 | for(u32 i=0; i void* { return s_vis.data(); } 1428 | u64 sizeBytes() const { return s_vis.sizeBytes(); } 1429 | i64 len(const void *const key, u32 klen, u32* out_vlen=nullptr, u32* out_version=nullptr) const 1430 | { 1431 | if(klen<1){return 0;} 1432 | 1433 | u32 hash=HashBytes(key,klen), i=hash%m_sz, en=prevIdx(i); 1434 | for(;; i=nxtIdx(i) ) 1435 | { 1436 | VerIdx vi = load(i); 1437 | if(vi.idx!=EMPTY && vi.idx!=DELETED){ 1438 | if(out_version){ *out_version = vi.version; } 1439 | Match m = m_csp->compare(vi.idx, vi.version, key, klen, hash); 1440 | if(m==MATCH_TRUE){ 1441 | return m_csp->len(vi.idx, vi.version, out_vlen); 1442 | } 1443 | } 1444 | 1445 | if(i==en){ return 0ull; } 1446 | } 1447 | } 1448 | bool get(const void *const key, u32 klen, void *const out_val, u32 vlen, u32* out_readlen=nullptr) const 1449 | { 1450 | if(klen<1){ return 0; } 1451 | 1452 | u32 hash=HashBytes(key,klen); 1453 | CncrStr* csp = m_csp; 1454 | auto runFunc = [csp, out_val, vlen, out_readlen](VerIdx vi){ 1455 | return csp->get(vi.idx, vi.version, out_val, vlen, out_readlen); 1456 | }; 1457 | 1458 | //Match m = runMatch(key, klen, hash, runFunc, 0); 1459 | return runMatch(key, klen, hash, runFunc, 0); 1460 | } 1461 | bool put(const void *const key, u32 klen, const void *const val, u32 vlen, u32* out_startBlock=nullptr) 1462 | { 1463 | assert(klen>0); 1464 | auto dif = __simdb_allocs - __simdb_deallocs; 1465 | 1466 | u32 hash = CncrHsh::HashBytes(key, klen); 1467 | VerIdx lstVi = m_csp->alloc(klen+vlen, klen, hash); // lstVi is block list versioned index 1468 | if(out_startBlock){ *out_startBlock = lstVi.idx; } 1469 | if(lstVi.idx==LIST_END){ 1470 | return false; 1471 | } 1472 | 1473 | m_csp->put(lstVi.idx, key, klen, val, vlen); // this writes the data into the blocks before exposing them to other threads through the hash map 1474 | 1475 | VerIdx vi = putHashed(hash, lstVi, key, klen); // put the versioned index in the hash map by swapping it for whatever is there - if there was another index already there, clean it up by freeing it's concurrent list indices and blocks 1476 | if(vi.idxfree(vi.idx, vi.version); 1478 | } // putHashed returns the entry that was there before, which is the entry that was replaced. If it wasn't empty, we free it here. 1479 | else{ 1480 | auto nxtDif = __simdb_allocs - __simdb_deallocs; 1481 | goto dummy; 1482 | dummy: ; 1483 | } 1484 | 1485 | //assert(dif == __simdb_allocs - __simdb_deallocs); 1486 | //Println("\nallocs: ", __simdb_allocs, " deallocs: ", __simdb_deallocs); 1487 | //std::cout << std::this_thread::get_id(); 1488 | //printf(" allocs: %d deallocs: %d DIFF: %d\n", __simdb_allocs, __simdb_deallocs, __simdb_allocs - __simdb_deallocs); 1489 | 1490 | return true; 1491 | } 1492 | bool del(const void *const key, u32 klen) 1493 | { 1494 | auto hash = CncrHsh::HashBytes(key, klen); 1495 | VerIdx vi = delHashed(key, klen, hash); 1496 | bool doFree = vi.idxfree(vi.idx, vi.version); } 1498 | 1499 | return doFree; 1500 | } 1501 | VerIdx load(u32 i) const 1502 | { 1503 | assert(i < m_sz); 1504 | 1505 | au64* avi = (au64*)(s_vis.data()+i); // avi is atomic versioned index 1506 | u64 cur = swp32(avi->load()); // need because of endianess? // atomic_load( (au64*)(m_vis.data()+i) ); // Load the key that was there. 1507 | 1508 | if(i%2==1) return VerIdx(hi32(cur), lo32(cur)); 1509 | else return VerIdx(lo32(cur), hi32(cur)); 1510 | } 1511 | u32 nxtIdx(u32 i) const { return (i+1)%m_sz; } 1512 | u32 prevIdx(u32 i) const { using namespace std; return min(i-1, m_sz-1); } // clamp to m_sz-1 for the case that hash==0, which will result in an unsigned integer wrap - syntax errors and possible windows min/max macros make this less problematic than std::min() 1513 | 1514 | }; 1515 | struct SharedMem 1516 | { 1517 | using u32 = uint32_t; 1518 | using u64 = uint64_t; 1519 | using au32 = std::atomic; 1520 | 1521 | static const int alignment = 0; 1522 | 1523 | #ifdef _WIN32 1524 | void* fileHndl; 1525 | #elif defined(__APPLE__) || defined(__MACH__) || defined(__unix__) || defined(__FreeBSD__) // || defined(__linux__) ? // osx, linux and freebsd 1526 | int fileHndl; 1527 | #endif 1528 | 1529 | void* hndlPtr; 1530 | void* ptr; 1531 | u64 size; 1532 | bool owner; 1533 | char path[256]; 1534 | 1535 | void mv(SharedMem&& rval) 1536 | { 1537 | fileHndl = rval.fileHndl; 1538 | hndlPtr = rval.hndlPtr; 1539 | ptr = rval.ptr; 1540 | size = rval.size; 1541 | owner = rval.owner; 1542 | 1543 | strncpy(path, rval.path, sizeof(path)); 1544 | 1545 | rval.clear(); 1546 | } 1547 | 1548 | public: 1549 | static void FreeAnon(SharedMem& sm) 1550 | { 1551 | #ifdef _WIN32 1552 | if(sm.hndlPtr){ 1553 | UnmapViewOfFile(sm.hndlPtr); 1554 | } 1555 | if(sm.fileHndl){ 1556 | CloseHandle(sm.fileHndl); 1557 | } 1558 | #elif defined(__APPLE__) || defined(__MACH__) || defined(__unix__) || defined(__FreeBSD__) || defined(__linux__) // osx, linux and freebsd 1559 | if(sm.hndlPtr){ 1560 | munmap(sm.hndlPtr, sm.size); // todo: size here needs to be the total size, and errors need to be checked 1561 | } 1562 | remove(sm.path); 1563 | // todo: deal with errors here as well 1564 | #endif 1565 | 1566 | sm.clear(); 1567 | } 1568 | static SharedMem AllocAnon(const char* name, u64 sizeBytes, bool raw_path=false, simdb_error* error_code=nullptr) 1569 | { 1570 | using namespace std; 1571 | 1572 | SharedMem sm; 1573 | sm.hndlPtr = nullptr; 1574 | sm.owner = false; 1575 | //sm.size = alignment==0? sizeBytes : alignment-(sizeBytes%alignment); 1576 | sm.size = sizeBytes; 1577 | if(error_code){ *error_code = simdb_error::NO_ERRORS; } 1578 | 1579 | #ifdef _WIN32 // windows 1580 | sm.fileHndl = nullptr; 1581 | if(!raw_path){ strcpy(sm.path, "simdb_"); } 1582 | #elif defined(__APPLE__) || defined(__MACH__) || defined(__unix__) || defined(__FreeBSD__) || defined(__linux__) // osx, linux and freebsd 1583 | sm.fileHndl = 0; 1584 | strcpy(sm.path, P_tmpdir "/simdb_"); 1585 | #endif 1586 | 1587 | u64 len = strlen(sm.path) + strlen(name); 1588 | if(len > sizeof(sm.path)-1){ 1589 | *error_code = simdb_error::PATH_TOO_LONG; 1590 | return move(sm); 1591 | }else{ strcat(sm.path, name); } 1592 | 1593 | #ifdef _WIN32 // windows 1594 | if(raw_path) 1595 | { 1596 | sm.fileHndl = CreateFileA( 1597 | sm.path, 1598 | GENERIC_READ|GENERIC_WRITE, //FILE_MAP_READ|FILE_MAP_WRITE, // apparently FILE_MAP constants have no effects here 1599 | FILE_SHARE_READ|FILE_SHARE_WRITE, 1600 | NULL, 1601 | CREATE_NEW, 1602 | FILE_ATTRIBUTE_NORMAL, //_In_ DWORD dwFlagsAndAttributes 1603 | NULL //_In_opt_ HANDLE hTemplateFile 1604 | ); 1605 | } 1606 | sm.fileHndl = OpenFileMappingA(FILE_MAP_READ | FILE_MAP_WRITE, FALSE, sm.path); 1607 | 1608 | if(sm.fileHndl==NULL) 1609 | { 1610 | sm.fileHndl = CreateFileMappingA( // todo: simplify and call this right away, it will open the section if it already exists 1611 | INVALID_HANDLE_VALUE, 1612 | NULL, 1613 | PAGE_READWRITE, 1614 | 0, 1615 | (DWORD)sizeBytes, 1616 | sm.path); 1617 | if(sm.fileHndl!=NULL){ sm.owner=true; } 1618 | } 1619 | 1620 | if(sm.fileHndl != nullptr){ 1621 | sm.hndlPtr = MapViewOfFile(sm.fileHndl, // handle to map object 1622 | FILE_MAP_READ | FILE_MAP_WRITE, // FILE_MAP_ALL_ACCESS, // read/write permission 1623 | 0, 1624 | 0, 1625 | 0); 1626 | } 1627 | 1628 | if(sm.hndlPtr==nullptr){ 1629 | int err = (int)GetLastError(); 1630 | LPSTR msgBuf = nullptr; 1631 | /*size_t msgSz =*/ FormatMessageA(FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, 1632 | NULL, err, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPSTR)&msgBuf, 0, NULL); 1633 | win_printf("simdb initialization error: %d - %s", err, msgBuf); 1634 | LocalFree(msgBuf); 1635 | 1636 | CloseHandle(sm.fileHndl); 1637 | sm.clear(); 1638 | return move(sm); 1639 | } 1640 | #elif defined(__APPLE__) || defined(__MACH__) || defined(__unix__) || defined(__FreeBSD__) || defined(__linux__) // osx, linux and freebsd 1641 | sm.owner = true; // todo: have to figure out how to detect which process is the owner 1642 | 1643 | sm.fileHndl = open(sm.path, O_RDWR); 1644 | if(sm.fileHndl == -1) 1645 | { 1646 | sm.fileHndl = open(sm.path, O_CREAT|O_RDWR, S_IRUSR|S_IWUSR |S_IRGRP|S_IWGRP | S_IROTH|S_IWOTH ); // O_CREAT | O_SHLOCK ); // | O_NONBLOCK ); 1647 | if(sm.fileHndl == -1){ 1648 | if(error_code){ *error_code = simdb_error::COULD_NOT_OPEN_MAP_FILE; } 1649 | } 1650 | else{ 1651 | //flock(sm.fileHndl, LOCK_EX); // exclusive lock // LOCK_NB 1652 | } 1653 | }else{ sm.owner = false; } 1654 | 1655 | if(sm.owner){ // todo: still need more concrete race protection? 1656 | fcntl(sm.fileHndl, F_GETLK, &flock); 1657 | flock(sm.fileHndl, LOCK_EX); // exclusive lock // LOCK_NB 1658 | //fcntl(sm.fileHndl, F_PREALLOCATE); 1659 | #if defined(__linux__) 1660 | #else 1661 | fcntl(sm.fileHndl, F_ALLOCATECONTIG); 1662 | #endif 1663 | 1664 | if( ftruncate(sm.fileHndl, sizeBytes)!=0 ){ 1665 | if(error_code){ *error_code = simdb_error::FTRUNCATE_FAILURE; } 1666 | } 1667 | if( flock(sm.fileHndl, LOCK_UN)!=0 ){ 1668 | if(error_code){ *error_code = simdb_error::FLOCK_FAILURE; } 1669 | } 1670 | } 1671 | 1672 | sm.hndlPtr = mmap(NULL, sizeBytes, PROT_READ|PROT_WRITE, MAP_SHARED , sm.fileHndl, 0); // MAP_PREFAULT_READ | MAP_NOSYNC 1673 | close(sm.fileHndl); 1674 | sm.fileHndl = 0; 1675 | 1676 | if(sm.hndlPtr==MAP_FAILED){ 1677 | if(error_code){ *error_code = simdb_error::COULD_NOT_MEMORY_MAP_FILE; } 1678 | } 1679 | #endif 1680 | 1681 | u64 addr = (u64)(sm.hndlPtr); 1682 | u64 alignAddr = addr; 1683 | //if(alignment!=0){ alignAddr = addr + ((alignment-addr%alignment)%alignment); } // why was the second modulo needed? 1684 | sm.ptr = (void*)(alignAddr); 1685 | 1686 | return move(sm); 1687 | } 1688 | 1689 | SharedMem() : 1690 | hndlPtr(nullptr), 1691 | ptr(nullptr), 1692 | size(0), 1693 | owner(false) 1694 | {} 1695 | SharedMem(SharedMem&) = delete; 1696 | SharedMem(SharedMem&& rval){ mv(std::move(rval)); } 1697 | SharedMem& operator=(SharedMem&& rval){ mv(std::move(rval)); return *this; } 1698 | ~SharedMem() 1699 | { 1700 | if(ptr){ 1701 | au32* cnt = ((au32*)ptr)+1; 1702 | u64 prev = 0; 1703 | if(cnt->load()>0){ prev = cnt->fetch_sub(1); } 1704 | if(prev==1){ SharedMem::FreeAnon(*this); } 1705 | } 1706 | } 1707 | void clear() 1708 | { 1709 | fileHndl = (decltype(fileHndl))0; 1710 | hndlPtr = nullptr; 1711 | ptr = nullptr; 1712 | size = 0; 1713 | owner = false; 1714 | } 1715 | auto data() -> void* 1716 | { 1717 | return ptr; 1718 | } 1719 | }; 1720 | class simdb 1721 | { 1722 | public: 1723 | using u8 = uint8_t; 1724 | using u32 = uint32_t; 1725 | using i32 = int32_t; 1726 | using u64 = uint64_t; 1727 | using i64 = int64_t; 1728 | using au32 = std::atomic; 1729 | using au64 = std::atomic; 1730 | using str = std::string; 1731 | using BlkCnt = CncrStr::BlkCnt; 1732 | using VerIdx = CncrHsh::VerIdx; 1733 | using string = std::string; 1734 | 1735 | //private: 1736 | au32* s_flags; 1737 | au32* s_cnt; 1738 | au64* s_blockSize; 1739 | au64* s_blockCount; 1740 | CncrStr s_cs; // store data in blocks and get back indices 1741 | CncrHsh s_ch; // store the indices of keys and values - contains a ConcurrentList 1742 | 1743 | // these variables are local to the stack where simdb lives, unlike the others, they are not simply a pointer into the shared memory 1744 | SharedMem m_mem; 1745 | mutable simdb_error m_error; 1746 | mutable u32 m_nxtChIdx; 1747 | mutable u32 m_curChIdx; 1748 | u64 m_blkCnt; 1749 | u64 m_blkSz; 1750 | bool m_isOpen; 1751 | 1752 | public: 1753 | static const u32 EMPTY = CncrHsh::EMPTY; // 28 bits set 1754 | static const u32 DELETED = CncrHsh::DELETED; // 28 bits set 1755 | static const u32 FAILED_PUT = CncrHsh::EMPTY; // 28 bits set 1756 | static const u32 SLOT_END = CncrHsh::SLOT_END; 1757 | static const u32 LIST_END = CncrStr::LIST_END; 1758 | 1759 | private: 1760 | static u64 OffsetBytes(){ return sizeof(au64)*3; } 1761 | static u64 MemSize(u64 blockSize, u64 blockCount) 1762 | { 1763 | auto hashbytes = CncrHsh::sizeBytes((u32)blockCount); 1764 | auto storebytes = CncrStr::sizeBytes((u32)blockSize, (u32)blockCount); 1765 | return hashbytes + storebytes + OffsetBytes(); 1766 | } 1767 | static Match CompareBlock(simdb const *const ths, i32 blkIdx, u32 version, void const *const buf, u32 len, u32 hash) 1768 | { 1769 | return ths->s_cs.compare(blkIdx, version, buf, len, hash); 1770 | } 1771 | static bool IsEmpty(VerIdx vi){return CncrHsh::IsEmpty(vi);} // special value for CncrHsh 1772 | static bool IsListEnd(VerIdx vi){return CncrStr::IsListEnd(vi);} // special value for CncrStr 1773 | 1774 | void mv(simdb&& rval) 1775 | { 1776 | using namespace std; 1777 | 1778 | s_flags = rval.s_flags; 1779 | s_cnt = rval.s_cnt; 1780 | s_blockSize = rval.s_blockSize; 1781 | s_blockCount = rval.s_blockCount; 1782 | memcpy(&s_cs, &rval.s_cs, sizeof(s_cs)); 1783 | memcpy(&s_ch, &rval.s_ch, sizeof(s_ch)); 1784 | 1785 | m_mem = move(rval.m_mem); 1786 | m_error = rval.m_error; 1787 | m_nxtChIdx = rval.m_nxtChIdx; 1788 | m_curChIdx = rval.m_curChIdx; 1789 | m_blkCnt = rval.m_blkCnt; 1790 | m_blkSz = rval.m_blkSz; 1791 | m_isOpen = rval.m_isOpen; 1792 | } 1793 | 1794 | public: 1795 | simdb() : 1796 | m_nxtChIdx(0), 1797 | m_curChIdx(0), 1798 | m_isOpen(false), 1799 | s_flags(nullptr), 1800 | s_cnt(nullptr), 1801 | s_blockSize(nullptr), 1802 | s_blockCount(nullptr) 1803 | {} 1804 | simdb(const char* name, u32 blockSize, u32 blockCount, bool raw_path=false) : 1805 | m_nxtChIdx(0), 1806 | m_curChIdx(0), 1807 | m_isOpen(false) 1808 | { 1809 | simdb_error error_code = simdb_error::NO_ERRORS; 1810 | new (&m_mem) SharedMem( SharedMem::AllocAnon(name, MemSize(blockSize,blockCount), raw_path, &error_code) ); 1811 | 1812 | if(error_code!=simdb_error::NO_ERRORS){ m_error = error_code; return; } 1813 | if(!m_mem.hndlPtr){ m_error = simdb_error::SHARED_MEMORY_ERROR; return; } 1814 | 1815 | // flags blockSize 1816 | // |----|----|--------|--------| each dash ('-') represents one byte - flags is the first four, cnt is the next 4, blockSize is the next 8, blockCount is the 8 bytes after that 1817 | // cnt blockCount 1818 | s_blockCount = ((au64*)m_mem.data())+2; 1819 | s_blockSize = ((au64*)m_mem.data())+1; // 8 byte offset to be after flags and cnt 1820 | s_flags = (au32*)m_mem.data(); 1821 | s_cnt = ((au32*)m_mem.data())+1; 1822 | 1823 | if(isOwner()){ 1824 | s_blockCount->store(blockCount); 1825 | s_blockSize->store(blockSize); 1826 | s_cnt->store(1); 1827 | }else{ 1828 | #if defined(_WIN32) // do we need to spin until ready on windows? unix has file locks built in to the system calls 1829 | //while(s_flags->load()<1){continue;} 1830 | #endif 1831 | s_cnt->fetch_add(1); 1832 | m_mem.size = MemSize(s_blockSize->load(), s_blockCount->load()); 1833 | } 1834 | 1835 | //auto cncrHashSize = CncrHsh::sizeBytes(blockCount); 1836 | uint64_t cncrHashSize = CncrHsh::sizeBytes((u32)s_blockCount->load()); 1837 | new (&s_cs) CncrStr( ((u8*)m_mem.data())+cncrHashSize+OffsetBytes(), 1838 | (u32)s_blockSize->load(), 1839 | (u32)s_blockCount->load(), 1840 | m_mem.owner); 1841 | 1842 | new (&s_ch) CncrHsh( ((u8*)m_mem.data())+OffsetBytes(), 1843 | (u32)s_blockCount->load(), 1844 | &s_cs, // the address of the CncrStr 1845 | m_mem.owner); 1846 | 1847 | m_blkCnt = s_blockCount->load(); 1848 | m_blkSz = s_blockSize->load(); 1849 | m_isOpen = true; 1850 | 1851 | if(isOwner()){ s_flags->store(1); } 1852 | } 1853 | ~simdb(){ close(); } 1854 | 1855 | simdb(simdb&& rval){ mv(std::move(rval)); } 1856 | simdb& operator=(simdb&& rval){ mv(std::move(rval)); return *this; } 1857 | 1858 | i64 len(const void *const key, u32 klen, u32* out_vlen=nullptr, u32* out_version=nullptr) const 1859 | { 1860 | return s_ch.len(key, klen, out_vlen, out_version); 1861 | } 1862 | bool get(const void *const key, u32 klen, void *const out_val, u32 vlen, u32* out_readlen=nullptr) const 1863 | { 1864 | return s_ch.get(key, klen, out_val, vlen, out_readlen); 1865 | } 1866 | bool put(const void *const key, u32 klen, const void *const val, u32 vlen, u32* out_startBlock=nullptr) 1867 | { 1868 | return s_ch.put(key, klen, val, vlen, out_startBlock); 1869 | } 1870 | bool del(const void *const key, u32 klen){ return s_ch.del(key, klen); } 1871 | 1872 | i64 len(u32 idx, u32 version, u32* out_klen=nullptr, u32* out_vlen=nullptr) const 1873 | { 1874 | VerIdx vi = s_ch.load(idx); 1875 | if(vi.idx>=DELETED || vi.version!=version){return 0;} 1876 | u32 total_len = s_cs.len(vi.idx, vi.version, out_vlen); 1877 | if(total_len>0){ 1878 | *out_klen = total_len - *out_vlen; 1879 | return total_len; 1880 | } 1881 | return 0; 1882 | } 1883 | bool get(char const* const key, void* val, u32 vlen) const 1884 | { 1885 | return get(key, (u32)strlen(key), val, vlen); 1886 | } 1887 | bool put(char const* const key, const void *const val, u32 vlen, u32* out_startBlock=nullptr) 1888 | { 1889 | assert(m_isOpen); // make sure if the db is being used it has been initialized 1890 | assert(strlen(key)>0); 1891 | return put(key, (u32)strlen(key), val, vlen, out_startBlock); 1892 | } 1893 | 1894 | void flush() const 1895 | { 1896 | #ifdef _WIN32 1897 | FlushViewOfFile(m_mem.hndlPtr, m_mem.size); 1898 | #endif 1899 | } 1900 | VerIdx nxt() const // this version index represents a hash index, not an block storage index 1901 | { 1902 | VerIdx ret = s_ch.empty_vi(); 1903 | u32 chNxt = s_ch.nxt(m_nxtChIdx); 1904 | if(chNxt!=SLOT_END){ 1905 | m_nxtChIdx = (chNxt + 1) % m_blkCnt; 1906 | ret = s_ch.at(chNxt); 1907 | }else{ 1908 | m_nxtChIdx = (m_nxtChIdx + 1) % m_blkCnt; 1909 | } 1910 | 1911 | return ret; 1912 | } 1913 | bool getKey(u32 idx, u32 version, void *const out_buf, u32 klen) const 1914 | { 1915 | if(klen<1) return false; 1916 | 1917 | VerIdx vi = s_ch.load(idx); 1918 | if(vi.idx >= CncrHsh::DELETED || vi.version!=version){return false;} 1919 | u32 l = s_cs.getKey(vi.idx, vi.version, out_buf, klen); // l is length 1920 | if(l<1){return false;} 1921 | 1922 | return true; 1923 | } 1924 | u32 cur() const { return m_curChIdx; } 1925 | auto data() const -> const void* const { return s_cs.data(); } // return a pointer to the start of the block data 1926 | u64 size() const { return CncrStr::sizeBytes( (u32)s_blockSize->load(), (u32)s_blockCount->load()); } 1927 | bool isOwner() const { return m_mem.owner; } 1928 | u64 blocks() const { return s_blockCount->load(); } // return the total number of blocks the shared memory 1929 | u64 blockSize() const { return s_blockSize->load(); } 1930 | auto mem() const -> void* { return m_mem.hndlPtr; } // returns a pointer to the start of the shared memory, which will contain the data structures first 1931 | u64 memsize() const { return m_mem.size; } 1932 | auto hashData() const -> void const* const { return s_ch.data(); } 1933 | bool close() 1934 | { 1935 | if(m_isOpen){ 1936 | m_isOpen = false; 1937 | //u64 prev = s_flags->fetch_sub(1); // should this be s_cnt? - prev is previous flags value - the number of simdb instances across process that had the shared memory file open 1938 | u64 prev = s_cnt->fetch_sub(1); // should this be s_cnt? - prev is previous flags value - the number of simdb instances across process that had the shared memory file open 1939 | if(prev==1){ // if the previous value was 1, that means the value is now 0, and we are the last one to stop using the file, which also means we need to be the one to clean it up 1940 | SharedMem::FreeAnon(m_mem); // close and delete the shared memory - this is done automatically on windows when all processes are no longer accessing a shared memory file 1941 | return true; 1942 | } 1943 | } 1944 | return false; 1945 | } 1946 | auto error() const -> simdb_error 1947 | { 1948 | return m_error; 1949 | } 1950 | 1951 | // separated C++ functions - these won't need to exist if compiled for a C interface 1952 | struct VerStr { 1953 | u32 ver; string str; 1954 | bool operator<(VerStr const& vs) const { return strdata(), vlen); 1974 | 1975 | return ok; 1976 | } 1977 | auto get(str const& key) const -> std::string 1978 | { 1979 | str ret; 1980 | if(this->get(key, &ret)) return ret; 1981 | else return str(""); 1982 | } 1983 | VerStr nxtKey(u64* searched=nullptr) const 1984 | { 1985 | u32 klen, vlen; 1986 | bool ok = false; 1987 | i64 prev = (i64)m_nxtChIdx; 1988 | VerIdx viNxt = this->nxt(); 1989 | i64 inxt = (i64)m_nxtChIdx; 1990 | u32 cur = s_ch.prevIdx((u32)(inxt)); 1991 | 1992 | if(searched){ 1993 | *searched = (inxt-prev-1)>0? inxt-prev-1 : (m_blkCnt-prev)+inxt; //(m_blkCnt-prev-1) + inxt+1; 1994 | } 1995 | if(viNxt.idx>=DELETED){ return {viNxt.version, ""}; } 1996 | 1997 | i64 total_len = this->len(cur, viNxt.version, &klen, &vlen); 1998 | if(total_len==0){ return {viNxt.version, ""}; } 1999 | 2000 | str key(klen,'\0'); 2001 | ok = this->getKey(cur, viNxt.version, 2002 | (void*)key.data(), klen); 2003 | 2004 | if(!ok || strlen(key.c_str())!=key.length() ) 2005 | return {viNxt.version, ""}; 2006 | 2007 | return { viNxt.version, key }; // copy elision 2008 | } 2009 | auto getKeyStrs() const -> std::vector 2010 | { 2011 | using namespace std; 2012 | 2013 | set keys; VerStr nxt; u64 searched=0, srchCnt=0; 2014 | while(srchCnt < m_blkCnt) 2015 | { 2016 | nxt = nxtKey(&searched); 2017 | if(nxt.str.length() > 0){ keys.insert(nxt); } 2018 | 2019 | srchCnt += searched; 2020 | } 2021 | 2022 | return vector(keys.begin(), keys.end()); 2023 | } 2024 | bool del(str const& key) 2025 | { 2026 | return this->del( (void const* const)key.data(), (u32)key.length() ); 2027 | } 2028 | 2029 | template 2030 | auto get(str const& key) -> std::vector 2031 | { 2032 | using namespace std; 2033 | 2034 | u32 vlen = 0; 2035 | //u64 len = len(key.data(), (u32)key.length(), &vlen); 2036 | i64 l = len(key, &vlen); 2037 | vector ret(vlen); 2038 | 2039 | u32 readLen = 0; 2040 | bool ok = get(key.data(), (u32)key.length(), (void*)ret.data(), vlen); // &readLen); 2041 | 2042 | if(ok) return ret; 2043 | else return vector(); 2044 | } 2045 | template 2046 | i64 put(str const& key, std::vector const& val) 2047 | { 2048 | return put(key.data(), (u32)key.length(), val.data(), (u32)(val.size()*sizeof(T)) ); 2049 | } 2050 | // end separated C++ functions 2051 | 2052 | }; 2053 | 2054 | // simdb_listDBs() 2055 | #ifdef _WIN32 2056 | auto simdb_listDBs(simdb_error* error_code=nullptr) -> std::vector 2057 | { 2058 | using namespace std; 2059 | 2060 | static HMODULE _hModule = nullptr; 2061 | static NTOPENDIRECTORYOBJECT NtOpenDirectoryObject = nullptr; 2062 | static NTOPENFILE NtOpenFile = nullptr; 2063 | static NTQUERYDIRECTORYOBJECT NtQueryDirectoryObject = nullptr; 2064 | static RTLINITUNICODESTRING RtlInitUnicodeString = nullptr; 2065 | 2066 | vector ret; 2067 | 2068 | if(!NtOpenDirectoryObject){ 2069 | //NtOpenDirectoryObject = (NTOPENDIRECTORYOBJECT)GetLibraryProcAddress( _T("ntdll.dll"), "NtOpenDirectoryObject"); 2070 | //NtOpenDirectoryObject = (NTOPENDIRECTORYOBJECT)GetLibraryProcAddress( (PSTR)_T("ntdll.dll"), (PSTR)_T("NtOpenDirectoryObject") ); 2071 | NtOpenDirectoryObject = (NTOPENDIRECTORYOBJECT)GetLibraryProcAddress( (PSTR)"ntdll.dll", (PSTR)"NtOpenDirectoryObject" ); 2072 | } 2073 | if(!NtQueryDirectoryObject){ 2074 | //NtQueryDirectoryObject = (NTQUERYDIRECTORYOBJECT)GetLibraryProcAddress(_T("ntdll.dll"), "NtQueryDirectoryObject"); 2075 | //NtQueryDirectoryObject = (NTQUERYDIRECTORYOBJECT)GetLibraryProcAddress( (PSTR)_T("ntdll.dll"), (PSTR)_T("NtQueryDirectoryObject") ); 2076 | NtQueryDirectoryObject = (NTQUERYDIRECTORYOBJECT)GetLibraryProcAddress( (PSTR)"ntdll.dll", (PSTR)"NtQueryDirectoryObject"); 2077 | } 2078 | if(!NtOpenFile){ 2079 | //NtOpenFile = (NTOPENFILE)GetLibraryProcAddress( (PSTR)_T("ntdll.dll"), (PSTR)_T("NtOpenFile") ); 2080 | NtOpenFile = (NTOPENFILE)GetLibraryProcAddress( (PSTR)"ntdll.dll", (PSTR)"NtOpenFile" ); 2081 | } 2082 | 2083 | HANDLE hDir = NULL; 2084 | IO_STATUS_BLOCK isb = { 0 }; 2085 | DWORD sessionId; 2086 | BOOL ok = ProcessIdToSessionId(GetCurrentProcessId(), &sessionId); 2087 | if(!ok){ return { "Could not get current session" }; } 2088 | 2089 | wstring sesspth = L"\\Sessions\\" + to_wstring(sessionId) + L"\\BaseNamedObjects"; 2090 | const WCHAR* mempth = sesspth.data(); 2091 | 2092 | WCHAR buf[4096]; 2093 | UNICODE_STRING pth = { 0 }; 2094 | pth.Buffer = (WCHAR*)mempth; 2095 | pth.Length = (USHORT)lstrlenW(mempth) * sizeof(WCHAR); 2096 | pth.MaximumLength = pth.Length; 2097 | 2098 | OBJECT_ATTRIBUTES oa = { 0 }; 2099 | oa.Length = sizeof( OBJECT_ATTRIBUTES ); 2100 | oa.RootDirectory = NULL; 2101 | oa.Attributes = OBJ_CASE_INSENSITIVE; 2102 | oa.ObjectName = &pth; 2103 | oa.SecurityDescriptor = NULL; 2104 | oa.SecurityQualityOfService = NULL; 2105 | 2106 | NTSTATUS status; 2107 | status = NtOpenDirectoryObject( 2108 | &hDir, 2109 | /*STANDARD_RIGHTS_READ |*/ DIRECTORY_QUERY, 2110 | &oa); 2111 | 2112 | if(hDir==NULL || status!=STATUS_SUCCESS){ return { "Could not open file" }; } 2113 | 2114 | BOOLEAN rescan = TRUE; 2115 | ULONG ctx = 0; 2116 | ULONG retLen = 0; 2117 | do 2118 | { 2119 | status = NtQueryDirectoryObject(hDir, buf, sizeof(buf), TRUE, rescan, &ctx, &retLen); 2120 | rescan = FALSE; 2121 | auto info = (OBJECT_DIRECTORY_INFORMATION*)buf; 2122 | 2123 | if( lstrcmpW(info->type.Buffer, L"Section")!=0 ){ continue; } 2124 | WCHAR wPrefix[] = L"simdb_"; 2125 | size_t pfxSz = sizeof(wPrefix); 2126 | if( strncmp( (char*)info->name.Buffer, (char*)wPrefix, pfxSz)!=0 ){ continue; } 2127 | 2128 | wstring wname = wstring( ((WCHAR*)info->name.Buffer)+6 ); 2129 | wstring_convert> cnvrtr; 2130 | string name = cnvrtr.to_bytes(wname); 2131 | 2132 | ret.push_back(name); 2133 | }while(status!=STATUS_NO_MORE_ENTRIES); 2134 | 2135 | return ret; 2136 | } 2137 | #else 2138 | auto simdb_listDBs(simdb_error* error_code=nullptr) -> std::vector 2139 | { 2140 | using namespace std; 2141 | 2142 | char prefix[] = "simdb_"; 2143 | size_t pfxSz = sizeof(prefix)-1; 2144 | 2145 | vector ret; 2146 | 2147 | DIR* d; // d is directory handle 2148 | errno = ENOENT; 2149 | if( (d=opendir(P_tmpdir))==NULL || errno!=ENOENT){ 2150 | closedir(d); 2151 | if(error_code){ *error_code = simdb_error::DIR_NOT_FOUND; } 2152 | return ret; 2153 | } 2154 | 2155 | struct dirent* dent; // dent is directory entry 2156 | while( (dent=readdir(d)) != NULL ) 2157 | { 2158 | if(errno != ENOENT){ 2159 | closedir(d); 2160 | if(error_code){ *error_code = simdb_error::DIR_ENTRY_ERROR; } 2161 | return ret; 2162 | } 2163 | 2164 | if(strncmp(dent->d_name, prefix, pfxSz)==0){ 2165 | ret.push_back(dent->d_name + 6); 2166 | } 2167 | } 2168 | 2169 | closedir(d); 2170 | if(error_code){ *error_code = simdb_error::NO_ERRORS; } 2171 | return ret; 2172 | } 2173 | #endif 2174 | 2175 | 2176 | #endif 2177 | 2178 | 2179 | 2180 | 2181 | 2182 | 2183 | 2184 | 2185 | // return empty; // should never be reached 2186 | // 2187 | //Match cmp = runIfMatch(vi, key, klen, hash, f); 2188 | //Match cmp = m_csp->compare(vi.idx,vi.version,key,klen,hash); 2189 | //bool success = cmpex_vi(i, vi, desired); // this should be hit even when the the versions don't match, since m_csp->compare() will return MATCH_TRUE_WRONG_VERSION 2190 | 2191 | //u32 cur=blkIdx, prev=blkIdx; // the first index will have its version set twice 2192 | //while(cur != LIST_END){ 2193 | // s_bls[prev].version = version; 2194 | // prev = cur; 2195 | // cur = s_bls[cur].idx; 2196 | //} 2197 | //return prev; 2198 | 2199 | //auto alloc(u32 size, u32 klen, u32 hash, BlkCnt* out_blocks=nullptr) -> VerIdx 2200 | //{ 2201 | // u32 byteRem = 0; 2202 | // u32 blocks = blocksNeeded(size, &byteRem); 2203 | // u32 st = s_cl.nxt(); 2204 | // SECTION(get the starting block index and handle errors) 2205 | // { 2206 | // if(st==LIST_END){ 2207 | // if(out_blocks){ *out_blocks = {1, 0} ; } 2208 | // return List_End(); 2209 | // } 2210 | // } 2211 | // 2212 | // u32 ver = (u32)s_version->fetch_add(1); 2213 | // u32 cur = st; 2214 | // u32 nxt = 0; 2215 | // u32 cnt = 0; 2216 | // SECTION(loop for the number of blocks needed and get new block and link it to the list) 2217 | // { 2218 | // for(u32 i=0; iend = nxt==LIST_END; 2250 | // out_blocks->cnt = cnt; 2251 | // } 2252 | // VerIdx vi(st, ver); 2253 | // return vi; 2254 | // } 2255 | //} 2256 | 2257 | //s_cl.s_lv[cur] = LIST_END; 2258 | // 2259 | //u32 st = s_cl.nxt(); 2260 | //u32 nxt = 0; 2261 | // 2262 | //nxt = s_cl.nxt(cur); 2263 | //if(nxt==LIST_END){ 2264 | // free(st, ver); 2265 | // return List_End(); 2266 | //} // todo: will this free the start if the start was never set? - will it just reset the blocks but free the index? 2267 | // 2268 | //s_bls[cur] = BlkLst(false, 0, nxt, ver, size); 2269 | //cur = nxt; 2270 | 2271 | //VerIdx empty={LIST_END,0}; // todo: use empty() for this? 2272 | //return empty; 2273 | // 2274 | //s_cl[cur] = nxt; 2275 | 2276 | //u32 findEndSetVersion(u32 blkIdx, u32 version) const // find the last BlkLst slot in the linked list of blocks to free 2277 | //{ 2278 | // u32 cur=blkIdx, prev=blkIdx; 2279 | // while(cur != LIST_END){ 2280 | // s_bls[prev].version = version; 2281 | // prev = cur; 2282 | // cur = s_bls[cur].idx; 2283 | // } 2284 | // 2285 | // return prev; 2286 | //} 2287 | 2288 | // 2289 | //u32 prevIdx(u32 i) const { return std::min(i-1, m_sz-1); } // clamp to m_sz-1 for the case that hash==0, which will result in an unsigned integer wrap 2290 | --------------------------------------------------------------------------------