├── .gitignore ├── MakeBoot.dyapp ├── vecdbboot.dws ├── dev.dyapp ├── doc ├── Usage.md └── Implementation.md ├── MakeBoot.dyalog ├── TODO.md ├── LICENSE ├── BootServers.dyalog ├── README.md ├── vecdbclt.dyalog ├── vecdbslave.dyalog ├── TestVecdbSrv.dyalog ├── vecdbsrv.dyalog ├── TestVecdb.dyalog ├── APLProcess.dyalog └── vecdb.dyalog /.gitignore: -------------------------------------------------------------------------------- 1 | *.sublime-* 2 | 3 | testdb1/ 4 | -------------------------------------------------------------------------------- /MakeBoot.dyapp: -------------------------------------------------------------------------------- 1 | Load MakeBoot 2 | Run MakeBoot -------------------------------------------------------------------------------- /vecdbboot.dws: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dyalog/vecdb/master/vecdbboot.dws -------------------------------------------------------------------------------- /dev.dyapp: -------------------------------------------------------------------------------- 1 | Load vecdb 2 | Load vecdbclt 3 | Load vecdbsrv 4 | Load vecdbslave 5 | Load MakeBoot 6 | Load BootServers 7 | Load APLProcess 8 | Load TestVecdb 9 | Load TestVecdbSrv -------------------------------------------------------------------------------- /doc/Usage.md: -------------------------------------------------------------------------------- 1 | # User Guide # 2 | 3 | At the moment, the only "documentation" is the test suite. Load the code and trace through or inspect the function `TestVecdb.RunAll` and its subfunctions. 4 | 5 | ]load \vecdbfolder\*.dyalog 6 | TestVecdb.RunAll -------------------------------------------------------------------------------- /MakeBoot.dyalog: -------------------------------------------------------------------------------- 1 | MakeBoot;Path 2 | ⍝ Built the "vecdbboot" workspace 3 | 4 | Path←{(1-⌊/'/\'⍳⍨⌽⍵)↓⍵}4↓,¯1↑⎕CR⊃⎕SI 5 | ⎕SE.SALT.Load Path,'BootServers.dyalog' 6 | ⎕LX←'BootServers ''''' 7 | ⎕←'Now please:' 8 | ⎕←' ⎕EX ''MakeBoot''' 9 | ⎕←' )WSID ',Path,'vecdbboot.dws' 10 | ⎕←' )SAVE' 11 | -------------------------------------------------------------------------------- /TODO.md: -------------------------------------------------------------------------------- 1 | # TODO # 2 | 3 | ## Started (Jan 2nd 2017) ## 4 | 5 | 1. Correct summaries for cross-shard calculations 6 | 1. Add "average" and "count distict" calculations 7 | 8 | ## To be done soon ## 9 | 10 | 1. Document server mode / parallel queries 11 | 1. Generalization of Symbol Tables + Add One, Four & Eight Byte Symbol Tables 12 | 1. Enhance queries to support conditional functions... Eg. ('price' '>' 100)('Name' 'like' 'A%') 13 | 1. Beef up error checking on file creation 14 | 1. Database status reporting function (# shards, records in each, statistics, etc) 15 | 1. Add a "Char" type which does not use a symbol table 16 | 1. User Guide 17 | 18 | ## More speculative ideas ## 19 | 1. RESTful / ODATA? API 20 | 1. Timestamped non-overwriting updates 21 | 1. Delete records (AFTER non-overwriting updates) 22 | 1. Database cleanup (throw away history) 23 | 1. TimeStamp columns 24 | 1. Aggregations in queries 25 | 1. Add support for noFiles switch: run entirely in memory with no backing storage -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Dyalog Ltd. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /BootServers.dyalog: -------------------------------------------------------------------------------- 1 | BootServers dummy;port;getenv;getnum;path 2 | ⍝ Start a vecdb server process if VECDBSRV="config.json" PORT=nnnn 3 | ⍝ vecdb slave process if VECDBSLAVE="file" SHARDS="n" PORT=nnnn 4 | 5 | ⎕←'Command Line:' 6 | ⎕←2 ⎕NQ'.' 'GetCommandLine' 7 | getenv←{0=≢r←2 ⎕NQ'.' 'GetEnvironment'⍵:⍺ ⋄ r} 8 | getnum←{⊃2⊃⎕VFI ⍵} 9 | path←'file://',⊃⎕NPARTS ⎕WSID 10 | 11 | VECDBSRV←0≠≢SRVDB←''getenv'VECDBSRV' 12 | VECDBSLAVE←0≠≢VECDB←''getenv'VECDBSLAVE' 13 | SHARDS←2⊃⎕VFI''getenv'SHARDS' 14 | TOKEN←2⊃⎕VFI''getenv'TOKEN' 15 | 16 | port←getnum''getenv'PORT' 17 | 18 | 2 ⎕FIX path,'APLProcess.dyalog' 19 | 2 ⎕FIX path,'vecdb.dyalog' 20 | 2 ⎕FIX path,'vecdbclt.dyalog' 21 | 2 ⎕FIX path,'vecdbsrv.dyalog' 22 | 2 ⎕FIX path,'vecdbslave.dyalog' 23 | 24 | :If 0=⎕NC'DRC' ⍝ Get conga if necessary 25 | 'DRC'⎕CY'conga'getenv'CONGAWS' 26 | :EndIf 27 | 28 | :If 0=port 29 | ⎕←'See:' 30 | ' ',2 ⎕FIX path,'TestVecdb.dyalog' 31 | ' ',2 ⎕FIX path,'TestVecdbSrv.dyalog' 32 | :Else 33 | 34 | AUTOSHUT←1 35 | {}1 ##.DRC.Init'' 36 | 37 | :If VECDBSRV ⋄ vecdbsrv.Start SRVDB port 38 | :ElseIf VECDBSLAVE ⋄ vecdbslave.Start VECDB SHARDS port 39 | :Else 40 | ⎕←'Invalid configuration...' 41 | :EndIf 42 | :EndIf 43 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # README # 2 | 3 | `vecdb` 4 | Current version: 0.2.3 5 | 6 | ### What is this repository for? ### 7 | `vecdb` is a simple "columnar database": each column in the database is stored in a single memory-mapped files. It is written in and for Dyalog APL as a tool on which to base new applications which need to generate and query very large amounts of data and do a large number of high performance reads, but do not need a full set of RDBMS features. In particuler, there is no "transactional" storage mechanism, and no ability to join tables built-in to the database. 8 | 9 | ### Features 10 | 11 | #### Supported data types: #### 12 | 13 | * 1, 2 and 4 byte integers 14 | * 8-byte IEEE double-precision floats 15 | * Boolean 16 | * Char (via a "symbol table" of up to 32,767 unique strings indexed by 2-byte integers) 17 | 18 | #### Sharding #### 19 | 20 | `vecdb` databases can be *sharded*, or *horizontally partitioned*. Each shard is a separate folder, named when the database is created (by default, there is a single shard). Each folder contains a file for each database column - which is memory mapped to an APL vector when the database is opened. A list of *sharding columns* is defined when the db is created; the values of these columns are passed as the argument to a user-defined *sharding function*, which has to return an origin-1 index into the list of shards, for each record. 21 | 22 | #### Supported Operations #### 23 | 24 | **Query**: At the moment, the `Query` function takes a constraint in the form of a list of (column_name values) pairs. Each one represents the relation which can be expressed in APL as (column_data∊values). If more than constraint is provided, they are AND-ed together. 25 | Query also takes a list of column names to be retrieved for records which match the constraint. 26 | 27 | Query results are returned as a vector with one element per database column, each item containing a vector of values for that column. 28 | 29 | **Search** If the `Query`function is called with an empty list of columns, record identifiers are returned as a 2-column matrix of (shard) (record index) pairs. 30 | 31 | **Read**: The `Read` function accepts a matrix in the format returned by a search query and a list of column names, and returns a vector per column. 32 | 33 | **Update**: The `Update` function also takes as input a search query result, a list of columns, and a vector of vectors containing new data values. 34 | 35 | **Append**: Takes a list of column names and a vector of data vectors, one per named column. The columns involved in the Shard selection must always be included. 36 | 37 | **Delete**: Deletion is not currently supported. 38 | 39 | ### Short-Term Goals ### 40 | 41 | 1. Enhance the query function to accept enhanced queries consisting of column names, comparison functions and values - and support AND/OR. If possible, optimise queries to be sensitive to sharding. 42 | 1. Parallel database queries: For a sharded database: Spin a number of isolate processes up and distribute the shards between them, so that each shard is handled by a single process. Enhance the database API functions to use these processes to perform searches, reads and writes in parallel. 43 | 1. Add a front-end server with a RESTful database API. As it stands, `vecdb` is effectively an embedded database engine which does not support data sharing between processes on the same or on separate machines. 44 | 45 | ### Longer Term (Dreams) ### 46 | 47 | There are ideas to add support for timeseries and versioning. This would include: 48 | 49 | 1. Support for deleting records 50 | 1. Performing all updates without overwriting data, and tagging old data with the timestamps defining its lifetime, allowing efficient queries on the database as it appeared at any given time in the past. 51 | 1. Built-in support for the computation of aggregate values as part of the parallel query mechanism, based on timeseries or other key values. 52 | 53 | ### How do I get set up? ### 54 | 55 | Clone/Fork the repo, and 56 | 57 | ```apl 58 | ]load vecdb.dyalog 59 | ``` 60 | 61 | ### Tests ### 62 | 63 | The full system test creates a database containing all supported data types, inserts and updates records, performs queries, and finally deletes the database. 64 | 65 | ```apl 66 | ]load TestVecdb.dyalog 67 | #.TestVecdb.RunAll 68 | ``` 69 | 70 | See doc\Usage.md for more information on usage. 71 | 72 | ### Contribution guidelines ### 73 | 74 | At this early stage, until the project acquires a bit more direction, we ask you to contact one of the key collaborators to discuss your ideas. 75 | 76 | Please read doc\Implementation.md before continuing. 77 | 78 | ### Key Collaborators ### 79 | 80 | * mkrom@dyalog.com 81 | * nicolas@dyalog.com 82 | * stf@apl.it 83 | -------------------------------------------------------------------------------- /vecdbclt.dyalog: -------------------------------------------------------------------------------- 1 | :Namespace vecdbclt 2 | 3 | (⎕IO ⎕ML)←1 1 4 | SERVER←'' 5 | 6 | ∇ r←Clt(connection address port) 7 | :If 1111=⊃r←##.DRC.Clt connection address port 8 | {}⎕DL 0.5 9 | :AndIf 1111=⊃r←##.DRC.Clt connection address port 10 | {}⎕DL 1 11 | :AndIf 1111=⊃r←##.DRC.Clt connection address port 12 | {}⎕DL 3 13 | :AndIf 1111=⊃r←##.DRC.Clt connection address port 14 | {}⎕DL 5 15 | :AndIf 1111=⊃r←##.DRC.Clt connection address port 16 | (⍕r)⎕SIGNAL 11 17 | :EndIf 18 | ∇ 19 | 20 | ∇ {r}←{connection}Connect(address port user) 21 | ⍝ Connect to vecdb server process 22 | 23 | :If 0=⎕NC'connection' ⋄ connection←'VECDB' ⋄ :EndIf 24 | 25 | :If 0=⊃r←##.DRC.Init'' 26 | :If 0≠⍴connection ⋄ {}##.DRC.Close connection ⋄ :EndIf 27 | :AndIf 0=⊃r←Clt connection address port 28 | CONNECTION←2⊃r 29 | :Else 30 | ('Error: ',,⍕r)⎕SIGNAL 11 31 | :EndIf 32 | ∇ 33 | 34 | ∇ r←SrvDo(client cmd) 35 | ⍝ Send a command to vecdb and await the result 36 | 37 | r←SrvRcv SrvSend client cmd 38 | ∇ 39 | 40 | ∇ cmd←SrvSend(client cmd);r 41 | ⍝ Return command name to wait on 42 | :If 0=⊃r←##.DRC.Send client cmd 43 | cmd←2⊃r 44 | :Else 45 | (⍕r)⎕SIGNAL 11 46 | :EndIf 47 | ∇ 48 | 49 | ∇ r←SrvRcv c;done;wr;z 50 | ⍝ Wait for result from vecdb, signal DOMAIN ERROR if it fails 51 | 52 | :Repeat 53 | :If ~done←∧/100 0≠1⊃r←##.DRC.Wait c 10000 ⍝ Only wait 10 seconds 54 | 55 | :Select 3⊃r 56 | :Case 'Error' 57 | done←1 58 | :Case 'Progress' 59 | ⎕←'Progress: ',4⊃r 60 | :Case 'Receive' 61 | :If 0=⊃r 62 | r←4⊃r 63 | :AndIf 0=⊃r 64 | r←2⊃r 65 | done←1 66 | :Else 67 | ('Error: ',,⍕r)⎕SIGNAL 11 68 | :EndIf 69 | :EndSelect 70 | :EndIf 71 | :Until done 72 | ∇ 73 | 74 | ∇ r←Open folder 75 | ⍝ Cover-function for call to Lock from a Client 76 | 77 | r←⎕NEW vecdbproxy(folder CONNECTION) 78 | ∇ 79 | 80 | :Class vecdbproxy 81 | ⍝ Produce a vecdb proxy object for a served vecdb 82 | 83 | ∇ Open(folder connection) 84 | :Access Public 85 | :Implements Constructor 86 | (FOLDER CONNECTION)←folder connection 87 | :If 0=⊃r←##.SrvDo CONNECTION('Open'folder) 88 | ⎕DF'[vecdbclt: ',folder,']' 89 | :Else 90 | (⍕r)⎕SIGNAL 11 91 | :EndIf 92 | ∇ 93 | 94 | ∇ {r}←Shutdown msg 95 | :Access Public 96 | :If 0=⊃r←##.SrvDo CONNECTION('Shutdown'msg) 97 | {}#.DRC.Close CONNECTION 98 | CONNECTION←'' 99 | :EndIf 100 | ∇ 101 | 102 | ∇ Close 103 | :Access Public 104 | :If 0=⊃r←##.SrvDo CONNECTION('Close'⍬) 105 | {}#.DRC.Close CONNECTION 106 | CONNECTION←'' 107 | :EndIf 108 | ∇ 109 | 110 | ∇ r←Count 111 | :Access Public 112 | :If 0≠⍴CONNECTION 113 | r←##.SrvDo CONNECTION('Count'(FOLDER ⍬)) 114 | r←+/r 115 | :Else 116 | 'CONNECTION CLOSED'⎕SIGNAL 11 117 | :EndIf 118 | ∇ 119 | 120 | ∇ r←Append args 121 | :Access Public 122 | :If 0≠⍴CONNECTION 123 | r←##.SrvDo CONNECTION('Append'(FOLDER args)) 124 | :Else 125 | 'CONNECTION CLOSED'⎕SIGNAL 11 126 | :EndIf 127 | ∇ 128 | 129 | ∇ r←Query args 130 | :Access Public 131 | :If 0≠⍴CONNECTION 132 | r←##.SrvDo CONNECTION('Query'(FOLDER args)) 133 | :If 2=⍴⍴⊃r 134 | r←⊃⍪/r 135 | :Else 136 | r←⊃,¨/r 137 | :EndIf 138 | :Else 139 | 'CONNECTION CLOSED'⎕SIGNAL 11 140 | :EndIf 141 | ∇ 142 | 143 | ∇ r←Read args 144 | :Access Public 145 | :If 0≠⍴CONNECTION 146 | r←##.SrvDo CONNECTION('Read'(FOLDER args)) 147 | r←⊃,¨/r 148 | :Else 149 | 'CONNECTION CLOSED'⎕SIGNAL 11 150 | :EndIf 151 | ∇ 152 | 153 | ∇ r←Update args 154 | :Access Public 155 | :If 0≠⍴CONNECTION 156 | r←##.SrvDo CONNECTION('Update'(FOLDER args)) 157 | :Else 158 | 'CONNECTION CLOSED'⎕SIGNAL 11 159 | :EndIf 160 | 161 | ∇ 162 | 163 | :EndClass 164 | 165 | :EndNamespace 166 | -------------------------------------------------------------------------------- /doc/Implementation.md: -------------------------------------------------------------------------------- 1 | # IMPLEMENTATION # 2 | 3 | `vecdb`is an "inverted" database written in and for Dyalog APL, based on the ⎕MAP facility for mapping APL arrays to "flat" files. 4 | 5 | Data Types supported are 6 | 7 | | Name | Description | 8 | |------|------------------------------------------------------------| 9 | | B | Boolean; 1 bit per item (0 or 1) | 10 | | I1 | 1-Byte Integer (-128 to +127) | 11 | | I2 | 2-Byte Integer (-32,768 to +32,767) | 12 | | I4 | 4-Byte Integer (+/- 2,147,483,647/8) | 13 | | F | IEEE Double-Precision Floating Point (+/- ~1.797E308) | 14 | | C | VarChar - up to 32,767 different strings indexed by an I2 | 15 | 16 | A single-byte character type is planned - as C but indexed by an I1, allowing only 127 different strings. The current proposal is to denote this type "c". 17 | 18 | ### File Formats ### 19 | Data is stored as APL vectors, each one mapping to a single file representing a numeric vector of uniform type, with the file extension *.vector*. In the case of the "C" type which contains variable length characters arrays, the serialised form (created using 220⌶) of a vector of character vectors is stored in a file with extension *.symbol*, and a 2-byte integer of indices into this is stored in the corresponding *.vector* file. 20 | 21 | #### Blocking #### 22 | Since mapped arrays cannot grow dynamically, the files are over-allocated using a configurable *BlockSize* (*NumBlocks* tracks the number of blocks in use). All vectors have the same length; the number of records actually in used is tracked separately. When a new block is required, all maps are expunged, ⎕NAPPEND is used to add a block to each file, and the maps are re-created. 23 | 24 | #### Meta Data #### 25 | Meta-data is stored in a Dyalog Component File "meta.vecdb", which contains data which does not change during normal operation of the database: 26 | 27 | | Cn# | Contents | Example | 28 | |-----|----------------------|------------------------------------------------| 29 | | 1 | Version number | 'vecdb 0.2.0 | 30 | | 2 | Description | a char vec | 31 | | 3 | Unused | Used to contain number of records | 32 | | 4 | Properties | ('Name' 'BlockSize')('TestDB1' 10000) | 33 | | 5 | Col names & types | ('Stock' 'Price') ((,'C') 'F') | 34 | | 6 | Shard folders | 'c:\mydb\shard1\' 'c:\mydb\shard2\ | 35 | | 7 | Shard fn and cols | '{1+2|⎕UCS ⊃¨⍵}' (,1) | 36 | 37 | #### Sharding #### 38 | 39 | `vecdb` allows the database to be *horizontally partitioned* into *shards*, based on the values of any selection of fields. If a database is created without sharding, data files are created in the same folder that the meta.vecdb file is in. Sharding is specified by passing suitable options to the vecdb constructor. The above example was set up using the following code. For a more advanced example, see the function `TestVecdb.Sharding`: 40 | 41 | columns←'Name' 'BlockSize' ⋄ types←,¨'C' 'F' 42 | data←('IBM' 'AAPL' 'MSFT' 'DYALOG')(160.97 112.6 47.21 999.99) 43 | options←⎕NS'' 44 | options.BlockSize←10000 45 | options.ShardFolders←'c:\mydb\shard'∘,¨'12' 46 | options.(ShardFn ShardCols)←'{1+2|⎕UCS ⊃¨⍵}' 1 47 | params←'TestDB1' 'c:\mydb' columns types options data 48 | mydb←⎕NEW vecdb params 49 | 50 | In the above example, the database has two shards, based on whether the first character of the Stock name has an odd or even Unicode number. 51 | 52 | Each shard is stored in a separate folder which contains the *.vector* files described above, plus a file "counters.vecdb" which currently contains a single 8-byte floating-point value which is the number of active records in the shard (the maximum number of records in a shard is limited to 2*48). 53 | 54 | Note that the *.symbol* files are not sharded: The complete list of unique strings for a column is shared between the shards, and resides in main database folder. 55 | 56 | The complete set of files which would be created by the above example would be along the lines of: 57 | 58 | Directory of c:\mydb\shardtest 59 | 60 | 04/01/2015 21:35 122 1.symbol // Symbols for Name column 61 | 04/01/2015 21:35 2,576 meta.vecdb // Meta data 62 | 63 | Directory of c:\mydb\db\shardtest\Shard1 64 | 65 | 04/01/2015 21:35 20,000 1.vector // 1 block of I2 symbol pointers 66 | 04/01/2015 21:35 80,000 2.vector // 1 block of Floating-point prices 67 | 04/01/2015 21:35 8 counters.vecdb // Used record counter (contains 3) 68 | 69 | Directory of c:\mydb\vecdb\shardtest\Shard2 70 | 71 | 04/01/2015 21:35 20,000 1.vector // As Shard1 72 | 04/01/2015 21:35 80,000 2.vector // As Shard1 73 | 04/01/2015 21:35 8 counters.vecdb // record counter (1) 74 | -------------------------------------------------------------------------------- /vecdbslave.dyalog: -------------------------------------------------------------------------------- 1 | :Namespace vecdbslave 2 | 3 | (⎕IO ⎕ML)←1 1 4 | LOGLEVEL←0 5 | 6 | fmtts←{,'ZI4,<->,ZI2,<->,ZI2,< >,ZI2,<:>,ZI2,<:>,ZI2' ⎕FMT 1 6⍴⍵} 7 | 8 | ∇ {r}←Shutdown dummy 9 | ⍝ Shut down slave 10 | 11 | DB.Close ⍝ Close the vecdb 12 | ⎕EX 'DB' 13 | done←1 ⍝ Global flag to shut down 14 | r←⍬ ⍝ Need a result 15 | ∇ 16 | 17 | ∇ Init(folder shards) 18 | STATE←1 ⍝ Starting, 0=Running, 2=Startup Failed, 3=Shut Down 19 | 1 Log STATUS←'Startup initiated at ',fmtts ⎕TS 20 | CONNS←TASKS←USERS←TOKENS←⍬ 21 | NEXTTASK←1000 22 | 23 | :Trap 0 24 | DB←⎕NEW ##.vecdb(folder shards) 25 | STATE←0 26 | 1 Log'Slave startup completed, ',STATUS←'Folder= ',folder,', shards= ',⍕shards 27 | :Else 28 | STATE←2 ⍝ Startup Failed 29 | 3 Log STATUS←'Startup failed: ',∊⎕DM 30 | ∘∘∘ 31 | :EndTrap 32 | ∇ 33 | 34 | ∇ {r}←Start(folder shards port);sink;data;event;obj;rc;wait;z;cmd;name 35 | ⍝ Run a vecdb Slave - based on CONGA RPCServer sample 36 | 37 | {}##.DRC.Init'' 38 | {}##.DRC.Close name←'VECSRV' 39 | 40 | Init folder shards 41 | 42 | :If 0=1⊃r←##.DRC.Srv name''port'Command' 43 | 1 Log'Server ''',name,''', listening on port ',⍕port 44 | 2 Log'Handler thread started: ',⍕Run&name port 45 | :Else 46 | 3 Log'Server failed to start: ',,⍕r 47 | :EndIf 48 | ∇ 49 | 50 | ∇ Connect cmd;task;conn 51 | ⍝ Connection Created 52 | 53 | conn←1↓⊃(cmd='.')⊂cmd 54 | CONNS,←⊂conn 55 | TASKS,←task←NEXTTASK 56 | NEXTTASK←10000|NEXTTASK+1 57 | USERS←USERS,0 58 | TOKENS←TOKENS,⊂'' 59 | 60 | 0 Log'New connection ',conn,' assigned task id ',⍕task 61 | ∇ 62 | 63 | ∇ Disconnect obj;m;i;held;task;conn 64 | ⍝ Connection Lost 65 | 66 | conn←1↓⊃(obj='.')⊂obj 67 | 0 Log'Connection ',conn,' disconnected' 68 | 69 | :If (⍴m)≥i←(m←~CONNS∊⊂conn)⍳0 70 | CONNS←m/CONNS 71 | TASKS←m/TASKS 72 | USERS←m/USERS 73 | TOKENS←m/TOKENS 74 | :EndIf 75 | ∇ 76 | 77 | ∇ level Log message 78 | →(level,ZI2,<:>,ZI2,<.>,ZI3'⎕FMT 1 4⍴3↓⎕TS),' ',message 80 | ∇ 81 | 82 | ∇ Process(obj data);r;CONNECTION;cmd;arg;close;txt 83 | ⍝ Process a call. data[1] contains function name, data[2] an argument 84 | 85 | ⍝ {}##.DRC.Progress obj(' Thread ',(⍕⎕TID),' started to run: ',,⍕data) ⍝ Send progress report 86 | CONNECTION←obj 87 | Conn←1↓⊃(obj='.')⊂obj 88 | (cmd arg)←2↑data 89 | close←0 90 | 91 | :If (⊂cmd)∊'SetToken' 'SetUser' 'Shutdown' 92 | r←0 (⍎cmd,' obj arg') 93 | 94 | :ElseIf (⊂cmd)∊'Append' 'Count' 'Query' 'Update' 'Read' 95 | :If 0≠≢(CONNS⍳⊂Conn)⊃TOKENS,⊂'' 96 | :Trap 9999 97 | ∘∘∘ 98 | :If cmd≡'Count' ⋄ r←0 DB.Count 99 | :Else ⋄ r←0 ((DB⍎cmd) arg) 100 | :EndIf 101 | :Else ⋄ r←⎕EN ⎕DM 102 | :EndTrap 103 | :Else 104 | close←1 105 | r←999 ('No valid token provided for command ',⍕cmd arg) 106 | :EndIf 107 | 108 | :Else 109 | r←999 ('Unsupported command: ',cmd) 110 | :EndIf 111 | 112 | {}##.DRC.Respond obj r 113 | 114 | :If close 115 | ⍝ /// {{}##.DRC.Close ⍵⊣⎕DL 1}&Conn ⍝ Start thread which waits 1s then closes 116 | :EndIf 117 | ∇ 118 | 119 | 120 | ∇ r←Run(name port);sink;data;event;obj;rc;wait;z;cmd 121 | ⍝ Run the Lock Server - based on CONGA RPCServer sample 122 | 123 | :If 0=⎕NC'start' ⋄ start←1 ⋄ :EndIf 124 | {}##.DRC.Init'' 125 | 126 | 0 Log'Thread ',(⍕⎕TID),' is now handing server ''',name,'''.' 127 | done←0 ⍝ done←1 in function "End" 128 | :While ~done 129 | rc obj event data←4↑wait←##.DRC.Wait name 3000 ⍝ Time out now and again 130 | 131 | :Select rc 132 | :Case 0 133 | :Select event 134 | :Case 'Error' 135 | :If 1119≢data ⋄ 3 Log'Error ',(⍕data),' on ',obj ⋄ :EndIf 136 | :If ~done∨←name≡obj ⍝ Error on the listener itself? 137 | {}##.DRC.Close obj ⍝ Close connection in error 138 | Disconnect obj ⍝ Let logic know 139 | :EndIf 140 | 141 | :Case 'Receive' 142 | :If 2≠⍴data ⍝ Command is expected to be (function name)(argument) 143 | {}##.DRC.Respond obj(99999 'Bad command format') ⋄ :Leave 144 | :EndIf 145 | 146 | Process obj data ⍝ NB Single-threaded 147 | 148 | :Case 'Connect' 149 | Connect obj 150 | 151 | :Else ⍝ Unexpected result? 152 | ∘ 153 | :EndSelect 154 | 155 | :Case 100 ⍝ Time out - Insert code for housekeeping tasks here (deadlocks?) 156 | 157 | :Case 1010 ⍝ Object Not Found 158 | 3 Log'Object ''',name,''' has been closed - RPC Server shutting down' ⋄ done←1 159 | 160 | :Else 161 | 3 Log'Error in RPC.Wait: ',⍕wait 162 | :EndSelect 163 | :EndWhile 164 | ⎕DL 1 ⍝ Give responses time to complete 165 | {}##.DRC.Close name 166 | 0 Log'Server ',name,' terminated.' 167 | 168 | :If 2=⎕NC '#.AUTOSHUT' 169 | :AndIf 0≠#.AUTOSHUT 170 | ⎕OFF 171 | :EndIf 172 | ∇ 173 | 174 | ∇ task←SetUser(cmd User);i;Conn 175 | ⍝ Return task ID 176 | 177 | Conn←1↓⊃(cmd='.')⊂cmd 178 | 179 | :If (⍴CONNS),ZI2,<:>,ZI2,<.>,ZI3'⎕FMT 1 4⍴3↓⎕TS),' ',message 240 | ∇ 241 | 242 | ∇ MockTest;assert;START;resources;nprocesses;nresources;nevents;i;conns;conn;z;s 243 | 244 | assert←{'Assertion failed'⎕SIGNAL(⍵=0)/11} 245 | 246 | InitLocks 0 247 | LOGLEVEL←3 ⍝ Log everything 248 | MOCK←1 249 | 250 | Connect'C1' 251 | assert(1 0)≡TASKS,USERS 252 | SetUser'C1' 1234 253 | assert(1 1234)≡TASKS,USERS 254 | 255 | Connect'C2' 256 | SetUser'C2' 4321 257 | 258 | assert 0=Lock'C1' '/ALLOC10' ⍝ Granted 259 | assert HELDBY≡,1 ⍝ Held by Task 1 260 | Release'C1' '/ALLOC10' ⍝ Release 261 | assert HELDBY≡,0 ⍝ Should now be free 262 | 263 | assert 0=Lock'C1' '/ALLOC10' ⍝ Granted 264 | assert HELDBY≡,1 ⍝ Held by Task 1 265 | assert 1=Lock'C2' '/ALLOC10' ⍝ Queued 266 | assert(2 'C2')≡2⍴⊃QUEUES ⍝ Task 2 is in the queue 267 | 268 | Release'C1' '/ALLOC10' 269 | assert HELDBY≡,2 ⍝ Should now be held by Task 2 270 | assert 0=⊃⍴⊃QUEUES 271 | 272 | Disconnect'C2' 273 | assert 1=⍴TASKS 274 | assert HELDBY≡,0 ⍝ Should now be free 275 | Disconnect'C1' 276 | assert 0=⍴TASKS 277 | 278 | ⍝ --- performance test --- 279 | 280 | LOGLEVEL←3 ⍝ Erors only 281 | 282 | nprocesses←10 283 | nevents←1000×2×nprocesses 284 | ⎕←'Testing performance...' 285 | Connect¨conns←'C'∘,¨⍕¨⍳nprocesses 286 | SetUser¨↓conns,[1.5]⍳nprocesses 287 | resources←nevents⍴('/BLAH/BLAH/ALLOC'∘,¨⍕¨⍳nprocesses),nprocesses⍴⊂'/BLAH/BLAH/ALLOC0' 288 | 289 | START←3⊃⎕AI 290 | :For i :In ⍳nprocesses+nevents 291 | conn←(1+nprocesses|i-1)⊃conns 292 | :If i≤nevents ⋄ z←Lock conn(i⊃resources) ⋄ :EndIf 293 | :If i>nprocesses ⋄ z←Release conn((i-nprocesses)⊃resources) ⋄ :EndIf 294 | :EndFor 295 | 296 | s←0.001×(3⊃⎕AI)-START 297 | ⎕←(⍕nevents),' released & locked in',(1⍕s),'s (',(,' '~⍨,'CI12'⎕FMT nevents÷s),' locks/s)' 298 | ∇ 299 | 300 | ∇ Notify(cmd Resource info);Conn;task 301 | ⍝ Notify connection that resource has been granted 302 | 303 | LOCKSGRANTED+←1 304 | :If LOGLEVEL=0 305 | Conn←1↓⊃(cmd='.')⊂cmd 306 | task←(CONNS⍳⊂Conn)⊃TASKS 307 | 0 Log'Lock for ',Resource,' granted to task ',task 308 | :EndIf 309 | 310 | :If ~MOCK 311 | :If 0≠⊃r←#.DRC.Respond cmd(0(Resource info)) 312 | 1 Log'Respond to ',cmd,' failed' 313 | :EndIf 314 | :EndIf 315 | ∇ 316 | 317 | ∇ r←Run(name port);sink;data;event;obj;rc;wait;z;cmd 318 | ⍝ Run the Lock Server - based on CONGA RPCServer sample 319 | 320 | :If 0=⎕NC'start' ⋄ start←1 ⋄ :EndIf 321 | {}##.DRC.Init'' 322 | 323 | 0 Log'Thread ',(⍕⎕TID),' is now handing server ''',name,'''.' 324 | done←0 ⍝ done←1 in function "End" 325 | :While ~done 326 | rc obj event data←4↑wait←##.DRC.Wait name 3000 ⍝ Time out now and again 327 | 328 | :Select rc 329 | :Case 0 330 | :Select event 331 | :Case 'Error' 332 | :If 1119≢data ⋄ 3 Log'Error ',(⍕data),' on ',obj ⋄ :EndIf 333 | :If ~done∨←name≡obj ⍝ Error on the listener itself? 334 | {}##.DRC.Close obj ⍝ Close connection in error 335 | Disconnect obj ⍝ Let logic know 336 | :EndIf 337 | 338 | :Case 'Receive' 339 | :If 2≠⍴data ⍝ Command is expected to be (function name)(argument) 340 | {}##.DRC.Respond obj(99999 'Bad command format') ⋄ :Leave 341 | :EndIf 342 | 343 | Process obj data ⍝ NB Single-threaded 344 | 345 | :Case 'Connect' ⍝ Ignored 346 | Connect obj 347 | 348 | :Else ⍝ Unexpected result? 349 | ∘ 350 | :EndSelect 351 | 352 | :Case 100 ⍝ Time out - Insert code for housekeeping tasks here (deadlocks?) 353 | 354 | :Case 1010 ⍝ Object Not Found 355 | 3 Log'Object ''',name,''' has been closed - RPC Server shutting down' ⋄ done←1 356 | 357 | :Else 358 | 3 Log'Error in RPC.Wait: ',⍕wait 359 | :EndSelect 360 | :EndWhile 361 | ⎕DL 1 ⍝ Give responses time to complete 362 | {}##.DRC.Close name 363 | 0 Log'Server ',name,' terminated.' 364 | 365 | :If 2=⎕NC'#.AUTOSHUT' 366 | :AndIf 0≠#.AUTOSHUT 367 | ⎕OFF 368 | :EndIf 369 | ∇ 370 | 371 | ∇ r←Open(cmd folder);i;Conn 372 | ⍝ Check whether a folder is serve-able 373 | 374 | Conn←1↓⊃(cmd='.')⊂cmd 375 | 376 | :If (⊂folder)∊DBFolders 377 | r←0 'OK' 378 | :Else 379 | r←999('Database folder not found: ',folder) 380 | :EndIf 381 | ∇ 382 | 383 | ∇ task←SetUser(cmd User);i;Conn 384 | ⍝ Return task ID 385 | 386 | Conn←1↓⊃(cmd='.')⊂cmd 387 | 388 | :If (⍴CONNS)nprocesses ⋄ z←Release conn((i-nprocesses)⊃resources) ⋄ :EndIf 454 | :EndFor 455 | 456 | s←0.001×(3⊃⎕AI)-START 457 | ⎕←(⍕nevents),' released & locked in',(1⍕s),'s (',(,' '~⍨,'CI12'⎕FMT nevents÷s),' locks/s)' 458 | ∇ 459 | 460 | assert←{'Assertion failed'⎕SIGNAL(⍵=0)/11} 461 | 462 | :EndNamespace 463 | -------------------------------------------------------------------------------- /TestVecdb.dyalog: -------------------------------------------------------------------------------- 1 | :Namespace TestVecdb 2 | 3 | ⍝ Updated to version 0.2.6 with mapped columns 4 | ⍝ Call TestVecdb.Run '' to run all tests 5 | ⍝ assumes vecdb is loaded in #.vecdb 6 | ⍝ returns memory usage statistics (result of "memstats 0") 7 | 8 | (⎕IO ⎕ML)←1 1 9 | 10 | ∇ z←Run selection;path;source;tests;i;TIMELOG;LOG;m 11 | LOG←1 12 | ⎕FUNTIE ⎕FNUMS ⋄ ⎕NUNTIE ⎕NNUMS 13 | :Trap 6 ⋄ source←SALT_Data.SourceFile 14 | :Else ⋄ source←⎕WSID 15 | :EndTrap 16 | path←{(-⌊/(⌽⍵)⍳'\/')↓⍵}source 17 | 18 | ⎕←'Testing vecdb version ',#.vecdb.Version 19 | :If selection≡'required' ⋄ selection←'' ⋄ :EndIf ⍝ Nothing like that yet 20 | 21 | tests←{⍵/⍨(⊂'test_')∊⍨5↑¨⍵}⎕NL-3 22 | :If 0≠≢selection 23 | :If 1=≡selection ⋄ selection←,⊂selection ⋄ :EndIf 24 | :If ∧/m←selection∊5↓¨tests ⋄ tests←'test_'∘,¨selection 25 | :Else ⋄ ('tests not found: ',(~m)/selection) ⎕SIGNAL 11 26 | :EndIf 27 | :EndIf 28 | 29 | :For i :In ⍳≢tests 30 | TIMELOG←0 2⍴0 31 | ⍎i⊃tests 32 | :If LOG∧0≠≢TIMELOG 33 | ⎕←(i⊃tests) TIMELOG 34 | :EndIf 35 | :EndFor 36 | ∇ 37 | 38 | ∇ (name folder)←preTest dummy 39 | name←⊃1↓⎕SI 40 | folder←'./',name,'/' 41 | ⎕←'Clearing: ',folder 42 | :Trap 22 ⋄ #.vecdb.Delete folder ⋄ :EndTrap 43 | ∇ 44 | 45 | ∇ (db data columns types)←makeBasicDB numrecs;folder;name;range;types;tnms;recs;options;params;charvalues 46 | 47 | memstats 1 ⍝ Clear memory statistics 48 | (numrecs recs)←2↑numrecs,numrecs 49 | :If (100×numrecs)>2000⌶16 50 | ⎕←'*** Warning: workspace size should be at least: ',(⍕⌈(100×numrecs)÷1000000)',Mb ***' 51 | :EndIf 52 | 53 | folder←path,'/',(name←⊃1↓⎕SI),'/' 54 | ⍝⎕←'Clearing: ',folder 55 | :Trap 22 ⋄ #.vecdb.Delete folder ⋄ :EndTrap 56 | 57 | ⍝⎕←'Creating: ',folder←path,'/',name,'/' 58 | columns←'col_'∘,¨types←#.vecdb.TypeNames 59 | assert #.vecdb.TypeNames≡tnms←'I1' 'I2' 'I4',,¨'FBC' ⍝ Types have been added? 60 | range←2*¯1+8×1 2 4 6 0.25 61 | data←numrecs⍴¨¯1+⍳¨numrecs⌊range 62 | data←data×0.1*'F'=⊃¨(≢data)↑types ⍝ Make float values where necessary 63 | data←data,⊂numrecs⍴charvalues←{1↓¨(⍵=⊃⍵)⊂⍵}'/zero/one/two/three/four/five/six/seven/eight/nine/ten/eleven/one dozen/thirteen/fourteen/fifteen' 64 | 65 | :If LOG ⋄ ⎕←'Size of input data: ',fmtnum ⎕SIZE'data' ⋄ :EndIf 66 | 67 | (options←⎕NS'').BlockSize←numrecs(⌊×)0.6 ⍝ Provoke block overflow 68 | params←name folder columns types options(recs↑¨data) 69 | TEST←'Creating db & inserting ',(fmtnum recs),' records' 70 | db←⎕NEW time #.vecdb params 71 | assert db.isOpen 72 | assert db.Count=recs 73 | assert 0=db.Close 74 | assert 0=db.isOpen 75 | 76 | TEST←'Reopen database' 77 | db←(⎕NEW time)#.vecdb(,⊂folder) ⍝ Open it again 78 | assert db.isOpen 79 | assert db.Count=recs 80 | ∇ 81 | 82 | :Section Tests 83 | 84 | ∇ test_sharding;columns;data;options;params;folder;types;name;db;ix;rotate;newcols;colsnow;m;db1;db2;ix2;ix1;t;i;z 85 | ⍝ Test database with 2 shards 86 | ⍝ Also acts as test for add/remove columns 87 | 88 | folder←path,'/',(name←'shardtest'),'/' 89 | 90 | :For rotate :In 0 1 2 ⍝ Test with shard key in all positions 91 | 92 | ⎕←'Clearing: ',folder 93 | :Trap 22 ⋄ #.vecdb.Delete folder ⋄ :EndTrap 94 | 95 | columns←rotate⌽'Name' 'BlockSize' 'Flag' 96 | types←rotate⌽,¨'C' 'F' 'C' 97 | data←rotate⌽('IBM' 'AAPL' 'MSFT' 'GOOG' 'DYALOG')(160.97 112.6 47.21 531.23 999.99)(5⍴'Buy' 'Sell') 98 | 99 | options←⎕NS'' 100 | options.BlockSize←10000 101 | options.ShardFolders←(folder,'Shard')∘,¨'12' 102 | options.(ShardFn ShardCols)←'{2-2|⎕UCS ⊃¨⊃⍵}'(⊃rotate⌽1 3 2) 103 | 104 | params←name folder columns types options(3↑¨data) 105 | TEST←'Create sharded database (rotate=',(⍕rotate),')' 106 | db←⎕NEW time #.vecdb params 107 | assert 3=db.Count 108 | assert(3↑¨data)≡db.Read(1 2⍴1(1 2 3))columns ⍝ All went into shard #1 109 | 110 | TEST←'Append last 2 records' 111 | z←db.Append time columns(3↓¨data) 112 | 113 | assert 5=db.Count 114 | ix←db.Query('Name'((columns⍳⊂'Name')⊃data))⍬ ⍝ Should find everything 115 | assert(1 2,⍪⍳¨4 1)≡ix 116 | TEST←'Read it all back' 117 | assert data≡db.Read time ix columns 118 | 119 | newcols←columns,¨'2' 120 | TEST←'Add columns' 121 | z←db.AddColumns time newcols types 122 | z←db.Update ix newcols data ⍝ Populate new columns 123 | assert(db.Read ix columns)≡(db.Read ix newcols) 124 | 125 | TEST←'Remove columns' 126 | m←(⍳≢columns)≠db.ShardCols ⍝ not the shard col 127 | z←db.RemoveColumns time(m/columns),(~m)/newcols 128 | colsnow←((~m)/columns),m/newcols 129 | types←((~m)/types),m/types 130 | data←((~m)/data),m/data 131 | assert(db.(Columns Types))≡(colsnow types) ⍝ should now only have the new columns 132 | assert data≡db.Read ix colsnow ⍝ Check database is "undamaged" 133 | 134 | z←db.Close 135 | 136 | ⍝ Now open shards individually 137 | db1←⎕NEW #.vecdb(folder 1) 138 | db2←⎕NEW #.vecdb(folder 2) 139 | ix1←db1.Query('Name'((colsnow⍳⊂'Name')⊃data))⍬ ⍝ Find all records 140 | ix2←db2.Query('Name'((colsnow⍳⊂'Name')⊃data))⍬ ⍝ ditto 141 | assert(1 2,⍪⍳¨4 1)≡ix1⍪ix2 142 | assert data≡⊃,¨/(db1 db2).Read(ix1 colsnow)(ix2 colsnow) 143 | 144 | t←4↓¨data 145 | 'data may only be appended to opened shards'db1.Append expecterror colsnow t 146 | t[i]←⌽¨¨t[i←colsnow⍳⊂'Flag2'] 147 | 'new strings not allowed unless all shards are open'db2.Append expecterror colsnow t 148 | z←(db1 db2).Close 149 | 150 | TEST←'Erase database' 151 | db←⎕NEW #.vecdb(,⊂folder) 152 | assert 0={db.Erase}time ⍬ 153 | 154 | :EndFor ⍝ rotate 155 | 156 | z←'Sharding Tests Completed' 157 | ∇ 158 | 159 | ∇ z←test_basic;db;data;columns;numrecs;recs;TEST;select;where;expect;vals;indices;rcols;rcoli;types;ix;newvals;i;t 160 | ⍝ Create and delete some tables 161 | 162 | (db data columns types)←makeBasicDB (numrecs recs)←1 0.5×10000000 ⍝ 10 million records 163 | TEST←'Reading them back:' 164 | assert(recs↑¨data)≡db.Read time(⍳recs)columns 165 | 166 | ⍝ test vecdb.Append and vecdb.Read 167 | TEST←'Appending ',(fmtnum numrecs-recs),' more' 168 | assert 0=db.Append time columns(recs↓¨data) ⍝ Append the rest of the data 169 | assert db.Count=numrecs 170 | assert data≡db.Read(⍳numrecs)columns ⍝ Read and verify ALL the data 171 | 172 | ⍝ Test vecdb.Query 173 | select←⌽columns ⍝ columns to select (all, but in reverse order) 174 | where←((1⊃columns)(1 2 3)) 175 | expect←⌽((1⊃data)∊1 2 3)∘/¨data ⍝ The expected result 176 | TEST←'Single expression query' 177 | assert expect≡db.Query time where select 178 | 179 | where←where((6⊃columns)(vals←'one' 'two' 'three' 'seventy')) ⍝ Add filter on char type 180 | expect←⌽(⊃∧/data[1 6]∊¨(1 2 3)vals)∘/¨data ⍝ Reduced expectations 181 | TEST←'Two expression query' 182 | assert expect≡db.Query time where select 183 | 184 | TEST←'Single key, single data group by' 185 | expect←(1⊃data){⍺,+/⍵}⌸2⊃data 186 | assert expect≡db.Query time ⍬'sum col_I2' 'col_I1' ⍝ select sum(col_I2) group by col_I1' 187 | 188 | TEST←'Single CHAR key, single data group by' 189 | expect←(6⊃data){⍺,+/⍵}⌸2⊃data 190 | assert expect≡db.Query time ⍬'sum col_I2' 'col_C' ⍝ select sum(col_I2) group by col_C' 191 | 192 | TEST←'Single key, multiple data group by' 193 | expect←(1⊃data){⍺,(+/⍵[;1]),⌈/⍵[;2]}⌸↑[0.5]data[2 3] 194 | assert expect≡db.Query time ⍬('sum col_I2' 'max col_I4')'col_I1' ⍝ select sum(col_I2),max(col_I4) group by col_I1' 195 | 196 | TEST←'Two key, single data group by' 197 | expect←(↑[0.5]data[1 5]){⍺,+/⍵}⌸2⊃data 198 | assert expect≡db.Query time ⍬'sum col_I2'('col_I1' 'col_B') ⍝ select sum(col_I2) group by col_I1, col_B' 199 | 200 | TEST←'Two key, multiple data group by' 201 | expect←(↑[0.5]data[1 5]){⍺,(+/⍵[;1]),⌈/⍵[;2]}⌸↑[0.5]data[2 3] 202 | assert expect≡db.Query time ⍬('sum col_I2' 'max col_I4')('col_I1' 'col_B') ⍝ select sum(col_I2),max(col_I4) group by col_I1,col_B' 203 | 204 | ⍝ Test vecdb.Replace 205 | indices←db.Query where ⍬ 206 | rcols←columns[rcoli←types⍳,¨'I2' 'B' 'C'] 207 | TEST←'Updating ',(fmtnum≢ix←2⊃,indices),' records' 208 | newvals←0 1-(⊂ix)∘⌷¨data[2↑rcoli] ⍝ Update with 0-data or ~data 209 | newvals,←⊂(≢ix)⍴⊂'changed' ⍝ And new char values 210 | assert 0=db.Update time indices rcols newvals 211 | expect←data[rcoli] 212 | :For i :In ⍳⍴rcoli 213 | t←i⊃expect ⋄ t[ix]←i⊃newvals ⋄ (i⊃expect)←t 214 | :EndFor 215 | TEST←'Reading two columns for all ',(⍕numrecs),' records' 216 | assert expect≡db.Read time(1,⍪⊂⍳numrecs)rcols 217 | 218 | :If LOG 219 | ⎕←'Basic tests: memstats before db.Erase:' 220 | ⎕←memstats 0 ⍝ Report 221 | :EndIf 222 | 223 | TEST←'Deleting the db' ⋄ assert 0={db.Erase}time ⍬ 224 | ∇ 225 | 226 | ∇test_calcmap;db;data;columns;numrecs;I1;Odd;OddC;charvalues;charsmapped;expect;allodd;square;sel;types;chardata 227 | ⍝ Test calculated / mapped columns 228 | 229 | numrecs←10000 230 | (db data columns types)←makeBasicDB numrecs 231 | charvalues←∪chardata← (types⍳⊂,'C')⊃data 232 | 233 | (I1 Odd)←{⍵(2|⍵)}∪1⊃data ⍝ Mappings of I1 column (with values in range 0…127) 234 | OddC←('Even' 'Odd')[1+Odd] ⍝ Odd in Char form 235 | db.AddCalc'OddI1' 'col_I1' 'B' 'map'(I1 Odd) ⍝ name source type calculation data 236 | db.AddCalc'OddI1C' 'col_I1' 'C' 'map'(I1 OddC) ⍝ Map I1 => string 'Odd' or 'Even' 237 | db.AddCalc'SquareI1' 'col_I1' 'I2' '{⍵*2}'⍬'{⍵*0.5}' ⍝ Function with inverse for faster searches 238 | db.AddCalc'ThreeResC' 'col_C' 'C' 'map'(charvalues(charsmapped←16⍴'zero' 'one' 'two')) ⍝ Map on char=>char 239 | 240 | assert Odd≡db.Calc'OddI1'I1 ⍝ Check that we perform a calculation 241 | assert OddC≡db.Calc'OddI1C'I1 242 | assert charsmapped≡db.Calc'ThreeResC'charvalues 243 | 244 | TEST←'Select calculated column' 245 | expect←({↓⍉(⍵∘.*1 2),2|⍵}1⊃data),(('Even' 'Odd')[1+2|1⊃data])(('zero' 'one' 'two')[1+3|¯1+charvalues⍳chardata]) 246 | assert expect≡db.Query time ⍬('col_I1' 'SquareI1' 'OddI1' 'OddI1C' 'ThreeResC') ⍝ select col_I1, SquareI1, OddI1 ThreeResC 247 | 248 | TEST←'Test query on calculated column with inverse' 249 | expect←1 2 3 250 | assert expect≡∪⊃db.Query('SquareI1'(1 4 9))'col_I1' ⍝ select col_I1 where SquareI1 in 1 4 9 251 | 252 | expect←((≢charvalues)⍴0 1 0)/charvalues 253 | assert expect≡∪⊃db.Query('ThreeResC'(⊂'one'))'col_C' ⍝ Where clause on char=>char mapping 254 | 255 | TEST←'Group by calculation' 256 | expect←(allodd←2|1⊃data){⍺,+/⍵}⌸2⊃data 257 | assert expect≡db.Query time ⍬'sum col_I2' 'OddI1' ⍝ select sum(col_i2) group by OddI1 258 | TEST←'Group by 1 calc, filter on another' 259 | sel←(square←×⍨1⊃data)∊1 4 9 ⍝ where (I2*2)∊1 4 9 260 | expect←(sel/square){⍺,+/⍵}⌸sel/2⊃data 261 | assert expect≡db.Query time('SquareI1'(1 4 9))'sum col_I2' 'SquareI1' ⍝ select sum(col_i2) group by SquareI1 where SquareI1∊1 4 9 262 | 263 | db.RemoveCalc'OddI1' 264 | ⍝ /// More calc column QA required 265 | ⍝ /// Do not allow calcs on character columns 266 | 267 | TEST←'Deleting the db' ⋄ assert 0={db.Erase}time ⍬ 268 | ∇ 269 | 270 | ∇ test_add_columns_in_sequence;name;folder;options;numrecs;columns;types;params;db;data 271 | name folder←preTest ⍬ 272 | 273 | numrecs←10 274 | columns←'first' 'second' 'third' 275 | types←'I1' 'F' 'C' 276 | data←(numrecs(?⍴)127)(0.1×numrecs(?⍴)1000)(numrecs⍴'abc' 'def' 'xyz') 277 | (options←⎕NS'').BlockSize←8 ⍝ Provoke block overflow 278 | params←name folder(,1⌷columns)(,1⌷types)options(,1⌷data) 279 | TEST←'Creating db & inserting columns in sequence' 280 | db←⎕NEW time #.vecdb params 281 | assert(,1⌷data)≡db.Read(1,⍪⎕NULL)(1⊃columns) 282 | db.AddColumns,¨2⌷¨columns types 283 | db.Update(1,⍪⍳10)(2⊃columns)(2⊃data) 284 | assert(,2⌷data)≡db.Read(1,⍪⎕NULL)(2⊃columns) 285 | db.AddColumns 3⌷¨columns types 286 | db.Update(⊂1,⍪⍳10),3⊃¨columns data 287 | assert(3⊃data)≡⊃db.Read(1,⍪⎕NULL)(3⊃columns) 288 | assert 0=db.Erase 289 | ∇ 290 | 291 | ∇ test_define_block_size;name;folder;options;params;db;em 292 | name folder←preTest ⍬ 293 | (options←⎕NS'').BlockSize←10 294 | params←name folder(,⊂'field')(,⊂'I1')options 295 | em←'Block size must be a multiple of 8' 296 | em ⎕NEW expecterror #.vecdb params 297 | options.BlockSize←64 298 | db←⎕NEW #.vecdb params 299 | assert db.isOpen 300 | assert 0=db.Erase 301 | ∇ 302 | 303 | ∇ no_test_summary_fns;name;folder;options;columns;types;params;db;data;sort;comp 304 | name folder←preTest ⍬ 305 | 306 | columns←'id' 'name' 'price' 'quantity' 307 | types←'I1' 'C' 'F' 'I1' 308 | data←,⊂9⍴1 2 309 | data,←⊂3/'ett' 'due' 'three' 310 | data,←⊂0.25×⍳9 311 | data,←⊂⌽⍳9 312 | 313 | options←⎕NS'' 314 | options.ShardFolders←(folder,'Shard')∘,¨'12' 315 | options.(ShardFn ShardCols)←'{2-2|⊃⍵}' 1 316 | 317 | params←name folder columns types options data 318 | db←⎕NEW #.vecdb params 319 | sort←{(⊂⍋↑⊃↓⍉⍵)⌷⍵} 320 | comp←sort⍨≡sort 321 | assert(+⌿⍉↑data[3 4])comp db.Query ⍬('sum price' 'sum quantity')⍬ 322 | assert((1⊃data){⍺,+⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('sum price' 'sum quantity')'id' 323 | assert((2⊃data){⍺,+⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('sum price' 'sum quantity')'name' 324 | assert((⍉↑2↑data){⍺,+⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('sum price' 'sum quantity')('id' 'name') 325 | 326 | assert(⌈⌿⍉↑data[3 4])comp db.Query ⍬('max price' 'max quantity')⍬ 327 | assert((1⊃data){⍺,⌈⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('max price' 'max quantity')'id' 328 | assert((2⊃data){⍺,⌈⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('max price' 'max quantity')'name' 329 | assert((⍉↑2↑data){⍺,⌈⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('max price' 'max quantity')('id' 'name') 330 | 331 | assert(⌊⌿⍉↑data[3 4])comp db.Query ⍬('min price' 'min quantity')⍬ 332 | assert((1⊃data){⍺,⌊⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('min price' 'min quantity')'id' 333 | assert((2⊃data){⍺,⌊⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('min price' 'min quantity')'name' 334 | assert((⍉↑2↑data){⍺,⌊⌿⍵}⌸⍉↑data[3 4])comp db.Query ⍬('min price' 'min quantity')('id' 'name') 335 | 336 | assert(2/≢⊃data)comp db.Query ⍬('count price' 'count quantity')⍬ 337 | assert((1⊃data){⍺,2/≢⍵}⌸⍉↑data[3 4])comp db.Query ⍬('count price' 'count quantity')'id' 338 | assert((2⊃data){⍺,2/≢⍵}⌸⍉↑data[3 4])comp db.Query ⍬('count price' 'count quantity')'name' 339 | assert((⍉↑2↑data){⍺,2/≢⍵}⌸⍉↑data[3 4])comp db.Query ⍬('count price' 'count quantity')('id' 'name') 340 | 341 | assert 0=db.Erase 342 | ∇ 343 | 344 | :EndSection 345 | 346 | ∇ x←output x 347 | :If LOG ⋄ ⍞←x ⋄ :EndIf 348 | ∇ 349 | 350 | ∇ r←fmtnum x 351 | ⍝ Nice formatting of large integers 352 | r←(↓((⍴x),20)⍴'CI20'⎕FMT⍪,x)~¨' ' 353 | ∇ 354 | 355 | ∇ r←memstats reset;maxws;z 356 | :If reset=1 357 | z←0(2000⌶)14 ⍝ Reset high water mark 358 | :Else 359 | maxws←⊂⍕2 ⎕NQ'.' 'GetEnvironment' 'MAXWS' 360 | r←⎕WA 361 | r←'MAXWS' '⎕WA' 'WS Used' 'Allocated' 'High Water Mark',⍪¯20↑¨maxws,fmtnum r,(2000⌶)1 13 14 362 | :EndIf 363 | ∇ 364 | 365 | assert←{'Assertion failed'⎕SIGNAL(⍵=0)/11} 366 | 367 | time←{⍺←⊣ ⋄ t←⎕AI[3] 368 | z←⍺ ⍺⍺ ⍵ 369 | z⊣timelog TEST (⎕AI[3]-t) 370 | } 371 | 372 | ∇{info}←timelog info 373 | :If 2=⎕NC 'TIMELOG' 374 | TIMELOG⍪←info 375 | :EndIf 376 | ∇ 377 | 378 | expecterror←{ 379 | 0::⎕SIGNAL(⍺≡⊃⎕DMX.DM)↓11 380 | z←⍺⍺ ⍵ 381 | ⎕SIGNAL 11 382 | } 383 | 384 | :EndNamespace 385 | -------------------------------------------------------------------------------- /APLProcess.dyalog: -------------------------------------------------------------------------------- 1 | :Class APLProcess 2 | ⍝ Start (and eventually dispose of) a Process 3 | 4 | (⎕IO ⎕ML)←1 1 5 | 6 | :Field Public Args←'' 7 | :Field Public Ws←'' 8 | :Field Public Exe←'' 9 | :Field Public Proc←⎕NS '' 10 | :Field Public onExit←'' 11 | :Field Public RunTime←0 ⍝ Boolean or name of runtime executable 12 | :Field Public IsWin←0 13 | :Field Public IsSsh←0 14 | 15 | :Field Public RIDE_INIT←'' ⍝ RIDE parameters if remote debugging is to be allowed 16 | 17 | endswith←{w←,⍵ ⋄ a←,⍺ ⋄ w≡(-(⍴a)⌊⍴w)↑a} 18 | tonum←{⊃⊃(//)⎕VFI ⍵} 19 | eis←{2>|≡⍵:,⊂⍵ ⋄ ⍵} ⍝ enclose if simple 20 | 21 | ∇ path←SourcePath;source 22 | ⍝ Determine the source path of the class 23 | 24 | :Trap 6 25 | source←⍎'(⊃⊃⎕CLASS ⎕THIS).SALT_Data.SourceFile' ⍝ ⍎ works around a bug 26 | :Else 27 | :If 0=⍴source←{((⊃¨⍵)⍳⊃⊃⎕CLASS ⎕THIS)⊃⍵,⊂''}5177⌶⍬ 28 | source←⎕WSID 29 | :Else ⋄ source←4⊃source 30 | :EndIf 31 | :EndTrap 32 | path←{(-⌊/(⌽⍵)⍳'\/')↓⍵}source 33 | ∇ 34 | 35 | ∇ make1 args;rt;cmd;ws 36 | :Access Public Instance 37 | :Implements Constructor 38 | ⍝ args is: 39 | ⍝ [1] the workspace to load 40 | ⍝ [2] any command line arguments 41 | ⍝ {[3]} if present, a Boolean indicating whether to use the runtime version, OR a character vector of the executable name to run 42 | args←{2>|≡⍵:,⊂⍵ ⋄ ⍵}args 43 | args←3↑args,(⍴args)↓'' '' 0 44 | (ws cmd rt)←args 45 | PATH←SourcePath 46 | Start(ws cmd rt) 47 | ∇ 48 | 49 | ∇ Run 50 | :Access Public Instance 51 | Start(Ws Args RunTime) 52 | ∇ 53 | 54 | ∇ Start(ws args rt);psi;pid;cmd;host;port;keyfile;exe 55 | (Ws Args)←ws args 56 | :If 0≠⍴RIDE_INIT 57 | args←args,' RIDE_SPAWNED=1 RIDE_INIT=',RIDE_INIT 58 | :EndIf 59 | 60 | :If ~0 2 6∊⍨10|⎕DR rt ⍝ if rt is character or nested, it defines what to start 61 | Exe←(RunTimeName⍣rt) GetCurrentExecutable ⍝ else, deduce it 62 | :Else 63 | Exe←rt 64 | rt←0 65 | :EndIf 66 | 67 | :If IsWin←IsWindows∧~IsSsh←326=⎕DR Exe 68 | ⎕USING←'System,System.dll' 69 | psi←⎕NEW Diagnostics.ProcessStartInfo,⊂Exe(ws,' ',args) 70 | psi.WindowStyle←Diagnostics.ProcessWindowStyle.Minimized 71 | Proc←Diagnostics.Process.Start psi 72 | :Else ⍝ Unix 73 | :If IsSsh 74 | (host port keyfile exe)←Exe 75 | cmd←args,' ',exe,' -q +s ',ws 76 | Proc←SshProc host port keyfile cmd 77 | :Else 78 | pid←_SH'{ ',args,' ',Exe,' +s ',ws,' -c APLppid=',(⍕GetCurrentProcessId),' /dev/null 2>&1 & } ; echo $!' 79 | Proc.Id←pid 80 | Proc.HasExited←HasExited 81 | :EndIf 82 | Proc.StartTime←⎕NEW Time ⎕TS 83 | :EndIf 84 | ∇ 85 | 86 | ∇ Close;count;limit 87 | :Implements Destructor 88 | WaitForKill&200 0.1 ⍝ Start a new thread to do the dirty work 89 | ∇ 90 | 91 | ∇ WaitForKill(limit interval);count 92 | :If (0≠⍴onExit)∧~HasExited ⍝ If the process is still alive 93 | :Trap 0 ⋄ ⍎onExit ⋄ :EndTrap ⍝ Try this 94 | 95 | count←0 96 | :While ~HasExited 97 | {}⎕DL interval 98 | count←count+1 99 | :Until count>limit 100 | :EndIf ⍝ OK, have it your own way 101 | 102 | {}Kill Proc 103 | ∇ 104 | 105 | ∇ r←IsWindows 106 | :Access Public Shared 107 | r←'Win'≡3↑⎕IO⊃#.⎕WG'APLVersion' 108 | ∇ 109 | 110 | ∇ r←GetCurrentProcessId;t 111 | :Access Public Shared 112 | :If IsWin 113 | r←⍎'t'⎕NA'U4 kernel32|GetCurrentProcessId' 114 | :ElseIf IsSsh 115 | r←Proc.Pid 116 | :Else 117 | r←tonum⊃_SH'echo $PPID' 118 | :EndIf 119 | ∇ 120 | 121 | ∇ r←GetCurrentExecutable;⎕USING;t;gmfn 122 | :Access Public Shared 123 | :If IsWindows 124 | r←'' 125 | :Trap 0 126 | 'gmfn'⎕NA'U4 kernel32|GetModuleFileName* P =T[] U4' 127 | r←⊃⍴/gmfn 0(1024⍴' ')1024 128 | :EndTrap 129 | :If 0∊⍴r 130 | ⎕USING←'System,system.dll' 131 | r←2 ⎕NQ'.' 'GetEnvironment' 'DYALOG' 132 | r←r,(~(¯1↑r)∊'\/')/'/' ⍝ Add separator if necessary 133 | r←r,(Diagnostics.Process.GetCurrentProcess.ProcessName),'.exe' 134 | :EndIf 135 | :ElseIf IsSsh 136 | ∘∘∘ ⍝ Not supported 137 | :Else 138 | t←⊃_PS'-o args -p ',⍕GetCurrentProcessId ⍝ AWS 139 | :If '"'''∊⍨⊃t ⍝ if command begins with ' or " 140 | r←{⍵/⍨{∧\⍵∨≠\⍵}⍵=⊃⍵}t 141 | :Else 142 | r←{⍵↑⍨¯1+1⍳⍨(¯1↓0,⍵='\')<⍵=' '}t ⍝ otherwise find first non-escaped space (this will fail on files that end with '\\') 143 | :EndIf 144 | :EndIf 145 | ∇ 146 | 147 | ∇ r←RunTimeName exe 148 | ⍝ Assumes that: 149 | ⍝ Windows runtime ends in "rt.exe" 150 | ⍝ *NIX runtime ends in ".rt" 151 | r←exe 152 | :If IsWin 153 | :If 'rt.exe'≢¯6↑{('rt.ex',⍵)[⍵⍳⍨'RT.EX',⍵]}exe ⍝ deal with case insensitivity 154 | r←'rt.exe',⍨{(~∨\⌽<\⌽'.'=⍵)/⍵}exe 155 | :EndIf 156 | :Else 157 | r←exe,('.rt'≢¯3↑exe)/'.rt' 158 | :EndIf 159 | ∇ 160 | 161 | 162 | ∇ r←KillChildren Exe;kids;⎕USING;p;m;i;mask 163 | :Access Public Shared 164 | ⍝ returns [;1] pid [;2] process name of any processes that were not killed 165 | r←0 2⍴0 '' 166 | :If ~0∊⍴kids←ListProcesses Exe ⍝ All child processes using the exe 167 | :If IsWin 168 | ⎕USING←'System,system.dll' 169 | p←Diagnostics.Process.GetProcessById¨kids[;1] 170 | p.Kill 171 | ⎕DL 1 172 | :If 0≠⍴p←(~p.HasExited)/p 173 | ⎕DL 1 174 | p.Kill 175 | ⎕DL 1 176 | :If ∨/m←~p.HasExited 177 | r←(kids[;1]∊m/p.Id)⌿kids 178 | :EndIf 179 | :EndIf 180 | :ElseIf IsSsh 181 | ∘∘∘ 182 | :Else 183 | mask←(⍬⍴⍴kids)⍴0 184 | :For i :In ⍳⍴mask 185 | mask[i]←Shoot kids[i;1] 186 | :EndFor 187 | r←(~mask)⌿kids 188 | :EndIf 189 | :EndIf 190 | ∇ 191 | 192 | ∇ r←{all}ListProcesses procName;me;⎕USING;procs;unames;names;name;i;pn;kid;parent;mask;n 193 | :Access public shared 194 | ⍝ returns either my child processes or all processes 195 | ⍝ procName is either '' for all children, or the name of a process 196 | ⍝ r[;1] - child process number (Id) 197 | ⍝ r[;2] - child process name 198 | me←GetCurrentProcessId 199 | r←0 2⍴0 '' 200 | procName←,procName 201 | all←{6::⍵ ⋄ all}0 ⍝ default to just my childen 202 | 203 | :If IsWin 204 | ⎕USING←'System,system.dll' 205 | 206 | :If 0∊⍴procName ⋄ procs←Diagnostics.Process.GetProcesses'' 207 | :Else ⋄ procs←Diagnostics.Process.GetProcessesByName⊂procName ⋄ :EndIf 208 | :If all 209 | r←↑procs.(Id ProcessName) 210 | r⌿⍨←r[;1]≠me 211 | :Else 212 | :If 0<⍴procs 213 | unames←∪names←procs.ProcessName 214 | :For name :In unames 215 | :For i :In ⍳n←1+.=(,⊂name)⍳names 216 | pn←name,(n≠1)/'#',⍕i 217 | :Trap 0 ⍝ trap here just in case a process disappeared before we get to it 218 | parent←⎕NEW Diagnostics.PerformanceCounter('Process' 'Creating Process Id'pn) 219 | :If me=parent.NextValue 220 | kid←⎕NEW Diagnostics.PerformanceCounter('Process' 'Id Process'pn) 221 | r⍪←(kid.NextValue)name 222 | :EndIf 223 | :EndTrap 224 | :EndFor 225 | :EndFor 226 | :EndIf 227 | :EndIf 228 | :ElseIf IsSsh 229 | ∘∘∘ 230 | :Else ⍝ Linux 231 | ⍝ unfortunately, Ubuntu (and perhaps others) report the PPID of tasks started via ⎕SH as 1 232 | ⍝ so, the best we can do at this point is identify processes that we tagged with ppid= 233 | mask←' '∧.=procs←' ',↑_PS'-eo pid,cmd',((~all)/' | grep APLppid=',(⍕GetCurrentProcessId)),(0<⍴procName)/' | grep ',procName,' | grep -v grep' ⍝ AWS 234 | mask∧←2≥+\mask 235 | procs←↓¨mask⊂procs 236 | mask←me≠tonum¨1⊃procs ⍝ remove my task 237 | procs←mask∘/¨procs[1 2] 238 | mask←1 239 | :If 0<⍴procName 240 | mask←∨/¨(procName,' ')∘⍷¨(2⊃procs),¨' ' 241 | :EndIf 242 | mask>←∨/¨'grep '∘⍷¨2⊃procs ⍝ remove procs that are for the searches 243 | procs←mask∘/¨procs 244 | r←↑[0.1]procs 245 | :EndIf 246 | ∇ 247 | 248 | ∇ r←Kill;delay 249 | :Access Public Instance 250 | r←0 ⋄ delay←0.1 251 | :Trap 0 252 | :If IsWin 253 | Proc.Kill 254 | :Repeat 255 | ⎕DL delay 256 | delay+←delay 257 | :Until (delay>10)∨Proc.HasExited 258 | :ElseIf IsSsh 259 | ∘∘∘ 260 | :Else ⍝ Local UNIX 261 | {}UNIXIssueKill 3 Proc.Id ⍝ issue strong interrupt 262 | {}⎕DL 2 ⍝ wait a couple seconds for it to react 263 | :If ~Proc.HasExited←~UNIXIsRunning Proc.Id 264 | {}UNIXIssueKill 9 Proc.Id ⍝ issue strong interrupt 265 | {}⎕DL 2 ⍝ wait a couple seconds for it to react 266 | :AndIf ~Proc.HasExited←~UNIXIsRunning Proc.Id 267 | :Repeat 268 | ⎕DL delay 269 | delay+←delay 270 | :Until (delay>10)∨Proc.HasExited~UNIXIsRunning Proc.Id 271 | :EndIf 272 | :EndIf 273 | r←Proc.HasExited 274 | :EndTrap 275 | ∇ 276 | 277 | ∇ r←Shoot Proc;MAX;res 278 | MAX←100 279 | r←0 280 | :If 0≠⎕NC⊂'Proc.HasExited' 281 | :Repeat 282 | :If ~Proc.HasExited 283 | :If IsWin 284 | Proc.Kill 285 | ⎕DL 0.2 286 | :ElseIf IsSsh 287 | ∘∘∘ 288 | :Else 289 | {}UNIXIssueKill 3 Proc.Id ⍝ issue strong interrupt AWS 290 | {}⎕DL 2 ⍝ wait a couple seconds for it to react 291 | :If ~Proc.HasExited←0∊⍴res←UNIXGetShortCmd Proc.Id ⍝ AWS 292 | Proc.HasExited∨←∨/''⍷⊃,/res 293 | :EndIf 294 | :EndIf 295 | :EndIf 296 | MAX-←1 297 | :Until Proc.HasExited∨MAX≤0 298 | r←Proc.HasExited 299 | :ElseIf 2=⎕NC'Proc' ⍝ just a process id? 300 | {}UNIXIssueKill 9 Proc.Id 301 | {}⎕DL 2 302 | r←~UNIXIsRunning Proc.Id ⍝ AWS 303 | :EndIf 304 | ∇ 305 | 306 | ∇ r←HasExited 307 | :Access public instance 308 | :If IsWin∨IsSsh 309 | r←{0::⍵ ⋄ Proc.HasExited}1 310 | :Else 311 | r←~UNIXIsRunning Proc.Id ⍝ AWS 312 | :EndIf 313 | ∇ 314 | 315 | ∇ r←IsRunning args;⎕USING;start;exe;pid;proc;diff;res 316 | :Access public shared 317 | ⍝ args - pid {exe} {startTS} 318 | r←0 319 | args←eis args 320 | (pid exe start)←3↑args,(⍴args)↓0 ''⍬ 321 | :If IsWin 322 | ⎕USING←'System,system.dll' 323 | :Trap 0 324 | proc←Diagnostics.Process.GetProcessById pid 325 | r←1 326 | :Else 327 | :Return 328 | :EndTrap 329 | :If ''≢exe 330 | r∧←exe≡proc.ProcessName 331 | :EndIf 332 | :If ⍬≢start 333 | :Trap 90 334 | diff←|-/#.DFSUtils.DateToIDN¨start(proc.StartTime.(Year Month Day Hour Minute Second Millisecond)) 335 | r∧←diff≤24 60 60 1000⊥0 1 0 0÷×/24 60 60 1000 ⍝ consider it a match within a 1 minute window 336 | :Else 337 | r←0 338 | :EndTrap 339 | :EndIf 340 | :ElseIf IsSsh 341 | ∘∘∘ 342 | :Else 343 | r←UNIXIsRunning pid 344 | :EndIf 345 | ∇ 346 | 347 | ∇ r←Stop pid;proc 348 | :Access public shared 349 | ⍝ attempts to stop the process with processID pid 350 | :If IsWin 351 | ⎕USING←'System,system.dll' 352 | :Trap 0 353 | proc←Diagnostics.Process.GetProcessById pid 354 | :Else 355 | r←1 356 | :Return 357 | :EndTrap 358 | proc.Kill 359 | {}⎕DL 0.5 360 | r←~##.APLProcess.IsRunning pid 361 | :ElseIf IsSsh 362 | ∘∘∘ 363 | :ElseIf 364 | {}UNIXIssueKill 3 pid ⍝ issue strong interrupt 365 | :EndIf 366 | ∇ 367 | 368 | ∇ r←UNIXIsRunning pid;txt 369 | ⍝ Return 1 if the process is in the process table and is not a defunct 370 | r←0 371 | →(r←' '∨.≠txt←UNIXGetShortCmd pid)↓0 372 | r←~∨/''⍷txt 373 | ∇ 374 | 375 | ∇ {r}←UNIXIssueKill(signal pid) 376 | signal pid←⍕¨signal pid 377 | cmd←'kill -',signal,' ',pid,' >/dev/null 2>&1 ; echo $?' 378 | :If IsSsh 379 | ∘∘∘ 380 | :Else 381 | r←⎕SH cmd 382 | :EndIf 383 | ∇ 384 | 385 | ∇ r←UNIXGetShortCmd pid;cmd 386 | ⍝ Retrieve sort form of cmd used to start process 387 | cmd←'ps -o cmd -p ',(⍕pid),' 2>/dev/null ; exit 0' 388 | :If IsSsh 389 | ∘∘∘ 390 | :Else 391 | r←⊃1↓⎕SH cmd 392 | :EndIf 393 | ∇ 394 | 395 | ∇ r←_PS cmd;ps 396 | ps←'ps ',⍨('AIX'≡3↑⊃'.'⎕WG'APLVersion')/'/usr/sysv/bin/' ⍝ Must use this ps on AIX 397 | r←1↓⎕SH ps,cmd,' 2>/dev/null; exit 0' ⍝ Remove header line 398 | ∇ 399 | 400 | ∇ r←{quietly}_SH cmd 401 | :Access public shared 402 | quietly←{6::⍵ ⋄ quietly}0 403 | :If quietly 404 | cmd←cmd,' &1' 405 | :EndIf 406 | r←{0::'' ⋄ ⎕SH ⍵}cmd 407 | ∇ 408 | 409 | :Class Time 410 | :Field Public Year 411 | :Field Public Month 412 | :Field Public Day 413 | :Field Public Hour 414 | :Field Public Minute 415 | :Field Public Second 416 | :Field Public Millisecond 417 | 418 | ∇ make ts 419 | :Implements Constructor 420 | :Access Public 421 | (Year Month Day Hour Minute Second Millisecond)←7↑ts 422 | ⎕DF(⍕¯2↑'00',⍕Day),'-',((12 3⍴'JanFebMarAprMayJunJulAugSepOctNovDec')[⍬⍴Month;]),'-',(⍕100|Year),' ',1↓⊃,/{':',¯2↑'00',⍕⍵}¨Hour Minute Second 423 | ∇ 424 | 425 | :EndClass 426 | 427 | ∇ r←ProcessUsingPort port;t 428 | ⍝ return the process ID of the process (if any) using a port 429 | :Access public shared 430 | r←⍬ 431 | :If IsWin 432 | :If ~0∊⍴t←_SH'netstat -a -n -o' 433 | :AndIf ~0∊⍴t/⍨←∨/¨'LISTENING'∘⍷¨t 434 | :AndIf ~0∊⍴t/⍨←∨/¨((':',⍕port),' ')∘⍷¨t 435 | r←∪∊¯1↑¨(//)∘⎕VFI¨t 436 | :EndIf 437 | :Else 438 | :If ~0∊⍴t←_SH'netstat -l -n -p 2>/dev/null | grep '':',(⍕port),' ''' 439 | r←∪∊{⊃(//)⎕VFI{(∧\⍵∊⎕D)/⍵}⊃¯1↑{⎕ML←3 ⋄ (' '≠⍵)⊂⍵}⍵}¨t 440 | :EndIf 441 | :EndIf 442 | ∇ 443 | 444 | ∇ r←MyDNSName;GCN 445 | :Access Public Shared 446 | 447 | :If IsWin 448 | 'GCN'⎕NA'I4 Kernel32|GetComputerNameEx* U4 >0T =U4' 449 | r←2⊃GCN 7 255 255 450 | :Return 451 | ⍝ ComputerNameNetBIOS = 0 452 | ⍝ ComputerNameDnsHostname = 1 453 | ⍝ ComputerNameDnsDomain = 2 454 | ⍝ ComputerNameDnsFullyQualified = 3 455 | ⍝ ComputerNamePhysicalNetBIOS = 4 456 | ⍝ ComputerNamePhysicalDnsHostname = 5 457 | ⍝ ComputerNamePhysicalDnsDomain = 6 458 | ⍝ ComputerNamePhysicalDnsFullyQualified = 7 <<< 459 | ⍝ ComputerNameMax = 8 460 | :ElseIf IsSsh 461 | ∘∘∘ ⍝ Not supported 462 | :ElseIf 463 | r←⊃_SH'hostname' 464 | :EndIf 465 | ∇ 466 | 467 | ∇ Proc←SshProc(host user keyfile cmd);conn;z;kf;allpids;guid;listpids;pids;⎕USING;pid;tid 468 | ⎕USING←'Renci.SshNet,',PATH,'/Renci.SshNet.dll' 469 | kf←⎕NEW PrivateKeyFile (,⊂keyfile) 470 | conn←⎕NEW SshClient (host 22 user (,kf)) 471 | 472 | :Trap 0 473 | conn.Connect ⍝ This is defined to be a void() 474 | :Case 90 ⋄ ('Error creating ssh client instance: ',⎕EXCEPTION.Message) ⎕SIGNAL 11 475 | :Else ⋄ 'Unexpected error creating ssh client instance' ⎕SIGNAL 11 476 | :EndTrap 477 | 478 | listpids←{0~⍨2⊃(⎕UCS 10)⎕VFI (conn.RunCommand ⊂'ps -u ',user,' | grep dyalog | grep -v grep | awk ''{print $2}''').Result} 479 | guid←'dyalog-ssh-',(⍕⎕TS)~' ' 480 | pids←listpids ⍬ 481 | Proc←⎕NS '' 482 | Proc.SshConn←conn 483 | Proc.HasExited←0 484 | tid←{SshRun conn ⍵ Proc}&⊂cmd 485 | Proc.tid←tid 486 | ⎕DL 1 487 | :If 1=⍴pid←(listpids ⍬)~pids ⋄ pid←⊃pid 488 | :Else ⋄ ∘∘∘ ⋄ :EndIf ⍝ failed to start 489 | Proc.Pid←pid 490 | ∇ 491 | 492 | ∇SshRun (conn cmd proc) 493 | ⍝ Wait until APL exits, then set HasExited←1 494 | conn.RunCommand cmd 495 | proc.HasExited←1 496 | ∇ 497 | 498 | :EndClass 499 | -------------------------------------------------------------------------------- /vecdb.dyalog: -------------------------------------------------------------------------------- 1 | :Class vecdb 2 | ⍝ Dyalog APL vector database - see https://github.com/Dyalog/vecdb 3 | 4 | (⎕IO ⎕ML)←1 1 5 | 6 | :Section Constants 7 | :Field Public Shared Version←'0.2.6' ⍝ Parallel DB 8 | :Field Public Shared TypeNames←,¨'I1' 'I2' 'I4' 'F' 'B' 'C' 9 | ⍝ To come: C4=323 indexed chars 10 | ⍝ Tn=Fixed with text (no index table) 11 | :Field Public Shared TypeNums←83 163 323 645 11 163 12 | :Field Public Shared SummaryFns←'sum' 'max' 'min' 'count' 13 | :Field Public Shared CalcFns←,⊂'map' 14 | :Field Public Shared SummaryAPLFns←'+/' '⌈/' '⌊/' '≢' 15 | :Field Public Shared ReSummaryAPLFns←'+/' '⌈/' '⌊/' '+/' 16 | :EndSection ⍝ Constants 17 | 18 | :Section Instance Fields ⍝ The fact that these are public does not mean it is safe to change them 19 | :Field Public Name←'' 20 | :Field Public Folder←'' ⍝ Where is it 21 | :Field Public BlockSize←100000 ⍝ Small while we test (must be multiple of 8) 22 | :Field Public NumBlocks←1 ⍝ We start with one block 23 | :Field Public noFiles←0 ⍝ in-memory database (not supported) 24 | :Field Public isOpen←0 ⍝ Not yet open 25 | :Field Public ShardFolders←⍬ ⍝ List of Shard Folders 26 | :Field Public LocalFolders←⍬ ⍝ Shard Folders as seen by slave task 27 | :Field Public ShardFn←⍬ ⍝ Shard Calculation Function 28 | :Field Public ShardCols←⍬ ⍝ ShardFn input column indices 29 | :Field Public ShardSelected←⍬ ⍝ Shards selected 30 | :Field Private AllShards←0 ⍝ Are all Shards in use? 31 | 32 | :Field _Columns←⍬ 33 | :Field _Types←⍬ 34 | :Field _Count←⍬ 35 | 36 | :EndSection ⍝ Instance Fields 37 | 38 | fileprops←'Name' 'BlockSize' ⍝ To go in comp 4 of meta.vecdb 39 | eis←{(≡⍵)∊0 1:⊂,⍵ ⋄ ⍵} ⍝ enclose if simple 40 | sizeOf←{(size dr)←⍵ ⋄⌈size×8÷⍨⌊dr÷10} ⍝ size in bytes of ⍵[1] elements of type ⍵[2] 41 | 42 | :Section Properties 43 | :Property Columns 44 | :Access Public 45 | ∇ r←get 46 | r←_Columns,_CalcCols 47 | ∇ 48 | :EndProperty 49 | 50 | :Property Types 51 | :Access public 52 | ∇ r←get 53 | r←_Types,_CalcTypes 54 | ∇ 55 | :EndProperty 56 | 57 | :Property Count 58 | :Access public 59 | ∇ r←get 60 | r←⊃⊃+/_Counts[ShardSelected].counter 61 | ∇ 62 | :EndProperty 63 | :EndSection ⍝ Properties 64 | 65 | ∇ Open(folder) 66 | :Implements Constructor 67 | :Access Public 68 | 69 | OpenFull(folder ⍬) ⍝ Open all shards 70 | ∇ 71 | 72 | ∇ InitCalcs tn;i;calc;spec;space;inv 73 | ⍝ Extract calculation data from meta file 74 | 75 | :If 8=2⊃⎕FSIZE tn ⍝ If File format pre-dates calculated columns 76 | 'unused'∆FAPPEND tn ⍝ 8 77 | 'unused'∆FAPPEND tn ⍝ 9 78 | (⍬ ⍬ ⍬)∆FAPPEND tn ⍝ 10 Calc Col Names, Source Columns, Data Type 79 | :EndIf 80 | 81 | (_CalcCols _CalcSources _CalcTypes)←⎕FREAD tn,10 ⍝ Calculated column definitions 82 | mappings,←⎕NS¨(≢_CalcCols)⍴⊂'' ⍝ Add mappings 83 | 84 | :For i :In ⍳≢_CalcCols ⍝ Run setup for each calculated coumn 85 | space←(i+≢_Columns)⊃mappings 86 | (calc spec inv)←⎕FREAD tn,10+i 87 | :If '{'=⊃calc ⍝ User-defined 88 | space.Type←2 ⍝ Calculation 89 | space.Spec←spec ⍝ Store data 90 | space⍎'Calc←',calc ⍝ Define function 91 | :If 0≠⍴inv ⋄ space⍎'CalcInv←',inv ⋄ :EndIf ⍝ Define inverse 92 | :Else 93 | ⍎'(i⊃_CalcCols) ',calc,'_Setup spec' 94 | :EndIf 95 | :EndFor 96 | ∇ 97 | 98 | ∇ name map_Setup(from to);cix;six;i;n;src;m;col;symbol;char;calcix 99 | ⍝ Setup for a "mapped" column 100 | 101 | (cix six)←Columns⍳(⊂name),_CalcSources[calcix←_CalcCols⍳⊂name] 102 | (col src)←mappings[cix six] 103 | 104 | :If (,'C')≡cix⊃Types ⍝ Special case char 105 | :AndIf (from≡⍳≢from)∨char←(,'C')≡six⊃Types ⍝ char-char or "direct" map 106 | 107 | :If char 108 | symbol←src.symbol ⍝ source symbols 109 | m←(≢from)≥i←from⍳symbol ⍝ mappable symbols 110 | (m/symbol)←to[m/i] ⍝ remap 111 | :Else 112 | symbol←to 113 | :EndIf 114 | 115 | col.file←0 ⍝ there is no symbol file 116 | col.Type←1 ⍝ Symbol 117 | col.symbol←symbol ⍝ Store symbol list 118 | col.(SymbolIndex←symbol∘⍳) ⍝ Create lookup function 119 | col.Source←six ⍝ Store the source column 120 | 121 | :Else 122 | col.Type←2 ⍝ Calc/CalcInv 123 | col.(to from)←to from 124 | col.(SymbolIndex←from∘⍳) 125 | col.(TargetIndex←to∘⍳) 126 | col.(Calc←to∘{⍺⌷⍨⊂SymbolIndex ⍵}) 127 | col.(CalcInv←from∘{⍺⌷⍨⊂TargetIndex ⍵}) 128 | :EndIf 129 | ∇ 130 | 131 | ∇ r←AddCalc spec;name;source;type;calc;file;tn;i;inv 132 | :Access Public 133 | 134 | 'not allowed unless all shards are open'⎕SIGNAL AllShards↓11 135 | (name source type calc spec inv)←6↑,¨spec,⍬ ⍬ ⍬ 136 | 137 | 'unknown source column'⎕SIGNAL((⊂source)∊_Columns)↓11 138 | 'unknown data type'⎕SIGNAL((⊂type)∊TypeNames)↓11 139 | 140 | :If '{'≠1⊃calc 141 | :Select calc 142 | :Case 'map' 143 | :If 2≠⍴spec 144 | :OrIf ≢/≢¨spec 145 | 'map source and target must have the same length'⎕SIGNAL 11 146 | :EndIf 147 | :Else 148 | ('Unknown standard calculation: ',calc)⎕SIGNAL 11 149 | :EndSelect 150 | :EndIf 151 | 152 | file←Folder,'meta.vecdb' 153 | ⎕FHOLD tn←file ⎕FSTIE 0 154 | (_CalcCols _CalcSources _CalcTypes)←⎕FREAD tn,10 ⍝ Calculated column definitions 155 | 156 | :If (≢_CalcCols)≥i←_CalcCols⍳⊂name ⍝ Existing source? 157 | (i⊃_CalcSources)←source 158 | (i⊃_CalcTypes)←type 159 | (_CalcCols _CalcSources _CalcTypes)⎕FREPLACE tn,10 160 | (calc spec inv)⎕FREPLACE tn,10+i 161 | 162 | :Else ⍝ New source 163 | ((_CalcCols _CalcSources _CalcTypes),∘⊂¨name source type)⎕FREPLACE tn,10 164 | :If (10+i)=2⊃⎕FSIZE tn ⍝ Append or replace? 165 | (calc spec inv)⎕FAPPEND tn 166 | :Else ⋄ (calc spec inv)⎕FREPLACE tn,10+i 167 | :EndIf 168 | :EndIf 169 | 170 | InitCalcs tn ⍝ re-read all from file (optimise later if necessary) 171 | ⎕FUNTIE tn ⍝ Also unholds it 172 | ∇ 173 | 174 | ∇ r←RemoveCalc name;file;tn;m;i;cn 175 | :Access Public 176 | 177 | 'not allowed unless all shards are open'⎕SIGNAL AllShards↓11 178 | 'calc not found'⎕SIGNAL((⊂name)∊_CalcCols)↓11 179 | 180 | file←Folder,'meta.vecdb' 181 | ⎕FHOLD tn←file ⎕FSTIE 0 182 | (_CalcCols _CalcSources _CalcTypes)←⎕FREAD tn,10 ⍝ Calculated column definitions 183 | i←(m←~_CalcCols∊⊂name)⍳0 184 | (m∘/¨_CalcCols _CalcSources _CalcTypes)⎕FREPLACE tn,10 185 | 186 | :For cn :In ⌽i↓10+⍳≢_CalcCols 187 | (⎕FREAD tn,cn)⎕FREPLACE tn,cn-1 ⍝ Copy following specs down 188 | :EndFor 189 | 190 | :If (11+≢_CalcCols)=2⊃⎕FSIZE tn ⍝ Did we stop using the last component? 191 | ⎕FDROP tn,¯1 192 | :EndIf 193 | 194 | InitCalcs tn ⍝ re-read all from file (optimise later if necessary) 195 | ⎕FUNTIE tn ⍝ Also unholds it 196 | ∇ 197 | 198 | ∇ r←Calc(name data);i;ns;cix;src;col 199 | :Access Public 200 | 201 | 'calculation not found'⎕SIGNAL((≢_CalcCols)≢columns)⍴11 322 | 'Column types and names do not have same length'⎕SIGNAL((≢columns)≠≢types)⍴11 323 | 'Invalid column types - see vecdb.TypeNames'⎕SIGNAL(∧/types∊TypeNames)↓11 324 | 'Column(s) already exist'⎕SIGNAL(∨/columns∊_Columns)⍴11 325 | :If 0=≢data ⋄ data←(≢columns)⍴⊂⍬ ⋄ :EndIf ⍝ Default data is all zeros 326 | 'Data lengths not all the same'⎕SIGNAL(1≠≢length←∪≢¨data)/11 327 | 328 | folder,←((¯1↑folder)∊'/\')↓'/' ⍝ make sure we have trailing separator 329 | metafile←folder,'meta.vecdb' 330 | 331 | :If create ⍝ We are CREATEing a database 332 | :If Exists ¯1↓folder ⍝ Folder already exists 333 | ('"',metafile,'" already exists')⎕SIGNAL(Exists metafile)/11 334 | :Else ⍝ Folder does not exist 335 | :Trap 0 ⋄ MkDir ¯1↓folder 336 | :Else ⋄ ⎕DMX.EM ⎕SIGNAL ⎕DMX.EN 337 | :EndTrap 338 | :EndIf 339 | ProcessOptions options ⍝ Sets global fields 340 | 'Block size must be a multiple of 8'⎕SIGNAL(0≠8|BlockSize)/11 341 | 342 | ⍝ Set defaults for sharding (1 shard) 343 | ShardFolders,←(0=⍴ShardFolders)/⊂folder 344 | ShardFolders←AddSlash¨ShardFolders 345 | :If 0=⎕NC 'LocalFolders' ⋄ LocalFolders←ShardFolders ⋄ :EndIf 346 | LocalFolders←AddSlash¨LocalFolders 347 | shardfolders←ShardFolders ⍝ When creating we have the full view 348 | ShardCols←,ShardCols 349 | :If 0≠⍴ShardFn ⋄ findshard←⍎ShardFn ⋄ :EndIf ⍝ Define shard calculation function 350 | (Name _Columns _Types)←name columns types ⍝ Set Class fields 351 | mappings←⎕NS¨(≢_Columns)⍴⊂'' 352 | 353 | :Else ⍝ We are adding columns to an open database 354 | (_Columns _Types)←(_Columns _Types),¨columns types ⍝ Extend Class fields 355 | mappings,←⎕NS¨(≢columns)⍴⊂'' 356 | 357 | :EndIf 358 | newcols←(-≢columns)↑⍳≢_Columns ⍝ Indices of new coulumns 359 | newchars←'C'=⊃¨_Types[newcols] ⍝ /// Should really be driven off mappings.Type=1 in the future 360 | 361 | :For i :In newchars/newcols ⍝ Create symbol files for CHAR fields 362 | col←i⊃mappings 363 | dix←newcols⍳i ⍝ data index 364 | col.symbol←{⍵[∪⍳⍨↑⍵]}dix⊃data ⍝ Unique symbols in input data 365 | col.file←folder,(⍕i),'.symbol' ⍝ Symbol file name in main folder 366 | col.symbol PutSymbols col.file ⍝ Read symbols 367 | col.(SymbolIndex←symbol∘⍳) ⍝ Create lookup function 368 | (dix⊃data)←col.SymbolIndex dix⊃data ⍝ Convert indices 369 | :EndFor 370 | 371 | :If create 372 | (shards data)←newcols ShardData data ⍝ NB data has one COLUMN per shard 373 | data←data,⊂⍬ 374 | :Else ⍝ adding columns 375 | shards←⍳≢Shards 376 | data←((≢newcols),≢shards)⍴⊂⍬ ⍝ No data provided when adding cols 377 | :EndIf 378 | 379 | :For f :In ⍳≢ShardFolders 380 | 3 ⎕MKDIR sf←f⊃ShardFolders 381 | d←data[;shards⍳f] ⍝ extract records for one shard 382 | :If create 383 | n←≢⊃d 384 | :Else 385 | n←f⊃_Counts.counter 386 | :EndIf 387 | size←BlockSize×1⌈⌈n÷BlockSize ⍝ At least one block 388 | 389 | :If create ⍝ # of records in the shard 390 | tn←(sf,'counters.vecdb')⎕NCREATE 0 391 | n ⎕NAPPEND tn 645 ⍝ Record the number of records as a FLOAT 392 | ⎕NUNTIE tn 393 | :EndIf 394 | 395 | :For i :In newcols ⍝ For each column being added 396 | ai3←⎕AI[3] 397 | dr←(TypeNames⍳_Types[i])⊃TypeNums 398 | tn←(filename←sf,(⍕i),'.vector')⎕NCREATE 0 399 | (sizeOf size dr)⎕NRESIZE tn 400 | ⎕NUNTIE tn 401 | :If 0≠≢⊃d ⍝ if there is some data to write 402 | temp←dr ¯1 ⎕MAP filename'W' 403 | temp[]←size↑(newcols⍳i)⊃d 404 | ⎕EX'temp' 405 | :EndIf 406 | ⍝ 'col ',(⍕i),': ',⍕⎕ai[3]-ai3 407 | :EndFor 408 | :EndFor 409 | 410 | :If create 411 | tn←metafile (⎕FCREATE⍠3) 0 412 | ('vecdb ',Version)∆FAPPEND tn ⍝ 1 413 | 'See github.com/Dyalog/vecdb/doc/Implementation.md'∆FAPPEND tn ⍝ 2 414 | 'unused'∆FAPPEND tn ⍝ 3 415 | (fileprops(⍎¨fileprops))∆FAPPEND tn ⍝ 4 (Name BlockSize) 416 | (_Columns _Types)∆FAPPEND tn ⍝ 5 417 | ((2,≢ShardFolders)⍴ShardFolders,LocalFolders) ∆FAPPEND tn ⍝ 6 418 | (ShardFn ShardCols)∆FAPPEND tn ⍝ 7 419 | 'unused'∆FAPPEND tn ⍝ 8 420 | 'unused'∆FAPPEND tn ⍝ 9 421 | (⍬ ⍬ ⍬)∆FAPPEND tn ⍝ 10 Calc Col Names, Source Columns, Data Type 422 | 423 | :Else ⍝ Extending 424 | tn←metafile ⎕FTIE 0 425 | (_Columns _Types)⎕FREPLACE tn 5 426 | :EndIf 427 | ⎕FUNTIE tn 428 | ∇ 429 | 430 | ∇X ∆FAPPEND Y 431 | ⍝ Work-around for Samba on Mac 432 | X ⎕FAPPEND Y 433 | ⎕FUNTIE ⍬ 434 | ∇ 435 | 436 | ∇ (shards data)←cix ShardData data;six;s;char;rawdata;sym;c;counts;m 437 | ⍝ Shards is a vector of shards to be updated 438 | ⍝ data has one column per shard, and one row per column 439 | 440 | :If 0=≢⊃rawdata←data 441 | shards←⍬ ⋄ data←0/⍪data 442 | →0 443 | :EndIf 444 | 445 | :If 1=≢shardfolders ⍝ Data will necessarily all be in the 1st shard then! 446 | shards←,1 ⋄ data←⍪data 447 | 448 | :Else ⍝ Database *is* sharded 449 | :If (≢cix)∨.≡cols ⋄ cols←,⊂,cols ⋄ :EndIf ⍝ Enclose if simple 528 | p←p×(≢¨cols)≥p←cols⍳¨' ' ⍝ position of separator 529 | summary←(0⌈p-1)↑¨cols 530 | colnames←p↓¨cols 531 | :EndIf 532 | ∇ 533 | 534 | ∇ {r}←AddColumns(columns types);z 535 | :Access Public 536 | 537 | 'not allowed unless all shards are open'⎕SIGNAL AllShards↓11 538 | 1 CreateOrExtend Name Folder columns types''⍬ 539 | z←Close ⋄ Open,⊂Folder ⍝ Reopen - might want to optimise this later? 540 | r←⍬ 541 | ∇ 542 | 543 | ∇ {r}←RemoveColumns columns;tn;keep;metafile;f;c;colix;file;sf;m;sym 544 | :Access Public 545 | 546 | 'not allowed unless all shards are open'⎕SIGNAL AllShards↓11 547 | :If ∨/m←~columns∊_Columns 548 | ('Columns not found:',⍕m/columns)⎕SIGNAL 11 549 | :EndIf 550 | 551 | 'Cannot remove sharding columns'⎕SIGNAL(∨/columns∊_Columns[ShardCols])⍴11 552 | 'Cannot remove all columns'⎕SIGNAL(∧/_Columns∊columns)⍴11 553 | 554 | keep←~_Columns∊columns 555 | 556 | ⎕EX'Shards' ⍝ We will reopen the file at the end, need to remove maps 557 | 558 | :For f :In ⍳≢shardfolders 559 | colix←1 560 | sf←f⊃shardfolders 561 | :For c :In ⍳≢_Columns 562 | tn←(file←sf,(⍕c),'.vector')⎕NTIE 0 563 | sym←{22::0 ⋄ (Folder,(⍕c),'.symbol')⎕NTIE ⍵}0 564 | :If c⊃keep ⍝ keeping this column 565 | :If c≠colix ⍝ needs renaming 566 | (sf,(⍕colix),'.vector')⎕NRENAME tn 567 | :If (f=1)∧sym≠0 568 | (Folder,(⍕colix),'.symbol')⎕NRENAME sym 569 | :EndIf 570 | :EndIf 571 | colix+←1 572 | :Else ⍝ erasing this column 573 | file ⎕NERASE tn 574 | :If (f=1)∧sym≠0 575 | (Folder,(⍕c),'.symbol')⎕NERASE sym 576 | :EndIf 577 | :EndIf 578 | ⎕NUNTIE ⎕NNUMS∩tn,sym 579 | :EndFor 580 | :EndFor 581 | 582 | (_Columns _Types)←keep∘/¨_Columns _Types 583 | 584 | metafile←Folder,'meta.vecdb' 585 | tn←metafile ⎕FTIE 0 586 | (_Columns _Types)⎕FREPLACE tn 5 587 | ⎕FUNTIE tn 588 | {}Close ⋄ Open,⊂Folder ⍝ Reopen 589 | r←⍬ 590 | ∇ 591 | 592 | ∇ r←Query args;where;cols;groupby;col;value;ix;j;s;count;Data;Cols;summary;m;i;f;cix;calc;mapped;c;columns;map 593 | :Access Public 594 | 595 | (where cols groupby)←3↑args,(≢args)↓⍬ ⍬ ⍬ 596 | cols←(0≠≢cols)/,,¨eis cols 597 | columns←Columns 598 | :If 2=≢where ⋄ :AndIf where[1]∊columns ⍝ just a single constraint? 599 | where←,⊂where 600 | :EndIf 601 | 602 | (summary cols)←ParseSummary cols 603 | 'UNKNOWN SUMMARY FUNCTION'⎕SIGNAL(∧/summary∊SummaryFns,⊂'')↓11 604 | 605 | :If 0≠≢groupby ⍝ We are grouping 606 | :If 1=≡groupby ⋄ groupby←,⊂groupby ⋄ :EndIf ⍝ Enclose if simple 607 | m←(0≠≢¨summary)∨cols∊groupby ⍝ summary or one of the grouping cols? 608 | 'ONLY SUMMARIZED COLUMNS MAY BE SELECTED WHEN GROUPING'⎕SIGNAL(∧/m)↓11 609 | :EndIf 610 | 611 | r←0 2⍴0 ⍝ (shard indices) 612 | 613 | :For s :In ShardSelected 614 | Cols←s⊃Shards 615 | count←⊃(s⊃_Counts).counter 616 | ix←⎕NULL 617 | 618 | :For (col value) :In where ⍝ AND them all together 619 | 620 | :If (≢columns) records selected 713 | blksize←numrecs←≢indices 714 | :EndIf 715 | 716 | split←0 ⍝ We did it all at once 717 | :Repeat 718 | :Trap 1 ⍝ WS FULL 719 | recs←blksize⌊numrecs-offset 720 | :If indices≡⎕NULL ⍝ All records still selected 721 | data←offset((s⊃Shards)[allix].{⍵↑⍺↓vector})recs 722 | :Else 723 | data←(s⊃Shards)[allix].{vector[⍵]}⊂recs↑offset↓indices 724 | :EndIf 725 | 726 | :For c :In calccols ⍝ /// equivalent code exists in Read: refactor someday? 727 | (c⊃data)←mappings[(≢_Columns)+calcix[c]].Calc c⊃data 728 | :EndFor 729 | 730 | r⍪←data[groupix]groupfn data[colix] 731 | offset+←blksize 732 | ⎕EX'data' 733 | :Else ⍝ Got a WS FULL 734 | split←1 ⍝ We had to go around again 735 | blksize←blksize(⌈÷)2 736 | ⎕←(⍕⎕AI[3]),': block size reduced: ',⍕blksize 737 | :If blksize<100000 738 | ∘∘∘ 739 | :EndIf 740 | :EndTrap 741 | :Until offset≥numrecs 742 | 743 | :If split ⍝ re-summarize partial results 744 | r←(↓⍉r[;groupix])regroupfn↓⍉r[;colix] 745 | :EndIf 746 | :EndFor 747 | 748 | :If 1<≢ix ⍝ re-summarize partial results 749 | r←(↓⍉r[;groupix])regroupfn↓⍉r[;colix] 750 | :EndIf 751 | 752 | :For char :In {⍵/⍳⍴⍵}'C'=⊃¨Types[(≢groupby)↑allix] ⍝ Symbol Group By cols 753 | r[;char]←mappings[allix[char]].{symbol[⍵]}r[;char] 754 | :EndFor 755 | ∇ 756 | 757 | ∇ r←Read(ix cols);char;m;num;cix;s;indices;t;calcix;calccols;c;nss;six;tix 758 | ⍝ Read specified indices of named columns 759 | :Access Public 760 | 761 | :If 1=⍴⍴ix ⋄ ix←1,⍪⊂ix ⋄ :EndIf ⍝ Single Shard? 762 | :If 1=≡cols ⋄ cols←,⊂cols ⋄ :EndIf ⍝ Single simple column name 763 | ⎕SIGNAL/ValidateColumns cols 764 | 765 | tix←six←cix←Columns⍳cols 766 | :If 0≠⍴calccols←((≢_CalcCols)≥calcix←_CalcCols⍳cols)/⍳⍴cols 767 | six[calccols]←_Columns⍳_CalcSources[calcix[calccols]] ⍝ source columns for calculated cols 768 | :EndIf 769 | r←(⍴cix)⍴⊂⍬ 770 | 771 | 'Data found in unopened shard!'⎕SIGNAL(∧/ix[;1]∊ShardSelected)↓11 772 | :For (s indices) :In ↓ix 773 | :If indices≡⎕NULL ⋄ r←r,¨(s⊃_Counts).counter↑¨(s⊃Shards)[six].vector 774 | :Else ⋄ r←r,¨(s⊃Shards)[six].{vector[⍵]}⊂indices ⋄ :EndIf 775 | :EndFor 776 | 777 | :If 0≠⍴char←{⍵/⍳≢⍵}'C'=⊃¨Types[cix] ⍝ Symbol transation 778 | :AndIf 0≠⍴char←(m←2=⊃¨(nss←mappings[cix[char]]).⎕NC⊂'symbol')/char 779 | r[char]←(m/nss).{symbol[⍵]}r[char] 780 | :EndIf 781 | 782 | :For c :In calccols~char ⍝ Exclude char-char maps handled above 783 | (c⊃r)←mappings[cix[c]].Calc c⊃r 784 | :EndFor 785 | ∇ 786 | 787 | ∇ r←ValidateColumns cols;bad 788 | ⍝ Return result suitable for ⎕SIGNAL/ 789 | 790 | r←''⍬ 791 | :If ~0∊⍴bad←cols~Columns 792 | r←('Unknown Column Names:',,⍕bad)11 793 | :EndIf 794 | ∇ 795 | 796 | ∇ r←Append(cols data);length;canupdate;shards;s;growth;tn;cix;count;i;append;Cols;size;d;n 797 | :Access Public 798 | 799 | 'Data lengths not all the same'⎕SIGNAL(1≠≢length←∪≢¨data)/11 800 | 'Col and Data counts not the same'⎕SIGNAL((≢cols)≠≢data)/11 801 | ⎕SIGNAL/ValidateColumns cols 802 | 803 | cix←_Columns⍳cols 804 | data←cix IndexSymbols data ⍝ Char to Symbol indices 805 | 806 | (shards data)←(⍳≢_Columns)ShardData data 807 | 'data may only be appended to opened shards'⎕SIGNAL(∧/shards∊ShardSelected)↓11 808 | 809 | :For s :In shards 810 | d←data[;shards⍳s] 811 | length←≢⊃d ⍝ # records to be written to *this* Shard 812 | Cols←s⊃Shards ⍝ Mapped columns in this Shard 813 | count←⊃(s⊃_Counts).counter ⍝ Active records in this Shard 814 | size←≢Cols[⊃cix].vector ⍝ Current Shard allocation 815 | 816 | :If 0≠canupdate←length⌊size-count ⍝ Updates to existing maps 817 | i←⊂count+⍳canupdate 818 | i(Cols[cix]).{vector[⍺]←⍵}canupdate↑¨d 819 | :EndIf 820 | 821 | :If length>canupdate ⍝ We need to extend the file 822 | append←(≢_Columns)⍴⊂⍬ 823 | append[cix]←canupdate↓¨d ⍝ Data which was not updated 824 | growth←BlockSize×(length-canupdate)(⌈÷)BlockSize ⍝ How many records to add to the Shard 825 | ExtendShard(s⊃shardfolders)Cols growth append 826 | :EndIf 827 | 828 | _Counts[s].counter[1]←count+length ⍝ Update (mapped) counter 829 | :EndFor 830 | 831 | r←0 832 | ∇ 833 | 834 | ∇ {r}←Update(ix cols data);cix;indices;s;p;i 835 | :Access Public 836 | 837 | :If 1=≡cols ⋄ (cols data)←,∘⊂¨cols data ⋄ :EndIf ⍝ Simple col name 838 | ⎕SIGNAL/ValidateColumns cols 839 | cix←Columns⍳cols 840 | 'Cannot update Sharding Cols'⎕SIGNAL(cix∊ShardCols)/11 841 | 842 | data←cix IndexSymbols data 843 | 844 | :If 1=≢ix ⋄ data←⍪data ⍝ One shard 845 | :Else ⍝ Partition data by Shard 846 | p←(≢⊃data)⍴0 ⋄ p[+\1,≢¨¯1↓ix[;2]]←1 847 | data←↑p∘⊂¨data 848 | :EndIf 849 | 850 | 'data must be in opened shards!'⎕SIGNAL(∧/ix[;1]∊ShardSelected)↓11 851 | :For i :In ⍳≢ix ⍝ Each shard 852 | (s indices)←ix[i;] 853 | (⊂indices)((s⊃Shards)[cix]).{vector[⍺]←⍵}data[;i] 854 | :EndFor 855 | r←0 856 | ∇ 857 | 858 | ∇ r←Delete folder;file;tn;folders;files;f;shards 859 | :Access Public Shared 860 | ⍝ Erase a vecdb file without opening it first (it might be too damaged to open) 861 | ⍝ Does check whether there is a meta file in the folder 862 | ⍝ Also deletes 863 | 864 | folder←AddSlash folder 865 | 'Folder not found'⎕SIGNAL(DirExists folder)↓22 ⍝ Not there 866 | 'Not a vecdb'⎕SIGNAL(Exists file←folder,'meta.vecdb')↓22 ⍝ Paranoia 867 | tn←file ⎕FTIE 0 868 | folders←(⎕FREAD tn 6),⊂folder ⍝ shards first 869 | file ⎕FERASE tn 870 | 871 | :For folder :In folders 872 | :If isWindows 873 | ⎕CMD'rmdir "',folder,'" /s /q' 874 | :Else 875 | 1 _SH'rm -r ',folder 876 | :EndIf 877 | :EndFor 878 | 879 | r←~DirExists folder 880 | ∇ 881 | 882 | ∇ r←Erase 883 | :Access Public 884 | ⍝ /// needs error trapping 885 | 886 | 'all shards must be open'⎕SIGNAL AllShards↓11 887 | 'vecdb is not open'⎕SIGNAL isOpen↓11 888 | 889 | {}Close 890 | {}Delete Folder 891 | r←0 892 | ∇ 893 | 894 | ∇ ix←ns SymbolUpdate values;m 895 | ⍝ Convert values to symbol indices, and update the file if necessary 896 | 897 | :If ∨/m←(≢ns.symbol)/dev/null' 1009 | :Else 1010 | ('shell command failed: ',cmd)⎕SIGNAL 11/⍨~suppress 1011 | :EndTrap 1012 | ∇ 1013 | 1014 | ∇ r←APLVersion 1015 | :Select 3↑⊃'.'⎕WG'APLVersion' 1016 | :CaseList 'Lin' 'AIX' 'Sol' 1017 | r←'*nix' 1018 | :Case 'Win' 1019 | r←'Win' 1020 | :Case 'Mac' 1021 | r←'Mac' 1022 | :Else 1023 | ... ⍝ unknown version 1024 | :EndSelect 1025 | ∇ 1026 | :EndSection ⍝ Files 1027 | 1028 | :EndClass 1029 | ⍝)(!Delete!!0 0 0 0 0 0 0!0 1030 | --------------------------------------------------------------------------------