├── .gitignore ├── ARTIFACT.md ├── CMakeLists.txt ├── DBx1000_README ├── LICENSE ├── Makefile ├── README.md ├── artifact.sh ├── benchmarks ├── TEST_schema.txt ├── TPCC_full_schema.txt ├── TPCC_short_schema.txt ├── YCSB_schema.txt ├── test.h ├── test_txn.cpp ├── test_wl.cpp ├── tpcc.h ├── tpcc_const.h ├── tpcc_helper.cpp ├── tpcc_helper.h ├── tpcc_query.cpp ├── tpcc_query.h ├── tpcc_txn.cpp ├── tpcc_wl.cpp ├── ycsb.h ├── ycsb_query.cpp ├── ycsb_query.h ├── ycsb_txn.cpp └── ycsb_wl.cpp ├── concurrency_control ├── aria.cpp ├── aria.h ├── bamboo.cpp ├── dl_detect.cpp ├── dl_detect.h ├── hekaton.cpp ├── ic3.cpp ├── occ.cpp ├── occ.h ├── plock.cpp ├── plock.h ├── row_aria.cpp ├── row_aria.h ├── row_bamboo.cpp ├── row_bamboo.h ├── row_hekaton.cpp ├── row_hekaton.h ├── row_ic3.cpp ├── row_ic3.h ├── row_lock.cpp ├── row_lock.h ├── row_mvcc.cpp ├── row_mvcc.h ├── row_occ.cpp ├── row_occ.h ├── row_silo.cpp ├── row_silo.h ├── row_silo_prio.cpp ├── row_silo_prio.h ├── row_tictoc.cpp ├── row_tictoc.h ├── row_ts.cpp ├── row_ts.h ├── row_vll.cpp ├── row_vll.h ├── row_ww.cpp ├── row_ww.h ├── silo.cpp ├── silo_prio.cpp ├── tictoc.cpp ├── vll.cpp └── vll.h ├── config-std.h ├── config.cpp ├── experiments ├── debug.json ├── default.json ├── large_dataset.json ├── long_txn.json ├── run_all.sh ├── run_tpcc_thread.sh ├── run_ycsb_aria.sh ├── run_ycsb_latency.sh ├── run_ycsb_prio_sen.sh ├── run_ycsb_readonly.sh ├── run_ycsb_thread.sh ├── run_ycsb_zipf.sh ├── synthetic_ycsb.json └── tpcc.json ├── libs └── libjemalloc.a ├── outputs ├── .ipynb_checkpoints │ └── analyze_factors-checkpoint.ipynb └── collect_stats.py ├── parse.py ├── plot.py ├── requirements.txt ├── storage ├── catalog.cpp ├── catalog.h ├── index_base.h ├── index_btree.cpp ├── index_btree.h ├── index_hash.cpp ├── index_hash.h ├── row.cpp ├── row.h ├── table.cpp └── table.h ├── system ├── amd64.h ├── batch.cpp ├── batch.h ├── global.cpp ├── global.h ├── helper.cpp ├── helper.h ├── main.cpp ├── manager.cpp ├── manager.h ├── mcs_spinlock.h ├── mem_alloc.cpp ├── mem_alloc.h ├── parser.cpp ├── query.cpp ├── query.h ├── stats.cpp ├── stats.h ├── thread.cpp ├── thread.h ├── txn.cpp ├── txn.h ├── wl.cpp └── wl.h └── test.py /.gitignore: -------------------------------------------------------------------------------- 1 | **.o 2 | **.d 3 | **.out 4 | **.swp 5 | **.swo 6 | run_exp.sh 7 | outputs/*.json 8 | config.h 9 | rundb 10 | cmake-build-debug 11 | .DS_STORE 12 | .idea/* 13 | perf* 14 | row_stats.csv 15 | linux/ 16 | results/ 17 | *.pdf 18 | -------------------------------------------------------------------------------- /ARTIFACT.md: -------------------------------------------------------------------------------- 1 | 2 | # Artifact Reproduction 3 | 4 | This document aims to provide detailed step-by-step instructions to reproduce all experiments on a CloudLab c6420 instance with Ubuntu 20. We recommend using this environment to reproduce experiments. To run experiments on other hardware, make sure the machine has at least 64 logical cores since some experiments use 64 threads. You may download a copy of the paper [here](https://dl.acm.org/doi/10.1145/3588724?cid=99660889005). 5 | 6 | The steps 0-B to 2 are summarized in a single script [`artifact.sh`](artifact.sh). Expected running time of this script on CloudLab c6420 machine is 9+ hours. 7 | 8 | ## Step 0-A: Set up c6420 Machine 9 | 10 | CloudLab machines, by default, only have 16 GB of storage space mounted at the root, which can be insufficient. The script below mounts `/dev/sda4` to `~/workspace`. You may skip this step if not using CloudLab c6420 instances. 11 | 12 | ```bash 13 | DISK=/dev/sda4 14 | WORKSPACE=$HOME/workspace 15 | sudo mkfs.ext4 $DISK 16 | sudo mkdir $WORKSPACE 17 | sudo mount $DISK $WORKSPACE 18 | sudo chown -R $USER $WORKSPACE 19 | echo "$DISK $WORKSPACE ext4 defaults 0 0" | sudo tee -a /etc/fstab 20 | # this directory will be mounted automatically for every reboot 21 | ``` 22 | 23 | ## Step 0-B: Install Dependencies 24 | 25 | Download the codebase under `~/workspace`, and `cd` into Polaris top-level directory. Then install software dependencies: 26 | 27 | ```bash 28 | # assume the current working directory is `polaris/` 29 | sudo apt update 30 | sudo apt install -y numactl python3-pip 31 | pip3 install -r requirements.txt 32 | ``` 33 | 34 | ## Step 1: Run All Experiments 35 | 36 | The experiment scripts are under `experiments/`. To run all experiments: 37 | 38 | ```bash 39 | # run all experiments; this may take 9+ hours 40 | bash experiments/run_ycsb_latency.sh # fig 1, 7 41 | bash experiments/run_ycsb_prio_sen.sh # fig 2 42 | bash experiments/run_ycsb_thread.sh # fig 3 43 | bash experiments/run_ycsb_readonly.sh # fig 4 44 | bash experiments/run_ycsb_zipf.sh # fig 5, 6 45 | bash experiments/run_tpcc_thread.sh # fig 8, 9 46 | bash experiments/run_ycsb_aria.sh # fig 10, 11 47 | ``` 48 | 49 | Alternatively, simply run `bash experiments/run_all.sh`, which is a shortcut to run the lines above. 50 | 51 | The experiment results are saved under `results/`. You should find 7 subdirectories: `ycsb_latency`, `ycsb_prio_sen`, `ycsb_thread`, `ycsb_readonly`, `ycsb_zipf`, `tpcc_thread`, and `ycsb_aria_batch`. 52 | 53 | ## Step 2: Process Experiment Data 54 | 55 | Then parse and plot the experiment results: 56 | 57 | ```bash 58 | # parse all experiments 59 | python3 parse.py ycsb_latency ycsb_prio_sen ycsb_thread ycsb_readonly ycsb_zipf tpcc_thread ycsb_aria_batch 60 | 61 | # plot figures; this may take minutes 62 | python3 plot.py 63 | ``` 64 | 65 | The plots are saved in the current working directory. You should find 11 figures (with the corresponding figure numbers in the [paper](https://doi.org/10.1145/3588724)): 66 | 67 | - fig 1: `ycsb_latency_allcc.pdf` 68 | - fig 2: `ycsb_prio_ratio_vs_throughput.pdf` 69 | - fig 3: `ycsb_thread_vs_throughput_tail.pdf` 70 | - fig 4: `ycsb_thread_vs_throughput_tail_readonly.pdf` 71 | - fig 5: `ycsb_zipf_vs_throughput_tail.pdf` 72 | - fig 6: `ycsb_latency_udprio.pdf` 73 | - fig 7: `tpcc_thread_vs_throughput_tail_wh1.pdf` 74 | - fig 8: `tpcc_thread_vs_throughput_tail_wh64.pdf` 75 | - fig 9: `ycsb_aria_thread_vs_throughput_tail_zipf0.99.pdf` 76 | - fig 10: `ycsb_aria_thread_vs_throughput_tail_zipf0.5.pdf` 77 | 78 | If running on CloudLab c6420 instance, all figures should be similar; the exception is fig 10: as reported in the paper, Aria p999 tail latency tends to have (relatively) high variation due to batching. 79 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 2.8) 2 | project(Dbx1000) 3 | 4 | SET (CMAKE_C_COMPILER "gcc") 5 | SET (CMAKE_CXX_COMPILER "g++") 6 | SET (CMAKE_CXX_FLAGS "-std=c++17 -Wno-deprecated-declarations" CACHE INTERNAL "compiler options" FORCE) 7 | SET (CMAKE_CXX_FLAGS_DEBUG "-O0 -g" CACHE INTERNAL "compiler options" FORCE) 8 | SET (CMAKE_CXX_FLAGS_RELEASE "-O3" CACHE INTERNAL "compiler options" FORCE) 9 | 10 | add_definitions(-DNOGRAPHITE=1) 11 | 12 | # include header files 13 | INCLUDE_DIRECTORIES(${PROJECT_SOURCE_DIR} ${PROJECT_SOURCE_DIR}/benchmarks/ ${PROJECT_SOURCE_DIR}/concurrency_control/ ${PROJECT_SOURCE_DIR}/storage/ ${PROJECT_SOURCE_DIR}/system/) 14 | # lib files 15 | #LINK_DIRECTORIES(${PROJECT_SOURCE_DIR}/libs) 16 | file(GLOB_RECURSE SRC_FILES benchmarks/*.cpp concurrency_control/*.cpp storage/*.cpp system/*.cpp config.cpp) 17 | add_executable(rundb ${SRC_FILES}) 18 | target_link_libraries(rundb libpthread.so libjemalloc.so) 19 | -------------------------------------------------------------------------------- /DBx1000_README: -------------------------------------------------------------------------------- 1 | 2 | DBMS BENCHMARK 3 | ------------------------------ 4 | 5 | == General Features == 6 | 7 | dbms is a OLTP database benchmark with the following features. 8 | 9 | 1. Seven different concurrency control algorithms are supported. 10 | DL_DETECT[1] : deadlock detection 11 | NO_WAIT[1] : no wait two phase locking 12 | WAIT_DIE[1] : wait and die two phase locking 13 | TIMESTAMP[1] : basic T/O 14 | MVCC[1] : multi-version T/O 15 | HSTORE[3] : H-STORE 16 | OCC[2] : optimistic concurrency control 17 | 18 | [1] Phlip Bernstein, Nathan Goodman, "Concurrency Control in Distributed Database Systems", Computing Surveys, June 1981 19 | [2] H.T. Kung, John Robinson, "On Optimistic Methods for Concurrency Control", Transactions on Database Systems, June 1981 20 | [3] R. Kallman et al, "H-Store: a High-Performance, Distributed Main Memory Transaction Processing System,", VLDB 2008 21 | 22 | 2. Two benchmarks are supported. 23 | 2.1 YCSB[4] 24 | 2.2 TPCC[5] 25 | Only Payment and New Order transactions are modeled. 26 | 27 | [4] B. Cooper et al, "Benchmarking Cloud Serving Systems with YCSB", SoCC 201 28 | [5] http://www.tpc.org/tpcc/ 29 | 30 | == Config File == 31 | 32 | dbms benchmark has the following parameters in the config file. Parameters with a * sign should not be changed. 33 | 34 | CORE_CNT : number of cores modeled in the system. 35 | PART_CNT : number of logical partitions in the system 36 | THREAD_CNT : number of threads running at the same time 37 | PAGE_SIZE : memory page size 38 | CL_SIZE : cache line size 39 | WARMUP : number of transactions to run for warmup 40 | 41 | WORKLOAD : workload supported (TPCC or YCSB) 42 | 43 | THREAD_ALLOC : per thread allocator. 44 | * MEM_PAD : enable memory padding to avoid false sharing. 45 | MEM_ALLIGN : allocated blocks are alligned to MEM_ALLIGN bytes 46 | 47 | PRT_LAT_DISTR : print out latency distribution of transactions 48 | 49 | CC_ALG : concurrency control algorithm 50 | * ROLL_BACK : roll back the modifications if a transaction aborts. 51 | 52 | ENABLE_LATCH : enable latching in btree index 53 | * CENTRAL_INDEX : centralized index structure 54 | * CENTRAL_MANAGER : centralized lock/timestamp manager 55 | INDEX_STRCT : data structure for index. 56 | BTREE_ORDER : fanout of each B-tree node 57 | 58 | DL_TIMEOUT_LOOP : the max waiting time in DL_DETECT. after timeout, deadlock will be detected. 59 | TS_TWR : enable Thomas Write Rule (TWR) in TIMESTAMP 60 | HIS_RECYCLE_LEN : in MVCC, history will be recycled if they are too long. 61 | MAX_WRITE_SET : the max size of a write set in OCC. 62 | 63 | MAX_ROW_PER_TXN : max number of rows touched per transaction. 64 | QUERY_INTVL : the rate at which database queries come 65 | MAX_TXN_PER_PART : maximum transactions to run per partition. 66 | 67 | // for YCSB Benchmark 68 | SYNTH_TABLE_SIZE : table size 69 | ZIPF_THETA : theta in zipfian distribution (rows accessed follow zipfian distribution) 70 | READ_PERC : 71 | WRITE_PERC : 72 | SCAN_PERC : percentage of read/write/scan queries. they should add up to 1. 73 | SCAN_LEN : number of rows touched per scan query. 74 | PART_PER_TXN : number of logical partitions to touch per transaction 75 | PERC_MULTI_PART : percentage of multi-partition transactions 76 | REQ_PER_QUERY : number of queries per transaction 77 | FIRST_PART_LOCAL : with this being true, the first touched partition is always the local partition. 78 | 79 | // for TPCC Benchmark 80 | NUM_HW : number of warehouses being modeled. 81 | PERC_PAYMENT : percentage of payment transactions. 82 | DIST_PER_WARE : number of districts in one warehouse 83 | MAXITEMS : number of items modeled. 84 | CUST_PER_DIST : number of customers per district 85 | ORD_PER_DIST : number of orders per district 86 | FIRSTNAME_LEN : length of first name 87 | MIDDLE_LEN : length of middle name 88 | LASTNAME_LEN : length of last name 89 | 90 | // !! centralized CC management should be ignored. 91 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | ISC License 2 | 3 | Copyright (c) 2014, Xiangyao Yu 4 | 5 | Permission to use, copy, modify, and/or distribute this software for any 6 | purpose with or without fee is hereby granted, provided that the above 7 | copyright notice and this permission notice appear in all copies. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH 10 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY 11 | AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, 12 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM 13 | LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE 14 | OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 15 | PERFORMANCE OF THIS SOFTWARE. 16 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | CC=g++ 2 | CFLAGS=-Wall -g -std=c++17 -fno-omit-frame-pointer 3 | 4 | .SUFFIXES: .o .cpp .h 5 | 6 | SRC_DIRS = ./ ./benchmarks/ ./concurrency_control/ ./storage/ ./system/ 7 | INCLUDE = -I. -I./benchmarks -I./concurrency_control -I./storage -I./system 8 | 9 | CFLAGS += $(INCLUDE) -D NOGRAPHITE=1 -O3 -Wno-unused-variable #-Werror 10 | LDFLAGS = -Wall -L. -L./libs -pthread -g -lrt -std=c++17 -O3 -ljemalloc 11 | LDFLAGS += $(CFLAGS) 12 | 13 | CPPS = $(foreach dir, $(SRC_DIRS), $(wildcard $(dir)*.cpp)) 14 | OBJS = $(CPPS:.cpp=.o) 15 | DEPS = $(CPPS:.cpp=.d) 16 | 17 | all:rundb 18 | 19 | rundb : $(OBJS) 20 | $(CC) -no-pie -o $@ $^ $(LDFLAGS) 21 | #$(CC) -o $@ $^ $(LDFLAGS) 22 | 23 | -include $(OBJS:%.o=%.d) 24 | 25 | %.d: %.cpp 26 | $(CC) -MM -MT $*.o -MF $@ $(CFLAGS) $< 27 | 28 | %.o: %.cpp 29 | $(CC) -c $(CFLAGS) -o $@ $< 30 | 31 | .PHONY: clean 32 | clean: 33 | rm -f rundb $(OBJS) $(DEPS) 34 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DBx1000-Polaris 2 | 3 | Polaris is an optimistic concurrency control algorithm with priority support. 4 | 5 | - [Polaris: Enabling Transaction Priority in Optimistic Concurrency Control](https://dl.acm.org/doi/10.1145/3588724?cid=99660889005). Chenhao Ye, Wuh-Chwen Hwang, Keren Chen, Xiangyao Yu. 6 | 7 | This repository implements Polaris on top of [DBx1000](https://github.com/yxymit/DBx1000) and [DBx1000-Bamboo](https://github.com/ScarletGuo/Bamboo-Public). 8 | 9 | - DBx1000: [Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores](http://www.vldb.org/pvldb/vol8/p209-yu.pdf). Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, Michael Stonebraker. 10 | - Bamboo: [Releasing Locks As Early As You Can: Reducing Contention of Hotspots by Violating Two-Phase Locking](https://doi.org/10.1145/3448016.3457294). Zhihan Guo, Kan Wu, Cong Yan, Xiangyao Yu. 11 | 12 | These repositories implement other concurrency control algorithms (e.g., No-Wait, Wait-Die, Wound-Wait, Silo) as the baseline for Polaris evaluation. 13 | 14 | **NOTE: This README describes the general usage of this repository; to reproduce all experiments in the [paper](https://dl.acm.org/doi/10.1145/3588724?cid=99660889005), please refer to [`ARTIFACT.md`](ARTIFACT.md).** 15 | 16 | ## Quick Start: Build & Test 17 | 18 | To test the database 19 | 20 | ```shell 21 | python3 test.py experiments/default.json 22 | ``` 23 | 24 | The command above will compile the code with the configuration specified in `experiments/default.json` and run experiments. `test.py` will read the configuration and the existing `config-std.h` to generate a new `config.h`. 25 | 26 | You can find other configuration files (`*.json`) under `experiments/`. 27 | 28 | ## Advanced: Configure & Run 29 | 30 | The parameters are set by `config-std.h` and the configuration file. You could overwrite parameters by specifying them from the command-line. 31 | 32 | ```shell 33 | python3 test.py experiments/default.json COMPILE_ONLY=true 34 | ``` 35 | 36 | This command would only compile the code but not run the experiment. 37 | 38 | Below are parameters that affect `test.py` behavior: 39 | 40 | - `UNSET_NUMA`: If set false, it will interleavingly allocate data. Default is `false`. 41 | - `COMPILE_ONLY`: Only compile the code but not run the experiments. Default is `false`. 42 | - `NDEBUG`: Disable all `assert`. Default is `true`. 43 | 44 | Below is a list of basic build parameters. They typically turn certain features on or off for evaluation purposes. The list is not exhaustive and you can find more on `config-std.h`. 45 | 46 | - `CC_ALG`: Which concurrency control algorithm to use. Default is `SILO_PRIO`, which is an alias name of Polaris\*. 47 | - `THREAD_CNT`: How many threads to use. 48 | - `WORKLOAD`: Which workload to run. Either `YCSB` or `TPCC`. 49 | - `ZIPF_THETA`: What is the Zipfian theta value in YCSB workload. Only useful when `WORKLOAD=YCSB`. 50 | - `NUM_WH`: How many warehouses in TPC-C workload. Only useful when `WORKLOAD=TPCC`. 51 | - `DUMP_LATENCY`: Whether dump the latency of all transactions to a file. Useful for latency distribution plotting. 52 | - `DUMP_LATENCY_FILENAME`: If `DUMP_LATENCY=true`, what's the filename of the dump. 53 | 54 | 55 | Below is another list of build parameters introduced for Polaris: 56 | 57 | - `SILO_PRIO_NO_RESERVE_LOWEST_PRIO`: Whether turn on the lowest-priority optimization for Polaris. Default is `true` and it should be set true all the time (unless you want to benchmark how much gain from this optimization). 58 | - `SILO_PRIO_FIXED_PRIO`: Whether fix the priority of each transaction. If `false`, Polaris will assign priority based on its own policy. 59 | - `SILO_PRIO_ABORT_THRESHOLD_BEFORE_INC_PRIO`: Do not increment the priority until the transaction's abort counter reaches this threshold. 60 | - `SILO_PRIO_INC_PRIO_AFTER_NUM_ABORT`: After reaching the threshold, increment the priority by one for every specified number of aborts. 61 | - `HIGH_PRIO_RATIO`: What's the ratio of transactions that start with high (i.e., nonzero) priority. Useful to simulate the case of user-specified priority. 62 | 63 | There are other handy tools included in this repository. `experiments/*.sh` are scripts to reproduce the experiments described in our paper. `parse.py` will process the experiment results into CSV files and `plot.py` can visualize them. 64 | 65 | \* **Fun fact**: Polaris is implemented based on Silo but with priority support, so it was previously termed `SILO_PRIO`. The name `POLARIS` came from a letter rearrangement of `SILO_PRIO` with an additional `A`. 66 | -------------------------------------------------------------------------------- /artifact.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # assume the current working directory is `polaris/` 3 | sudo apt update 4 | sudo apt install -y numactl python3-pip 5 | pip3 install -r requirements.txt 6 | 7 | bash experiments/run_all.sh 8 | python3 parse.py ycsb_latency ycsb_prio_sen ycsb_thread ycsb_readonly ycsb_zipf tpcc_thread ycsb_aria_batch 9 | python3 plot.py 10 | -------------------------------------------------------------------------------- /benchmarks/TEST_schema.txt: -------------------------------------------------------------------------------- 1 | //size, type, name 2 | TABLE=MAIN_TABLE 3 | 4,int,F0 4 | 8,double,F1 5 | 8,uint64,F2 6 | 100,string,F3 7 | 8 | INDEX=MAIN_INDEX 9 | MAIN_TABLE,0 10 | -------------------------------------------------------------------------------- /benchmarks/TPCC_full_schema.txt: -------------------------------------------------------------------------------- 1 | //size,type,name 2 | TABLE=WAREHOUSE 3 | 8,int64_t,W_ID 4 | 10,string,W_NAME 5 | 20,string,W_STREET_1 6 | 20,string,W_STREET_2 7 | 20,string,W_CITY 8 | 2,string,W_STATE 9 | 9,string,W_ZIP 10 | 8,double,W_TAX 11 | 8,double,W_YTD 12 | 13 | TABLE=DISTRICT 14 | 8,int64_t,D_ID 15 | 8,int64_t,D_W_ID 16 | 10,string,D_NAME 17 | 20,string,D_STREET_1 18 | 20,string,D_STREET_2 19 | 20,string,D_CITY 20 | 2,string,D_STATE 21 | 9,string,D_ZIP 22 | 8,double,D_TAX 23 | 8,double,D_YTD 24 | 8,int64_t,D_NEXT_O_ID 25 | 26 | TABLE=CUSTOMER 27 | 8,int64_t,C_ID 28 | 8,int64_t,C_D_ID 29 | 8,int64_t,C_W_ID 30 | 16,string,C_FIRST 31 | 2,string,C_MIDDLE 32 | 16,string,C_LAST 33 | 20,string,C_STREET_1 34 | 20,string,C_STREET_2 35 | 20,string,C_CITY 36 | 2,string,C_STATE 37 | 9,string,C_ZIP 38 | 16,string,C_PHONE 39 | 8,int64_t,C_SINCE 40 | 2,string,C_CREDIT 41 | 8,int64_t,C_CREDIT_LIM 42 | 8,int64_t,C_DISCOUNT 43 | 8,double,C_BALANCE 44 | 8,double,C_YTD_PAYMENT 45 | 8,uint64_t,C_PAYMENT_CNT 46 | 8,uint64_t,C_DELIVERY_CNT 47 | 500,string,C_DATA 48 | 49 | TABLE=HISTORY 50 | 8,int64_t,H_C_ID 51 | 8,int64_t,H_C_D_ID 52 | 8,int64_t,H_C_W_ID 53 | 8,int64_t,H_D_ID 54 | 8,int64_t,H_W_ID 55 | 8,int64_t,H_DATE 56 | 8,double,H_AMOUNT 57 | 24,string,H_DATA 58 | 59 | TABLE=NEW-ORDER 60 | 8,int64_t,NO_O_ID 61 | 8,int64_t,NO_D_ID 62 | 8,int64_t,NO_W_ID 63 | 64 | TABLE=ORDER 65 | 8,int64_t,O_ID 66 | 8,int64_t,O_C_ID 67 | 8,int64_t,O_D_ID 68 | 8,int64_t,O_W_ID 69 | 8,int64_t,O_ENTRY_D 70 | 8,int64_t,O_CARRIER_ID 71 | 8,int64_t,O_OL_CNT 72 | 8,int64_t,O_ALL_LOCAL 73 | 74 | TABLE=ORDER-LINE 75 | 8,int64_t,OL_O_ID 76 | 8,int64_t,OL_D_ID 77 | 8,int64_t,OL_W_ID 78 | 8,int64_t,OL_NUMBER 79 | 8,int64_t,OL_I_ID 80 | 8,int64_t,OL_SUPPLY_W_ID 81 | 8,int64_t,OL_DELIVERY_D 82 | 8,int64_t,OL_QUANTITY 83 | 8,double,OL_AMOUNT 84 | 8,int64_t,OL_DIST_INFO 85 | 86 | TABLE=ITEM 87 | 8,int64_t,I_ID 88 | 8,int64_t,I_IM_ID 89 | 24,string,I_NAME 90 | 8,int64_t,I_PRICE 91 | 50,string,I_DATA 92 | 93 | TABLE=STOCK 94 | 8,int64_t,S_I_ID 95 | 8,int64_t,S_W_ID 96 | 8,int64_t,S_QUANTITY 97 | 24,string,S_DIST_01 98 | 24,string,S_DIST_02 99 | 24,string,S_DIST_03 100 | 24,string,S_DIST_04 101 | 24,string,S_DIST_05 102 | 24,string,S_DIST_06 103 | 24,string,S_DIST_07 104 | 24,string,S_DIST_08 105 | 24,string,S_DIST_09 106 | 24,string,S_DIST_10 107 | 8,int64_t,S_YTD 108 | 8,int64_t,S_ORDER_CNT 109 | 8,int64_t,S_REMOTE_CNT 110 | 50,string,S_DATA 111 | 112 | INDEX=ITEM_IDX 113 | ITEM,400000 114 | 115 | INDEX=WAREHOUSE_IDX 116 | WAREHOUSE,100 117 | 118 | INDEX=DISTRICT_IDX 119 | DISTRICT,1000 120 | 121 | INDEX=CUSTOMER_ID_IDX 122 | CUSTOMER,120000 123 | 124 | INDEX=CUSTOMER_LAST_IDX 125 | CUSTOMER,120000 126 | 127 | INDEX=STOCK_IDX 128 | STOCK,400000 129 | -------------------------------------------------------------------------------- /benchmarks/TPCC_short_schema.txt: -------------------------------------------------------------------------------- 1 | //size,type,name 2 | TABLE=WAREHOUSE 3 | 8,int64_t,W_ID 4 | 10,string,W_NAME 5 | 20,string,W_STREET_1 6 | 20,string,W_STREET_2 7 | 20,string,W_CITY 8 | 2,string,W_STATE 9 | 9,string,W_ZIP 10 | 8,double,W_TAX 11 | 8,double,W_YTD 12 | 13 | TABLE=DISTRICT 14 | 8,int64_t,D_ID 15 | 8,int64_t,D_W_ID 16 | 10,string,D_NAME 17 | 20,string,D_STREET_1 18 | 20,string,D_STREET_2 19 | 20,string,D_CITY 20 | 2,string,D_STATE 21 | 9,string,D_ZIP 22 | 8,double,D_TAX 23 | 8,double,D_YTD 24 | 8,int64_t,D_NEXT_O_ID 25 | 26 | TABLE=CUSTOMER 27 | 8,int64_t,C_ID 28 | 8,int64_t,C_D_ID 29 | 8,int64_t,C_W_ID 30 | 2,string,C_MIDDLE 31 | 16,string,C_LAST 32 | 2,string,C_STATE 33 | 2,string,C_CREDIT 34 | 8,int64_t,C_DISCOUNT 35 | 8,double,C_BALANCE 36 | 8,double,C_YTD_PAYMENT 37 | 8,uint64_t,C_PAYMENT_CNT 38 | 39 | TABLE=HISTORY 40 | 8,int64_t,H_C_ID 41 | 8,int64_t,H_C_D_ID 42 | 8,int64_t,H_C_W_ID 43 | 8,int64_t,H_D_ID 44 | 8,int64_t,H_W_ID 45 | 8,int64_t,H_DATE 46 | 8,double,H_AMOUNT 47 | 48 | TABLE=NEW-ORDER 49 | 8,int64_t,NO_O_ID 50 | 8,int64_t,NO_D_ID 51 | 8,int64_t,NO_W_ID 52 | 53 | TABLE=ORDER 54 | 8,int64_t,O_ID 55 | 8,int64_t,O_C_ID 56 | 8,int64_t,O_D_ID 57 | 8,int64_t,O_W_ID 58 | 8,int64_t,O_ENTRY_D 59 | 8,int64_t,O_CARRIER_ID 60 | 8,int64_t,O_OL_CNT 61 | 8,int64_t,O_ALL_LOCAL 62 | 63 | TABLE=ORDER-LINE 64 | 8,int64_t,OL_O_ID 65 | 8,int64_t,OL_D_ID 66 | 8,int64_t,OL_W_ID 67 | 8,int64_t,OL_NUMBER 68 | 8,int64_t,OL_I_ID 69 | 70 | TABLE=ITEM 71 | 8,int64_t,I_ID 72 | 8,int64_t,I_IM_ID 73 | 24,string,I_NAME 74 | 8,int64_t,I_PRICE 75 | 50,string,I_DATA 76 | 77 | TABLE=STOCK 78 | 8,int64_t,S_I_ID 79 | 8,int64_t,S_W_ID 80 | 8,int64_t,S_QUANTITY 81 | 8,int64_t,S_REMOTE_CNT 82 | 83 | INDEX=ITEM_IDX 84 | ITEM,10000 85 | 86 | INDEX=WAREHOUSE_IDX 87 | WAREHOUSE,1 88 | 89 | INDEX=DISTRICT_IDX 90 | DISTRICT,10 91 | 92 | INDEX=CUSTOMER_ID_IDX 93 | CUSTOMER,40000 94 | 95 | INDEX=CUSTOMER_LAST_IDX 96 | CUSTOMER,40000 97 | 98 | INDEX=STOCK_IDX 99 | STOCK,10000 100 | -------------------------------------------------------------------------------- /benchmarks/YCSB_schema.txt: -------------------------------------------------------------------------------- 1 | //size, type, name 2 | TABLE=MAIN_TABLE 3 | 10,string,F0 4 | 10,string,F1 5 | 10,string,F2 6 | 10,string,F3 7 | 10,string,F4 8 | 10,string,F5 9 | 10,string,F6 10 | 10,string,F7 11 | 10,string,F8 12 | 10,string,F9 13 | 14 | INDEX=MAIN_INDEX 15 | MAIN_TABLE,0 16 | -------------------------------------------------------------------------------- /benchmarks/test.h: -------------------------------------------------------------------------------- 1 | #ifndef _TEST_H_ 2 | #define _TEST_H_ 3 | 4 | #include "global.h" 5 | #include "txn.h" 6 | #include "wl.h" 7 | 8 | class TestWorkload : public workload 9 | { 10 | public: 11 | RC init(); 12 | RC init_table(); 13 | RC init_schema(const char * schema_file); 14 | RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd); 15 | void summarize(); 16 | void tick() { time = get_sys_clock(); }; 17 | INDEX * the_index; 18 | table_t * the_table; 19 | #if CC_ALG == IC3 20 | SC_PIECE * get_cedges(TPCCTxnType type, int idx) {return NULL;}; 21 | #endif 22 | private: 23 | uint64_t time; 24 | }; 25 | 26 | class TestTxnMan : public txn_man 27 | { 28 | public: 29 | void init(thread_t * h_thd, workload * h_wl, uint64_t part_id); 30 | RC run_txn(int type, int access_num); 31 | RC exec_txn(base_query * m_query) { assert(false); return Abort;}; 32 | private: 33 | RC testReadwrite(int access_num); 34 | RC testConflict(int access_num); 35 | 36 | TestWorkload * _wl; 37 | }; 38 | 39 | #endif 40 | -------------------------------------------------------------------------------- /benchmarks/test_txn.cpp: -------------------------------------------------------------------------------- 1 | #include "test.h" 2 | #include "row.h" 3 | 4 | void TestTxnMan::init(thread_t * h_thd, workload * h_wl, uint64_t thd_id) { 5 | txn_man::init(h_thd, h_wl, thd_id); 6 | _wl = (TestWorkload *) h_wl; 7 | } 8 | 9 | RC TestTxnMan::run_txn(int type, int access_num) { 10 | switch(type) { 11 | case READ_WRITE : 12 | return testReadwrite(access_num); 13 | case CONFLICT: 14 | return testConflict(access_num); 15 | default: 16 | assert(false); 17 | return Abort; 18 | } 19 | } 20 | 21 | RC TestTxnMan::testReadwrite(int access_num) { 22 | RC rc = RCOK; 23 | itemid_t * m_item; 24 | 25 | m_item = index_read(_wl->the_index, 0, 0); 26 | row_t * row = ((row_t *)m_item->location); 27 | row_t * row_local = get_row(row, WR); 28 | if (access_num == 0) { 29 | char str[] = "hello"; 30 | row_local->set_value(0, 1234); 31 | row_local->set_value(1, 1234.5); 32 | row_local->set_value(2, 8589934592UL); 33 | row_local->set_value(3, str); 34 | } else { 35 | int v1; 36 | double v2; 37 | uint64_t v3; 38 | 39 | row_local->get_value(0, v1); 40 | row_local->get_value(1, v2); 41 | row_local->get_value(2, v3); 42 | #ifdef NDEBUG 43 | row_local->get_value(3); 44 | #else 45 | char * v4; 46 | v4 = row_local->get_value(3); 47 | #endif 48 | assert(v1 == 1234); 49 | assert(v2 == 1234.5); 50 | assert(v3 == 8589934592UL); 51 | assert(strcmp(v4, "hello") == 0); 52 | } 53 | rc = finish(rc); 54 | if (access_num == 0) 55 | return RCOK; 56 | else 57 | return FINISH; 58 | } 59 | 60 | RC 61 | TestTxnMan::testConflict(int access_num) 62 | { 63 | RC rc = RCOK; 64 | itemid_t * m_item; 65 | 66 | idx_key_t key; 67 | for (key = 0; key < 1; key ++) { 68 | m_item = index_read(_wl->the_index, key, 0); 69 | row_t * row = ((row_t *)m_item->location); 70 | row_t * row_local; 71 | row_local = get_row(row, WR); 72 | if (row_local) { 73 | char str[] = "hello"; 74 | row_local->set_value(0, 1234); 75 | row_local->set_value(1, 1234.5); 76 | row_local->set_value(2, 8589934592UL); 77 | row_local->set_value(3, str); 78 | sleep(1); 79 | } else { 80 | rc = Abort; 81 | break; 82 | } 83 | } 84 | rc = finish(rc); 85 | return rc; 86 | } 87 | -------------------------------------------------------------------------------- /benchmarks/test_wl.cpp: -------------------------------------------------------------------------------- 1 | #include "test.h" 2 | #include "table.h" 3 | #include "row.h" 4 | #include "mem_alloc.h" 5 | #include "index_hash.h" 6 | #include "index_btree.h" 7 | #include "thread.h" 8 | 9 | RC TestWorkload::init() { 10 | workload::init(); 11 | string path; 12 | path = "./benchmarks/TEST_schema.txt"; 13 | init_schema( path.c_str() ); 14 | 15 | init_table(); 16 | return RCOK; 17 | } 18 | 19 | RC TestWorkload::init_schema(const char * schema_file) { 20 | workload::init_schema(schema_file); 21 | the_table = tables["MAIN_TABLE"]; 22 | the_index = indexes["MAIN_INDEX"]; 23 | return RCOK; 24 | } 25 | 26 | RC TestWorkload::init_table() { 27 | RC rc = RCOK; 28 | for (int rid = 0; rid < 10; rid ++) { 29 | row_t * new_row = NULL; 30 | uint64_t row_id; 31 | int part_id = 0; 32 | rc = the_table->get_new_row(new_row, part_id, row_id); 33 | assert(rc == RCOK); 34 | uint64_t primary_key = rid; 35 | new_row->set_primary_key(primary_key); 36 | new_row->set_value(0, rid); 37 | new_row->set_value(1, 0); 38 | new_row->set_value(2, 0); 39 | itemid_t * m_item = (itemid_t *) mem_allocator.alloc( sizeof(itemid_t), part_id ); 40 | assert(m_item != NULL); 41 | m_item->type = DT_row; 42 | m_item->location = new_row; 43 | m_item->valid = true; 44 | uint64_t idx_key = primary_key; 45 | rc = the_index->index_insert(idx_key, m_item, 0); 46 | assert(rc == RCOK); 47 | } 48 | return rc; 49 | } 50 | 51 | RC TestWorkload::get_txn_man(txn_man *& txn_manager, thread_t * h_thd) { 52 | txn_manager = (TestTxnMan *) 53 | mem_allocator.alloc( sizeof(TestTxnMan), h_thd->get_thd_id() ); 54 | new(txn_manager) TestTxnMan(); 55 | txn_manager->init(h_thd, this, h_thd->get_thd_id()); 56 | return RCOK; 57 | } 58 | 59 | void TestWorkload::summarize() { 60 | uint64_t curr_time = get_sys_clock(); 61 | if (g_test_case == CONFLICT) { 62 | assert(curr_time - time > g_thread_cnt * 1e9); 63 | int total_wait_cnt = 0; 64 | for (UInt32 tid = 0; tid < g_thread_cnt; tid ++) { 65 | total_wait_cnt += stats._stats[tid]->wait_cnt; 66 | } 67 | printf("CONFLICT TEST. PASSED.\n"); 68 | } 69 | } 70 | -------------------------------------------------------------------------------- /benchmarks/tpcc.h: -------------------------------------------------------------------------------- 1 | #ifndef _TPCC_H_ 2 | #define _TPCC_H_ 3 | 4 | #include "wl.h" 5 | #include "txn.h" 6 | 7 | class table_t; 8 | class INDEX; 9 | class tpcc_query; 10 | 11 | #define IC3_TPCC_NEW_ORDER_PIECES 8 12 | #define IC3_TPCC_PAYMENT_PIECES 4 13 | #define IC3_TPCC_DELIVERY_PIECES 4 14 | 15 | class tpcc_wl : public workload { 16 | public: 17 | RC init(); 18 | RC init_table(); 19 | RC init_schema(const char * schema_file); 20 | RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd); 21 | table_t * t_warehouse; 22 | table_t * t_district; 23 | table_t * t_customer; 24 | table_t * t_history; 25 | table_t * t_neworder; 26 | table_t * t_order; 27 | table_t * t_orderline; 28 | table_t * t_item; 29 | table_t * t_stock; 30 | 31 | INDEX * i_item; 32 | INDEX * i_warehouse; 33 | INDEX * i_district; 34 | INDEX * i_customer_id; 35 | INDEX * i_customer_last; 36 | INDEX * i_stock; 37 | INDEX * i_order; // key = (w_id, d_id, o_id) 38 | INDEX * i_orderline; // key = (w_id, d_id, o_id) 39 | INDEX * i_orderline_wd; // key = (w_id, d_id). 40 | 41 | bool ** delivering; 42 | uint32_t next_tid; 43 | #if CC_ALG == IC3 44 | void init_scgraph(); 45 | SC_PIECE * get_cedges(TPCCTxnType txn_type, int piece_id); 46 | SC_PIECE *** sc_graph; 47 | #endif 48 | private: 49 | uint64_t num_wh; 50 | void init_tab_item(); 51 | void init_tab_wh(uint32_t wid); 52 | void init_tab_dist(uint64_t w_id); 53 | void init_tab_stock(uint64_t w_id); 54 | void init_tab_cust(uint64_t d_id, uint64_t w_id); 55 | void init_tab_hist(uint64_t c_id, uint64_t d_id, uint64_t w_id); 56 | void init_tab_order(uint64_t d_id, uint64_t w_id); 57 | 58 | void init_permutation(uint64_t * perm_c_id, uint64_t wid); 59 | 60 | static void * threadInitItem(void * This); 61 | static void * threadInitWh(void * This); 62 | static void * threadInitDist(void * This); 63 | static void * threadInitStock(void * This); 64 | static void * threadInitCust(void * This); 65 | static void * threadInitHist(void * This); 66 | static void * threadInitOrder(void * This); 67 | 68 | static void * threadInitWarehouse(void * This); 69 | }; 70 | 71 | class tpcc_txn_man : public txn_man 72 | { 73 | public: 74 | void init(thread_t * h_thd, workload * h_wl, uint64_t part_id); 75 | RC exec_txn(base_query * query); 76 | private: 77 | tpcc_wl * _wl; 78 | RC exec_payment(tpcc_query * m_query); 79 | RC exec_new_order(tpcc_query * m_query); 80 | RC exec_order_status(tpcc_query * query); 81 | RC exec_delivery(tpcc_query * query); 82 | RC exec_stock_level(tpcc_query * query); 83 | bool has_local_row(row_t * location, access_t type, row_t * local, access_t local_type) { 84 | if (location == local) { 85 | if ((type == local_type) || (local_type == WR)) { 86 | return true; 87 | } else if (type == WR) { 88 | return false; 89 | } 90 | } 91 | return false; 92 | }; 93 | }; 94 | 95 | #endif 96 | -------------------------------------------------------------------------------- /benchmarks/tpcc_const.h: -------------------------------------------------------------------------------- 1 | #if TPCC_SMALL 2 | enum { 3 | W_ID, 4 | W_NAME, 5 | W_STREET_1, 6 | W_STREET_2, 7 | W_CITY, 8 | W_STATE, 9 | W_ZIP, 10 | W_TAX, 11 | W_YTD 12 | }; 13 | enum { 14 | D_ID, 15 | D_W_ID, 16 | D_NAME, 17 | D_STREET_1, 18 | D_STREET_2, 19 | D_CITY, 20 | D_STATE, 21 | D_ZIP, 22 | D_TAX, 23 | D_YTD, 24 | D_NEXT_O_ID 25 | }; 26 | enum { 27 | C_ID, 28 | C_D_ID, 29 | C_W_ID, 30 | C_MIDDLE, 31 | C_LAST, 32 | C_STATE, 33 | C_CREDIT, 34 | C_DISCOUNT, 35 | C_BALANCE, 36 | C_YTD_PAYMENT, 37 | C_PAYMENT_CNT 38 | }; 39 | enum { 40 | H_C_ID, 41 | H_C_D_ID, 42 | H_C_W_ID, 43 | H_D_ID, 44 | H_W_ID, 45 | H_DATE, 46 | H_AMOUNT 47 | }; 48 | enum { 49 | NO_O_ID, 50 | NO_D_ID, 51 | NO_W_ID 52 | }; 53 | enum { 54 | O_ID, 55 | O_C_ID, 56 | O_D_ID, 57 | O_W_ID, 58 | O_ENTRY_D, 59 | O_CARRIER_ID, 60 | O_OL_CNT, 61 | O_ALL_LOCAL 62 | }; 63 | enum { 64 | OL_O_ID, 65 | OL_D_ID, 66 | OL_W_ID, 67 | OL_NUMBER, 68 | OL_I_ID 69 | }; 70 | enum { 71 | I_ID, 72 | I_IM_ID, 73 | I_NAME, 74 | I_PRICE, 75 | I_DATA 76 | }; 77 | enum { 78 | S_I_ID, 79 | S_W_ID, 80 | S_QUANTITY, 81 | S_REMOTE_CNT 82 | }; 83 | #else 84 | enum { 85 | W_ID, 86 | W_NAME, 87 | W_STREET_1, 88 | W_STREET_2, 89 | W_CITY, 90 | W_STATE, 91 | W_ZIP, 92 | W_TAX, 93 | W_YTD 94 | }; 95 | enum { 96 | D_ID, 97 | D_W_ID, 98 | D_NAME, 99 | D_STREET_1, 100 | D_STREET_2, 101 | D_CITY, 102 | D_STATE, 103 | D_ZIP, 104 | D_TAX, 105 | D_YTD, 106 | D_NEXT_O_ID 107 | }; 108 | enum { 109 | C_ID, 110 | C_D_ID, 111 | C_W_ID, 112 | C_FIRST, 113 | C_MIDDLE, 114 | C_LAST, 115 | C_STREET_1, 116 | C_STREET_2, 117 | C_CITY, 118 | C_STATE, 119 | C_ZIP, 120 | C_PHONE, 121 | C_SINCE, 122 | C_CREDIT, 123 | C_CREDIT_LIM, 124 | C_DISCOUNT, 125 | C_BALANCE, 126 | C_YTD_PAYMENT, 127 | C_PAYMENT_CNT, 128 | C_DELIVERY_CNT, 129 | C_DATA 130 | }; 131 | enum { 132 | H_C_ID, 133 | H_C_D_ID, 134 | H_C_W_ID, 135 | H_D_ID, 136 | H_W_ID, 137 | H_DATE, 138 | H_AMOUNT, 139 | H_DATA 140 | }; 141 | enum { 142 | NO_O_ID, 143 | NO_D_ID, 144 | NO_W_ID 145 | }; 146 | enum { 147 | O_ID, 148 | O_C_ID, 149 | O_D_ID, 150 | O_W_ID, 151 | O_ENTRY_D, 152 | O_CARRIER_ID, 153 | O_OL_CNT, 154 | O_ALL_LOCAL 155 | }; 156 | enum { 157 | OL_O_ID, 158 | OL_D_ID, 159 | OL_W_ID, 160 | OL_NUMBER, 161 | OL_I_ID, 162 | OL_SUPPLY_W_ID, 163 | OL_DELIVERY_D, 164 | OL_QUANTITY, 165 | OL_AMOUNT, 166 | OL_DIST_INFO 167 | }; 168 | enum { 169 | I_ID, 170 | I_IM_ID, 171 | I_NAME, 172 | I_PRICE, 173 | I_DATA 174 | }; 175 | enum { 176 | S_I_ID, 177 | S_W_ID, 178 | S_QUANTITY, 179 | S_DIST_01, 180 | S_DIST_02, 181 | S_DIST_03, 182 | S_DIST_04, 183 | S_DIST_05, 184 | S_DIST_06, 185 | S_DIST_07, 186 | S_DIST_08, 187 | S_DIST_09, 188 | S_DIST_10, 189 | S_YTD, 190 | S_ORDER_CNT, 191 | S_REMOTE_CNT, 192 | S_DATA 193 | }; 194 | #endif 195 | 196 | -------------------------------------------------------------------------------- /benchmarks/tpcc_helper.cpp: -------------------------------------------------------------------------------- 1 | #include "tpcc_helper.h" 2 | 3 | drand48_data ** tpcc_buffer; 4 | 5 | uint64_t distKey(uint64_t d_id, uint64_t d_w_id) { 6 | return d_w_id * DIST_PER_WARE + d_id; 7 | } 8 | 9 | uint64_t custKey(uint64_t c_id, uint64_t c_d_id, uint64_t c_w_id) { 10 | return (distKey(c_d_id, c_w_id) * g_cust_per_dist + c_id); 11 | } 12 | 13 | uint64_t orderlineKey(uint64_t w_id, uint64_t d_id, uint64_t o_id) { 14 | return distKey(d_id, w_id) * g_cust_per_dist + o_id; 15 | } 16 | 17 | uint64_t orderPrimaryKey(uint64_t w_id, uint64_t d_id, uint64_t o_id) { 18 | return orderlineKey(w_id, d_id, o_id); 19 | } 20 | 21 | uint64_t custNPKey(char * c_last, uint64_t c_d_id, uint64_t c_w_id) { 22 | uint64_t key = 0; 23 | char offset = 'A'; 24 | for (uint32_t i = 0; i < strlen(c_last); i++) 25 | key = (key << 2) + (c_last[i] - offset); 26 | key = key << 3; 27 | key += c_w_id * DIST_PER_WARE + c_d_id; 28 | return key; 29 | } 30 | 31 | uint64_t stockKey(uint64_t s_i_id, uint64_t s_w_id) { 32 | return s_w_id * g_max_items + s_i_id; 33 | } 34 | 35 | uint64_t Lastname(uint64_t num, char* name) { 36 | static const char *n[] = 37 | {"BAR", "OUGHT", "ABLE", "PRI", "PRES", 38 | "ESE", "ANTI", "CALLY", "ATION", "EING"}; 39 | strcpy(name, n[num/100]); 40 | strcat(name, n[(num/10)%10]); 41 | strcat(name, n[num%10]); 42 | return strlen(name); 43 | } 44 | 45 | uint64_t RAND(uint64_t max, uint64_t thd_id) { 46 | int64_t rint64 = 0; 47 | lrand48_r(tpcc_buffer[thd_id], &rint64); 48 | return rint64 % max; 49 | } 50 | 51 | uint64_t URand(uint64_t x, uint64_t y, uint64_t thd_id) { 52 | return x + RAND(y - x + 1, thd_id); 53 | } 54 | 55 | uint64_t NURand(uint64_t A, uint64_t x, uint64_t y, uint64_t thd_id) { 56 | static bool C_255_init = false; 57 | static bool C_1023_init = false; 58 | static bool C_8191_init = false; 59 | static uint64_t C_255, C_1023, C_8191; 60 | int C = 0; 61 | switch(A) { 62 | case 255: 63 | if(!C_255_init) { 64 | C_255 = (uint64_t) URand(0,255, thd_id); 65 | C_255_init = true; 66 | } 67 | C = C_255; 68 | break; 69 | case 1023: 70 | if(!C_1023_init) { 71 | C_1023 = (uint64_t) URand(0,1023, thd_id); 72 | C_1023_init = true; 73 | } 74 | C = C_1023; 75 | break; 76 | case 8191: 77 | if(!C_8191_init) { 78 | C_8191 = (uint64_t) URand(0,8191, thd_id); 79 | C_8191_init = true; 80 | } 81 | C = C_8191; 82 | break; 83 | default: 84 | M_ASSERT(false, "Error! NURand\n"); 85 | exit(-1); 86 | } 87 | return(((URand(0,A, thd_id) | URand(x,y, thd_id))+C)%(y-x+1))+x; 88 | } 89 | 90 | uint64_t MakeAlphaString(int min, int max, char* str, uint64_t thd_id) { 91 | char char_list[] = {'1','2','3','4','5','6','7','8','9','a','b','c', 92 | 'd','e','f','g','h','i','j','k','l','m','n','o', 93 | 'p','q','r','s','t','u','v','w','x','y','z','A', 94 | 'B','C','D','E','F','G','H','I','J','K','L','M', 95 | 'N','O','P','Q','R','S','T','U','V','W','X','Y','Z'}; 96 | uint64_t cnt = URand(min, max, thd_id); 97 | for (uint32_t i = 0; i < cnt; i++) 98 | str[i] = char_list[URand(0L, 60L, thd_id)]; 99 | for (int i = cnt; i < max; i++) 100 | str[i] = '\0'; 101 | 102 | return cnt; 103 | } 104 | 105 | uint64_t MakeNumberString(int min, int max, char* str, uint64_t thd_id) { 106 | 107 | uint64_t cnt = URand(min, max, thd_id); 108 | for (UInt32 i = 0; i < cnt; i++) { 109 | uint64_t r = URand(0L,9L, thd_id); 110 | str[i] = '0' + r; 111 | } 112 | return cnt; 113 | } 114 | 115 | uint64_t wh_to_part(uint64_t wid) { 116 | assert(g_part_cnt <= g_num_wh); 117 | return wid % g_part_cnt; 118 | } 119 | -------------------------------------------------------------------------------- /benchmarks/tpcc_helper.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | #include "global.h" 3 | #include "helper.h" 4 | 5 | uint64_t distKey(uint64_t d_id, uint64_t d_w_id); 6 | uint64_t custKey(uint64_t c_id, uint64_t c_d_id, uint64_t c_w_id); 7 | uint64_t orderlineKey(uint64_t w_id, uint64_t d_id, uint64_t o_id); 8 | uint64_t orderPrimaryKey(uint64_t w_id, uint64_t d_id, uint64_t o_id); 9 | // non-primary key 10 | uint64_t custNPKey(char * c_last, uint64_t c_d_id, uint64_t c_w_id); 11 | uint64_t stockKey(uint64_t s_i_id, uint64_t s_w_id); 12 | 13 | uint64_t Lastname(uint64_t num, char* name); 14 | 15 | extern drand48_data ** tpcc_buffer; 16 | // return random data from [0, max-1] 17 | uint64_t RAND(uint64_t max, uint64_t thd_id); 18 | // random number from [x, y] 19 | uint64_t URand(uint64_t x, uint64_t y, uint64_t thd_id); 20 | // non-uniform random number 21 | uint64_t NURand(uint64_t A, uint64_t x, uint64_t y, uint64_t thd_id); 22 | // random string with random length beteen min and max. 23 | uint64_t MakeAlphaString(int min, int max, char * str, uint64_t thd_id); 24 | uint64_t MakeNumberString(int min, int max, char* str, uint64_t thd_id); 25 | 26 | uint64_t wh_to_part(uint64_t wid); 27 | -------------------------------------------------------------------------------- /benchmarks/tpcc_query.cpp: -------------------------------------------------------------------------------- 1 | #include "query.h" 2 | #include "tpcc_query.h" 3 | #include "tpcc.h" 4 | #include "tpcc_helper.h" 5 | #include "mem_alloc.h" 6 | #include "wl.h" 7 | #include "table.h" 8 | 9 | void tpcc_query::init(uint64_t thd_id, workload * h_wl) { 10 | // base_query init 11 | num_abort = 0; 12 | double y; 13 | drand48_r(&per_thread_rand_buf, &y); 14 | prio = y < HIGH_PRIO_RATIO ? ((SILO_PRIO_MAX_PRIO + 1) / 2) : 0; 15 | max_prio = y < HIGH_PRIO_RATIO ? SILO_PRIO_MAX_PRIO : LOW_PRIO_BOUND; 16 | // tpcc query 17 | double x = (double)(rand() % 100) / 100.0; 18 | part_to_access = (uint64_t *) 19 | mem_allocator.alloc(sizeof(uint64_t) * g_part_cnt, thd_id); 20 | if (x < g_perc_payment) { 21 | gen_payment(thd_id); 22 | } else if (x < (g_perc_payment + g_perc_delivery)) 23 | gen_delivery(thd_id); 24 | else if (x < (g_perc_payment + g_perc_delivery + g_perc_orderstatus)) 25 | gen_order_status(thd_id); 26 | else if (x < (g_perc_payment + g_perc_delivery + g_perc_orderstatus + 27 | g_perc_stocklevel)) 28 | gen_stock_level(thd_id); 29 | else 30 | gen_new_order(thd_id); 31 | } 32 | 33 | void tpcc_query::gen_payment(uint64_t thd_id) { 34 | type = TPCC_PAYMENT; 35 | if (FIRST_PART_LOCAL) 36 | w_id = thd_id % g_num_wh + 1; 37 | else 38 | w_id = URand(1, g_num_wh, thd_id % g_num_wh); 39 | d_w_id = w_id; 40 | uint64_t part_id = wh_to_part(w_id); 41 | part_to_access[0] = part_id; 42 | part_num = 1; 43 | 44 | d_id = URand(1, DIST_PER_WARE, w_id-1); 45 | h_amount = URand(1, 5000, w_id-1); 46 | int x = URand(1, 100, w_id-1); 47 | int y = URand(1, 100, w_id-1); 48 | 49 | 50 | if(x <= 85) { 51 | // home warehouse 52 | c_d_id = d_id; 53 | c_w_id = w_id; 54 | } else { 55 | // remote warehouse 56 | c_d_id = URand(1, DIST_PER_WARE, w_id-1); 57 | if(g_num_wh > 1) { 58 | while((c_w_id = URand(1, g_num_wh, w_id-1)) == w_id) {} 59 | if (wh_to_part(w_id) != wh_to_part(c_w_id)) { 60 | part_to_access[1] = wh_to_part(c_w_id); 61 | part_num = 2; 62 | } 63 | } else 64 | c_w_id = w_id; 65 | } 66 | if(y <= 60) { 67 | // by last name 68 | by_last_name = true; 69 | Lastname(NURand(255,0,999,w_id-1),c_last); 70 | } else { 71 | // by cust id 72 | by_last_name = false; 73 | c_id = NURand(1023, 1, g_cust_per_dist,w_id-1); 74 | } 75 | } 76 | 77 | void tpcc_query::gen_new_order(uint64_t thd_id) { 78 | type = TPCC_NEW_ORDER; 79 | if (FIRST_PART_LOCAL) 80 | w_id = thd_id % g_num_wh + 1; 81 | else 82 | w_id = URand(1, g_num_wh, thd_id % g_num_wh); 83 | d_id = URand(1, DIST_PER_WARE, w_id-1); 84 | c_id = NURand(1023, 1, g_cust_per_dist, w_id-1); 85 | rbk = URand(1, 100, w_id-1); 86 | ol_cnt = URand(5, 15, w_id-1); 87 | o_entry_d = 2013; 88 | items = (Item_no *) _mm_malloc(sizeof(Item_no) * ol_cnt, 64); 89 | remote = false; 90 | part_to_access[0] = wh_to_part(w_id); 91 | part_num = 1; 92 | 93 | for (UInt32 oid = 0; oid < ol_cnt; oid ++) { 94 | items[oid].ol_i_id = NURand(8191, 1, g_max_items, w_id-1); 95 | #if TPCC_USER_ABORT 96 | // XXX(zhihan): 1% of the New-Order transactions are chosen at random to 97 | // simulate user data entry errors and exercise the performance of 98 | // rolling back update transactions. 99 | // If this is the last item on the order and rbk = 1 (chosen from [1, 100 | // 100]), then the item number is set to an unused value. 101 | if ((oid == ol_cnt - 1) && (rbk == 1)) { 102 | items[oid].ol_i_id = 0; 103 | } 104 | #endif 105 | UInt32 x = URand(1, 100, w_id-1); 106 | if (x > 1 || g_num_wh == 1) 107 | items[oid].ol_supply_w_id = w_id; 108 | else { 109 | while((items[oid].ol_supply_w_id = URand(1, g_num_wh, w_id-1)) == w_id) {} 110 | remote = true; 111 | } 112 | items[oid].ol_quantity = URand(1, 10, w_id-1); 113 | } 114 | // Remove duplicate items 115 | for (UInt32 i = 0; i < ol_cnt; i ++) { 116 | for (UInt32 j = 0; j < i; j++) { 117 | if (items[i].ol_i_id == items[j].ol_i_id) { 118 | for (UInt32 k = i; k < ol_cnt - 1; k++) 119 | items[k] = items[k + 1]; 120 | ol_cnt --; 121 | i--; 122 | } 123 | } 124 | } 125 | for (UInt32 i = 0; i < ol_cnt; i ++) 126 | for (UInt32 j = 0; j < i; j++) 127 | assert(items[i].ol_i_id != items[j].ol_i_id); 128 | // update part_to_access 129 | for (UInt32 i = 0; i < ol_cnt; i ++) { 130 | UInt32 j; 131 | for (j = 0; j < part_num; j++ ) 132 | if (part_to_access[j] == wh_to_part(items[i].ol_supply_w_id)) 133 | break; 134 | if (j == part_num) // not found! add to it. 135 | part_to_access[part_num ++] = wh_to_part( items[i].ol_supply_w_id ); 136 | } 137 | } 138 | 139 | void 140 | tpcc_query::gen_order_status(uint64_t thd_id) { 141 | type = TPCC_ORDER_STATUS; 142 | if (FIRST_PART_LOCAL) 143 | w_id = thd_id % g_num_wh + 1; 144 | else 145 | w_id = URand(1, g_num_wh, thd_id % g_num_wh); 146 | d_id = URand(1, DIST_PER_WARE, w_id-1); 147 | c_w_id = w_id; 148 | c_d_id = d_id; 149 | int y = URand(1, 100, w_id-1); 150 | if(y <= 60) { 151 | // by last name 152 | by_last_name = true; 153 | Lastname(NURand(255,0,999,w_id-1),c_last); 154 | } else { 155 | // by cust id 156 | by_last_name = false; 157 | c_id = NURand(1023, 1, g_cust_per_dist, w_id-1); 158 | } 159 | } 160 | 161 | void 162 | tpcc_query::gen_delivery(uint64_t thd_id) { 163 | type = TPCC_DELIVERY; 164 | } 165 | 166 | void 167 | tpcc_query::gen_stock_level(uint64_t thd_id) { 168 | type = TPCC_STOCK_LEVEL; 169 | } 170 | -------------------------------------------------------------------------------- /benchmarks/tpcc_query.h: -------------------------------------------------------------------------------- 1 | #ifndef _TPCC_QUERY_H_ 2 | #define _TPCC_QUERY_H_ 3 | 4 | #include "global.h" 5 | #include "helper.h" 6 | #include "query.h" 7 | 8 | class workload; 9 | 10 | // items of new order transaction 11 | struct Item_no { 12 | uint64_t ol_i_id; 13 | uint64_t ol_supply_w_id; 14 | uint64_t ol_quantity; 15 | }; 16 | 17 | class tpcc_query : public base_query { 18 | public: 19 | void init(uint64_t thd_id, workload * h_wl); 20 | TPCCTxnType type; 21 | /**********************************************/ 22 | // common txn input for both payment & new-order 23 | /**********************************************/ 24 | uint64_t w_id; 25 | uint64_t d_id; 26 | uint64_t c_id; 27 | /**********************************************/ 28 | // txn input for payment 29 | /**********************************************/ 30 | uint64_t d_w_id; 31 | uint64_t c_w_id; 32 | uint64_t c_d_id; 33 | char c_last[LASTNAME_LEN]; 34 | double h_amount; 35 | bool by_last_name; 36 | /**********************************************/ 37 | // txn input for new-order 38 | /**********************************************/ 39 | Item_no * items; 40 | uint64_t rbk; 41 | bool remote; 42 | uint64_t ol_cnt; 43 | uint64_t o_entry_d; 44 | // Input for delivery 45 | uint64_t o_carrier_id; 46 | uint64_t ol_delivery_d; 47 | // for order-status 48 | 49 | 50 | private: 51 | // warehouse id to partition id mapping 52 | // uint64_t wh_to_part(uint64_t wid); 53 | void gen_payment(uint64_t thd_id); 54 | void gen_new_order(uint64_t thd_id); 55 | void gen_order_status(uint64_t thd_id); 56 | void gen_delivery(uint64_t thd_id); 57 | void gen_stock_level(uint64_t thd_id); 58 | }; 59 | 60 | #endif 61 | -------------------------------------------------------------------------------- /benchmarks/ycsb.h: -------------------------------------------------------------------------------- 1 | #ifndef _SYNTH_BM_H_ 2 | #define _SYNTH_BM_H_ 3 | 4 | #include "wl.h" 5 | #include "txn.h" 6 | #include "global.h" 7 | #include "helper.h" 8 | 9 | class ycsb_query; 10 | 11 | class ycsb_wl : public workload { 12 | public : 13 | RC init(); 14 | RC init_table(); 15 | RC init_schema(string schema_file); 16 | RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd); 17 | int key_to_part(uint64_t key); 18 | INDEX * the_index; 19 | table_t * the_table; 20 | #if CC_ALG == IC3 21 | SC_PIECE * get_cedges(TPCCTxnType type, int idx) {return NULL;}; 22 | #endif 23 | private: 24 | void init_table_parallel(); 25 | void * init_table_slice(); 26 | static void * threadInitTable(void * This) { 27 | ((ycsb_wl *)This)->init_table_slice(); 28 | return NULL; 29 | } 30 | pthread_mutex_t insert_lock; 31 | // For parallel initialization 32 | static int next_tid; 33 | }; 34 | 35 | class ycsb_txn_man : public txn_man 36 | { 37 | public: 38 | void init(thread_t * h_thd, workload * h_wl, uint64_t part_id); 39 | RC exec_txn(base_query * query); 40 | private: 41 | #if CC_ALG != BAMBOO 42 | uint64_t row_cnt; 43 | #endif 44 | ycsb_wl * _wl; 45 | }; 46 | 47 | #endif 48 | -------------------------------------------------------------------------------- /benchmarks/ycsb_query.h: -------------------------------------------------------------------------------- 1 | #ifndef _YCSB_QUERY_H_ 2 | #define _YCSB_QUERY_H_ 3 | 4 | #include "global.h" 5 | #include "helper.h" 6 | #include "query.h" 7 | 8 | class workload; 9 | class Query_thd; 10 | // Each ycsb_query contains several ycsb_requests, 11 | // each of which is a RD, WR to a single table 12 | 13 | class ycsb_request { 14 | public: 15 | access_t rtype; 16 | uint64_t key; 17 | char value; 18 | // only for (qtype == SCAN) 19 | UInt32 scan_len; 20 | }; 21 | 22 | class ycsb_query : public base_query { 23 | public: 24 | void init(uint64_t thd_id, workload * h_wl) { assert(false); }; 25 | void init(uint64_t thd_id, workload * h_wl, Query_thd * query_thd); 26 | static void calculateDenom(); 27 | uint64_t get_new_row(); 28 | void gen_requests(uint64_t thd_id, workload * h_wl); 29 | 30 | uint64_t request_cnt; 31 | uint64_t local_req_per_query; 32 | bool is_long; 33 | double local_read_perc; 34 | ycsb_request * requests; 35 | 36 | private: 37 | // for Zipfian distribution 38 | static double zeta(uint64_t n, double theta); 39 | uint64_t zipf(uint64_t n, double theta); 40 | 41 | static uint64_t the_n; 42 | static double denom; 43 | double zeta_2_theta; 44 | Query_thd * _query_thd; 45 | }; 46 | 47 | #endif 48 | -------------------------------------------------------------------------------- /benchmarks/ycsb_txn.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | #include "ycsb.h" 4 | #include "ycsb_query.h" 5 | #include "wl.h" 6 | #include "thread.h" 7 | #include "table.h" 8 | #include "row.h" 9 | #include "index_hash.h" 10 | #include "index_btree.h" 11 | #include "catalog.h" 12 | #include "manager.h" 13 | #include "row_lock.h" 14 | #include "row_ts.h" 15 | #include "row_mvcc.h" 16 | #include "mem_alloc.h" 17 | #include "query.h" 18 | void ycsb_txn_man::init(thread_t * h_thd, workload * h_wl, uint64_t thd_id) { 19 | txn_man::init(h_thd, h_wl, thd_id); 20 | _wl = (ycsb_wl *) h_wl; 21 | } 22 | 23 | RC ycsb_txn_man::exec_txn(base_query * query) { 24 | RC rc; 25 | ycsb_query * m_query = (ycsb_query *) query; 26 | ycsb_wl * wl = (ycsb_wl *) h_wl; 27 | itemid_t * m_item = NULL; 28 | #if CC_ALG == BAMBOO && (THREAD_CNT != 1) 29 | int access_id; 30 | retire_threshold = (uint32_t) floor(m_query->request_cnt * (1 - g_last_retire)); 31 | #else 32 | row_cnt = 0; 33 | #endif 34 | for (uint32_t rid = 0; rid < m_query->request_cnt; rid ++) { 35 | ycsb_request * req = &m_query->requests[rid]; 36 | int part_id = wl->key_to_part( req->key ); 37 | bool finish_req = false; 38 | UInt32 iteration = 0; 39 | while ( !finish_req ) { 40 | if (iteration == 0) { 41 | m_item = index_read(_wl->the_index, req->key, part_id); 42 | } 43 | #if INDEX_STRUCT == IDX_BTREE 44 | else { 45 | _wl->the_index->index_next(get_thd_id(), m_item); 46 | if (m_item == NULL) 47 | break; 48 | } 49 | #endif 50 | row_t * row = ((row_t *)m_item->location); 51 | row_t * row_local; 52 | access_t type = req->rtype; 53 | //printf("[txn-%lu] start %d requests at key %lu\n", get_txn_id(), rid, req->key); 54 | row_local = get_row(row, type); 55 | if (row_local == NULL) { 56 | rc = Abort; 57 | goto final; 58 | } 59 | #if CC_ALG == BAMBOO && (THREAD_CNT != 1) 60 | access_id = row_cnt - 1; 61 | #endif 62 | 63 | // Computation // 64 | // Only do computation when there are more than 1 requests. 65 | if (m_query->request_cnt > 1) { 66 | if (req->rtype == RD || req->rtype == SCAN) { 67 | // for (int fid = 0; fid < schema->get_field_cnt(); fid++) { 68 | int fid = 0; 69 | char * data = row_local->get_data(); 70 | __attribute__((unused)) uint64_t fval = *(uint64_t *)(&data[fid * 10]); 71 | // } 72 | } else { 73 | assert(req->rtype == WR); 74 | // for (int fid = 0; fid < schema->get_field_cnt(); fid++) { 75 | int fid = 0; 76 | #if (CC_ALG == BAMBOO) || (CC_ALG == WOUND_WAIT) 77 | char * data = row_local->get_data(); 78 | #else 79 | char * data = row->get_data(); 80 | #endif 81 | *(uint64_t *)(&data[fid * 10]) = 0; 82 | // } 83 | } 84 | } 85 | 86 | 87 | iteration ++; 88 | if (req->rtype == RD || req->rtype == WR || iteration == req->scan_len) 89 | finish_req = true; 90 | #if (CC_ALG == BAMBOO) && (THREAD_CNT != 1) 91 | // retire write txn 92 | if (finish_req && (req->rtype == WR) && (rid <= retire_threshold)) { 93 | //printf("[txn-%lu] retire %d requests\n", get_txn_id(), rid); 94 | if (retire_row(access_id) == Abort) 95 | return Abort; 96 | } 97 | #endif 98 | } 99 | } 100 | rc = RCOK; 101 | final: 102 | return rc; 103 | } 104 | 105 | -------------------------------------------------------------------------------- /benchmarks/ycsb_wl.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include "global.h" 3 | #include "helper.h" 4 | #include "ycsb.h" 5 | #include "wl.h" 6 | #include "thread.h" 7 | #include "table.h" 8 | #include "row.h" 9 | #include "index_hash.h" 10 | #include "index_btree.h" 11 | #include "catalog.h" 12 | #include "manager.h" 13 | #include "row_lock.h" 14 | #include "row_ts.h" 15 | #include "row_mvcc.h" 16 | #include "mem_alloc.h" 17 | #include "query.h" 18 | 19 | int ycsb_wl::next_tid; 20 | 21 | RC ycsb_wl::init() { 22 | workload::init(); 23 | next_tid = 0; 24 | string path = "./benchmarks/YCSB_schema.txt"; 25 | init_schema( path ); 26 | 27 | init_table_parallel(); 28 | // init_table(); 29 | return RCOK; 30 | } 31 | 32 | RC ycsb_wl::init_schema(string schema_file) { 33 | workload::init_schema(schema_file); 34 | the_table = tables["MAIN_TABLE"]; 35 | the_index = indexes["MAIN_INDEX"]; 36 | return RCOK; 37 | } 38 | 39 | int 40 | ycsb_wl::key_to_part(uint64_t key) { 41 | uint64_t rows_per_part = g_synth_table_size / g_part_cnt; 42 | return key / rows_per_part; 43 | } 44 | 45 | RC ycsb_wl::init_table() { 46 | RC rc = RCOK; 47 | uint64_t total_row = 0; 48 | while (true) { 49 | for (UInt32 part_id = 0; part_id < g_part_cnt; part_id ++) { 50 | if (total_row > g_synth_table_size) 51 | goto ins_done; 52 | row_t * new_row = NULL; 53 | //zhihan 54 | uint64_t row_id = get_sys_clock(); 55 | rc = the_table->get_new_row(new_row, part_id, row_id); 56 | // TODO insertion of last row may fail after the table_size 57 | // is updated. So never access the last record in a table 58 | assert(rc == RCOK); 59 | uint64_t primary_key = total_row; 60 | new_row->set_primary_key(primary_key); 61 | new_row->set_value(0, &primary_key); 62 | Catalog * schema = the_table->get_schema(); 63 | for (UInt32 fid = 0; fid < schema->get_field_cnt(); fid ++) { 64 | int field_size = schema->get_field_size(fid); 65 | char value[field_size]; 66 | for (int i = 0; i < field_size; i++) 67 | value[i] = (char)rand() % (1<<8) ; 68 | new_row->set_value(fid, value); 69 | } 70 | itemid_t * m_item = 71 | (itemid_t *) mem_allocator.alloc( sizeof(itemid_t), part_id ); 72 | assert(m_item != NULL); 73 | m_item->type = DT_row; 74 | m_item->location = new_row; 75 | m_item->valid = true; 76 | uint64_t idx_key = primary_key; 77 | rc = the_index->index_insert(idx_key, m_item, part_id); 78 | assert(rc == RCOK); 79 | total_row ++; 80 | } 81 | } 82 | ins_done: 83 | printf("[YCSB] Table \"MAIN_TABLE\" initialized.\n"); 84 | return rc; 85 | 86 | } 87 | 88 | // init table in parallel 89 | void ycsb_wl::init_table_parallel() { 90 | enable_thread_mem_pool = true; 91 | pthread_t p_thds[g_init_parallelism - 1]; 92 | for (UInt32 i = 0; i < g_init_parallelism - 1; i++) 93 | pthread_create(&p_thds[i], NULL, threadInitTable, this); 94 | threadInitTable(this); 95 | 96 | for (uint32_t i = 0; i < g_init_parallelism - 1; i++) { 97 | int rc = pthread_join(p_thds[i], NULL); 98 | if (rc) { 99 | printf("ERROR; return code from pthread_join() is %d\n", rc); 100 | exit(-1); 101 | } 102 | } 103 | enable_thread_mem_pool = false; 104 | mem_allocator.unregister(); 105 | } 106 | 107 | void * ycsb_wl::init_table_slice() { 108 | UInt32 tid = ATOM_FETCH_ADD(next_tid, 1); 109 | // set cpu affinity 110 | set_affinity(tid); 111 | 112 | mem_allocator.register_thread(tid); 113 | assert(g_synth_table_size % g_init_parallelism == 0); 114 | assert(tid < g_init_parallelism); 115 | while ((UInt32)ATOM_FETCH_ADD(next_tid, 0) < g_init_parallelism) {} 116 | assert((UInt32)ATOM_FETCH_ADD(next_tid, 0) == g_init_parallelism); 117 | uint64_t slice_size = g_synth_table_size / g_init_parallelism; 118 | for (uint64_t key = slice_size * tid; 119 | key < slice_size * (tid + 1); 120 | key ++ 121 | ) { 122 | row_t * new_row = NULL; 123 | //zhihan uint64_t row_id; 124 | uint64_t row_id = get_sys_clock(); 125 | int part_id = key_to_part(key); 126 | #ifdef NDEBUG 127 | the_table->get_new_row(new_row, part_id, row_id); 128 | #else 129 | RC rc = the_table->get_new_row(new_row, part_id, row_id); 130 | #endif 131 | assert(rc == RCOK); 132 | uint64_t primary_key = key; 133 | new_row->set_primary_key(primary_key); 134 | new_row->set_value(0, &primary_key); 135 | Catalog * schema = the_table->get_schema(); 136 | 137 | for (UInt32 fid = 0; fid < schema->get_field_cnt(); fid ++) { 138 | char value[6] = "hello"; 139 | new_row->set_value(fid, value); 140 | } 141 | 142 | itemid_t * m_item = 143 | (itemid_t *) mem_allocator.alloc( sizeof(itemid_t), part_id ); 144 | assert(m_item != NULL); 145 | m_item->type = DT_row; 146 | m_item->location = new_row; 147 | m_item->valid = true; 148 | uint64_t idx_key = primary_key; 149 | #ifdef NDEBUG 150 | the_index->index_insert(idx_key, m_item, part_id); 151 | #else 152 | rc = the_index->index_insert(idx_key, m_item, part_id); 153 | #endif 154 | assert(rc == RCOK); 155 | } 156 | return NULL; 157 | } 158 | 159 | RC ycsb_wl::get_txn_man(txn_man *& txn_manager, thread_t * h_thd){ 160 | txn_manager = (ycsb_txn_man *) 161 | _mm_malloc( sizeof(ycsb_txn_man), 64 ); 162 | new(txn_manager) ycsb_txn_man(); 163 | txn_manager->init(h_thd, this, h_thd->get_thd_id()); 164 | return RCOK; 165 | } 166 | 167 | 168 | -------------------------------------------------------------------------------- /concurrency_control/aria.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #if CC_ALG == ARIA 4 | 5 | class base_query; 6 | 7 | // coordinate threads to agree on the batch and phrase. 8 | namespace AriaCoord { 9 | 10 | void init(); 11 | void register_thread(uint64_t thd_id); 12 | bool start_exec_phase(uint64_t thd_id, uint64_t batch_id, bool sim_done); 13 | void start_commit_phase(uint64_t thd_id, uint64_t batch_id); 14 | 15 | } // namespace AriaCoord 16 | 17 | #endif 18 | -------------------------------------------------------------------------------- /concurrency_control/bamboo.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by Zhihan Guo on 8/27/20. 3 | // 4 | #include "txn.h" 5 | #include "row.h" 6 | #include "row_bamboo.h" 7 | //#include "row_bamboo_pt.h" 8 | 9 | #if CC_ALG == BAMBOO 10 | RC 11 | txn_man::retire_row(int access_cnt){ 12 | return accesses[access_cnt]->orig_row->retire_row(accesses[access_cnt]->lock_entry); 13 | } 14 | #endif 15 | 16 | void 17 | txn_man::decrement_commit_barriers() { 18 | //ATOM_SUB(*addr_barriers, 1UL << 2); 19 | ATOM_SUB(commit_barriers, 1UL << 2); 20 | } 21 | 22 | void 23 | txn_man::increment_commit_barriers() { 24 | //ATOM_ADD(*addr_barriers, 1UL << 2); 25 | ATOM_ADD(commit_barriers, 1UL << 2); 26 | } 27 | -------------------------------------------------------------------------------- /concurrency_control/dl_detect.cpp: -------------------------------------------------------------------------------- 1 | #include "dl_detect.h" 2 | #include "global.h" 3 | #include "helper.h" 4 | #include "txn.h" 5 | #include "row.h" 6 | #include "manager.h" 7 | #include "mem_alloc.h" 8 | 9 | /********************************************************/ 10 | // The current txn aborts itself only if it holds less 11 | // locks than all the other txns on the loop. 12 | // In other words, the victim should be the txn that 13 | // performs the least amount of work 14 | /********************************************************/ 15 | void DL_detect::init() { 16 | dependency = new DepThd[g_thread_cnt]; 17 | V = g_thread_cnt; 18 | } 19 | 20 | int 21 | DL_detect::add_dep(uint64_t txnid1, uint64_t * txnids, int cnt, int num_locks) { 22 | if (g_no_dl) 23 | return 0; 24 | int thd1 = get_thdid_from_txnid(txnid1); 25 | pthread_mutex_lock( &dependency[thd1].lock ); 26 | dependency[thd1].txnid = txnid1; 27 | dependency[thd1].num_locks = num_locks; 28 | 29 | for (int i = 0; i < cnt; i++) 30 | dependency[thd1].adj.push_back(txnids[i]); 31 | 32 | pthread_mutex_unlock( &dependency[thd1].lock ); 33 | return 0; 34 | } 35 | 36 | bool 37 | DL_detect::nextNode(uint64_t txnid, DetectData * detect_data) { 38 | int thd = get_thdid_from_txnid(txnid); 39 | assert( !detect_data->visited[thd] ); 40 | detect_data->visited[thd] = true; 41 | detect_data->recStack[thd] = true; 42 | 43 | pthread_mutex_lock( &dependency[thd].lock ); 44 | 45 | int lock_num = dependency[thd].num_locks; 46 | int txnid_num = dependency[thd].adj.size(); 47 | uint64_t txnids[ txnid_num ]; 48 | int n = 0; 49 | 50 | if (dependency[thd].txnid != (SInt64)txnid) { 51 | detect_data->recStack[thd] = false; 52 | pthread_mutex_unlock( &dependency[thd].lock ); 53 | return false; 54 | } 55 | 56 | for(list::iterator i = dependency[thd].adj.begin(); i != dependency[thd].adj.end(); ++i) { 57 | txnids[n++] = *i; 58 | } 59 | 60 | pthread_mutex_unlock( &dependency[thd].lock ); 61 | 62 | for (n = 0; n < txnid_num; n++) { 63 | int nextthd = get_thdid_from_txnid( txnids[n] ); 64 | 65 | // next node not visited and txnid is not stale 66 | if ( detect_data->recStack[nextthd] ) { 67 | if ((SInt32)txnids[n] == dependency[nextthd].txnid) { 68 | detect_data->loop = true; 69 | detect_data->onloop = true; 70 | detect_data->loopstart = nextthd; 71 | break; 72 | } 73 | } 74 | if ( !detect_data->visited[nextthd] && 75 | dependency[nextthd].txnid == (SInt64) txnids[n] && 76 | nextNode(txnids[n], detect_data)) 77 | { 78 | break; 79 | } 80 | } 81 | detect_data->recStack[thd] = false; 82 | if (detect_data->loop 83 | && detect_data->onloop 84 | && lock_num < detect_data->min_lock_num) { 85 | detect_data->min_lock_num = lock_num; 86 | detect_data->min_txnid = txnid; 87 | } 88 | if (thd == detect_data->loopstart) { 89 | detect_data->onloop = false; 90 | } 91 | return detect_data->loop; 92 | } 93 | 94 | // isCycle returns true if there is a loop AND the current txn holds the least 95 | // number of locks on that loop. 96 | bool DL_detect::isCyclic(uint64_t txnid, DetectData * detect_data) { 97 | return nextNode(txnid, detect_data); 98 | } 99 | 100 | int 101 | DL_detect::detect_cycle(uint64_t txnid) { 102 | if (g_no_dl) 103 | return 0; 104 | uint64_t starttime = get_sys_clock(); 105 | INC_GLOB_STATS(cycle_detect, 1); 106 | bool deadlock = false; 107 | 108 | int thd = get_thdid_from_txnid(txnid); 109 | DetectData * detect_data = (DetectData *) 110 | mem_allocator.alloc(sizeof(DetectData), thd); 111 | detect_data->visited = (bool * ) 112 | mem_allocator.alloc(sizeof(bool) * V, thd); 113 | detect_data->recStack = (bool * ) 114 | mem_allocator.alloc(sizeof(bool) * V, thd); 115 | for(int i = 0; i < V; i++) { 116 | detect_data->visited[i] = false; 117 | detect_data->recStack[i] = false; 118 | } 119 | 120 | detect_data->min_lock_num = 1000; 121 | detect_data->min_txnid = -1; 122 | detect_data->loop = false; 123 | 124 | if ( isCyclic(txnid, detect_data) ){ 125 | deadlock = true; 126 | INC_GLOB_STATS(deadlock, 1); 127 | int thd_to_abort = get_thdid_from_txnid(detect_data->min_txnid); 128 | if (dependency[thd_to_abort].txnid == (SInt64) detect_data->min_txnid) { 129 | txn_man * txn = glob_manager->get_txn_man(thd_to_abort); 130 | txn->lock_abort = true; 131 | } 132 | } 133 | 134 | mem_allocator.free(detect_data->visited, sizeof(bool)*V); 135 | mem_allocator.free(detect_data->recStack, sizeof(bool)*V); 136 | mem_allocator.free(detect_data, sizeof(DetectData)); 137 | uint64_t timespan = get_sys_clock() - starttime; 138 | INC_GLOB_STATS(dl_detect_time, timespan); 139 | if (deadlock) return 1; 140 | else return 0; 141 | } 142 | 143 | void DL_detect::clear_dep(uint64_t txnid) { 144 | if (g_no_dl) 145 | return; 146 | int thd = get_thdid_from_txnid(txnid); 147 | pthread_mutex_lock( &dependency[thd].lock ); 148 | 149 | dependency[thd].adj.clear(); 150 | dependency[thd].txnid = -1; 151 | dependency[thd].num_locks = 0; 152 | 153 | pthread_mutex_unlock( &dependency[thd].lock ); 154 | } 155 | 156 | -------------------------------------------------------------------------------- /concurrency_control/dl_detect.h: -------------------------------------------------------------------------------- 1 | #ifndef _DL_DETECT_ 2 | #define _DL_DETECT_ 3 | 4 | #include 5 | #include 6 | #include 7 | #include "pthread.h" 8 | #include "config.h" 9 | //#include "global.h" 10 | //#include "helper.h" 11 | 12 | // The denpendency information per thread 13 | struct DepThd { 14 | std::list adj; // Pointer to an array containing adjacency lists 15 | pthread_mutex_t lock; 16 | volatile int64_t txnid; // -1 means invalid 17 | int num_locks; // the # of locks that txn is currently holding 18 | char pad[2 * CL_SIZE - sizeof(int64_t) - sizeof(pthread_mutex_t) - sizeof(std::list) - sizeof(int)]; 19 | }; 20 | 21 | // shared data for a particular deadlock detection 22 | struct DetectData { 23 | bool * visited; 24 | bool * recStack; 25 | bool loop; 26 | bool onloop; // the current node is on the loop 27 | int loopstart; // the starting point of the loop 28 | int min_lock_num; // the min lock num for txn in the loop 29 | uint64_t min_txnid; // the txnid that holds the min lock num 30 | }; 31 | 32 | class DL_detect { 33 | public: 34 | void init(); 35 | // return values: 36 | // 0: no deadlocks 37 | // 1: deadlock exists 38 | int detect_cycle(uint64_t txnid); 39 | // txn1 (txn_id) dependes on txns (containing cnt txns) 40 | // return values: 41 | // 0: succeed. 42 | // 16: cannot get lock 43 | int add_dep(uint64_t txnid, uint64_t * txnids, int cnt, int num_locks); 44 | // remove all outbound dependencies for txnid. 45 | // will wait for the lock until acquired. 46 | void clear_dep(uint64_t txnid); 47 | private: 48 | int V; // No. of vertices 49 | DepThd * dependency; 50 | 51 | /////////////////////////////////////////// 52 | // For deadlock detection 53 | /////////////////////////////////////////// 54 | // dl_lock is the global lock. Only used when deadlock detection happens 55 | pthread_mutex_t _lock; 56 | // return value: whether a loop is detected. 57 | bool nextNode(uint64_t txnid, DetectData * detect_data); 58 | bool isCyclic(uint64_t txnid, DetectData * detect_data); // return if "thd" is causing a cycle 59 | }; 60 | 61 | #endif 62 | -------------------------------------------------------------------------------- /concurrency_control/hekaton.cpp: -------------------------------------------------------------------------------- 1 | #include "txn.h" 2 | #include "row.h" 3 | #include "row_hekaton.h" 4 | #include "manager.h" 5 | 6 | #if CC_ALG==HEKATON 7 | 8 | RC 9 | txn_man::validate_hekaton(RC rc) 10 | { 11 | uint64_t starttime = get_sys_clock(); 12 | INC_STATS(get_thd_id(), debug1, get_sys_clock() - starttime); 13 | ts_t commit_ts = glob_manager->get_ts(get_thd_id()); 14 | // validate the read set. 15 | #if ISOLATION_LEVEL == SERIALIZABLE 16 | if (rc == RCOK) { 17 | for (int rid = 0; rid < row_cnt; rid ++) { 18 | if (accesses[rid]->type == WR) 19 | continue; 20 | rc = accesses[rid]->orig_row->manager->prepare_read(this, accesses[rid]->data, commit_ts); 21 | if (rc == Abort) 22 | break; 23 | } 24 | } 25 | #endif 26 | // postprocess 27 | for (int rid = 0; rid < row_cnt; rid ++) { 28 | if (accesses[rid]->type == RD) 29 | continue; 30 | accesses[rid]->orig_row->manager->post_process(this, commit_ts, rc); 31 | } 32 | return rc; 33 | } 34 | 35 | #endif 36 | -------------------------------------------------------------------------------- /concurrency_control/occ.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | #include "txn.h" 4 | #include "occ.h" 5 | #include "manager.h" 6 | #include "mem_alloc.h" 7 | #include "row_occ.h" 8 | 9 | 10 | set_ent::set_ent() { 11 | set_size = 0; 12 | txn = NULL; 13 | rows = NULL; 14 | next = NULL; 15 | } 16 | 17 | void OptCC::init() { 18 | tnc = 0; 19 | his_len = 0; 20 | active_len = 0; 21 | active = NULL; 22 | lock_all = false; 23 | } 24 | 25 | RC OptCC::validate(txn_man * txn) { 26 | RC rc; 27 | #if PER_ROW_VALID 28 | rc = per_row_validate(txn); 29 | #else 30 | rc = central_validate(txn); 31 | #endif 32 | return rc; 33 | } 34 | 35 | RC 36 | OptCC::per_row_validate(txn_man * txn) { 37 | RC rc = RCOK; 38 | #if CC_ALG == OCC 39 | // sort all rows accessed in primary key order. 40 | // TODO for migration, should first sort by partition id 41 | for (int i = txn->row_cnt - 1; i > 0; i--) { 42 | for (int j = 0; j < i; j ++) { 43 | int tabcmp = strcmp(txn->accesses[j]->orig_row->get_table_name(), 44 | txn->accesses[j+1]->orig_row->get_table_name()); 45 | if (tabcmp > 0 || (tabcmp == 0 && txn->accesses[j]->orig_row->get_primary_key() > txn->accesses[j+1]->orig_row->get_primary_key())) { 46 | Access * tmp = txn->accesses[j]; 47 | txn->accesses[j] = txn->accesses[j+1]; 48 | txn->accesses[j+1] = tmp; 49 | } 50 | } 51 | } 52 | #if DEBUG_ASSERT 53 | for (int i = txn->row_cnt - 1; i > 0; i--) { 54 | int tabcmp = strcmp(txn->accesses[i-1]->orig_row->get_table_name(), 55 | txn->accesses[i]->orig_row->get_table_name()); 56 | assert(tabcmp < 0 || tabcmp == 0 && txn->accesses[i]->orig_row->get_primary_key() > 57 | txn->accesses[i-1]->orig_row->get_primary_key()); 58 | } 59 | #endif 60 | // lock all rows in the readset and writeset. 61 | // Validate each access 62 | bool ok = true; 63 | int lock_cnt = 0; 64 | for (int i = 0; i < txn->row_cnt && ok; i++) { 65 | lock_cnt ++; 66 | txn->accesses[i]->orig_row->manager->latch(); 67 | ok = txn->accesses[i]->orig_row->manager->validate( txn->start_ts ); 68 | } 69 | if (ok) { 70 | // Validation passed. 71 | // advance the global timestamp and get the end_ts 72 | txn->end_ts = glob_manager->get_ts( txn->get_thd_id() ); 73 | // write to each row and update wts 74 | txn->cleanup(RCOK); 75 | rc = RCOK; 76 | } else { 77 | txn->cleanup(Abort); 78 | rc = Abort; 79 | } 80 | 81 | for (int i = 0; i < lock_cnt; i++) 82 | txn->accesses[i]->orig_row->manager->release(); 83 | #endif 84 | return rc; 85 | } 86 | 87 | RC OptCC::central_validate(txn_man * txn) { 88 | RC rc; 89 | uint64_t start_tn = txn->start_ts; 90 | uint64_t finish_tn; 91 | set_ent ** finish_active; 92 | uint64_t f_active_len; 93 | bool valid = true; 94 | // OptCC is centralized. No need to do per partition malloc. 95 | set_ent * wset; 96 | set_ent * rset; 97 | get_rw_set(txn, rset, wset); 98 | bool readonly = (wset->set_size == 0); 99 | set_ent * his; 100 | set_ent * ent; 101 | int n = 0; 102 | 103 | pthread_mutex_lock( &latch ); 104 | finish_tn = tnc; 105 | ent = active; 106 | f_active_len = active_len; 107 | finish_active = (set_ent**) mem_allocator.alloc(sizeof(set_ent *) * f_active_len, 0); 108 | while (ent != NULL) { 109 | finish_active[n++] = ent; 110 | ent = ent->next; 111 | } 112 | if ( !readonly ) { 113 | active_len ++; 114 | STACK_PUSH(active, wset); 115 | } 116 | his = history; 117 | pthread_mutex_unlock( &latch ); 118 | if (finish_tn > start_tn) { 119 | while (his && his->tn > finish_tn) 120 | his = his->next; 121 | while (his && his->tn > start_tn) { 122 | valid = test_valid(his, rset); 123 | if (!valid) 124 | goto final; 125 | his = his->next; 126 | } 127 | } 128 | 129 | for (UInt32 i = 0; i < f_active_len; i++) { 130 | set_ent * wact = finish_active[i]; 131 | valid = test_valid(wact, rset); 132 | if (valid) { 133 | valid = test_valid(wact, wset); 134 | } if (!valid) 135 | goto final; 136 | } 137 | final: 138 | if (valid) 139 | txn->cleanup(RCOK); 140 | mem_allocator.free(rset, sizeof(set_ent)); 141 | 142 | if (!readonly) { 143 | // only update active & tnc for non-readonly transactions 144 | pthread_mutex_lock( &latch ); 145 | set_ent * act = active; 146 | set_ent * prev = NULL; 147 | while (act->txn != txn) { 148 | prev = act; 149 | act = act->next; 150 | } 151 | assert(act->txn == txn); 152 | if (prev != NULL) 153 | prev->next = act->next; 154 | else 155 | active = act->next; 156 | active_len --; 157 | if (valid) { 158 | if (history) 159 | assert(history->tn == tnc); 160 | tnc ++; 161 | wset->tn = tnc; 162 | STACK_PUSH(history, wset); 163 | his_len ++; 164 | } 165 | pthread_mutex_unlock( &latch ); 166 | } 167 | if (valid) { 168 | rc = RCOK; 169 | } else { 170 | txn->cleanup(Abort); 171 | rc = Abort; 172 | } 173 | return rc; 174 | } 175 | 176 | RC OptCC::get_rw_set(txn_man * txn, set_ent * &rset, set_ent *& wset) { 177 | wset = (set_ent*) mem_allocator.alloc(sizeof(set_ent), 0); 178 | rset = (set_ent*) mem_allocator.alloc(sizeof(set_ent), 0); 179 | wset->set_size = txn->wr_cnt; 180 | rset->set_size = txn->row_cnt - txn->wr_cnt; 181 | wset->rows = (row_t **) mem_allocator.alloc(sizeof(row_t *) * wset->set_size, 0); 182 | rset->rows = (row_t **) mem_allocator.alloc(sizeof(row_t *) * rset->set_size, 0); 183 | wset->txn = txn; 184 | rset->txn = txn; 185 | 186 | UInt32 n = 0, m = 0; 187 | for (int i = 0; i < txn->row_cnt; i++) { 188 | if (txn->accesses[i]->type == WR) 189 | wset->rows[n ++] = txn->accesses[i]->orig_row; 190 | else 191 | rset->rows[m ++] = txn->accesses[i]->orig_row; 192 | } 193 | 194 | assert(n == wset->set_size); 195 | assert(m == rset->set_size); 196 | return RCOK; 197 | } 198 | 199 | bool OptCC::test_valid(set_ent * set1, set_ent * set2) { 200 | for (UInt32 i = 0; i < set1->set_size; i++) 201 | for (UInt32 j = 0; j < set2->set_size; j++) { 202 | if (set1->rows[i] == set2->rows[j]) { 203 | return false; 204 | } 205 | } 206 | return true; 207 | } 208 | -------------------------------------------------------------------------------- /concurrency_control/occ.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "row.h" 4 | 5 | // TODO For simplicity, the txn hisotry for OCC is oganized as follows: 6 | // 1. history is never deleted. 7 | // 2. hisotry forms a single directional list. 8 | // history head -> hist_1 -> hist_2 -> hist_3 -> ... -> hist_n 9 | // The head is always the latest and the tail the youngest. 10 | // When history is traversed, always go from head -> tail order. 11 | 12 | class txn_man; 13 | 14 | class set_ent{ 15 | public: 16 | set_ent(); 17 | UInt64 tn; 18 | txn_man * txn; 19 | UInt32 set_size; 20 | row_t ** rows; 21 | set_ent * next; 22 | }; 23 | 24 | class OptCC { 25 | public: 26 | void init(); 27 | RC validate(txn_man * txn); 28 | volatile bool lock_all; 29 | uint64_t lock_txn_id; 30 | private: 31 | 32 | // per row validation similar to Hekaton. 33 | RC per_row_validate(txn_man * txn); 34 | 35 | // parallel validation in the original OCC paper. 36 | RC central_validate(txn_man * txn); 37 | bool test_valid(set_ent * set1, set_ent * set2); 38 | RC get_rw_set(txn_man * txni, set_ent * &rset, set_ent *& wset); 39 | 40 | // "history" stores write set of transactions with tn >= smallest running tn 41 | set_ent * history; 42 | set_ent * active; 43 | uint64_t his_len; 44 | uint64_t active_len; 45 | volatile uint64_t tnc; // transaction number counter 46 | pthread_mutex_t latch; 47 | }; 48 | -------------------------------------------------------------------------------- /concurrency_control/plock.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | #include "plock.h" 4 | #include "mem_alloc.h" 5 | #include "txn.h" 6 | 7 | /************************************************/ 8 | // per-partition Manager 9 | /************************************************/ 10 | void PartMan::init() { 11 | uint64_t part_id = get_part_id(this); 12 | waiter_cnt = 0; 13 | owner = NULL; 14 | waiters = (txn_man **) 15 | mem_allocator.alloc(sizeof(txn_man *) * g_thread_cnt, part_id); 16 | pthread_mutex_init( &latch, NULL ); 17 | } 18 | 19 | RC PartMan::lock(txn_man * txn) { 20 | RC rc; 21 | 22 | pthread_mutex_lock( &latch ); 23 | if (owner == NULL) { 24 | owner = txn; 25 | rc = RCOK; 26 | } else if (owner->get_ts() < txn->get_ts()) { 27 | int i; 28 | assert(waiter_cnt < g_thread_cnt); 29 | for (i = waiter_cnt; i > 0; i--) { 30 | if (txn->get_ts() > waiters[i - 1]->get_ts()) { 31 | waiters[i] = txn; 32 | break; 33 | } else 34 | waiters[i] = waiters[i - 1]; 35 | } 36 | if (i == 0) 37 | waiters[i] = txn; 38 | waiter_cnt ++; 39 | ATOM_ADD(txn->ready_part, 1); 40 | rc = WAIT; 41 | } else 42 | rc = Abort; 43 | pthread_mutex_unlock( &latch ); 44 | return rc; 45 | } 46 | 47 | void PartMan::unlock(txn_man * txn) { 48 | pthread_mutex_lock( &latch ); 49 | if (txn == owner) { 50 | if (waiter_cnt == 0) 51 | owner = NULL; 52 | else { 53 | owner = waiters[0]; 54 | for (UInt32 i = 0; i < waiter_cnt - 1; i++) { 55 | assert( waiters[i]->get_ts() < waiters[i + 1]->get_ts() ); 56 | waiters[i] = waiters[i + 1]; 57 | } 58 | waiter_cnt --; 59 | ATOM_SUB(owner->ready_part, 1); 60 | } 61 | } else { 62 | bool find = false; 63 | for (UInt32 i = 0; i < waiter_cnt; i++) { 64 | if (waiters[i] == txn) 65 | find = true; 66 | if (find && i < waiter_cnt - 1) 67 | waiters[i] = waiters[i + 1]; 68 | } 69 | ATOM_SUB(txn->ready_part, 1); 70 | assert(find); 71 | waiter_cnt --; 72 | } 73 | pthread_mutex_unlock( &latch ); 74 | } 75 | 76 | /************************************************/ 77 | // Partition Lock 78 | /************************************************/ 79 | 80 | void Plock::init() { 81 | ARR_PTR(PartMan, part_mans, g_part_cnt); 82 | for (UInt32 i = 0; i < g_part_cnt; i++) 83 | part_mans[i]->init(); 84 | } 85 | 86 | RC Plock::lock(txn_man * txn, uint64_t * parts, uint64_t part_cnt) { 87 | RC rc = RCOK; 88 | ts_t starttime = get_sys_clock(); 89 | UInt32 i; 90 | for (i = 0; i < part_cnt; i ++) { 91 | uint64_t part_id = parts[i]; 92 | rc = part_mans[part_id]->lock(txn); 93 | if (rc == Abort) 94 | break; 95 | } 96 | if (rc == Abort) { 97 | for (UInt32 j = 0; j < i; j++) { 98 | uint64_t part_id = parts[j]; 99 | part_mans[part_id]->unlock(txn); 100 | } 101 | assert(txn->ready_part == 0); 102 | INC_TMP_STATS(txn->get_thd_id(), time_man, get_sys_clock() - starttime); 103 | return Abort; 104 | } 105 | if (txn->ready_part > 0) { 106 | ts_t t = get_sys_clock(); 107 | while (txn->ready_part > 0) {} 108 | INC_TMP_STATS(txn->get_thd_id(), time_wait, get_sys_clock() - t); 109 | #if DEBUG_WW 110 | printf("[plock] increment time wait %lu\n", get_sys_clock() - t); 111 | #endif 112 | } 113 | assert(txn->ready_part == 0); 114 | INC_TMP_STATS(txn->get_thd_id(), time_man, get_sys_clock() - starttime); 115 | return RCOK; 116 | } 117 | 118 | void Plock::unlock(txn_man * txn, uint64_t * parts, uint64_t part_cnt) { 119 | ts_t starttime = get_sys_clock(); 120 | for (UInt32 i = 0; i < part_cnt; i ++) { 121 | uint64_t part_id = parts[i]; 122 | part_mans[part_id]->unlock(txn); 123 | } 124 | INC_TMP_STATS(txn->get_thd_id(), time_man, get_sys_clock() - starttime); 125 | } 126 | -------------------------------------------------------------------------------- /concurrency_control/plock.h: -------------------------------------------------------------------------------- 1 | #ifndef _PLOCK_H_ 2 | #define _PLOCK_H_ 3 | 4 | #include "global.h" 5 | #include "helper.h" 6 | 7 | class txn_man; 8 | 9 | // Parition manager for HSTORE 10 | class PartMan { 11 | public: 12 | void init(); 13 | RC lock(txn_man * txn); 14 | void unlock(txn_man * txn); 15 | private: 16 | pthread_mutex_t latch; 17 | txn_man * owner; 18 | txn_man ** waiters; 19 | UInt32 waiter_cnt; 20 | }; 21 | 22 | // Partition Level Locking 23 | class Plock { 24 | public: 25 | void init(); 26 | // lock all partitions in parts 27 | RC lock(txn_man * txn, uint64_t * parts, uint64_t part_cnt); 28 | void unlock(txn_man * txn, uint64_t * parts, uint64_t part_cnt); 29 | private: 30 | PartMan ** part_mans; 31 | }; 32 | 33 | #endif 34 | -------------------------------------------------------------------------------- /concurrency_control/row_aria.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "txn.h" 3 | #include "row.h" 4 | #include "row_aria.h" 5 | 6 | #if CC_ALG == ARIA 7 | 8 | void 9 | Row_aria::init(row_t * row) { 10 | _row = row; 11 | _write_resv.store(0, std::memory_order_relaxed); 12 | #if ARIA_REORDER 13 | _read_resv.store(0, std::memory_order_relaxed); 14 | #endif 15 | } 16 | 17 | RC 18 | Row_aria::access(txn_man * txn, TsType type, row_t * local_row) { 19 | if (type != R_REQ) { 20 | if (!reserve_write(txn->batch_id, txn->prio, txn->get_txn_id())) 21 | return Abort; 22 | } 23 | #if ARIA_REORDER 24 | else { 25 | reserve_read(txn->batch_id, txn->prio, txn->get_txn_id()); 26 | } 27 | #endif 28 | // when in execution phase, everything is read-only except TID, so it is safe 29 | // to copy record data without any lock 30 | #if ARIA_NOCOPY_READ 31 | // no need to make a copy because the whole database is read-only 32 | if (type == R_REQ) return RCOK; 33 | #endif 34 | local_row->copy(_row); 35 | return RCOK; 36 | } 37 | 38 | void 39 | Row_aria::write(row_t * data) { 40 | _row->copy(data); 41 | } 42 | 43 | #endif 44 | -------------------------------------------------------------------------------- /concurrency_control/row_aria.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | class table_t; 4 | class Catalog; 5 | class txn_man; 6 | struct TsReqEntry; 7 | 8 | #include 9 | 10 | #if CC_ALG == ARIA 11 | 12 | // we implement Aria's reservation as per-record TID 13 | union TID_aria_t { 14 | uint64_t raw_bits; 15 | struct { 16 | uint64_t batch_id : ARIA_NUM_BITS_BATCH_ID; 17 | uint32_t prio : ARIA_NUM_BITS_PRIO; 18 | uint64_t txn_id : ARIA_NUM_BITS_TXN_ID; // resereved by 19 | } tid_aria; 20 | TID_aria_t() = default; 21 | TID_aria_t(uint64_t tid_bits): raw_bits(tid_bits) {} 22 | TID_aria_t(uint64_t batch_id, uint32_t prio, uint64_t txn_id): \ 23 | tid_aria({batch_id, prio, txn_id}) {} 24 | }; 25 | 26 | class Row_aria { 27 | // txns are serialized as the order with these comparsion rules: 28 | // - if txn A has higher prio than txn B, A is serialized before B 29 | // - else if txn A has lower txn_id than txn B, A is serialized before B 30 | static bool is_order_before(uint32_t lhs_prio, uint64_t lhs_txn_id, 31 | uint32_t rhs_prio, uint64_t rhs_txn_id) 32 | { 33 | if (lhs_prio != rhs_prio) 34 | return lhs_prio > rhs_prio; 35 | return lhs_txn_id < rhs_txn_id; 36 | } 37 | 38 | public: 39 | void init(row_t * row); 40 | RC access(txn_man * txn, TsType type, row_t * local_row); 41 | void write(row_t * data); 42 | 43 | bool reserve_write(uint64_t batch_id, uint32_t prio, uint64_t txn_id) { 44 | return reserve(_write_resv, batch_id, prio, txn_id); 45 | } 46 | bool validate_write(uint64_t batch_id, uint32_t prio, uint64_t txn_id) const { 47 | return validate(_write_resv, batch_id, prio, txn_id); 48 | } 49 | 50 | #if ARIA_REORDER 51 | void reserve_read(uint64_t batch_id, uint32_t prio, uint64_t txn_id) { 52 | // we don't care return value for read reservation 53 | reserve(_read_resv, batch_id, prio, txn_id); 54 | } 55 | bool validate_read(uint64_t batch_id, uint32_t prio, uint64_t txn_id) const { 56 | return validate(_read_resv, batch_id, prio, txn_id); 57 | } 58 | #endif 59 | 60 | private: 61 | bool reserve(std::atomic& resv, uint64_t batch_id, 62 | uint32_t prio, uint64_t txn_id) 63 | { 64 | TID_aria_t new_tid(batch_id, prio, txn_id); 65 | TID_aria_t v = resv.load(std::memory_order_relaxed); 66 | retry: 67 | // if no one ever reserves this record in this batch, 68 | // OR the one that previously has reserved the record is not ordered 69 | // before the current one (in which case we preempt) 70 | if (v.tid_aria.batch_id != batch_id \ 71 | || !is_order_before(v.tid_aria.prio, v.tid_aria.txn_id, prio, txn_id)) 72 | { 73 | if (!resv.compare_exchange_strong(v, new_tid, 74 | std::memory_order_relaxed, std::memory_order_relaxed)) 75 | goto retry; 76 | return true; 77 | } 78 | return false; 79 | } 80 | 81 | bool validate(const std::atomic& resv, uint64_t batch_id, 82 | uint32_t prio, uint64_t txn_id) const 83 | { 84 | TID_aria_t v = resv.load(std::memory_order_relaxed); 85 | // compared record's TID with txn's tid: 86 | // - if reserved by a txn from another batch; no one reserves it in the 87 | // current batch; pass 88 | if (v.tid_aria.batch_id != batch_id) return true; 89 | // - else for a validation to pass, the txn that reserves the record must 90 | // not be serialized before the current txn 91 | return !is_order_before(v.tid_aria.prio, v.tid_aria.txn_id, prio, txn_id); 92 | } 93 | 94 | private: 95 | std::atomic _write_resv; 96 | #if ARIA_REORDER 97 | std::atomic _read_resv; 98 | #endif 99 | row_t * _row; 100 | }; 101 | 102 | #endif 103 | -------------------------------------------------------------------------------- /concurrency_control/row_hekaton.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | #include "row_mvcc.h" 3 | 4 | class table_t; 5 | class Catalog; 6 | class txn_man; 7 | 8 | // Only a constant number of versions can be maintained. 9 | // If a request accesses an old version that has been recycled, 10 | // simply abort the request. 11 | 12 | #if CC_ALG == HEKATON 13 | 14 | struct WriteHisEntry { 15 | bool begin_txn; 16 | bool end_txn; 17 | ts_t begin; 18 | ts_t end; 19 | row_t * row; 20 | }; 21 | 22 | #define INF UINT64_MAX 23 | 24 | class Row_hekaton { 25 | public: 26 | void init(row_t * row); 27 | RC access(txn_man * txn, TsType type, row_t * row); 28 | RC prepare_read(txn_man * txn, row_t * row, ts_t commit_ts); 29 | void post_process(txn_man * txn, ts_t commit_ts, RC rc); 30 | 31 | private: 32 | volatile bool blatch; 33 | uint32_t reserveRow(txn_man * txn); 34 | void doubleHistory(); 35 | 36 | uint32_t _his_latest; 37 | uint32_t _his_oldest; 38 | WriteHisEntry * _write_history; // circular buffer 39 | bool _exists_prewrite; 40 | 41 | uint32_t _his_len; 42 | }; 43 | 44 | #endif 45 | -------------------------------------------------------------------------------- /concurrency_control/row_ic3.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | class table_t; 4 | class Catalog; 5 | class txn_man; 6 | struct TsReqEntry; 7 | class Row_ic3; 8 | 9 | #define LOCK_BIT (1UL << 63) 10 | #if CC_ALG == IC3 11 | 12 | struct IC3LockEntry { 13 | access_t type; 14 | txn_man * txn; 15 | uint64_t txn_id; 16 | IC3LockEntry * prev; 17 | IC3LockEntry * next; 18 | }; 19 | 20 | class Cell_ic3 { 21 | public: 22 | void init(row_t * orig_row, uint64_t id); 23 | /* copy to corresponding col of local row */ 24 | void access(row_t * local_row, Access *txn_access); 25 | uint64_t get_tid() {return _tid;}; 26 | void add_to_acclist(txn_man * txn, access_t type); 27 | void rm_from_acclist(txn_man * txn, bool aborted); 28 | IC3LockEntry * get_last_writer(); 29 | IC3LockEntry * get_last_accessor(); 30 | bool try_lock(); 31 | void release(); 32 | void update_version(uint64_t txn_id) {_tid = txn_id;}; 33 | private: 34 | row_t * _row; 35 | Row_ic3 * row_manager; 36 | volatile uint64_t _tid; 37 | uint64_t idx; 38 | int acclist_cnt; 39 | IC3LockEntry * acclist; 40 | IC3LockEntry * acclist_tail; 41 | volatile int lock; 42 | /* 43 | #if LATCH == LH_SPINLOCK 44 | pthread_spinlock_t * latch; 45 | #elif LATCH == LH_MUTEX 46 | pthread_mutex_t * latch; 47 | #else 48 | mcslock * latch; 49 | #endif 50 | */ 51 | 52 | }; 53 | 54 | class Row_ic3 { 55 | public: 56 | void init(row_t * row); 57 | #if IC3_FIELD_LOCKING 58 | bool try_lock(uint64_t idx) {return cell_managers[idx].try_lock();}; 59 | void release(uint64_t idx) {cell_managers[idx].release();}; 60 | uint64_t get_tid(uint64_t idx) {return cell_managers[idx].get_tid();}; 61 | IC3LockEntry * get_last_writer(uint64_t idx) { 62 | return cell_managers[idx].get_last_writer();}; 63 | IC3LockEntry * get_last_accessor(uint64_t idx) { 64 | return cell_managers[idx].get_last_accessor();}; 65 | void add_to_acclist(uint64_t idx, txn_man * txn, access_t type) { 66 | cell_managers[idx].add_to_acclist(txn, type);}; 67 | void rm_from_acclist(uint64_t idx, txn_man * txn, bool aborted=false) { 68 | cell_managers[idx].rm_from_acclist(txn, aborted);}; 69 | void update_version(uint64_t idx, uint64_t txn_id) { 70 | cell_managers[idx].update_version(txn_id);}; 71 | void access(row_t * local_row, uint64_t idx, Access * txn_access) { 72 | cell_managers[idx].access(local_row, txn_access);}; 73 | #else // tuple-level locking 74 | bool try_lock(); 75 | uint64_t get_tid() {return _tid;}; 76 | IC3LockEntry * get_last_writer(); 77 | IC3LockEntry * get_last_accessor(); 78 | void release() {lock = 0;}; 79 | void add_to_acclist(txn_man * txn, access_t type); 80 | void rm_from_acclist(txn_man * txn, bool aborted=false); 81 | void update_version(uint64_t txn_id) {_tid = txn_id;}; 82 | void access(row_t * local_row, Access * txn_access); 83 | #endif 84 | row_t * _row; 85 | 86 | private: 87 | #if !IC3_FIELD_LOCKING 88 | volatile uint64_t _tid; 89 | uint64_t idx; 90 | int acclist_cnt; 91 | IC3LockEntry * acclist; 92 | IC3LockEntry * acclist_tail; 93 | volatile int lock; 94 | #else 95 | Cell_ic3 * cell_managers; 96 | #endif 97 | }; 98 | 99 | #endif 100 | -------------------------------------------------------------------------------- /concurrency_control/row_lock.h: -------------------------------------------------------------------------------- 1 | #ifndef ROW_LOCK_H 2 | #define ROW_LOCK_H 3 | 4 | struct LockEntry { 5 | txn_man * txn; 6 | Access * access; 7 | lock_t type; 8 | lock_status status; 9 | LockEntry * next; 10 | LockEntry * prev; 11 | LockEntry(txn_man * t, Access * a): txn(t), access(a), type(LOCK_NONE), 12 | status(LOCK_DROPPED), next(NULL), prev(NULL) {}; 13 | }; 14 | 15 | class Row_lock { 16 | public: 17 | void init(row_t * row); 18 | // [DL_DETECT] txnids are the txn_ids that current txn is waiting for. 19 | RC lock_get(lock_t type, txn_man * txn, Access * access); 20 | RC lock_get(lock_t type, txn_man * txn, uint64_t* &txnids, int &txncnt, Access * access); 21 | RC lock_release(LockEntry * entry); 22 | void lock(txn_man * txn); 23 | void unlock(txn_man * txn); 24 | 25 | private: 26 | #if LATCH == LH_SPINLOCK 27 | pthread_spinlock_t * latch; 28 | #elif LATCH == LH_MUTEX 29 | pthread_mutex_t * latch; 30 | #else 31 | mcslock * latch; 32 | #endif 33 | bool blatch; 34 | 35 | bool conflict_lock(lock_t l1, lock_t l2); 36 | static LockEntry * get_entry(Access * access); 37 | static void return_entry(LockEntry * entry); 38 | row_t * _row; 39 | lock_t lock_type; 40 | UInt32 owner_cnt; 41 | UInt32 waiter_cnt; 42 | 43 | // owners is a single linked list 44 | // waiters is a double linked list 45 | // [waiters] head is the oldest txn, tail is the youngest txn. 46 | // So new txns are inserted into the tail. 47 | LockEntry * owners; 48 | LockEntry * waiters_head; 49 | LockEntry * waiters_tail; 50 | }; 51 | 52 | #endif 53 | -------------------------------------------------------------------------------- /concurrency_control/row_mvcc.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | class table_t; 4 | class Catalog; 5 | class txn_man; 6 | 7 | // Only a constant number of versions can be maintained. 8 | // If a request accesses an old version that has been recycled, 9 | // simply abort the request. 10 | 11 | #if CC_ALG == MVCC 12 | struct WriteHisEntry { 13 | bool valid; // whether the entry contains a valid version 14 | bool reserved; // when valid == false, whether the entry is reserved by a P_REQ 15 | ts_t ts; 16 | row_t * row; 17 | }; 18 | 19 | struct ReqEntry { 20 | bool valid; 21 | TsType type; // P_REQ or R_REQ 22 | ts_t ts; 23 | txn_man * txn; 24 | ts_t time; 25 | }; 26 | 27 | 28 | class Row_mvcc { 29 | public: 30 | void init(row_t * row); 31 | RC access(txn_man * txn, TsType type, row_t * row); 32 | private: 33 | pthread_mutex_t * latch; 34 | volatile bool blatch; 35 | 36 | row_t * _row; 37 | 38 | RC conflict(TsType type, ts_t ts, uint64_t thd_id = 0); 39 | void update_buffer(txn_man * txn, TsType type); 40 | void buffer_req(TsType type, txn_man * txn, bool served); 41 | 42 | // Invariant: all valid entries in _requests have greater ts than any entry in _write_history 43 | row_t * _latest_row; 44 | ts_t _latest_wts; 45 | ts_t _oldest_wts; 46 | WriteHisEntry * _write_history; 47 | // the following is a small optimization. 48 | // the timestamp for the served prewrite request. There should be at most one 49 | // served prewrite request. 50 | bool _exists_prewrite; 51 | ts_t _prewrite_ts; 52 | uint32_t _prewrite_his_id; 53 | ts_t _max_served_rts; 54 | 55 | // _requests only contains pending requests. 56 | ReqEntry * _requests; 57 | uint32_t _his_len; 58 | uint32_t _req_len; 59 | // Invariant: _num_versions <= 4 60 | // Invariant: _num_prewrite_reservation <= 2 61 | uint32_t _num_versions; 62 | 63 | // list = 0: _write_history 64 | // list = 1: _requests 65 | void double_list(uint32_t list); 66 | row_t * reserveRow(ts_t ts, txn_man * txn); 67 | }; 68 | 69 | #endif 70 | -------------------------------------------------------------------------------- /concurrency_control/row_occ.cpp: -------------------------------------------------------------------------------- 1 | #include "txn.h" 2 | #include "row.h" 3 | #include "row_occ.h" 4 | #include "mem_alloc.h" 5 | 6 | void 7 | Row_occ::init(row_t * row) { 8 | _row = row; 9 | int part_id = row->get_part_id(); 10 | _latch = (pthread_mutex_t *) 11 | mem_allocator.alloc(sizeof(pthread_mutex_t), part_id); 12 | pthread_mutex_init( _latch, NULL ); 13 | wts = 0; 14 | blatch = false; 15 | } 16 | 17 | RC 18 | Row_occ::access(txn_man * txn, TsType type) { 19 | RC rc = RCOK; 20 | pthread_mutex_lock( _latch ); 21 | if (type == R_REQ) { 22 | if (txn->start_ts < wts) 23 | rc = Abort; 24 | else { 25 | txn->cur_row->copy(_row); 26 | rc = RCOK; 27 | } 28 | } else 29 | assert(false); 30 | pthread_mutex_unlock( _latch ); 31 | return rc; 32 | } 33 | 34 | void 35 | Row_occ::latch() { 36 | pthread_mutex_lock( _latch ); 37 | } 38 | 39 | bool 40 | Row_occ::validate(uint64_t ts) { 41 | if (ts < wts) return false; 42 | else return true; 43 | } 44 | 45 | void 46 | Row_occ::write(row_t * data, uint64_t ts) { 47 | _row->copy(data); 48 | if (PER_ROW_VALID) { 49 | assert(ts > wts); 50 | wts = ts; 51 | } 52 | } 53 | 54 | void 55 | Row_occ::release() { 56 | pthread_mutex_unlock( _latch ); 57 | } 58 | -------------------------------------------------------------------------------- /concurrency_control/row_occ.h: -------------------------------------------------------------------------------- 1 | #ifndef ROW_OCC_H 2 | #define ROW_OCC_H 3 | 4 | class table_t; 5 | class Catalog; 6 | class txn_man; 7 | struct TsReqEntry; 8 | 9 | class Row_occ { 10 | public: 11 | void init(row_t * row); 12 | RC access(txn_man * txn, TsType type); 13 | void latch(); 14 | // ts is the start_ts of the validating txn 15 | bool validate(uint64_t ts); 16 | void write(row_t * data, uint64_t ts); 17 | void release(); 18 | private: 19 | pthread_mutex_t * _latch; 20 | bool blatch; 21 | 22 | row_t * _row; 23 | // the last update time 24 | ts_t wts; 25 | }; 26 | 27 | #endif 28 | -------------------------------------------------------------------------------- /concurrency_control/row_silo.cpp: -------------------------------------------------------------------------------- 1 | #include "txn.h" 2 | #include "row.h" 3 | #include "row_silo.h" 4 | #include "mem_alloc.h" 5 | 6 | #if CC_ALG==SILO 7 | 8 | void 9 | Row_silo::init(row_t * row) 10 | { 11 | _row = row; 12 | #if ATOMIC_WORD 13 | _tid_word = 0; 14 | #else 15 | _latch = (pthread_mutex_t *) _mm_malloc(sizeof(pthread_mutex_t), 64); 16 | pthread_mutex_init( _latch, NULL ); 17 | _tid = 0; 18 | #endif 19 | } 20 | 21 | RC 22 | Row_silo::access(txn_man * txn, TsType type, row_t * local_row) { 23 | #if ATOMIC_WORD 24 | uint64_t v = 0; 25 | uint64_t v2 = 1; 26 | while (v2 != v) { 27 | v = _tid_word; 28 | while (v & LOCK_BIT) { 29 | PAUSE 30 | v = _tid_word; 31 | } 32 | local_row->copy(_row); 33 | COMPILER_BARRIER 34 | v2 = _tid_word; 35 | } 36 | txn->last_tid = v & (~LOCK_BIT); 37 | #else 38 | lock(); 39 | local_row->copy(_row); 40 | txn->last_tid = _tid; 41 | release(); 42 | #endif 43 | return RCOK; 44 | } 45 | 46 | bool 47 | Row_silo::validate(ts_t tid, bool in_write_set) { 48 | #if ATOMIC_WORD 49 | uint64_t v = _tid_word; 50 | if (in_write_set) 51 | return tid == (v & (~LOCK_BIT)); 52 | 53 | if (v & LOCK_BIT) 54 | return false; 55 | else if (tid != (v & (~LOCK_BIT))) 56 | return false; 57 | else 58 | return true; 59 | #else 60 | if (in_write_set) 61 | return tid == _tid; 62 | if (!try_lock()) 63 | return false; 64 | bool valid = (tid == _tid); 65 | release(); 66 | return valid; 67 | #endif 68 | } 69 | 70 | void 71 | Row_silo::write(row_t * data, uint64_t tid) { 72 | _row->copy(data); 73 | #if ATOMIC_WORD 74 | uint64_t v = _tid_word; 75 | M_ASSERT(tid > (v & (~LOCK_BIT)) && (v & LOCK_BIT), "tid=%ld, v & LOCK_BIT=%ld, v & (~LOCK_BIT)=%ld\n", tid, (v & LOCK_BIT), (v & (~LOCK_BIT))); 76 | _tid_word = (tid | LOCK_BIT); 77 | #else 78 | _tid = tid; 79 | #endif 80 | } 81 | 82 | void 83 | Row_silo::lock() { 84 | #if ATOMIC_WORD 85 | uint64_t v = _tid_word; 86 | while ((v & LOCK_BIT) || !__sync_bool_compare_and_swap(&_tid_word, v, v | LOCK_BIT)) { 87 | PAUSE 88 | v = _tid_word; 89 | } 90 | #else 91 | pthread_mutex_lock( _latch ); 92 | #endif 93 | } 94 | 95 | void 96 | Row_silo::release() { 97 | #if ATOMIC_WORD 98 | assert(_tid_word & LOCK_BIT); 99 | _tid_word = _tid_word & (~LOCK_BIT); 100 | #else 101 | pthread_mutex_unlock( _latch ); 102 | #endif 103 | } 104 | 105 | bool 106 | Row_silo::try_lock() 107 | { 108 | #if ATOMIC_WORD 109 | uint64_t v = _tid_word; 110 | if (v & LOCK_BIT) // already locked 111 | return false; 112 | return __sync_bool_compare_and_swap(&_tid_word, v, (v | LOCK_BIT)); 113 | #else 114 | return pthread_mutex_trylock( _latch ) != EBUSY; 115 | #endif 116 | } 117 | 118 | uint64_t 119 | Row_silo::get_tid() 120 | { 121 | assert(ATOMIC_WORD); 122 | return _tid_word & (~LOCK_BIT); 123 | } 124 | 125 | #endif 126 | -------------------------------------------------------------------------------- /concurrency_control/row_silo.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | class table_t; 4 | class Catalog; 5 | class txn_man; 6 | struct TsReqEntry; 7 | 8 | #if CC_ALG==SILO 9 | #define LOCK_BIT (1UL << 63) 10 | 11 | class Row_silo { 12 | public: 13 | void init(row_t * row); 14 | RC access(txn_man * txn, TsType type, row_t * local_row); 15 | 16 | bool validate(ts_t tid, bool in_write_set); 17 | void write(row_t * data, uint64_t tid); 18 | 19 | void lock(); 20 | void release(); 21 | bool try_lock(); 22 | uint64_t get_tid(); 23 | 24 | void assert_lock() {assert(_tid_word & LOCK_BIT); } 25 | private: 26 | #if ATOMIC_WORD 27 | volatile uint64_t _tid_word; 28 | #else 29 | pthread_mutex_t * _latch; 30 | ts_t _tid; 31 | #endif 32 | row_t * _row; 33 | }; 34 | 35 | #endif 36 | -------------------------------------------------------------------------------- /concurrency_control/row_silo_prio.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "txn.h" 3 | #include "row.h" 4 | #include "row_silo_prio.h" 5 | #include "mem_alloc.h" 6 | #include 7 | 8 | #if CC_ALG == SILO_PRIO 9 | 10 | void 11 | Row_silo_prio::init(row_t * row) 12 | { 13 | _row = row; 14 | _tid_word.store({0, 0}, std::memory_order_relaxed); 15 | } 16 | 17 | RC 18 | Row_silo_prio::access(txn_man * txn, TsType type, row_t * local_row) { 19 | TID_prio_t v, v2; 20 | const uint32_t prio = txn->prio; 21 | bool is_reserved; 22 | v = _tid_word.load(std::memory_order_relaxed); 23 | retry: 24 | while (v.is_locked()) { 25 | PAUSE 26 | v = _tid_word.load(std::memory_order_relaxed); 27 | } 28 | // for a write, abort if the current priority is higher 29 | if (prio < v.get_prio()) { 30 | if (type != R_REQ) return Abort; 31 | } 32 | v2 = v; 33 | is_reserved = v2.acquire_prio(prio); 34 | local_row->copy(_row); 35 | if (is_reserved) { 36 | if (!_tid_word.compare_exchange_strong(v, v2, std::memory_order_acq_rel, 37 | std::memory_order_acquire)) 38 | goto retry; 39 | } else { 40 | assert (v2 == v); 41 | v = _tid_word.load(std::memory_order_acquire); 42 | if (v != v2) 43 | goto retry; 44 | } 45 | txn->last_is_reserved = is_reserved; 46 | txn->last_data_ver = v2.get_data_ver(); 47 | if (is_reserved) txn->last_prio_ver = v2.get_prio_ver(); 48 | return RCOK; 49 | } 50 | 51 | void Row_silo_prio::write(row_t * data) { 52 | _row->copy(data); 53 | } 54 | 55 | #endif 56 | -------------------------------------------------------------------------------- /concurrency_control/row_tictoc.cpp: -------------------------------------------------------------------------------- 1 | #include "row_tictoc.h" 2 | #include "row.h" 3 | #include "txn.h" 4 | #include "mem_alloc.h" 5 | #include 6 | 7 | #if CC_ALG==TICTOC 8 | 9 | void 10 | Row_tictoc::init(row_t * row) 11 | { 12 | _row = row; 13 | #if ATOMIC_WORD 14 | _ts_word = 0; 15 | #else 16 | _latch = (pthread_mutex_t *) _mm_malloc(sizeof(pthread_mutex_t), 64); 17 | pthread_mutex_init( _latch, NULL ); 18 | _wts = 0; 19 | _rts = 0; 20 | #endif 21 | #if TICTOC_MV 22 | _hist_wts = 0; 23 | #endif 24 | } 25 | 26 | RC 27 | Row_tictoc::access(txn_man * txn, TsType type, row_t * local_row) 28 | { 29 | #if ATOMIC_WORD 30 | uint64_t v = 0; 31 | uint64_t v2 = 1; 32 | uint64_t lock_mask = LOCK_BIT; 33 | if (WRITE_PERMISSION_LOCK && type == P_REQ) 34 | lock_mask = WRITE_BIT; 35 | 36 | while ((v2 | RTS_MASK) != (v | RTS_MASK)) { 37 | v = _ts_word; 38 | while (v & lock_mask) { 39 | PAUSE 40 | v = _ts_word; 41 | } 42 | local_row->copy(_row); 43 | COMPILER_BARRIER 44 | v2 = _ts_word; 45 | #if WRITE_PERMISSION_LOCK 46 | if (type == R_REQ) { 47 | v |= WRITE_BIT; 48 | v2 |= WRITE_BIT; 49 | } 50 | #endif 51 | } 52 | txn->last_wts = v & WTS_MASK; 53 | txn->last_rts = ((v & RTS_MASK) >> WTS_LEN) + txn->last_wts; 54 | #else 55 | lock(); 56 | txn->last_wts = _wts; 57 | txn->last_rts = _rts; 58 | local_row->copy(_row); 59 | release(); 60 | #endif 61 | return RCOK; 62 | } 63 | 64 | void 65 | Row_tictoc::write_data(row_t * data, ts_t wts) 66 | { 67 | #if ATOMIC_WORD 68 | uint64_t v = _ts_word; 69 | #if TICTOC_MV 70 | _hist_wts = v & WTS_MASK; 71 | #endif 72 | #if WRITE_PERMISSION_LOCK 73 | assert(__sync_bool_compare_and_swap(&_ts_word, v, v | LOCK_BIT)); 74 | #endif 75 | v &= ~(RTS_MASK | WTS_MASK); // clear wts and rts. 76 | v |= wts; 77 | _ts_word = v; 78 | _row->copy(data); 79 | #if WRITE_PERMISSION_LOCK 80 | _ts_word &= (~LOCK_BIT); 81 | #endif 82 | #else 83 | #if TICTOC_MV 84 | _hist_wts = _wts; 85 | #endif 86 | _wts = wts; 87 | _rts = wts; 88 | _row->copy(data); 89 | #endif 90 | } 91 | 92 | bool 93 | Row_tictoc::renew_lease(ts_t wts, ts_t rts) 94 | { 95 | #if !ATOMIC_WORD 96 | if (_wts != wts) { 97 | #if TICTOC_MV 98 | if (wts == _hist_wts && rts < _wts) 99 | return true; 100 | #endif 101 | return false; 102 | } 103 | _rts = rts; 104 | #endif 105 | return true; 106 | } 107 | 108 | bool 109 | Row_tictoc::try_renew(ts_t wts, ts_t rts, ts_t &new_rts, uint64_t thd_id) 110 | { 111 | #if ATOMIC_WORD 112 | uint64_t v = _ts_word; 113 | uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT; 114 | if ((v & WTS_MASK) == wts && ((v & RTS_MASK) >> WTS_LEN) >= rts - wts) 115 | return true; 116 | if (v & lock_mask) 117 | return false; 118 | #if TICTOC_MV 119 | COMPILER_BARRIER 120 | uint64_t hist_wts = _hist_wts; 121 | if (wts != (v & WTS_MASK)) { 122 | if (wts == hist_wts && rts < (v & WTS_MASK)) { 123 | return true; 124 | } else { 125 | return false; 126 | } 127 | } 128 | #else 129 | if (wts != (v & WTS_MASK)) 130 | return false; 131 | #endif 132 | 133 | ts_t delta_rts = rts - wts; 134 | if (delta_rts < ((v & RTS_MASK) >> WTS_LEN)) // the rts has already been extended. 135 | return true; 136 | bool rebase = false; 137 | if (delta_rts >= (1 << RTS_LEN)) { 138 | rebase = true; 139 | uint64_t delta = (delta_rts & ~((1 << RTS_LEN) - 1)); 140 | delta_rts &= ((1 << RTS_LEN) - 1); 141 | wts += delta; 142 | } 143 | uint64_t v2 = 0; 144 | v2 |= wts; 145 | v2 |= (delta_rts << WTS_LEN); 146 | while (true) { 147 | uint64_t pre_v = __sync_val_compare_and_swap(&_ts_word, v, v2); 148 | if (pre_v == v) 149 | return true; 150 | v = pre_v; 151 | if (rebase || (v & lock_mask) || (wts != (v & WTS_MASK))) 152 | return false; 153 | else if (rts < ((v & RTS_MASK) >> WTS_LEN)) 154 | return true; 155 | } 156 | assert(false); 157 | return false; 158 | #else 159 | #if TICTOC_MV 160 | if (wts < _hist_wts) 161 | return false; 162 | #else 163 | if (wts != _wts) 164 | return false; 165 | #endif 166 | int ret = pthread_mutex_trylock( _latch ); 167 | if (ret == EBUSY) 168 | return false; 169 | 170 | if (wts != _wts) { 171 | #if TICTOC_MV 172 | if (wts == _hist_wts && rts < _wts) { 173 | pthread_mutex_unlock( _latch ); 174 | return true; 175 | } 176 | #endif 177 | pthread_mutex_unlock( _latch ); 178 | return false; 179 | } 180 | if (rts > _rts) 181 | _rts = rts; 182 | pthread_mutex_unlock( _latch ); 183 | new_rts = rts; 184 | return true; 185 | #endif 186 | } 187 | 188 | 189 | ts_t 190 | Row_tictoc::get_wts() 191 | { 192 | #if ATOMIC_WORD 193 | return _ts_word & WTS_MASK; 194 | #else 195 | return _wts; 196 | #endif 197 | } 198 | 199 | void 200 | Row_tictoc::get_ts_word(bool &lock, uint64_t &rts, uint64_t &wts) 201 | { 202 | assert(ATOMIC_WORD); 203 | uint64_t v = _ts_word; 204 | lock = ((v & LOCK_BIT) != 0); 205 | wts = v & WTS_MASK; 206 | rts = ((v & RTS_MASK) >> WTS_LEN) + (v & WTS_MASK); 207 | } 208 | 209 | ts_t 210 | Row_tictoc::get_rts() 211 | { 212 | #if ATOMIC_WORD 213 | uint64_t v = _ts_word; 214 | return ((v & RTS_MASK) >> WTS_LEN) + (v & WTS_MASK); 215 | #else 216 | return _rts; 217 | #endif 218 | 219 | } 220 | 221 | void 222 | Row_tictoc::lock() 223 | { 224 | #if ATOMIC_WORD 225 | uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT; 226 | uint64_t v = _ts_word; 227 | while ((v & lock_mask) || !__sync_bool_compare_and_swap(&_ts_word, v, v | lock_mask)) { 228 | PAUSE 229 | v = _ts_word; 230 | } 231 | #else 232 | pthread_mutex_lock( _latch ); 233 | #endif 234 | } 235 | 236 | bool 237 | Row_tictoc::try_lock() 238 | { 239 | #if ATOMIC_WORD 240 | uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT; 241 | uint64_t v = _ts_word; 242 | if (v & lock_mask) // already locked 243 | return false; 244 | return __sync_bool_compare_and_swap(&_ts_word, v, v | lock_mask); 245 | #else 246 | return pthread_mutex_trylock( _latch ) != EBUSY; 247 | #endif 248 | } 249 | 250 | void 251 | Row_tictoc::release() 252 | { 253 | #if ATOMIC_WORD 254 | uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT; 255 | _ts_word &= (~lock_mask); 256 | #else 257 | pthread_mutex_unlock( _latch ); 258 | #endif 259 | } 260 | 261 | #endif 262 | -------------------------------------------------------------------------------- /concurrency_control/row_tictoc.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | #if CC_ALG == TICTOC 6 | 7 | #if WRITE_PERMISSION_LOCK 8 | 9 | #define LOCK_BIT (1UL << 63) 10 | #define WRITE_BIT (1UL << 62) 11 | #define RTS_LEN (15) 12 | #define WTS_LEN (62 - RTS_LEN) 13 | #define WTS_MASK ((1UL << WTS_LEN) - 1) 14 | #define RTS_MASK (((1UL << RTS_LEN) - 1) << WTS_LEN) 15 | 16 | #else 17 | 18 | #define LOCK_BIT (1UL << 63) 19 | #define WRITE_BIT (1UL << 63) 20 | #define RTS_LEN (15) 21 | #define WTS_LEN (63 - RTS_LEN) 22 | #define WTS_MASK ((1UL << WTS_LEN) - 1) 23 | #define RTS_MASK (((1UL << RTS_LEN) - 1) << WTS_LEN) 24 | 25 | #endif 26 | 27 | class txn_man; 28 | class row_t; 29 | 30 | class Row_tictoc { 31 | public: 32 | void init(row_t * row); 33 | RC access(txn_man * txn, TsType type, row_t * local_row); 34 | #if SPECULATE 35 | RC write_speculate(row_t * data, ts_t version, bool spec_read); 36 | #endif 37 | void write_data(row_t * data, ts_t wts); 38 | void write_ptr(row_t * data, ts_t wts, char *& data_to_free); 39 | bool renew_lease(ts_t wts, ts_t rts); 40 | bool try_renew(ts_t wts, ts_t rts, ts_t &new_rts, uint64_t thd_id); 41 | 42 | void lock(); 43 | bool try_lock(); 44 | void release(); 45 | 46 | ts_t get_wts(); 47 | ts_t get_rts(); 48 | void get_ts_word(bool &lock, uint64_t &rts, uint64_t &wts); 49 | private: 50 | row_t * _row; 51 | #if ATOMIC_WORD 52 | volatile uint64_t _ts_word; 53 | #else 54 | ts_t _wts; // last write timestamp 55 | ts_t _rts; // end lease timestamp 56 | pthread_mutex_t * _latch; 57 | #endif 58 | #if TICTOC_MV 59 | volatile ts_t _hist_wts; 60 | #endif 61 | }; 62 | 63 | #endif 64 | -------------------------------------------------------------------------------- /concurrency_control/row_ts.h: -------------------------------------------------------------------------------- 1 | #ifndef ROW_TS_H 2 | #define ROW_TS_H 3 | 4 | class table_t; 5 | class Catalog; 6 | class txn_man; 7 | struct TsReqEntry { 8 | txn_man * txn; 9 | // for write requests, need to have a copy of the data to write. 10 | row_t * row; 11 | itemid_t * item; 12 | ts_t ts; 13 | TsReqEntry * next; 14 | }; 15 | 16 | class Row_ts { 17 | public: 18 | void init(row_t * row); 19 | RC access(txn_man * txn, TsType type, row_t * row); 20 | 21 | private: 22 | pthread_mutex_t * latch; 23 | bool blatch; 24 | 25 | void buffer_req(TsType type, txn_man * txn, row_t * row); 26 | TsReqEntry * debuffer_req(TsType type, txn_man * txn); 27 | TsReqEntry * debuffer_req(TsType type, ts_t ts); 28 | TsReqEntry * debuffer_req(TsType type, txn_man * txn, ts_t ts); 29 | void update_buffer(); 30 | ts_t cal_min(TsType type); 31 | TsReqEntry * get_req_entry(); 32 | void return_req_entry(TsReqEntry * entry); 33 | void return_req_list(TsReqEntry * list); 34 | 35 | row_t * _row; 36 | ts_t wts; 37 | ts_t rts; 38 | ts_t min_wts; 39 | ts_t min_rts; 40 | ts_t min_pts; 41 | 42 | TsReqEntry * readreq; 43 | TsReqEntry * writereq; 44 | TsReqEntry * prereq; 45 | uint64_t preq_len; 46 | }; 47 | 48 | #endif 49 | -------------------------------------------------------------------------------- /concurrency_control/row_vll.cpp: -------------------------------------------------------------------------------- 1 | #include "row.h" 2 | #include "row_vll.h" 3 | #include "global.h" 4 | #include "helper.h" 5 | 6 | void 7 | Row_vll::init(row_t * row) { 8 | _row = row; 9 | cs = 0; 10 | cx = 0; 11 | } 12 | 13 | bool 14 | Row_vll::insert_access(access_t type) { 15 | if (type == RD) { 16 | cs ++; 17 | return (cx > 0); 18 | } else { 19 | cx ++; 20 | return (cx > 1) || (cs > 0); 21 | } 22 | } 23 | 24 | void 25 | Row_vll::remove_access(access_t type) { 26 | if (type == RD) { 27 | assert (cs > 0); 28 | cs --; 29 | } else { 30 | assert (cx > 0); 31 | cx --; 32 | } 33 | } 34 | -------------------------------------------------------------------------------- /concurrency_control/row_vll.h: -------------------------------------------------------------------------------- 1 | #ifndef ROW_VLL_H 2 | #define ROW_VLL_H 3 | 4 | class Row_vll { 5 | public: 6 | void init(row_t * row); 7 | // return true : the access is blocked. 8 | // return false : the access is NOT blocked 9 | bool insert_access(access_t type); 10 | void remove_access(access_t type); 11 | int get_cs() { return cs; }; 12 | private: 13 | row_t * _row; 14 | int cs; 15 | int cx; 16 | }; 17 | 18 | #endif 19 | -------------------------------------------------------------------------------- /concurrency_control/row_ww.h: -------------------------------------------------------------------------------- 1 | #ifndef ROW_WW_H 2 | #define ROW_WW_H 3 | 4 | #include "row_lock.h" 5 | 6 | class Row_ww { 7 | public: 8 | void init(row_t * row); 9 | RC lock_get(lock_t type, txn_man * txn, Access * access); 10 | RC lock_get(lock_t type, txn_man * txn, uint64_t* &txnids, int &txncnt, Access * access); 11 | RC lock_release(LockEntry * entry); 12 | void lock(txn_man * txn); 13 | void unlock(txn_man * txn); 14 | 15 | private: 16 | #if LATCH == LH_SPINLOCK 17 | pthread_spinlock_t * latch; 18 | #elif LATCH == LH_MUTEX 19 | pthread_mutex_t * latch; 20 | #else 21 | mcslock * latch; 22 | #endif 23 | bool blatch; 24 | 25 | bool conflict_lock(lock_t l1, lock_t l2); 26 | static LockEntry * get_entry(Access * access); 27 | static void return_entry(LockEntry * entry); 28 | void bring_next(); 29 | 30 | row_t * _row; 31 | // owner's lock type 32 | lock_t lock_type; 33 | UInt32 owner_cnt; 34 | UInt32 waiter_cnt; 35 | 36 | // owners is a single linked list 37 | // waiters is a double linked list 38 | // [waiters] head is the oldest txn, tail is the youngest txn. 39 | // So new txns are inserted into the tail. 40 | LockEntry * owners; 41 | LockEntry * waiters_head; 42 | LockEntry * waiters_tail; 43 | }; 44 | 45 | #endif 46 | -------------------------------------------------------------------------------- /concurrency_control/silo.cpp: -------------------------------------------------------------------------------- 1 | #include "txn.h" 2 | #include "row.h" 3 | #include "row_silo.h" 4 | 5 | #if CC_ALG == SILO 6 | 7 | RC 8 | txn_man::validate_silo() 9 | { 10 | RC rc = RCOK; 11 | // lock write tuples in the primary key order. 12 | int write_set[wr_cnt]; 13 | int cur_wr_idx = 0; 14 | int read_set[row_cnt - wr_cnt]; 15 | int cur_rd_idx = 0; 16 | for (int rid = 0; rid < row_cnt; rid ++) { 17 | if (accesses[rid]->type == WR) 18 | write_set[cur_wr_idx ++] = rid; 19 | else 20 | read_set[cur_rd_idx ++] = rid; 21 | } 22 | 23 | // bubble sort the write set, in primary key order 24 | for (int i = wr_cnt - 1; i >= 1; i--) { 25 | for (int j = 0; j < i; j++) { 26 | if (accesses[ write_set[j] ]->orig_row->get_primary_key() > 27 | accesses[ write_set[j + 1] ]->orig_row->get_primary_key()) 28 | { 29 | int tmp = write_set[j]; 30 | write_set[j] = write_set[j+1]; 31 | write_set[j+1] = tmp; 32 | } 33 | } 34 | } 35 | 36 | int num_locks = 0; 37 | ts_t max_tid = 0; 38 | bool done = false; 39 | if (_pre_abort) { 40 | for (int i = 0; i < wr_cnt; i++) { 41 | row_t * row = accesses[ write_set[i] ]->orig_row; 42 | if (row->manager->get_tid() != accesses[write_set[i]]->tid) { 43 | rc = Abort; 44 | goto final; 45 | } 46 | } 47 | for (int i = 0; i < row_cnt - wr_cnt; i ++) { 48 | Access * access = accesses[ read_set[i] ]; 49 | if (access->orig_row->manager->get_tid() != accesses[read_set[i]]->tid) { 50 | rc = Abort; 51 | goto final; 52 | } 53 | } 54 | } 55 | 56 | // lock all rows in the write set. 57 | if (_validation_no_wait) { 58 | while (!done) { 59 | num_locks = 0; 60 | for (int i = 0; i < wr_cnt; i++) { 61 | row_t * row = accesses[ write_set[i] ]->orig_row; 62 | if (!row->manager->try_lock()) 63 | break; 64 | row->manager->assert_lock(); 65 | num_locks ++; 66 | if (row->manager->get_tid() != accesses[write_set[i]]->tid) 67 | { 68 | rc = Abort; 69 | goto final; 70 | } 71 | } 72 | if (num_locks == wr_cnt) 73 | done = true; 74 | else { 75 | for (int i = 0; i < num_locks; i++) 76 | accesses[ write_set[i] ]->orig_row->manager->release(); 77 | if (_pre_abort) { 78 | num_locks = 0; 79 | for (int i = 0; i < wr_cnt; i++) { 80 | row_t * row = accesses[ write_set[i] ]->orig_row; 81 | if (row->manager->get_tid() != accesses[write_set[i]]->tid) { 82 | rc = Abort; 83 | goto final; 84 | } 85 | } 86 | for (int i = 0; i < row_cnt - wr_cnt; i ++) { 87 | Access * access = accesses[ read_set[i] ]; 88 | if (access->orig_row->manager->get_tid() != accesses[read_set[i]]->tid) { 89 | rc = Abort; 90 | goto final; 91 | } 92 | } 93 | } 94 | PAUSE 95 | } 96 | } 97 | } else { 98 | for (int i = 0; i < wr_cnt; i++) { 99 | row_t * row = accesses[ write_set[i] ]->orig_row; 100 | row->manager->lock(); 101 | num_locks++; 102 | if (row->manager->get_tid() != accesses[write_set[i]]->tid) { 103 | rc = Abort; 104 | goto final; 105 | } 106 | } 107 | } 108 | 109 | // validate rows in the read set 110 | // for repeatable_read, no need to validate the read set. 111 | for (int i = 0; i < row_cnt - wr_cnt; i ++) { 112 | Access * access = accesses[ read_set[i] ]; 113 | bool success = access->orig_row->manager->validate(access->tid, false); 114 | if (!success) { 115 | rc = Abort; 116 | goto final; 117 | } 118 | if (access->tid > max_tid) 119 | max_tid = access->tid; 120 | } 121 | // validate rows in the write set 122 | for (int i = 0; i < wr_cnt; i++) { 123 | Access * access = accesses[ write_set[i] ]; 124 | bool success = access->orig_row->manager->validate(access->tid, true); 125 | if (!success) { 126 | rc = Abort; 127 | goto final; 128 | } 129 | if (access->tid > max_tid) 130 | max_tid = access->tid; 131 | } 132 | if (max_tid > _cur_tid) 133 | _cur_tid = max_tid + 1; 134 | else 135 | _cur_tid ++; 136 | final: 137 | if (rc == Abort) { 138 | for (int i = 0; i < num_locks; i++) 139 | accesses[ write_set[i] ]->orig_row->manager->release(); 140 | cleanup(rc); 141 | } else { 142 | for (int i = 0; i < wr_cnt; i++) { 143 | Access * access = accesses[ write_set[i] ]; 144 | access->orig_row->manager->write( 145 | access->data, _cur_tid ); 146 | accesses[ write_set[i] ]->orig_row->manager->release(); 147 | } 148 | cleanup(rc); 149 | } 150 | return rc; 151 | } 152 | #endif 153 | -------------------------------------------------------------------------------- /concurrency_control/silo_prio.cpp: -------------------------------------------------------------------------------- 1 | #include "txn.h" 2 | #include "row.h" 3 | #include "row_silo_prio.h" 4 | 5 | #if CC_ALG == SILO_PRIO 6 | 7 | RC 8 | txn_man::validate_silo_prio() 9 | { 10 | RC rc = RCOK; 11 | // lock write tuples in the primary key order. 12 | int cur_wr_idx = 0; 13 | int cur_rd_idx = 0; 14 | int write_set[wr_cnt]; 15 | int read_set[row_cnt - wr_cnt]; 16 | for (int rid = 0; rid < row_cnt; rid ++) { 17 | if (accesses[rid]->type == WR) 18 | write_set[cur_wr_idx ++] = rid; 19 | else 20 | read_set[cur_rd_idx ++] = rid; 21 | } 22 | 23 | // bubble sort the write set, in primary key order 24 | for (int i = wr_cnt - 1; i >= 1; i--) { 25 | for (int j = 0; j < i; j++) { 26 | if (accesses[ write_set[j] ]->orig_row->get_primary_key() > 27 | accesses[ write_set[j + 1] ]->orig_row->get_primary_key()) 28 | { 29 | int tmp = write_set[j]; 30 | write_set[j] = write_set[j+1]; 31 | write_set[j+1] = tmp; 32 | } 33 | } 34 | } 35 | 36 | int num_locks = 0; 37 | ts_t max_data_ver = 0; 38 | bool done = false; 39 | if (_pre_abort) { 40 | for (int i = 0; i < wr_cnt; i++) { 41 | row_t * row = accesses[ write_set[i] ]->orig_row; 42 | if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) { 43 | rc = Abort; 44 | goto final; 45 | } 46 | } 47 | for (int i = 0; i < row_cnt - wr_cnt; i ++) { 48 | Access * access = accesses[ read_set[i] ]; 49 | if (access->orig_row->manager->get_data_ver() != accesses[read_set[i]]->data_ver) { 50 | rc = Abort; 51 | goto final; 52 | } 53 | } 54 | } 55 | 56 | // lock all rows in the write set. 57 | if (_validation_no_wait) { 58 | while (!done) { 59 | num_locks = 0; 60 | for (int i = 0; i < wr_cnt; i++) { 61 | row_t * row = accesses[ write_set[i] ]->orig_row; 62 | if (row->manager->try_lock(prio) != Row_silo_prio::LOCK_STATUS::LOCK_DONE) 63 | break; 64 | row->manager->assert_lock(); 65 | num_locks ++; 66 | if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) 67 | { 68 | rc = Abort; 69 | goto final; 70 | } 71 | } 72 | if (num_locks == wr_cnt) 73 | done = true; 74 | else { 75 | for (int i = 0; i < num_locks; i++) 76 | accesses[ write_set[i] ]->orig_row->manager->unlock(); 77 | if (_pre_abort) { 78 | num_locks = 0; 79 | for (int i = 0; i < wr_cnt; i++) { 80 | row_t * row = accesses[ write_set[i] ]->orig_row; 81 | if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) { 82 | rc = Abort; 83 | goto final; 84 | } 85 | } 86 | for (int i = 0; i < row_cnt - wr_cnt; i ++) { 87 | Access * access = accesses[ read_set[i] ]; 88 | if (access->orig_row->manager->get_data_ver() != accesses[read_set[i]]->data_ver) { 89 | rc = Abort; 90 | goto final; 91 | } 92 | } 93 | } 94 | PAUSE 95 | } 96 | } 97 | } else { 98 | /** 99 | * This path does not work, since release the latch requires resetting prio 100 | * and prio_ver etc., we here simply disallow this operation. 101 | */ 102 | assert(false); 103 | for (int i = 0; i < wr_cnt; i++) { 104 | row_t * row = accesses[ write_set[i] ]->orig_row; 105 | Row_silo_prio::LOCK_STATUS ls = row->manager->lock(prio); 106 | if (ls == Row_silo_prio::LOCK_STATUS::LOCK_ERR_PRIO) { 107 | rc = Abort; 108 | goto final; 109 | } 110 | num_locks++; 111 | if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) { 112 | rc = Abort; 113 | goto final; 114 | } 115 | } 116 | } 117 | 118 | // validate rows in the read set 119 | // for repeatable_read, no need to validate the read set. 120 | for (int i = 0; i < row_cnt - wr_cnt; i ++) { 121 | Access * access = accesses[ read_set[i] ]; 122 | bool success = access->orig_row->manager->validate(access->data_ver, false); 123 | if (!success) { 124 | rc = Abort; 125 | goto final; 126 | } 127 | if (access->data_ver > max_data_ver) 128 | max_data_ver = access->data_ver; 129 | } 130 | // validate rows in the write set 131 | for (int i = 0; i < wr_cnt; i++) { 132 | Access * access = accesses[ write_set[i] ]; 133 | bool success = access->orig_row->manager->validate(access->data_ver, true); 134 | if (!success) { 135 | rc = Abort; 136 | goto final; 137 | } 138 | if (access->data_ver > max_data_ver) 139 | max_data_ver = access->data_ver; 140 | } 141 | if (max_data_ver > _cur_data_ver) 142 | _cur_data_ver = max_data_ver + 1; 143 | else 144 | _cur_data_ver ++; 145 | final: 146 | // we release the priority and ref_cnt together with the latch 147 | // for those rows with latch acquired (all read-only row and some write rows) 148 | // we release them in cleanup() 149 | if (rc == Abort) { 150 | for (int i = 0; i < num_locks; i++) { 151 | Access * access = accesses[ write_set[i] ]; 152 | access->orig_row->manager->writer_release_abort(prio, access->prio_ver); 153 | assert(access->is_reserved || SILO_PRIO_NO_RESERVE_LOWEST_PRIO); 154 | access->is_reserved = false; 155 | } 156 | cleanup(rc); 157 | } else { 158 | for (int i = 0; i < wr_cnt; i++) { 159 | Access * access = accesses[ write_set[i] ]; 160 | access->orig_row->manager->write(access->data); 161 | access->orig_row->manager->writer_release_commit(_cur_data_ver); 162 | assert(access->is_reserved || SILO_PRIO_NO_RESERVE_LOWEST_PRIO); 163 | access->is_reserved = false; 164 | } 165 | cleanup(rc); 166 | } 167 | return rc; 168 | } 169 | #endif 170 | -------------------------------------------------------------------------------- /concurrency_control/vll.cpp: -------------------------------------------------------------------------------- 1 | #include "vll.h" 2 | #include "txn.h" 3 | #include "table.h" 4 | #include "row.h" 5 | #include "row_vll.h" 6 | #include "ycsb_query.h" 7 | #include "ycsb.h" 8 | #include "wl.h" 9 | #include "catalog.h" 10 | #include "mem_alloc.h" 11 | #if CC_ALG == VLL 12 | 13 | void 14 | VLLMan::init() { 15 | _txn_queue_size = 0; 16 | _txn_queue = NULL; 17 | _txn_queue_tail = NULL; 18 | } 19 | 20 | void 21 | VLLMan::vllMainLoop(txn_man * txn, base_query * query) { 22 | 23 | ycsb_query * m_query = (ycsb_query *) query; 24 | // access the indexes. This is not in the critical section 25 | for (int rid = 0; rid < m_query->request_cnt; rid ++) { 26 | ycsb_request * req = &m_query->requests[rid]; 27 | ycsb_wl * wl = (ycsb_wl *) txn->get_wl(); 28 | int part_id = wl->key_to_part( req->key ); 29 | INDEX * index = wl->the_index; 30 | itemid_t * item; 31 | item = txn->index_read(index, req->key, part_id); 32 | row_t * row = ((row_t *)item->location); 33 | // the following line adds the read/write sets to txn->accesses 34 | txn->get_row(row, req->rtype); 35 | int cs = row->manager->get_cs(); 36 | } 37 | 38 | bool done = false; 39 | while (!done) { 40 | txn_man * front_txn = NULL; 41 | uint64_t t5 = get_sys_clock(); 42 | pthread_mutex_lock(&_mutex); 43 | uint64_t tt5 = get_sys_clock() - t5; 44 | INC_STATS(txn->get_thd_id(), debug5, tt5); 45 | 46 | 47 | TxnQEntry * front = _txn_queue; 48 | if (front) 49 | front_txn = front->txn; 50 | // only one worker thread can execute the txn. 51 | if (front_txn && front_txn->vll_txn_type == VLL_Blocked) { 52 | front_txn->vll_txn_type = VLL_Free; 53 | pthread_mutex_unlock(&_mutex); 54 | execute(front_txn, query); 55 | finishTxn( front_txn, front); 56 | } else { 57 | // _mutex will be unlocked in beginTxn() 58 | TxnQEntry * entry = NULL; 59 | int ok = beginTxn(txn, query, entry); 60 | if (ok == 2) { 61 | execute(txn, query); 62 | finishTxn(txn, entry); 63 | } 64 | assert(ok == 1 || ok == 2); 65 | done = true; 66 | } 67 | } 68 | return; 69 | } 70 | 71 | int 72 | VLLMan::beginTxn(txn_man * txn, base_query * query, TxnQEntry *& entry) { 73 | 74 | int ret = -1; 75 | if (_txn_queue_size >= TXN_QUEUE_SIZE_LIMIT) 76 | ret = 3; 77 | 78 | txn->vll_txn_type = VLL_Free; 79 | assert(WORKLOAD == YCSB); 80 | 81 | for (int rid = 0; rid < txn->row_cnt; rid ++ ) { 82 | access_t type = txn->accesses[rid]->type; 83 | if (txn->accesses[rid]->orig_row->manager->insert_access(type)) 84 | txn->vll_txn_type = VLL_Blocked; 85 | } 86 | 87 | entry = getQEntry(); 88 | LIST_PUT_TAIL(_txn_queue, _txn_queue_tail, entry); 89 | if (txn->vll_txn_type == VLL_Blocked) 90 | ret = 1; 91 | else 92 | ret = 2; 93 | pthread_mutex_unlock(&_mutex); 94 | return ret; 95 | } 96 | 97 | void 98 | VLLMan::execute(txn_man * txn, base_query * query) { 99 | RC rc; 100 | uint64_t t3 = get_sys_clock(); 101 | ycsb_query * m_query = (ycsb_query *) query; 102 | ycsb_wl * wl = (ycsb_wl *) txn->get_wl(); 103 | Catalog * schema = wl->the_table->get_schema(); 104 | uint64_t average; 105 | for (int rid = 0; rid < txn->row_cnt; rid ++) { 106 | row_t * row = txn->accesses[rid]->orig_row; 107 | access_t type = txn->accesses[rid]->type; 108 | if (type == RD) { 109 | for (int fid = 0; fid < schema->get_field_cnt(); fid++) { 110 | char * data = row->get_data(); 111 | uint64_t fval = *(uint64_t *)(&data[fid * 100]); 112 | } 113 | } else { 114 | assert(type == WR); 115 | for (int fid = 0; fid < schema->get_field_cnt(); fid++) { 116 | char * data = row->get_data(); 117 | *(uint64_t *)(&data[fid * 100]) = 0; 118 | } 119 | } 120 | } 121 | uint64_t tt3 = get_sys_clock() - t3; 122 | INC_STATS(txn->get_thd_id(), debug3, tt3); 123 | } 124 | 125 | void 126 | VLLMan::finishTxn(txn_man * txn, TxnQEntry * entry) { 127 | pthread_mutex_lock(&_mutex); 128 | 129 | for (int rid = 0; rid < txn->row_cnt; rid ++ ) { 130 | access_t type = txn->accesses[rid]->type; 131 | txn->accesses[rid]->orig_row->manager->remove_access(type); 132 | } 133 | LIST_REMOVE_HT(entry, _txn_queue, _txn_queue_tail); 134 | pthread_mutex_unlock(&_mutex); 135 | txn->release(); 136 | mem_allocator.free(txn, 0); 137 | } 138 | 139 | 140 | TxnQEntry * 141 | VLLMan::getQEntry() { 142 | TxnQEntry * entry = (TxnQEntry *) mem_allocator.alloc(sizeof(TxnQEntry), 0); 143 | entry->prev = NULL; 144 | entry->next = NULL; 145 | entry->txn = NULL; 146 | return entry; 147 | } 148 | 149 | void 150 | VLLMan::returnQEntry(TxnQEntry * entry) { 151 | mem_allocator.free(entry, sizeof(TxnQEntry)); 152 | } 153 | 154 | #endif 155 | -------------------------------------------------------------------------------- /concurrency_control/vll.h: -------------------------------------------------------------------------------- 1 | #ifndef _VLL_H_ 2 | #define _VLL_H_ 3 | 4 | #include "global.h" 5 | #include "helper.h" 6 | #include "query.h" 7 | 8 | class txn_man; 9 | 10 | class TxnQEntry { 11 | public: 12 | TxnQEntry * prev; 13 | TxnQEntry * next; 14 | txn_man * txn; 15 | }; 16 | 17 | class VLLMan { 18 | public: 19 | void init(); 20 | void vllMainLoop(txn_man * next_txn, base_query * query); 21 | // 1: txn is blocked 22 | // 2: txn is not blocked. Can run. 23 | // 3: txn_queue is full. 24 | int beginTxn(txn_man * txn, base_query * query, TxnQEntry *& entry); 25 | void finishTxn(txn_man * txn, TxnQEntry * entry); 26 | void execute(txn_man * txn, base_query * query); 27 | private: 28 | TxnQEntry * _txn_queue; 29 | TxnQEntry * _txn_queue_tail; 30 | int _txn_queue_size; 31 | pthread_mutex_t _mutex; 32 | 33 | TxnQEntry * getQEntry(); 34 | void returnQEntry(TxnQEntry * entry); 35 | }; 36 | 37 | #endif 38 | -------------------------------------------------------------------------------- /config.cpp: -------------------------------------------------------------------------------- 1 | #include "config.h" 2 | 3 | TPCCTxnType g_tpcc_txn_type = TPCC_ALL; 4 | TestCases g_test_case = CONFLICT; 5 | 6 | -------------------------------------------------------------------------------- /experiments/debug.json: -------------------------------------------------------------------------------- 1 | { 2 | "THREAD_CNT": 16, 3 | "MAX_TXN_PER_PART": 1000000, 4 | "ABORT_PENALTY": 50000, 5 | "LATCH": "LH_MCSLOCK", 6 | "TERMINATE_BY_COUNT": "false", 7 | "MAX_RUNTIME": 10, 8 | 9 | "PF_BASIC": "true", 10 | "PF_CS": "false", 11 | "PF_ABORT": "false", 12 | 13 | "CC_ALG": "SILO_PRIO", 14 | "WW_STARV_FREE": "true", 15 | "BB_DYNAMIC_TS": "true", 16 | "BB_OPT_RAW": "true", 17 | "BB_LAST_RETIRE": 0, 18 | "BB_PRECOMMIT": "false", 19 | "BB_AUTORETIRE": "false", 20 | "BB_ALWAYS_RETIRE_READ": "true", 21 | 22 | "WORKLOAD": "YCSB", 23 | "SYNTHETIC_YCSB": "true", 24 | "ZIPF_THETA": 0, 25 | "READ_PERC": 1, 26 | "POS_HS": "SPECIFIED", 27 | "SPECIFIED_RATIO": 1.0, 28 | "FLIP_RATIO": 0, 29 | "NUM_HS": 2, 30 | "FIRST_HS": "WR", 31 | "SECOND_HS": "WR", 32 | "FIXED_HS": 1, 33 | 34 | "LONG_TXN_RATIO": 0, 35 | "REQ_PER_QUERY": 16, 36 | "SYNTH_TABLE_SIZE": 100000, 37 | 38 | "UNSET_NUMA": "false", 39 | "COMPILE_ONLY": "false", 40 | "NDEBUG": "false" 41 | } 42 | -------------------------------------------------------------------------------- /experiments/default.json: -------------------------------------------------------------------------------- 1 | { 2 | "THREAD_CNT": 16, 3 | "MAX_TXN_PER_PART": 1000000, 4 | "ABORT_PENALTY": 50000, 5 | "LATCH": "LH_MCSLOCK", 6 | "TERMINATE_BY_COUNT": "true", 7 | "MAX_RUNTIME": 10, 8 | 9 | "PF_BASIC": "true", 10 | "PF_CS": "false", 11 | "PF_ABORT": "false", 12 | 13 | "CC_ALG": "SILO_PRIO", 14 | "WW_STARV_FREE": "true", 15 | "BB_DYNAMIC_TS": "true", 16 | "BB_OPT_RAW": "true", 17 | "BB_LAST_RETIRE": 0, 18 | "BB_PRECOMMIT": "false", 19 | "BB_AUTORETIRE": "false", 20 | "BB_ALWAYS_RETIRE_READ": "true", 21 | 22 | "WORKLOAD": "YCSB", 23 | "SYNTHETIC_YCSB": "false", 24 | "ZIPF_THETA": 0.99, 25 | "READ_PERC": 0.5, 26 | "LONG_TXN_RATIO": 0, 27 | "MAX_ROW_PER_TXN": 1000, 28 | "REQ_PER_QUERY": 16, 29 | "SYNTH_TABLE_SIZE": 10000000, 30 | 31 | "UNSET_NUMA": "false", 32 | "COMPILE_ONLY": "false", 33 | "NDEBUG": "true" 34 | } 35 | -------------------------------------------------------------------------------- /experiments/large_dataset.json: -------------------------------------------------------------------------------- 1 | { 2 | "THREAD_CNT": 16, 3 | "MAX_TXN_PER_PART": 1000000, 4 | "ABORT_PENALTY": 1000, 5 | "LATCH": "LH_MCSLOCK", 6 | "TERMINATE_BY_COUNT": "false", 7 | "MAX_RUNTIME": 20, 8 | 9 | "PF_BASIC": "true", 10 | "PF_CS": "false", 11 | "PF_ABORT": "false", 12 | 13 | "CC_ALG": "SILO_PRIO", 14 | "WW_STARV_FREE": "true", 15 | "BB_DYNAMIC_TS": "true", 16 | "BB_OPT_RAW": "true", 17 | "BB_LAST_RETIRE": 0, 18 | "BB_PRECOMMIT": "false", 19 | "BB_AUTORETIRE": "false", 20 | "BB_ALWAYS_RETIRE_READ": "true", 21 | 22 | "WORKLOAD": "YCSB", 23 | "SYNTHETIC_YCSB": "false", 24 | "ZIPF_THETA": 0.99, 25 | "READ_PERC": 0.5, 26 | "LONG_TXN_RATIO": 0, 27 | "MAX_ROW_PER_TXN": 1000, 28 | "REQ_PER_QUERY": 16, 29 | "SYNTH_TABLE_SIZE": 100000000, 30 | 31 | "UNSET_NUMA": "false", 32 | "COMPILE_ONLY": "false", 33 | "NDEBUG": "true" 34 | } 35 | -------------------------------------------------------------------------------- /experiments/long_txn.json: -------------------------------------------------------------------------------- 1 | { 2 | "THREAD_CNT": 16, 3 | "MAX_TXN_PER_PART": 100000, 4 | "ABORT_PENALTY": 1000, 5 | "LATCH": "LH_MCSLOCK", 6 | "TERMINATE_BY_COUNT": "false", 7 | "MAX_RUNTIME": 10, 8 | "WARMUP": 100, 9 | 10 | "PF_BASIC": "true", 11 | "PF_CS": "false", 12 | "PF_ABORT": "false", 13 | 14 | "CC_ALG": "SILO_PRIO", 15 | "WW_STARV_FREE": "true", 16 | "BB_DYNAMIC_TS": "true", 17 | "BB_OPT_RAW": "true", 18 | "BB_LAST_RETIRE": 0, 19 | "BB_PRECOMMIT": "false", 20 | "BB_AUTORETIRE": "false", 21 | "BB_ALWAYS_RETIRE_READ": "true", 22 | 23 | "WORKLOAD": "YCSB", 24 | "SYNTHETIC_YCSB": "false", 25 | "ZIPF_THETA": 0.99, 26 | "READ_PERC": 0.5, 27 | "LONG_TXN_RATIO": 0.05, 28 | "LONG_TXN_READ_RATIO": 1, 29 | "MAX_ROW_PER_TXN": 1000, 30 | "REQ_PER_QUERY": 16, 31 | "SYNTH_TABLE_SIZE": 100000000, 32 | 33 | "UNSET_NUMA": "false", 34 | "COMPILE_ONLY": "false", 35 | "NDEBUG": "true" 36 | } 37 | -------------------------------------------------------------------------------- /experiments/run_all.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # assume the current working directory is `polaris/` 3 | bash experiments/run_ycsb_latency.sh # fig 1, 7 4 | bash experiments/run_ycsb_prio_sen.sh # fig 2 5 | bash experiments/run_ycsb_thread.sh # fig 3 6 | bash experiments/run_ycsb_readonly.sh # fig 4 7 | bash experiments/run_ycsb_zipf.sh # fig 5, 6 8 | bash experiments/run_tpcc_thread.sh # fig 8, 9 9 | bash experiments/run_ycsb_aria.sh # fig 10, 11 10 | -------------------------------------------------------------------------------- /experiments/run_tpcc_thread.sh: -------------------------------------------------------------------------------- 1 | exper=tpcc_thread 2 | mkdir -p results 3 | num_wh=1 4 | 5 | for thd in 1 4 8 16 24 32 40 48 56 64; do 6 | for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do 7 | data_dir="results/${exper}/TPCC-CC=${alg}-THD=${thd}-NUM_WH=${num_wh}" 8 | mkdir -p "${data_dir}" 9 | if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then 10 | python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} | tee "${data_dir}/log" 11 | else 12 | python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 13 | fi 14 | done 15 | done 16 | 17 | num_wh=64 18 | 19 | for thd in 1 4 8 16 24 32 40 48 56 64; do 20 | for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do 21 | data_dir="results/${exper}/TPCC-CC=${alg}-THD=${thd}-NUM_WH=${num_wh}" 22 | mkdir -p "${data_dir}" 23 | if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then 24 | python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} | tee "${data_dir}/log" 25 | else 26 | python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 27 | fi 28 | done 29 | done 30 | 31 | python3 parse.py "${exper}" 32 | -------------------------------------------------------------------------------- /experiments/run_ycsb_aria.sh: -------------------------------------------------------------------------------- 1 | # Note: it turns out p999 of ARIA can vary a lot; to get a relatively stable 2 | # results, we make experiment duration longer and dump all latency 3 | 4 | exper=ycsb_aria_batch 5 | mkdir -p results 6 | 7 | alg=ARIA 8 | for zipf in 0.99 0.5; do 9 | for thd in 1 4 8 16 24 32 40 48 56 64; do 10 | for batch in 1 2 4 8; do 11 | data_dir="results/${exper}/YCSB-CC=${alg}_${batch}-THD=${thd}-ZIPF=${zipf}" 12 | mkdir -p "${data_dir}" 13 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} ARIA_BATCH_SIZE=${batch} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" MAX_RUNTIME=40 | tee "${data_dir}/log" 14 | done 15 | done 16 | done 17 | 18 | # we then run SILO_PRIO for comparsion 19 | alg=SILO_PRIO 20 | for zipf in 0.99 0.5; do 21 | for thd in 1 4 8 16 24 32 40 48 56 64; do 22 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 23 | mkdir -p "${data_dir}" 24 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" MAX_RUNTIME=40 | tee "${data_dir}/log" 25 | done 26 | done 27 | 28 | python3 parse.py "${exper}" 29 | -------------------------------------------------------------------------------- /experiments/run_ycsb_latency.sh: -------------------------------------------------------------------------------- 1 | exper=ycsb_latency 2 | mkdir -p results 3 | zipf=0.99 4 | thd=64 5 | 6 | for alg in SILO WOUND_WAIT NO_WAIT WAIT_DIE; do 7 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 8 | mkdir -p "${data_dir}" 9 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 10 | done 11 | 12 | alg=SILO_PRIO 13 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 14 | mkdir -p "${data_dir}" 15 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} HIGH_PRIO_RATIO=0.05 DUMP_LATENCY=true SILO_PRIO_FIXED_PRIO=false LOW_PRIO_BOUND=7 DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 16 | 17 | alg=SILO_PRIO_FIXED 18 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 19 | mkdir -p "${data_dir}" 20 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=SILO_PRIO HIGH_PRIO_RATIO=0.05 DUMP_LATENCY=true SILO_PRIO_FIXED_PRIO=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 21 | 22 | python3 parse.py "${exper}" 23 | -------------------------------------------------------------------------------- /experiments/run_ycsb_prio_sen.sh: -------------------------------------------------------------------------------- 1 | exper=ycsb_prio_sen 2 | mkdir -p results 3 | zipf=0.99 4 | thd=64 5 | alg=SILO_PRIO 6 | 7 | for pr in 0 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1; do 8 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}-PRIO_RATIO=${pr}" 9 | mkdir -p "${data_dir}" 10 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} SILO_PRIO_FIXED_PRIO=true HIGH_PRIO_RATIO=${pr} | tee "${data_dir}/log" 11 | done 12 | 13 | python3 parse.py "${exper}" 14 | -------------------------------------------------------------------------------- /experiments/run_ycsb_readonly.sh: -------------------------------------------------------------------------------- 1 | exper=ycsb_readonly 2 | mkdir -p results 3 | zipf=0.99 4 | 5 | for thd in 1 4 8 16 24 32 40 48 56 64; do 6 | for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do 7 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 8 | mkdir -p "${data_dir}" 9 | if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then 10 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} READ_PERC=1 | tee "${data_dir}/log" 11 | else 12 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} READ_PERC=1 DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 13 | fi 14 | done 15 | done 16 | 17 | python3 parse.py "${exper}" 18 | -------------------------------------------------------------------------------- /experiments/run_ycsb_thread.sh: -------------------------------------------------------------------------------- 1 | exper=ycsb_thread 2 | mkdir -p results 3 | zipf=0.99 4 | 5 | for thd in 1 4 8 16 24 32 40 48 56 64; do 6 | for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do 7 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 8 | mkdir -p "${data_dir}" 9 | if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then 10 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} | tee "${data_dir}/log" 11 | else 12 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 13 | fi 14 | done 15 | done 16 | 17 | python3 parse.py "${exper}" 18 | -------------------------------------------------------------------------------- /experiments/run_ycsb_zipf.sh: -------------------------------------------------------------------------------- 1 | exper=ycsb_zipf 2 | mkdir -p results 3 | thd=64 4 | 5 | for zipf in 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.99 1.1 1.2 1.3 1.4 1.5; do 6 | for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do 7 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}" 8 | mkdir -p "${data_dir}" 9 | if [ "$zipf" != "0" ] && [ "$zipf" != "0.9" ] && [ "$zipf" != "0.99" ] && [ "$zipf" != "1.5" ]; then 10 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} | tee "${data_dir}/log" 11 | else 12 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log" 13 | fi 14 | done 15 | done 16 | 17 | python3 parse.py "${exper}" 18 | -------------------------------------------------------------------------------- /experiments/synthetic_ycsb.json: -------------------------------------------------------------------------------- 1 | { 2 | "THREAD_CNT": 32, 3 | "MAX_TXN_PER_PART": 1000000, 4 | "ABORT_PENALTY": 50000, 5 | "LATCH": "LH_MCSLOCK", 6 | "TERMINATE_BY_COUNT": "false", 7 | "MAX_RUNTIME": 10, 8 | 9 | "PF_BASIC": "true", 10 | "PF_CS": "false", 11 | "PF_ABORT": "false", 12 | 13 | "CC_ALG": "SILO_PRIO", 14 | "WW_STARV_FREE": "true", 15 | "BB_DYNAMIC_TS": "true", 16 | "BB_OPT_RAW": "true", 17 | "BB_LAST_RETIRE": 0, 18 | "BB_PRECOMMIT": "false", 19 | "BB_AUTORETIRE": "false", 20 | "BB_ALWAYS_RETIRE_READ": "true", 21 | 22 | "WORKLOAD": "YCSB", 23 | "SYNTHETIC_YCSB": "true", 24 | "ZIPF_THETA": 0, 25 | "READ_PERC": 1, 26 | "POS_HS": "SPECIFIED", 27 | "SPECIFIED_RATIO": 0, 28 | "FLIP_RATIO": 0, 29 | "NUM_HS": 1, 30 | "FIRST_HS": "WR", 31 | "SECOND_HS": "WR", 32 | "FIXED_HS": 0, 33 | 34 | "LONG_TXN_RATIO": 0, 35 | "REQ_PER_QUERY": 16, 36 | "SYNTH_TABLE_SIZE": 100000000, 37 | 38 | "UNSET_NUMA": "false", 39 | "COMPILE_ONLY": "false", 40 | "NDEBUG": "true" 41 | } 42 | -------------------------------------------------------------------------------- /experiments/tpcc.json: -------------------------------------------------------------------------------- 1 | { 2 | "THREAD_CNT": 16, 3 | "MAX_TXN_PER_PART": 1000000, 4 | "ABORT_PENALTY": 1000, 5 | "LATCH": "LH_MCSLOCK", 6 | "TERMINATE_BY_COUNT": "false", 7 | "MAX_RUNTIME": 20, 8 | 9 | "PF_BASIC": "true", 10 | "PF_CS": "false", 11 | "PF_ABORT": "false", 12 | 13 | "CC_ALG": "SILO_PRIO", 14 | "WW_STARV_FREE": "true", 15 | "BB_DYNAMIC_TS": "true", 16 | "BB_OPT_RAW": "true", 17 | "BB_LAST_RETIRE": 0, 18 | "BB_PRECOMMIT": "false", 19 | "BB_AUTORETIRE": "false", 20 | "BB_ALWAYS_RETIRE_READ": "true", 21 | 22 | "WORKLOAD": "TPCC", 23 | "NUM_WH": 1, 24 | "TPCC_USER_ABORT": "true", 25 | 26 | "UNSET_NUMA": "false", 27 | "COMPILE_ONLY": "false", 28 | "NDEBUG": "true" 29 | } 30 | -------------------------------------------------------------------------------- /libs/libjemalloc.a: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chenhao-ye/polaris/d4d7b7b4ba41ae2ac401efdbbf7f64cc4dc961a9/libs/libjemalloc.a -------------------------------------------------------------------------------- /outputs/collect_stats.py: -------------------------------------------------------------------------------- 1 | import json 2 | import pandas as pd 3 | import numpy as np 4 | import sys 5 | 6 | data = [] 7 | if len(sys.argv) > 2: 8 | fname = sys.argv[1] 9 | else: 10 | fname = "stats.json" 11 | f = open(fname) 12 | for line in f: 13 | data.append(json.loads(line.strip())) 14 | df = pd.DataFrame(data=data) 15 | df.to_csv("stats.csv", index=False) 16 | ''' 17 | df = df.dropna(how='all',axis=1) 18 | summarized = [] 19 | idx = np.where(df.columns.values == 'abort_cnt')[0][0] 20 | for cls, group in df.groupby(list(df.columns[:idx])): 21 | summarized.append(group.loc[group['throughput'].idxmax()]) 22 | summarized = pd.DataFrame(summarized, columns = df.columns).reset_index(drop=True) 23 | summarized.to_csv("stats-summarized.csv", index=False) 24 | ''' 25 | -------------------------------------------------------------------------------- /parse.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import os 3 | import re 4 | import sys 5 | import os.path 6 | from typing import List 7 | 8 | 9 | class DataPoint(): 10 | # regex for directory name, which encoding experiment metadata 11 | re_dirname = re.compile( 12 | r'\A(?P[A-Z]+)-CC=(?P[A-Z_0-9]+)-THD=(?P[0-9]+)(-ZIPF=(?P[0-9.]+))?(-NUM_WH=(?P[0-9]+))?(-PRIO_RATIO=(?P[0-9.]+))?\Z') 13 | # regex to filter throughput 14 | re_throughput = re.compile( 15 | r'\A\[summary\] throughput=(?P[0-9.e+]+),') 16 | # regex to filter tail latency 17 | re_tail = re.compile( 18 | r'\A\[(?P\S+):tail\]\s+txn_cnt=(?P[0-9]+)(, p50=(?P[0-9.]+))?(, p90=(?P[0-9.]+))?(, p99=(?P[0-9.]+))?(, p999=(?P[0-9.]+))?(, p9999=(?P[0-9.]+))?') 19 | # regex to filter per-priority breakdown 20 | re_prio_breakdown = re.compile( 21 | r'\A\[prio=(?P\d+)\]\s+txn_cnt=(?P[0-9]+), abort_cnt=(?P[0-9]+), abort_time=(?P[0-9]+), exec_time=(?P[0-9]+), backoff_time=(?P[0-9]+), ') 22 | 23 | def __init__(self, prefix: str, dirname: str) -> None: 24 | d = self.re_dirname.match(dirname).groupdict() 25 | self.wl = d['wl'] 26 | assert self.wl in {"YCSB", "TPCC"} 27 | self.params = d # cc_alg, thread_cnt, zipf_theta, num_wh, prio_ratio 28 | self.throughput = None 29 | self.tail = {} 30 | self.prio_breakdown = {} 31 | with open(os.path.join(prefix, dirname, "log"), 'r') as f: 32 | for line in f: 33 | if line.startswith('[summary]'): 34 | self.throughput = self.re_throughput.match(line).groupdict()[ 35 | 'throughput'] 36 | else: 37 | m = self.re_tail.match(line) 38 | if m is not None: 39 | d = m.groupdict() 40 | self.tail[d['tag']] = { 41 | k: v for k, v in d.items() if k != 'tag'} 42 | m = self.re_prio_breakdown.match(line) 43 | if m is not None: 44 | d = m.groupdict() 45 | self.prio_breakdown[d['prio']] = { 46 | k: v for k, v in d.items() if k != 'prio'} 47 | 48 | def get_base_header(self) -> List[str]: 49 | return [ 50 | p for p in ["cc_alg", "thread_cnt", "zipf_theta", "num_wh", "prio_ratio"] 51 | if self.params.get(p) 52 | ] 53 | 54 | def get_base_data(self) -> List: 55 | return [ 56 | self.params.get(p) for p in ["cc_alg", "thread_cnt", "zipf_theta", "num_wh", "prio_ratio"] 57 | if self.params.get(p) 58 | ] 59 | 60 | def get_throughput_header(self) -> List[str]: 61 | h = self.get_base_header() 62 | h.append("throughput") 63 | return h 64 | 65 | def get_throughput_data(self) -> List[str]: 66 | d = self.get_base_data() 67 | d.append(str(self.throughput)) 68 | return d 69 | 70 | def get_tail_header(self) -> List[str]: 71 | h = self.get_base_header() 72 | h.extend(['tag', 'p50', 'p99', 'p999', 'p9999']) 73 | return h 74 | 75 | def get_tail_data(self) -> List[List]: 76 | d = self.get_base_data() 77 | return [ 78 | d + [tag, str(tail['p50']), str(tail['p99']), 79 | str(tail['p999']), str(tail['p9999'])] 80 | for tag, tail in self.tail.items() 81 | ] 82 | 83 | 84 | def parse_datapoint(prefix: str, dirname: str) -> DataPoint: 85 | if dirname.startswith("YCSB") or dirname.startswith("TPCC"): 86 | return DataPoint(prefix, dirname) 87 | else: 88 | print(f"Unknown experiment: {dirname}") 89 | 90 | 91 | def dump_throughput(datapoints: List[DataPoint], path: str, has_header: bool = True): 92 | with open(path, 'w') as f: 93 | if has_header: 94 | f.write(f"{','.join(datapoints[0].get_throughput_header())}\n") 95 | for dp in datapoints: 96 | f.write(f"{','.join(dp.get_throughput_data())}\n") 97 | 98 | 99 | def dump_tail(datapoints: List[DataPoint], path: str, has_header: bool = True): 100 | with open(path, 'w') as f: 101 | if has_header: 102 | if has_header: 103 | f.write(f"{','.join(datapoints[0].get_tail_header())}\n") 104 | for dp in datapoints: 105 | for l in dp.get_tail_data(): 106 | f.write(f"{','.join(l)}\n") 107 | 108 | 109 | if __name__ == "__main__": 110 | if len(sys.argv) <= 1: 111 | print(f"Usage: {sys.argv[0]} exper1 [exper2 [exper3...]]") 112 | exper_list = sys.argv[1:] 113 | 114 | for exper in exper_list: 115 | dp_list = [ 116 | parse_datapoint(f'results/{exper}', d) 117 | for d in os.listdir(f'results/{exper}') 118 | if os.path.isdir(f'results/{exper}/{d}') 119 | ] 120 | dump_throughput(dp_list, f'results/{exper}/throughput.csv') 121 | dump_tail(dp_list, f'results/{exper}/tail.csv') 122 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib 2 | pandas 3 | numpy 4 | -------------------------------------------------------------------------------- /storage/catalog.cpp: -------------------------------------------------------------------------------- 1 | #include "catalog.h" 2 | #include "global.h" 3 | #include "helper.h" 4 | 5 | void 6 | Catalog::init(const char * table_name, int field_cnt) { 7 | this->table_name = table_name; 8 | this->field_cnt = 0; 9 | this->_columns = new Column [field_cnt]; 10 | this->tuple_size = 0; 11 | } 12 | 13 | void Catalog::add_col(char * col_name, uint64_t size, char * type) { 14 | _columns[field_cnt].size = size; 15 | strcpy(_columns[field_cnt].type, type); 16 | strcpy(_columns[field_cnt].name, col_name); 17 | _columns[field_cnt].id = field_cnt; 18 | _columns[field_cnt].index = tuple_size; 19 | tuple_size += size; 20 | field_cnt ++; 21 | } 22 | 23 | uint64_t Catalog::get_field_id(const char * name) { 24 | UInt32 i; 25 | for (i = 0; i < field_cnt; i++) { 26 | if (strcmp(name, _columns[i].name) == 0) 27 | break; 28 | } 29 | assert (i < field_cnt); 30 | return i; 31 | } 32 | 33 | char * Catalog::get_field_type(uint64_t id) { 34 | return _columns[id].type; 35 | } 36 | 37 | char * Catalog::get_field_name(uint64_t id) { 38 | return _columns[id].name; 39 | } 40 | 41 | 42 | char * Catalog::get_field_type(char * name) { 43 | return get_field_type( get_field_id(name) ); 44 | } 45 | 46 | uint64_t Catalog::get_field_index(char * name) { 47 | return get_field_index( get_field_id(name) ); 48 | } 49 | 50 | void Catalog::print_schema() { 51 | printf("\n[Catalog] %s\n", table_name); 52 | for (UInt32 i = 0; i < field_cnt; i++) { 53 | printf("\t%s\t%s\t%ld\n", get_field_name(i), 54 | get_field_type(i), get_field_size(i)); 55 | } 56 | } 57 | -------------------------------------------------------------------------------- /storage/catalog.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | #include 5 | #include "global.h" 6 | #include "helper.h" 7 | 8 | class Column { 9 | public: 10 | Column() { 11 | this->type = new char[80]; 12 | this->name = new char[80]; 13 | } 14 | Column(uint64_t size, char * type, char * name, 15 | uint64_t id, uint64_t index) 16 | { 17 | this->size = size; 18 | this->id = id; 19 | this->index = index; 20 | this->type = new char[80]; 21 | this->name = new char[80]; 22 | strcpy(this->type, type); 23 | strcpy(this->name, name); 24 | }; 25 | 26 | UInt64 id; 27 | UInt32 size; 28 | UInt32 index; 29 | char * type; 30 | char * name; 31 | char pad[CL_SIZE - sizeof(uint64_t)*3 - sizeof(char *)*2]; 32 | }; 33 | 34 | class Catalog { 35 | public: 36 | // abandoned init function 37 | // field_size is the size of each each field. 38 | void init(const char * table_name, int field_cnt); 39 | void add_col(char * col_name, uint64_t size, char * type); 40 | 41 | UInt32 field_cnt; 42 | const char * table_name; 43 | 44 | UInt32 get_tuple_size() { return tuple_size; }; 45 | 46 | uint64_t get_field_cnt() { return field_cnt; }; 47 | uint64_t get_field_size(int id) { return _columns[id].size; }; 48 | uint64_t get_field_index(int id) { return _columns[id].index; }; 49 | char * get_field_type(uint64_t id); 50 | char * get_field_name(uint64_t id); 51 | uint64_t get_field_id(const char * name); 52 | char * get_field_type(char * name); 53 | uint64_t get_field_index(char * name); 54 | 55 | void print_schema(); 56 | Column * _columns; 57 | UInt32 tuple_size; 58 | }; 59 | 60 | -------------------------------------------------------------------------------- /storage/index_base.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | class table_t; 6 | 7 | class index_base { 8 | public: 9 | virtual RC init() { return RCOK; }; 10 | virtual RC init(uint64_t size) { return RCOK; }; 11 | 12 | virtual bool index_exist(idx_key_t key)=0; // check if the key exist. 13 | 14 | virtual RC index_insert(idx_key_t key, 15 | itemid_t * item, 16 | int part_id=-1)=0; 17 | 18 | virtual RC index_read(idx_key_t key, 19 | itemid_t * &item, 20 | int part_id=-1)=0; 21 | 22 | virtual RC index_read(idx_key_t key, 23 | itemid_t * &item, 24 | int part_id=-1, int thd_id=0)=0; 25 | 26 | // TODO implement index_remove 27 | virtual RC index_remove(idx_key_t key) { return RCOK; }; 28 | 29 | // the index in on "table". The key is the merged key of "fields" 30 | table_t * table; 31 | }; 32 | -------------------------------------------------------------------------------- /storage/index_btree.h: -------------------------------------------------------------------------------- 1 | #ifndef _BTREE_H_ 2 | #define _BTREE_H_ 3 | 4 | #include "global.h" 5 | #include "helper.h" 6 | #include "index_base.h" 7 | 8 | 9 | typedef struct bt_node { 10 | // TODO bad hack! 11 | void ** pointers; // for non-leaf nodes, point to bt_nodes 12 | bool is_leaf; 13 | idx_key_t * keys; 14 | bt_node * parent; 15 | UInt32 num_keys; 16 | bt_node * next; 17 | bool latch; 18 | pthread_mutex_t locked; 19 | latch_t latch_type; 20 | UInt32 share_cnt; 21 | } bt_node; 22 | 23 | struct glob_param { 24 | uint64_t part_id; 25 | }; 26 | 27 | class index_btree : public index_base { 28 | public: 29 | RC init(uint64_t part_cnt); 30 | RC init(uint64_t part_cnt, table_t * table); 31 | bool index_exist(idx_key_t key); // check if the key exist. 32 | RC index_insert(idx_key_t key, itemid_t * item, int part_id = -1); 33 | RC index_read(idx_key_t key, itemid_t * &item, 34 | uint64_t thd_id, int64_t part_id = -1); 35 | RC index_read(idx_key_t key, itemid_t * &item, int part_id = -1); 36 | RC index_read(idx_key_t key, itemid_t * &item); 37 | RC index_next(uint64_t thd_id, itemid_t * &item, bool samekey = false); 38 | 39 | private: 40 | // index structures may have part_cnt = 1 or PART_CNT. 41 | uint64_t part_cnt; 42 | RC make_lf(uint64_t part_id, bt_node *& node); 43 | RC make_nl(uint64_t part_id, bt_node *& node); 44 | RC make_node(uint64_t part_id, bt_node *& node); 45 | 46 | RC start_new_tree(glob_param params, idx_key_t key, itemid_t * item); 47 | RC find_leaf(glob_param params, idx_key_t key, idx_acc_t access_type, bt_node *& leaf, bt_node *& last_ex); 48 | RC find_leaf(glob_param params, idx_key_t key, idx_acc_t access_type, bt_node *& leaf); 49 | RC insert_into_leaf(glob_param params, bt_node * leaf, idx_key_t key, itemid_t * item); 50 | // handle split 51 | RC split_lf_insert(glob_param params, bt_node * leaf, idx_key_t key, itemid_t * item); 52 | RC split_nl_insert(glob_param params, bt_node * node, UInt32 left_index, idx_key_t key, bt_node * right); 53 | RC insert_into_parent(glob_param params, bt_node * left, idx_key_t key, bt_node * right); 54 | RC insert_into_new_root(glob_param params, bt_node * left, idx_key_t key, bt_node * right); 55 | 56 | int leaf_has_key(bt_node * leaf, idx_key_t key); 57 | 58 | UInt32 cut(UInt32 length); 59 | UInt32 order; // # of keys in a node(for both leaf and non-leaf) 60 | bt_node ** roots; // each partition has a different root 61 | bt_node * find_root(uint64_t part_id); 62 | 63 | bool latch_node(bt_node * node, latch_t latch_type); 64 | latch_t release_latch(bt_node * node); 65 | RC upgrade_latch(bt_node * node); 66 | // clean up all the LATCH_EX up tp last_ex 67 | RC cleanup(bt_node * node, bt_node * last_ex); 68 | 69 | // the leaf and the idx within the leaf that the thread last accessed. 70 | bt_node *** cur_leaf_per_thd; 71 | UInt32 ** cur_idx_per_thd; 72 | }; 73 | 74 | #endif 75 | -------------------------------------------------------------------------------- /storage/index_hash.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "index_hash.h" 3 | #include "mem_alloc.h" 4 | #include "table.h" 5 | 6 | RC IndexHash::init(uint64_t bucket_cnt, int part_cnt) { 7 | _bucket_cnt = bucket_cnt; 8 | _bucket_cnt_per_part = bucket_cnt / part_cnt; 9 | _buckets = new BucketHeader * [part_cnt]; 10 | for (int i = 0; i < part_cnt; i++) { 11 | _buckets[i] = (BucketHeader *) _mm_malloc(sizeof(BucketHeader) * _bucket_cnt_per_part, 64); 12 | for (uint32_t n = 0; n < _bucket_cnt_per_part; n ++) 13 | _buckets[i][n].init(); 14 | } 15 | return RCOK; 16 | } 17 | 18 | RC 19 | IndexHash::init(int part_cnt, table_t * table, uint64_t bucket_cnt) { 20 | init(bucket_cnt, part_cnt); 21 | this->table = table; 22 | return RCOK; 23 | } 24 | 25 | bool IndexHash::index_exist(idx_key_t key) { 26 | assert(false); 27 | return false; 28 | } 29 | 30 | void 31 | IndexHash::get_latch(BucketHeader * bucket) { 32 | while (!ATOM_CAS(bucket->locked, false, true)) {} 33 | } 34 | 35 | void 36 | IndexHash::release_latch(BucketHeader * bucket) { 37 | bool ok = ATOM_CAS(bucket->locked, true, false); 38 | assert(ok); 39 | // XXX(zhihan): change to read/write lock 40 | //pthread_rwlock_unlock(bucket->rwlock); 41 | } 42 | 43 | void 44 | IndexHash::get_latch(BucketHeader * bucket, access_t access) { 45 | while (!ATOM_CAS(bucket->locked, false, true)) {} 46 | /* 47 | // XXX(zhihan): rwlock 48 | if (access == RD) 49 | pthread_rwlock_rdlock(bucket->rwlock); 50 | else 51 | pthread_rwlock_wrlock(bucket->rwlock); 52 | */ 53 | } 54 | 55 | RC IndexHash::index_insert(idx_key_t key, itemid_t * item, int part_id) { 56 | RC rc = RCOK; 57 | uint64_t bkt_idx = hash(key); 58 | assert(bkt_idx < _bucket_cnt_per_part); 59 | BucketHeader * cur_bkt = &_buckets[part_id][bkt_idx]; 60 | // 1. get the ex latch 61 | get_latch(cur_bkt, WR); 62 | // 2. update the latch list 63 | cur_bkt->insert_item(key, item, part_id); 64 | // 3. release the latch 65 | release_latch(cur_bkt); 66 | return rc; 67 | } 68 | 69 | RC IndexHash::index_read(idx_key_t key, itemid_t * &item, int part_id) { 70 | //TODO(zhihan): take read lock 71 | uint64_t bkt_idx = hash(key); 72 | assert(bkt_idx < _bucket_cnt_per_part); 73 | BucketHeader * cur_bkt = &_buckets[part_id][bkt_idx]; 74 | RC rc = RCOK; 75 | // 1. get the sh latch 76 | //get_latch(cur_bkt, RD); 77 | cur_bkt->read_item(key, item, table->get_table_name()); 78 | // 3. release the latch 79 | //release_latch(cur_bkt); 80 | return rc; 81 | 82 | } 83 | 84 | RC IndexHash::index_read(idx_key_t key, itemid_t * &item, 85 | int part_id, int thd_id) { 86 | uint64_t bkt_idx = hash(key); 87 | assert(bkt_idx < _bucket_cnt_per_part); 88 | BucketHeader * cur_bkt = &_buckets[part_id][bkt_idx]; 89 | RC rc = RCOK; 90 | // 1. get the sh latch 91 | //get_latch(cur_bkt, RD); 92 | cur_bkt->read_item(key, item, table->get_table_name()); 93 | // 3. release the latch 94 | //release_latch(cur_bkt); 95 | return rc; 96 | } 97 | 98 | /************** BucketHeader Operations ******************/ 99 | 100 | void BucketHeader::init() { 101 | node_cnt = 0; 102 | first_node = NULL; 103 | locked = false; 104 | // XXX(zhihan): init another rw lock 105 | rwlock = new pthread_rwlock_t; 106 | pthread_rwlock_init(rwlock, NULL); 107 | } 108 | 109 | void BucketHeader::insert_item(idx_key_t key, 110 | itemid_t * item, 111 | int part_id) 112 | { 113 | BucketNode * cur_node = first_node; 114 | BucketNode * prev_node = NULL; 115 | while (cur_node != NULL) { 116 | if (cur_node->key == key) 117 | break; 118 | prev_node = cur_node; 119 | cur_node = cur_node->next; 120 | } 121 | if (cur_node == NULL) { 122 | BucketNode * new_node = (BucketNode *) 123 | mem_allocator.alloc(sizeof(BucketNode), part_id ); 124 | new_node->init(key); 125 | new_node->items = item; 126 | if (prev_node != NULL) { 127 | new_node->next = prev_node->next; 128 | prev_node->next = new_node; 129 | } else { 130 | new_node->next = first_node; 131 | first_node = new_node; 132 | } 133 | } else { 134 | item->next = cur_node->items; 135 | cur_node->items = item; 136 | } 137 | } 138 | 139 | void BucketHeader::read_item(idx_key_t key, itemid_t * &item, const char * tname) 140 | { 141 | BucketNode * cur_node = first_node; 142 | while (cur_node != NULL) { 143 | if (cur_node->key == key) 144 | break; 145 | cur_node = cur_node->next; 146 | } 147 | M_ASSERT(cur_node->key == key, "Key does not exist!"); 148 | item = cur_node->items; 149 | } 150 | -------------------------------------------------------------------------------- /storage/index_hash.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | #include "helper.h" 5 | #include "index_base.h" 6 | 7 | //TODO make proper variables private 8 | // each BucketNode contains items sharing the same key 9 | class BucketNode { 10 | public: 11 | BucketNode(idx_key_t key) { init(key); }; 12 | void init(idx_key_t key) { 13 | this->key = key; 14 | next = NULL; 15 | items = NULL; 16 | } 17 | idx_key_t key; 18 | // The node for the next key 19 | BucketNode * next; 20 | // NOTE. The items can be a list of items connected by the next pointer. 21 | itemid_t * items; 22 | }; 23 | 24 | // BucketHeader does concurrency control of Hash 25 | class BucketHeader { 26 | public: 27 | void init(); 28 | void insert_item(idx_key_t key, itemid_t * item, int part_id); 29 | void read_item(idx_key_t key, itemid_t * &item, const char * tname); 30 | BucketNode * first_node; 31 | uint64_t node_cnt; 32 | bool locked; 33 | pthread_rwlock_t * rwlock; 34 | }; 35 | 36 | // TODO Hash index does not support partition yet. 37 | class IndexHash : public index_base 38 | { 39 | public: 40 | RC init(uint64_t bucket_cnt, int part_cnt); 41 | RC init(int part_cnt, 42 | table_t * table, 43 | uint64_t bucket_cnt); 44 | bool index_exist(idx_key_t key); // check if the key exist. 45 | RC index_insert(idx_key_t key, itemid_t * item, int part_id=-1); 46 | // the following call returns a single item 47 | RC index_read(idx_key_t key, itemid_t * &item, int part_id=-1); 48 | RC index_read(idx_key_t key, itemid_t * &item, 49 | int part_id=-1, int thd_id=0); 50 | private: 51 | void get_latch(BucketHeader * bucket); 52 | void get_latch(BucketHeader * bucket, access_t access); 53 | void release_latch(BucketHeader * bucket); 54 | 55 | // TODO implement more complex hash function 56 | uint64_t hash(idx_key_t key) { return key % _bucket_cnt_per_part; } 57 | 58 | BucketHeader ** _buckets; 59 | uint64_t _bucket_cnt; 60 | uint64_t _bucket_cnt_per_part; 61 | }; 62 | -------------------------------------------------------------------------------- /storage/row.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | #define DECL_SET_VALUE(type) \ 6 | void set_value(int col_id, type value); 7 | 8 | #define SET_VALUE(type) \ 9 | void row_t::set_value(int col_id, type value) { \ 10 | set_value(col_id, &value); \ 11 | } 12 | 13 | #define DECL_GET_VALUE(type)\ 14 | void get_value(int col_id, type & value); 15 | 16 | #define GET_VALUE(type)\ 17 | void row_t::get_value(int col_id, type & value) { \ 18 | value = *(type *)get_value(col_id); \ 19 | } 20 | 21 | // int pos = get_schema()->get_field_index(col_id); 22 | // value = *(type *)&data[pos]; 23 | // } 24 | 25 | class Access; 26 | class table_t; 27 | class Catalog; 28 | class txn_man; 29 | class Row_lock; 30 | class Row_mvcc; 31 | class Row_hekaton; 32 | class Row_ts; 33 | class Row_occ; 34 | class Row_tictoc; 35 | class Row_silo; 36 | class Row_silo_prio; 37 | class Row_aria; 38 | class Row_vll; 39 | class Row_ww; 40 | class Row_bamboo; 41 | //class Row_bamboo_pt; 42 | class Row_ic3; 43 | #if CC_ALG == WOUND_WAIT || CC_ALG == WAIT_DIE || CC_ALG == NO_WAIT || CC_ALG == DL_DETECT 44 | struct LockEntry; 45 | #elif CC_ALG == BAMBOO 46 | struct BBLockEntry; 47 | #endif 48 | 49 | class row_t 50 | { 51 | public: 52 | 53 | RC init(table_t * host_table, uint64_t part_id, uint64_t row_id = 0); 54 | void init(int size); 55 | RC switch_schema(table_t * host_table); 56 | // not every row has a manager 57 | void init_manager(row_t * row); 58 | 59 | table_t * get_table(); 60 | Catalog * get_schema(); 61 | const char * get_table_name(); 62 | uint64_t get_field_cnt(); 63 | uint64_t get_tuple_size(); 64 | uint64_t get_row_id() { return _row_id; }; 65 | 66 | void copy(row_t * src); 67 | void copy(row_t * src, int idx); 68 | 69 | void set_primary_key(uint64_t key) { _primary_key = key; }; 70 | uint64_t get_primary_key() {return _primary_key; }; 71 | uint64_t get_part_id() { return _part_id; }; 72 | 73 | void set_value(int id, void * ptr); 74 | void set_value_plain(int id, void * ptr); 75 | void set_value(int id, void * ptr, int size); 76 | void set_value(const char * col_name, void * ptr); 77 | char * get_value(int id); 78 | char * get_value_plain(uint64_t id); 79 | char * get_value(char * col_name); 80 | void inc_value(int id, uint64_t val); 81 | void dec_value(int id, uint64_t val); 82 | 83 | DECL_SET_VALUE(uint64_t); 84 | DECL_SET_VALUE(int64_t); 85 | DECL_SET_VALUE(double); 86 | DECL_SET_VALUE(UInt32); 87 | DECL_SET_VALUE(SInt32); 88 | 89 | DECL_GET_VALUE(uint64_t); 90 | DECL_GET_VALUE(int64_t); 91 | DECL_GET_VALUE(double); 92 | DECL_GET_VALUE(UInt32); 93 | DECL_GET_VALUE(SInt32); 94 | 95 | 96 | void set_data(char * data, uint64_t size); 97 | char * get_data(); 98 | 99 | void free_row(); 100 | 101 | // for concurrency control. can be lock, timestamp etc. 102 | #if CC_ALG == BAMBOO 103 | RC retire_row(BBLockEntry * lock_entry); 104 | #elif CC_ALG == IC3 105 | row_t * orig; 106 | void init_accesses(Access * access); 107 | Access * txn_access; // only used when row is a local copy 108 | #endif 109 | RC get_row(access_t type, txn_man * txn, row_t *& row, Access *access=NULL); 110 | #if CC_ALG == BAMBOO 111 | void return_row(BBLockEntry * lock_entry, RC rc); 112 | #elif CC_ALG == WOUND_WAIT 113 | void return_row(LockEntry * lock_entry, RC rc); 114 | #endif 115 | #if CC_ALG == WOUND_WAIT || CC_ALG == WAIT_DIE || CC_ALG == NO_WAIT || CC_ALG == DL_DETECT 116 | void return_row(access_t type, row_t * row, LockEntry * lock_entry); 117 | #endif 118 | void return_row(access_t type, txn_man * txn, row_t * row); 119 | 120 | #if CC_ALG == DL_DETECT || CC_ALG == NO_WAIT || CC_ALG == WAIT_DIE 121 | Row_lock * manager; 122 | #elif CC_ALG == TIMESTAMP 123 | Row_ts * manager; 124 | #elif CC_ALG == MVCC 125 | Row_mvcc * manager; 126 | #elif CC_ALG == HEKATON 127 | Row_hekaton * manager; 128 | #elif CC_ALG == OCC 129 | Row_occ * manager; 130 | #elif CC_ALG == TICTOC 131 | Row_tictoc * manager; 132 | #elif CC_ALG == SILO 133 | Row_silo * manager; 134 | #elif CC_ALG == SILO_PRIO 135 | Row_silo_prio * manager; 136 | #elif CC_ALG == ARIA 137 | Row_aria * manager; 138 | #elif CC_ALG == VLL 139 | Row_vll * manager; 140 | #elif CC_ALG == WOUND_WAIT 141 | Row_ww * manager; 142 | #elif CC_ALG == BAMBOO 143 | Row_bamboo * manager; 144 | #elif CC_ALG == IC3 145 | Row_ic3 * manager; 146 | #endif 147 | char * data; 148 | table_t * table; 149 | private: 150 | // primary key should be calculated from the data stored in the row. 151 | uint64_t _primary_key; 152 | uint64_t _part_id; 153 | uint64_t _row_id; 154 | }; 155 | -------------------------------------------------------------------------------- /storage/table.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | #include "table.h" 4 | #include "catalog.h" 5 | #include "row.h" 6 | #include "mem_alloc.h" 7 | 8 | void table_t::init(Catalog * schema) { 9 | this->table_name = schema->table_name; 10 | this->schema = schema; 11 | } 12 | 13 | RC table_t::get_new_row(row_t *& row) { 14 | // this function is obsolete. 15 | assert(false); 16 | return RCOK; 17 | } 18 | 19 | // the row is not stored locally. the pointer must be maintained by index structure. 20 | RC table_t::get_new_row(row_t *& row, uint64_t part_id, uint64_t &row_id) { 21 | RC rc = RCOK; 22 | cur_tab_size ++; 23 | 24 | row = (row_t *) _mm_malloc(sizeof(row_t), 64); 25 | rc = row->init(this, part_id, row_id); 26 | row->init_manager(row); 27 | 28 | return rc; 29 | } 30 | -------------------------------------------------------------------------------- /storage/table.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | // TODO sequential scan is not supported yet. 6 | // only index access is supported for table. 7 | class Catalog; 8 | class row_t; 9 | 10 | class table_t 11 | { 12 | public: 13 | void init(Catalog * schema); 14 | // row lookup should be done with index. But index does not have 15 | // records for new rows. get_new_row returns the pointer to a 16 | // new row. 17 | RC get_new_row(row_t *& row); // this is equivalent to insert() 18 | RC get_new_row(row_t *& row, uint64_t part_id, uint64_t &row_id); 19 | 20 | void delete_row(); // TODO delete_row is not supportet yet 21 | 22 | uint64_t get_table_size() { return cur_tab_size; }; 23 | Catalog * get_schema() { return schema; }; 24 | const char * get_table_name() { return table_name; }; 25 | 26 | Catalog * schema; 27 | private: 28 | const char * table_name; 29 | uint64_t cur_tab_size; 30 | char pad[CL_SIZE - sizeof(void *)*3]; 31 | }; 32 | -------------------------------------------------------------------------------- /system/amd64.h: -------------------------------------------------------------------------------- 1 | // 2 | // Implemented by authors of SILO. 3 | // 4 | 5 | #ifndef _AMD64_H_ 6 | #define _AMD64_H_ 7 | 8 | #include 9 | 10 | #define ALWAYS_INLINE __attribute__((always_inline)) 11 | 12 | inline ALWAYS_INLINE void 13 | nop_pause() { 14 | __asm volatile("pause" : :); 15 | } 16 | 17 | inline void 18 | memory_barrier() { 19 | asm volatile("mfence" : : : "memory"); 20 | } 21 | 22 | #endif /* _AMD64_H_ */ 23 | 24 | -------------------------------------------------------------------------------- /system/batch.cpp: -------------------------------------------------------------------------------- 1 | 2 | #include "batch.h" 3 | #include "wl.h" 4 | 5 | void BatchMgr::BatchBuffer::init_txn(workload* wl, thread_t* thd) { 6 | for (auto& e: batch) { 7 | RC rc = wl->get_txn_man(e.txn, thd); 8 | assert(rc == RCOK); 9 | } 10 | } 11 | -------------------------------------------------------------------------------- /system/batch.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | #include "txn.h" 5 | 6 | struct BatchEntry { 7 | txn_man* txn; // init once and reused repeatedly 8 | base_query* query; // the current query to execute 9 | RC rc; // current state; can be Abort if its reservation fails 10 | ts_t start_ts; // if zero, meaning it is a newly start one 11 | // the name is a little confusing: exec_time here includes validation phases 12 | uint64_t exec_time_curr; // current execution time (not yet abort/commit) 13 | uint64_t exec_time_abort; // how much execution time spent eventually aborts 14 | uint64_t txn_id; 15 | }; 16 | 17 | /* 18 | * Batch manager. 19 | * Currently only used by Aria. It manages txns batches by batches. 20 | */ 21 | class BatchMgr { 22 | struct BatchBuffer { // this should be a FIFO queue 23 | BatchEntry batch[ARIA_BATCH_SIZE] {}; 24 | int size = 0; 25 | 26 | BatchBuffer() = default; 27 | void init_txn(workload* wl, thread_t* thd); 28 | 29 | void reset() { size = 0; } 30 | void append(base_query* q) { 31 | assert(size < ARIA_BATCH_SIZE); 32 | batch[size].query = q; 33 | batch[size].rc = RCOK; 34 | batch[size].start_ts = 0; 35 | batch[size].exec_time_curr = 0; 36 | batch[size].exec_time_abort = 0; 37 | batch[size].txn_id = 0; 38 | ++size; 39 | } 40 | void append(struct BatchEntry* other) { 41 | assert(size < ARIA_BATCH_SIZE); 42 | batch[size].query = other->query; 43 | batch[size].rc = RCOK; 44 | batch[size].start_ts = other->start_ts; 45 | batch[size].exec_time_curr = 0; 46 | batch[size].exec_time_abort = other->exec_time_abort; 47 | batch[size].txn_id = other->txn_id; 48 | ++size; 49 | } 50 | BatchEntry* get(int idx) { 51 | if (idx >= size) return nullptr; // nothing to pop 52 | return &batch[idx]; 53 | } 54 | }; 55 | 56 | uint64_t batch_id; 57 | BatchBuffer batch_buf0; 58 | BatchBuffer batch_buf1; 59 | BatchBuffer* curr_batch; 60 | BatchBuffer* next_batch; 61 | 62 | public: 63 | BatchMgr(): batch_id(0), batch_buf0(), batch_buf1(), curr_batch(&batch_buf0), 64 | next_batch(&batch_buf1) {} 65 | void init_txn(workload* wl, thread_t* thd) { 66 | batch_buf0.reset(); 67 | batch_buf1.reset(); 68 | batch_buf0.init_txn(wl, thd); 69 | batch_buf1.init_txn(wl, thd); 70 | } 71 | 72 | const uint64_t get_batch_id() const { return batch_id; } 73 | 74 | // get one entry from the current batch 75 | BatchEntry* get_entry(int idx) const { return curr_batch->get(idx); } 76 | // a txn aborted, put it into the next batch 77 | void put_next(BatchEntry* e) { next_batch->append(e); } 78 | 79 | // whether there is any space left on the current batch 80 | bool can_admit() { return curr_batch->size < ARIA_BATCH_SIZE; } 81 | // admit new query into the buffer 82 | void admit_new_query(base_query* q) { 83 | assert(q); 84 | assert(can_admit()); 85 | curr_batch->append(q); 86 | } 87 | 88 | // start a new batch: 89 | // next_batch becomes "curr_batch"; recycle the old one as new "next_batch" 90 | void start_new_batch() { 91 | ++batch_id; // batch_id must be nonzero 92 | // switch curr/next_batch 93 | BatchBuffer* tmp = curr_batch; 94 | curr_batch = next_batch; 95 | next_batch = tmp; 96 | next_batch->reset(); 97 | } 98 | }; 99 | -------------------------------------------------------------------------------- /system/global.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "mem_alloc.h" 3 | #include "stats.h" 4 | #include "dl_detect.h" 5 | #include "manager.h" 6 | #include "query.h" 7 | #include "plock.h" 8 | #include "occ.h" 9 | #include "vll.h" 10 | #include "aria.h" 11 | 12 | mem_alloc mem_allocator; 13 | Stats stats; 14 | DL_detect dl_detector; 15 | Manager * glob_manager; 16 | Query_queue * query_queue; 17 | Plock part_lock_man; 18 | OptCC occ_man; 19 | #if CC_ALG == VLL 20 | VLLMan vll_man; 21 | #endif 22 | 23 | bool volatile warmup_finish = false; 24 | bool volatile enable_thread_mem_pool = false; 25 | pthread_barrier_t warmup_bar; 26 | #ifndef NOGRAPHITE 27 | carbon_barrier_t enable_barrier; 28 | #endif 29 | 30 | ts_t g_abort_penalty = ABORT_PENALTY; 31 | bool g_central_man = CENTRAL_MAN; 32 | UInt32 g_ts_alloc = TS_ALLOC; 33 | bool g_key_order = KEY_ORDER; 34 | bool g_no_dl = NO_DL; 35 | ts_t g_timeout = TIMEOUT; 36 | ts_t g_dl_loop_detect = DL_LOOP_DETECT; 37 | bool g_ts_batch_alloc = TS_BATCH_ALLOC; 38 | UInt32 g_ts_batch_num = TS_BATCH_NUM; 39 | 40 | bool g_part_alloc = PART_ALLOC; 41 | bool g_mem_pad = MEM_PAD; 42 | UInt32 g_cc_alg = CC_ALG; 43 | ts_t g_query_intvl = QUERY_INTVL; 44 | UInt32 g_part_per_txn = PART_PER_TXN; 45 | double g_perc_multi_part = PERC_MULTI_PART; 46 | double g_read_perc = READ_PERC; 47 | double g_write_perc = WRITE_PERC; 48 | double g_zipf_theta = ZIPF_THETA; 49 | bool g_prt_lat_distr = PRT_LAT_DISTR; 50 | UInt32 g_part_cnt = PART_CNT; 51 | UInt32 g_virtual_part_cnt = VIRTUAL_PART_CNT; 52 | UInt32 g_thread_cnt = THREAD_CNT; 53 | UInt64 g_synth_table_size = SYNTH_TABLE_SIZE - (SYNTH_TABLE_SIZE % INIT_PARALLELISM); 54 | UInt32 g_req_per_query = REQ_PER_QUERY; 55 | UInt32 g_field_per_tuple = FIELD_PER_TUPLE; 56 | UInt32 g_init_parallelism = INIT_PARALLELISM; 57 | double g_last_retire = BB_LAST_RETIRE; 58 | double g_specified_ratio = SPECIFIED_RATIO; 59 | double g_flip_ratio = FLIP_RATIO; 60 | double g_long_txn_ratio = LONG_TXN_RATIO; 61 | double g_long_txn_read_ratio = LONG_TXN_READ_RATIO; 62 | 63 | UInt32 g_num_wh = NUM_WH; 64 | double g_perc_payment = PERC_PAYMENT; 65 | double g_perc_delivery = PERC_DELIVERY; 66 | double g_perc_orderstatus = PERC_ORDERSTATUS; 67 | double g_perc_stocklevel = PERC_STOCKLEVEL; 68 | double g_perc_neworder = 1 - (g_perc_payment + g_perc_delivery + g_perc_orderstatus + g_perc_stocklevel); 69 | bool g_wh_update = WH_UPDATE; 70 | char * output_file = NULL; 71 | 72 | map g_params; 73 | 74 | #if TPCC_SMALL 75 | UInt32 g_max_items = 10000; 76 | UInt32 g_cust_per_dist = 2000; 77 | #else 78 | UInt32 g_max_items = 100000; 79 | UInt32 g_cust_per_dist = 3000; 80 | #endif 81 | -------------------------------------------------------------------------------- /system/global.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "stdint.h" 4 | #include 5 | #include 6 | #include 7 | #define NDEBUG 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | #include 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | 25 | #if LATCH == LH_MCSLOCK 26 | #include "mcs_spinlock.h" 27 | #endif 28 | #include "pthread.h" 29 | #include "config.h" 30 | #include "stats.h" 31 | #include "dl_detect.h" 32 | #ifndef NOGRAPHITE 33 | #include "carbon_user.h" 34 | #endif 35 | #include "helper.h" 36 | 37 | using namespace std; 38 | 39 | class mem_alloc; 40 | class Stats; 41 | class DL_detect; 42 | class Manager; 43 | class Query_queue; 44 | class Plock; 45 | class OptCC; 46 | class VLLMan; 47 | 48 | typedef uint32_t UInt32; 49 | typedef int32_t SInt32; 50 | typedef uint64_t UInt64; 51 | typedef int64_t SInt64; 52 | 53 | typedef uint64_t ts_t; // time stamp type 54 | 55 | /******************************************/ 56 | // Global Data Structure 57 | /******************************************/ 58 | extern mem_alloc mem_allocator; 59 | extern Stats stats; 60 | extern DL_detect dl_detector; 61 | extern Manager * glob_manager; 62 | extern Query_queue * query_queue; 63 | extern Plock part_lock_man; 64 | extern OptCC occ_man; 65 | #if CC_ALG == VLL 66 | extern VLLMan vll_man; 67 | #endif 68 | 69 | extern bool volatile warmup_finish; 70 | extern bool volatile enable_thread_mem_pool; 71 | extern pthread_barrier_t warmup_bar; 72 | #ifndef NOGRAPHITE 73 | extern carbon_barrier_t enable_barrier; 74 | #endif 75 | 76 | /******************************************/ 77 | // Global Parameter 78 | /******************************************/ 79 | extern bool g_part_alloc; 80 | extern bool g_mem_pad; 81 | extern bool g_prt_lat_distr; 82 | extern UInt32 g_part_cnt; 83 | extern UInt32 g_virtual_part_cnt; 84 | extern UInt32 g_thread_cnt; 85 | extern ts_t g_abort_penalty; 86 | extern bool g_central_man; 87 | extern UInt32 g_ts_alloc; 88 | extern bool g_key_order; 89 | extern bool g_no_dl; 90 | extern ts_t g_timeout; 91 | extern ts_t g_dl_loop_detect; 92 | extern bool g_ts_batch_alloc; 93 | extern UInt32 g_ts_batch_num; 94 | 95 | extern map g_params; 96 | 97 | // YCSB 98 | extern UInt32 g_cc_alg; 99 | extern ts_t g_query_intvl; 100 | extern UInt32 g_part_per_txn; 101 | extern double g_perc_multi_part; 102 | extern double g_read_perc; 103 | extern double g_write_perc; 104 | extern double g_zipf_theta; 105 | extern UInt64 g_synth_table_size; 106 | extern UInt32 g_req_per_query; 107 | extern UInt32 g_field_per_tuple; 108 | extern UInt32 g_init_parallelism; 109 | extern double g_last_retire; 110 | extern double g_specified_ratio; 111 | extern double g_flip_ratio; 112 | extern double g_long_txn_ratio; 113 | extern double g_long_txn_read_ratio; 114 | 115 | // TPCC 116 | extern UInt32 g_num_wh; 117 | extern double g_perc_payment; 118 | extern double g_perc_delivery; 119 | extern double g_perc_orderstatus; 120 | extern double g_perc_stocklevel; 121 | extern double g_perc_neworder; 122 | extern bool g_wh_update; 123 | extern char * output_file; 124 | extern UInt32 g_max_items; 125 | extern UInt32 g_cust_per_dist; 126 | 127 | enum RC { RCOK, Commit, Abort, WAIT, ERROR, FINISH}; 128 | 129 | /* Thread */ 130 | typedef uint64_t txnid_t; 131 | 132 | /* Txn */ 133 | typedef uint64_t txn_t; 134 | 135 | /* Table and Row */ 136 | typedef uint64_t rid_t; // row id 137 | typedef uint64_t pgid_t; // page id 138 | 139 | 140 | 141 | /* INDEX */ 142 | enum latch_t {LATCH_EX, LATCH_SH, LATCH_NONE}; 143 | // accessing type determines the latch type on nodes 144 | enum idx_acc_t {INDEX_INSERT, INDEX_READ, INDEX_NONE}; 145 | typedef uint64_t idx_key_t; // key id for index 146 | typedef uint64_t (*func_ptr)(idx_key_t); // part_id func_ptr(index_key); 147 | 148 | /* general concurrency control */ 149 | enum access_t {RD, WR, XP, SCAN, CM}; 150 | /* LOCK */ 151 | enum lock_t {LOCK_EX, LOCK_SH, LOCK_NONE }; 152 | enum loc_t {RETIRED, OWNERS, WAITERS, LOC_NONE}; 153 | enum lock_status {LOCK_DROPPED, LOCK_WAITER, LOCK_OWNER, LOCK_RETIRED}; 154 | /* TIMESTAMP */ 155 | enum TsType {R_REQ, W_REQ, P_REQ, XP_REQ}; 156 | /* TXN STATUS */ 157 | // XXX(zhihan): bamboo requires the enumeration order to be unchanged 158 | enum status_t: unsigned int {RUNNING, ABORTED, COMMITED, HOLDING}; 159 | 160 | /* COMMUTATIVE OPERATIONS */ 161 | enum com_t {COM_INC, COM_DEC, COM_NONE}; 162 | 163 | 164 | #define MSG(str, args...) { \ 165 | printf("[%s : %d] " str, __FILE__, __LINE__, args); } \ 166 | // printf(args); } 167 | 168 | // principal index structure. The workload may decide to use a different 169 | // index structure for specific purposes. (e.g. non-primary key access should use hash) 170 | #if (INDEX_STRUCT == IDX_BTREE) 171 | #define INDEX index_btree 172 | #else // IDX_HASH 173 | #define INDEX IndexHash 174 | #endif 175 | 176 | /************************************************/ 177 | // constants 178 | /************************************************/ 179 | #ifndef UINT64_MAX 180 | #define UINT64_MAX 18446744073709551615UL 181 | #endif // UINT64_MAX 182 | -------------------------------------------------------------------------------- /system/helper.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | #include "mem_alloc.h" 4 | #include "time.h" 5 | 6 | bool itemid_t::operator==(const itemid_t &other) const { 7 | return (type == other.type && location == other.location); 8 | } 9 | 10 | bool itemid_t::operator!=(const itemid_t &other) const { 11 | return !(*this == other); 12 | } 13 | 14 | void itemid_t::operator=(const itemid_t &other){ 15 | this->valid = other.valid; 16 | this->type = other.type; 17 | this->location = other.location; 18 | assert(*this == other); 19 | assert(this->valid); 20 | } 21 | 22 | void itemid_t::init() { 23 | valid = false; 24 | location = 0; 25 | next = NULL; 26 | } 27 | 28 | int get_thdid_from_txnid(uint64_t txnid) { 29 | return txnid % g_thread_cnt; 30 | } 31 | 32 | uint64_t get_part_id(void * addr) { 33 | return ((uint64_t)addr / PAGE_SIZE) % g_part_cnt; 34 | } 35 | 36 | uint64_t key_to_part(uint64_t key) { 37 | if (g_part_alloc) 38 | return key % g_part_cnt; 39 | else 40 | return 0; 41 | } 42 | 43 | uint64_t merge_idx_key(UInt64 key_cnt, UInt64 * keys) { 44 | UInt64 len = 64 / key_cnt; 45 | UInt64 key = 0; 46 | for (UInt32 i = 0; i < len; i++) { 47 | assert(keys[i] < (1UL << len)); 48 | key = (key << len) | keys[i]; 49 | } 50 | return key; 51 | } 52 | 53 | uint64_t merge_idx_key(uint64_t key1, uint64_t key2) { 54 | assert(key1 < (1UL << 32) && key2 < (1UL << 32)); 55 | return key1 << 32 | key2; 56 | } 57 | 58 | uint64_t merge_idx_key(uint64_t key1, uint64_t key2, uint64_t key3) { 59 | assert(key1 < (1 << 21) && key2 < (1 << 21) && key3 < (1 << 21)); 60 | return key1 << 42 | key2 << 21 | key3; 61 | } 62 | 63 | /****************************************************/ 64 | // Global Clock! 65 | /****************************************************/ 66 | /* 67 | inline uint64_t get_server_clock() { 68 | #if defined(__i386__) 69 | uint64_t ret; 70 | __asm__ __volatile__("rdtsc" : "=A" (ret)); 71 | #elif defined(__x86_64__) 72 | unsigned hi, lo; 73 | __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi)); 74 | uint64_t ret = ( (uint64_t)lo)|( ((uint64_t)hi)<<32 ); 75 | ret = (uint64_t) ((double)ret / CPU_FREQ); 76 | #else 77 | timespec * tp = new timespec; 78 | clock_gettime(CLOCK_REALTIME, tp); 79 | uint64_t ret = tp->tv_sec * 1000000000 + tp->tv_nsec; 80 | #endif 81 | return ret; 82 | } 83 | 84 | inline uint64_t get_sys_clock() { 85 | #ifndef NOGRAPHITE 86 | static volatile uint64_t fake_clock = 0; 87 | if (warmup_finish) 88 | return CarbonGetTime(); // in ns 89 | else { 90 | return ATOM_ADD_FETCH(fake_clock, 100); 91 | } 92 | #else 93 | if (TIME_ENABLE) 94 | return get_server_clock(); 95 | return 0; 96 | #endif 97 | } 98 | */ 99 | void myrand::init(uint64_t seed) { 100 | this->seed = seed; 101 | } 102 | 103 | uint64_t myrand::next() { 104 | seed = (seed * 1103515247UL + 12345UL) % (1UL<<63); 105 | return (seed / 65537) % RAND_MAX; 106 | } 107 | 108 | -------------------------------------------------------------------------------- /system/main.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "ycsb.h" 3 | #include "tpcc.h" 4 | #include "test.h" 5 | #include "thread.h" 6 | #include "manager.h" 7 | #include "mem_alloc.h" 8 | #include "query.h" 9 | #include "plock.h" 10 | #include "occ.h" 11 | #include "vll.h" 12 | #include "aria.h" 13 | 14 | void * f(void *); 15 | 16 | thread_t ** m_thds; 17 | 18 | // defined in parser.cpp 19 | void parser(int argc, char * argv[]); 20 | 21 | int main(int argc, char* argv[]) 22 | { 23 | parser(argc, argv); 24 | 25 | #ifndef NDEBUG 26 | uint64_t ts0, ts1; 27 | ts0 = get_sys_clock(); 28 | sleep(1); 29 | ts1 = get_sys_clock(); 30 | double ratio = ((double)(ts1 - ts0)) / 1000000000.0; 31 | if (ratio < 0.99 || ratio > 1.00) { 32 | fprintf(stderr, 33 | "FATAL ERROR: CPU freqency might be incorrectly configured: " 34 | "real_time/cpu_ts_time=%f\n", ratio); 35 | abort(); 36 | } 37 | #endif 38 | 39 | mem_allocator.init(g_part_cnt, MEM_SIZE / g_part_cnt); 40 | stats.init(); 41 | glob_manager = (Manager *) _mm_malloc(sizeof(Manager), 64); 42 | glob_manager->init(); 43 | if (g_cc_alg == DL_DETECT) 44 | dl_detector.init(); 45 | printf("mem_allocator initialized!\n"); 46 | 47 | #if CC_ALG == WOUND_WAIT 48 | printf("WOUND_WAIT\n"); 49 | #elif CC_ALG == NO_WAIT 50 | printf("NO_WAIT\n"); 51 | #elif CC_ALG == WAIT_DIE 52 | printf("WAIT_DIE\n"); 53 | #elif CC_ALG == BAMBOO 54 | printf("BAMBOO\n"); 55 | #elif CC_ALG == SILO 56 | printf("SILO\n"); 57 | #elif CC_ALG == SILO_PRIO 58 | printf("SILO_PRIO\n"); 59 | #elif CC_ALG == ARIA 60 | printf("ARIA\n"); 61 | #elif CC_ALG == IC3 62 | printf("IC3\n"); 63 | #endif 64 | 65 | 66 | workload * m_wl; 67 | switch (WORKLOAD) { 68 | case YCSB : 69 | m_wl = new ycsb_wl; break; 70 | case TPCC : 71 | m_wl = new tpcc_wl; break; 72 | case TEST : 73 | m_wl = new TestWorkload; 74 | ((TestWorkload *)m_wl)->tick(); 75 | break; 76 | default: 77 | assert(false); 78 | } 79 | m_wl->init(); 80 | printf("workload initialized!\n"); 81 | 82 | 83 | uint64_t thd_cnt = g_thread_cnt; 84 | pthread_t p_thds[thd_cnt - 1]; 85 | m_thds = new thread_t * [thd_cnt]; 86 | for (uint32_t i = 0; i < thd_cnt; i++) 87 | m_thds[i] = (thread_t *) _mm_malloc(sizeof(thread_t), 64); 88 | // query_queue should be the last one to be initialized!!! 89 | // because it collects txn latency 90 | query_queue = (Query_queue *) _mm_malloc(sizeof(Query_queue), 64); 91 | if (WORKLOAD != TEST) 92 | query_queue->init(m_wl); 93 | pthread_barrier_init( &warmup_bar, NULL, g_thread_cnt ); 94 | printf("query_queue initialized!\n"); 95 | #if CC_ALG == HSTORE 96 | part_lock_man.init(); 97 | #elif CC_ALG == OCC 98 | occ_man.init(); 99 | #elif CC_ALG == VLL 100 | vll_man.init(); 101 | #elif CC_ALG == ARIA 102 | AriaCoord::init(); 103 | #endif 104 | 105 | for (uint32_t i = 0; i < thd_cnt; i++) 106 | m_thds[i]->init(i, m_wl); 107 | 108 | if (WARMUP > 0){ 109 | printf("WARMUP start!\n"); 110 | for (uint32_t i = 0; i < thd_cnt - 1; i++) { 111 | uint64_t vid = i; 112 | pthread_create(&p_thds[i], NULL, f, (void *)vid); 113 | } 114 | f((void *)(thd_cnt - 1)); 115 | for (uint32_t i = 0; i < thd_cnt - 1; i++) 116 | pthread_join(p_thds[i], NULL); 117 | printf("WARMUP finished!\n"); 118 | } 119 | warmup_finish = true; 120 | pthread_barrier_init( &warmup_bar, NULL, g_thread_cnt ); 121 | #ifndef NOGRAPHITE 122 | CarbonBarrierInit(&enable_barrier, g_thread_cnt); 123 | #endif 124 | pthread_barrier_init( &warmup_bar, NULL, g_thread_cnt ); 125 | 126 | #if CC_ALG == ARIA 127 | AriaCoord::init(); // re-init 128 | #endif 129 | 130 | // spawn and run txns again. 131 | int64_t starttime = get_server_clock(); 132 | for (uint32_t i = 0; i < thd_cnt - 1; i++) { 133 | uint64_t vid = i; 134 | pthread_create(&p_thds[i], NULL, f, (void *)vid); 135 | } 136 | f((void *)(thd_cnt - 1)); 137 | for (uint32_t i = 0; i < thd_cnt - 1; i++) 138 | pthread_join(p_thds[i], NULL); 139 | int64_t endtime = get_server_clock(); 140 | 141 | if (WORKLOAD != TEST) { 142 | printf("PASS! SimTime = %ld\n", endtime - starttime); 143 | if (STATS_ENABLE) 144 | stats.print(); 145 | } else { 146 | ((TestWorkload *)m_wl)->summarize(); 147 | } 148 | return 0; 149 | } 150 | 151 | void * f(void * id) { 152 | uint64_t tid = (uint64_t)id; 153 | m_thds[tid]->run(); 154 | return NULL; 155 | } 156 | -------------------------------------------------------------------------------- /system/manager.cpp: -------------------------------------------------------------------------------- 1 | #include "manager.h" 2 | #include "row.h" 3 | #include "txn.h" 4 | #include "pthread.h" 5 | 6 | void Manager::init() { 7 | timestamp = (uint64_t *) _mm_malloc(sizeof(uint64_t), 64); 8 | *timestamp = 1; 9 | _last_min_ts_time = 0; 10 | _min_ts = 0; 11 | _epoch = (uint64_t *) _mm_malloc(sizeof(uint64_t), 64); 12 | _last_epoch_update_time = (ts_t *) _mm_malloc(sizeof(uint64_t), 64); 13 | _epoch = 0; 14 | _last_epoch_update_time = 0; 15 | all_ts = (ts_t volatile **) _mm_malloc(sizeof(ts_t *) * g_thread_cnt, 64); 16 | for (uint32_t i = 0; i < g_thread_cnt; i++) 17 | all_ts[i] = (ts_t *) _mm_malloc(sizeof(ts_t), 64); 18 | 19 | _all_txns = new txn_man * [g_thread_cnt]; 20 | for (UInt32 i = 0; i < g_thread_cnt; i++) { 21 | *all_ts[i] = UINT64_MAX; 22 | _all_txns[i] = NULL; 23 | } 24 | for (UInt32 i = 0; i < BUCKET_CNT; i++) 25 | pthread_mutex_init( &mutexes[i], NULL ); 26 | } 27 | 28 | uint64_t 29 | Manager::get_ts(uint64_t thread_id) { 30 | if (g_ts_batch_alloc) 31 | assert(g_ts_alloc == TS_CAS); 32 | uint64_t time; 33 | uint64_t starttime = get_sys_clock(); 34 | switch(g_ts_alloc) { 35 | case TS_MUTEX : 36 | pthread_mutex_lock( &ts_mutex ); 37 | time = ++(*timestamp); 38 | pthread_mutex_unlock( &ts_mutex ); 39 | break; 40 | case TS_CAS : 41 | if (g_ts_batch_alloc) 42 | time = ATOM_FETCH_ADD((*timestamp), g_ts_batch_num); 43 | else 44 | time = ATOM_FETCH_ADD((*timestamp), 1); 45 | break; 46 | case TS_HW : 47 | #ifndef NOGRAPHITE 48 | time = CarbonGetTimestamp(); 49 | #else 50 | time = 0; 51 | assert(false); 52 | #endif 53 | break; 54 | case TS_CLOCK : 55 | time = get_sys_clock() * g_thread_cnt + thread_id; 56 | break; 57 | default : 58 | time = 0; assert(false); 59 | } 60 | INC_STATS(thread_id, time_ts_alloc, get_sys_clock() - starttime); 61 | return time; 62 | } 63 | 64 | uint64_t 65 | Manager::get_n_ts(int n) { 66 | uint64_t time = ATOM_ADD_FETCH((*timestamp), n); 67 | return time; 68 | } 69 | 70 | ts_t Manager::get_min_ts(uint64_t tid) { 71 | uint64_t now = get_sys_clock(); 72 | uint64_t last_time = _last_min_ts_time; 73 | if (tid == 0 && now - last_time > MIN_TS_INTVL) 74 | { 75 | ts_t min = UINT64_MAX; 76 | for (UInt32 i = 0; i < g_thread_cnt; i++) 77 | { // added curly braces by zhihan 78 | if (*all_ts[i] < min) 79 | min = *all_ts[i]; 80 | if (min > _min_ts) 81 | _min_ts = min; 82 | } 83 | } 84 | return _min_ts; 85 | } 86 | 87 | void Manager::add_ts(uint64_t thd_id, ts_t ts) { 88 | assert( ts >= *all_ts[thd_id] || 89 | *all_ts[thd_id] == UINT64_MAX); 90 | *all_ts[thd_id] = ts; 91 | } 92 | 93 | void Manager::set_txn_man(txn_man * txn) { 94 | int thd_id = txn->get_thd_id(); 95 | _all_txns[thd_id] = txn; 96 | } 97 | 98 | 99 | uint64_t Manager::hash(row_t * row) { 100 | uint64_t addr = (uint64_t)row / MEM_ALLIGN; 101 | return (addr * 1103515247 + 12345) % BUCKET_CNT; 102 | } 103 | 104 | void Manager::lock_row(row_t * row) { 105 | int bid = hash(row); 106 | pthread_mutex_lock( &mutexes[bid] ); 107 | } 108 | 109 | void Manager::release_row(row_t * row) { 110 | int bid = hash(row); 111 | pthread_mutex_unlock( &mutexes[bid] ); 112 | } 113 | 114 | void 115 | Manager::update_epoch() 116 | { 117 | ts_t time = get_sys_clock(); 118 | if (time - *_last_epoch_update_time > LOG_BATCH_TIME * 1000 * 1000) { 119 | *_epoch = *_epoch + 1; 120 | *_last_epoch_update_time = time; 121 | } 122 | } 123 | -------------------------------------------------------------------------------- /system/manager.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | class row_t; 6 | class txn_man; 7 | 8 | class Manager { 9 | public: 10 | void init(); 11 | // returns the next timestamp. 12 | ts_t get_ts(uint64_t thread_id); 13 | ts_t get_n_ts(int n); // book n timestamps 14 | 15 | // For MVCC. To calculate the min active ts in the system 16 | void add_ts(uint64_t thd_id, ts_t ts); 17 | ts_t get_min_ts(uint64_t tid = 0); 18 | 19 | // HACK! the following mutexes are used to model a centralized 20 | // lock/timestamp manager. 21 | void lock_row(row_t * row); 22 | void release_row(row_t * row); 23 | 24 | txn_man * get_txn_man(int thd_id) { return _all_txns[thd_id]; }; 25 | void set_txn_man(txn_man * txn); 26 | 27 | uint64_t get_epoch() { return *_epoch; }; 28 | void update_epoch(); 29 | private: 30 | // for SILO 31 | volatile uint64_t * _epoch; 32 | ts_t * _last_epoch_update_time; 33 | 34 | pthread_mutex_t ts_mutex; 35 | uint64_t * timestamp; 36 | pthread_mutex_t mutexes[BUCKET_CNT]; 37 | uint64_t hash(row_t * row); 38 | ts_t volatile * volatile * volatile all_ts; 39 | txn_man ** _all_txns; 40 | // for MVCC 41 | volatile ts_t _last_min_ts_time; 42 | ts_t _min_ts; 43 | }; 44 | -------------------------------------------------------------------------------- /system/mcs_spinlock.h: -------------------------------------------------------------------------------- 1 | // 2 | // Implemented based on MCS_lock in IC3 3 | // Modified based on http://libfbp.blogspot.com/2018/01/c-mellor-crummey-scott-mcs-lock.html 4 | // 5 | 6 | #ifndef _MCS_SPINLOCK 7 | #define _MCS_SPINLOCK 8 | 9 | #include 10 | #include "amd64.h" 11 | 12 | class mcslock { 13 | 14 | public: 15 | mcslock(): tail(nullptr) {}; 16 | 17 | struct mcs_node { 18 | volatile bool locked; 19 | uint8_t pad0[64 - sizeof(bool)]; 20 | // padding to separate next and locked into two cache lines 21 | volatile mcs_node* volatile next; 22 | uint8_t pad1[64 - sizeof(mcs_node *)]; 23 | mcs_node(): locked(true), next(nullptr) {} 24 | }; 25 | 26 | void acquire(mcs_node * me) { 27 | auto prior_node = tail.exchange(me, std::memory_order_acquire); 28 | // Any one there? 29 | if (prior_node != nullptr) { 30 | // memory_barrier(); 31 | // Someone there, need to link in 32 | me->locked = true; 33 | prior_node->next = me; 34 | // Make sure we do the above setting of next. 35 | // memory_barrier(); 36 | // Spin on my spin variable 37 | while (me->locked){ 38 | // memory_barrier(); 39 | nop_pause(); 40 | } 41 | assert(!me->locked); 42 | } 43 | }; 44 | 45 | void release(mcs_node * me) { 46 | if (me->next == nullptr) { 47 | mcs_node * expected = me; 48 | // No successor yet 49 | if (tail.compare_exchange_strong(expected, nullptr, 50 | std::memory_order_release, 51 | std::memory_order_relaxed)) { 52 | return; 53 | } 54 | // otherwise, another thread is in the process of trying to 55 | // acquire the lock, so spins waiting for it to finish 56 | while (me->next == nullptr) {}; 57 | } 58 | // memory_barrier(); 59 | // Unlock next one 60 | me->next->locked = false; 61 | me->next = nullptr; 62 | }; 63 | 64 | private: 65 | std::atomic tail; 66 | }; 67 | 68 | #endif 69 | 70 | -------------------------------------------------------------------------------- /system/mem_alloc.cpp: -------------------------------------------------------------------------------- 1 | #include "mem_alloc.h" 2 | #include "helper.h" 3 | #include "global.h" 4 | 5 | // Assume the data is strided across the L2 slices, stride granularity 6 | // is the size of a page 7 | void mem_alloc::init(uint64_t part_cnt, uint64_t bytes_per_part) { 8 | if (g_thread_cnt < g_init_parallelism) 9 | _bucket_cnt = g_init_parallelism * 4 + 1; 10 | else 11 | _bucket_cnt = g_thread_cnt * 4 + 1; 12 | pid_arena = new std::pair[_bucket_cnt]; 13 | for (int i = 0; i < _bucket_cnt; i ++) 14 | pid_arena[i] = std::make_pair(0, 0); 15 | 16 | if (THREAD_ALLOC) { 17 | assert( !g_part_alloc ); 18 | init_thread_arena(); 19 | } 20 | } 21 | 22 | void 23 | Arena::init(int arena_id, int size) { 24 | _buffer = NULL; 25 | _arena_id = arena_id; 26 | _size_in_buffer = 0; 27 | _head = NULL; 28 | _block_size = size; 29 | } 30 | 31 | void * 32 | Arena::alloc() { 33 | FreeBlock * block; 34 | if (_head == NULL) { 35 | // not in the list. allocate from the buffer 36 | int size = (_block_size + sizeof(FreeBlock) + (MEM_ALLIGN - 1)) & ~(MEM_ALLIGN-1); 37 | if (_size_in_buffer < size) { 38 | _buffer = (char *) malloc(_block_size * 40960); 39 | _size_in_buffer = _block_size * 40960; // * 8; 40 | } 41 | block = (FreeBlock *)_buffer; 42 | block->size = _block_size; 43 | _size_in_buffer -= size; 44 | _buffer = _buffer + size; 45 | } else { 46 | block = _head; 47 | _head = _head->next; 48 | } 49 | return (void *) ((char *)block + sizeof(FreeBlock)); 50 | } 51 | 52 | void 53 | Arena::free(void * ptr) { 54 | FreeBlock * block = (FreeBlock *)((UInt64)ptr - sizeof(FreeBlock)); 55 | block->next = _head; 56 | _head = block; 57 | } 58 | 59 | void mem_alloc::init_thread_arena() { 60 | UInt32 buf_cnt = g_thread_cnt; 61 | if (buf_cnt < g_init_parallelism) 62 | buf_cnt = g_init_parallelism; 63 | _arenas = new Arena * [buf_cnt]; 64 | for (UInt32 i = 0; i < buf_cnt; i++) { 65 | _arenas[i] = new Arena[SizeNum]; 66 | for (int n = 0; n < SizeNum; n++) { 67 | assert(sizeof(Arena) == 128); 68 | _arenas[i][n].init(i, BlockSizes[n]); 69 | } 70 | } 71 | } 72 | 73 | void mem_alloc::register_thread(int thd_id) { 74 | if (THREAD_ALLOC) { 75 | pthread_mutex_lock( &map_lock ); 76 | pthread_t pid = pthread_self(); 77 | int entry = pid % _bucket_cnt; 78 | while (pid_arena[ entry ].first != 0) { 79 | printf("conflict at entry %d (pid=%ld)\n", entry, pid); 80 | entry = (entry + 1) % _bucket_cnt; 81 | } 82 | pid_arena[ entry ].first = pid; 83 | pid_arena[ entry ].second = thd_id; 84 | pthread_mutex_unlock( &map_lock ); 85 | } 86 | } 87 | 88 | void mem_alloc::unregister() { 89 | if (THREAD_ALLOC) { 90 | pthread_mutex_lock( &map_lock ); 91 | for (int i = 0; i < _bucket_cnt; i ++) { 92 | pid_arena[i].first = 0; 93 | pid_arena[i].second = 0; 94 | } 95 | pthread_mutex_unlock( &map_lock ); 96 | } 97 | } 98 | 99 | int 100 | mem_alloc::get_arena_id() { 101 | int arena_id; 102 | #if NOGRAPHITE 103 | pthread_t pid = pthread_self(); 104 | int entry = pid % _bucket_cnt; 105 | while (pid_arena[entry].first != pid) { 106 | if (pid_arena[entry].first == 0) 107 | break; 108 | entry = (entry + 1) % _bucket_cnt; 109 | } 110 | arena_id = pid_arena[entry].second; 111 | #else 112 | arena_id = CarbonGetTileId(); 113 | #endif 114 | return arena_id; 115 | } 116 | 117 | int 118 | mem_alloc::get_size_id(UInt32 size) { 119 | for (int i = 0; i < SizeNum; i++) { 120 | if (size <= BlockSizes[i]) 121 | return i; 122 | } 123 | printf("size = %d\n", size); 124 | assert( false ); 125 | return 0; 126 | } 127 | 128 | 129 | void mem_alloc::free(void * ptr, uint64_t size) { 130 | if (NO_FREE) {} 131 | else if (THREAD_ALLOC) { 132 | int arena_id = get_arena_id(); 133 | FreeBlock * block = (FreeBlock *)((UInt64)ptr - sizeof(FreeBlock)); 134 | int size = block->size; 135 | int size_id = get_size_id(size); 136 | _arenas[arena_id][size_id].free(ptr); 137 | } else { 138 | std::free(ptr); 139 | } 140 | } 141 | 142 | //TODO the program should not access more than a PAGE 143 | // to guanrantee correctness 144 | // lock is used for consistency (multiple threads may alloc simultaneously and 145 | // cause trouble) 146 | void * mem_alloc::alloc(uint64_t size, uint64_t part_id) { 147 | void * ptr; 148 | if (size > BlockSizes[SizeNum - 1]) 149 | ptr = malloc(size); 150 | else if (THREAD_ALLOC && (warmup_finish || enable_thread_mem_pool)) { 151 | int arena_id = get_arena_id(); 152 | int size_id = get_size_id(size); 153 | ptr = _arenas[arena_id][size_id].alloc(); 154 | } else { 155 | ptr = malloc(size); 156 | } 157 | return ptr; 158 | } 159 | 160 | 161 | -------------------------------------------------------------------------------- /system/mem_alloc.h: -------------------------------------------------------------------------------- 1 | #ifndef _MEM_ALLOC_H_ 2 | #define _MEM_ALLOC_H_ 3 | 4 | #include "global.h" 5 | #include 6 | 7 | const int SizeNum = 4; 8 | const UInt32 BlockSizes[] = {32, 64, 256, 1024}; 9 | 10 | typedef struct free_block { 11 | int size; 12 | struct free_block* next; 13 | } FreeBlock; 14 | 15 | class Arena { 16 | public: 17 | void init(int arena_id, int size); 18 | void * alloc(); 19 | void free(void * ptr); 20 | private: 21 | char * _buffer; 22 | int _size_in_buffer; 23 | int _arena_id; 24 | int _block_size; 25 | FreeBlock * _head; 26 | char _pad[128 - sizeof(int)*3 - sizeof(void *)*2 - 8]; 27 | }; 28 | 29 | class mem_alloc { 30 | public: 31 | void init(uint64_t part_cnt, uint64_t bytes_per_part); 32 | void register_thread(int thd_id); 33 | void unregister(); 34 | void * alloc(uint64_t size, uint64_t part_id); 35 | void free(void * block, uint64_t size); 36 | int get_arena_id(); 37 | private: 38 | void init_thread_arena(); 39 | int get_size_id(UInt32 size); 40 | 41 | // each thread has several arenas for different block size 42 | Arena ** _arenas; 43 | int _bucket_cnt; 44 | std::pair* pid_arena;// max_arena_id; 45 | pthread_mutex_t map_lock; // only used for pid_to_arena update 46 | }; 47 | 48 | #endif 49 | -------------------------------------------------------------------------------- /system/parser.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | 4 | void print_usage() { 5 | printf("[usage]:\n"); 6 | printf("\t-pINT ; PART_CNT\n"); 7 | printf("\t-vINT ; VIRTUAL_PART_CNT\n"); 8 | printf("\t-tINT ; THREAD_CNT\n"); 9 | printf("\t-qINT ; QUERY_INTVL\n"); 10 | printf("\t-dINT ; PRT_LAT_DISTR\n"); 11 | printf("\t-aINT ; PART_ALLOC (0 or 1)\n"); 12 | printf("\t-mINT ; MEM_PAD (0 or 1)\n"); 13 | printf("\t-GaINT ; ABORT_PENALTY (in ms)\n"); 14 | printf("\t-GcINT ; CENTRAL_MAN\n"); 15 | printf("\t-GtINT ; TS_ALLOC\n"); 16 | printf("\t-GkINT ; KEY_ORDER\n"); 17 | printf("\t-GnINT ; NO_DL\n"); 18 | printf("\t-GoINT ; TIMEOUT\n"); 19 | printf("\t-GlINT ; DL_LOOP_DETECT\n"); 20 | 21 | printf("\t-GbINT ; TS_BATCH_ALLOC\n"); 22 | printf("\t-GuINT ; TS_BATCH_NUM\n"); 23 | 24 | printf("\t-o STRING ; output file\n\n"); 25 | printf(" [YCSB]:\n"); 26 | printf("\t-cINT ; PART_PER_TXN\n"); 27 | printf("\t-eINT ; PERC_MULTI_PART\n"); 28 | printf("\t-rFLOAT ; READ_PERC\n"); 29 | printf("\t-wFLOAT ; WRITE_PERC\n"); 30 | printf("\t-zFLOAT ; ZIPF_THETA\n"); 31 | printf("\t-sINT ; SYNTH_TABLE_SIZE\n"); 32 | printf("\t-RINT ; REQ_PER_QUERY\n"); 33 | printf("\t-fINT ; FIELD_PER_TUPLE\n"); 34 | printf(" [TPCC]:\n"); 35 | printf("\t-nINT ; NUM_WH\n"); 36 | printf("\t-TpFLOAT ; PERC_PAYMENT\n"); 37 | printf("\t-TuINT ; WH_UPDATE\n"); 38 | printf(" [TEST]:\n"); 39 | printf("\t-Ar ; Test READ_WRITE\n"); 40 | printf("\t-Ac ; Test CONFLIT\n"); 41 | } 42 | 43 | void parser(int argc, char * argv[]) { 44 | g_params["abort_buffer_enable"] = ABORT_BUFFER_ENABLE? "true" : "false"; 45 | g_params["write_copy_form"] = WRITE_COPY_FORM; 46 | g_params["validation_lock"] = VALIDATION_LOCK; 47 | g_params["pre_abort"] = PRE_ABORT; 48 | g_params["atomic_timestamp"] = ATOMIC_TIMESTAMP; 49 | 50 | for (int i = 1; i < argc; i++) { 51 | assert(argv[i][0] == '-'); 52 | if (argv[i][1] == 'a') 53 | g_part_alloc = atoi( &argv[i][2] ); 54 | else if (argv[i][1] == 'm') 55 | g_mem_pad = atoi( &argv[i][2] ); 56 | else if (argv[i][1] == 'q') 57 | g_query_intvl = atoi( &argv[i][2] ); 58 | else if (argv[i][1] == 'c') 59 | g_part_per_txn = atoi( &argv[i][2] ); 60 | else if (argv[i][1] == 'e') 61 | g_perc_multi_part = atof( &argv[i][2] ); 62 | else if (argv[i][1] == 'r') 63 | g_read_perc = atof( &argv[i][2] ); 64 | else if (argv[i][1] == 'w') 65 | g_write_perc = atof( &argv[i][2] ); 66 | else if (argv[i][1] == 'z') 67 | g_zipf_theta = atof( &argv[i][2] ); 68 | else if (argv[i][1] == 'd') 69 | g_prt_lat_distr = atoi( &argv[i][2] ); 70 | else if (argv[i][1] == 'p') 71 | g_part_cnt = atoi( &argv[i][2] ); 72 | else if (argv[i][1] == 'v') 73 | g_virtual_part_cnt = atoi( &argv[i][2] ); 74 | else if (argv[i][1] == 't') 75 | g_thread_cnt = atoi( &argv[i][2] ); 76 | else if (argv[i][1] == 's') 77 | g_synth_table_size = atoi( &argv[i][2] ); 78 | else if (argv[i][1] == 'R') 79 | g_req_per_query = atoi( &argv[i][2] ); 80 | else if (argv[i][1] == 'f') 81 | g_field_per_tuple = atoi( &argv[i][2] ); 82 | else if (argv[i][1] == 'n') 83 | g_num_wh = atoi( &argv[i][2] ); 84 | else if (argv[i][1] == 'G') { 85 | if (argv[i][2] == 'a') 86 | g_abort_penalty = atoi( &argv[i][3] ); 87 | else if (argv[i][2] == 'c') 88 | g_central_man = atoi( &argv[i][3] ); 89 | else if (argv[i][2] == 't') 90 | g_ts_alloc = atoi( &argv[i][3] ); 91 | else if (argv[i][2] == 'k') 92 | g_key_order = atoi( &argv[i][3] ); 93 | else if (argv[i][2] == 'n') 94 | g_no_dl = atoi( &argv[i][3] ); 95 | else if (argv[i][2] == 'o') 96 | g_timeout = atol( &argv[i][3] ); 97 | else if (argv[i][2] == 'l') 98 | g_dl_loop_detect = atoi( &argv[i][3] ); 99 | else if (argv[i][2] == 'b') 100 | g_ts_batch_alloc = atoi( &argv[i][3] ); 101 | else if (argv[i][2] == 'u') 102 | g_ts_batch_num = atoi( &argv[i][3] ); 103 | } else if (argv[i][1] == 'T') { 104 | if (argv[i][2] == 'p') 105 | g_perc_payment = atof( &argv[i][3] ); 106 | if (argv[i][2] == 'u') 107 | g_wh_update = atoi( &argv[i][3] ); 108 | } else if (argv[i][1] == 'A') { 109 | if (argv[i][2] == 'r') 110 | g_test_case = READ_WRITE; 111 | if (argv[i][2] == 'c') 112 | g_test_case = CONFLICT; 113 | } 114 | else if (argv[i][1] == 'o') { 115 | i++; 116 | output_file = argv[i]; 117 | } 118 | else if (argv[i][1] == 'h') { 119 | print_usage(); 120 | exit(0); 121 | } 122 | else if (argv[i][1] == '-') { 123 | string line(&argv[i][2]); 124 | size_t pos = line.find("="); 125 | assert(pos != string::npos); 126 | string name = line.substr(0, pos); 127 | string value = line.substr(pos + 1, line.length()); 128 | assert(g_params.find(name) != g_params.end()); 129 | g_params[name] = value; 130 | } 131 | else 132 | assert(false); 133 | } 134 | if (g_thread_cnt < g_init_parallelism) 135 | g_init_parallelism = g_thread_cnt; 136 | } 137 | -------------------------------------------------------------------------------- /system/query.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include "query.h" 3 | #include "mem_alloc.h" 4 | #include "wl.h" 5 | #include "table.h" 6 | #include "ycsb_query.h" 7 | #include "tpcc_query.h" 8 | #include "tpcc_helper.h" 9 | 10 | thread_local drand48_data per_thread_rand_buf; 11 | 12 | /*************************************************/ 13 | // class Query_queue 14 | /*************************************************/ 15 | int Query_queue::_next_tid; 16 | 17 | void 18 | Query_queue::init(workload * h_wl) { 19 | all_queries = new Query_thd * [g_thread_cnt]; 20 | _wl = h_wl; 21 | _next_tid = 0; 22 | 23 | 24 | #if WORKLOAD == YCSB 25 | ycsb_query::calculateDenom(); 26 | #elif WORKLOAD == TPCC 27 | assert(tpcc_buffer != NULL); 28 | #endif 29 | int64_t begin = get_server_clock(); 30 | pthread_t p_thds[g_thread_cnt - 1]; 31 | for (UInt32 i = 0; i < g_thread_cnt - 1; i++) { 32 | pthread_create(&p_thds[i], NULL, threadInitQuery, this); 33 | } 34 | threadInitQuery(this); 35 | for (uint32_t i = 0; i < g_thread_cnt - 1; i++) 36 | pthread_join(p_thds[i], NULL); 37 | int64_t end = get_server_clock(); 38 | printf("Query Queue Init Time %f\n", 1.0 * (end - begin) / 1000000000UL); 39 | } 40 | 41 | void 42 | Query_queue::init_per_thread(int thread_id) { 43 | all_queries[thread_id] = (Query_thd *) _mm_malloc(sizeof(Query_thd), 64); 44 | all_queries[thread_id]->init(_wl, thread_id); 45 | } 46 | 47 | base_query * 48 | Query_queue::get_next_query(uint64_t thd_id) { 49 | base_query * query = all_queries[thd_id]->get_next_query(); 50 | return query; 51 | } 52 | 53 | void * 54 | Query_queue::threadInitQuery(void * This) { 55 | Query_queue * query_queue = (Query_queue *)This; 56 | uint32_t tid = ATOM_FETCH_ADD(_next_tid, 1); 57 | 58 | // set cpu affinity 59 | set_affinity(tid); 60 | 61 | query_queue->init_per_thread(tid); 62 | return NULL; 63 | } 64 | 65 | /*************************************************/ 66 | // class Query_thd 67 | /*************************************************/ 68 | 69 | 70 | void 71 | Query_thd::init(workload * h_wl, int thread_id) { 72 | q_idx = 0; 73 | #if TPCC_USER_ABORT 74 | request_cnt = WARMUP / g_thread_cnt + MAX_TXN_PER_PART + 100 + 75 | MAX_TXN_PER_PART / 100; 76 | #else 77 | request_cnt = WARMUP / g_thread_cnt + MAX_TXN_PER_PART + 100; 78 | #endif 79 | #if ABORT_BUFFER_ENABLE 80 | request_cnt += ABORT_BUFFER_SIZE; 81 | #endif 82 | #if WORKLOAD == YCSB 83 | queries = (ycsb_query *) 84 | mem_allocator.alloc(sizeof(ycsb_query) * request_cnt, thread_id); 85 | srand48_r(thread_id + 1, &buffer); 86 | // XXX(zhihan): create a pre-allocated space for long txn 87 | if (g_long_txn_ratio > 0) { 88 | long_txn = (ycsb_request *) 89 | mem_allocator.alloc(sizeof(ycsb_request) * MAX_ROW_PER_TXN, 90 | thread_id); 91 | long_txn_part = (uint64_t *) 92 | mem_allocator.alloc(sizeof(uint64_t) * g_part_per_txn, thread_id); 93 | } 94 | #elif WORKLOAD == TPCC 95 | queries = (tpcc_query *) _mm_malloc(sizeof(tpcc_query) * request_cnt, 64); 96 | #endif 97 | for (UInt32 qid = 0; qid < request_cnt; qid ++) { 98 | #if WORKLOAD == YCSB 99 | new(&queries[qid]) ycsb_query(); 100 | queries[qid].init(thread_id, h_wl, this); 101 | #elif WORKLOAD == TPCC 102 | new(&queries[qid]) tpcc_query(); 103 | queries[qid].init(thread_id, h_wl); 104 | #endif 105 | } 106 | } 107 | 108 | 109 | base_query * 110 | Query_thd::get_next_query() { 111 | if (q_idx >= request_cnt-1) { 112 | q_idx = 0; 113 | assert(q_idx < request_cnt); 114 | //printf("WARNING: run out of queries, increase txn cnt per part!\n"); 115 | //return NULL; 116 | } 117 | base_query * query = &queries[q_idx++]; 118 | return query; 119 | } 120 | -------------------------------------------------------------------------------- /system/query.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | class workload; 6 | class ycsb_query; 7 | class tpcc_query; 8 | class ycsb_request; 9 | 10 | extern thread_local drand48_data per_thread_rand_buf; 11 | 12 | class base_query { 13 | public: 14 | virtual void init(uint64_t thd_id, workload * h_wl) = 0; 15 | uint64_t waiting_time; 16 | uint64_t part_num; 17 | uint64_t * part_to_access; 18 | bool rerun; 19 | uint32_t num_abort = 0; 20 | uint32_t prio = 0; 21 | uint32_t max_prio = LOW_PRIO_BOUND; 22 | // note prio may be overwritten by subclass to support more complicated 23 | // priority distribution, e.g. long-running txn 24 | }; 25 | 26 | // All the querise for a particular thread. 27 | class Query_thd { 28 | public: 29 | void init(workload * h_wl, int thread_id); 30 | base_query * get_next_query(); 31 | uint64_t q_idx; 32 | #if WORKLOAD == YCSB 33 | ycsb_query * queries; 34 | ycsb_request * long_txn; 35 | uint64_t * long_txn_part; 36 | #else 37 | tpcc_query * queries; 38 | #endif 39 | char pad[CL_SIZE - sizeof(void *) - sizeof(int)]; 40 | drand48_data buffer; 41 | uint64_t request_cnt; 42 | }; 43 | 44 | // TODO we assume a separate task queue for each thread in order to avoid 45 | // contention in a centralized query queue. In reality, more sofisticated 46 | // queue model might be implemented. 47 | class Query_queue { 48 | public: 49 | void init(workload * h_wl); 50 | void init_per_thread(int thread_id); 51 | base_query * get_next_query(uint64_t thd_id); 52 | 53 | private: 54 | static void * threadInitQuery(void * This); 55 | 56 | Query_thd ** all_queries; 57 | workload * _wl; 58 | static int _next_tid; 59 | }; 60 | -------------------------------------------------------------------------------- /system/thread.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | class workload; 6 | class base_query; 7 | class BatchMgr; 8 | 9 | class thread_t { 10 | public: 11 | uint64_t _thd_id; 12 | workload * _wl; 13 | 14 | #if CC_ALG == ARIA 15 | BatchMgr* batch_mgr; 16 | #endif 17 | 18 | constexpr uint64_t get_thd_id() { return _thd_id; } 19 | constexpr uint64_t get_host_cid() { return _host_cid; } 20 | void set_host_cid(uint64_t cid) { _host_cid = cid; } 21 | constexpr uint64_t get_cur_cid() { return _cur_cid; } 22 | void set_cur_cid(uint64_t cid) { _cur_cid = cid; } 23 | 24 | void init(uint64_t thd_id, workload * workload); 25 | // the following function must be in the form void* (*)(void*) 26 | // to run with pthread. 27 | // conversion is done within the function. 28 | RC run(); 29 | 30 | // moved from private to global for clv 31 | ts_t get_next_ts(); 32 | ts_t get_next_n_ts(int n); 33 | 34 | 35 | private: 36 | uint64_t _host_cid; 37 | uint64_t _cur_cid; 38 | ts_t _curr_ts; 39 | 40 | RC runTest(txn_man * txn); 41 | drand48_data buffer; 42 | 43 | // added for wound wait 44 | base_query * curr_query; 45 | ts_t starttime; 46 | 47 | // A restart buffer for aborted txns. 48 | struct AbortBufferEntry { 49 | ts_t abort_time; // abort_time + penalty == ready_time 50 | ts_t ready_time; 51 | base_query * query; 52 | ts_t starttime; 53 | uint64_t exec_time_abort; // exec time that eventually aborts 54 | uint64_t backoff_time; // accumulated backoff time 55 | }; 56 | AbortBufferEntry * _abort_buffer; 57 | int _abort_buffer_size; 58 | int _abort_buffer_empty_slots; 59 | bool _abort_buffer_enable; 60 | }; 61 | -------------------------------------------------------------------------------- /system/wl.cpp: -------------------------------------------------------------------------------- 1 | #include "global.h" 2 | #include "helper.h" 3 | #include "wl.h" 4 | #include "row.h" 5 | #include "table.h" 6 | #include "index_hash.h" 7 | #include "index_btree.h" 8 | #include "catalog.h" 9 | #include "mem_alloc.h" 10 | 11 | RC workload::init() { 12 | sim_done.store(false, std::memory_order_release); 13 | return RCOK; 14 | } 15 | 16 | RC workload::init_schema(string schema_file) { 17 | assert(sizeof(uint64_t) == 8); 18 | assert(sizeof(double) == 8); 19 | string line; 20 | ifstream fin(schema_file); 21 | Catalog * schema; 22 | while (getline(fin, line)) { 23 | if (line.compare(0, 6, "TABLE=") == 0) { 24 | string tname; 25 | tname = &line[6]; 26 | schema = (Catalog *) _mm_malloc(sizeof(Catalog), CL_SIZE); 27 | getline(fin, line); 28 | int col_count = 0; 29 | // Read all fields for this table. 30 | vector lines; 31 | while (line.length() > 1) { 32 | lines.push_back(line); 33 | getline(fin, line); 34 | } 35 | schema->init( tname.c_str(), lines.size() ); 36 | for (UInt32 i = 0; i < lines.size(); i++) { 37 | string line = lines[i]; 38 | size_t pos = 0; 39 | string token; 40 | int elem_num = 0; 41 | int size = 0; 42 | string type; 43 | string name; 44 | while (line.length() != 0) { 45 | pos = line.find(","); 46 | if (pos == string::npos) 47 | pos = line.length(); 48 | token = line.substr(0, pos); 49 | line.erase(0, pos + 1); 50 | switch (elem_num) { 51 | case 0: size = atoi(token.c_str()); break; 52 | case 1: type = token; break; 53 | case 2: name = token; break; 54 | default: assert(false); 55 | } 56 | elem_num ++; 57 | } 58 | assert(elem_num == 3); 59 | schema->add_col((char *)name.c_str(), size, (char *)type.c_str()); 60 | col_count ++; 61 | } 62 | table_t * cur_tab = (table_t *) _mm_malloc(sizeof(table_t), CL_SIZE); 63 | cur_tab->init(schema); 64 | tables[tname] = cur_tab; 65 | } else if (!line.compare(0, 6, "INDEX=")) { 66 | string iname; 67 | iname = &line[6]; 68 | getline(fin, line); 69 | 70 | vector items; 71 | string token; 72 | size_t pos; 73 | while (line.length() != 0) { 74 | pos = line.find(","); 75 | if (pos == string::npos) 76 | pos = line.length(); 77 | token = line.substr(0, pos); 78 | items.push_back(token); 79 | line.erase(0, pos + 1); 80 | } 81 | 82 | string tname(items[0]); 83 | INDEX * index = (INDEX *) _mm_malloc(sizeof(INDEX), 64); 84 | new(index) INDEX(); 85 | int part_cnt = (CENTRAL_INDEX)? 1 : g_part_cnt; 86 | if (tname == "ITEM") 87 | part_cnt = 1; 88 | #if INDEX_STRUCT == IDX_HASH 89 | #if WORKLOAD == YCSB 90 | index->init(part_cnt, tables[tname], g_synth_table_size * 2); 91 | #elif WORKLOAD == TPCC 92 | assert(tables[tname] != NULL); 93 | index->init(part_cnt, tables[tname], stoi( items[1] ) * part_cnt); 94 | #endif 95 | #else 96 | index->init(part_cnt, tables[tname]); 97 | #endif 98 | indexes[iname] = index; 99 | } 100 | } 101 | fin.close(); 102 | return RCOK; 103 | } 104 | 105 | 106 | 107 | void workload::index_insert(string index_name, uint64_t key, row_t * row) { 108 | assert(false); 109 | INDEX * index = (INDEX *) indexes[index_name]; 110 | index_insert(index, key, row); 111 | } 112 | 113 | void workload::index_insert(INDEX * index, uint64_t key, row_t * row, int64_t part_id) { 114 | uint64_t pid = part_id; 115 | if (part_id == -1) 116 | pid = get_part_id(row); 117 | itemid_t * m_item = 118 | (itemid_t *) mem_allocator.alloc( sizeof(itemid_t), pid ); 119 | m_item->init(); 120 | m_item->type = DT_row; 121 | m_item->location = row; 122 | m_item->valid = true; 123 | #ifdef NDEBUG 124 | index->index_insert(key, m_item, pid); 125 | #else 126 | assert( index->index_insert(key, m_item, pid) == RCOK ); 127 | #endif 128 | } 129 | 130 | SC_PIECE * workload::get_cedges(TPCCTxnType type, int idx) { 131 | assert(false); 132 | return NULL; 133 | } 134 | 135 | 136 | -------------------------------------------------------------------------------- /system/wl.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "global.h" 4 | 5 | class row_t; 6 | class table_t; 7 | class IndexHash; 8 | class index_btree; 9 | class Catalog; 10 | class lock_man; 11 | class txn_man; 12 | class thread_t; 13 | class index_base; 14 | class Timestamp; 15 | class Mvcc; 16 | 17 | struct SC_PIECE { 18 | int txn_type; 19 | int piece_id; 20 | }; 21 | 22 | // this is the base class for all workload 23 | class workload 24 | { 25 | public: 26 | // tables indexed by table name 27 | map tables; 28 | map indexes; 29 | 30 | 31 | // initialize the tables and indexes. 32 | virtual RC init(); 33 | virtual RC init_schema(string schema_file); 34 | virtual RC init_table()=0; 35 | virtual RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd)=0; 36 | 37 | // ic3 helpers 38 | virtual SC_PIECE * get_cedges(TPCCTxnType txn_type, int piece_id); 39 | 40 | std::atomic_bool sim_done; 41 | protected: 42 | void index_insert(string index_name, uint64_t key, row_t * row); 43 | void index_insert(INDEX * index, uint64_t key, row_t * row, int64_t part_id = -1); 44 | }; 45 | 46 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | import os, sys, re, os.path 2 | import platform 3 | import subprocess, datetime, time, signal, json 4 | 5 | 6 | dbms_cfg = ["config-std.h", "config.h"] 7 | 8 | def replace(filename, pattern, replacement): 9 | f = open(filename) 10 | s = f.read() 11 | f.close() 12 | s = re.sub(pattern,replacement,s) 13 | f = open(filename,'w') 14 | f.write(s) 15 | f.close() 16 | 17 | def set_ndebug(ndebug): 18 | f = open("system/global.h", "r") 19 | found = None 20 | set_ndebug = False 21 | for line in f: 22 | if "#define NDEBUG" in line: 23 | found = line 24 | if line[0] != '#': 25 | set_ndebug = True 26 | if "", "#define NDEBUG\n#include ") 32 | else: 33 | if ndebug: 34 | replace("system/global.h", found, "#define NDEBUG\n") 35 | else: 36 | replace("system/global.h", found, "") 37 | 38 | def compile(job): 39 | os.system("cp {} {}".format(dbms_cfg[0], dbms_cfg[1])) 40 | # define workload 41 | for param, value in job.items(): 42 | pattern = r"\#define\s*" + re.escape(param) + r'[\t ].*' 43 | replacement = "#define " + param + ' ' + str(value) 44 | replace(dbms_cfg[1], pattern, replacement) 45 | os.system("make clean > temp.out 2>&1") 46 | ret = os.system("make -j8 > temp.out 2>&1") 47 | if ret != 0: 48 | print("ERROR in compiling, output saved in temp.out", file=sys.stderr) 49 | exit(0) 50 | else: 51 | os.system("rm -f temp.out") 52 | 53 | def run(test = '', job=None, numa=True): 54 | app_flags = "" 55 | if test == 'read_write': 56 | app_flags = "-Ar -t1" 57 | if test == 'conflict': 58 | app_flags = "-Ac -t4" 59 | if numa: 60 | os.system("numactl --interleave all ./rundb %s | tee temp.out" % app_flags) 61 | else: 62 | os.system("./rundb %s | tee temp.out" % app_flags) 63 | 64 | def eval_arg(job, arg): 65 | return ((arg in job) and (job[arg] == "true")) 66 | 67 | def parse_output(job): 68 | output = open("temp.out") 69 | success = False 70 | for line in output: 71 | line = line.strip() 72 | if "[summary]" in line: 73 | success = True 74 | for token in line.strip().split('[summary]')[-1].split(','): 75 | key, val = token.strip().split('=') 76 | job[key] = val 77 | break 78 | if success: 79 | output.close() 80 | os.system("rm -f temp.out") 81 | return job 82 | errlog = open("log/{}.log".format(datetime.datetime.now().strftime("%b-%d_%H-%M-%S-%f")), 'a+') 83 | errlog.write("{}\n".format(json.dumps(job))) 84 | output = open("temp.out") 85 | for line in output: 86 | errlog.write(line) 87 | errlog.close() 88 | output.close() 89 | os.system("rm -f temp.out") 90 | return job 91 | 92 | if __name__ == "__main__": 93 | print("usage: path/to/json [more args]", file=sys.stderr) 94 | fname = sys.argv[1] 95 | idx = 2 96 | if ".json" not in fname: 97 | fname = "experiments/default.json" 98 | idx = 1 99 | print("- read config from file: {}".format(fname), file=sys.stderr) 100 | job = json.load(open(fname)) 101 | if len(sys.argv) > idx: 102 | # has more args / overwrite existing args 103 | for item in sys.argv[idx:]: 104 | key, value = item.split("=", 1) 105 | job[key] = value 106 | if not eval_arg(job, "EXEC_ONLY"): 107 | print("- compiling...", file=sys.stderr) 108 | ndebug = eval_arg(job, "NDEBUG") 109 | set_ndebug(ndebug) 110 | if ndebug: 111 | print("- disable assert()", file=sys.stderr) 112 | compile(job) 113 | numa = eval_arg(job, "UNSET_NUMA") == False 114 | if not numa: 115 | print("- disable interleaving allocation across numa nodes", file=sys.stderr) 116 | if not eval_arg(job, "COMPILE_ONLY"): 117 | print("- executing...", file=sys.stderr) 118 | run("", job, numa=numa) 119 | if eval_arg(job, "OUTPUT_TO_FILE"): 120 | job = parse_output(job) 121 | stats = open("outputs/stats.json", "a+") 122 | stats.write(json.dumps(job)+"\n") 123 | stats.close() 124 | --------------------------------------------------------------------------------