├── .gitignore
├── ARTIFACT.md
├── CMakeLists.txt
├── DBx1000_README
├── LICENSE
├── Makefile
├── README.md
├── artifact.sh
├── benchmarks
    ├── TEST_schema.txt
    ├── TPCC_full_schema.txt
    ├── TPCC_short_schema.txt
    ├── YCSB_schema.txt
    ├── test.h
    ├── test_txn.cpp
    ├── test_wl.cpp
    ├── tpcc.h
    ├── tpcc_const.h
    ├── tpcc_helper.cpp
    ├── tpcc_helper.h
    ├── tpcc_query.cpp
    ├── tpcc_query.h
    ├── tpcc_txn.cpp
    ├── tpcc_wl.cpp
    ├── ycsb.h
    ├── ycsb_query.cpp
    ├── ycsb_query.h
    ├── ycsb_txn.cpp
    └── ycsb_wl.cpp
├── concurrency_control
    ├── aria.cpp
    ├── aria.h
    ├── bamboo.cpp
    ├── dl_detect.cpp
    ├── dl_detect.h
    ├── hekaton.cpp
    ├── ic3.cpp
    ├── occ.cpp
    ├── occ.h
    ├── plock.cpp
    ├── plock.h
    ├── row_aria.cpp
    ├── row_aria.h
    ├── row_bamboo.cpp
    ├── row_bamboo.h
    ├── row_hekaton.cpp
    ├── row_hekaton.h
    ├── row_ic3.cpp
    ├── row_ic3.h
    ├── row_lock.cpp
    ├── row_lock.h
    ├── row_mvcc.cpp
    ├── row_mvcc.h
    ├── row_occ.cpp
    ├── row_occ.h
    ├── row_silo.cpp
    ├── row_silo.h
    ├── row_silo_prio.cpp
    ├── row_silo_prio.h
    ├── row_tictoc.cpp
    ├── row_tictoc.h
    ├── row_ts.cpp
    ├── row_ts.h
    ├── row_vll.cpp
    ├── row_vll.h
    ├── row_ww.cpp
    ├── row_ww.h
    ├── silo.cpp
    ├── silo_prio.cpp
    ├── tictoc.cpp
    ├── vll.cpp
    └── vll.h
├── config-std.h
├── config.cpp
├── experiments
    ├── debug.json
    ├── default.json
    ├── large_dataset.json
    ├── long_txn.json
    ├── run_all.sh
    ├── run_tpcc_thread.sh
    ├── run_ycsb_aria.sh
    ├── run_ycsb_latency.sh
    ├── run_ycsb_prio_sen.sh
    ├── run_ycsb_readonly.sh
    ├── run_ycsb_thread.sh
    ├── run_ycsb_zipf.sh
    ├── synthetic_ycsb.json
    └── tpcc.json
├── libs
    └── libjemalloc.a
├── outputs
    ├── .ipynb_checkpoints
    │   └── analyze_factors-checkpoint.ipynb
    └── collect_stats.py
├── parse.py
├── plot.py
├── requirements.txt
├── storage
    ├── catalog.cpp
    ├── catalog.h
    ├── index_base.h
    ├── index_btree.cpp
    ├── index_btree.h
    ├── index_hash.cpp
    ├── index_hash.h
    ├── row.cpp
    ├── row.h
    ├── table.cpp
    └── table.h
├── system
    ├── amd64.h
    ├── batch.cpp
    ├── batch.h
    ├── global.cpp
    ├── global.h
    ├── helper.cpp
    ├── helper.h
    ├── main.cpp
    ├── manager.cpp
    ├── manager.h
    ├── mcs_spinlock.h
    ├── mem_alloc.cpp
    ├── mem_alloc.h
    ├── parser.cpp
    ├── query.cpp
    ├── query.h
    ├── stats.cpp
    ├── stats.h
    ├── thread.cpp
    ├── thread.h
    ├── txn.cpp
    ├── txn.h
    ├── wl.cpp
    └── wl.h
└── test.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | **.o
 2 | **.d
 3 | **.out
 4 | **.swp
 5 | **.swo
 6 | run_exp.sh
 7 | outputs/*.json
 8 | config.h
 9 | rundb
10 | cmake-build-debug
11 | .DS_STORE
12 | .idea/*
13 | perf*
14 | row_stats.csv
15 | linux/
16 | results/
17 | *.pdf
18 | 


--------------------------------------------------------------------------------
/ARTIFACT.md:
--------------------------------------------------------------------------------
 1 | 
 2 | # Artifact Reproduction
 3 | 
 4 | This document aims to provide detailed step-by-step instructions to reproduce all experiments on a CloudLab c6420 instance with Ubuntu 20. We recommend using this environment to reproduce experiments. To run experiments on other hardware, make sure the machine has at least 64 logical cores since some experiments use 64 threads. You may download a copy of the paper [here](https://dl.acm.org/doi/10.1145/3588724?cid=99660889005).
 5 | 
 6 | The steps 0-B to 2 are summarized in a single script [`artifact.sh`](artifact.sh). Expected running time of this script on CloudLab c6420 machine is 9+ hours.
 7 | 
 8 | ## Step 0-A: Set up c6420 Machine
 9 | 
10 | CloudLab machines, by default, only have 16 GB of storage space mounted at the root, which can be insufficient. The script below mounts `/dev/sda4` to `~/workspace`. You may skip this step if not using CloudLab c6420 instances.
11 | 
12 | ```bash
13 | DISK=/dev/sda4
14 | WORKSPACE=$HOME/workspace
15 | sudo mkfs.ext4 $DISK
16 | sudo mkdir $WORKSPACE
17 | sudo mount $DISK $WORKSPACE
18 | sudo chown -R $USER $WORKSPACE
19 | echo "$DISK        $WORKSPACE        ext4    defaults        0       0" | sudo tee -a /etc/fstab
20 | # this directory will be mounted automatically for every reboot
21 | ```
22 | 
23 | ## Step 0-B: Install Dependencies
24 | 
25 | Download the codebase under `~/workspace`, and `cd` into Polaris top-level directory. Then install software dependencies:
26 | 
27 | ```bash
28 | # assume the current working directory is `polaris/`
29 | sudo apt update
30 | sudo apt install -y numactl python3-pip
31 | pip3 install -r requirements.txt
32 | ```
33 | 
34 | ## Step 1: Run All Experiments
35 | 
36 | The experiment scripts are under `experiments/`. To run all experiments:
37 | 
38 | ```bash
39 | # run all experiments; this may take 9+ hours
40 | bash experiments/run_ycsb_latency.sh  # fig 1, 7
41 | bash experiments/run_ycsb_prio_sen.sh # fig 2
42 | bash experiments/run_ycsb_thread.sh   # fig 3
43 | bash experiments/run_ycsb_readonly.sh # fig 4
44 | bash experiments/run_ycsb_zipf.sh     # fig 5, 6
45 | bash experiments/run_tpcc_thread.sh   # fig 8, 9
46 | bash experiments/run_ycsb_aria.sh     # fig 10, 11
47 | ```
48 | 
49 | Alternatively, simply run `bash experiments/run_all.sh`, which is a shortcut to run the lines above.
50 | 
51 | The experiment results are saved under `results/`. You should find 7 subdirectories: `ycsb_latency`, `ycsb_prio_sen`, `ycsb_thread`, `ycsb_readonly`, `ycsb_zipf`, `tpcc_thread`, and `ycsb_aria_batch`.
52 | 
53 | ## Step 2: Process Experiment Data
54 | 
55 | Then parse and plot the experiment results:
56 | 
57 | ```bash
58 | # parse all experiments
59 | python3 parse.py ycsb_latency ycsb_prio_sen ycsb_thread ycsb_readonly ycsb_zipf tpcc_thread ycsb_aria_batch
60 | 
61 | # plot figures; this may take minutes
62 | python3 plot.py
63 | ```
64 | 
65 | The plots are saved in the current working directory. You should find 11 figures (with the corresponding figure numbers in the [paper](https://doi.org/10.1145/3588724)):
66 | 
67 | - fig 1: `ycsb_latency_allcc.pdf`
68 | - fig 2: `ycsb_prio_ratio_vs_throughput.pdf`
69 | - fig 3: `ycsb_thread_vs_throughput_tail.pdf`
70 | - fig 4: `ycsb_thread_vs_throughput_tail_readonly.pdf`
71 | - fig 5: `ycsb_zipf_vs_throughput_tail.pdf`
72 | - fig 6: `ycsb_latency_udprio.pdf`
73 | - fig 7: `tpcc_thread_vs_throughput_tail_wh1.pdf`
74 | - fig 8: `tpcc_thread_vs_throughput_tail_wh64.pdf`
75 | - fig 9: `ycsb_aria_thread_vs_throughput_tail_zipf0.99.pdf`
76 | - fig 10: `ycsb_aria_thread_vs_throughput_tail_zipf0.5.pdf`
77 | 
78 | If running on CloudLab c6420 instance, all figures should be similar; the exception is fig 10: as reported in the paper, Aria p999 tail latency tends to have (relatively) high variation due to batching.
79 | 


--------------------------------------------------------------------------------
/CMakeLists.txt:
--------------------------------------------------------------------------------
 1 | cmake_minimum_required(VERSION 2.8)
 2 | project(Dbx1000)
 3 | 
 4 | SET (CMAKE_C_COMPILER "gcc")
 5 | SET (CMAKE_CXX_COMPILER "g++")
 6 | SET (CMAKE_CXX_FLAGS "-std=c++17 -Wno-deprecated-declarations" CACHE INTERNAL "compiler options" FORCE)
 7 | SET (CMAKE_CXX_FLAGS_DEBUG "-O0 -g" CACHE INTERNAL "compiler options" FORCE)
 8 | SET (CMAKE_CXX_FLAGS_RELEASE "-O3" CACHE INTERNAL "compiler options" FORCE)
 9 | 
10 | add_definitions(-DNOGRAPHITE=1)
11 | 
12 | # include header files
13 | INCLUDE_DIRECTORIES(${PROJECT_SOURCE_DIR} ${PROJECT_SOURCE_DIR}/benchmarks/ ${PROJECT_SOURCE_DIR}/concurrency_control/ ${PROJECT_SOURCE_DIR}/storage/ ${PROJECT_SOURCE_DIR}/system/)
14 | # lib files
15 | #LINK_DIRECTORIES(${PROJECT_SOURCE_DIR}/libs)
16 | file(GLOB_RECURSE SRC_FILES benchmarks/*.cpp concurrency_control/*.cpp storage/*.cpp system/*.cpp config.cpp)
17 | add_executable(rundb ${SRC_FILES})
18 | target_link_libraries(rundb libpthread.so libjemalloc.so)
19 | 


--------------------------------------------------------------------------------
/DBx1000_README:
--------------------------------------------------------------------------------
 1 | 
 2 | DBMS BENCHMARK
 3 | ------------------------------
 4 | 
 5 | == General Features ==
 6 | 
 7 |   dbms is a OLTP database benchmark with the following features.
 8 |   
 9 |   1. Seven different concurrency control algorithms are supported.
10 |   DL_DETECT[1]	: deadlock detection 
11 | 	NO_WAIT[1]		: no wait two phase locking
12 | 	WAIT_DIE[1]		: wait and die two phase locking
13 | 	TIMESTAMP[1]	: basic T/O
14 | 	MVCC[1]			: multi-version T/O
15 | 	HSTORE[3]		: H-STORE
16 | 	OCC[2]			: optimistic concurrency control
17 | 
18 |   [1] Phlip Bernstein, Nathan Goodman, "Concurrency Control in Distributed Database Systems", Computing Surveys, June 1981
19 |   [2] H.T. Kung, John Robinson, "On Optimistic Methods for Concurrency Control", Transactions on Database Systems, June 1981
20 |   [3] R. Kallman et al, "H-Store: a High-Performance, Distributed Main Memory Transaction Processing System,", VLDB 2008
21 | 	
22 |   2. Two benchmarks are supported. 
23 |     2.1 YCSB[4]
24 | 	2.2 TPCC[5] 
25 | 	  Only Payment and New Order transactions are modeled. 
26 | 	
27 |   [4] B. Cooper et al, "Benchmarking Cloud Serving Systems with YCSB", SoCC 201
28 |   [5] http://www.tpc.org/tpcc/ 
29 | 
30 | == Config File ==
31 | 
32 | dbms benchmark has the following parameters in the config file. Parameters with a * sign should not be changed.
33 | 
34 |   CORE_CNT		: number of cores modeled in the system.
35 |   PART_CNT		: number of logical partitions in the system
36 |   THREAD_CNT	: number of threads running at the same time
37 |   PAGE_SIZE		: memory page size
38 |   CL_SIZE		: cache line size
39 |   WARMUP		: number of transactions to run for warmup
40 | 
41 |   WORKLOAD		: workload supported (TPCC or YCSB)
42 |   
43 |   THREAD_ALLOC	: per thread allocator. 
44 |   * MEM_PAD		: enable memory padding to avoid false sharing.
45 |   MEM_ALLIGN	: allocated blocks are alligned to MEM_ALLIGN bytes
46 | 
47 |   PRT_LAT_DISTR	: print out latency distribution of transactions
48 | 
49 |   CC_ALG		: concurrency control algorithm
50 |   * ROLL_BACK		: roll back the modifications if a transaction aborts.
51 |   
52 |   ENABLE_LATCH  : enable latching in btree index
53 |   * CENTRAL_INDEX : centralized index structure
54 |   * CENTRAL_MANAGER	: centralized lock/timestamp manager
55 |   INDEX_STRCT	: data structure for index. 
56 |   BTREE_ORDER	: fanout of each B-tree node
57 | 
58 |   DL_TIMEOUT_LOOP	: the max waiting time in DL_DETECT. after timeout, deadlock will be detected.
59 |   TS_TWR		: enable Thomas Write Rule (TWR) in TIMESTAMP
60 |   HIS_RECYCLE_LEN	: in MVCC, history will be recycled if they are too long.
61 |   MAX_WRITE_SET	: the max size of a write set in OCC.
62 | 
63 |   MAX_ROW_PER_TXN	: max number of rows touched per transaction.
64 |   QUERY_INTVL	: the rate at which database queries come
65 |   MAX_TXN_PER_PART	: maximum transactions to run per partition.
66 |   
67 |   // for YCSB Benchmark
68 |   SYNTH_TABLE_SIZE	: table size
69 |   ZIPF_THETA	: theta in zipfian distribution (rows accessed follow zipfian distribution)
70 |   READ_PERC		:
71 |   WRITE_PERC	:
72 |   SCAN_PERC		: percentage of read/write/scan queries. they should add up to 1.
73 |   SCAN_LEN		: number of rows touched per scan query.
74 |   PART_PER_TXN	: number of logical partitions to touch per transaction
75 |   PERC_MULTI_PART	: percentage of multi-partition transactions
76 |   REQ_PER_QUERY	: number of queries per transaction
77 |   FIRST_PART_LOCAL	: with this being true, the first touched partition is always the local partition.
78 |   
79 |   // for TPCC Benchmark
80 |   NUM_HW		: number of warehouses being modeled.
81 |   PERC_PAYMENT	: percentage of payment transactions.
82 |   DIST_PER_WARE	: number of districts in one warehouse
83 |   MAXITEMS		: number of items modeled.
84 |   CUST_PER_DIST	: number of customers per district
85 |   ORD_PER_DIST	: number of orders per district
86 |   FIRSTNAME_LEN	: length of first name
87 |   MIDDLE_LEN	: length of middle name
88 |   LASTNAME_LEN	: length of last name
89 | 
90 |   // !! centralized CC management should be ignored.
91 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | ISC License
 2 | 
 3 | Copyright (c) 2014, Xiangyao Yu
 4 | 
 5 | Permission to use, copy, modify, and/or distribute this software for any
 6 | purpose with or without fee is hereby granted, provided that the above
 7 | copyright notice and this permission notice appear in all copies.
 8 | 
 9 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
10 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
11 | AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
12 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
13 | LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
14 | OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
15 | PERFORMANCE OF THIS SOFTWARE.
16 | 


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
 1 | CC=g++
 2 | CFLAGS=-Wall -g -std=c++17 -fno-omit-frame-pointer
 3 | 
 4 | .SUFFIXES: .o .cpp .h
 5 | 
 6 | SRC_DIRS = ./ ./benchmarks/ ./concurrency_control/ ./storage/ ./system/
 7 | INCLUDE = -I. -I./benchmarks -I./concurrency_control -I./storage -I./system
 8 | 
 9 | CFLAGS += $(INCLUDE) -D NOGRAPHITE=1 -O3 -Wno-unused-variable #-Werror
10 | LDFLAGS = -Wall -L. -L./libs -pthread -g -lrt -std=c++17 -O3 -ljemalloc
11 | LDFLAGS += $(CFLAGS)
12 | 
13 | CPPS = $(foreach dir, $(SRC_DIRS), $(wildcard $(dir)*.cpp))
14 | OBJS = $(CPPS:.cpp=.o)
15 | DEPS = $(CPPS:.cpp=.d)
16 | 
17 | all:rundb
18 | 
19 | rundb : $(OBJS)
20 | 	$(CC) -no-pie -o $@ $^ $(LDFLAGS)
21 | 	#$(CC) -o $@ $^ $(LDFLAGS)
22 | 
23 | -include $(OBJS:%.o=%.d)
24 | 
25 | %.d: %.cpp
26 | 	$(CC) -MM -MT $*.o -MF $@ $(CFLAGS) $<
27 | 
28 | %.o: %.cpp
29 | 	$(CC) -c $(CFLAGS) -o $@ $<
30 | 
31 | .PHONY: clean
32 | clean:
33 | 	rm -f rundb $(OBJS) $(DEPS)
34 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # DBx1000-Polaris
 2 | 
 3 | Polaris is an optimistic concurrency control algorithm with priority support.
 4 | 
 5 | - [Polaris: Enabling Transaction Priority in Optimistic Concurrency Control](https://dl.acm.org/doi/10.1145/3588724?cid=99660889005). Chenhao Ye, Wuh-Chwen Hwang, Keren Chen, Xiangyao Yu.
 6 | 
 7 | This repository implements Polaris on top of [DBx1000](https://github.com/yxymit/DBx1000) and [DBx1000-Bamboo](https://github.com/ScarletGuo/Bamboo-Public).
 8 | 
 9 | - DBx1000: [Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores](http://www.vldb.org/pvldb/vol8/p209-yu.pdf). Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, Michael Stonebraker.
10 | - Bamboo: [Releasing Locks As Early As You Can: Reducing Contention of Hotspots by Violating Two-Phase Locking](https://doi.org/10.1145/3448016.3457294). Zhihan Guo, Kan Wu, Cong Yan, Xiangyao Yu.
11 | 
12 | These repositories implement other concurrency control algorithms (e.g., No-Wait, Wait-Die, Wound-Wait, Silo) as the baseline for Polaris evaluation.
13 | 
14 | **NOTE: This README describes the general usage of this repository; to reproduce all experiments in the [paper](https://dl.acm.org/doi/10.1145/3588724?cid=99660889005), please refer to [`ARTIFACT.md`](ARTIFACT.md).**
15 | 
16 | ## Quick Start: Build & Test
17 | 
18 | To test the database
19 | 
20 | ```shell
21 | python3 test.py experiments/default.json
22 | ```
23 | 
24 | The command above will compile the code with the configuration specified in `experiments/default.json` and run experiments. `test.py` will read the configuration and the existing `config-std.h` to generate a new `config.h`.
25 | 
26 | You can find other configuration files (`*.json`) under `experiments/`.
27 | 
28 | ## Advanced: Configure & Run
29 | 
30 | The parameters are set by `config-std.h` and the configuration file. You could overwrite parameters by specifying them from the command-line.
31 | 
32 | ```shell
33 | python3 test.py experiments/default.json COMPILE_ONLY=true
34 | ```
35 | 
36 | This command would only compile the code but not run the experiment.
37 | 
38 | Below are parameters that affect `test.py` behavior:
39 | 
40 | - `UNSET_NUMA`: If set false, it will interleavingly allocate data. Default is `false`.
41 | - `COMPILE_ONLY`: Only compile the code but not run the experiments. Default is `false`.
42 | - `NDEBUG`: Disable all `assert`. Default is `true`.
43 | 
44 | Below is a list of basic build parameters. They typically turn certain features on or off for evaluation purposes. The list is not exhaustive and you can find more on `config-std.h`.
45 | 
46 | - `CC_ALG`: Which concurrency control algorithm to use. Default is `SILO_PRIO`, which is an alias name of Polaris\*.
47 | - `THREAD_CNT`: How many threads to use.
48 | - `WORKLOAD`: Which workload to run. Either `YCSB` or `TPCC`.
49 | - `ZIPF_THETA`: What is the Zipfian theta value in YCSB workload. Only useful when `WORKLOAD=YCSB`.
50 | - `NUM_WH`: How many warehouses in TPC-C workload. Only useful when `WORKLOAD=TPCC`.
51 | - `DUMP_LATENCY`: Whether dump the latency of all transactions to a file. Useful for latency distribution plotting.
52 | - `DUMP_LATENCY_FILENAME`: If `DUMP_LATENCY=true`, what's the filename of the dump.
53 | 
54 | 
55 | Below is another list of build parameters introduced for Polaris:
56 | 
57 | - `SILO_PRIO_NO_RESERVE_LOWEST_PRIO`: Whether turn on the lowest-priority optimization for Polaris. Default is `true` and it should be set true all the time (unless you want to benchmark how much gain from this optimization).
58 | - `SILO_PRIO_FIXED_PRIO`: Whether fix the priority of each transaction. If `false`, Polaris will assign priority based on its own policy.
59 | - `SILO_PRIO_ABORT_THRESHOLD_BEFORE_INC_PRIO`: Do not increment the priority until the transaction's abort counter reaches this threshold.
60 | - `SILO_PRIO_INC_PRIO_AFTER_NUM_ABORT`: After reaching the threshold, increment the priority by one for every specified number of aborts.
61 | - `HIGH_PRIO_RATIO`: What's the ratio of transactions that start with high (i.e., nonzero) priority. Useful to simulate the case of user-specified priority.
62 | 
63 | There are other handy tools included in this repository. `experiments/*.sh` are scripts to reproduce the experiments described in our paper. `parse.py` will process the experiment results into CSV files and `plot.py` can visualize them.
64 | 
65 | \* **Fun fact**: Polaris is implemented based on Silo but with priority support, so it was previously termed `SILO_PRIO`. The name `POLARIS` came from a letter rearrangement of `SILO_PRIO` with an additional `A`.
66 | 


--------------------------------------------------------------------------------
/artifact.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | # assume the current working directory is `polaris/`
 3 | sudo apt update
 4 | sudo apt install -y numactl python3-pip
 5 | pip3 install -r requirements.txt
 6 | 
 7 | bash experiments/run_all.sh
 8 | python3 parse.py ycsb_latency ycsb_prio_sen ycsb_thread ycsb_readonly ycsb_zipf tpcc_thread ycsb_aria_batch
 9 | python3 plot.py
10 | 


--------------------------------------------------------------------------------
/benchmarks/TEST_schema.txt:
--------------------------------------------------------------------------------
 1 | //size, type, name
 2 | TABLE=MAIN_TABLE
 3 | 	4,int,F0
 4 | 	8,double,F1
 5 | 	8,uint64,F2
 6 | 	100,string,F3
 7 | 
 8 | INDEX=MAIN_INDEX
 9 | 	MAIN_TABLE,0
10 | 


--------------------------------------------------------------------------------
/benchmarks/TPCC_full_schema.txt:
--------------------------------------------------------------------------------
  1 | //size,type,name
  2 | TABLE=WAREHOUSE
  3 | 	8,int64_t,W_ID
  4 | 	10,string,W_NAME
  5 | 	20,string,W_STREET_1
  6 | 	20,string,W_STREET_2
  7 | 	20,string,W_CITY
  8 | 	2,string,W_STATE
  9 | 	9,string,W_ZIP
 10 | 	8,double,W_TAX
 11 | 	8,double,W_YTD
 12 | 
 13 | TABLE=DISTRICT
 14 | 	8,int64_t,D_ID
 15 | 	8,int64_t,D_W_ID
 16 | 	10,string,D_NAME
 17 | 	20,string,D_STREET_1
 18 | 	20,string,D_STREET_2
 19 | 	20,string,D_CITY
 20 | 	2,string,D_STATE
 21 | 	9,string,D_ZIP
 22 | 	8,double,D_TAX
 23 | 	8,double,D_YTD
 24 | 	8,int64_t,D_NEXT_O_ID
 25 | 
 26 | TABLE=CUSTOMER
 27 | 	8,int64_t,C_ID
 28 | 	8,int64_t,C_D_ID
 29 | 	8,int64_t,C_W_ID
 30 | 	16,string,C_FIRST
 31 | 	2,string,C_MIDDLE
 32 | 	16,string,C_LAST
 33 | 	20,string,C_STREET_1
 34 | 	20,string,C_STREET_2
 35 | 	20,string,C_CITY
 36 | 	2,string,C_STATE
 37 | 	9,string,C_ZIP
 38 | 	16,string,C_PHONE
 39 | 	8,int64_t,C_SINCE
 40 | 	2,string,C_CREDIT
 41 | 	8,int64_t,C_CREDIT_LIM
 42 | 	8,int64_t,C_DISCOUNT
 43 | 	8,double,C_BALANCE
 44 | 	8,double,C_YTD_PAYMENT
 45 | 	8,uint64_t,C_PAYMENT_CNT
 46 | 	8,uint64_t,C_DELIVERY_CNT
 47 | 	500,string,C_DATA
 48 | 	
 49 | TABLE=HISTORY
 50 | 	8,int64_t,H_C_ID
 51 | 	8,int64_t,H_C_D_ID
 52 | 	8,int64_t,H_C_W_ID
 53 | 	8,int64_t,H_D_ID
 54 | 	8,int64_t,H_W_ID
 55 | 	8,int64_t,H_DATE
 56 | 	8,double,H_AMOUNT
 57 | 	24,string,H_DATA
 58 | 
 59 | TABLE=NEW-ORDER
 60 | 	8,int64_t,NO_O_ID
 61 | 	8,int64_t,NO_D_ID
 62 | 	8,int64_t,NO_W_ID
 63 | 
 64 | TABLE=ORDER
 65 | 	8,int64_t,O_ID
 66 | 	8,int64_t,O_C_ID
 67 | 	8,int64_t,O_D_ID
 68 | 	8,int64_t,O_W_ID
 69 | 	8,int64_t,O_ENTRY_D
 70 | 	8,int64_t,O_CARRIER_ID
 71 | 	8,int64_t,O_OL_CNT
 72 | 	8,int64_t,O_ALL_LOCAL
 73 | 
 74 | TABLE=ORDER-LINE
 75 | 	8,int64_t,OL_O_ID
 76 | 	8,int64_t,OL_D_ID
 77 | 	8,int64_t,OL_W_ID
 78 | 	8,int64_t,OL_NUMBER
 79 | 	8,int64_t,OL_I_ID
 80 | 	8,int64_t,OL_SUPPLY_W_ID
 81 | 	8,int64_t,OL_DELIVERY_D
 82 | 	8,int64_t,OL_QUANTITY
 83 | 	8,double,OL_AMOUNT
 84 | 	8,int64_t,OL_DIST_INFO
 85 | 
 86 | TABLE=ITEM
 87 | 	8,int64_t,I_ID
 88 | 	8,int64_t,I_IM_ID
 89 | 	24,string,I_NAME
 90 | 	8,int64_t,I_PRICE
 91 | 	50,string,I_DATA
 92 | 
 93 | TABLE=STOCK
 94 | 	8,int64_t,S_I_ID
 95 | 	8,int64_t,S_W_ID
 96 | 	8,int64_t,S_QUANTITY
 97 | 	24,string,S_DIST_01
 98 | 	24,string,S_DIST_02
 99 | 	24,string,S_DIST_03
100 | 	24,string,S_DIST_04
101 | 	24,string,S_DIST_05
102 | 	24,string,S_DIST_06
103 | 	24,string,S_DIST_07
104 | 	24,string,S_DIST_08
105 | 	24,string,S_DIST_09
106 | 	24,string,S_DIST_10
107 | 	8,int64_t,S_YTD
108 | 	8,int64_t,S_ORDER_CNT
109 | 	8,int64_t,S_REMOTE_CNT
110 | 	50,string,S_DATA
111 | 
112 | INDEX=ITEM_IDX
113 | ITEM,400000
114 | 
115 | INDEX=WAREHOUSE_IDX
116 | WAREHOUSE,100
117 | 
118 | INDEX=DISTRICT_IDX
119 | DISTRICT,1000
120 | 
121 | INDEX=CUSTOMER_ID_IDX
122 | CUSTOMER,120000
123 | 
124 | INDEX=CUSTOMER_LAST_IDX
125 | CUSTOMER,120000
126 | 
127 | INDEX=STOCK_IDX
128 | STOCK,400000
129 | 


--------------------------------------------------------------------------------
/benchmarks/TPCC_short_schema.txt:
--------------------------------------------------------------------------------
  1 | //size,type,name
  2 | TABLE=WAREHOUSE
  3 | 	8,int64_t,W_ID
  4 | 	10,string,W_NAME
  5 | 	20,string,W_STREET_1
  6 | 	20,string,W_STREET_2
  7 | 	20,string,W_CITY
  8 | 	2,string,W_STATE
  9 | 	9,string,W_ZIP
 10 | 	8,double,W_TAX
 11 | 	8,double,W_YTD
 12 | 
 13 | TABLE=DISTRICT
 14 | 	8,int64_t,D_ID
 15 | 	8,int64_t,D_W_ID
 16 | 	10,string,D_NAME
 17 | 	20,string,D_STREET_1
 18 | 	20,string,D_STREET_2
 19 | 	20,string,D_CITY
 20 | 	2,string,D_STATE
 21 | 	9,string,D_ZIP
 22 | 	8,double,D_TAX
 23 | 	8,double,D_YTD
 24 | 	8,int64_t,D_NEXT_O_ID
 25 | 
 26 | TABLE=CUSTOMER
 27 | 	8,int64_t,C_ID
 28 | 	8,int64_t,C_D_ID
 29 | 	8,int64_t,C_W_ID
 30 | 	2,string,C_MIDDLE
 31 | 	16,string,C_LAST
 32 | 	2,string,C_STATE
 33 | 	2,string,C_CREDIT
 34 | 	8,int64_t,C_DISCOUNT
 35 | 	8,double,C_BALANCE
 36 | 	8,double,C_YTD_PAYMENT
 37 | 	8,uint64_t,C_PAYMENT_CNT
 38 | 	
 39 | TABLE=HISTORY
 40 | 	8,int64_t,H_C_ID
 41 | 	8,int64_t,H_C_D_ID
 42 | 	8,int64_t,H_C_W_ID
 43 | 	8,int64_t,H_D_ID
 44 | 	8,int64_t,H_W_ID
 45 | 	8,int64_t,H_DATE
 46 | 	8,double,H_AMOUNT
 47 | 
 48 | TABLE=NEW-ORDER
 49 | 	8,int64_t,NO_O_ID
 50 | 	8,int64_t,NO_D_ID
 51 | 	8,int64_t,NO_W_ID
 52 | 
 53 | TABLE=ORDER
 54 | 	8,int64_t,O_ID
 55 | 	8,int64_t,O_C_ID
 56 | 	8,int64_t,O_D_ID
 57 | 	8,int64_t,O_W_ID
 58 | 	8,int64_t,O_ENTRY_D
 59 | 	8,int64_t,O_CARRIER_ID
 60 | 	8,int64_t,O_OL_CNT
 61 | 	8,int64_t,O_ALL_LOCAL
 62 | 
 63 | TABLE=ORDER-LINE
 64 | 	8,int64_t,OL_O_ID
 65 | 	8,int64_t,OL_D_ID
 66 | 	8,int64_t,OL_W_ID
 67 | 	8,int64_t,OL_NUMBER
 68 | 	8,int64_t,OL_I_ID
 69 | 
 70 | TABLE=ITEM
 71 | 	8,int64_t,I_ID
 72 | 	8,int64_t,I_IM_ID
 73 | 	24,string,I_NAME
 74 | 	8,int64_t,I_PRICE
 75 | 	50,string,I_DATA
 76 | 
 77 | TABLE=STOCK
 78 | 	8,int64_t,S_I_ID
 79 | 	8,int64_t,S_W_ID
 80 | 	8,int64_t,S_QUANTITY
 81 | 	8,int64_t,S_REMOTE_CNT
 82 | 
 83 | INDEX=ITEM_IDX
 84 | ITEM,10000
 85 | 
 86 | INDEX=WAREHOUSE_IDX
 87 | WAREHOUSE,1
 88 | 
 89 | INDEX=DISTRICT_IDX
 90 | DISTRICT,10
 91 | 
 92 | INDEX=CUSTOMER_ID_IDX
 93 | CUSTOMER,40000
 94 | 
 95 | INDEX=CUSTOMER_LAST_IDX
 96 | CUSTOMER,40000
 97 | 
 98 | INDEX=STOCK_IDX
 99 | STOCK,10000
100 | 


--------------------------------------------------------------------------------
/benchmarks/YCSB_schema.txt:
--------------------------------------------------------------------------------
 1 | //size, type, name
 2 | TABLE=MAIN_TABLE
 3 | 	10,string,F0
 4 | 	10,string,F1
 5 | 	10,string,F2
 6 | 	10,string,F3
 7 | 	10,string,F4
 8 | 	10,string,F5
 9 | 	10,string,F6
10 | 	10,string,F7
11 | 	10,string,F8
12 | 	10,string,F9
13 | 
14 | INDEX=MAIN_INDEX
15 | MAIN_TABLE,0
16 | 


--------------------------------------------------------------------------------
/benchmarks/test.h:
--------------------------------------------------------------------------------
 1 | #ifndef _TEST_H_
 2 | #define _TEST_H_
 3 | 
 4 | #include "global.h"
 5 | #include "txn.h"
 6 | #include "wl.h"
 7 | 
 8 | class TestWorkload : public workload
 9 | {
10 | public:
11 | 	RC init();
12 | 	RC init_table();
13 | 	RC init_schema(const char * schema_file);
14 | 	RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd);
15 | 	void summarize();
16 | 	void tick() { time = get_sys_clock(); };
17 | 	INDEX * the_index;
18 | 	table_t * the_table;
19 | #if CC_ALG == IC3
20 | 	SC_PIECE * get_cedges(TPCCTxnType type, int idx) {return NULL;};
21 | #endif
22 | private:
23 | 	uint64_t time;
24 | };
25 | 
26 | class TestTxnMan : public txn_man 
27 | {
28 | public:
29 | 	void init(thread_t * h_thd, workload * h_wl, uint64_t part_id); 
30 | 	RC run_txn(int type, int access_num);
31 | 	RC exec_txn(base_query * m_query) { assert(false); return Abort;};
32 | private:
33 | 	RC testReadwrite(int access_num);
34 | 	RC testConflict(int access_num);
35 | 	
36 | 	TestWorkload * _wl;
37 | };
38 | 
39 | #endif
40 | 


--------------------------------------------------------------------------------
/benchmarks/test_txn.cpp:
--------------------------------------------------------------------------------
 1 | #include "test.h"
 2 | #include "row.h"
 3 | 
 4 | void TestTxnMan::init(thread_t * h_thd, workload * h_wl, uint64_t thd_id) {
 5 | 	txn_man::init(h_thd, h_wl, thd_id);
 6 | 	_wl = (TestWorkload *) h_wl;
 7 | }
 8 | 
 9 | RC TestTxnMan::run_txn(int type, int access_num) {
10 | 	switch(type) {
11 | 	case READ_WRITE :
12 | 		return testReadwrite(access_num);
13 | 	case CONFLICT:
14 | 		return testConflict(access_num);
15 | 	default:
16 | 		assert(false);
17 |         return Abort;
18 | 	}
19 | }
20 | 
21 | RC TestTxnMan::testReadwrite(int access_num) {
22 | 	RC rc = RCOK;
23 | 	itemid_t * m_item;
24 | 
25 | 	m_item = index_read(_wl->the_index, 0, 0);
26 | 	row_t * row = ((row_t *)m_item->location);
27 | 	row_t * row_local = get_row(row, WR);
28 | 	if (access_num == 0) {			
29 | 		char str[] = "hello";
30 | 		row_local->set_value(0, 1234);
31 | 		row_local->set_value(1, 1234.5);
32 | 		row_local->set_value(2, 8589934592UL);
33 | 		row_local->set_value(3, str);
34 | 	} else {
35 | 		int v1;
36 |     	double v2;
37 |     	uint64_t v3;
38 |     	
39 | 		row_local->get_value(0, v1);
40 | 	    row_local->get_value(1, v2);
41 |     	row_local->get_value(2, v3);
42 | #ifdef NDEBUG
43 |         row_local->get_value(3);
44 | #else
45 | 	    char * v4;
46 | 	    v4 = row_local->get_value(3);
47 | #endif
48 |     	assert(v1 == 1234);
49 | 	    assert(v2 == 1234.5);
50 |     	assert(v3 == 8589934592UL);
51 | 	    assert(strcmp(v4, "hello") == 0);
52 | 	}
53 | 	rc = finish(rc);
54 | 	if (access_num == 0)
55 | 		return RCOK;
56 | 	else 
57 | 		return FINISH;
58 | }
59 | 
60 | RC 
61 | TestTxnMan::testConflict(int access_num)
62 | {
63 | 	RC rc = RCOK;
64 | 	itemid_t * m_item;
65 | 
66 | 	idx_key_t key;
67 | 	for (key = 0; key < 1; key ++) {
68 | 		m_item = index_read(_wl->the_index, key, 0);
69 | 		row_t * row = ((row_t *)m_item->location);
70 | 		row_t * row_local; 
71 | 		row_local = get_row(row, WR);
72 | 		if (row_local) {
73 | 			char str[] = "hello";
74 | 			row_local->set_value(0, 1234);
75 | 			row_local->set_value(1, 1234.5);
76 | 			row_local->set_value(2, 8589934592UL);
77 | 			row_local->set_value(3, str);
78 | 			sleep(1);
79 | 		} else {
80 | 			rc = Abort;
81 | 			break;
82 | 		}
83 | 	}
84 | 	rc = finish(rc);
85 | 	return rc;
86 | }
87 | 


--------------------------------------------------------------------------------
/benchmarks/test_wl.cpp:
--------------------------------------------------------------------------------
 1 | #include "test.h"
 2 | #include "table.h"
 3 | #include "row.h"
 4 | #include "mem_alloc.h"
 5 | #include "index_hash.h"
 6 | #include "index_btree.h"
 7 | #include "thread.h"
 8 | 
 9 | RC TestWorkload::init() {
10 | 	workload::init();
11 | 	string path;
12 | 	path = "./benchmarks/TEST_schema.txt";
13 | 	init_schema( path.c_str() );
14 | 
15 | 	init_table();
16 | 	return RCOK;
17 | }
18 | 
19 | RC TestWorkload::init_schema(const char * schema_file) {
20 | 	workload::init_schema(schema_file);
21 | 	the_table = tables["MAIN_TABLE"]; 	
22 | 	the_index = indexes["MAIN_INDEX"];
23 | 	return RCOK;
24 | }
25 | 
26 | RC TestWorkload::init_table() {
27 | 	RC rc = RCOK;
28 | 	for (int rid = 0; rid < 10; rid ++) {
29 | 		row_t * new_row = NULL;
30 | 		uint64_t row_id;
31 | 		int part_id = 0;
32 |         rc = the_table->get_new_row(new_row, part_id, row_id); 
33 | 		assert(rc == RCOK);
34 | 		uint64_t primary_key = rid;
35 | 		new_row->set_primary_key(primary_key);
36 |         new_row->set_value(0, rid);
37 |         new_row->set_value(1, 0);
38 |         new_row->set_value(2, 0);
39 |         itemid_t * m_item = (itemid_t *) mem_allocator.alloc( sizeof(itemid_t), part_id );
40 | 		assert(m_item != NULL);
41 | 		m_item->type = DT_row;
42 | 		m_item->location = new_row;
43 | 		m_item->valid = true;
44 | 		uint64_t idx_key = primary_key;
45 |         rc = the_index->index_insert(idx_key, m_item, 0);
46 |         assert(rc == RCOK);
47 |     }
48 | 	return rc;
49 | }
50 | 
51 | RC TestWorkload::get_txn_man(txn_man *& txn_manager, thread_t * h_thd) {
52 | 	txn_manager = (TestTxnMan *)
53 | 		mem_allocator.alloc( sizeof(TestTxnMan), h_thd->get_thd_id() );
54 | 	new(txn_manager) TestTxnMan();
55 | 	txn_manager->init(h_thd, this, h_thd->get_thd_id());
56 | 	return RCOK;
57 | }
58 | 
59 | void TestWorkload::summarize() {
60 | 	uint64_t curr_time = get_sys_clock();
61 | 	if (g_test_case == CONFLICT) {
62 | 		assert(curr_time - time > g_thread_cnt * 1e9);
63 | 		int total_wait_cnt = 0;
64 | 		for (UInt32 tid = 0; tid < g_thread_cnt; tid ++) {
65 | 			total_wait_cnt += stats._stats[tid]->wait_cnt;
66 | 		}
67 | 		printf("CONFLICT TEST. PASSED.\n");
68 | 	}
69 | }
70 | 


--------------------------------------------------------------------------------
/benchmarks/tpcc.h:
--------------------------------------------------------------------------------
 1 | #ifndef _TPCC_H_
 2 | #define _TPCC_H_
 3 | 
 4 | #include "wl.h"
 5 | #include "txn.h"
 6 | 
 7 | class table_t;
 8 | class INDEX;
 9 | class tpcc_query;
10 | 	
11 | #define IC3_TPCC_NEW_ORDER_PIECES   8
12 | #define IC3_TPCC_PAYMENT_PIECES     4
13 | #define IC3_TPCC_DELIVERY_PIECES    4
14 | 
15 | class tpcc_wl : public workload {
16 | public:
17 | 	RC init();
18 | 	RC init_table();
19 | 	RC init_schema(const char * schema_file);
20 | 	RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd);
21 | 	table_t * 		t_warehouse;
22 | 	table_t * 		t_district;
23 | 	table_t * 		t_customer;
24 | 	table_t *		t_history;
25 | 	table_t *		t_neworder;
26 | 	table_t *		t_order;
27 | 	table_t *		t_orderline;
28 | 	table_t *		t_item;
29 | 	table_t *		t_stock;
30 | 
31 | 	INDEX * 	i_item;
32 | 	INDEX * 	i_warehouse;
33 | 	INDEX * 	i_district;
34 | 	INDEX * 	i_customer_id;
35 | 	INDEX * 	i_customer_last;
36 | 	INDEX * 	i_stock;
37 | 	INDEX * 	i_order; // key = (w_id, d_id, o_id)
38 | 	INDEX * 	i_orderline; // key = (w_id, d_id, o_id)
39 | 	INDEX * 	i_orderline_wd; // key = (w_id, d_id). 
40 | 	
41 | 	bool ** delivering;
42 | 	uint32_t next_tid;
43 | #if CC_ALG == IC3
44 | 	void init_scgraph();
45 | 	SC_PIECE * get_cedges(TPCCTxnType txn_type, int piece_id);
46 | 	SC_PIECE *** sc_graph;
47 | #endif
48 | private:
49 | 	uint64_t num_wh;
50 | 	void init_tab_item();
51 | 	void init_tab_wh(uint32_t wid);
52 | 	void init_tab_dist(uint64_t w_id);
53 | 	void init_tab_stock(uint64_t w_id);
54 | 	void init_tab_cust(uint64_t d_id, uint64_t w_id);
55 | 	void init_tab_hist(uint64_t c_id, uint64_t d_id, uint64_t w_id);
56 | 	void init_tab_order(uint64_t d_id, uint64_t w_id);
57 | 	
58 | 	void init_permutation(uint64_t * perm_c_id, uint64_t wid);
59 | 
60 | 	static void * threadInitItem(void * This);
61 | 	static void * threadInitWh(void * This);
62 | 	static void * threadInitDist(void * This);
63 | 	static void * threadInitStock(void * This);
64 | 	static void * threadInitCust(void * This);
65 | 	static void * threadInitHist(void * This);
66 | 	static void * threadInitOrder(void * This);
67 | 
68 | 	static void * threadInitWarehouse(void * This);
69 | };
70 | 
71 | class tpcc_txn_man : public txn_man
72 | {
73 | public:
74 | 	void init(thread_t * h_thd, workload * h_wl, uint64_t part_id); 
75 | 	RC exec_txn(base_query * query);
76 | private:
77 | 	tpcc_wl * _wl;
78 | 	RC exec_payment(tpcc_query * m_query);
79 | 	RC exec_new_order(tpcc_query * m_query);
80 | 	RC exec_order_status(tpcc_query * query);
81 | 	RC exec_delivery(tpcc_query * query);
82 | 	RC exec_stock_level(tpcc_query * query);
83 | 	bool has_local_row(row_t * location, access_t type, row_t * local, access_t local_type) {
84 | 	    if (location == local) {
85 | 	        if ((type == local_type) || (local_type == WR)) {
86 | 	            return true;
87 | 	        } else if (type == WR) {
88 |                 return false;
89 | 	        }
90 | 	    }
91 | 	    return false;
92 | 	};
93 | };
94 | 
95 | #endif
96 | 


--------------------------------------------------------------------------------
/benchmarks/tpcc_const.h:
--------------------------------------------------------------------------------
  1 | #if TPCC_SMALL 
  2 | enum {
  3 | 	W_ID,
  4 | 	W_NAME,
  5 | 	W_STREET_1,
  6 | 	W_STREET_2,
  7 | 	W_CITY,
  8 | 	W_STATE,
  9 | 	W_ZIP,
 10 | 	W_TAX,
 11 | 	W_YTD
 12 | };
 13 | enum {
 14 | 	D_ID,
 15 | 	D_W_ID,
 16 | 	D_NAME,
 17 | 	D_STREET_1,
 18 | 	D_STREET_2,
 19 | 	D_CITY,
 20 | 	D_STATE,
 21 | 	D_ZIP,
 22 | 	D_TAX,
 23 | 	D_YTD,
 24 | 	D_NEXT_O_ID
 25 | };
 26 | enum {
 27 | 	C_ID,
 28 | 	C_D_ID,
 29 | 	C_W_ID,
 30 | 	C_MIDDLE,
 31 | 	C_LAST,
 32 | 	C_STATE,
 33 | 	C_CREDIT,
 34 | 	C_DISCOUNT,
 35 | 	C_BALANCE,
 36 | 	C_YTD_PAYMENT,
 37 | 	C_PAYMENT_CNT
 38 | };
 39 | enum {
 40 | 	H_C_ID,
 41 | 	H_C_D_ID,
 42 | 	H_C_W_ID,
 43 | 	H_D_ID,
 44 | 	H_W_ID,
 45 | 	H_DATE,
 46 | 	H_AMOUNT
 47 | };
 48 | enum {
 49 | 	NO_O_ID,
 50 | 	NO_D_ID,
 51 | 	NO_W_ID
 52 | };
 53 | enum {
 54 | 	O_ID,
 55 | 	O_C_ID,
 56 | 	O_D_ID,
 57 | 	O_W_ID,
 58 | 	O_ENTRY_D,
 59 | 	O_CARRIER_ID,
 60 | 	O_OL_CNT,
 61 | 	O_ALL_LOCAL
 62 | };
 63 | enum {
 64 | 	OL_O_ID,
 65 | 	OL_D_ID,
 66 | 	OL_W_ID,
 67 | 	OL_NUMBER,
 68 | 	OL_I_ID
 69 | };
 70 | enum {
 71 | 	I_ID,
 72 | 	I_IM_ID,
 73 | 	I_NAME,
 74 | 	I_PRICE,
 75 | 	I_DATA
 76 | };
 77 | enum {
 78 | 	S_I_ID,
 79 | 	S_W_ID,
 80 | 	S_QUANTITY,
 81 | 	S_REMOTE_CNT
 82 | };
 83 | #else 
 84 | enum {
 85 | 	W_ID,
 86 | 	W_NAME,
 87 | 	W_STREET_1,
 88 | 	W_STREET_2,
 89 | 	W_CITY,
 90 | 	W_STATE,
 91 | 	W_ZIP,
 92 | 	W_TAX,
 93 | 	W_YTD
 94 | };
 95 | enum {
 96 | 	D_ID,
 97 | 	D_W_ID,
 98 | 	D_NAME,
 99 | 	D_STREET_1,
100 | 	D_STREET_2,
101 | 	D_CITY,
102 | 	D_STATE,
103 | 	D_ZIP,
104 | 	D_TAX,
105 | 	D_YTD,
106 | 	D_NEXT_O_ID
107 | };
108 | enum {
109 | 	C_ID,
110 | 	C_D_ID,
111 | 	C_W_ID,
112 | 	C_FIRST,
113 | 	C_MIDDLE,
114 | 	C_LAST,
115 | 	C_STREET_1,
116 | 	C_STREET_2,
117 | 	C_CITY,
118 | 	C_STATE,
119 | 	C_ZIP,
120 | 	C_PHONE,
121 | 	C_SINCE,
122 | 	C_CREDIT,
123 | 	C_CREDIT_LIM,
124 | 	C_DISCOUNT,
125 | 	C_BALANCE,
126 | 	C_YTD_PAYMENT,
127 | 	C_PAYMENT_CNT,
128 | 	C_DELIVERY_CNT,
129 | 	C_DATA
130 | };
131 | enum {
132 | 	H_C_ID,
133 | 	H_C_D_ID,
134 | 	H_C_W_ID,
135 | 	H_D_ID,
136 | 	H_W_ID,
137 | 	H_DATE,
138 | 	H_AMOUNT,
139 | 	H_DATA
140 | };
141 | enum {
142 | 	NO_O_ID,
143 | 	NO_D_ID,
144 | 	NO_W_ID
145 | };
146 | enum {
147 | 	O_ID,
148 | 	O_C_ID,
149 | 	O_D_ID,
150 | 	O_W_ID,
151 | 	O_ENTRY_D,
152 | 	O_CARRIER_ID,
153 | 	O_OL_CNT,
154 | 	O_ALL_LOCAL
155 | };
156 | enum {
157 | 	OL_O_ID,
158 | 	OL_D_ID,
159 | 	OL_W_ID,
160 | 	OL_NUMBER,
161 | 	OL_I_ID,
162 | 	OL_SUPPLY_W_ID,
163 | 	OL_DELIVERY_D,
164 | 	OL_QUANTITY,
165 | 	OL_AMOUNT,
166 | 	OL_DIST_INFO
167 | };
168 | enum {
169 | 	I_ID,
170 | 	I_IM_ID,
171 | 	I_NAME,
172 | 	I_PRICE,
173 | 	I_DATA
174 | };
175 | enum {
176 | 	S_I_ID,
177 | 	S_W_ID,
178 | 	S_QUANTITY,
179 | 	S_DIST_01,
180 | 	S_DIST_02,
181 | 	S_DIST_03,
182 | 	S_DIST_04,
183 | 	S_DIST_05,
184 | 	S_DIST_06,
185 | 	S_DIST_07,
186 | 	S_DIST_08,
187 | 	S_DIST_09,
188 | 	S_DIST_10,
189 | 	S_YTD,
190 | 	S_ORDER_CNT,
191 | 	S_REMOTE_CNT,
192 | 	S_DATA
193 | };
194 | #endif
195 | 
196 | 


--------------------------------------------------------------------------------
/benchmarks/tpcc_helper.cpp:
--------------------------------------------------------------------------------
  1 | #include "tpcc_helper.h"
  2 | 
  3 | drand48_data ** tpcc_buffer;
  4 | 
  5 | uint64_t distKey(uint64_t d_id, uint64_t d_w_id)  {
  6 | 	return d_w_id * DIST_PER_WARE + d_id; 
  7 | }
  8 | 
  9 | uint64_t custKey(uint64_t c_id, uint64_t c_d_id, uint64_t c_w_id) {
 10 | 	return (distKey(c_d_id, c_w_id) * g_cust_per_dist + c_id);
 11 | }
 12 | 
 13 | uint64_t orderlineKey(uint64_t w_id, uint64_t d_id, uint64_t o_id) {
 14 | 	return distKey(d_id, w_id) * g_cust_per_dist + o_id; 
 15 | }
 16 | 
 17 | uint64_t orderPrimaryKey(uint64_t w_id, uint64_t d_id, uint64_t o_id) {
 18 | 	return orderlineKey(w_id, d_id, o_id); 
 19 | }
 20 | 
 21 | uint64_t custNPKey(char * c_last, uint64_t c_d_id, uint64_t c_w_id) {
 22 | 	uint64_t key = 0;
 23 | 	char offset = 'A';
 24 | 	for (uint32_t i = 0; i < strlen(c_last); i++) 
 25 | 		key = (key << 2) + (c_last[i] - offset);
 26 | 	key = key << 3;
 27 | 	key += c_w_id * DIST_PER_WARE + c_d_id;
 28 | 	return key;
 29 | }
 30 | 
 31 | uint64_t stockKey(uint64_t s_i_id, uint64_t s_w_id) {
 32 | 	return s_w_id * g_max_items + s_i_id;
 33 | }
 34 | 
 35 | uint64_t Lastname(uint64_t num, char* name) {
 36 |   	static const char *n[] =
 37 |     	{"BAR", "OUGHT", "ABLE", "PRI", "PRES",
 38 |      	"ESE", "ANTI", "CALLY", "ATION", "EING"};
 39 |   	strcpy(name, n[num/100]);
 40 |   	strcat(name, n[(num/10)%10]);
 41 |   	strcat(name, n[num%10]);
 42 |   	return strlen(name);
 43 | }
 44 | 
 45 | uint64_t RAND(uint64_t max, uint64_t thd_id) {
 46 | 	int64_t rint64 = 0;
 47 | 	lrand48_r(tpcc_buffer[thd_id], &rint64);
 48 | 	return rint64 % max;
 49 | }
 50 | 
 51 | uint64_t URand(uint64_t x, uint64_t y, uint64_t thd_id) {
 52 |     return x + RAND(y - x + 1, thd_id);
 53 | }
 54 | 
 55 | uint64_t NURand(uint64_t A, uint64_t x, uint64_t y, uint64_t thd_id) {
 56 |   static bool C_255_init = false;
 57 |   static bool C_1023_init = false;
 58 |   static bool C_8191_init = false;
 59 |   static uint64_t C_255, C_1023, C_8191;
 60 |   int C = 0;
 61 |   switch(A) {
 62 |     case 255:
 63 |       if(!C_255_init) {
 64 |         C_255 = (uint64_t) URand(0,255, thd_id);
 65 |         C_255_init = true;
 66 |       }
 67 |       C = C_255;
 68 |       break;
 69 |     case 1023:
 70 |       if(!C_1023_init) {
 71 |         C_1023 = (uint64_t) URand(0,1023, thd_id);
 72 |         C_1023_init = true;
 73 |       }
 74 |       C = C_1023;
 75 |       break;
 76 |     case 8191:
 77 |       if(!C_8191_init) {
 78 |         C_8191 = (uint64_t) URand(0,8191, thd_id);
 79 |         C_8191_init = true;
 80 |       }
 81 |       C = C_8191;
 82 |       break;
 83 |     default:
 84 |       M_ASSERT(false, "Error! NURand\n");
 85 |       exit(-1);
 86 |   }
 87 |   return(((URand(0,A, thd_id) | URand(x,y, thd_id))+C)%(y-x+1))+x;
 88 | }
 89 | 
 90 | uint64_t MakeAlphaString(int min, int max, char* str, uint64_t thd_id) {
 91 |     char char_list[] = {'1','2','3','4','5','6','7','8','9','a','b','c',
 92 |                         'd','e','f','g','h','i','j','k','l','m','n','o',
 93 |                         'p','q','r','s','t','u','v','w','x','y','z','A',
 94 |                         'B','C','D','E','F','G','H','I','J','K','L','M',
 95 |                         'N','O','P','Q','R','S','T','U','V','W','X','Y','Z'};
 96 |     uint64_t cnt = URand(min, max, thd_id);
 97 |     for (uint32_t i = 0; i < cnt; i++) 
 98 | 		str[i] = char_list[URand(0L, 60L, thd_id)];
 99 |     for (int i = cnt; i < max; i++)
100 | 		str[i] = '\0';
101 | 
102 |     return cnt;
103 | }
104 | 
105 | uint64_t MakeNumberString(int min, int max, char* str, uint64_t thd_id) {
106 | 
107 |   uint64_t cnt = URand(min, max, thd_id);
108 |   for (UInt32 i = 0; i < cnt; i++) {
109 |     uint64_t r = URand(0L,9L, thd_id);
110 |     str[i] = '0' + r;
111 |   }
112 |   return cnt;
113 | }
114 | 
115 | uint64_t wh_to_part(uint64_t wid) {
116 | 	assert(g_part_cnt <= g_num_wh);
117 | 	return wid % g_part_cnt;
118 | }
119 | 


--------------------------------------------------------------------------------
/benchmarks/tpcc_helper.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | #include "global.h"
 3 | #include "helper.h"
 4 | 
 5 | uint64_t distKey(uint64_t d_id, uint64_t d_w_id);
 6 | uint64_t custKey(uint64_t c_id, uint64_t c_d_id, uint64_t c_w_id);
 7 | uint64_t orderlineKey(uint64_t w_id, uint64_t d_id, uint64_t o_id);
 8 | uint64_t orderPrimaryKey(uint64_t w_id, uint64_t d_id, uint64_t o_id);
 9 | // non-primary key
10 | uint64_t custNPKey(char * c_last, uint64_t c_d_id, uint64_t c_w_id);
11 | uint64_t stockKey(uint64_t s_i_id, uint64_t s_w_id);
12 | 
13 | uint64_t Lastname(uint64_t num, char* name);
14 | 
15 | extern drand48_data ** tpcc_buffer;
16 | // return random data from [0, max-1]
17 | uint64_t RAND(uint64_t max, uint64_t thd_id);
18 | // random number from [x, y]
19 | uint64_t URand(uint64_t x, uint64_t y, uint64_t thd_id);
20 | // non-uniform random number
21 | uint64_t NURand(uint64_t A, uint64_t x, uint64_t y, uint64_t thd_id);
22 | // random string with random length beteen min and max.
23 | uint64_t MakeAlphaString(int min, int max, char * str, uint64_t thd_id);
24 | uint64_t MakeNumberString(int min, int max, char* str, uint64_t thd_id);
25 | 
26 | uint64_t wh_to_part(uint64_t wid); 
27 | 


--------------------------------------------------------------------------------
/benchmarks/tpcc_query.cpp:
--------------------------------------------------------------------------------
  1 | #include "query.h"
  2 | #include "tpcc_query.h"
  3 | #include "tpcc.h"
  4 | #include "tpcc_helper.h"
  5 | #include "mem_alloc.h"
  6 | #include "wl.h"
  7 | #include "table.h"
  8 | 
  9 | void tpcc_query::init(uint64_t thd_id, workload * h_wl) {
 10 |   // base_query init
 11 |   num_abort = 0;
 12 |   double y;
 13 |   drand48_r(&per_thread_rand_buf, &y);
 14 |   prio = y < HIGH_PRIO_RATIO ? ((SILO_PRIO_MAX_PRIO + 1) / 2) : 0;
 15 |   max_prio = y < HIGH_PRIO_RATIO ? SILO_PRIO_MAX_PRIO : LOW_PRIO_BOUND;
 16 |   // tpcc query
 17 |   double x = (double)(rand() % 100) / 100.0;
 18 |   part_to_access = (uint64_t *)
 19 |       mem_allocator.alloc(sizeof(uint64_t) * g_part_cnt, thd_id);
 20 |   if (x < g_perc_payment) {
 21 |     gen_payment(thd_id);
 22 |   } else if (x < (g_perc_payment + g_perc_delivery))
 23 |     gen_delivery(thd_id);
 24 |   else if (x < (g_perc_payment + g_perc_delivery + g_perc_orderstatus))
 25 |     gen_order_status(thd_id);
 26 |   else if (x < (g_perc_payment + g_perc_delivery + g_perc_orderstatus +
 27 |   g_perc_stocklevel))
 28 |     gen_stock_level(thd_id);
 29 |   else
 30 |     gen_new_order(thd_id);
 31 | }
 32 | 
 33 | void tpcc_query::gen_payment(uint64_t thd_id) {
 34 |   type = TPCC_PAYMENT;
 35 |   if (FIRST_PART_LOCAL)
 36 |     w_id = thd_id % g_num_wh + 1;
 37 |   else
 38 |     w_id = URand(1, g_num_wh, thd_id % g_num_wh);
 39 |   d_w_id = w_id;
 40 |   uint64_t part_id = wh_to_part(w_id);
 41 |   part_to_access[0] = part_id;
 42 |   part_num = 1;
 43 | 
 44 |   d_id = URand(1, DIST_PER_WARE, w_id-1);
 45 |   h_amount = URand(1, 5000, w_id-1);
 46 |   int x = URand(1, 100, w_id-1);
 47 |   int y = URand(1, 100, w_id-1);
 48 | 
 49 | 
 50 |   if(x <= 85) {
 51 |     // home warehouse
 52 |     c_d_id = d_id;
 53 |     c_w_id = w_id;
 54 |   } else {
 55 |     // remote warehouse
 56 |     c_d_id = URand(1, DIST_PER_WARE, w_id-1);
 57 |     if(g_num_wh > 1) {
 58 |       while((c_w_id = URand(1, g_num_wh, w_id-1)) == w_id) {}
 59 |       if (wh_to_part(w_id) != wh_to_part(c_w_id)) {
 60 |         part_to_access[1] = wh_to_part(c_w_id);
 61 |         part_num = 2;
 62 |       }
 63 |     } else
 64 |       c_w_id = w_id;
 65 |   }
 66 |   if(y <= 60) {
 67 |     // by last name
 68 |     by_last_name = true;
 69 |     Lastname(NURand(255,0,999,w_id-1),c_last);
 70 |   } else {
 71 |     // by cust id
 72 |     by_last_name = false;
 73 |     c_id = NURand(1023, 1, g_cust_per_dist,w_id-1);
 74 |   }
 75 | }
 76 | 
 77 | void tpcc_query::gen_new_order(uint64_t thd_id) {
 78 |   type = TPCC_NEW_ORDER;
 79 |   if (FIRST_PART_LOCAL)
 80 |     w_id = thd_id % g_num_wh + 1;
 81 |   else
 82 |     w_id = URand(1, g_num_wh, thd_id % g_num_wh);
 83 |   d_id = URand(1, DIST_PER_WARE, w_id-1);
 84 |   c_id = NURand(1023, 1, g_cust_per_dist, w_id-1);
 85 |   rbk = URand(1, 100, w_id-1);
 86 |   ol_cnt = URand(5, 15, w_id-1);
 87 |   o_entry_d = 2013;
 88 |   items = (Item_no *) _mm_malloc(sizeof(Item_no) * ol_cnt, 64);
 89 |   remote = false;
 90 |   part_to_access[0] = wh_to_part(w_id);
 91 |   part_num = 1;
 92 | 
 93 |   for (UInt32 oid = 0; oid < ol_cnt; oid ++) {
 94 |     items[oid].ol_i_id = NURand(8191, 1, g_max_items, w_id-1);
 95 | #if TPCC_USER_ABORT
 96 |     // XXX(zhihan): 1% of the New-Order transactions are chosen at random to
 97 | 		// simulate user data entry errors and exercise the performance of
 98 | 		// rolling back update transactions.
 99 | 		// If this is the last item on the order and rbk = 1 (chosen from [1,
100 | 		// 100]), then the item number is set to an unused value.
101 |         if ((oid == ol_cnt - 1) && (rbk == 1)) {
102 |           items[oid].ol_i_id = 0;
103 |         }
104 | #endif
105 |     UInt32 x = URand(1, 100, w_id-1);
106 |     if (x > 1 || g_num_wh == 1)
107 |       items[oid].ol_supply_w_id = w_id;
108 |     else  {
109 |       while((items[oid].ol_supply_w_id = URand(1, g_num_wh, w_id-1)) == w_id) {}
110 |       remote = true;
111 |     }
112 |     items[oid].ol_quantity = URand(1, 10, w_id-1);
113 |   }
114 |   // Remove duplicate items
115 |   for (UInt32 i = 0; i < ol_cnt; i ++) {
116 |     for (UInt32 j = 0; j < i; j++) {
117 |       if (items[i].ol_i_id == items[j].ol_i_id) {
118 |         for (UInt32 k = i; k < ol_cnt - 1; k++)
119 |           items[k] = items[k + 1];
120 |         ol_cnt --;
121 |         i--;
122 |       }
123 |     }
124 |   }
125 |   for (UInt32 i = 0; i < ol_cnt; i ++)
126 |     for (UInt32 j = 0; j < i; j++)
127 |       assert(items[i].ol_i_id != items[j].ol_i_id);
128 |   // update part_to_access
129 |   for (UInt32 i = 0; i < ol_cnt; i ++) {
130 |     UInt32 j;
131 |     for (j = 0; j < part_num; j++ )
132 |       if (part_to_access[j] == wh_to_part(items[i].ol_supply_w_id))
133 |         break;
134 |     if (j == part_num) // not found! add to it.
135 |       part_to_access[part_num ++] = wh_to_part( items[i].ol_supply_w_id );
136 |   }
137 | }
138 | 
139 | void
140 | tpcc_query::gen_order_status(uint64_t thd_id) {
141 |   type = TPCC_ORDER_STATUS;
142 |   if (FIRST_PART_LOCAL)
143 |     w_id = thd_id % g_num_wh + 1;
144 |   else
145 |     w_id = URand(1, g_num_wh, thd_id % g_num_wh);
146 |   d_id = URand(1, DIST_PER_WARE, w_id-1);
147 |   c_w_id = w_id;
148 |   c_d_id = d_id;
149 |   int y = URand(1, 100, w_id-1);
150 |   if(y <= 60) {
151 |     // by last name
152 |     by_last_name = true;
153 |     Lastname(NURand(255,0,999,w_id-1),c_last);
154 |   } else {
155 |     // by cust id
156 |     by_last_name = false;
157 |     c_id = NURand(1023, 1, g_cust_per_dist, w_id-1);
158 |   }
159 | }
160 | 
161 | void
162 | tpcc_query::gen_delivery(uint64_t thd_id) {
163 |   type = TPCC_DELIVERY;
164 | }
165 | 
166 | void
167 | tpcc_query::gen_stock_level(uint64_t thd_id) {
168 |   type = TPCC_STOCK_LEVEL;
169 | }
170 | 


--------------------------------------------------------------------------------
/benchmarks/tpcc_query.h:
--------------------------------------------------------------------------------
 1 | #ifndef _TPCC_QUERY_H_
 2 | #define _TPCC_QUERY_H_
 3 | 
 4 | #include "global.h"
 5 | #include "helper.h"
 6 | #include "query.h"
 7 | 
 8 | class workload;
 9 | 
10 | // items of new order transaction
11 | struct Item_no {
12 |   uint64_t ol_i_id;
13 |   uint64_t ol_supply_w_id;
14 |   uint64_t ol_quantity;
15 | };
16 | 
17 | class tpcc_query : public base_query {
18 |  public:
19 |   void init(uint64_t thd_id, workload * h_wl);
20 |   TPCCTxnType type;
21 |   /**********************************************/
22 |   // common txn input for both payment & new-order
23 |   /**********************************************/
24 |   uint64_t w_id;
25 |   uint64_t d_id;
26 |   uint64_t c_id;
27 |   /**********************************************/
28 |   // txn input for payment
29 |   /**********************************************/
30 |   uint64_t d_w_id;
31 |   uint64_t c_w_id;
32 |   uint64_t c_d_id;
33 |   char c_last[LASTNAME_LEN];
34 |   double h_amount;
35 |   bool by_last_name;
36 |   /**********************************************/
37 |   // txn input for new-order
38 |   /**********************************************/
39 |   Item_no * items;
40 |   uint64_t rbk;
41 |   bool remote;
42 |   uint64_t ol_cnt;
43 |   uint64_t o_entry_d;
44 |   // Input for delivery
45 |   uint64_t o_carrier_id;
46 |   uint64_t ol_delivery_d;
47 |   // for order-status
48 | 
49 | 
50 |  private:
51 |   // warehouse id to partition id mapping
52 | //	uint64_t wh_to_part(uint64_t wid);
53 |   void gen_payment(uint64_t thd_id);
54 |   void gen_new_order(uint64_t thd_id);
55 |   void gen_order_status(uint64_t thd_id);
56 |   void gen_delivery(uint64_t thd_id);
57 |   void gen_stock_level(uint64_t thd_id);
58 | };
59 | 
60 | #endif
61 | 


--------------------------------------------------------------------------------
/benchmarks/ycsb.h:
--------------------------------------------------------------------------------
 1 | #ifndef _SYNTH_BM_H_
 2 | #define _SYNTH_BM_H_
 3 | 
 4 | #include "wl.h"
 5 | #include "txn.h"
 6 | #include "global.h"
 7 | #include "helper.h"
 8 | 
 9 | class ycsb_query;
10 | 
11 | class ycsb_wl : public workload {
12 | public :
13 | 	RC init();
14 | 	RC init_table();
15 | 	RC init_schema(string schema_file);
16 | 	RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd);
17 | 	int key_to_part(uint64_t key);
18 | 	INDEX * the_index;
19 | 	table_t * the_table;
20 | #if CC_ALG == IC3
21 | 	SC_PIECE * get_cedges(TPCCTxnType type, int idx) {return NULL;};
22 | #endif
23 | private:
24 | 	void init_table_parallel();
25 | 	void * init_table_slice();
26 | 	static void * threadInitTable(void * This) {
27 | 		((ycsb_wl *)This)->init_table_slice(); 
28 | 		return NULL;
29 | 	}
30 | 	pthread_mutex_t insert_lock;
31 | 	//  For parallel initialization
32 | 	static int next_tid;
33 | };
34 | 
35 | class ycsb_txn_man : public txn_man
36 | {
37 | public:
38 | 	void init(thread_t * h_thd, workload * h_wl, uint64_t part_id); 
39 | 	RC exec_txn(base_query * query);
40 | private:
41 | #if CC_ALG != BAMBOO
42 | 	uint64_t row_cnt;
43 | #endif
44 | 	ycsb_wl * _wl;
45 | };
46 | 
47 | #endif
48 | 


--------------------------------------------------------------------------------
/benchmarks/ycsb_query.h:
--------------------------------------------------------------------------------
 1 | #ifndef _YCSB_QUERY_H_
 2 | #define _YCSB_QUERY_H_
 3 | 
 4 | #include "global.h"
 5 | #include "helper.h"
 6 | #include "query.h"
 7 | 
 8 | class workload;
 9 | class Query_thd;
10 | // Each ycsb_query contains several ycsb_requests, 
11 | // each of which is a RD, WR to a single table
12 | 
13 | class ycsb_request {
14 | public:
15 | 	access_t rtype; 
16 | 	uint64_t key;
17 | 	char value;
18 | 	// only for (qtype == SCAN)
19 | 	UInt32 scan_len;
20 | };
21 | 
22 | class ycsb_query : public base_query {
23 | public:
24 | 	void init(uint64_t thd_id, workload * h_wl) { assert(false); };
25 | 	void init(uint64_t thd_id, workload * h_wl, Query_thd * query_thd);
26 | 	static void calculateDenom();
27 |   uint64_t get_new_row();
28 | 	void gen_requests(uint64_t thd_id, workload * h_wl);
29 | 
30 | 	uint64_t request_cnt;
31 | 	uint64_t local_req_per_query;
32 | 	bool is_long;
33 | 	double local_read_perc;
34 | 	ycsb_request * requests;
35 | 
36 | private:
37 | 	// for Zipfian distribution
38 | 	static double zeta(uint64_t n, double theta);
39 | 	uint64_t zipf(uint64_t n, double theta);
40 | 	
41 | 	static uint64_t the_n;
42 | 	static double denom;
43 | 	double zeta_2_theta;
44 | 	Query_thd * _query_thd;
45 | };
46 | 
47 | #endif
48 | 


--------------------------------------------------------------------------------
/benchmarks/ycsb_txn.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "helper.h"
  3 | #include "ycsb.h"
  4 | #include "ycsb_query.h"
  5 | #include "wl.h"
  6 | #include "thread.h"
  7 | #include "table.h"
  8 | #include "row.h"
  9 | #include "index_hash.h"
 10 | #include "index_btree.h"
 11 | #include "catalog.h"
 12 | #include "manager.h"
 13 | #include "row_lock.h"
 14 | #include "row_ts.h"
 15 | #include "row_mvcc.h"
 16 | #include "mem_alloc.h"
 17 | #include "query.h"
 18 | void ycsb_txn_man::init(thread_t * h_thd, workload * h_wl, uint64_t thd_id) {
 19 |     txn_man::init(h_thd, h_wl, thd_id);
 20 |     _wl = (ycsb_wl *) h_wl;
 21 | }
 22 | 
 23 | RC ycsb_txn_man::exec_txn(base_query * query) {
 24 |     RC rc;
 25 |     ycsb_query * m_query = (ycsb_query *) query;
 26 |     ycsb_wl * wl = (ycsb_wl *) h_wl;
 27 |     itemid_t * m_item = NULL;
 28 | #if CC_ALG == BAMBOO && (THREAD_CNT != 1)
 29 |     int access_id;
 30 |     retire_threshold = (uint32_t) floor(m_query->request_cnt * (1 - g_last_retire));
 31 | #else
 32 |     row_cnt = 0;
 33 | #endif
 34 |     for (uint32_t rid = 0; rid < m_query->request_cnt; rid ++) {
 35 |         ycsb_request * req = &m_query->requests[rid];
 36 |         int part_id = wl->key_to_part( req->key );
 37 |         bool finish_req = false;
 38 |         UInt32 iteration = 0;
 39 |         while ( !finish_req ) {
 40 |             if (iteration == 0) {
 41 |                 m_item = index_read(_wl->the_index, req->key, part_id);
 42 |             }
 43 | #if INDEX_STRUCT == IDX_BTREE
 44 |             else {
 45 |                 _wl->the_index->index_next(get_thd_id(), m_item);
 46 |                 if (m_item == NULL)
 47 |                     break;
 48 |             }
 49 | #endif
 50 |             row_t * row = ((row_t *)m_item->location);
 51 |             row_t * row_local;
 52 |             access_t type = req->rtype;
 53 |             //printf("[txn-%lu] start %d requests at key %lu\n", get_txn_id(), rid, req->key);
 54 |             row_local = get_row(row, type);
 55 |             if (row_local == NULL) {
 56 |                 rc = Abort;
 57 |                 goto final;
 58 |             }
 59 | #if CC_ALG == BAMBOO && (THREAD_CNT != 1)
 60 |             access_id = row_cnt - 1;
 61 | #endif
 62 | 
 63 |             // Computation //
 64 |             // Only do computation when there are more than 1 requests.
 65 |             if (m_query->request_cnt > 1) {
 66 |                 if (req->rtype == RD || req->rtype == SCAN) {
 67 | //                  for (int fid = 0; fid < schema->get_field_cnt(); fid++) {
 68 |                         int fid = 0;
 69 |                         char * data = row_local->get_data();
 70 |                         __attribute__((unused)) uint64_t fval = *(uint64_t *)(&data[fid * 10]);
 71 | //                  }
 72 |                 } else {
 73 |                     assert(req->rtype == WR);
 74 | //					for (int fid = 0; fid < schema->get_field_cnt(); fid++) {
 75 |                         int fid = 0;
 76 | #if (CC_ALG == BAMBOO) || (CC_ALG == WOUND_WAIT)
 77 |                         char * data = row_local->get_data();
 78 | #else
 79 |                         char * data = row->get_data();
 80 | #endif
 81 |                         *(uint64_t *)(&data[fid * 10]) = 0;
 82 | //					}
 83 |                 } 
 84 |             }
 85 | 
 86 | 
 87 |             iteration ++;
 88 |             if (req->rtype == RD || req->rtype == WR || iteration == req->scan_len)
 89 |                 finish_req = true;
 90 | #if (CC_ALG == BAMBOO) && (THREAD_CNT != 1)
 91 |             // retire write txn
 92 |             if (finish_req && (req->rtype == WR) && (rid <= retire_threshold)) {
 93 |             	//printf("[txn-%lu] retire %d requests\n", get_txn_id(), rid);
 94 |                 if (retire_row(access_id) == Abort)
 95 |                   return Abort;
 96 |             }
 97 | #endif
 98 |         }
 99 |     }
100 |     rc = RCOK;
101 | final:
102 |     return rc;
103 | }
104 | 
105 | 


--------------------------------------------------------------------------------
/benchmarks/ycsb_wl.cpp:
--------------------------------------------------------------------------------
  1 | #include <sched.h>
  2 | #include "global.h"
  3 | #include "helper.h"
  4 | #include "ycsb.h"
  5 | #include "wl.h"
  6 | #include "thread.h"
  7 | #include "table.h"
  8 | #include "row.h"
  9 | #include "index_hash.h"
 10 | #include "index_btree.h"
 11 | #include "catalog.h"
 12 | #include "manager.h"
 13 | #include "row_lock.h"
 14 | #include "row_ts.h"
 15 | #include "row_mvcc.h"
 16 | #include "mem_alloc.h"
 17 | #include "query.h"
 18 | 
 19 | int ycsb_wl::next_tid;
 20 | 
 21 | RC ycsb_wl::init() {
 22 | 	workload::init();
 23 | 	next_tid = 0;
 24 | 	string path = "./benchmarks/YCSB_schema.txt";
 25 | 	init_schema( path );
 26 | 	
 27 | 	init_table_parallel();
 28 | //	init_table();
 29 | 	return RCOK;
 30 | }
 31 | 
 32 | RC ycsb_wl::init_schema(string schema_file) {
 33 | 	workload::init_schema(schema_file);
 34 | 	the_table = tables["MAIN_TABLE"]; 	
 35 | 	the_index = indexes["MAIN_INDEX"];
 36 | 	return RCOK;
 37 | }
 38 | 	
 39 | int 
 40 | ycsb_wl::key_to_part(uint64_t key) {
 41 | 	uint64_t rows_per_part = g_synth_table_size / g_part_cnt;
 42 | 	return key / rows_per_part;
 43 | }
 44 | 
 45 | RC ycsb_wl::init_table() {
 46 | 	RC rc = RCOK;
 47 |     uint64_t total_row = 0;
 48 |     while (true) {
 49 |     	for (UInt32 part_id = 0; part_id < g_part_cnt; part_id ++) {
 50 |             if (total_row > g_synth_table_size)
 51 |                 goto ins_done;
 52 |             row_t * new_row = NULL;
 53 | 	        //zhihan
 54 | 			uint64_t row_id = get_sys_clock();
 55 |             rc = the_table->get_new_row(new_row, part_id, row_id); 
 56 |             // TODO insertion of last row may fail after the table_size
 57 |             // is updated. So never access the last record in a table
 58 | 			assert(rc == RCOK);
 59 | 			uint64_t primary_key = total_row;
 60 | 			new_row->set_primary_key(primary_key);
 61 |             new_row->set_value(0, &primary_key);
 62 | 			Catalog * schema = the_table->get_schema();
 63 | 			for (UInt32 fid = 0; fid < schema->get_field_cnt(); fid ++) {
 64 | 				int field_size = schema->get_field_size(fid);
 65 | 				char value[field_size];
 66 | 				for (int i = 0; i < field_size; i++) 
 67 | 					value[i] = (char)rand() % (1<<8) ;
 68 | 				new_row->set_value(fid, value);
 69 | 			}
 70 |             itemid_t * m_item = 
 71 |                 (itemid_t *) mem_allocator.alloc( sizeof(itemid_t), part_id );
 72 | 			assert(m_item != NULL);
 73 |             m_item->type = DT_row;
 74 |             m_item->location = new_row;
 75 |             m_item->valid = true;
 76 |             uint64_t idx_key = primary_key;
 77 |             rc = the_index->index_insert(idx_key, m_item, part_id);
 78 |             assert(rc == RCOK);
 79 |             total_row ++;
 80 |         }
 81 |     }
 82 | ins_done:
 83 |     printf("[YCSB] Table \"MAIN_TABLE\" initialized.\n");
 84 |     return rc;
 85 | 
 86 | }
 87 | 
 88 | // init table in parallel
 89 | void ycsb_wl::init_table_parallel() {
 90 | 	enable_thread_mem_pool = true;
 91 | 	pthread_t p_thds[g_init_parallelism - 1];
 92 | 	for (UInt32 i = 0; i < g_init_parallelism - 1; i++) 
 93 | 		pthread_create(&p_thds[i], NULL, threadInitTable, this);
 94 | 	threadInitTable(this);
 95 | 
 96 | 	for (uint32_t i = 0; i < g_init_parallelism - 1; i++) {
 97 | 		int rc = pthread_join(p_thds[i], NULL);
 98 | 		if (rc) {
 99 | 			printf("ERROR; return code from pthread_join() is %d\n", rc);
100 | 			exit(-1);
101 | 		}
102 | 	}
103 | 	enable_thread_mem_pool = false;
104 | 	mem_allocator.unregister();
105 | }
106 | 
107 | void * ycsb_wl::init_table_slice() {
108 | 	UInt32 tid = ATOM_FETCH_ADD(next_tid, 1);
109 | 	// set cpu affinity
110 | 	set_affinity(tid);
111 | 
112 | 	mem_allocator.register_thread(tid);
113 | 	assert(g_synth_table_size % g_init_parallelism == 0);
114 | 	assert(tid < g_init_parallelism);
115 | 	while ((UInt32)ATOM_FETCH_ADD(next_tid, 0) < g_init_parallelism) {}
116 | 	assert((UInt32)ATOM_FETCH_ADD(next_tid, 0) == g_init_parallelism);
117 | 	uint64_t slice_size = g_synth_table_size / g_init_parallelism;
118 | 	for (uint64_t key = slice_size * tid; 
119 | 			key < slice_size * (tid + 1); 
120 | 			key ++
121 | 	) {
122 | 		row_t * new_row = NULL;
123 | 		//zhihan uint64_t row_id;
124 | 		uint64_t row_id = get_sys_clock();
125 | 		int part_id = key_to_part(key);
126 |         #ifdef NDEBUG
127 |         the_table->get_new_row(new_row, part_id, row_id);
128 |         #else
129 | 		RC rc = the_table->get_new_row(new_row, part_id, row_id); 
130 |         #endif
131 | 		assert(rc == RCOK);
132 | 		uint64_t primary_key = key;
133 | 		new_row->set_primary_key(primary_key);
134 | 		new_row->set_value(0, &primary_key);
135 | 		Catalog * schema = the_table->get_schema();
136 | 		
137 | 		for (UInt32 fid = 0; fid < schema->get_field_cnt(); fid ++) {
138 | 			char value[6] = "hello";
139 | 			new_row->set_value(fid, value);
140 | 		}
141 | 
142 | 		itemid_t * m_item =
143 | 			(itemid_t *) mem_allocator.alloc( sizeof(itemid_t), part_id );
144 | 		assert(m_item != NULL);
145 | 		m_item->type = DT_row;
146 | 		m_item->location = new_row;
147 | 		m_item->valid = true;
148 | 		uint64_t idx_key = primary_key;
149 | 		#ifdef NDEBUG
150 | 		the_index->index_insert(idx_key, m_item, part_id);
151 |         #else
152 | 		rc = the_index->index_insert(idx_key, m_item, part_id);
153 |         #endif
154 | 		assert(rc == RCOK);
155 | 	}
156 | 	return NULL;
157 | }
158 | 
159 | RC ycsb_wl::get_txn_man(txn_man *& txn_manager, thread_t * h_thd){
160 | 	txn_manager = (ycsb_txn_man *)
161 | 		_mm_malloc( sizeof(ycsb_txn_man), 64 );
162 | 	new(txn_manager) ycsb_txn_man();
163 | 	txn_manager->init(h_thd, this, h_thd->get_thd_id());
164 | 	return RCOK;
165 | }
166 | 
167 | 
168 | 


--------------------------------------------------------------------------------
/concurrency_control/aria.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #if CC_ALG == ARIA
 4 | 
 5 | class base_query;
 6 | 
 7 | // coordinate threads to agree on the batch and phrase.
 8 | namespace AriaCoord {
 9 | 
10 | void init();
11 | void register_thread(uint64_t thd_id);
12 | bool start_exec_phase(uint64_t thd_id, uint64_t batch_id, bool sim_done);
13 | void start_commit_phase(uint64_t thd_id, uint64_t batch_id);
14 | 
15 | } // namespace AriaCoord
16 | 
17 | #endif
18 | 


--------------------------------------------------------------------------------
/concurrency_control/bamboo.cpp:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by Zhihan Guo on 8/27/20.
 3 | //
 4 | #include "txn.h"
 5 | #include "row.h"
 6 | #include "row_bamboo.h"
 7 | //#include "row_bamboo_pt.h"
 8 | 
 9 | #if CC_ALG == BAMBOO
10 | RC
11 | txn_man::retire_row(int access_cnt){
12 |   return accesses[access_cnt]->orig_row->retire_row(accesses[access_cnt]->lock_entry);
13 | }
14 | #endif
15 | 
16 | void
17 | txn_man::decrement_commit_barriers() {
18 |     //ATOM_SUB(*addr_barriers, 1UL << 2);
19 |     ATOM_SUB(commit_barriers, 1UL << 2);
20 | }
21 | 
22 | void
23 | txn_man::increment_commit_barriers() {
24 |     //ATOM_ADD(*addr_barriers, 1UL << 2);
25 |     ATOM_ADD(commit_barriers, 1UL << 2);
26 | }
27 | 


--------------------------------------------------------------------------------
/concurrency_control/dl_detect.cpp:
--------------------------------------------------------------------------------
  1 | #include "dl_detect.h"
  2 | #include "global.h"
  3 | #include "helper.h"
  4 | #include "txn.h"
  5 | #include "row.h"
  6 | #include "manager.h"
  7 | #include "mem_alloc.h"
  8 | 
  9 | /********************************************************/
 10 | // The current txn aborts itself only if it holds less
 11 | // locks than all the other txns on the loop. 
 12 | // In other words, the victim should be the txn that 
 13 | // performs the least amount of work
 14 | /********************************************************/
 15 | void DL_detect::init() {
 16 | 	dependency = new DepThd[g_thread_cnt];
 17 | 	V = g_thread_cnt;
 18 | }
 19 | 
 20 | int
 21 | DL_detect::add_dep(uint64_t txnid1, uint64_t * txnids, int cnt, int num_locks) {
 22 | 	if (g_no_dl)
 23 | 		return 0;
 24 | 	int thd1 = get_thdid_from_txnid(txnid1);
 25 | 	pthread_mutex_lock( &dependency[thd1].lock );
 26 | 	dependency[thd1].txnid = txnid1;
 27 | 	dependency[thd1].num_locks = num_locks;
 28 | 	
 29 | 	for (int i = 0; i < cnt; i++) 
 30 | 		dependency[thd1].adj.push_back(txnids[i]);
 31 | 	
 32 | 	pthread_mutex_unlock( &dependency[thd1].lock );
 33 | 	return 0;
 34 | }
 35 | 
 36 | bool 
 37 | DL_detect::nextNode(uint64_t txnid, DetectData * detect_data) {
 38 | 	int thd = get_thdid_from_txnid(txnid);
 39 | 	assert( !detect_data->visited[thd] );
 40 | 	detect_data->visited[thd] = true;
 41 | 	detect_data->recStack[thd] = true;
 42 | 	
 43 | 	pthread_mutex_lock( &dependency[thd].lock );
 44 | 	
 45 | 	int lock_num = dependency[thd].num_locks;
 46 | 	int txnid_num = dependency[thd].adj.size();
 47 | 	uint64_t txnids[ txnid_num ];
 48 | 	int n = 0;
 49 | 	
 50 | 	if (dependency[thd].txnid != (SInt64)txnid) {
 51 | 		detect_data->recStack[thd] = false;
 52 | 		pthread_mutex_unlock( &dependency[thd].lock );
 53 | 		return false;
 54 | 	}
 55 | 	
 56 | 	for(list<uint64_t>::iterator i = dependency[thd].adj.begin(); i != dependency[thd].adj.end(); ++i) {
 57 | 		txnids[n++] = *i;
 58 | 	}
 59 | 	
 60 | 	pthread_mutex_unlock( &dependency[thd].lock );
 61 | 
 62 | 	for (n = 0; n < txnid_num; n++) {
 63 | 		int nextthd = get_thdid_from_txnid( txnids[n] );
 64 | 
 65 | 		// next node not visited and txnid is not stale
 66 | 		if ( detect_data->recStack[nextthd] ) {
 67 | 			if ((SInt32)txnids[n] == dependency[nextthd].txnid) {
 68 | 				detect_data->loop = true;
 69 | 				detect_data->onloop = true;
 70 | 				detect_data->loopstart = nextthd;
 71 | 				break;
 72 | 			}
 73 | 		} 
 74 | 		if ( !detect_data->visited[nextthd] && 
 75 | 			dependency[nextthd].txnid == (SInt64) txnids[n] && 
 76 | 			nextNode(txnids[n], detect_data)) 
 77 | 		{
 78 | 			break;
 79 | 		}
 80 | 	}
 81 | 	detect_data->recStack[thd] = false;
 82 | 	if (detect_data->loop 
 83 | 			&& detect_data->onloop 
 84 | 			&& lock_num < detect_data->min_lock_num) {
 85 | 		detect_data->min_lock_num = lock_num;
 86 | 		detect_data->min_txnid = txnid;
 87 | 	}
 88 | 	if (thd == detect_data->loopstart) {
 89 | 		detect_data->onloop = false;
 90 | 	}
 91 | 	return detect_data->loop;
 92 | }
 93 | 
 94 | // isCycle returns true if there is a loop AND the current txn holds the least 
 95 | // number of locks on that loop.
 96 | bool DL_detect::isCyclic(uint64_t txnid, DetectData * detect_data) {
 97 | 	return nextNode(txnid, detect_data);
 98 | }
 99 | 
100 | int
101 | DL_detect::detect_cycle(uint64_t txnid) {
102 | 	if (g_no_dl)
103 | 		return 0;
104 | 	uint64_t starttime = get_sys_clock();
105 | 	INC_GLOB_STATS(cycle_detect, 1);
106 | 	bool deadlock = false;
107 | 
108 | 	int thd = get_thdid_from_txnid(txnid);
109 | 	DetectData * detect_data = (DetectData *)
110 | 		mem_allocator.alloc(sizeof(DetectData), thd);
111 | 	detect_data->visited = (bool * )
112 | 		mem_allocator.alloc(sizeof(bool) * V, thd);
113 | 	detect_data->recStack = (bool * )
114 | 		mem_allocator.alloc(sizeof(bool) * V, thd);	
115 | 	for(int i = 0; i < V; i++) {
116 |         detect_data->visited[i] = false;
117 | 		detect_data->recStack[i] = false;
118 | 	}
119 | 
120 | 	detect_data->min_lock_num = 1000;
121 | 	detect_data->min_txnid = -1;
122 | 	detect_data->loop = false;
123 | 
124 | 	if ( isCyclic(txnid, detect_data) ){ 
125 | 		deadlock = true;
126 | 		INC_GLOB_STATS(deadlock, 1);
127 | 		int thd_to_abort = get_thdid_from_txnid(detect_data->min_txnid);
128 | 		if (dependency[thd_to_abort].txnid == (SInt64) detect_data->min_txnid) {
129 | 			txn_man * txn = glob_manager->get_txn_man(thd_to_abort);
130 | 			txn->lock_abort = true;
131 | 		}
132 | 	} 
133 | 	
134 | 	mem_allocator.free(detect_data->visited, sizeof(bool)*V);
135 | 	mem_allocator.free(detect_data->recStack, sizeof(bool)*V);
136 | 	mem_allocator.free(detect_data, sizeof(DetectData));
137 | 	uint64_t timespan = get_sys_clock() - starttime;
138 | 	INC_GLOB_STATS(dl_detect_time, timespan);
139 | 	if (deadlock) return 1;
140 | 	else return 0;
141 | }
142 | 
143 | void DL_detect::clear_dep(uint64_t txnid) {
144 | 	if (g_no_dl)
145 | 		return;
146 | 	int thd = get_thdid_from_txnid(txnid);
147 | 	pthread_mutex_lock( &dependency[thd].lock );
148 | 	
149 | 	dependency[thd].adj.clear();
150 | 	dependency[thd].txnid = -1;
151 | 	dependency[thd].num_locks = 0;
152 | 	
153 | 	pthread_mutex_unlock( &dependency[thd].lock );
154 | }
155 | 
156 | 


--------------------------------------------------------------------------------
/concurrency_control/dl_detect.h:
--------------------------------------------------------------------------------
 1 | #ifndef _DL_DETECT_
 2 | #define _DL_DETECT_
 3 | 
 4 | #include <limits.h>
 5 | #include <list>
 6 | #include <stdint.h>
 7 | #include "pthread.h"
 8 | #include "config.h"
 9 | //#include "global.h"
10 | //#include "helper.h"
11 | 
12 | // The denpendency information per thread
13 | struct DepThd {
14 |     std::list<uint64_t> adj;    // Pointer to an array containing adjacency lists
15 | 	pthread_mutex_t lock; 
16 | 	volatile int64_t txnid; 				// -1 means invalid
17 | 	int num_locks;				// the # of locks that txn is currently holding
18 | 	char pad[2 * CL_SIZE - sizeof(int64_t) - sizeof(pthread_mutex_t) - sizeof(std::list<uint64_t>) - sizeof(int)];
19 | };
20 | 
21 | // shared data for a particular deadlock detection
22 | struct DetectData {
23 | 	bool * visited;
24 | 	bool * recStack;
25 | 	bool loop;
26 | 	bool onloop;		// the current node is on the loop
27 | 	int loopstart;		// the starting point of the loop
28 | 	int min_lock_num; 	// the min lock num for txn in the loop
29 | 	uint64_t min_txnid; // the txnid that holds the min lock num
30 | };
31 | 
32 | class DL_detect {
33 | public:
34 | 	void init();
35 | 	// return values: 
36 | 	// 	0: no deadlocks
37 | 	//  1: deadlock exists
38 | 	int detect_cycle(uint64_t txnid);
39 | 	// txn1 (txn_id) dependes on txns (containing cnt txns)
40 | 	// return values:
41 | 	//	0: succeed.
42 | 	//	16: cannot get lock
43 | 	int add_dep(uint64_t txnid, uint64_t * txnids, int cnt, int num_locks);
44 | 	// remove all outbound dependencies for txnid.
45 | 	// will wait for the lock until acquired.
46 | 	void clear_dep(uint64_t txnid);
47 | private:
48 | 	int V;    // No. of vertices
49 | 	DepThd * dependency;
50 | 	
51 | 	///////////////////////////////////////////
52 | 	// For deadlock detection
53 | 	///////////////////////////////////////////
54 | 	// dl_lock is the global lock. Only used when deadlock detection happens
55 | 	pthread_mutex_t _lock;
56 | 	// return value: whether a loop is detected.
57 | 	bool nextNode(uint64_t txnid, DetectData * detect_data);
58 | 	bool isCyclic(uint64_t txnid, DetectData * detect_data); // return if "thd" is causing a cycle
59 | };
60 | 
61 | #endif
62 | 


--------------------------------------------------------------------------------
/concurrency_control/hekaton.cpp:
--------------------------------------------------------------------------------
 1 | #include "txn.h"
 2 | #include "row.h"
 3 | #include "row_hekaton.h"
 4 | #include "manager.h"
 5 | 
 6 | #if CC_ALG==HEKATON
 7 | 
 8 | RC
 9 | txn_man::validate_hekaton(RC rc)
10 | {
11 | 	uint64_t starttime = get_sys_clock();
12 | 	INC_STATS(get_thd_id(), debug1, get_sys_clock() - starttime);
13 | 	ts_t commit_ts = glob_manager->get_ts(get_thd_id());
14 | 	// validate the read set.
15 | #if ISOLATION_LEVEL == SERIALIZABLE
16 | 	if (rc == RCOK) {
17 | 		for (int rid = 0; rid < row_cnt; rid ++) {
18 | 			if (accesses[rid]->type == WR)
19 | 				continue;
20 | 			rc = accesses[rid]->orig_row->manager->prepare_read(this, accesses[rid]->data, commit_ts);
21 | 			if (rc == Abort)
22 | 				break;
23 | 		}
24 | 	}
25 | #endif
26 | 	// postprocess 
27 | 	for (int rid = 0; rid < row_cnt; rid ++) {
28 | 		if (accesses[rid]->type == RD)
29 | 			continue;
30 | 		accesses[rid]->orig_row->manager->post_process(this, commit_ts, rc);
31 | 	}
32 | 	return rc;
33 | }
34 | 
35 | #endif
36 | 


--------------------------------------------------------------------------------
/concurrency_control/occ.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "helper.h"
  3 | #include "txn.h"
  4 | #include "occ.h"
  5 | #include "manager.h"
  6 | #include "mem_alloc.h"
  7 | #include "row_occ.h"
  8 | 
  9 | 
 10 | set_ent::set_ent() {
 11 | 	set_size = 0;
 12 | 	txn = NULL;
 13 | 	rows = NULL;
 14 | 	next = NULL;
 15 | }
 16 | 
 17 | void OptCC::init() {
 18 | 	tnc = 0;
 19 | 	his_len = 0;
 20 | 	active_len = 0;
 21 | 	active = NULL;
 22 | 	lock_all = false;
 23 | }
 24 | 
 25 | RC OptCC::validate(txn_man * txn) {
 26 | 	RC rc;
 27 | #if PER_ROW_VALID
 28 | 	rc = per_row_validate(txn);
 29 | #else
 30 | 	rc = central_validate(txn);
 31 | #endif
 32 | 	return rc;
 33 | }
 34 | 
 35 | RC 
 36 | OptCC::per_row_validate(txn_man * txn) {
 37 | 	RC rc = RCOK;
 38 | #if CC_ALG == OCC
 39 | 	// sort all rows accessed in primary key order.
 40 | 	// TODO for migration, should first sort by partition id
 41 | 	for (int i = txn->row_cnt - 1; i > 0; i--) {
 42 | 		for (int j = 0; j < i; j ++) {
 43 | 			int tabcmp = strcmp(txn->accesses[j]->orig_row->get_table_name(), 
 44 | 			txn->accesses[j+1]->orig_row->get_table_name());
 45 | 			if (tabcmp > 0 || (tabcmp == 0 && txn->accesses[j]->orig_row->get_primary_key() > txn->accesses[j+1]->orig_row->get_primary_key())) {
 46 | 				Access * tmp = txn->accesses[j]; 
 47 | 				txn->accesses[j] = txn->accesses[j+1];
 48 | 				txn->accesses[j+1] = tmp;
 49 | 			}
 50 | 		}
 51 | 	}
 52 | #if DEBUG_ASSERT
 53 | 	for (int i = txn->row_cnt - 1; i > 0; i--) {
 54 | 		int tabcmp = strcmp(txn->accesses[i-1]->orig_row->get_table_name(), 
 55 | 		txn->accesses[i]->orig_row->get_table_name());
 56 | 		assert(tabcmp < 0 || tabcmp == 0 && txn->accesses[i]->orig_row->get_primary_key() > 
 57 | 		txn->accesses[i-1]->orig_row->get_primary_key());
 58 | 	}
 59 | #endif
 60 | 	// lock all rows in the readset and writeset.
 61 | 	// Validate each access
 62 | 	bool ok = true;
 63 | 	int lock_cnt = 0;
 64 | 	for (int i = 0; i < txn->row_cnt && ok; i++) {
 65 | 		lock_cnt ++;
 66 | 		txn->accesses[i]->orig_row->manager->latch();
 67 | 		ok = txn->accesses[i]->orig_row->manager->validate( txn->start_ts );
 68 | 	}
 69 | 	if (ok) {
 70 | 		// Validation passed.
 71 | 		// advance the global timestamp and get the end_ts
 72 | 		txn->end_ts = glob_manager->get_ts( txn->get_thd_id() );
 73 | 		// write to each row and update wts
 74 | 		txn->cleanup(RCOK);
 75 | 		rc = RCOK;
 76 | 	} else {
 77 | 		txn->cleanup(Abort);
 78 | 		rc = Abort;
 79 | 	}
 80 | 
 81 | 	for (int i = 0; i < lock_cnt; i++) 
 82 | 		txn->accesses[i]->orig_row->manager->release();
 83 | #endif
 84 | 	return rc;
 85 | }
 86 | 
 87 | RC OptCC::central_validate(txn_man * txn) {
 88 | 	RC rc;
 89 | 	uint64_t start_tn = txn->start_ts;
 90 | 	uint64_t finish_tn;
 91 | 	set_ent ** finish_active;
 92 | 	uint64_t f_active_len;
 93 | 	bool valid = true;
 94 | 	// OptCC is centralized. No need to do per partition malloc.
 95 | 	set_ent * wset;
 96 | 	set_ent * rset;
 97 | 	get_rw_set(txn, rset, wset);
 98 | 	bool readonly = (wset->set_size == 0);
 99 | 	set_ent * his;
100 | 	set_ent * ent;
101 | 	int n = 0;
102 | 
103 | 	pthread_mutex_lock( &latch );
104 | 	finish_tn = tnc;
105 | 	ent = active;
106 | 	f_active_len = active_len;
107 | 	finish_active = (set_ent**) mem_allocator.alloc(sizeof(set_ent *) * f_active_len, 0);
108 | 	while (ent != NULL) {
109 | 		finish_active[n++] = ent;
110 | 		ent = ent->next;
111 | 	}
112 | 	if ( !readonly ) {
113 | 		active_len ++;
114 | 		STACK_PUSH(active, wset);
115 | 	}
116 | 	his = history;
117 | 	pthread_mutex_unlock( &latch );
118 | 	if (finish_tn > start_tn) {
119 | 		while (his && his->tn > finish_tn) 
120 | 			his = his->next;
121 | 		while (his && his->tn > start_tn) {
122 | 			valid = test_valid(his, rset);
123 | 			if (!valid) 
124 | 				goto final;
125 | 			his = his->next;
126 | 		}
127 | 	}
128 | 
129 | 	for (UInt32 i = 0; i < f_active_len; i++) {
130 | 		set_ent * wact = finish_active[i];
131 | 		valid = test_valid(wact, rset);
132 | 		if (valid) {
133 | 			valid = test_valid(wact, wset);
134 | 		} if (!valid)
135 | 			goto final;
136 | 	}
137 | final:
138 | 	if (valid) 
139 | 		txn->cleanup(RCOK);
140 | 	mem_allocator.free(rset, sizeof(set_ent));
141 | 
142 | 	if (!readonly) {
143 | 		// only update active & tnc for non-readonly transactions
144 | 		pthread_mutex_lock( &latch );
145 | 		set_ent * act = active;
146 | 		set_ent * prev = NULL;
147 | 		while (act->txn != txn) {
148 | 			prev = act;
149 | 			act = act->next;
150 | 		}
151 | 		assert(act->txn == txn);
152 | 		if (prev != NULL)
153 | 			prev->next = act->next;
154 | 		else
155 | 			active = act->next;
156 | 		active_len --;
157 | 		if (valid) {
158 | 			if (history)
159 | 				assert(history->tn == tnc);
160 | 			tnc ++;
161 | 			wset->tn = tnc;
162 | 			STACK_PUSH(history, wset);
163 | 			his_len ++;
164 | 		}
165 | 		pthread_mutex_unlock( &latch );
166 | 	}
167 | 	if (valid) {
168 | 		rc = RCOK;
169 | 	} else {
170 | 		txn->cleanup(Abort);
171 | 		rc = Abort;
172 | 	}
173 | 	return rc;
174 | }
175 | 
176 | RC OptCC::get_rw_set(txn_man * txn, set_ent * &rset, set_ent *& wset) {
177 | 	wset = (set_ent*) mem_allocator.alloc(sizeof(set_ent), 0);
178 | 	rset = (set_ent*) mem_allocator.alloc(sizeof(set_ent), 0);
179 | 	wset->set_size = txn->wr_cnt;
180 | 	rset->set_size = txn->row_cnt - txn->wr_cnt;
181 | 	wset->rows = (row_t **) mem_allocator.alloc(sizeof(row_t *) * wset->set_size, 0);
182 | 	rset->rows = (row_t **) mem_allocator.alloc(sizeof(row_t *) * rset->set_size, 0);
183 | 	wset->txn = txn;
184 | 	rset->txn = txn;
185 | 
186 | 	UInt32 n = 0, m = 0;
187 | 	for (int i = 0; i < txn->row_cnt; i++) {
188 | 		if (txn->accesses[i]->type == WR)
189 | 			wset->rows[n ++] = txn->accesses[i]->orig_row;
190 | 		else 
191 | 			rset->rows[m ++] = txn->accesses[i]->orig_row;
192 | 	}
193 | 
194 | 	assert(n == wset->set_size);
195 | 	assert(m == rset->set_size);
196 | 	return RCOK;
197 | }
198 | 
199 | bool OptCC::test_valid(set_ent * set1, set_ent * set2) {
200 | 	for (UInt32 i = 0; i < set1->set_size; i++)
201 | 		for (UInt32 j = 0; j < set2->set_size; j++) {
202 | 			if (set1->rows[i] == set2->rows[j]) {
203 | 				return false;
204 | 			}
205 | 		}
206 | 	return true;
207 | }
208 | 


--------------------------------------------------------------------------------
/concurrency_control/occ.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "row.h"
 4 | 
 5 | // TODO For simplicity, the txn hisotry for OCC is oganized as follows:
 6 | // 1. history is never deleted.
 7 | // 2. hisotry forms a single directional list. 
 8 | //		history head -> hist_1 -> hist_2 -> hist_3 -> ... -> hist_n
 9 | //    The head is always the latest and the tail the youngest. 
10 | // 	  When history is traversed, always go from head -> tail order.
11 | 
12 | class txn_man;
13 | 
14 | class set_ent{
15 | public:
16 | 	set_ent();
17 | 	UInt64 tn;
18 | 	txn_man * txn;
19 | 	UInt32 set_size;
20 | 	row_t ** rows;
21 | 	set_ent * next;
22 | };
23 | 
24 | class OptCC {
25 | public:
26 | 	void init();
27 | 	RC validate(txn_man * txn);
28 | 	volatile bool lock_all;
29 | 	uint64_t lock_txn_id;
30 | private:
31 | 	
32 | 	// per row validation similar to Hekaton.
33 | 	RC per_row_validate(txn_man * txn);
34 | 
35 | 	// parallel validation in the original OCC paper.
36 | 	RC central_validate(txn_man * txn);
37 | 	bool test_valid(set_ent * set1, set_ent * set2);
38 | 	RC get_rw_set(txn_man * txni, set_ent * &rset, set_ent *& wset);
39 | 	
40 | 	// "history" stores write set of transactions with tn >= smallest running tn
41 | 	set_ent * history;
42 | 	set_ent * active;
43 | 	uint64_t his_len;
44 | 	uint64_t active_len;
45 | 	volatile uint64_t tnc; // transaction number counter
46 | 	pthread_mutex_t latch;
47 | };
48 | 


--------------------------------------------------------------------------------
/concurrency_control/plock.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "helper.h"
  3 | #include "plock.h"
  4 | #include "mem_alloc.h"
  5 | #include "txn.h"
  6 | 
  7 | /************************************************/
  8 | // per-partition Manager
  9 | /************************************************/
 10 | void PartMan::init() {
 11 | 	uint64_t part_id = get_part_id(this);
 12 | 	waiter_cnt = 0;
 13 | 	owner = NULL;
 14 | 	waiters = (txn_man **)
 15 | 		mem_allocator.alloc(sizeof(txn_man *) * g_thread_cnt, part_id);
 16 | 	pthread_mutex_init( &latch, NULL );
 17 | }
 18 | 
 19 | RC PartMan::lock(txn_man * txn) {
 20 | 	RC rc;
 21 | 
 22 | 	pthread_mutex_lock( &latch );
 23 | 	if (owner == NULL) {
 24 | 		owner = txn;
 25 | 		rc = RCOK;
 26 | 	} else if (owner->get_ts() < txn->get_ts()) {
 27 | 		int i;
 28 | 		assert(waiter_cnt < g_thread_cnt);
 29 | 		for (i = waiter_cnt; i > 0; i--) {
 30 | 			if (txn->get_ts() > waiters[i - 1]->get_ts()) {
 31 | 				waiters[i] = txn;
 32 | 				break;
 33 | 			} else 
 34 | 				waiters[i] = waiters[i - 1];
 35 | 		}
 36 | 		if (i == 0)
 37 | 			waiters[i] = txn;
 38 | 		waiter_cnt ++;
 39 | 		ATOM_ADD(txn->ready_part, 1);
 40 | 		rc = WAIT;
 41 | 	} else
 42 | 		rc = Abort;
 43 | 	pthread_mutex_unlock( &latch );
 44 | 	return rc;
 45 | }
 46 | 
 47 | void PartMan::unlock(txn_man * txn) {
 48 | 	pthread_mutex_lock( &latch );
 49 | 	if (txn == owner) {		
 50 | 		if (waiter_cnt == 0) 
 51 | 			owner = NULL;
 52 | 		else {
 53 | 			owner = waiters[0];			
 54 | 			for (UInt32 i = 0; i < waiter_cnt - 1; i++) {
 55 | 				assert( waiters[i]->get_ts() < waiters[i + 1]->get_ts() );
 56 | 				waiters[i] = waiters[i + 1];
 57 | 			}
 58 | 			waiter_cnt --;
 59 | 			ATOM_SUB(owner->ready_part, 1);
 60 | 		} 
 61 | 	} else {
 62 | 		bool find = false;
 63 | 		for (UInt32 i = 0; i < waiter_cnt; i++) {
 64 | 			if (waiters[i] == txn) 
 65 | 				find = true;
 66 | 			if (find && i < waiter_cnt - 1) 
 67 | 				waiters[i] = waiters[i + 1];
 68 | 		}
 69 | 		ATOM_SUB(txn->ready_part, 1);
 70 | 		assert(find);
 71 | 		waiter_cnt --;
 72 | 	}
 73 | 	pthread_mutex_unlock( &latch );
 74 | }
 75 | 
 76 | /************************************************/
 77 | // Partition Lock
 78 | /************************************************/
 79 | 
 80 | void Plock::init() {
 81 | 	ARR_PTR(PartMan, part_mans, g_part_cnt);
 82 | 	for (UInt32 i = 0; i < g_part_cnt; i++)
 83 | 		part_mans[i]->init();
 84 | }
 85 | 
 86 | RC Plock::lock(txn_man * txn, uint64_t * parts, uint64_t part_cnt) {
 87 | 	RC rc = RCOK;
 88 | 	ts_t starttime = get_sys_clock();
 89 | 	UInt32 i;
 90 | 	for (i = 0; i < part_cnt; i ++) {
 91 | 		uint64_t part_id = parts[i];
 92 | 		rc = part_mans[part_id]->lock(txn);
 93 | 		if (rc == Abort)
 94 | 			break;
 95 | 	}
 96 | 	if (rc == Abort) {
 97 | 		for (UInt32 j = 0; j < i; j++) {
 98 | 			uint64_t part_id = parts[j];
 99 | 			part_mans[part_id]->unlock(txn);
100 | 		}
101 | 		assert(txn->ready_part == 0);
102 | 		INC_TMP_STATS(txn->get_thd_id(), time_man, get_sys_clock() - starttime);
103 | 		return Abort;
104 | 	}
105 | 	if (txn->ready_part > 0) {
106 | 		ts_t t = get_sys_clock();
107 | 		while (txn->ready_part > 0) {}
108 | 		INC_TMP_STATS(txn->get_thd_id(), time_wait, get_sys_clock() - t);
109 | 		#if DEBUG_WW
110 | 			printf("[plock] increment time wait %lu\n", get_sys_clock() - t);
111 | 		#endif
112 | 	}
113 | 	assert(txn->ready_part == 0);
114 | 	INC_TMP_STATS(txn->get_thd_id(), time_man, get_sys_clock() - starttime);
115 | 	return RCOK;
116 | }
117 | 
118 | void Plock::unlock(txn_man * txn, uint64_t * parts, uint64_t part_cnt) {
119 | 	ts_t starttime = get_sys_clock();
120 | 	for (UInt32 i = 0; i < part_cnt; i ++) {
121 | 		uint64_t part_id = parts[i];
122 | 		part_mans[part_id]->unlock(txn);
123 | 	}
124 | 	INC_TMP_STATS(txn->get_thd_id(), time_man, get_sys_clock() - starttime);
125 | }
126 | 


--------------------------------------------------------------------------------
/concurrency_control/plock.h:
--------------------------------------------------------------------------------
 1 | #ifndef _PLOCK_H_
 2 | #define _PLOCK_H_
 3 | 
 4 | #include "global.h"
 5 | #include "helper.h"
 6 | 
 7 | class txn_man;
 8 | 
 9 | // Parition manager for HSTORE
10 | class PartMan {
11 | public:
12 | 	void init();
13 | 	RC lock(txn_man * txn);
14 | 	void unlock(txn_man * txn);
15 | private:
16 | 	pthread_mutex_t latch;
17 | 	txn_man * owner;
18 | 	txn_man ** waiters;
19 | 	UInt32 waiter_cnt;
20 | };
21 | 
22 | // Partition Level Locking
23 | class Plock {
24 | public:
25 | 	void init();
26 | 	// lock all partitions in parts
27 | 	RC lock(txn_man * txn, uint64_t * parts, uint64_t part_cnt);
28 | 	void unlock(txn_man * txn, uint64_t * parts, uint64_t part_cnt);
29 | private:
30 | 	PartMan ** part_mans;
31 | };
32 | 
33 | #endif
34 | 


--------------------------------------------------------------------------------
/concurrency_control/row_aria.cpp:
--------------------------------------------------------------------------------
 1 | #include "global.h"
 2 | #include "txn.h"
 3 | #include "row.h"
 4 | #include "row_aria.h"
 5 | 
 6 | #if CC_ALG == ARIA
 7 | 
 8 | void 
 9 | Row_aria::init(row_t * row) {
10 | 	_row = row;
11 | 	_write_resv.store(0, std::memory_order_relaxed);
12 | #if ARIA_REORDER
13 | 	_read_resv.store(0, std::memory_order_relaxed);
14 | #endif
15 | }
16 | 
17 | RC
18 | Row_aria::access(txn_man * txn, TsType type, row_t * local_row) {
19 | 	if (type != R_REQ) {
20 | 		if (!reserve_write(txn->batch_id, txn->prio, txn->get_txn_id()))
21 | 			return Abort;
22 | 	}
23 | #if ARIA_REORDER
24 | 	else {
25 | 		reserve_read(txn->batch_id, txn->prio, txn->get_txn_id());
26 | 	}
27 | #endif
28 | 	// when in execution phase, everything is read-only except TID, so it is safe
29 | 	// to copy record data without any lock
30 | #if ARIA_NOCOPY_READ
31 | 	// no need to make a copy because the whole database is read-only
32 | 	if (type == R_REQ) return RCOK;
33 | #endif
34 | 	local_row->copy(_row);
35 | 	return RCOK;
36 | }
37 | 
38 | void
39 | Row_aria::write(row_t * data) {
40 | 	_row->copy(data);
41 | }
42 | 
43 | #endif
44 | 


--------------------------------------------------------------------------------
/concurrency_control/row_aria.h:
--------------------------------------------------------------------------------
  1 | #pragma once 
  2 | 
  3 | class table_t;
  4 | class Catalog;
  5 | class txn_man;
  6 | struct TsReqEntry;
  7 | 
  8 | #include <atomic>
  9 | 
 10 | #if CC_ALG == ARIA
 11 | 
 12 | // we implement Aria's reservation as per-record TID
 13 | union TID_aria_t {
 14 | 	uint64_t raw_bits;
 15 | 	struct {
 16 | 		uint64_t batch_id : ARIA_NUM_BITS_BATCH_ID;
 17 | 		uint32_t prio : ARIA_NUM_BITS_PRIO;
 18 | 		uint64_t txn_id : ARIA_NUM_BITS_TXN_ID; // resereved by
 19 | 	} tid_aria;
 20 | 	TID_aria_t() = default;
 21 | 	TID_aria_t(uint64_t tid_bits): raw_bits(tid_bits) {}
 22 | 	TID_aria_t(uint64_t batch_id, uint32_t prio, uint64_t txn_id): \
 23 | 		tid_aria({batch_id, prio, txn_id}) {}
 24 | };
 25 | 
 26 | class Row_aria {
 27 | 	// txns are serialized as the order with these comparsion rules:
 28 | 	// - if txn A has higher prio than txn B, A is serialized before B
 29 | 	// - else if txn A has lower txn_id than txn B, A is serialized before B
 30 | 	static bool is_order_before(uint32_t lhs_prio, uint64_t lhs_txn_id,
 31 | 		uint32_t rhs_prio, uint64_t rhs_txn_id)
 32 | 	{
 33 | 		if (lhs_prio != rhs_prio)
 34 | 			return lhs_prio > rhs_prio;
 35 | 		return lhs_txn_id < rhs_txn_id;
 36 | 	}
 37 | 
 38 | public:
 39 | 	void init(row_t * row);
 40 | 	RC access(txn_man * txn, TsType type, row_t * local_row);
 41 | 	void write(row_t * data);
 42 | 
 43 | 	bool reserve_write(uint64_t batch_id, uint32_t prio, uint64_t txn_id) {
 44 | 		return reserve(_write_resv, batch_id, prio, txn_id);
 45 | 	}
 46 | 	bool validate_write(uint64_t batch_id, uint32_t prio, uint64_t txn_id) const {
 47 | 		return validate(_write_resv, batch_id, prio, txn_id);
 48 | 	}
 49 | 
 50 | #if ARIA_REORDER
 51 | 	void reserve_read(uint64_t batch_id, uint32_t prio, uint64_t txn_id) {
 52 | 		// we don't care return value for read reservation
 53 | 		reserve(_read_resv, batch_id, prio, txn_id);
 54 | 	}
 55 | 	bool validate_read(uint64_t batch_id, uint32_t prio, uint64_t txn_id) const {
 56 | 		return validate(_read_resv, batch_id, prio, txn_id);
 57 | 	}
 58 | #endif
 59 | 
 60 | private:
 61 | 	bool reserve(std::atomic<TID_aria_t>& resv, uint64_t batch_id,
 62 | 		uint32_t prio, uint64_t txn_id)
 63 | 	{
 64 | 		TID_aria_t new_tid(batch_id, prio, txn_id);
 65 | 		TID_aria_t v = resv.load(std::memory_order_relaxed);
 66 | 	retry:
 67 | 		// if no one ever reserves this record in this batch,
 68 | 		// OR the one that previously has reserved the record is not ordered
 69 | 		// before the current one (in which case we preempt)
 70 | 		if (v.tid_aria.batch_id != batch_id \
 71 | 			|| !is_order_before(v.tid_aria.prio, v.tid_aria.txn_id, prio, txn_id))
 72 | 		{
 73 | 			if (!resv.compare_exchange_strong(v, new_tid,
 74 | 				std::memory_order_relaxed, std::memory_order_relaxed))
 75 | 				goto retry;
 76 | 			return true;
 77 | 		}
 78 | 		return false;
 79 | 	}
 80 | 
 81 | 	bool validate(const std::atomic<TID_aria_t>& resv, uint64_t batch_id,
 82 | 		uint32_t prio, uint64_t txn_id) const
 83 | 	{
 84 | 		TID_aria_t v = resv.load(std::memory_order_relaxed);
 85 | 		// compared record's TID with txn's tid:
 86 | 		// - if reserved by a txn from another batch; no one reserves it in the
 87 | 		//   current batch; pass
 88 | 		if (v.tid_aria.batch_id != batch_id) return true;
 89 | 		// - else for a validation to pass, the txn that reserves the record must
 90 | 		//   not be serialized before the current txn
 91 | 		return !is_order_before(v.tid_aria.prio, v.tid_aria.txn_id, prio, txn_id);
 92 | 	}
 93 | 
 94 | private:
 95 | 	std::atomic<TID_aria_t>	_write_resv;
 96 | #if ARIA_REORDER
 97 | 	std::atomic<TID_aria_t>	_read_resv;
 98 | #endif
 99 | 	row_t * 								_row;
100 | };
101 | 
102 | #endif
103 | 


--------------------------------------------------------------------------------
/concurrency_control/row_hekaton.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | #include "row_mvcc.h"
 3 | 
 4 | class table_t;
 5 | class Catalog;
 6 | class txn_man;
 7 | 
 8 | // Only a constant number of versions can be maintained.
 9 | // If a request accesses an old version that has been recycled,   
10 | // simply abort the request.
11 | 
12 | #if CC_ALG == HEKATON
13 | 
14 | struct WriteHisEntry {
15 | 	bool begin_txn;	
16 | 	bool end_txn;
17 | 	ts_t begin;
18 | 	ts_t end;
19 | 	row_t * row;
20 | };
21 | 
22 | #define INF UINT64_MAX
23 | 
24 | class Row_hekaton {
25 | public:
26 | 	void 			init(row_t * row);
27 | 	RC 				access(txn_man * txn, TsType type, row_t * row);
28 | 	RC 				prepare_read(txn_man * txn, row_t * row, ts_t commit_ts);
29 | 	void 			post_process(txn_man * txn, ts_t commit_ts, RC rc);
30 | 
31 | private:
32 | 	volatile bool 	blatch;
33 | 	uint32_t 		reserveRow(txn_man * txn);
34 | 	void 			doubleHistory();
35 | 
36 | 	uint32_t 		_his_latest;
37 | 	uint32_t 		_his_oldest;
38 | 	WriteHisEntry * _write_history; // circular buffer
39 | 	bool  			_exists_prewrite;
40 | 	
41 | 	uint32_t 		_his_len;
42 | };
43 | 
44 | #endif
45 | 


--------------------------------------------------------------------------------
/concurrency_control/row_ic3.h:
--------------------------------------------------------------------------------
  1 | #pragma once
  2 | 
  3 | class table_t;
  4 | class Catalog;
  5 | class txn_man;
  6 | struct TsReqEntry;
  7 | class Row_ic3;
  8 | 
  9 | #define LOCK_BIT (1UL << 63)
 10 | #if CC_ALG == IC3
 11 | 
 12 | struct IC3LockEntry {
 13 |   access_t             type;
 14 |   txn_man *            txn;
 15 |   uint64_t             txn_id;
 16 |   IC3LockEntry *          prev;
 17 |   IC3LockEntry *          next;
 18 | };
 19 | 
 20 | class Cell_ic3 {
 21 |  public:
 22 |   void                  init(row_t * orig_row, uint64_t id);
 23 |   /* copy to corresponding col of local row */
 24 |   void                  access(row_t * local_row, Access *txn_access);
 25 |   uint64_t              get_tid() {return _tid;};
 26 |   void                  add_to_acclist(txn_man * txn, access_t type);
 27 |   void                  rm_from_acclist(txn_man * txn, bool aborted);
 28 |   IC3LockEntry *        get_last_writer();
 29 |   IC3LockEntry *        get_last_accessor();
 30 |   bool                  try_lock();
 31 |   void                  release();
 32 |   void                  update_version(uint64_t txn_id) {_tid = txn_id;};
 33 |  private:
 34 |   row_t * 			    _row;
 35 |   Row_ic3 *             row_manager;
 36 |   volatile uint64_t	    _tid;
 37 |   uint64_t              idx;
 38 |   int                   acclist_cnt;
 39 |   IC3LockEntry *        acclist;
 40 |   IC3LockEntry *        acclist_tail;
 41 |   volatile int          lock;
 42 |   /*
 43 | #if LATCH == LH_SPINLOCK
 44 |   pthread_spinlock_t *  latch;
 45 | #elif LATCH == LH_MUTEX
 46 |   pthread_mutex_t *     latch;
 47 | #else
 48 |   mcslock *             latch;
 49 | #endif
 50 | */
 51 | 
 52 | };
 53 | 
 54 | class Row_ic3 {
 55 |  public:
 56 |   void 	            init(row_t * row);
 57 | #if IC3_FIELD_LOCKING
 58 |   bool              try_lock(uint64_t idx) {return cell_managers[idx].try_lock();};
 59 |   void              release(uint64_t idx) {cell_managers[idx].release();};
 60 |   uint64_t          get_tid(uint64_t idx) {return cell_managers[idx].get_tid();};
 61 |   IC3LockEntry *    get_last_writer(uint64_t idx) {
 62 |     return cell_managers[idx].get_last_writer();};
 63 |   IC3LockEntry *    get_last_accessor(uint64_t idx) {
 64 |     return cell_managers[idx].get_last_accessor();};
 65 |   void              add_to_acclist(uint64_t idx, txn_man * txn, access_t type) {
 66 |     cell_managers[idx].add_to_acclist(txn, type);};
 67 |   void              rm_from_acclist(uint64_t idx, txn_man * txn, bool aborted=false) {
 68 |     cell_managers[idx].rm_from_acclist(txn, aborted);};
 69 |   void              update_version(uint64_t idx, uint64_t txn_id) {
 70 |     cell_managers[idx].update_version(txn_id);};
 71 |   void              access(row_t * local_row, uint64_t idx, Access * txn_access) {
 72 |     cell_managers[idx].access(local_row, txn_access);};
 73 | #else // tuple-level locking
 74 |   bool              try_lock();
 75 |   uint64_t          get_tid() {return _tid;};
 76 |   IC3LockEntry *    get_last_writer();
 77 |   IC3LockEntry *    get_last_accessor();
 78 |   void              release() {lock = 0;};
 79 |   void              add_to_acclist(txn_man * txn, access_t type);
 80 |   void              rm_from_acclist(txn_man * txn, bool aborted=false);
 81 |   void              update_version(uint64_t txn_id) {_tid = txn_id;};
 82 |   void              access(row_t * local_row, Access * txn_access);
 83 | #endif
 84 |   row_t * 			    _row;
 85 | 
 86 |  private:
 87 | #if !IC3_FIELD_LOCKING
 88 |   volatile uint64_t	_tid;
 89 |   uint64_t              idx;
 90 |   int                   acclist_cnt;
 91 |   IC3LockEntry *        acclist;
 92 |   IC3LockEntry *        acclist_tail;
 93 |   volatile int          lock;
 94 | #else
 95 |   Cell_ic3 *            cell_managers;
 96 | #endif
 97 | };
 98 | 
 99 | #endif
100 | 


--------------------------------------------------------------------------------
/concurrency_control/row_lock.h:
--------------------------------------------------------------------------------
 1 | #ifndef ROW_LOCK_H
 2 | #define ROW_LOCK_H
 3 | 
 4 | struct LockEntry {
 5 |     txn_man * txn;
 6 |     Access * access;
 7 |     lock_t type;
 8 |     lock_status status;
 9 |     LockEntry * next;
10 |     LockEntry * prev;
11 |     LockEntry(txn_man * t, Access * a): txn(t), access(a), type(LOCK_NONE),
12 |                                         status(LOCK_DROPPED), next(NULL), prev(NULL) {};
13 | };
14 | 
15 | class Row_lock {
16 |   public:
17 |     void init(row_t * row);
18 |     // [DL_DETECT] txnids are the txn_ids that current txn is waiting for.
19 |     RC lock_get(lock_t type, txn_man * txn, Access * access);
20 |     RC lock_get(lock_t type, txn_man * txn, uint64_t* &txnids, int &txncnt, Access * access);
21 |     RC lock_release(LockEntry * entry);
22 |     void lock(txn_man * txn);
23 |     void unlock(txn_man * txn);
24 | 
25 |   private:
26 | #if LATCH == LH_SPINLOCK
27 |     pthread_spinlock_t * latch;
28 | #elif LATCH == LH_MUTEX
29 |     pthread_mutex_t * latch;
30 | #else
31 |   mcslock * latch;
32 | #endif
33 |     bool blatch;
34 | 
35 |     bool 		conflict_lock(lock_t l1, lock_t l2);
36 |     static LockEntry * get_entry(Access * access);
37 |     static void 		return_entry(LockEntry * entry);
38 |     row_t * _row;
39 |     lock_t lock_type;
40 |     UInt32 owner_cnt;
41 |     UInt32 waiter_cnt;
42 | 
43 |     // owners is a single linked list
44 |     // waiters is a double linked list
45 |     // [waiters] head is the oldest txn, tail is the youngest txn.
46 |     //   So new txns are inserted into the tail.
47 |     LockEntry * owners;
48 |     LockEntry * waiters_head;
49 |     LockEntry * waiters_tail;
50 | };
51 | 
52 | #endif
53 | 


--------------------------------------------------------------------------------
/concurrency_control/row_mvcc.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | class table_t;
 4 | class Catalog;
 5 | class txn_man;
 6 | 
 7 | // Only a constant number of versions can be maintained.
 8 | // If a request accesses an old version that has been recycled,   
 9 | // simply abort the request.
10 | 
11 | #if CC_ALG == MVCC
12 | struct WriteHisEntry {
13 | 	bool valid;		// whether the entry contains a valid version
14 | 	bool reserved; 	// when valid == false, whether the entry is reserved by a P_REQ 
15 | 	ts_t ts;
16 | 	row_t * row;
17 | };
18 | 
19 | struct ReqEntry {
20 | 	bool valid;
21 | 	TsType type; // P_REQ or R_REQ
22 | 	ts_t ts;
23 | 	txn_man * txn;
24 | 	ts_t time;
25 | };
26 | 
27 | 
28 | class Row_mvcc {
29 | public:
30 | 	void init(row_t * row);
31 | 	RC access(txn_man * txn, TsType type, row_t * row);
32 | private:
33 |  	pthread_mutex_t * latch;
34 | 	volatile bool blatch;
35 | 
36 | 	row_t * _row;
37 | 
38 | 	RC conflict(TsType type, ts_t ts, uint64_t thd_id = 0);
39 | 	void update_buffer(txn_man * txn, TsType type);
40 | 	void buffer_req(TsType type, txn_man * txn, bool served);
41 | 
42 | 	// Invariant: all valid entries in _requests have greater ts than any entry in _write_history 
43 | 	row_t * 		_latest_row;
44 | 	ts_t			_latest_wts;
45 | 	ts_t			_oldest_wts;
46 | 	WriteHisEntry * _write_history;
47 | 	// the following is a small optimization.
48 | 	// the timestamp for the served prewrite request. There should be at most one 
49 | 	// served prewrite request. 
50 | 	bool  			_exists_prewrite;
51 | 	ts_t 			_prewrite_ts;
52 | 	uint32_t 		_prewrite_his_id;
53 | 	ts_t 			_max_served_rts;
54 | 
55 | 	// _requests only contains pending requests.
56 | 	ReqEntry * 		_requests;
57 | 	uint32_t 		_his_len;
58 | 	uint32_t 		_req_len;
59 | 	// Invariant: _num_versions <= 4
60 | 	// Invariant: _num_prewrite_reservation <= 2
61 | 	uint32_t 		_num_versions;
62 | 	
63 | 	// list = 0: _write_history
64 | 	// list = 1: _requests
65 | 	void double_list(uint32_t list);
66 | 	row_t * reserveRow(ts_t ts, txn_man * txn);
67 | };
68 | 
69 | #endif
70 | 


--------------------------------------------------------------------------------
/concurrency_control/row_occ.cpp:
--------------------------------------------------------------------------------
 1 | #include "txn.h"
 2 | #include "row.h"
 3 | #include "row_occ.h"
 4 | #include "mem_alloc.h"
 5 | 
 6 | void 
 7 | Row_occ::init(row_t * row) {
 8 | 	_row = row;
 9 | 	int part_id = row->get_part_id();
10 | 	_latch = (pthread_mutex_t *) 
11 | 		mem_allocator.alloc(sizeof(pthread_mutex_t), part_id);
12 | 	pthread_mutex_init( _latch, NULL );
13 | 	wts = 0;
14 | 	blatch = false;
15 | }
16 | 
17 | RC
18 | Row_occ::access(txn_man * txn, TsType type) {
19 | 	RC rc = RCOK;
20 | 	pthread_mutex_lock( _latch );
21 | 	if (type == R_REQ) {
22 | 		if (txn->start_ts < wts)
23 | 			rc = Abort;
24 | 		else { 
25 | 			txn->cur_row->copy(_row);
26 | 			rc = RCOK;
27 | 		}
28 | 	} else 
29 | 		assert(false);
30 | 	pthread_mutex_unlock( _latch );
31 | 	return rc;
32 | }
33 | 
34 | void
35 | Row_occ::latch() {
36 | 	pthread_mutex_lock( _latch );
37 | }
38 | 
39 | bool
40 | Row_occ::validate(uint64_t ts) {
41 | 	if (ts < wts) return false;
42 | 	else return true;
43 | }
44 | 
45 | void
46 | Row_occ::write(row_t * data, uint64_t ts) {
47 | 	_row->copy(data);
48 | 	if (PER_ROW_VALID) {
49 | 		assert(ts > wts);
50 | 		wts = ts;
51 | 	}
52 | }
53 | 
54 | void
55 | Row_occ::release() {
56 | 	pthread_mutex_unlock( _latch );
57 | }
58 | 


--------------------------------------------------------------------------------
/concurrency_control/row_occ.h:
--------------------------------------------------------------------------------
 1 | #ifndef ROW_OCC_H
 2 | #define ROW_OCC_H
 3 | 
 4 | class table_t;
 5 | class Catalog;
 6 | class txn_man;
 7 | struct TsReqEntry;
 8 | 
 9 | class Row_occ {
10 | public:
11 | 	void 				init(row_t * row);
12 | 	RC 					access(txn_man * txn, TsType type);
13 | 	void 				latch();
14 | 	// ts is the start_ts of the validating txn 
15 | 	bool				validate(uint64_t ts);
16 | 	void				write(row_t * data, uint64_t ts);
17 | 	void 				release();
18 | private:
19 |  	pthread_mutex_t * 	_latch;
20 | 	bool 				blatch;
21 | 
22 | 	row_t * 			_row;
23 | 	// the last update time
24 | 	ts_t 				wts;
25 | };
26 | 
27 | #endif
28 | 


--------------------------------------------------------------------------------
/concurrency_control/row_silo.cpp:
--------------------------------------------------------------------------------
  1 | #include "txn.h"
  2 | #include "row.h"
  3 | #include "row_silo.h"
  4 | #include "mem_alloc.h"
  5 | 
  6 | #if CC_ALG==SILO
  7 | 
  8 | void 
  9 | Row_silo::init(row_t * row) 
 10 | {
 11 | 	_row = row;
 12 | #if ATOMIC_WORD
 13 | 	_tid_word = 0;
 14 | #else 
 15 | 	_latch = (pthread_mutex_t *) _mm_malloc(sizeof(pthread_mutex_t), 64);
 16 | 	pthread_mutex_init( _latch, NULL );
 17 | 	_tid = 0;
 18 | #endif
 19 | }
 20 | 
 21 | RC
 22 | Row_silo::access(txn_man * txn, TsType type, row_t * local_row) {
 23 | #if ATOMIC_WORD
 24 | 	uint64_t v = 0;
 25 | 	uint64_t v2 = 1;
 26 | 	while (v2 != v) {
 27 | 		v = _tid_word;
 28 | 		while (v & LOCK_BIT) {
 29 | 			PAUSE
 30 | 			v = _tid_word;
 31 | 		}
 32 | 		local_row->copy(_row);
 33 | 		COMPILER_BARRIER
 34 | 		v2 = _tid_word;
 35 | 	} 
 36 | 	txn->last_tid = v & (~LOCK_BIT);
 37 | #else 
 38 | 	lock();
 39 | 	local_row->copy(_row);
 40 | 	txn->last_tid = _tid;
 41 | 	release();
 42 | #endif
 43 | 	return RCOK;
 44 | }
 45 | 
 46 | bool
 47 | Row_silo::validate(ts_t tid, bool in_write_set) {
 48 | #if ATOMIC_WORD
 49 | 	uint64_t v = _tid_word;
 50 | 	if (in_write_set)
 51 | 		return tid == (v & (~LOCK_BIT));
 52 | 
 53 | 	if (v & LOCK_BIT) 
 54 | 		return false;
 55 | 	else if (tid != (v & (~LOCK_BIT)))
 56 | 		return false;
 57 | 	else 
 58 | 		return true;
 59 | #else
 60 | 	if (in_write_set)	
 61 | 		return tid == _tid;
 62 | 	if (!try_lock())
 63 | 		return false;
 64 | 	bool valid = (tid == _tid);
 65 | 	release();
 66 | 	return valid;
 67 | #endif
 68 | }
 69 | 
 70 | void
 71 | Row_silo::write(row_t * data, uint64_t tid) {
 72 | 	_row->copy(data);
 73 | #if ATOMIC_WORD
 74 | 	uint64_t v = _tid_word;
 75 | 	M_ASSERT(tid > (v & (~LOCK_BIT)) && (v & LOCK_BIT), "tid=%ld, v & LOCK_BIT=%ld, v & (~LOCK_BIT)=%ld\n", tid, (v & LOCK_BIT), (v & (~LOCK_BIT)));
 76 | 	_tid_word = (tid | LOCK_BIT); 
 77 | #else
 78 | 	_tid = tid;
 79 | #endif
 80 | }
 81 | 
 82 | void
 83 | Row_silo::lock() {
 84 | #if ATOMIC_WORD
 85 | 	uint64_t v = _tid_word;
 86 | 	while ((v & LOCK_BIT) || !__sync_bool_compare_and_swap(&_tid_word, v, v | LOCK_BIT)) {
 87 | 		PAUSE
 88 | 		v = _tid_word;
 89 | 	}
 90 | #else
 91 | 	pthread_mutex_lock( _latch );
 92 | #endif
 93 | }
 94 | 
 95 | void
 96 | Row_silo::release() {
 97 | #if ATOMIC_WORD
 98 | 	assert(_tid_word & LOCK_BIT);
 99 | 	_tid_word = _tid_word & (~LOCK_BIT);
100 | #else 
101 | 	pthread_mutex_unlock( _latch );
102 | #endif
103 | }
104 | 
105 | bool
106 | Row_silo::try_lock()
107 | {
108 | #if ATOMIC_WORD
109 | 	uint64_t v = _tid_word;
110 | 	if (v & LOCK_BIT) // already locked
111 | 		return false;
112 | 	return __sync_bool_compare_and_swap(&_tid_word, v, (v | LOCK_BIT));
113 | #else
114 | 	return pthread_mutex_trylock( _latch ) != EBUSY;
115 | #endif
116 | }
117 | 
118 | uint64_t 
119 | Row_silo::get_tid()
120 | {
121 | 	assert(ATOMIC_WORD);
122 | 	return _tid_word & (~LOCK_BIT);
123 | }
124 | 
125 | #endif
126 | 


--------------------------------------------------------------------------------
/concurrency_control/row_silo.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | class table_t;
 4 | class Catalog;
 5 | class txn_man;
 6 | struct TsReqEntry;
 7 | 
 8 | #if CC_ALG==SILO
 9 | #define LOCK_BIT (1UL << 63)
10 | 
11 | class Row_silo {
12 | public:
13 | 	void 				init(row_t * row);
14 | 	RC 					access(txn_man * txn, TsType type, row_t * local_row);
15 | 	
16 | 	bool				validate(ts_t tid, bool in_write_set);
17 | 	void				write(row_t * data, uint64_t tid);
18 | 	
19 | 	void 				lock();
20 | 	void 				release();
21 | 	bool				try_lock();
22 | 	uint64_t 			get_tid();
23 | 
24 | 	void 				assert_lock() {assert(_tid_word & LOCK_BIT); }
25 | private:
26 | #if ATOMIC_WORD
27 | 	volatile uint64_t	_tid_word;
28 | #else
29 |  	pthread_mutex_t * 	_latch;
30 | 	ts_t 				_tid;
31 | #endif
32 | 	row_t * 			_row;
33 | };
34 | 
35 | #endif
36 | 


--------------------------------------------------------------------------------
/concurrency_control/row_silo_prio.cpp:
--------------------------------------------------------------------------------
 1 | #include "global.h"
 2 | #include "txn.h"
 3 | #include "row.h"
 4 | #include "row_silo_prio.h"
 5 | #include "mem_alloc.h"
 6 | #include <atomic>
 7 | 
 8 | #if CC_ALG == SILO_PRIO
 9 | 
10 | void 
11 | Row_silo_prio::init(row_t * row) 
12 | {
13 | 	_row = row;
14 | 	_tid_word.store({0, 0}, std::memory_order_relaxed);
15 | }
16 | 
17 | RC
18 | Row_silo_prio::access(txn_man * txn, TsType type, row_t * local_row) {
19 | 	TID_prio_t v, v2;
20 | 	const uint32_t prio = txn->prio;
21 | 	bool is_reserved;
22 | 	v = _tid_word.load(std::memory_order_relaxed);
23 | retry:
24 | 	while (v.is_locked()) {
25 | 		PAUSE
26 | 		v = _tid_word.load(std::memory_order_relaxed);
27 | 	}
28 | 	// for a write, abort if the current priority is higher
29 | 	if (prio < v.get_prio()) {
30 | 		if (type != R_REQ) return Abort;
31 | 	}
32 | 	v2 = v;
33 | 	is_reserved = v2.acquire_prio(prio);
34 | 	local_row->copy(_row);
35 | 	if (is_reserved) {
36 | 		if (!_tid_word.compare_exchange_strong(v, v2, std::memory_order_acq_rel,
37 | 			std::memory_order_acquire))
38 | 			goto retry;
39 | 	} else {
40 | 		assert (v2 == v);
41 | 		v = _tid_word.load(std::memory_order_acquire);
42 | 		if (v != v2)
43 | 			goto retry;
44 | 	}
45 | 	txn->last_is_reserved = is_reserved;
46 | 	txn->last_data_ver = v2.get_data_ver();
47 | 	if (is_reserved) txn->last_prio_ver = v2.get_prio_ver();
48 | 	return RCOK;
49 | }
50 | 
51 | void Row_silo_prio::write(row_t * data) {
52 | 	_row->copy(data);
53 | }
54 | 
55 | #endif
56 | 


--------------------------------------------------------------------------------
/concurrency_control/row_tictoc.cpp:
--------------------------------------------------------------------------------
  1 | #include "row_tictoc.h"
  2 | #include "row.h"
  3 | #include "txn.h"
  4 | #include "mem_alloc.h"
  5 | #include <mm_malloc.h>
  6 | 
  7 | #if CC_ALG==TICTOC
  8 | 
  9 | void 
 10 | Row_tictoc::init(row_t * row)
 11 | {
 12 | 	_row = row;
 13 | #if ATOMIC_WORD
 14 | 	_ts_word = 0;
 15 | #else
 16 | 	_latch = (pthread_mutex_t *) _mm_malloc(sizeof(pthread_mutex_t), 64);
 17 | 	pthread_mutex_init( _latch, NULL );
 18 | 	_wts = 0;
 19 | 	_rts = 0;
 20 | #endif
 21 | #if TICTOC_MV
 22 | 	_hist_wts = 0;
 23 | #endif
 24 | }
 25 | 	
 26 | RC
 27 | Row_tictoc::access(txn_man * txn, TsType type, row_t * local_row)
 28 | {
 29 | #if ATOMIC_WORD
 30 | 	uint64_t v = 0;
 31 | 	uint64_t v2 = 1;
 32 | 	uint64_t lock_mask = LOCK_BIT;
 33 | 	if (WRITE_PERMISSION_LOCK && type == P_REQ)
 34 | 		lock_mask = WRITE_BIT;
 35 | 
 36 | 	while ((v2 | RTS_MASK) != (v | RTS_MASK)) {
 37 | 		v = _ts_word;
 38 | 		while (v & lock_mask) {
 39 | 			PAUSE
 40 | 			v = _ts_word;
 41 | 		}
 42 | 		local_row->copy(_row);
 43 | 		COMPILER_BARRIER
 44 | 		v2 = _ts_word;
 45 |   #if WRITE_PERMISSION_LOCK
 46 |   		if (type == R_REQ) {
 47 | 			v |= WRITE_BIT;
 48 | 			v2 |= WRITE_BIT;
 49 | 		}
 50 |   #endif
 51 | 	}
 52 | 	txn->last_wts = v & WTS_MASK;
 53 | 	txn->last_rts = ((v & RTS_MASK) >> WTS_LEN) + txn->last_wts;
 54 | #else
 55 | 	lock();
 56 | 	txn->last_wts = _wts;
 57 | 	txn->last_rts = _rts;
 58 | 	local_row->copy(_row); 
 59 | 	release();
 60 | #endif
 61 | 	return RCOK;
 62 | }
 63 | 
 64 | void 
 65 | Row_tictoc::write_data(row_t * data, ts_t wts)
 66 | {
 67 | #if ATOMIC_WORD
 68 |   	uint64_t v = _ts_word;
 69 |   #if TICTOC_MV
 70 | 	_hist_wts = v & WTS_MASK;
 71 |   #endif
 72 |   #if WRITE_PERMISSION_LOCK
 73 | 	assert(__sync_bool_compare_and_swap(&_ts_word, v, v | LOCK_BIT));
 74 |   #endif
 75 |   	v &= ~(RTS_MASK | WTS_MASK); // clear wts and rts.
 76 | 	v |= wts;
 77 | 	_ts_word = v;
 78 | 	_row->copy(data);
 79 |   #if WRITE_PERMISSION_LOCK
 80 | 	_ts_word &= (~LOCK_BIT);
 81 |   #endif
 82 | #else 
 83 |   #if TICTOC_MV
 84 | 	_hist_wts = _wts;
 85 |   #endif
 86 | 	_wts = wts;
 87 | 	_rts = wts;
 88 | 	_row->copy(data);
 89 | #endif
 90 | }
 91 | 
 92 | bool
 93 | Row_tictoc::renew_lease(ts_t wts, ts_t rts)
 94 | {	
 95 | #if !ATOMIC_WORD
 96 | 	if (_wts != wts) {
 97 |   #if TICTOC_MV
 98 | 		if (wts == _hist_wts && rts < _wts) 
 99 | 			return true;
100 |   #endif
101 | 		return false;
102 | 	}
103 | 	_rts = rts;
104 | #endif
105 | 	return true;
106 | }
107 | 
108 | bool 
109 | Row_tictoc::try_renew(ts_t wts, ts_t rts, ts_t &new_rts, uint64_t thd_id)
110 | {	
111 | #if ATOMIC_WORD
112 | 	uint64_t v = _ts_word;
113 | 	uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT;
114 | 	if ((v & WTS_MASK) == wts && ((v & RTS_MASK) >> WTS_LEN) >= rts - wts)
115 | 		return true;
116 | 	if (v & lock_mask)
117 | 		return false; 
118 |   #if TICTOC_MV
119 |   	COMPILER_BARRIER
120 |   	uint64_t hist_wts = _hist_wts;
121 | 	if (wts != (v & WTS_MASK)) {
122 | 		if (wts == hist_wts && rts < (v & WTS_MASK)) {
123 | 			return true;
124 | 		} else {
125 | 			return false;
126 | 		}
127 | 	}
128 |   #else
129 | 	if (wts != (v & WTS_MASK)) 
130 | 		return false;
131 |   #endif
132 | 
133 | 	ts_t delta_rts = rts - wts;
134 | 	if (delta_rts < ((v & RTS_MASK) >> WTS_LEN)) // the rts has already been extended.
135 | 		return true;
136 | 	bool rebase = false;
137 | 	if (delta_rts >= (1 << RTS_LEN)) {
138 | 		rebase = true;
139 | 		uint64_t delta = (delta_rts & ~((1 << RTS_LEN) - 1));
140 | 		delta_rts &= ((1 << RTS_LEN) - 1);
141 | 		wts += delta;
142 | 	}
143 | 	uint64_t v2 = 0;
144 | 	v2 |= wts;
145 | 	v2 |= (delta_rts << WTS_LEN);
146 | 	while (true) {
147 | 		uint64_t pre_v = __sync_val_compare_and_swap(&_ts_word, v, v2);
148 | 		if (pre_v == v)
149 | 			return true;
150 | 		v = pre_v;
151 | 		if (rebase || (v & lock_mask) || (wts != (v & WTS_MASK)))
152 | 			return false;
153 | 		else if (rts < ((v & RTS_MASK) >> WTS_LEN))
154 | 			return true;
155 | 	}
156 | 	assert(false);
157 | 	return false;
158 | #else
159 |   #if TICTOC_MV
160 | 	if (wts < _hist_wts)
161 | 		return false;
162 |   #else 
163 | 	if (wts != _wts)
164 | 		return false;
165 |   #endif
166 | 	int ret = pthread_mutex_trylock( _latch );
167 | 	if (ret == EBUSY) 
168 | 		return false;
169 | 
170 | 	if (wts != _wts) { 
171 |   #if TICTOC_MV
172 | 		if (wts == _hist_wts && rts < _wts) {
173 | 			pthread_mutex_unlock( _latch );
174 | 			return true;
175 | 		}
176 |   #endif
177 | 		pthread_mutex_unlock( _latch );
178 | 		return false;
179 | 	}
180 | 	if (rts > _rts)
181 | 		_rts = rts;
182 | 	pthread_mutex_unlock( _latch );
183 | 	new_rts = rts;
184 | 	return true;
185 | #endif
186 | }
187 | 
188 | 
189 | ts_t
190 | Row_tictoc::get_wts()
191 | {
192 | #if ATOMIC_WORD
193 | 	return _ts_word & WTS_MASK;
194 | #else
195 | 	return _wts;
196 | #endif
197 | }
198 | 
199 | void
200 | Row_tictoc::get_ts_word(bool &lock, uint64_t &rts, uint64_t &wts)
201 | {
202 | 	assert(ATOMIC_WORD);
203 | 	uint64_t v = _ts_word;
204 | 	lock = ((v & LOCK_BIT) != 0);
205 | 	wts = v & WTS_MASK;
206 | 	rts = ((v & RTS_MASK) >> WTS_LEN)  + (v & WTS_MASK);
207 | }
208 | 
209 | ts_t
210 | Row_tictoc::get_rts()
211 | {
212 | #if ATOMIC_WORD
213 | 	uint64_t v = _ts_word;
214 | 	return ((v & RTS_MASK) >> WTS_LEN) + (v & WTS_MASK);
215 | #else
216 | 	return _rts;
217 | #endif
218 | 
219 | }
220 | 
221 | void 
222 | Row_tictoc::lock()
223 | {
224 | #if ATOMIC_WORD  
225 | 	uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT;
226 | 	uint64_t v = _ts_word;
227 | 	while ((v & lock_mask) || !__sync_bool_compare_and_swap(&_ts_word, v, v | lock_mask)) {
228 | 		PAUSE 
229 | 		v = _ts_word;
230 | 	}
231 | #else 
232 | 	pthread_mutex_lock( _latch );
233 | #endif
234 | }
235 | 
236 | bool
237 | Row_tictoc::try_lock()
238 | {
239 | #if ATOMIC_WORD
240 | 	uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT;
241 | 	uint64_t v = _ts_word;
242 | 	if (v & lock_mask) // already locked
243 | 		return false;
244 | 	return __sync_bool_compare_and_swap(&_ts_word, v, v | lock_mask);
245 | #else
246 | 	return pthread_mutex_trylock( _latch ) != EBUSY; 
247 | #endif
248 | }
249 | 
250 | void 
251 | Row_tictoc::release()
252 | {
253 | #if ATOMIC_WORD
254 | 	uint64_t lock_mask = (WRITE_PERMISSION_LOCK)? WRITE_BIT : LOCK_BIT;
255 | 	_ts_word &= (~lock_mask);
256 | #else 
257 | 	pthread_mutex_unlock( _latch );
258 | #endif
259 | }
260 | 
261 | #endif
262 | 


--------------------------------------------------------------------------------
/concurrency_control/row_tictoc.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "global.h"
 4 | 
 5 | #if CC_ALG == TICTOC
 6 | 
 7 | #if WRITE_PERMISSION_LOCK
 8 | 
 9 | #define LOCK_BIT (1UL << 63)
10 | #define WRITE_BIT (1UL << 62)
11 | #define RTS_LEN (15)
12 | #define WTS_LEN (62 - RTS_LEN)
13 | #define WTS_MASK ((1UL << WTS_LEN) - 1)
14 | #define RTS_MASK (((1UL << RTS_LEN) - 1) << WTS_LEN)
15 | 
16 | #else 
17 | 
18 | #define LOCK_BIT (1UL << 63)
19 | #define WRITE_BIT (1UL << 63)
20 | #define RTS_LEN (15)
21 | #define WTS_LEN (63 - RTS_LEN)
22 | #define WTS_MASK ((1UL << WTS_LEN) - 1)
23 | #define RTS_MASK (((1UL << RTS_LEN) - 1) << WTS_LEN)
24 | 
25 | #endif
26 | 
27 | class txn_man;
28 | class row_t;
29 | 
30 | class Row_tictoc {
31 | public:
32 | 	void 				init(row_t * row);
33 | 	RC 					access(txn_man * txn, TsType type, row_t * local_row);
34 | #if SPECULATE
35 | 	RC					write_speculate(row_t * data, ts_t version, bool spec_read); 
36 | #endif
37 | 	void				write_data(row_t * data, ts_t wts);
38 | 	void				write_ptr(row_t * data, ts_t wts, char *& data_to_free);
39 | 	bool 				renew_lease(ts_t wts, ts_t rts);
40 | 	bool 				try_renew(ts_t wts, ts_t rts, ts_t &new_rts, uint64_t thd_id);
41 | 	
42 | 	void 				lock();
43 | 	bool  				try_lock();
44 | 	void 				release();
45 | 
46 | 	ts_t 				get_wts();
47 | 	ts_t 				get_rts();
48 | 	void 				get_ts_word(bool &lock, uint64_t &rts, uint64_t &wts);
49 | private:
50 | 	row_t * 			_row;
51 | #if ATOMIC_WORD
52 | 	volatile uint64_t	_ts_word; 
53 | #else
54 | 	ts_t 				_wts; // last write timestamp
55 | 	ts_t 				_rts; // end lease timestamp
56 | 	pthread_mutex_t * 	_latch;
57 | #endif
58 | #if TICTOC_MV
59 | 	volatile ts_t 		_hist_wts;
60 | #endif
61 | };
62 | 
63 | #endif
64 | 


--------------------------------------------------------------------------------
/concurrency_control/row_ts.h:
--------------------------------------------------------------------------------
 1 | #ifndef ROW_TS_H
 2 | #define ROW_TS_H
 3 | 
 4 | class table_t;
 5 | class Catalog;
 6 | class txn_man;
 7 | struct TsReqEntry {
 8 | 	txn_man * txn;
 9 | 	// for write requests, need to have a copy of the data to write.
10 | 	row_t * row;
11 | 	itemid_t * item;
12 | 	ts_t ts;
13 | 	TsReqEntry * next;
14 | };
15 | 
16 | class Row_ts {
17 | public:
18 | 	void init(row_t * row);
19 | 	RC access(txn_man * txn, TsType type, row_t * row);
20 | 
21 | private:
22 |  	pthread_mutex_t * latch;
23 | 	bool blatch;
24 | 
25 | 	void buffer_req(TsType type, txn_man * txn, row_t * row);
26 | 	TsReqEntry * debuffer_req(TsType type, txn_man * txn);
27 | 	TsReqEntry * debuffer_req(TsType type, ts_t ts);
28 | 	TsReqEntry * debuffer_req(TsType type, txn_man * txn, ts_t ts);
29 | 	void update_buffer();
30 | 	ts_t cal_min(TsType type);
31 | 	TsReqEntry * get_req_entry();
32 | 	void return_req_entry(TsReqEntry * entry);
33 |  	void return_req_list(TsReqEntry * list);
34 | 
35 | 	row_t * _row;
36 | 	ts_t wts;
37 |     ts_t rts;
38 |     ts_t min_wts;
39 |     ts_t min_rts;
40 |     ts_t min_pts;
41 | 
42 |     TsReqEntry * readreq;
43 |     TsReqEntry * writereq;
44 |     TsReqEntry * prereq;
45 | 	uint64_t preq_len;
46 | };
47 | 
48 | #endif
49 | 


--------------------------------------------------------------------------------
/concurrency_control/row_vll.cpp:
--------------------------------------------------------------------------------
 1 | #include "row.h"
 2 | #include "row_vll.h"
 3 | #include "global.h"
 4 | #include "helper.h"
 5 | 
 6 | void 
 7 | Row_vll::init(row_t * row) {
 8 | 	_row = row;
 9 | 	cs = 0;
10 | 	cx = 0;
11 | }
12 | 
13 | bool 
14 | Row_vll::insert_access(access_t type) {
15 | 	if (type == RD) {
16 | 		cs ++;
17 | 		return (cx > 0);
18 | 	} else { 
19 | 		cx ++;
20 | 		return (cx > 1) || (cs > 0);
21 | 	}
22 | }
23 | 
24 | void 
25 | Row_vll::remove_access(access_t type) {
26 | 	if (type == RD) {
27 | 		assert (cs > 0);
28 | 		cs --;
29 | 	} else {
30 | 		assert (cx > 0);
31 | 		cx --;
32 | 	}
33 | }
34 | 


--------------------------------------------------------------------------------
/concurrency_control/row_vll.h:
--------------------------------------------------------------------------------
 1 | #ifndef ROW_VLL_H
 2 | #define ROW_VLL_H
 3 | 
 4 | class Row_vll {
 5 | public:
 6 | 	void init(row_t * row);
 7 | 	// return true   : the access is blocked.
 8 | 	// return false	 : the access is NOT blocked 
 9 | 	bool insert_access(access_t type);
10 | 	void remove_access(access_t type);
11 | 	int get_cs() { return cs; };
12 | private:
13 | 	row_t * _row;
14 |     int cs;
15 |     int cx;
16 | };
17 | 
18 | #endif
19 | 


--------------------------------------------------------------------------------
/concurrency_control/row_ww.h:
--------------------------------------------------------------------------------
 1 | #ifndef ROW_WW_H
 2 | #define ROW_WW_H
 3 | 
 4 | #include "row_lock.h"
 5 | 
 6 | class Row_ww {
 7 |  public:
 8 |   void init(row_t * row);
 9 |   RC lock_get(lock_t type, txn_man * txn, Access * access);
10 |   RC lock_get(lock_t type, txn_man * txn, uint64_t* &txnids, int &txncnt, Access * access);
11 |   RC lock_release(LockEntry * entry);
12 |   void lock(txn_man * txn);
13 |   void unlock(txn_man * txn);
14 | 
15 |  private:
16 | #if LATCH == LH_SPINLOCK
17 |   pthread_spinlock_t * latch;
18 | #elif LATCH == LH_MUTEX
19 |   pthread_mutex_t * latch;
20 | #else
21 |   mcslock * latch;
22 | #endif
23 |   bool blatch;
24 | 
25 |   bool 		conflict_lock(lock_t l1, lock_t l2);
26 |   static LockEntry * get_entry(Access * access);
27 |   static void 		return_entry(LockEntry * entry);
28 |   void        bring_next();
29 | 
30 |   row_t * _row;
31 |   // owner's lock type
32 |   lock_t lock_type;
33 |   UInt32 owner_cnt;
34 |   UInt32 waiter_cnt;
35 | 
36 |   // owners is a single linked list
37 |   // waiters is a double linked list
38 |   // [waiters] head is the oldest txn, tail is the youngest txn.
39 |   //   So new txns are inserted into the tail.
40 |   LockEntry * owners;
41 |   LockEntry * waiters_head;
42 |   LockEntry * waiters_tail;
43 | };
44 | 
45 | #endif
46 | 


--------------------------------------------------------------------------------
/concurrency_control/silo.cpp:
--------------------------------------------------------------------------------
  1 | #include "txn.h"
  2 | #include "row.h"
  3 | #include "row_silo.h"
  4 | 
  5 | #if CC_ALG == SILO
  6 | 
  7 | RC
  8 | txn_man::validate_silo()
  9 | {
 10 | 	RC rc = RCOK;
 11 | 	// lock write tuples in the primary key order.
 12 | 	int write_set[wr_cnt];
 13 | 	int cur_wr_idx = 0;
 14 | 	int read_set[row_cnt - wr_cnt];
 15 | 	int cur_rd_idx = 0;
 16 | 	for (int rid = 0; rid < row_cnt; rid ++) {
 17 | 		if (accesses[rid]->type == WR)
 18 | 			write_set[cur_wr_idx ++] = rid;
 19 | 		else 
 20 | 			read_set[cur_rd_idx ++] = rid;
 21 | 	}
 22 | 
 23 | 	// bubble sort the write set, in primary key order
 24 | 	for (int i = wr_cnt - 1; i >= 1; i--) {
 25 | 		for (int j = 0; j < i; j++) {
 26 | 			if (accesses[ write_set[j] ]->orig_row->get_primary_key() >
 27 | 				accesses[ write_set[j + 1] ]->orig_row->get_primary_key())
 28 | 			{
 29 | 				int tmp = write_set[j];
 30 | 				write_set[j] = write_set[j+1];
 31 | 				write_set[j+1] = tmp;
 32 | 			}
 33 | 		}
 34 | 	}
 35 | 
 36 | 	int num_locks = 0;
 37 | 	ts_t max_tid = 0;
 38 | 	bool done = false;
 39 | 	if (_pre_abort) {
 40 | 		for (int i = 0; i < wr_cnt; i++) {
 41 | 			row_t * row = accesses[ write_set[i] ]->orig_row;
 42 | 			if (row->manager->get_tid() != accesses[write_set[i]]->tid) {
 43 | 				rc = Abort;
 44 | 				goto final;
 45 | 			}	
 46 | 		}	
 47 | 		for (int i = 0; i < row_cnt - wr_cnt; i ++) {
 48 | 			Access * access = accesses[ read_set[i] ];
 49 | 			if (access->orig_row->manager->get_tid() != accesses[read_set[i]]->tid) {
 50 | 				rc = Abort;
 51 | 				goto final;
 52 | 			}
 53 | 		}
 54 | 	}
 55 | 
 56 | 	// lock all rows in the write set.
 57 | 	if (_validation_no_wait) {
 58 | 		while (!done) {
 59 | 			num_locks = 0;
 60 | 			for (int i = 0; i < wr_cnt; i++) {
 61 | 				row_t * row = accesses[ write_set[i] ]->orig_row;
 62 | 				if (!row->manager->try_lock())
 63 | 					break;
 64 | 				row->manager->assert_lock();
 65 | 				num_locks ++;
 66 | 				if (row->manager->get_tid() != accesses[write_set[i]]->tid)
 67 | 				{
 68 | 					rc = Abort;
 69 | 					goto final;
 70 | 				}
 71 | 			}
 72 | 			if (num_locks == wr_cnt)
 73 | 				done = true;
 74 | 			else {
 75 | 				for (int i = 0; i < num_locks; i++)
 76 | 					accesses[ write_set[i] ]->orig_row->manager->release();
 77 | 				if (_pre_abort) {
 78 | 					num_locks = 0;
 79 | 					for (int i = 0; i < wr_cnt; i++) {
 80 | 						row_t * row = accesses[ write_set[i] ]->orig_row;
 81 | 						if (row->manager->get_tid() != accesses[write_set[i]]->tid) {
 82 | 							rc = Abort;
 83 | 							goto final;
 84 | 						}	
 85 | 					}	
 86 | 					for (int i = 0; i < row_cnt - wr_cnt; i ++) {
 87 | 						Access * access = accesses[ read_set[i] ];
 88 | 						if (access->orig_row->manager->get_tid() != accesses[read_set[i]]->tid) {
 89 | 							rc = Abort;
 90 | 							goto final;
 91 | 						}
 92 | 					}
 93 | 				}
 94 |                 PAUSE
 95 | 			}
 96 | 		}
 97 | 	} else {
 98 | 		for (int i = 0; i < wr_cnt; i++) {
 99 | 			row_t * row = accesses[ write_set[i] ]->orig_row;
100 | 			row->manager->lock();
101 | 			num_locks++;
102 | 			if (row->manager->get_tid() != accesses[write_set[i]]->tid) {
103 | 				rc = Abort;
104 | 				goto final;
105 | 			}
106 | 		}
107 | 	}
108 | 
109 | 	// validate rows in the read set
110 | 	// for repeatable_read, no need to validate the read set.
111 | 	for (int i = 0; i < row_cnt - wr_cnt; i ++) {
112 | 		Access * access = accesses[ read_set[i] ];
113 | 		bool success = access->orig_row->manager->validate(access->tid, false);
114 | 		if (!success) {
115 | 			rc = Abort;
116 | 			goto final;
117 | 		}
118 | 		if (access->tid > max_tid)
119 | 			max_tid = access->tid;
120 | 	}
121 | 	// validate rows in the write set
122 | 	for (int i = 0; i < wr_cnt; i++) {
123 | 		Access * access = accesses[ write_set[i] ];
124 | 		bool success = access->orig_row->manager->validate(access->tid, true);
125 | 		if (!success) {
126 | 			rc = Abort;
127 | 			goto final;
128 | 		}
129 | 		if (access->tid > max_tid)
130 | 			max_tid = access->tid;
131 | 	}
132 | 	if (max_tid > _cur_tid)
133 | 		_cur_tid = max_tid + 1;
134 | 	else 
135 | 		_cur_tid ++;
136 | final:
137 | 	if (rc == Abort) {
138 | 		for (int i = 0; i < num_locks; i++) 
139 | 			accesses[ write_set[i] ]->orig_row->manager->release();
140 | 		cleanup(rc);
141 | 	} else {
142 | 		for (int i = 0; i < wr_cnt; i++) {
143 | 			Access * access = accesses[ write_set[i] ];
144 | 			access->orig_row->manager->write( 
145 | 				access->data, _cur_tid );
146 | 			accesses[ write_set[i] ]->orig_row->manager->release();
147 | 		}
148 | 		cleanup(rc);
149 | 	}
150 | 	return rc;
151 | }
152 | #endif
153 | 


--------------------------------------------------------------------------------
/concurrency_control/silo_prio.cpp:
--------------------------------------------------------------------------------
  1 | #include "txn.h"
  2 | #include "row.h"
  3 | #include "row_silo_prio.h"
  4 | 
  5 | #if CC_ALG == SILO_PRIO
  6 | 
  7 | RC
  8 | txn_man::validate_silo_prio()
  9 | {
 10 | 	RC rc = RCOK;
 11 | 	// lock write tuples in the primary key order.
 12 | 	int cur_wr_idx = 0;
 13 | 	int cur_rd_idx = 0;
 14 | 	int write_set[wr_cnt];
 15 | 	int read_set[row_cnt - wr_cnt];
 16 | 	for (int rid = 0; rid < row_cnt; rid ++) {
 17 | 		if (accesses[rid]->type == WR)
 18 | 			write_set[cur_wr_idx ++] = rid;
 19 | 		else 
 20 | 			read_set[cur_rd_idx ++] = rid;
 21 | 	}
 22 | 
 23 | 	// bubble sort the write set, in primary key order
 24 | 	for (int i = wr_cnt - 1; i >= 1; i--) {
 25 | 		for (int j = 0; j < i; j++) {
 26 | 			if (accesses[ write_set[j] ]->orig_row->get_primary_key() >
 27 | 				accesses[ write_set[j + 1] ]->orig_row->get_primary_key())
 28 | 			{
 29 | 				int tmp = write_set[j];
 30 | 				write_set[j] = write_set[j+1];
 31 | 				write_set[j+1] = tmp;
 32 | 			}
 33 | 		}
 34 | 	}
 35 | 
 36 | 	int num_locks = 0;
 37 | 	ts_t max_data_ver = 0;
 38 | 	bool done = false;
 39 | 	if (_pre_abort) {
 40 | 		for (int i = 0; i < wr_cnt; i++) {
 41 | 			row_t * row = accesses[ write_set[i] ]->orig_row;
 42 | 			if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) {
 43 | 				rc = Abort;
 44 | 				goto final;
 45 | 			}
 46 | 		}	
 47 | 		for (int i = 0; i < row_cnt - wr_cnt; i ++) {
 48 | 			Access * access = accesses[ read_set[i] ];
 49 | 			if (access->orig_row->manager->get_data_ver() != accesses[read_set[i]]->data_ver) {
 50 | 				rc = Abort;
 51 | 				goto final;
 52 | 			}
 53 | 		}
 54 | 	}
 55 | 
 56 | 	// lock all rows in the write set.
 57 | 	if (_validation_no_wait) {
 58 | 		while (!done) {
 59 | 			num_locks = 0;
 60 | 			for (int i = 0; i < wr_cnt; i++) {
 61 | 				row_t * row = accesses[ write_set[i] ]->orig_row;
 62 | 				if (row->manager->try_lock(prio) != Row_silo_prio::LOCK_STATUS::LOCK_DONE)
 63 | 					break;
 64 | 				row->manager->assert_lock();
 65 | 				num_locks ++;
 66 | 				if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver)
 67 | 				{
 68 | 					rc = Abort;
 69 | 					goto final;
 70 | 				}
 71 | 			}
 72 | 			if (num_locks == wr_cnt)
 73 | 				done = true;
 74 | 			else {
 75 | 				for (int i = 0; i < num_locks; i++)
 76 | 					accesses[ write_set[i] ]->orig_row->manager->unlock();
 77 | 				if (_pre_abort) {
 78 | 					num_locks = 0;
 79 | 					for (int i = 0; i < wr_cnt; i++) {
 80 | 						row_t * row = accesses[ write_set[i] ]->orig_row;
 81 | 						if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) {
 82 | 							rc = Abort;
 83 | 							goto final;
 84 | 						}	
 85 | 					}	
 86 | 					for (int i = 0; i < row_cnt - wr_cnt; i ++) {
 87 | 						Access * access = accesses[ read_set[i] ];
 88 | 						if (access->orig_row->manager->get_data_ver() != accesses[read_set[i]]->data_ver) {
 89 | 							rc = Abort;
 90 | 							goto final;
 91 | 						}
 92 | 					}
 93 | 				}
 94 |                 PAUSE
 95 | 			}
 96 | 		}
 97 | 	} else {
 98 | 		/**
 99 | 		 * This path does not work, since release the latch requires resetting prio
100 | 		 * and prio_ver etc., we here simply disallow this operation.
101 | 		 */
102 | 		assert(false);
103 | 		for (int i = 0; i < wr_cnt; i++) {
104 | 			row_t * row = accesses[ write_set[i] ]->orig_row;
105 | 			Row_silo_prio::LOCK_STATUS ls = row->manager->lock(prio);
106 | 			if (ls == Row_silo_prio::LOCK_STATUS::LOCK_ERR_PRIO) {
107 | 				rc = Abort;
108 | 				goto final;
109 | 			}
110 | 			num_locks++;
111 | 			if (row->manager->get_data_ver() != accesses[write_set[i]]->data_ver) {
112 | 				rc = Abort;
113 | 				goto final;
114 | 			}
115 | 		}
116 | 	}
117 | 
118 | 	// validate rows in the read set
119 | 	// for repeatable_read, no need to validate the read set.
120 | 	for (int i = 0; i < row_cnt - wr_cnt; i ++) {
121 | 		Access * access = accesses[ read_set[i] ];
122 | 		bool success = access->orig_row->manager->validate(access->data_ver, false);
123 | 		if (!success) {
124 | 			rc = Abort;
125 | 			goto final;
126 | 		}
127 | 		if (access->data_ver > max_data_ver)
128 | 			max_data_ver = access->data_ver;
129 | 	}
130 | 	// validate rows in the write set
131 | 	for (int i = 0; i < wr_cnt; i++) {
132 | 		Access * access = accesses[ write_set[i] ];
133 | 		bool success = access->orig_row->manager->validate(access->data_ver, true);
134 | 		if (!success) {
135 | 			rc = Abort;
136 | 			goto final;
137 | 		}
138 | 		if (access->data_ver > max_data_ver)
139 | 			max_data_ver = access->data_ver;
140 | 	}
141 | 	if (max_data_ver > _cur_data_ver)
142 | 		_cur_data_ver = max_data_ver + 1;
143 | 	else 
144 | 		_cur_data_ver ++;
145 | final:
146 | 	// we release the priority and ref_cnt together with the latch
147 | 	// for those rows with latch acquired (all read-only row and some write rows)
148 | 	// we release them in cleanup()
149 | 	if (rc == Abort) {
150 | 		for (int i = 0; i < num_locks; i++) {
151 | 			Access * access = accesses[ write_set[i] ];
152 | 			access->orig_row->manager->writer_release_abort(prio, access->prio_ver);
153 | 			assert(access->is_reserved || SILO_PRIO_NO_RESERVE_LOWEST_PRIO);
154 | 			access->is_reserved = false;
155 | 		}
156 | 		cleanup(rc);
157 | 	} else {
158 | 		for (int i = 0; i < wr_cnt; i++) {
159 | 			Access * access = accesses[ write_set[i] ];
160 | 			access->orig_row->manager->write(access->data);
161 | 			access->orig_row->manager->writer_release_commit(_cur_data_ver);
162 | 			assert(access->is_reserved || SILO_PRIO_NO_RESERVE_LOWEST_PRIO);
163 | 			access->is_reserved = false;
164 | 		}
165 | 		cleanup(rc);
166 | 	}
167 | 	return rc;
168 | }
169 | #endif
170 | 


--------------------------------------------------------------------------------
/concurrency_control/vll.cpp:
--------------------------------------------------------------------------------
  1 | #include "vll.h"
  2 | #include "txn.h"
  3 | #include "table.h"
  4 | #include "row.h"
  5 | #include "row_vll.h"
  6 | #include "ycsb_query.h"
  7 | #include "ycsb.h"
  8 | #include "wl.h"
  9 | #include "catalog.h"
 10 | #include "mem_alloc.h"
 11 | #if CC_ALG == VLL
 12 | 
 13 | void 
 14 | VLLMan::init() {
 15 | 	_txn_queue_size = 0;
 16 | 	_txn_queue = NULL;
 17 | 	_txn_queue_tail = NULL;
 18 | }
 19 | 
 20 | void
 21 | VLLMan::vllMainLoop(txn_man * txn, base_query * query) {
 22 | 	
 23 | 	ycsb_query * m_query = (ycsb_query *) query;
 24 | 	// access the indexes. This is not in the critical section
 25 | 	for (int rid = 0; rid < m_query->request_cnt; rid ++) {
 26 | 		ycsb_request * req = &m_query->requests[rid];
 27 | 		ycsb_wl * wl = (ycsb_wl *) txn->get_wl();
 28 | 		int part_id = wl->key_to_part( req->key );
 29 | 		INDEX * index = wl->the_index;
 30 | 		itemid_t * item;
 31 | 		item = txn->index_read(index, req->key, part_id);
 32 | 		row_t * row = ((row_t *)item->location);
 33 | 		// the following line adds the read/write sets to txn->accesses
 34 | 		txn->get_row(row, req->rtype);
 35 | 		int cs = row->manager->get_cs();
 36 | 	}
 37 | 
 38 | 	bool done = false;
 39 | 	while (!done) {
 40 | 		txn_man * front_txn = NULL;
 41 | uint64_t t5 = get_sys_clock();
 42 | 		pthread_mutex_lock(&_mutex);
 43 | uint64_t tt5 = get_sys_clock() - t5;
 44 | INC_STATS(txn->get_thd_id(), debug5, tt5);
 45 | 
 46 | 		
 47 | 		TxnQEntry * front = _txn_queue;
 48 | 		if (front)
 49 | 			front_txn = front->txn;
 50 | 		// only one worker thread can execute the txn.
 51 | 		if (front_txn && front_txn->vll_txn_type == VLL_Blocked) {
 52 | 			front_txn->vll_txn_type = VLL_Free;
 53 | 			pthread_mutex_unlock(&_mutex);
 54 | 			execute(front_txn, query);
 55 | 			finishTxn( front_txn, front);
 56 | 		} else {
 57 | 			// _mutex will be unlocked in beginTxn()
 58 | 			TxnQEntry * entry = NULL;
 59 | 			int ok = beginTxn(txn, query, entry);
 60 | 			if (ok == 2) {
 61 | 				execute(txn, query);
 62 | 				finishTxn(txn, entry);
 63 | 			} 
 64 | 			assert(ok == 1 || ok == 2);
 65 | 			done = true;
 66 | 		}
 67 | 	}
 68 | 	return;
 69 | }
 70 | 
 71 | int
 72 | VLLMan::beginTxn(txn_man * txn, base_query * query, TxnQEntry *& entry) {
 73 | 
 74 | 	int ret = -1;	
 75 | 	if (_txn_queue_size >= TXN_QUEUE_SIZE_LIMIT)
 76 | 		ret = 3;
 77 | 
 78 | 	txn->vll_txn_type = VLL_Free;
 79 | 	assert(WORKLOAD == YCSB);
 80 | 	
 81 | 	for (int rid = 0; rid < txn->row_cnt; rid ++ ) {
 82 | 		access_t type = txn->accesses[rid]->type;
 83 | 		if (txn->accesses[rid]->orig_row->manager->insert_access(type))
 84 | 			txn->vll_txn_type = VLL_Blocked;
 85 | 	}
 86 | 	
 87 | 	entry = getQEntry();
 88 | 	LIST_PUT_TAIL(_txn_queue, _txn_queue_tail, entry);
 89 | 	if (txn->vll_txn_type == VLL_Blocked)
 90 | 		ret = 1;
 91 | 	else 
 92 | 		ret = 2;
 93 | 	pthread_mutex_unlock(&_mutex);
 94 | 	return ret;
 95 | }
 96 | 
 97 | void 
 98 | VLLMan::execute(txn_man * txn, base_query * query) {
 99 | 	RC rc;
100 | uint64_t t3 = get_sys_clock();
101 | 	ycsb_query * m_query = (ycsb_query *) query;
102 | 	ycsb_wl * wl = (ycsb_wl *) txn->get_wl();
103 | 	Catalog * schema = wl->the_table->get_schema();
104 | 	uint64_t average;
105 | 	for (int rid = 0; rid < txn->row_cnt; rid ++) {
106 | 		row_t * row = txn->accesses[rid]->orig_row;
107 | 		access_t type = txn->accesses[rid]->type;
108 | 		if (type == RD) {
109 | 			for (int fid = 0; fid < schema->get_field_cnt(); fid++) {
110 | 				char * data = row->get_data();
111 | 				uint64_t fval = *(uint64_t *)(&data[fid * 100]);
112 |            	}
113 | 		} else {
114 | 			assert(type == WR);
115 | 			for (int fid = 0; fid < schema->get_field_cnt(); fid++) {
116 | 				char * data = row->get_data();
117 | 				*(uint64_t *)(&data[fid * 100]) = 0;
118 | 			}
119 | 		} 
120 | 	}
121 | uint64_t tt3 = get_sys_clock() - t3;
122 | INC_STATS(txn->get_thd_id(), debug3, tt3);
123 | }
124 | 
125 | void 
126 | VLLMan::finishTxn(txn_man * txn, TxnQEntry * entry) {
127 | 	pthread_mutex_lock(&_mutex);
128 | 	
129 | 	for (int rid = 0; rid < txn->row_cnt; rid ++ ) {
130 | 		access_t type = txn->accesses[rid]->type;
131 | 		txn->accesses[rid]->orig_row->manager->remove_access(type);
132 | 	}
133 | 	LIST_REMOVE_HT(entry, _txn_queue, _txn_queue_tail);
134 | 	pthread_mutex_unlock(&_mutex);
135 | 	txn->release();
136 | 	mem_allocator.free(txn, 0);
137 | }
138 | 
139 | 
140 | TxnQEntry * 
141 | VLLMan::getQEntry() {
142 | 	TxnQEntry * entry = (TxnQEntry *) mem_allocator.alloc(sizeof(TxnQEntry), 0);
143 | 	entry->prev = NULL;
144 | 	entry->next = NULL;
145 | 	entry->txn = NULL;
146 | 	return entry;
147 | }
148 | 
149 | void 
150 | VLLMan::returnQEntry(TxnQEntry * entry) {
151 |  	mem_allocator.free(entry, sizeof(TxnQEntry));
152 | }
153 | 
154 | #endif
155 | 


--------------------------------------------------------------------------------
/concurrency_control/vll.h:
--------------------------------------------------------------------------------
 1 | #ifndef _VLL_H_
 2 | #define _VLL_H_
 3 | 
 4 | #include "global.h"
 5 | #include "helper.h"
 6 | #include "query.h"
 7 | 
 8 | class txn_man;
 9 | 
10 | class TxnQEntry {
11 | public:
12 | 	TxnQEntry * prev;
13 | 	TxnQEntry * next;
14 | 	txn_man * 	txn;
15 | };
16 | 
17 | class VLLMan {
18 | public:
19 | 	void init();
20 | 	void vllMainLoop(txn_man * next_txn, base_query * query);
21 | 	// 	 1: txn is blocked
22 | 	//	 2: txn is not blocked. Can run.
23 | 	//   3: txn_queue is full. 
24 | 	int beginTxn(txn_man * txn, base_query * query, TxnQEntry *& entry);
25 | 	void finishTxn(txn_man * txn, TxnQEntry * entry);
26 | 	void execute(txn_man * txn, base_query * query);
27 | private:
28 |     TxnQEntry * 			_txn_queue;
29 |     TxnQEntry * 			_txn_queue_tail;
30 | 	int 					_txn_queue_size;
31 | 	pthread_mutex_t 		_mutex;
32 | 
33 | 	TxnQEntry * getQEntry();
34 | 	void returnQEntry(TxnQEntry * entry);
35 | };
36 | 
37 | #endif
38 | 


--------------------------------------------------------------------------------
/config.cpp:
--------------------------------------------------------------------------------
1 | #include "config.h"
2 | 
3 | TPCCTxnType 			g_tpcc_txn_type = TPCC_ALL;
4 | TestCases				g_test_case = CONFLICT;
5 | 
6 | 


--------------------------------------------------------------------------------
/experiments/debug.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"THREAD_CNT": 16,
 3 | 	"MAX_TXN_PER_PART": 1000000,
 4 | 	"ABORT_PENALTY": 50000,
 5 | 	"LATCH": "LH_MCSLOCK",
 6 | 	"TERMINATE_BY_COUNT": "false",
 7 | 	"MAX_RUNTIME": 10,
 8 | 
 9 | 	"PF_BASIC": "true",
10 | 	"PF_CS": "false",
11 | 	"PF_ABORT": "false",
12 | 
13 | 	"CC_ALG": "SILO_PRIO",
14 | 	"WW_STARV_FREE": "true",
15 | 	"BB_DYNAMIC_TS": "true",
16 | 	"BB_OPT_RAW": "true",
17 | 	"BB_LAST_RETIRE": 0,
18 | 	"BB_PRECOMMIT": "false",
19 | 	"BB_AUTORETIRE": "false",
20 | 	"BB_ALWAYS_RETIRE_READ": "true",
21 | 
22 | 	"WORKLOAD": "YCSB",
23 | 	"SYNTHETIC_YCSB": "true",
24 | 	"ZIPF_THETA": 0,
25 | 	"READ_PERC": 1,
26 |   "POS_HS": "SPECIFIED",
27 |   "SPECIFIED_RATIO": 1.0,
28 |   "FLIP_RATIO": 0,
29 |   "NUM_HS": 2,
30 |   "FIRST_HS": "WR",
31 |   "SECOND_HS": "WR",
32 |   "FIXED_HS": 1,
33 | 
34 | 	"LONG_TXN_RATIO": 0,
35 | 	"REQ_PER_QUERY": 16,
36 | 	"SYNTH_TABLE_SIZE": 100000,
37 | 
38 | 	"UNSET_NUMA": "false",
39 | 	"COMPILE_ONLY": "false",
40 | 	"NDEBUG": "false"
41 | }
42 | 


--------------------------------------------------------------------------------
/experiments/default.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"THREAD_CNT": 16,
 3 | 	"MAX_TXN_PER_PART": 1000000,
 4 | 	"ABORT_PENALTY": 50000,
 5 | 	"LATCH": "LH_MCSLOCK",
 6 | 	"TERMINATE_BY_COUNT": "true",
 7 | 	"MAX_RUNTIME": 10,
 8 | 
 9 | 	"PF_BASIC": "true",
10 | 	"PF_CS": "false",
11 | 	"PF_ABORT": "false",
12 | 
13 | 	"CC_ALG": "SILO_PRIO",
14 | 	"WW_STARV_FREE": "true",
15 | 	"BB_DYNAMIC_TS": "true",
16 | 	"BB_OPT_RAW": "true",
17 | 	"BB_LAST_RETIRE": 0,
18 | 	"BB_PRECOMMIT": "false",
19 | 	"BB_AUTORETIRE": "false",
20 | 	"BB_ALWAYS_RETIRE_READ": "true",
21 | 
22 | 	"WORKLOAD": "YCSB",
23 | 	"SYNTHETIC_YCSB": "false",
24 | 	"ZIPF_THETA": 0.99,
25 | 	"READ_PERC": 0.5,
26 | 	"LONG_TXN_RATIO": 0,
27 | 	"MAX_ROW_PER_TXN": 1000,
28 | 	"REQ_PER_QUERY": 16,
29 | 	"SYNTH_TABLE_SIZE": 10000000,
30 | 
31 | 	"UNSET_NUMA": "false",
32 | 	"COMPILE_ONLY": "false",
33 | 	"NDEBUG": "true"
34 | }
35 | 


--------------------------------------------------------------------------------
/experiments/large_dataset.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"THREAD_CNT": 16,
 3 | 	"MAX_TXN_PER_PART": 1000000,
 4 | 	"ABORT_PENALTY": 1000,
 5 | 	"LATCH": "LH_MCSLOCK",
 6 | 	"TERMINATE_BY_COUNT": "false",
 7 | 	"MAX_RUNTIME": 20,
 8 | 
 9 | 	"PF_BASIC": "true",
10 | 	"PF_CS": "false",
11 | 	"PF_ABORT": "false",
12 | 
13 | 	"CC_ALG": "SILO_PRIO",
14 | 	"WW_STARV_FREE": "true",
15 | 	"BB_DYNAMIC_TS": "true",
16 | 	"BB_OPT_RAW": "true",
17 | 	"BB_LAST_RETIRE": 0,
18 | 	"BB_PRECOMMIT": "false",
19 | 	"BB_AUTORETIRE": "false",
20 | 	"BB_ALWAYS_RETIRE_READ": "true",
21 | 
22 | 	"WORKLOAD": "YCSB",
23 | 	"SYNTHETIC_YCSB": "false",
24 | 	"ZIPF_THETA": 0.99,
25 | 	"READ_PERC": 0.5,
26 | 	"LONG_TXN_RATIO": 0,
27 | 	"MAX_ROW_PER_TXN": 1000,
28 | 	"REQ_PER_QUERY": 16,
29 | 	"SYNTH_TABLE_SIZE": 100000000,
30 | 
31 | 	"UNSET_NUMA": "false",
32 | 	"COMPILE_ONLY": "false",
33 | 	"NDEBUG": "true"
34 | }
35 | 


--------------------------------------------------------------------------------
/experiments/long_txn.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"THREAD_CNT": 16,
 3 | 	"MAX_TXN_PER_PART": 100000,
 4 | 	"ABORT_PENALTY": 1000,
 5 | 	"LATCH": "LH_MCSLOCK",
 6 | 	"TERMINATE_BY_COUNT": "false",
 7 | 	"MAX_RUNTIME": 10,
 8 |   	"WARMUP": 100,
 9 | 
10 | 	"PF_BASIC": "true",
11 | 	"PF_CS": "false",
12 | 	"PF_ABORT": "false",
13 | 
14 | 	"CC_ALG": "SILO_PRIO",
15 | 	"WW_STARV_FREE": "true",
16 | 	"BB_DYNAMIC_TS": "true",
17 | 	"BB_OPT_RAW": "true",
18 | 	"BB_LAST_RETIRE": 0,
19 | 	"BB_PRECOMMIT": "false",
20 | 	"BB_AUTORETIRE": "false",
21 | 	"BB_ALWAYS_RETIRE_READ": "true",
22 | 
23 | 	"WORKLOAD": "YCSB",
24 | 	"SYNTHETIC_YCSB": "false",
25 | 	"ZIPF_THETA": 0.99,
26 | 	"READ_PERC": 0.5,
27 | 	"LONG_TXN_RATIO": 0.05,
28 |   	"LONG_TXN_READ_RATIO": 1,
29 | 	"MAX_ROW_PER_TXN": 1000,
30 | 	"REQ_PER_QUERY": 16,
31 | 	"SYNTH_TABLE_SIZE": 100000000,
32 | 
33 | 	"UNSET_NUMA": "false",
34 | 	"COMPILE_ONLY": "false",
35 | 	"NDEBUG": "true"
36 | }
37 | 


--------------------------------------------------------------------------------
/experiments/run_all.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | # assume the current working directory is `polaris/`
 3 | bash experiments/run_ycsb_latency.sh  # fig 1, 7
 4 | bash experiments/run_ycsb_prio_sen.sh # fig 2
 5 | bash experiments/run_ycsb_thread.sh   # fig 3
 6 | bash experiments/run_ycsb_readonly.sh # fig 4
 7 | bash experiments/run_ycsb_zipf.sh     # fig 5, 6
 8 | bash experiments/run_tpcc_thread.sh   # fig 8, 9
 9 | bash experiments/run_ycsb_aria.sh     # fig 10, 11
10 | 


--------------------------------------------------------------------------------
/experiments/run_tpcc_thread.sh:
--------------------------------------------------------------------------------
 1 | exper=tpcc_thread
 2 | mkdir -p results
 3 | num_wh=1
 4 | 
 5 | for thd in 1 4 8 16 24 32 40 48 56 64; do
 6 | 	for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do
 7 | 		data_dir="results/${exper}/TPCC-CC=${alg}-THD=${thd}-NUM_WH=${num_wh}"
 8 | 		mkdir -p "${data_dir}"
 9 | 		if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then
10 | 			python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} | tee "${data_dir}/log"
11 | 		else
12 | 			python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
13 | 		fi
14 | 	done
15 | done
16 | 
17 | num_wh=64
18 | 
19 | for thd in 1 4 8 16 24 32 40 48 56 64; do
20 | 	for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do
21 | 		data_dir="results/${exper}/TPCC-CC=${alg}-THD=${thd}-NUM_WH=${num_wh}"
22 | 		mkdir -p "${data_dir}"
23 | 		if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then
24 | 			python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} | tee "${data_dir}/log"
25 | 		else
26 | 			python3 test.py experiments/tpcc.json THREAD_CNT=${thd} CC_ALG=${alg} NUM_WH=${num_wh} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
27 | 		fi
28 | 	done
29 | done
30 | 
31 | python3 parse.py "${exper}"
32 | 


--------------------------------------------------------------------------------
/experiments/run_ycsb_aria.sh:
--------------------------------------------------------------------------------
 1 | # Note: it turns out p999 of ARIA can vary a lot; to get a relatively stable
 2 | # results, we make experiment duration longer and dump all latency
 3 | 
 4 | exper=ycsb_aria_batch
 5 | mkdir -p results
 6 | 
 7 | alg=ARIA
 8 | for zipf in 0.99 0.5; do
 9 | 	for thd in 1 4 8 16 24 32 40 48 56 64; do
10 | 		for batch in 1 2 4 8; do
11 | 			data_dir="results/${exper}/YCSB-CC=${alg}_${batch}-THD=${thd}-ZIPF=${zipf}"
12 | 			mkdir -p "${data_dir}"
13 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} ARIA_BATCH_SIZE=${batch} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" MAX_RUNTIME=40 | tee "${data_dir}/log"
14 | 		done
15 | 	done
16 | done
17 | 
18 | # we then run SILO_PRIO for comparsion
19 | alg=SILO_PRIO
20 | for zipf in 0.99 0.5; do
21 | 	for thd in 1 4 8 16 24 32 40 48 56 64; do
22 | 		data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
23 | 		mkdir -p "${data_dir}"
24 | 		python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" MAX_RUNTIME=40 | tee "${data_dir}/log"
25 | 	done
26 | done
27 | 
28 | python3 parse.py "${exper}"
29 | 


--------------------------------------------------------------------------------
/experiments/run_ycsb_latency.sh:
--------------------------------------------------------------------------------
 1 | exper=ycsb_latency
 2 | mkdir -p results
 3 | zipf=0.99
 4 | thd=64
 5 | 
 6 | for alg in SILO WOUND_WAIT NO_WAIT WAIT_DIE; do
 7 | 	data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
 8 | 	mkdir -p "${data_dir}"
 9 | 	python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
10 | done
11 | 
12 | alg=SILO_PRIO
13 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
14 | mkdir -p "${data_dir}"
15 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} HIGH_PRIO_RATIO=0.05 DUMP_LATENCY=true SILO_PRIO_FIXED_PRIO=false LOW_PRIO_BOUND=7 DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
16 | 
17 | alg=SILO_PRIO_FIXED
18 | data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
19 | mkdir -p "${data_dir}"
20 | python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=SILO_PRIO HIGH_PRIO_RATIO=0.05 DUMP_LATENCY=true SILO_PRIO_FIXED_PRIO=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
21 | 
22 | python3 parse.py "${exper}"
23 | 


--------------------------------------------------------------------------------
/experiments/run_ycsb_prio_sen.sh:
--------------------------------------------------------------------------------
 1 | exper=ycsb_prio_sen
 2 | mkdir -p results
 3 | zipf=0.99
 4 | thd=64
 5 | alg=SILO_PRIO
 6 | 
 7 | for pr in 0 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1; do
 8 | 	data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}-PRIO_RATIO=${pr}"
 9 | 	mkdir -p "${data_dir}"
10 | 	python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} SILO_PRIO_FIXED_PRIO=true HIGH_PRIO_RATIO=${pr} | tee "${data_dir}/log"
11 | done
12 | 
13 | python3 parse.py "${exper}"
14 | 


--------------------------------------------------------------------------------
/experiments/run_ycsb_readonly.sh:
--------------------------------------------------------------------------------
 1 | exper=ycsb_readonly
 2 | mkdir -p results
 3 | zipf=0.99
 4 | 
 5 | for thd in 1 4 8 16 24 32 40 48 56 64; do
 6 | 	for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do
 7 | 		data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
 8 | 		mkdir -p "${data_dir}"
 9 | 		if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then
10 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} READ_PERC=1 | tee "${data_dir}/log"
11 | 		else
12 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} READ_PERC=1 DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
13 | 		fi
14 | 	done
15 | done
16 | 
17 | python3 parse.py "${exper}"
18 | 


--------------------------------------------------------------------------------
/experiments/run_ycsb_thread.sh:
--------------------------------------------------------------------------------
 1 | exper=ycsb_thread
 2 | mkdir -p results
 3 | zipf=0.99
 4 | 
 5 | for thd in 1 4 8 16 24 32 40 48 56 64; do
 6 | 	for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do
 7 | 		data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
 8 | 		mkdir -p "${data_dir}"
 9 | 		if [ "$thd" != "64" ] && [ "$thd" != "16" ]; then
10 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} | tee "${data_dir}/log"
11 | 		else
12 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
13 | 		fi
14 | 	done
15 | done
16 | 
17 | python3 parse.py "${exper}"
18 | 


--------------------------------------------------------------------------------
/experiments/run_ycsb_zipf.sh:
--------------------------------------------------------------------------------
 1 | exper=ycsb_zipf
 2 | mkdir -p results
 3 | thd=64
 4 | 
 5 | for zipf in 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.99 1.1 1.2 1.3 1.4 1.5; do
 6 | 	for alg in SILO SILO_PRIO WOUND_WAIT NO_WAIT WAIT_DIE; do
 7 | 		data_dir="results/${exper}/YCSB-CC=${alg}-THD=${thd}-ZIPF=${zipf}"
 8 | 		mkdir -p "${data_dir}"
 9 | 		if [ "$zipf" != "0" ] && [ "$zipf" != "0.9" ] && [ "$zipf" != "0.99" ] && [ "$zipf" != "1.5" ]; then
10 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} | tee "${data_dir}/log"
11 | 		else
12 | 			python3 test.py experiments/large_dataset.json THREAD_CNT=${thd} ZIPF_THETA=${zipf} CC_ALG=${alg} DUMP_LATENCY=true DUMP_LATENCY_FILENAME="\"${data_dir}/latency_dump.csv\"" | tee "${data_dir}/log"
13 | 		fi
14 | 	done
15 | done
16 | 
17 | python3 parse.py "${exper}"
18 | 


--------------------------------------------------------------------------------
/experiments/synthetic_ycsb.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"THREAD_CNT": 32,
 3 | 	"MAX_TXN_PER_PART": 1000000,
 4 | 	"ABORT_PENALTY": 50000,
 5 | 	"LATCH": "LH_MCSLOCK",
 6 | 	"TERMINATE_BY_COUNT": "false",
 7 | 	"MAX_RUNTIME": 10,
 8 | 
 9 | 	"PF_BASIC": "true",
10 | 	"PF_CS": "false",
11 | 	"PF_ABORT": "false",
12 | 
13 | 	"CC_ALG": "SILO_PRIO",
14 | 	"WW_STARV_FREE": "true",
15 | 	"BB_DYNAMIC_TS": "true",
16 | 	"BB_OPT_RAW": "true",
17 | 	"BB_LAST_RETIRE": 0,
18 | 	"BB_PRECOMMIT": "false",
19 | 	"BB_AUTORETIRE": "false",
20 | 	"BB_ALWAYS_RETIRE_READ": "true",
21 | 
22 | 	"WORKLOAD": "YCSB",
23 | 	"SYNTHETIC_YCSB": "true",
24 | 	"ZIPF_THETA": 0,
25 | 	"READ_PERC": 1,
26 |   "POS_HS": "SPECIFIED",
27 |   "SPECIFIED_RATIO": 0,
28 |   "FLIP_RATIO": 0,
29 |   "NUM_HS": 1,
30 |   "FIRST_HS": "WR",
31 |   "SECOND_HS": "WR",
32 |   "FIXED_HS": 0,
33 | 
34 | 	"LONG_TXN_RATIO": 0,
35 | 	"REQ_PER_QUERY": 16,
36 | 	"SYNTH_TABLE_SIZE": 100000000,
37 | 
38 | 	"UNSET_NUMA": "false",
39 | 	"COMPILE_ONLY": "false",
40 | 	"NDEBUG": "true"
41 | }
42 | 


--------------------------------------------------------------------------------
/experiments/tpcc.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"THREAD_CNT": 16,
 3 | 	"MAX_TXN_PER_PART": 1000000,
 4 | 	"ABORT_PENALTY": 1000,
 5 | 	"LATCH": "LH_MCSLOCK",
 6 | 	"TERMINATE_BY_COUNT": "false",
 7 | 	"MAX_RUNTIME": 20,
 8 | 
 9 | 	"PF_BASIC": "true",
10 | 	"PF_CS": "false",
11 | 	"PF_ABORT": "false",
12 | 
13 | 	"CC_ALG": "SILO_PRIO",
14 | 	"WW_STARV_FREE": "true",
15 | 	"BB_DYNAMIC_TS": "true",
16 | 	"BB_OPT_RAW": "true",
17 | 	"BB_LAST_RETIRE": 0,
18 | 	"BB_PRECOMMIT": "false",
19 | 	"BB_AUTORETIRE": "false",
20 | 	"BB_ALWAYS_RETIRE_READ": "true",
21 | 
22 | 	"WORKLOAD": "TPCC",
23 |   "NUM_WH": 1,
24 |   "TPCC_USER_ABORT": "true",
25 | 
26 | 	"UNSET_NUMA": "false",
27 | 	"COMPILE_ONLY": "false",
28 | 	"NDEBUG": "true"
29 | }
30 | 


--------------------------------------------------------------------------------
/libs/libjemalloc.a:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenhao-ye/polaris/d4d7b7b4ba41ae2ac401efdbbf7f64cc4dc961a9/libs/libjemalloc.a


--------------------------------------------------------------------------------
/outputs/collect_stats.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import pandas as pd
 3 | import numpy as np
 4 | import sys
 5 | 
 6 | data = []
 7 | if len(sys.argv) > 2:
 8 |     fname = sys.argv[1]
 9 | else:
10 |     fname = "stats.json"
11 | f = open(fname)
12 | for line in f:
13 |     data.append(json.loads(line.strip()))
14 | df = pd.DataFrame(data=data)
15 | df.to_csv("stats.csv", index=False)
16 | '''
17 | df = df.dropna(how='all',axis=1)
18 | summarized = []
19 | idx = np.where(df.columns.values == 'abort_cnt')[0][0]
20 | for cls, group in df.groupby(list(df.columns[:idx])):
21 |     summarized.append(group.loc[group['throughput'].idxmax()])
22 | summarized = pd.DataFrame(summarized, columns = df.columns).reset_index(drop=True)
23 | summarized.to_csv("stats-summarized.csv", index=False)
24 | '''
25 | 


--------------------------------------------------------------------------------
/parse.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | import os
  3 | import re
  4 | import sys
  5 | import os.path
  6 | from typing import List
  7 | 
  8 | 
  9 | class DataPoint():
 10 |     # regex for directory name, which encoding experiment metadata
 11 |     re_dirname = re.compile(
 12 |         r'\A(?P<wl>[A-Z]+)-CC=(?P<cc_alg>[A-Z_0-9]+)-THD=(?P<thread_cnt>[0-9]+)(-ZIPF=(?P<zipf_theta>[0-9.]+))?(-NUM_WH=(?P<num_wh>[0-9]+))?(-PRIO_RATIO=(?P<prio_ratio>[0-9.]+))?\Z')
 13 |     # regex to filter throughput
 14 |     re_throughput = re.compile(
 15 |         r'\A\[summary\] throughput=(?P<throughput>[0-9.e+]+),')
 16 |     # regex to filter tail latency
 17 |     re_tail = re.compile(
 18 |         r'\A\[(?P<tag>\S+):tail\]\s+txn_cnt=(?P<txn_cnt>[0-9]+)(, p50=(?P<p50>[0-9.]+))?(, p90=(?P<p90>[0-9.]+))?(, p99=(?P<p99>[0-9.]+))?(, p999=(?P<p999>[0-9.]+))?(, p9999=(?P<p9999>[0-9.]+))?')
 19 |     # regex to filter per-priority breakdown
 20 |     re_prio_breakdown = re.compile(
 21 |         r'\A\[prio=(?P<prio>\d+)\]\s+txn_cnt=(?P<txn_cnt>[0-9]+), abort_cnt=(?P<abort_cnt>[0-9]+), abort_time=(?P<abort_time>[0-9]+), exec_time=(?P<exec_time>[0-9]+), backoff_time=(?P<backoff_time>[0-9]+), ')
 22 | 
 23 |     def __init__(self, prefix: str, dirname: str) -> None:
 24 |         d = self.re_dirname.match(dirname).groupdict()
 25 |         self.wl = d['wl']
 26 |         assert self.wl in {"YCSB", "TPCC"}
 27 |         self.params = d  # cc_alg, thread_cnt, zipf_theta, num_wh, prio_ratio
 28 |         self.throughput = None
 29 |         self.tail = {}
 30 |         self.prio_breakdown = {}
 31 |         with open(os.path.join(prefix, dirname, "log"), 'r') as f:
 32 |             for line in f:
 33 |                 if line.startswith('[summary]'):
 34 |                     self.throughput = self.re_throughput.match(line).groupdict()[
 35 |                         'throughput']
 36 |                 else:
 37 |                     m = self.re_tail.match(line)
 38 |                     if m is not None:
 39 |                         d = m.groupdict()
 40 |                         self.tail[d['tag']] = {
 41 |                             k: v for k, v in d.items() if k != 'tag'}
 42 |                     m = self.re_prio_breakdown.match(line)
 43 |                     if m is not None:
 44 |                         d = m.groupdict()
 45 |                         self.prio_breakdown[d['prio']] = {
 46 |                             k: v for k, v in d.items() if k != 'prio'}
 47 | 
 48 |     def get_base_header(self) -> List[str]:
 49 |         return [
 50 |             p for p in ["cc_alg", "thread_cnt", "zipf_theta", "num_wh", "prio_ratio"]
 51 |             if self.params.get(p)
 52 |         ]
 53 | 
 54 |     def get_base_data(self) -> List:
 55 |         return [
 56 |             self.params.get(p) for p in ["cc_alg", "thread_cnt", "zipf_theta", "num_wh", "prio_ratio"]
 57 |             if self.params.get(p)
 58 |         ]
 59 | 
 60 |     def get_throughput_header(self) -> List[str]:
 61 |         h = self.get_base_header()
 62 |         h.append("throughput")
 63 |         return h
 64 | 
 65 |     def get_throughput_data(self) -> List[str]:
 66 |         d = self.get_base_data()
 67 |         d.append(str(self.throughput))
 68 |         return d
 69 | 
 70 |     def get_tail_header(self) -> List[str]:
 71 |         h = self.get_base_header()
 72 |         h.extend(['tag', 'p50', 'p99', 'p999', 'p9999'])
 73 |         return h
 74 | 
 75 |     def get_tail_data(self) -> List[List]:
 76 |         d = self.get_base_data()
 77 |         return [
 78 |             d + [tag, str(tail['p50']), str(tail['p99']),
 79 |                  str(tail['p999']), str(tail['p9999'])]
 80 |             for tag, tail in self.tail.items()
 81 |         ]
 82 | 
 83 | 
 84 | def parse_datapoint(prefix: str, dirname: str) -> DataPoint:
 85 |     if dirname.startswith("YCSB") or dirname.startswith("TPCC"):
 86 |         return DataPoint(prefix, dirname)
 87 |     else:
 88 |         print(f"Unknown experiment: {dirname}")
 89 | 
 90 | 
 91 | def dump_throughput(datapoints: List[DataPoint], path: str, has_header: bool = True):
 92 |     with open(path, 'w') as f:
 93 |         if has_header:
 94 |             f.write(f"{','.join(datapoints[0].get_throughput_header())}\n")
 95 |         for dp in datapoints:
 96 |             f.write(f"{','.join(dp.get_throughput_data())}\n")
 97 | 
 98 | 
 99 | def dump_tail(datapoints: List[DataPoint], path: str, has_header: bool = True):
100 |     with open(path, 'w') as f:
101 |         if has_header:
102 |             if has_header:
103 |                 f.write(f"{','.join(datapoints[0].get_tail_header())}\n")
104 |             for dp in datapoints:
105 |                 for l in dp.get_tail_data():
106 |                     f.write(f"{','.join(l)}\n")
107 | 
108 | 
109 | if __name__ == "__main__":
110 |     if len(sys.argv) <= 1:
111 |         print(f"Usage: {sys.argv[0]} exper1 [exper2 [exper3...]]")
112 |     exper_list = sys.argv[1:]
113 | 
114 |     for exper in exper_list:
115 |         dp_list = [
116 |             parse_datapoint(f'results/{exper}', d)
117 |             for d in os.listdir(f'results/{exper}')
118 |             if os.path.isdir(f'results/{exper}/{d}')
119 |         ]
120 |         dump_throughput(dp_list, f'results/{exper}/throughput.csv')
121 |         dump_tail(dp_list, f'results/{exper}/tail.csv')
122 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib
2 | pandas
3 | numpy
4 | 


--------------------------------------------------------------------------------
/storage/catalog.cpp:
--------------------------------------------------------------------------------
 1 | #include "catalog.h"
 2 | #include "global.h"
 3 | #include "helper.h"
 4 | 
 5 | void 
 6 | Catalog::init(const char * table_name, int field_cnt) {
 7 | 	this->table_name = table_name;
 8 | 	this->field_cnt = 0;
 9 | 	this->_columns = new Column [field_cnt];
10 | 	this->tuple_size = 0;
11 | }
12 | 
13 | void Catalog::add_col(char * col_name, uint64_t size, char * type) {
14 | 	_columns[field_cnt].size = size;
15 | 	strcpy(_columns[field_cnt].type, type);
16 | 	strcpy(_columns[field_cnt].name, col_name);
17 | 	_columns[field_cnt].id = field_cnt;
18 | 	_columns[field_cnt].index = tuple_size;
19 | 	tuple_size += size;
20 | 	field_cnt ++;
21 | }
22 | 
23 | uint64_t Catalog::get_field_id(const char * name) {
24 | 	UInt32 i;
25 | 	for (i = 0; i < field_cnt; i++) {
26 | 		if (strcmp(name, _columns[i].name) == 0)
27 | 			break;
28 | 	}
29 | 	assert (i < field_cnt);
30 | 	return i;
31 | }
32 | 
33 | char * Catalog::get_field_type(uint64_t id) {
34 | 	return _columns[id].type;
35 | }
36 | 
37 | char * Catalog::get_field_name(uint64_t id) {
38 | 	return _columns[id].name;
39 | }
40 | 
41 | 
42 | char * Catalog::get_field_type(char * name) {
43 | 	return get_field_type( get_field_id(name) );
44 | }
45 | 
46 | uint64_t Catalog::get_field_index(char * name) {
47 | 	return get_field_index( get_field_id(name) );
48 | }
49 | 
50 | void Catalog::print_schema() {
51 | 	printf("\n[Catalog] %s\n", table_name);
52 | 	for (UInt32 i = 0; i < field_cnt; i++) {
53 | 		printf("\t%s\t%s\t%ld\n", get_field_name(i), 
54 | 			get_field_type(i), get_field_size(i));
55 | 	}
56 | }
57 | 


--------------------------------------------------------------------------------
/storage/catalog.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include <map>
 4 | #include <vector>
 5 | #include "global.h"
 6 | #include "helper.h"
 7 | 
 8 | class Column {
 9 | public:
10 | 	Column() {
11 | 		this->type = new char[80];
12 | 		this->name = new char[80];
13 | 	}
14 | 	Column(uint64_t size, char * type, char * name, 
15 | 		uint64_t id, uint64_t index) 
16 | 	{
17 | 		this->size = size;
18 | 		this->id = id;
19 | 		this->index = index;
20 | 		this->type = new char[80];
21 | 		this->name = new char[80];
22 | 		strcpy(this->type, type);
23 | 		strcpy(this->name, name);
24 | 	};
25 | 
26 | 	UInt64 id;
27 | 	UInt32 size;
28 | 	UInt32 index;
29 | 	char * type;
30 | 	char * name;
31 | 	char pad[CL_SIZE - sizeof(uint64_t)*3 - sizeof(char *)*2];
32 | };
33 | 
34 | class Catalog {
35 | public:
36 | 	// abandoned init function
37 | 	// field_size is the size of each each field.
38 | 	void init(const char * table_name, int field_cnt);
39 | 	void add_col(char * col_name, uint64_t size, char * type);
40 | 
41 | 	UInt32 			field_cnt;
42 |  	const char * 	table_name;
43 | 	
44 | 	UInt32 			get_tuple_size() { return tuple_size; };
45 | 	
46 | 	uint64_t 		get_field_cnt() { return field_cnt; };
47 | 	uint64_t 		get_field_size(int id) { return _columns[id].size; };
48 | 	uint64_t 		get_field_index(int id) { return _columns[id].index; };
49 | 	char * 			get_field_type(uint64_t id);
50 | 	char * 			get_field_name(uint64_t id);
51 | 	uint64_t 		get_field_id(const char * name);
52 | 	char * 			get_field_type(char * name);
53 | 	uint64_t 		get_field_index(char * name);
54 | 
55 | 	void 			print_schema();
56 | 	Column * 		_columns;
57 | 	UInt32 			tuple_size;
58 | };
59 | 
60 | 


--------------------------------------------------------------------------------
/storage/index_base.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "global.h"
 4 | 
 5 | class table_t;
 6 | 
 7 | class index_base {
 8 | public:
 9 | 	virtual RC 			init() { return RCOK; };
10 | 	virtual RC 			init(uint64_t size) { return RCOK; };
11 | 
12 | 	virtual bool 		index_exist(idx_key_t key)=0; // check if the key exist.
13 | 
14 | 	virtual RC 			index_insert(idx_key_t key, 
15 | 							itemid_t * item, 
16 | 							int part_id=-1)=0;
17 | 
18 | 	virtual RC	 		index_read(idx_key_t key, 
19 | 							itemid_t * &item,
20 | 							int part_id=-1)=0;
21 | 	
22 | 	virtual RC	 		index_read(idx_key_t key, 
23 | 							itemid_t * &item,
24 | 							int part_id=-1, int thd_id=0)=0;
25 | 
26 | 	// TODO implement index_remove
27 | 	virtual RC 			index_remove(idx_key_t key) { return RCOK; };
28 | 	
29 | 	// the index in on "table". The key is the merged key of "fields"
30 | 	table_t * 			table;
31 | };
32 | 


--------------------------------------------------------------------------------
/storage/index_btree.h:
--------------------------------------------------------------------------------
 1 | #ifndef _BTREE_H_
 2 | #define _BTREE_H_
 3 | 
 4 | #include "global.h"
 5 | #include "helper.h"
 6 | #include "index_base.h"
 7 | 
 8 | 
 9 | typedef struct bt_node {
10 | 	// TODO bad hack!
11 |    	void ** pointers; // for non-leaf nodes, point to bt_nodes
12 | 	bool is_leaf;
13 | 	idx_key_t * keys;
14 | 	bt_node * parent;
15 | 	UInt32 num_keys;
16 | 	bt_node * next;
17 | 	bool latch;
18 | 	pthread_mutex_t locked;
19 | 	latch_t latch_type;
20 | 	UInt32 share_cnt;
21 | } bt_node;
22 | 
23 | struct glob_param {
24 | 	uint64_t part_id;
25 | };
26 | 
27 | class index_btree : public index_base {
28 | public:
29 | 	RC			init(uint64_t part_cnt);
30 | 	RC			init(uint64_t part_cnt, table_t * table);
31 | 	bool 		index_exist(idx_key_t key); // check if the key exist. 
32 | 	RC 			index_insert(idx_key_t key, itemid_t * item, int part_id = -1);
33 | 	RC	 		index_read(idx_key_t key, itemid_t * &item, 
34 | 					uint64_t thd_id, int64_t part_id = -1);
35 | 	RC	 		index_read(idx_key_t key, itemid_t * &item, int part_id = -1);
36 | 	RC	 		index_read(idx_key_t key, itemid_t * &item);
37 | 	RC 			index_next(uint64_t thd_id, itemid_t * &item, bool samekey = false);
38 | 
39 | private:
40 | 	// index structures may have part_cnt = 1 or PART_CNT.
41 | 	uint64_t part_cnt;
42 | 	RC			make_lf(uint64_t part_id, bt_node *& node);
43 | 	RC			make_nl(uint64_t part_id, bt_node *& node);
44 | 	RC		 	make_node(uint64_t part_id, bt_node *& node);
45 | 	
46 | 	RC 			start_new_tree(glob_param params, idx_key_t key, itemid_t * item);
47 | 	RC 			find_leaf(glob_param params, idx_key_t key, idx_acc_t access_type, bt_node *& leaf, bt_node  *& last_ex);
48 | 	RC 			find_leaf(glob_param params, idx_key_t key, idx_acc_t access_type, bt_node *& leaf);
49 | 	RC			insert_into_leaf(glob_param params, bt_node * leaf, idx_key_t key, itemid_t * item);
50 | 	// handle split
51 | 	RC 			split_lf_insert(glob_param params, bt_node * leaf, idx_key_t key, itemid_t * item);
52 | 	RC 			split_nl_insert(glob_param params, bt_node * node, UInt32 left_index, idx_key_t key, bt_node * right);
53 | 	RC 			insert_into_parent(glob_param params, bt_node * left, idx_key_t key, bt_node * right);
54 | 	RC 			insert_into_new_root(glob_param params, bt_node * left, idx_key_t key, bt_node * right);
55 | 
56 | 	int			leaf_has_key(bt_node * leaf, idx_key_t key);
57 | 	
58 | 	UInt32 		cut(UInt32 length);
59 | 	UInt32	 	order; // # of keys in a node(for both leaf and non-leaf)
60 | 	bt_node ** 	roots; // each partition has a different root
61 | 	bt_node *   find_root(uint64_t part_id);
62 | 
63 | 	bool 		latch_node(bt_node * node, latch_t latch_type);
64 | 	latch_t		release_latch(bt_node * node);
65 | 	RC		 	upgrade_latch(bt_node * node);
66 | 	// clean up all the LATCH_EX up tp last_ex
67 | 	RC 			cleanup(bt_node * node, bt_node * last_ex);
68 | 
69 | 	// the leaf and the idx within the leaf that the thread last accessed.
70 | 	bt_node *** cur_leaf_per_thd;
71 | 	UInt32 ** 		cur_idx_per_thd;
72 | };
73 | 
74 | #endif
75 | 


--------------------------------------------------------------------------------
/storage/index_hash.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "index_hash.h"
  3 | #include "mem_alloc.h"
  4 | #include "table.h"
  5 | 
  6 | RC IndexHash::init(uint64_t bucket_cnt, int part_cnt) {
  7 |   _bucket_cnt = bucket_cnt;
  8 |   _bucket_cnt_per_part = bucket_cnt / part_cnt;
  9 |   _buckets = new BucketHeader * [part_cnt];
 10 |   for (int i = 0; i < part_cnt; i++) {
 11 |     _buckets[i] = (BucketHeader *) _mm_malloc(sizeof(BucketHeader) * _bucket_cnt_per_part, 64);
 12 |     for (uint32_t n = 0; n < _bucket_cnt_per_part; n ++)
 13 |       _buckets[i][n].init();
 14 |   }
 15 |   return RCOK;
 16 | }
 17 | 
 18 | RC
 19 | IndexHash::init(int part_cnt, table_t * table, uint64_t bucket_cnt) {
 20 |   init(bucket_cnt, part_cnt);
 21 |   this->table = table;
 22 |   return RCOK;
 23 | }
 24 | 
 25 | bool IndexHash::index_exist(idx_key_t key) {
 26 |   assert(false);
 27 |   return false;
 28 | }
 29 | 
 30 | void
 31 | IndexHash::get_latch(BucketHeader * bucket) {
 32 |   while (!ATOM_CAS(bucket->locked, false, true)) {}
 33 | }
 34 | 
 35 | void
 36 | IndexHash::release_latch(BucketHeader * bucket) {
 37 |     bool ok = ATOM_CAS(bucket->locked, true, false);
 38 |     assert(ok);
 39 |   // XXX(zhihan): change to read/write lock
 40 |   //pthread_rwlock_unlock(bucket->rwlock);
 41 | }
 42 | 
 43 | void
 44 | IndexHash::get_latch(BucketHeader * bucket, access_t access) {
 45 |   while (!ATOM_CAS(bucket->locked, false, true)) {}
 46 |   /*
 47 |   // XXX(zhihan): rwlock
 48 |   if (access == RD)
 49 |     pthread_rwlock_rdlock(bucket->rwlock);
 50 |   else
 51 |     pthread_rwlock_wrlock(bucket->rwlock);
 52 |   */
 53 | }
 54 | 
 55 | RC IndexHash::index_insert(idx_key_t key, itemid_t * item, int part_id) {
 56 |   RC rc = RCOK;
 57 |   uint64_t bkt_idx = hash(key);
 58 |   assert(bkt_idx < _bucket_cnt_per_part);
 59 |   BucketHeader * cur_bkt = &_buckets[part_id][bkt_idx];
 60 |   // 1. get the ex latch
 61 |   get_latch(cur_bkt, WR);
 62 |   // 2. update the latch list
 63 |   cur_bkt->insert_item(key, item, part_id);
 64 |   // 3. release the latch
 65 |   release_latch(cur_bkt);
 66 |   return rc;
 67 | }
 68 | 
 69 | RC IndexHash::index_read(idx_key_t key, itemid_t * &item, int part_id) {
 70 |   //TODO(zhihan): take read lock
 71 |   uint64_t bkt_idx = hash(key);
 72 |   assert(bkt_idx < _bucket_cnt_per_part);
 73 |   BucketHeader * cur_bkt = &_buckets[part_id][bkt_idx];
 74 |   RC rc = RCOK;
 75 |   // 1. get the sh latch
 76 |   //get_latch(cur_bkt, RD);
 77 |   cur_bkt->read_item(key, item, table->get_table_name());
 78 |   // 3. release the latch
 79 |   //release_latch(cur_bkt);
 80 |   return rc;
 81 | 
 82 | }
 83 | 
 84 | RC IndexHash::index_read(idx_key_t key, itemid_t * &item,
 85 |                          int part_id, int thd_id) {
 86 |   uint64_t bkt_idx = hash(key);
 87 |   assert(bkt_idx < _bucket_cnt_per_part);
 88 |   BucketHeader * cur_bkt = &_buckets[part_id][bkt_idx];
 89 |   RC rc = RCOK;
 90 |   // 1. get the sh latch
 91 |   //get_latch(cur_bkt, RD);
 92 |   cur_bkt->read_item(key, item, table->get_table_name());
 93 |   // 3. release the latch
 94 |   //release_latch(cur_bkt);
 95 |   return rc;
 96 | }
 97 | 
 98 | /************** BucketHeader Operations ******************/
 99 | 
100 | void BucketHeader::init() {
101 |   node_cnt = 0;
102 |   first_node = NULL;
103 |   locked = false;
104 |   // XXX(zhihan): init another rw lock
105 |   rwlock = new pthread_rwlock_t;
106 |   pthread_rwlock_init(rwlock, NULL);
107 | }
108 | 
109 | void BucketHeader::insert_item(idx_key_t key,
110 |                                itemid_t * item,
111 |                                int part_id)
112 | {
113 |   BucketNode * cur_node = first_node;
114 |   BucketNode * prev_node = NULL;
115 |   while (cur_node != NULL) {
116 |     if (cur_node->key == key)
117 |       break;
118 |     prev_node = cur_node;
119 |     cur_node = cur_node->next;
120 |   }
121 |   if (cur_node == NULL) {
122 |     BucketNode * new_node = (BucketNode *)
123 |         mem_allocator.alloc(sizeof(BucketNode), part_id );
124 |     new_node->init(key);
125 |     new_node->items = item;
126 |     if (prev_node != NULL) {
127 |       new_node->next = prev_node->next;
128 |       prev_node->next = new_node;
129 |     } else {
130 |       new_node->next = first_node;
131 |       first_node = new_node;
132 |     }
133 |   } else {
134 |     item->next = cur_node->items;
135 |     cur_node->items = item;
136 |   }
137 | }
138 | 
139 | void BucketHeader::read_item(idx_key_t key, itemid_t * &item, const char * tname)
140 | {
141 |   BucketNode * cur_node = first_node;
142 |   while (cur_node != NULL) {
143 |     if (cur_node->key == key)
144 |       break;
145 |     cur_node = cur_node->next;
146 |   }
147 |   M_ASSERT(cur_node->key == key, "Key does not exist!");
148 |   item = cur_node->items;
149 | }
150 | 


--------------------------------------------------------------------------------
/storage/index_hash.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #include "global.h"
 4 | #include "helper.h"
 5 | #include "index_base.h"
 6 | 
 7 | //TODO make proper variables private
 8 | // each BucketNode contains items sharing the same key
 9 | class BucketNode {
10 |  public:
11 |   BucketNode(idx_key_t key) {	init(key); };
12 |   void init(idx_key_t key) {
13 |     this->key = key;
14 |     next = NULL;
15 |     items = NULL;
16 |   }
17 |   idx_key_t 		key;
18 |   // The node for the next key
19 |   BucketNode * 	next;
20 |   // NOTE. The items can be a list of items connected by the next pointer.
21 |   itemid_t * 		items;
22 | };
23 | 
24 | // BucketHeader does concurrency control of Hash
25 | class BucketHeader {
26 |  public:
27 |   void init();
28 |   void insert_item(idx_key_t key, itemid_t * item, int part_id);
29 |   void read_item(idx_key_t key, itemid_t * &item, const char * tname);
30 |   BucketNode * 	first_node;
31 |   uint64_t 		node_cnt;
32 |   bool 			locked;
33 |   pthread_rwlock_t * rwlock;
34 | };
35 | 
36 | // TODO Hash index does not support partition yet.
37 | class IndexHash  : public index_base
38 | {
39 |  public:
40 |   RC 			init(uint64_t bucket_cnt, int part_cnt);
41 |   RC 			init(int part_cnt,
42 |                      table_t * table,
43 |                      uint64_t bucket_cnt);
44 |   bool 		index_exist(idx_key_t key); // check if the key exist.
45 |   RC 			index_insert(idx_key_t key, itemid_t * item, int part_id=-1);
46 |   // the following call returns a single item
47 |   RC	 		index_read(idx_key_t key, itemid_t * &item, int part_id=-1);
48 |   RC	 		index_read(idx_key_t key, itemid_t * &item,
49 |                            int part_id=-1, int thd_id=0);
50 |  private:
51 |   void get_latch(BucketHeader * bucket);
52 |   void get_latch(BucketHeader * bucket, access_t access);
53 |   void release_latch(BucketHeader * bucket);
54 | 
55 |   // TODO implement more complex hash function
56 |   uint64_t hash(idx_key_t key) {	return key % _bucket_cnt_per_part; }
57 | 
58 |   BucketHeader ** 	_buckets;
59 |   uint64_t	 		_bucket_cnt;
60 |   uint64_t 			_bucket_cnt_per_part;
61 | };
62 | 


--------------------------------------------------------------------------------
/storage/row.h:
--------------------------------------------------------------------------------
  1 | #pragma once
  2 | 
  3 | #include "global.h"
  4 | 
  5 | #define DECL_SET_VALUE(type) \
  6 | 	void set_value(int col_id, type value);
  7 | 
  8 | #define SET_VALUE(type) \
  9 | 	void row_t::set_value(int col_id, type value) { \
 10 | 		set_value(col_id, &value); \
 11 | 	}
 12 | 
 13 | #define DECL_GET_VALUE(type)\
 14 | 	void get_value(int col_id, type & value);
 15 | 
 16 | #define GET_VALUE(type)\
 17 | 	void row_t::get_value(int col_id, type & value) { \
 18 |           value = *(type *)get_value(col_id); \
 19 |         }
 20 | 
 21 | //		int pos = get_schema()->get_field_index(col_id);
 22 | //		value = *(type *)&data[pos];
 23 | //	}
 24 | 
 25 | class Access;
 26 | class table_t;
 27 | class Catalog;
 28 | class txn_man;
 29 | class Row_lock;
 30 | class Row_mvcc;
 31 | class Row_hekaton;
 32 | class Row_ts;
 33 | class Row_occ;
 34 | class Row_tictoc;
 35 | class Row_silo;
 36 | class Row_silo_prio;
 37 | class Row_aria;
 38 | class Row_vll;
 39 | class Row_ww;
 40 | class Row_bamboo;
 41 | //class Row_bamboo_pt;
 42 | class Row_ic3;
 43 | #if CC_ALG == WOUND_WAIT || CC_ALG == WAIT_DIE || CC_ALG == NO_WAIT || CC_ALG == DL_DETECT
 44 | struct LockEntry;
 45 | #elif CC_ALG == BAMBOO
 46 | struct BBLockEntry;
 47 | #endif
 48 | 
 49 | class row_t
 50 | {
 51 |   public:
 52 | 
 53 |     RC init(table_t * host_table, uint64_t part_id, uint64_t row_id = 0);
 54 |     void init(int size);
 55 |     RC switch_schema(table_t * host_table);
 56 |     // not every row has a manager
 57 |     void init_manager(row_t * row);
 58 | 
 59 |     table_t * get_table();
 60 |     Catalog * get_schema();
 61 |     const char * get_table_name();
 62 |     uint64_t get_field_cnt();
 63 |     uint64_t get_tuple_size();
 64 |     uint64_t get_row_id() { return _row_id; };
 65 | 
 66 |     void copy(row_t * src);
 67 |     void copy(row_t * src, int idx);
 68 | 
 69 |     void 		set_primary_key(uint64_t key) { _primary_key = key; };
 70 |     uint64_t 	get_primary_key() {return _primary_key; };
 71 |     uint64_t 	get_part_id() { return _part_id; };
 72 | 
 73 |     void set_value(int id, void * ptr);
 74 |     void set_value_plain(int id, void * ptr);
 75 |     void set_value(int id, void * ptr, int size);
 76 |     void set_value(const char * col_name, void * ptr);
 77 |     char * get_value(int id);
 78 |     char * get_value_plain(uint64_t id);
 79 |     char * get_value(char * col_name);
 80 |     void inc_value(int id, uint64_t val);
 81 |     void dec_value(int id, uint64_t val);
 82 | 
 83 |     DECL_SET_VALUE(uint64_t);
 84 |     DECL_SET_VALUE(int64_t);
 85 |     DECL_SET_VALUE(double);
 86 |     DECL_SET_VALUE(UInt32);
 87 |     DECL_SET_VALUE(SInt32);
 88 | 
 89 |     DECL_GET_VALUE(uint64_t);
 90 |     DECL_GET_VALUE(int64_t);
 91 |     DECL_GET_VALUE(double);
 92 |     DECL_GET_VALUE(UInt32);
 93 |     DECL_GET_VALUE(SInt32);
 94 | 
 95 | 
 96 |     void set_data(char * data, uint64_t size);
 97 |     char * get_data();
 98 | 
 99 |     void free_row();
100 | 
101 |     // for concurrency control. can be lock, timestamp etc.
102 | #if CC_ALG == BAMBOO
103 |     RC retire_row(BBLockEntry * lock_entry);
104 | #elif CC_ALG == IC3
105 |     row_t * orig;
106 |     void init_accesses(Access * access);
107 |     Access * txn_access; // only used when row is a local copy
108 | #endif
109 |     RC get_row(access_t type, txn_man * txn, row_t *& row, Access *access=NULL);
110 | #if CC_ALG == BAMBOO
111 |     void return_row(BBLockEntry * lock_entry, RC rc);
112 | #elif CC_ALG == WOUND_WAIT
113 |     void return_row(LockEntry * lock_entry, RC rc);
114 | #endif
115 | #if CC_ALG == WOUND_WAIT || CC_ALG == WAIT_DIE || CC_ALG == NO_WAIT || CC_ALG == DL_DETECT
116 |     void return_row(access_t type, row_t * row, LockEntry * lock_entry);
117 | #endif
118 |     void return_row(access_t type, txn_man * txn, row_t * row);
119 | 
120 | #if CC_ALG == DL_DETECT || CC_ALG == NO_WAIT || CC_ALG == WAIT_DIE
121 |     Row_lock * manager;
122 | #elif CC_ALG == TIMESTAMP
123 |     Row_ts * manager;
124 |   #elif CC_ALG == MVCC
125 |   	Row_mvcc * manager;
126 |   #elif CC_ALG == HEKATON
127 |   	Row_hekaton * manager;
128 |   #elif CC_ALG == OCC
129 |   	Row_occ * manager;
130 |   #elif CC_ALG == TICTOC
131 |   	Row_tictoc * manager;
132 |   #elif CC_ALG == SILO
133 |   	Row_silo * manager;
134 |   #elif CC_ALG == SILO_PRIO
135 |   	Row_silo_prio * manager;
136 |   #elif CC_ALG == ARIA
137 |   	Row_aria * manager;
138 |   #elif CC_ALG == VLL
139 |   	Row_vll * manager;
140 |   #elif CC_ALG == WOUND_WAIT
141 |   	Row_ww * manager;
142 |   #elif CC_ALG == BAMBOO
143 | 	Row_bamboo * manager;
144 |   #elif CC_ALG == IC3
145 | 	Row_ic3 * manager;
146 | #endif
147 |     char * data;
148 |     table_t * table;
149 |   private:
150 |     // primary key should be calculated from the data stored in the row.
151 |     uint64_t 		_primary_key;
152 |     uint64_t		_part_id;
153 |     uint64_t 		_row_id;
154 | };
155 | 


--------------------------------------------------------------------------------
/storage/table.cpp:
--------------------------------------------------------------------------------
 1 | #include "global.h"
 2 | #include "helper.h"
 3 | #include "table.h"
 4 | #include "catalog.h"
 5 | #include "row.h"
 6 | #include "mem_alloc.h"
 7 | 
 8 | void table_t::init(Catalog * schema) {
 9 | 	this->table_name = schema->table_name;
10 | 	this->schema = schema;
11 | }
12 | 
13 | RC table_t::get_new_row(row_t *& row) {
14 | 	// this function is obsolete. 
15 | 	assert(false);
16 | 	return RCOK;
17 | }
18 | 
19 | // the row is not stored locally. the pointer must be maintained by index structure.
20 | RC table_t::get_new_row(row_t *& row, uint64_t part_id, uint64_t &row_id) {
21 | 	RC rc = RCOK;
22 | 	cur_tab_size ++;
23 | 	
24 | 	row = (row_t *) _mm_malloc(sizeof(row_t), 64);
25 | 	rc = row->init(this, part_id, row_id);
26 | 	row->init_manager(row);
27 | 
28 | 	return rc;
29 | }
30 | 


--------------------------------------------------------------------------------
/storage/table.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "global.h"
 4 | 
 5 | // TODO sequential scan is not supported yet.
 6 | // only index access is supported for table. 
 7 | class Catalog;
 8 | class row_t;
 9 | 
10 | class table_t
11 | {
12 | public:
13 | 	void init(Catalog * schema);
14 | 	// row lookup should be done with index. But index does not have
15 | 	// records for new rows. get_new_row returns the pointer to a 
16 | 	// new row.	
17 | 	RC get_new_row(row_t *& row); // this is equivalent to insert()
18 | 	RC get_new_row(row_t *& row, uint64_t part_id, uint64_t &row_id);
19 | 
20 | 	void delete_row(); // TODO delete_row is not supportet yet
21 | 
22 | 	uint64_t get_table_size() { return cur_tab_size; };
23 | 	Catalog * get_schema() { return schema; };
24 | 	const char * get_table_name() { return table_name; };
25 | 
26 | 	Catalog * 		schema;
27 | private:
28 | 	const char * 	table_name;
29 | 	uint64_t  		cur_tab_size;
30 | 	char 			pad[CL_SIZE - sizeof(void *)*3];
31 | };
32 | 


--------------------------------------------------------------------------------
/system/amd64.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Implemented by authors of SILO.
 3 | //
 4 | 
 5 | #ifndef _AMD64_H_
 6 | #define _AMD64_H_
 7 | 
 8 | #include <stdint.h>
 9 | 
10 | #define ALWAYS_INLINE __attribute__((always_inline))
11 | 
12 | inline ALWAYS_INLINE void
13 | nop_pause() {
14 |   __asm volatile("pause" : :);
15 | }
16 | 
17 | inline void
18 | memory_barrier() {
19 |   asm volatile("mfence" : : : "memory");
20 | }
21 | 
22 | #endif /* _AMD64_H_ */
23 | 
24 | 


--------------------------------------------------------------------------------
/system/batch.cpp:
--------------------------------------------------------------------------------
 1 | 
 2 | #include "batch.h"
 3 | #include "wl.h"
 4 | 
 5 | void BatchMgr::BatchBuffer::init_txn(workload* wl, thread_t* thd) {
 6 | 	for (auto& e: batch) {
 7 | 		RC rc = wl->get_txn_man(e.txn, thd);
 8 | 		assert(rc == RCOK);
 9 | 	}
10 | }
11 | 


--------------------------------------------------------------------------------
/system/batch.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #include "global.h"
 4 | #include "txn.h"
 5 | 
 6 | struct BatchEntry {
 7 | 	txn_man* txn; // init once and reused repeatedly
 8 | 	base_query* query; // the current query to execute
 9 | 	RC rc; // current state; can be Abort if its reservation fails
10 | 	ts_t start_ts; // if zero, meaning it is a newly start one
11 | 	// the name is a little confusing: exec_time here includes validation phases
12 | 	uint64_t exec_time_curr; // current execution time (not yet abort/commit)
13 | 	uint64_t exec_time_abort; // how much execution time spent eventually aborts
14 | 	uint64_t txn_id;
15 | };
16 | 
17 | /*
18 |  * Batch manager.
19 |  * Currently only used by Aria. It manages txns batches by batches.
20 |  */
21 | class BatchMgr {
22 | 	struct BatchBuffer { // this should be a FIFO queue
23 | 		BatchEntry batch[ARIA_BATCH_SIZE] {};
24 | 		int size = 0;
25 | 
26 | 		BatchBuffer() = default;
27 | 		void init_txn(workload* wl, thread_t* thd);
28 | 	
29 | 		void reset() { size = 0; }
30 | 		void append(base_query* q) {
31 | 			assert(size < ARIA_BATCH_SIZE);
32 | 			batch[size].query = q;
33 | 			batch[size].rc = RCOK;
34 | 			batch[size].start_ts = 0;
35 | 			batch[size].exec_time_curr = 0;
36 | 			batch[size].exec_time_abort = 0;
37 | 			batch[size].txn_id = 0;
38 | 			++size;
39 | 		}
40 | 		void append(struct BatchEntry* other) {
41 | 			assert(size < ARIA_BATCH_SIZE);
42 | 			batch[size].query = other->query;
43 | 			batch[size].rc = RCOK;
44 | 			batch[size].start_ts = other->start_ts;
45 | 			batch[size].exec_time_curr = 0;
46 | 			batch[size].exec_time_abort = other->exec_time_abort;
47 | 			batch[size].txn_id = other->txn_id;
48 | 			++size;
49 | 		}
50 | 		BatchEntry* get(int idx) {
51 | 			if (idx >= size) return nullptr; // nothing to pop
52 | 			return &batch[idx];
53 | 		}
54 | 	};
55 | 
56 | 	uint64_t batch_id;
57 | 	BatchBuffer batch_buf0;
58 | 	BatchBuffer batch_buf1;
59 | 	BatchBuffer* curr_batch;
60 | 	BatchBuffer* next_batch;
61 | 
62 |  public:
63 | 	BatchMgr(): batch_id(0), batch_buf0(), batch_buf1(), curr_batch(&batch_buf0), 
64 | 							next_batch(&batch_buf1) {}
65 | 	void init_txn(workload* wl, thread_t* thd) {
66 | 		batch_buf0.reset();
67 | 		batch_buf1.reset();
68 | 		batch_buf0.init_txn(wl, thd);
69 | 		batch_buf1.init_txn(wl, thd);
70 | 	}
71 | 
72 | 	const uint64_t get_batch_id() const { return batch_id; }
73 | 
74 | 	// get one entry from the current batch
75 | 	BatchEntry* get_entry(int idx) const { return curr_batch->get(idx); }
76 | 	// a txn aborted, put it into the next batch
77 | 	void put_next(BatchEntry* e) { next_batch->append(e); }
78 | 
79 | 	// whether there is any space left on the current batch
80 | 	bool can_admit() { return curr_batch->size < ARIA_BATCH_SIZE; }
81 | 	// admit new query into the buffer
82 | 	void admit_new_query(base_query* q) {
83 | 		assert(q);
84 | 		assert(can_admit());
85 | 		curr_batch->append(q);
86 | 	}
87 | 
88 | 	// start a new batch:
89 | 	// next_batch becomes "curr_batch"; recycle the old one as new "next_batch"
90 | 	void start_new_batch() {
91 | 		++batch_id; // batch_id must be nonzero
92 | 		// switch curr/next_batch
93 | 		BatchBuffer* tmp = curr_batch;
94 | 		curr_batch = next_batch;
95 | 		next_batch = tmp;
96 | 		next_batch->reset();
97 | 	}
98 | };
99 | 


--------------------------------------------------------------------------------
/system/global.cpp:
--------------------------------------------------------------------------------
 1 | #include "global.h"
 2 | #include "mem_alloc.h"
 3 | #include "stats.h"
 4 | #include "dl_detect.h"
 5 | #include "manager.h"
 6 | #include "query.h"
 7 | #include "plock.h"
 8 | #include "occ.h"
 9 | #include "vll.h"
10 | #include "aria.h"
11 | 
12 | mem_alloc mem_allocator;
13 | Stats stats;
14 | DL_detect dl_detector;
15 | Manager * glob_manager;
16 | Query_queue * query_queue;
17 | Plock part_lock_man;
18 | OptCC occ_man;
19 | #if CC_ALG == VLL
20 | VLLMan vll_man;
21 | #endif 
22 | 
23 | bool volatile warmup_finish = false;
24 | bool volatile enable_thread_mem_pool = false;
25 | pthread_barrier_t warmup_bar;
26 | #ifndef NOGRAPHITE
27 | carbon_barrier_t enable_barrier;
28 | #endif
29 | 
30 | ts_t g_abort_penalty = ABORT_PENALTY;
31 | bool g_central_man = CENTRAL_MAN;
32 | UInt32 g_ts_alloc = TS_ALLOC;
33 | bool g_key_order = KEY_ORDER;
34 | bool g_no_dl = NO_DL;
35 | ts_t g_timeout = TIMEOUT;
36 | ts_t g_dl_loop_detect = DL_LOOP_DETECT;
37 | bool g_ts_batch_alloc = TS_BATCH_ALLOC;
38 | UInt32 g_ts_batch_num = TS_BATCH_NUM;
39 | 
40 | bool g_part_alloc = PART_ALLOC;
41 | bool g_mem_pad = MEM_PAD;
42 | UInt32 g_cc_alg = CC_ALG;
43 | ts_t g_query_intvl = QUERY_INTVL;
44 | UInt32 g_part_per_txn = PART_PER_TXN;
45 | double g_perc_multi_part = PERC_MULTI_PART;
46 | double g_read_perc = READ_PERC;
47 | double g_write_perc = WRITE_PERC;
48 | double g_zipf_theta = ZIPF_THETA;
49 | bool g_prt_lat_distr = PRT_LAT_DISTR;
50 | UInt32 g_part_cnt = PART_CNT;
51 | UInt32 g_virtual_part_cnt = VIRTUAL_PART_CNT;
52 | UInt32 g_thread_cnt = THREAD_CNT;
53 | UInt64 g_synth_table_size = SYNTH_TABLE_SIZE - (SYNTH_TABLE_SIZE % INIT_PARALLELISM);
54 | UInt32 g_req_per_query = REQ_PER_QUERY;
55 | UInt32 g_field_per_tuple = FIELD_PER_TUPLE;
56 | UInt32 g_init_parallelism = INIT_PARALLELISM;
57 | double g_last_retire = BB_LAST_RETIRE;
58 | double g_specified_ratio = SPECIFIED_RATIO;
59 | double g_flip_ratio = FLIP_RATIO;
60 | double g_long_txn_ratio = LONG_TXN_RATIO;
61 | double g_long_txn_read_ratio = LONG_TXN_READ_RATIO;
62 | 
63 | UInt32 g_num_wh = NUM_WH;
64 | double g_perc_payment = PERC_PAYMENT;
65 | double g_perc_delivery = PERC_DELIVERY;
66 | double g_perc_orderstatus = PERC_ORDERSTATUS;
67 | double g_perc_stocklevel = PERC_STOCKLEVEL;
68 | double g_perc_neworder = 1 - (g_perc_payment + g_perc_delivery + g_perc_orderstatus + g_perc_stocklevel);
69 | bool g_wh_update = WH_UPDATE;
70 | char * output_file = NULL;
71 | 
72 | map<string, string> g_params;
73 | 
74 | #if TPCC_SMALL
75 | UInt32 g_max_items = 10000;
76 | UInt32 g_cust_per_dist = 2000;
77 | #else 
78 | UInt32 g_max_items = 100000;
79 | UInt32 g_cust_per_dist = 3000;
80 | #endif
81 | 


--------------------------------------------------------------------------------
/system/global.h:
--------------------------------------------------------------------------------
  1 | #pragma once 
  2 | 
  3 | #include "stdint.h"
  4 | #include <unistd.h>
  5 | #include <cstddef>
  6 | #include <cstdlib>
  7 | #define NDEBUG
  8 | #include <cassert>
  9 | #include <stdio.h>
 10 | #include <iostream>
 11 | #include <fstream>
 12 | #include <string.h>
 13 | #include <typeinfo>
 14 | #include <list>
 15 | #include <mm_malloc.h>
 16 | #include <map>
 17 | #include <set>
 18 | #include <string>
 19 | #include <vector>
 20 | #include <sstream>
 21 | #include <time.h> 
 22 | #include <sys/time.h>
 23 | #include <math.h>
 24 | 
 25 | #if LATCH == LH_MCSLOCK
 26 | #include "mcs_spinlock.h"
 27 | #endif
 28 | #include "pthread.h"
 29 | #include "config.h"
 30 | #include "stats.h"
 31 | #include "dl_detect.h"
 32 | #ifndef NOGRAPHITE
 33 | #include "carbon_user.h"
 34 | #endif
 35 | #include "helper.h"
 36 | 
 37 | using namespace std;
 38 | 
 39 | class mem_alloc;
 40 | class Stats;
 41 | class DL_detect;
 42 | class Manager;
 43 | class Query_queue;
 44 | class Plock;
 45 | class OptCC;
 46 | class VLLMan;
 47 | 
 48 | typedef uint32_t UInt32;
 49 | typedef int32_t SInt32;
 50 | typedef uint64_t UInt64;
 51 | typedef int64_t SInt64;
 52 | 
 53 | typedef uint64_t ts_t; // time stamp type
 54 | 
 55 | /******************************************/
 56 | // Global Data Structure 
 57 | /******************************************/
 58 | extern mem_alloc mem_allocator;
 59 | extern Stats stats;
 60 | extern DL_detect dl_detector;
 61 | extern Manager * glob_manager;
 62 | extern Query_queue * query_queue;
 63 | extern Plock part_lock_man;
 64 | extern OptCC occ_man;
 65 | #if CC_ALG == VLL
 66 | extern VLLMan vll_man;
 67 | #endif
 68 | 
 69 | extern bool volatile warmup_finish;
 70 | extern bool volatile enable_thread_mem_pool;
 71 | extern pthread_barrier_t warmup_bar;
 72 | #ifndef NOGRAPHITE
 73 | extern carbon_barrier_t enable_barrier;
 74 | #endif
 75 | 
 76 | /******************************************/
 77 | // Global Parameter
 78 | /******************************************/
 79 | extern bool g_part_alloc;
 80 | extern bool g_mem_pad;
 81 | extern bool g_prt_lat_distr;
 82 | extern UInt32 g_part_cnt;
 83 | extern UInt32 g_virtual_part_cnt;
 84 | extern UInt32 g_thread_cnt;
 85 | extern ts_t g_abort_penalty; 
 86 | extern bool g_central_man;
 87 | extern UInt32 g_ts_alloc;
 88 | extern bool g_key_order;
 89 | extern bool g_no_dl;
 90 | extern ts_t g_timeout;
 91 | extern ts_t g_dl_loop_detect;
 92 | extern bool g_ts_batch_alloc;
 93 | extern UInt32 g_ts_batch_num;
 94 | 
 95 | extern map<string, string> g_params;
 96 | 
 97 | // YCSB
 98 | extern UInt32 g_cc_alg;
 99 | extern ts_t g_query_intvl;
100 | extern UInt32 g_part_per_txn;
101 | extern double g_perc_multi_part;
102 | extern double g_read_perc;
103 | extern double g_write_perc;
104 | extern double g_zipf_theta;
105 | extern UInt64 g_synth_table_size;
106 | extern UInt32 g_req_per_query;
107 | extern UInt32 g_field_per_tuple;
108 | extern UInt32 g_init_parallelism;
109 | extern double g_last_retire;
110 | extern double g_specified_ratio;
111 | extern double g_flip_ratio;
112 | extern double g_long_txn_ratio;
113 | extern double g_long_txn_read_ratio;
114 | 
115 | // TPCC
116 | extern UInt32 g_num_wh;
117 | extern double g_perc_payment;
118 | extern double g_perc_delivery;
119 | extern double g_perc_orderstatus;
120 | extern double g_perc_stocklevel;
121 | extern double g_perc_neworder;
122 | extern bool g_wh_update;
123 | extern char * output_file;
124 | extern UInt32 g_max_items;
125 | extern UInt32 g_cust_per_dist;
126 | 
127 | enum RC { RCOK, Commit, Abort, WAIT, ERROR, FINISH};
128 | 
129 | /* Thread */
130 | typedef uint64_t txnid_t;
131 | 
132 | /* Txn */
133 | typedef uint64_t txn_t;
134 | 
135 | /* Table and Row */
136 | typedef uint64_t rid_t; // row id
137 | typedef uint64_t pgid_t; // page id
138 | 
139 | 
140 | 
141 | /* INDEX */
142 | enum latch_t {LATCH_EX, LATCH_SH, LATCH_NONE};
143 | // accessing type determines the latch type on nodes
144 | enum idx_acc_t {INDEX_INSERT, INDEX_READ, INDEX_NONE};
145 | typedef uint64_t idx_key_t; // key id for index
146 | typedef uint64_t (*func_ptr)(idx_key_t);	// part_id func_ptr(index_key);
147 | 
148 | /* general concurrency control */
149 | enum access_t {RD, WR, XP, SCAN, CM};
150 | /* LOCK */
151 | enum lock_t {LOCK_EX, LOCK_SH, LOCK_NONE };
152 | enum loc_t {RETIRED, OWNERS, WAITERS, LOC_NONE};
153 | enum lock_status {LOCK_DROPPED, LOCK_WAITER, LOCK_OWNER, LOCK_RETIRED};
154 | /* TIMESTAMP */
155 | enum TsType {R_REQ, W_REQ, P_REQ, XP_REQ};
156 | /* TXN STATUS */
157 | // XXX(zhihan): bamboo requires the enumeration order to be unchanged
158 | enum status_t: unsigned int {RUNNING, ABORTED, COMMITED, HOLDING}; 
159 | 
160 | /* COMMUTATIVE OPERATIONS */
161 | enum com_t {COM_INC, COM_DEC, COM_NONE};
162 | 
163 | 
164 | #define MSG(str, args...) { \
165 | 	printf("[%s : %d] " str, __FILE__, __LINE__, args); } \
166 | //	printf(args); }
167 | 
168 | // principal index structure. The workload may decide to use a different 
169 | // index structure for specific purposes. (e.g. non-primary key access should use hash)
170 | #if (INDEX_STRUCT == IDX_BTREE)
171 | #define INDEX		index_btree
172 | #else  // IDX_HASH
173 | #define INDEX		IndexHash
174 | #endif
175 | 
176 | /************************************************/
177 | // constants
178 | /************************************************/
179 | #ifndef UINT64_MAX
180 | #define UINT64_MAX 		18446744073709551615UL
181 | #endif // UINT64_MAX
182 | 


--------------------------------------------------------------------------------
/system/helper.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "helper.h"
  3 | #include "mem_alloc.h"
  4 | #include "time.h"
  5 | 
  6 | bool itemid_t::operator==(const itemid_t &other) const {
  7 | 	return (type == other.type && location == other.location);
  8 | }
  9 | 
 10 | bool itemid_t::operator!=(const itemid_t &other) const {
 11 | 	return !(*this == other);
 12 | }
 13 | 
 14 | void itemid_t::operator=(const itemid_t &other){
 15 | 	this->valid = other.valid;
 16 | 	this->type = other.type;
 17 | 	this->location = other.location;
 18 | 	assert(*this == other);
 19 | 	assert(this->valid);
 20 | }
 21 | 
 22 | void itemid_t::init() {
 23 | 	valid = false;
 24 | 	location = 0;
 25 | 	next = NULL;
 26 | }
 27 | 
 28 | int get_thdid_from_txnid(uint64_t txnid) {
 29 | 	return txnid % g_thread_cnt;
 30 | }
 31 | 
 32 | uint64_t get_part_id(void * addr) {
 33 | 	return ((uint64_t)addr / PAGE_SIZE) % g_part_cnt; 
 34 | }
 35 | 
 36 | uint64_t key_to_part(uint64_t key) {
 37 | 	if (g_part_alloc)
 38 | 		return key % g_part_cnt;
 39 | 	else 
 40 | 		return 0;
 41 | }
 42 | 
 43 | uint64_t merge_idx_key(UInt64 key_cnt, UInt64 * keys) {
 44 | 	UInt64 len = 64 / key_cnt;
 45 | 	UInt64 key = 0;
 46 | 	for (UInt32 i = 0; i < len; i++) {
 47 | 		assert(keys[i] < (1UL << len));
 48 | 		key = (key << len) | keys[i];
 49 | 	}
 50 | 	return key;
 51 | }
 52 | 
 53 | uint64_t merge_idx_key(uint64_t key1, uint64_t key2) {
 54 | 	assert(key1 < (1UL << 32) && key2 < (1UL << 32));
 55 | 	return key1 << 32 | key2;
 56 | }
 57 | 
 58 | uint64_t merge_idx_key(uint64_t key1, uint64_t key2, uint64_t key3) {
 59 | 	assert(key1 < (1 << 21) && key2 < (1 << 21) && key3 < (1 << 21));
 60 | 	return key1 << 42 | key2 << 21 | key3;
 61 | }
 62 | 
 63 | /****************************************************/
 64 | // Global Clock!
 65 | /****************************************************/
 66 | /*
 67 | inline uint64_t get_server_clock() {
 68 | #if defined(__i386__)
 69 |     uint64_t ret;
 70 |     __asm__ __volatile__("rdtsc" : "=A" (ret));
 71 | #elif defined(__x86_64__)
 72 |     unsigned hi, lo;
 73 |     __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
 74 |     uint64_t ret = ( (uint64_t)lo)|( ((uint64_t)hi)<<32 );
 75 | 	ret = (uint64_t) ((double)ret / CPU_FREQ);
 76 | #else 
 77 | 	timespec * tp = new timespec;
 78 |     clock_gettime(CLOCK_REALTIME, tp);
 79 |     uint64_t ret = tp->tv_sec * 1000000000 + tp->tv_nsec;
 80 | #endif
 81 |     return ret;
 82 | }
 83 | 
 84 | inline uint64_t get_sys_clock() {
 85 | #ifndef NOGRAPHITE
 86 | 	static volatile uint64_t fake_clock = 0;
 87 | 	if (warmup_finish)
 88 | 		return CarbonGetTime();   // in ns
 89 | 	else {
 90 | 		return ATOM_ADD_FETCH(fake_clock, 100);
 91 | 	}
 92 | #else
 93 | 	if (TIME_ENABLE) 
 94 | 		return get_server_clock();
 95 | 	return 0;
 96 | #endif
 97 | }
 98 | */
 99 | void myrand::init(uint64_t seed) {
100 | 	this->seed = seed;
101 | }
102 | 
103 | uint64_t myrand::next() {
104 | 	seed = (seed * 1103515247UL + 12345UL) % (1UL<<63);
105 | 	return (seed / 65537) % RAND_MAX;
106 | }
107 | 
108 | 


--------------------------------------------------------------------------------
/system/main.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "ycsb.h"
  3 | #include "tpcc.h"
  4 | #include "test.h"
  5 | #include "thread.h"
  6 | #include "manager.h"
  7 | #include "mem_alloc.h"
  8 | #include "query.h"
  9 | #include "plock.h"
 10 | #include "occ.h"
 11 | #include "vll.h"
 12 | #include "aria.h"
 13 | 
 14 | void * f(void *);
 15 | 
 16 | thread_t ** m_thds;
 17 | 
 18 | // defined in parser.cpp
 19 | void parser(int argc, char * argv[]);
 20 | 
 21 | int main(int argc, char* argv[])
 22 | {
 23 | 	parser(argc, argv);
 24 | 	
 25 | #ifndef NDEBUG
 26 | 	uint64_t ts0, ts1;
 27 | 	ts0 = get_sys_clock();
 28 | 	sleep(1);
 29 | 	ts1 = get_sys_clock();
 30 | 	double ratio = ((double)(ts1 - ts0)) / 1000000000.0;
 31 | 	if (ratio < 0.99 || ratio > 1.00) {
 32 | 		fprintf(stderr, 
 33 | 			"FATAL ERROR: CPU freqency might be incorrectly configured: "
 34 | 			"real_time/cpu_ts_time=%f\n", ratio);
 35 | 		abort();
 36 | 	}
 37 | #endif
 38 | 
 39 | 	mem_allocator.init(g_part_cnt, MEM_SIZE / g_part_cnt); 
 40 | 	stats.init();
 41 | 	glob_manager = (Manager *) _mm_malloc(sizeof(Manager), 64);
 42 | 	glob_manager->init();
 43 | 	if (g_cc_alg == DL_DETECT) 
 44 | 		dl_detector.init();
 45 | 	printf("mem_allocator initialized!\n");
 46 | 
 47 | #if CC_ALG == WOUND_WAIT
 48 | 	printf("WOUND_WAIT\n");
 49 | #elif CC_ALG == NO_WAIT
 50 | 	printf("NO_WAIT\n");
 51 | #elif CC_ALG == WAIT_DIE
 52 | 	printf("WAIT_DIE\n");
 53 | #elif CC_ALG == BAMBOO
 54 | 	printf("BAMBOO\n");
 55 | #elif CC_ALG == SILO
 56 | 	printf("SILO\n");
 57 | #elif CC_ALG == SILO_PRIO
 58 | 	printf("SILO_PRIO\n");
 59 | #elif CC_ALG == ARIA
 60 | 	printf("ARIA\n");
 61 | #elif CC_ALG == IC3
 62 | 	printf("IC3\n");
 63 | #endif
 64 | 	
 65 |  
 66 | 	workload * m_wl;
 67 | 	switch (WORKLOAD) {
 68 | 		case YCSB :
 69 | 			m_wl = new ycsb_wl; break;
 70 | 		case TPCC :
 71 | 			m_wl = new tpcc_wl; break;
 72 | 		case TEST :
 73 | 			m_wl = new TestWorkload; 
 74 | 			((TestWorkload *)m_wl)->tick();
 75 | 			break;
 76 | 		default:
 77 | 			assert(false);
 78 | 	}
 79 | 	m_wl->init();
 80 | 	printf("workload initialized!\n");
 81 | 
 82 | 	
 83 | 	uint64_t thd_cnt = g_thread_cnt;
 84 | 	pthread_t p_thds[thd_cnt - 1];
 85 | 	m_thds = new thread_t * [thd_cnt];
 86 | 	for (uint32_t i = 0; i < thd_cnt; i++)
 87 | 		m_thds[i] = (thread_t *) _mm_malloc(sizeof(thread_t), 64);
 88 | 	// query_queue should be the last one to be initialized!!!
 89 | 	// because it collects txn latency
 90 | 	query_queue = (Query_queue *) _mm_malloc(sizeof(Query_queue), 64);
 91 | 	if (WORKLOAD != TEST)
 92 | 		query_queue->init(m_wl);
 93 | 	pthread_barrier_init( &warmup_bar, NULL, g_thread_cnt );
 94 | 	printf("query_queue initialized!\n");
 95 | #if CC_ALG == HSTORE
 96 | 	part_lock_man.init();
 97 | #elif CC_ALG == OCC
 98 | 	occ_man.init();
 99 | #elif CC_ALG == VLL
100 | 	vll_man.init();
101 | #elif CC_ALG == ARIA
102 | 	AriaCoord::init();
103 | #endif
104 | 
105 | 	for (uint32_t i = 0; i < thd_cnt; i++) 
106 | 		m_thds[i]->init(i, m_wl);
107 | 
108 | 	if (WARMUP > 0){
109 | 		printf("WARMUP start!\n");
110 | 		for (uint32_t i = 0; i < thd_cnt - 1; i++) {
111 | 			uint64_t vid = i;
112 | 			pthread_create(&p_thds[i], NULL, f, (void *)vid);
113 | 		}
114 | 		f((void *)(thd_cnt - 1));
115 | 		for (uint32_t i = 0; i < thd_cnt - 1; i++)
116 | 			pthread_join(p_thds[i], NULL);
117 | 		printf("WARMUP finished!\n");
118 | 	}
119 | 	warmup_finish = true;
120 | 	pthread_barrier_init( &warmup_bar, NULL, g_thread_cnt );
121 | #ifndef NOGRAPHITE
122 | 	CarbonBarrierInit(&enable_barrier, g_thread_cnt);
123 | #endif
124 | 	pthread_barrier_init( &warmup_bar, NULL, g_thread_cnt );
125 | 
126 | #if CC_ALG == ARIA
127 | 	AriaCoord::init(); // re-init
128 | #endif
129 | 
130 | 	// spawn and run txns again.
131 | 	int64_t starttime = get_server_clock();
132 | 	for (uint32_t i = 0; i < thd_cnt - 1; i++) {
133 | 		uint64_t vid = i;
134 | 		pthread_create(&p_thds[i], NULL, f, (void *)vid);
135 | 	}
136 | 	f((void *)(thd_cnt - 1));
137 | 	for (uint32_t i = 0; i < thd_cnt - 1; i++) 
138 | 		pthread_join(p_thds[i], NULL);
139 | 	int64_t endtime = get_server_clock();
140 | 	
141 | 	if (WORKLOAD != TEST) {
142 | 		printf("PASS! SimTime = %ld\n", endtime - starttime);
143 | 		if (STATS_ENABLE)
144 | 			stats.print();
145 | 	} else {
146 | 		((TestWorkload *)m_wl)->summarize();
147 | 	}
148 | 	return 0;
149 | }
150 | 
151 | void * f(void * id) {
152 | 	uint64_t tid = (uint64_t)id;
153 | 	m_thds[tid]->run();
154 | 	return NULL;
155 | }
156 | 


--------------------------------------------------------------------------------
/system/manager.cpp:
--------------------------------------------------------------------------------
  1 | #include "manager.h"
  2 | #include "row.h"
  3 | #include "txn.h"
  4 | #include "pthread.h"
  5 | 
  6 | void Manager::init() {
  7 | 	timestamp = (uint64_t *) _mm_malloc(sizeof(uint64_t), 64);
  8 | 	*timestamp = 1;
  9 | 	_last_min_ts_time = 0;
 10 | 	_min_ts = 0;
 11 | 	_epoch = (uint64_t *) _mm_malloc(sizeof(uint64_t), 64);
 12 | 	_last_epoch_update_time = (ts_t *) _mm_malloc(sizeof(uint64_t), 64);
 13 | 	_epoch = 0;
 14 | 	_last_epoch_update_time = 0;
 15 | 	all_ts = (ts_t volatile **) _mm_malloc(sizeof(ts_t *) * g_thread_cnt, 64);
 16 | 	for (uint32_t i = 0; i < g_thread_cnt; i++) 
 17 | 		all_ts[i] = (ts_t *) _mm_malloc(sizeof(ts_t), 64);
 18 | 
 19 | 	_all_txns = new txn_man * [g_thread_cnt];
 20 | 	for (UInt32 i = 0; i < g_thread_cnt; i++) {
 21 | 		*all_ts[i] = UINT64_MAX;
 22 | 		_all_txns[i] = NULL;
 23 | 	}
 24 | 	for (UInt32 i = 0; i < BUCKET_CNT; i++)
 25 | 		pthread_mutex_init( &mutexes[i], NULL );
 26 | }
 27 | 
 28 | uint64_t 
 29 | Manager::get_ts(uint64_t thread_id) {
 30 | 	if (g_ts_batch_alloc)
 31 | 		assert(g_ts_alloc == TS_CAS);
 32 | 	uint64_t time;
 33 | 	uint64_t starttime = get_sys_clock();
 34 | 	switch(g_ts_alloc) {
 35 | 	case TS_MUTEX :
 36 | 		pthread_mutex_lock( &ts_mutex );
 37 | 		time = ++(*timestamp);
 38 | 		pthread_mutex_unlock( &ts_mutex );
 39 | 		break;
 40 | 	case TS_CAS :
 41 | 		if (g_ts_batch_alloc)
 42 | 			time = ATOM_FETCH_ADD((*timestamp), g_ts_batch_num);
 43 | 		else 
 44 | 			time = ATOM_FETCH_ADD((*timestamp), 1);
 45 | 		break;
 46 | 	case TS_HW :
 47 | #ifndef NOGRAPHITE
 48 | 		time = CarbonGetTimestamp();
 49 | #else
 50 |         time = 0;
 51 | 		assert(false);
 52 | #endif
 53 | 		break;
 54 | 	case TS_CLOCK :
 55 | 		time = get_sys_clock() * g_thread_cnt + thread_id;
 56 | 		break;
 57 | 	default :
 58 | 		time = 0; assert(false);
 59 | 	}
 60 | 	INC_STATS(thread_id, time_ts_alloc, get_sys_clock() - starttime);
 61 | 	return time;
 62 | }
 63 | 
 64 | uint64_t
 65 | Manager::get_n_ts(int n) {
 66 | 	uint64_t time = ATOM_ADD_FETCH((*timestamp), n);
 67 | 	return time;
 68 | }
 69 | 
 70 | ts_t Manager::get_min_ts(uint64_t tid) {
 71 | 	uint64_t now = get_sys_clock();
 72 | 	uint64_t last_time = _last_min_ts_time; 
 73 | 	if (tid == 0 && now - last_time > MIN_TS_INTVL)
 74 | 	{ 
 75 | 		ts_t min = UINT64_MAX;
 76 | 	    	for (UInt32 i = 0; i < g_thread_cnt; i++) 
 77 | 		{ // added curly braces by zhihan
 78 | 		    	if (*all_ts[i] < min)
 79 | 	    	    		min = *all_ts[i];
 80 | 			if (min > _min_ts)
 81 | 				_min_ts = min;
 82 | 		}
 83 | 	}
 84 | 	return _min_ts;
 85 | }
 86 | 
 87 | void Manager::add_ts(uint64_t thd_id, ts_t ts) {
 88 | 	assert( ts >= *all_ts[thd_id] || 
 89 | 		*all_ts[thd_id] == UINT64_MAX);
 90 | 	*all_ts[thd_id] = ts;
 91 | }
 92 | 
 93 | void Manager::set_txn_man(txn_man * txn) {
 94 | 	int thd_id = txn->get_thd_id();
 95 | 	_all_txns[thd_id] = txn;
 96 | }
 97 | 
 98 | 
 99 | uint64_t Manager::hash(row_t * row) {
100 | 	uint64_t addr = (uint64_t)row / MEM_ALLIGN;
101 |     return (addr * 1103515247 + 12345) % BUCKET_CNT;
102 | }
103 |  
104 | void Manager::lock_row(row_t * row) {
105 | 	int bid = hash(row);
106 | 	pthread_mutex_lock( &mutexes[bid] );	
107 | }
108 | 
109 | void Manager::release_row(row_t * row) {
110 | 	int bid = hash(row);
111 | 	pthread_mutex_unlock( &mutexes[bid] );
112 | }
113 | 	
114 | void
115 | Manager::update_epoch()
116 | {
117 | 	ts_t time = get_sys_clock();
118 | 	if (time - *_last_epoch_update_time > LOG_BATCH_TIME * 1000 * 1000) {
119 | 		*_epoch = *_epoch + 1;
120 | 		*_last_epoch_update_time = time;
121 | 	}
122 | }
123 | 


--------------------------------------------------------------------------------
/system/manager.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "global.h"
 4 | 
 5 | class row_t;
 6 | class txn_man;
 7 | 
 8 | class Manager {
 9 | public:
10 | 	void 			init();
11 | 	// returns the next timestamp.
12 | 	ts_t			get_ts(uint64_t thread_id);
13 | 	ts_t			get_n_ts(int n); // book n timestamps
14 | 
15 | 	// For MVCC. To calculate the min active ts in the system
16 | 	void 			add_ts(uint64_t thd_id, ts_t ts);
17 | 	ts_t 			get_min_ts(uint64_t tid = 0);
18 | 
19 | 	// HACK! the following mutexes are used to model a centralized
20 | 	// lock/timestamp manager. 
21 |  	void 			lock_row(row_t * row);
22 | 	void 			release_row(row_t * row);
23 | 	
24 | 	txn_man * 		get_txn_man(int thd_id) { return _all_txns[thd_id]; };
25 | 	void 			set_txn_man(txn_man * txn);
26 | 	
27 | 	uint64_t 		get_epoch() { return *_epoch; };
28 | 	void 	 		update_epoch();
29 | private:
30 | 	// for SILO
31 | 	volatile uint64_t * _epoch;		
32 | 	ts_t * 			_last_epoch_update_time;
33 | 
34 | 	pthread_mutex_t ts_mutex;
35 | 	uint64_t *		timestamp;
36 | 	pthread_mutex_t mutexes[BUCKET_CNT];
37 | 	uint64_t 		hash(row_t * row);
38 | 	ts_t volatile * volatile * volatile all_ts;
39 | 	txn_man ** 		_all_txns;
40 | 	// for MVCC 
41 | 	volatile ts_t	_last_min_ts_time;
42 | 	ts_t			_min_ts;
43 | };
44 | 


--------------------------------------------------------------------------------
/system/mcs_spinlock.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Implemented based on MCS_lock in IC3
 3 | // Modified based on http://libfbp.blogspot.com/2018/01/c-mellor-crummey-scott-mcs-lock.html
 4 | //
 5 | 
 6 | #ifndef _MCS_SPINLOCK
 7 | #define _MCS_SPINLOCK
 8 | 
 9 | #include <atomic>
10 | #include "amd64.h"
11 | 
12 | class mcslock {
13 | 
14 |   public:
15 |     mcslock(): tail(nullptr) {};
16 | 
17 |     struct mcs_node {
18 |         volatile bool locked;
19 |         uint8_t pad0[64 - sizeof(bool)];
20 |         // padding to separate next and locked into two cache lines
21 |         volatile mcs_node* volatile next;
22 |         uint8_t pad1[64 - sizeof(mcs_node *)];
23 |         mcs_node(): locked(true), next(nullptr) {}
24 |     };
25 | 
26 |     void acquire(mcs_node * me) {
27 |         auto prior_node = tail.exchange(me, std::memory_order_acquire);
28 |         // Any one there?
29 |         if (prior_node != nullptr) {
30 |             // memory_barrier();
31 |             // Someone there, need to link in
32 |             me->locked = true;
33 |             prior_node->next = me;
34 |             // Make sure we do the above setting of next.
35 |             // memory_barrier();
36 |             // Spin on my spin variable
37 |             while (me->locked){
38 |                 // memory_barrier();
39 |                 nop_pause();
40 |             }
41 |             assert(!me->locked);
42 |         }
43 |     };
44 | 
45 |     void release(mcs_node * me) {
46 |         if (me->next == nullptr) {
47 |             mcs_node * expected = me;
48 |             // No successor yet
49 |             if (tail.compare_exchange_strong(expected, nullptr,
50 |                                          std::memory_order_release,
51 |                                          std::memory_order_relaxed)) {
52 |                 return;
53 |             }
54 |             // otherwise, another thread is in the process of trying to
55 |             // acquire the lock, so spins waiting for it to finish
56 |             while (me->next == nullptr) {};
57 |         }
58 |         // memory_barrier();
59 |         // Unlock next one
60 |         me->next->locked = false;
61 |         me->next = nullptr;
62 |     };
63 | 
64 |   private:
65 |     std::atomic<mcs_node*> tail;
66 | };
67 | 
68 | #endif
69 | 
70 | 


--------------------------------------------------------------------------------
/system/mem_alloc.cpp:
--------------------------------------------------------------------------------
  1 | #include "mem_alloc.h"
  2 | #include "helper.h"
  3 | #include "global.h"
  4 | 
  5 | // Assume the data is strided across the L2 slices, stride granularity 
  6 | // is the size of a page
  7 | void mem_alloc::init(uint64_t part_cnt, uint64_t bytes_per_part) {
  8 | 	if (g_thread_cnt < g_init_parallelism)
  9 | 		_bucket_cnt = g_init_parallelism * 4 + 1;
 10 | 	else
 11 | 		_bucket_cnt = g_thread_cnt * 4 + 1;
 12 | 	pid_arena = new std::pair<pthread_t, int>[_bucket_cnt];
 13 | 	for (int i = 0; i < _bucket_cnt; i ++)
 14 | 		pid_arena[i] = std::make_pair(0, 0);
 15 | 
 16 | 	if (THREAD_ALLOC) {
 17 | 		assert( !g_part_alloc );
 18 | 		init_thread_arena();
 19 | 	}
 20 | }
 21 | 
 22 | void 
 23 | Arena::init(int arena_id, int size) {
 24 | 	_buffer = NULL;
 25 | 	_arena_id = arena_id;
 26 | 	_size_in_buffer = 0;
 27 | 	_head = NULL;
 28 | 	_block_size = size;
 29 | }
 30 | 
 31 | void *
 32 | Arena::alloc() {
 33 | 	FreeBlock * block;
 34 | 	if (_head == NULL) {
 35 | 		// not in the list. allocate from the buffer
 36 | 		int size = (_block_size + sizeof(FreeBlock) + (MEM_ALLIGN - 1)) & ~(MEM_ALLIGN-1);
 37 | 		if (_size_in_buffer < size) {
 38 | 			_buffer = (char *) malloc(_block_size * 40960);
 39 | 			_size_in_buffer = _block_size * 40960; // * 8;
 40 | 		}
 41 | 		block = (FreeBlock *)_buffer;
 42 | 		block->size = _block_size;
 43 | 		_size_in_buffer -= size;
 44 | 		_buffer = _buffer + size;
 45 | 	} else {
 46 | 		block = _head;
 47 | 		_head = _head->next;
 48 | 	}
 49 | 	return (void *) ((char *)block + sizeof(FreeBlock));
 50 | }
 51 | 
 52 | void
 53 | Arena::free(void * ptr) {
 54 | 	FreeBlock * block = (FreeBlock *)((UInt64)ptr - sizeof(FreeBlock));
 55 | 	block->next = _head;
 56 | 	_head = block;
 57 | }
 58 | 
 59 | void mem_alloc::init_thread_arena() {
 60 | 	UInt32 buf_cnt = g_thread_cnt;
 61 | 	if (buf_cnt < g_init_parallelism)
 62 | 		buf_cnt = g_init_parallelism;
 63 | 	_arenas = new Arena * [buf_cnt];
 64 | 	for (UInt32 i = 0; i < buf_cnt; i++) {
 65 | 		_arenas[i] = new Arena[SizeNum];
 66 | 		for (int n = 0; n < SizeNum; n++) {
 67 | 			assert(sizeof(Arena) == 128);
 68 | 			_arenas[i][n].init(i, BlockSizes[n]);
 69 | 		}
 70 | 	}
 71 | }
 72 | 
 73 | void mem_alloc::register_thread(int thd_id) {
 74 | 	if (THREAD_ALLOC) {
 75 | 		pthread_mutex_lock( &map_lock );
 76 | 		pthread_t pid = pthread_self();
 77 | 		int entry = pid % _bucket_cnt;
 78 | 		while (pid_arena[ entry ].first != 0) {
 79 | 			printf("conflict at entry %d (pid=%ld)\n", entry, pid);
 80 | 			entry = (entry + 1) % _bucket_cnt;
 81 | 		}
 82 | 		pid_arena[ entry ].first = pid;
 83 | 		pid_arena[ entry ].second = thd_id;
 84 | 		pthread_mutex_unlock( &map_lock );
 85 | 	}
 86 | }
 87 | 
 88 | void mem_alloc::unregister() {
 89 | 	if (THREAD_ALLOC) {
 90 | 		pthread_mutex_lock( &map_lock );
 91 | 		for (int i = 0; i < _bucket_cnt; i ++) {
 92 | 			pid_arena[i].first = 0;
 93 | 			pid_arena[i].second = 0;
 94 | 		}
 95 | 		pthread_mutex_unlock( &map_lock );
 96 | 	}
 97 | }
 98 | 
 99 | int 
100 | mem_alloc::get_arena_id() {
101 | 	int arena_id; 
102 | #if NOGRAPHITE
103 | 	pthread_t pid = pthread_self();
104 | 	int entry = pid % _bucket_cnt;
105 | 	while (pid_arena[entry].first != pid) {
106 | 		if (pid_arena[entry].first == 0)
107 | 			break;
108 | 		entry = (entry + 1) % _bucket_cnt;
109 | 	}
110 | 	arena_id = pid_arena[entry].second;
111 | #else 
112 | 	arena_id = CarbonGetTileId();
113 | #endif
114 | 	return arena_id;
115 | }
116 | 
117 | int 
118 | mem_alloc::get_size_id(UInt32 size) {
119 | 	for (int i = 0; i < SizeNum; i++) {
120 | 		if (size <= BlockSizes[i]) 
121 | 			return i;
122 | 	}
123 |     printf("size = %d\n", size);
124 | 	assert( false );
125 |     return 0;
126 | }
127 | 
128 | 
129 | void mem_alloc::free(void * ptr, uint64_t size) {
130 | 	if (NO_FREE) {} 
131 | 	else if (THREAD_ALLOC) {
132 | 		int arena_id = get_arena_id();
133 | 		FreeBlock * block = (FreeBlock *)((UInt64)ptr - sizeof(FreeBlock));
134 | 		int size = block->size;
135 | 		int size_id = get_size_id(size);
136 | 		_arenas[arena_id][size_id].free(ptr);
137 | 	} else {
138 | 		std::free(ptr);
139 | 	}
140 | }
141 | 
142 | //TODO the program should not access more than a PAGE
143 | // to guanrantee correctness
144 | // lock is used for consistency (multiple threads may alloc simultaneously and 
145 | // cause trouble)
146 | void * mem_alloc::alloc(uint64_t size, uint64_t part_id) {
147 | 	void * ptr;
148 |     if (size > BlockSizes[SizeNum - 1])
149 |         ptr = malloc(size);
150 | 	else if (THREAD_ALLOC && (warmup_finish || enable_thread_mem_pool)) {
151 | 		int arena_id = get_arena_id();
152 | 		int size_id = get_size_id(size);
153 | 		ptr = _arenas[arena_id][size_id].alloc();
154 | 	} else {
155 | 		ptr = malloc(size);
156 | 	}
157 | 	return ptr;
158 | }
159 | 
160 | 
161 | 


--------------------------------------------------------------------------------
/system/mem_alloc.h:
--------------------------------------------------------------------------------
 1 | #ifndef _MEM_ALLOC_H_
 2 | #define _MEM_ALLOC_H_
 3 | 
 4 | #include "global.h"
 5 | #include <map>
 6 | 
 7 | const int SizeNum = 4;
 8 | const UInt32 BlockSizes[] = {32, 64, 256, 1024};
 9 | 
10 | typedef struct free_block {
11 |     int size;
12 |     struct free_block* next;
13 | } FreeBlock;
14 | 
15 | class Arena {
16 | public:
17 | 	void init(int arena_id, int size);
18 | 	void * alloc();
19 | 	void free(void * ptr);
20 | private:
21 | 	char * 		_buffer;
22 | 	int 		_size_in_buffer;
23 | 	int 		_arena_id;
24 | 	int 		_block_size;
25 | 	FreeBlock * _head;
26 | 	char 		_pad[128 - sizeof(int)*3 - sizeof(void *)*2 - 8];
27 | };
28 | 
29 | class mem_alloc {
30 | public:
31 |     void init(uint64_t part_cnt, uint64_t bytes_per_part);
32 |     void register_thread(int thd_id);
33 |     void unregister();
34 |     void * alloc(uint64_t size, uint64_t part_id);
35 |     void free(void * block, uint64_t size);
36 | 	int get_arena_id();
37 | private:
38 |     void init_thread_arena();
39 | 	int get_size_id(UInt32 size);
40 | 	
41 | 	// each thread has several arenas for different block size
42 | 	Arena ** _arenas;
43 | 	int _bucket_cnt;
44 |     std::pair<pthread_t, int>* pid_arena;//                     max_arena_id;
45 |     pthread_mutex_t         map_lock; // only used for pid_to_arena update
46 | };
47 | 
48 | #endif
49 | 


--------------------------------------------------------------------------------
/system/parser.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "helper.h"
  3 | 
  4 | void print_usage() {
  5 | 	printf("[usage]:\n");
  6 | 	printf("\t-pINT       ; PART_CNT\n");
  7 | 	printf("\t-vINT       ; VIRTUAL_PART_CNT\n");
  8 | 	printf("\t-tINT       ; THREAD_CNT\n");
  9 | 	printf("\t-qINT       ; QUERY_INTVL\n");
 10 | 	printf("\t-dINT       ; PRT_LAT_DISTR\n");
 11 | 	printf("\t-aINT       ; PART_ALLOC (0 or 1)\n");
 12 | 	printf("\t-mINT       ; MEM_PAD (0 or 1)\n");
 13 | 	printf("\t-GaINT      ; ABORT_PENALTY (in ms)\n");
 14 | 	printf("\t-GcINT      ; CENTRAL_MAN\n");
 15 | 	printf("\t-GtINT      ; TS_ALLOC\n");
 16 | 	printf("\t-GkINT      ; KEY_ORDER\n");
 17 | 	printf("\t-GnINT      ; NO_DL\n");
 18 | 	printf("\t-GoINT      ; TIMEOUT\n");
 19 | 	printf("\t-GlINT      ; DL_LOOP_DETECT\n");
 20 | 	
 21 | 	printf("\t-GbINT      ; TS_BATCH_ALLOC\n");
 22 | 	printf("\t-GuINT      ; TS_BATCH_NUM\n");
 23 | 	
 24 | 	printf("\t-o STRING   ; output file\n\n");
 25 | 	printf("  [YCSB]:\n");
 26 | 	printf("\t-cINT       ; PART_PER_TXN\n");
 27 | 	printf("\t-eINT       ; PERC_MULTI_PART\n");
 28 | 	printf("\t-rFLOAT     ; READ_PERC\n");
 29 | 	printf("\t-wFLOAT     ; WRITE_PERC\n");
 30 | 	printf("\t-zFLOAT     ; ZIPF_THETA\n");
 31 | 	printf("\t-sINT       ; SYNTH_TABLE_SIZE\n");
 32 | 	printf("\t-RINT       ; REQ_PER_QUERY\n");
 33 | 	printf("\t-fINT       ; FIELD_PER_TUPLE\n");
 34 | 	printf("  [TPCC]:\n");
 35 | 	printf("\t-nINT       ; NUM_WH\n");
 36 | 	printf("\t-TpFLOAT    ; PERC_PAYMENT\n");
 37 | 	printf("\t-TuINT      ; WH_UPDATE\n");
 38 | 	printf("  [TEST]:\n");
 39 | 	printf("\t-Ar         ; Test READ_WRITE\n");
 40 | 	printf("\t-Ac         ; Test CONFLIT\n");
 41 | }
 42 | 
 43 | void parser(int argc, char * argv[]) {
 44 | 	g_params["abort_buffer_enable"] = ABORT_BUFFER_ENABLE? "true" : "false";
 45 | 	g_params["write_copy_form"] = WRITE_COPY_FORM;
 46 | 	g_params["validation_lock"] = VALIDATION_LOCK;
 47 | 	g_params["pre_abort"] = PRE_ABORT;
 48 | 	g_params["atomic_timestamp"] = ATOMIC_TIMESTAMP;
 49 | 
 50 | 	for (int i = 1; i < argc; i++) {
 51 | 		assert(argv[i][0] == '-');
 52 | 		if (argv[i][1] == 'a')
 53 | 			g_part_alloc = atoi( &argv[i][2] );
 54 | 		else if (argv[i][1] == 'm')
 55 | 			g_mem_pad = atoi( &argv[i][2] );
 56 | 		else if (argv[i][1] == 'q')
 57 | 			g_query_intvl = atoi( &argv[i][2] );
 58 | 		else if (argv[i][1] == 'c')
 59 | 			g_part_per_txn = atoi( &argv[i][2] );
 60 | 		else if (argv[i][1] == 'e')
 61 | 			g_perc_multi_part = atof( &argv[i][2] );
 62 | 		else if (argv[i][1] == 'r') 
 63 | 			g_read_perc = atof( &argv[i][2] );
 64 | 		else if (argv[i][1] == 'w') 
 65 | 			g_write_perc = atof( &argv[i][2] );
 66 | 		else if (argv[i][1] == 'z')
 67 | 			g_zipf_theta = atof( &argv[i][2] );
 68 | 		else if (argv[i][1] == 'd')
 69 | 			g_prt_lat_distr = atoi( &argv[i][2] );
 70 | 		else if (argv[i][1] == 'p')
 71 | 			g_part_cnt = atoi( &argv[i][2] );
 72 | 		else if (argv[i][1] == 'v')
 73 | 			g_virtual_part_cnt = atoi( &argv[i][2] );
 74 | 		else if (argv[i][1] == 't')
 75 | 			g_thread_cnt = atoi( &argv[i][2] );
 76 | 		else if (argv[i][1] == 's')
 77 | 			g_synth_table_size = atoi( &argv[i][2] );
 78 | 		else if (argv[i][1] == 'R') 
 79 | 			g_req_per_query = atoi( &argv[i][2] );
 80 | 		else if (argv[i][1] == 'f')
 81 | 			g_field_per_tuple = atoi( &argv[i][2] );
 82 | 		else if (argv[i][1] == 'n')
 83 | 			g_num_wh = atoi( &argv[i][2] );
 84 | 		else if (argv[i][1] == 'G') {
 85 | 			if (argv[i][2] == 'a')
 86 | 				g_abort_penalty = atoi( &argv[i][3] );
 87 | 			else if (argv[i][2] == 'c')
 88 | 				g_central_man = atoi( &argv[i][3] );
 89 | 			else if (argv[i][2] == 't')
 90 | 				g_ts_alloc = atoi( &argv[i][3] );
 91 | 			else if (argv[i][2] == 'k')
 92 | 				g_key_order = atoi( &argv[i][3] );
 93 | 			else if (argv[i][2] == 'n')
 94 | 				g_no_dl = atoi( &argv[i][3] );
 95 | 			else if (argv[i][2] == 'o')
 96 | 				g_timeout = atol( &argv[i][3] );
 97 | 			else if (argv[i][2] == 'l')
 98 | 				g_dl_loop_detect = atoi( &argv[i][3] );
 99 | 			else if (argv[i][2] == 'b')
100 | 				g_ts_batch_alloc = atoi( &argv[i][3] );
101 | 			else if (argv[i][2] == 'u')
102 | 				g_ts_batch_num = atoi( &argv[i][3] );
103 | 		} else if (argv[i][1] == 'T') {
104 | 			if (argv[i][2] == 'p')
105 | 				g_perc_payment = atof( &argv[i][3] );
106 | 			if (argv[i][2] == 'u')
107 | 				g_wh_update = atoi( &argv[i][3] );
108 | 		} else if (argv[i][1] == 'A') {
109 | 			if (argv[i][2] == 'r')
110 | 				g_test_case = READ_WRITE;
111 | 			if (argv[i][2] == 'c')
112 | 				g_test_case = CONFLICT;
113 | 		}
114 | 		else if (argv[i][1] == 'o') {
115 | 			i++;
116 | 			output_file = argv[i];
117 | 		}
118 | 		else if (argv[i][1] == 'h') {
119 | 			print_usage();
120 | 			exit(0);
121 | 		} 
122 | 		else if (argv[i][1] == '-') {
123 | 			string line(&argv[i][2]);
124 | 			size_t pos = line.find("="); 
125 | 			assert(pos != string::npos);
126 | 			string name = line.substr(0, pos);
127 | 			string value = line.substr(pos + 1, line.length());
128 | 			assert(g_params.find(name) != g_params.end());
129 | 			g_params[name] = value;
130 | 		}
131 | 		else
132 | 			assert(false);
133 | 	}
134 | 	if (g_thread_cnt < g_init_parallelism)
135 | 		g_init_parallelism = g_thread_cnt;
136 | }
137 | 


--------------------------------------------------------------------------------
/system/query.cpp:
--------------------------------------------------------------------------------
  1 | #include <sched.h>
  2 | #include "query.h"
  3 | #include "mem_alloc.h"
  4 | #include "wl.h"
  5 | #include "table.h"
  6 | #include "ycsb_query.h"
  7 | #include "tpcc_query.h"
  8 | #include "tpcc_helper.h"
  9 | 
 10 | thread_local drand48_data per_thread_rand_buf;
 11 | 
 12 | /*************************************************/
 13 | //     class Query_queue
 14 | /*************************************************/
 15 | int Query_queue::_next_tid;
 16 | 
 17 | void 
 18 | Query_queue::init(workload * h_wl) {
 19 | 	all_queries = new Query_thd * [g_thread_cnt];
 20 | 	_wl = h_wl;
 21 | 	_next_tid = 0;
 22 | 	
 23 | 
 24 | #if WORKLOAD == YCSB	
 25 | 	ycsb_query::calculateDenom();
 26 | #elif WORKLOAD == TPCC
 27 | 	assert(tpcc_buffer != NULL);
 28 | #endif
 29 | 	int64_t begin = get_server_clock();
 30 | 	pthread_t p_thds[g_thread_cnt - 1];
 31 | 	for (UInt32 i = 0; i < g_thread_cnt - 1; i++) {
 32 | 		pthread_create(&p_thds[i], NULL, threadInitQuery, this);
 33 | 	}
 34 | 	threadInitQuery(this);
 35 | 	for (uint32_t i = 0; i < g_thread_cnt - 1; i++) 
 36 | 		pthread_join(p_thds[i], NULL);
 37 | 	int64_t end = get_server_clock();
 38 | 	printf("Query Queue Init Time %f\n", 1.0 * (end - begin) / 1000000000UL);
 39 | }
 40 | 
 41 | void 
 42 | Query_queue::init_per_thread(int thread_id) {	
 43 | 	all_queries[thread_id] = (Query_thd *) _mm_malloc(sizeof(Query_thd), 64);
 44 | 	all_queries[thread_id]->init(_wl, thread_id);
 45 | }
 46 | 
 47 | base_query * 
 48 | Query_queue::get_next_query(uint64_t thd_id) { 	
 49 | 	base_query * query = all_queries[thd_id]->get_next_query();
 50 | 	return query;
 51 | }
 52 | 
 53 | void *
 54 | Query_queue::threadInitQuery(void * This) {
 55 | 	Query_queue * query_queue = (Query_queue *)This;
 56 | 	uint32_t tid = ATOM_FETCH_ADD(_next_tid, 1);
 57 | 	
 58 | 	// set cpu affinity
 59 | 	set_affinity(tid);
 60 | 	
 61 | 	query_queue->init_per_thread(tid);
 62 | 	return NULL;
 63 | }
 64 | 
 65 | /*************************************************/
 66 | //     class Query_thd
 67 | /*************************************************/
 68 | 
 69 | 
 70 | void 
 71 | Query_thd::init(workload * h_wl, int thread_id) {
 72 | 	q_idx = 0;
 73 | #if TPCC_USER_ABORT
 74 | 	request_cnt = WARMUP / g_thread_cnt + MAX_TXN_PER_PART + 100 +
 75 | 	    MAX_TXN_PER_PART / 100;
 76 | #else
 77 |     request_cnt = WARMUP / g_thread_cnt + MAX_TXN_PER_PART + 100;
 78 | #endif
 79 | #if ABORT_BUFFER_ENABLE
 80 |     request_cnt += ABORT_BUFFER_SIZE;
 81 | #endif
 82 | #if WORKLOAD == YCSB
 83 | 	queries = (ycsb_query *) 
 84 | 		mem_allocator.alloc(sizeof(ycsb_query) * request_cnt, thread_id);
 85 | 	srand48_r(thread_id + 1, &buffer);
 86 |     // XXX(zhihan): create a pre-allocated space for long txn
 87 | 	if (g_long_txn_ratio > 0) {
 88 |         long_txn = (ycsb_request *)
 89 |             mem_allocator.alloc(sizeof(ycsb_request) * MAX_ROW_PER_TXN,
 90 |                 thread_id);
 91 |         long_txn_part = (uint64_t *)
 92 |             mem_allocator.alloc(sizeof(uint64_t) * g_part_per_txn, thread_id);
 93 | 	}
 94 | #elif WORKLOAD == TPCC
 95 | 	queries = (tpcc_query *) _mm_malloc(sizeof(tpcc_query) * request_cnt, 64);
 96 | #endif
 97 | 	for (UInt32 qid = 0; qid < request_cnt; qid ++) {
 98 | #if WORKLOAD == YCSB
 99 | 		new(&queries[qid]) ycsb_query();
100 |         queries[qid].init(thread_id, h_wl, this);
101 | #elif WORKLOAD == TPCC
102 | 		new(&queries[qid]) tpcc_query();
103 | 		queries[qid].init(thread_id, h_wl);
104 | #endif
105 | 	}
106 | }
107 | 
108 | 
109 | base_query * 
110 | Query_thd::get_next_query() {
111 | 	if (q_idx >= request_cnt-1) {
112 |         q_idx = 0;
113 |         assert(q_idx < request_cnt);
114 |         //printf("WARNING: run out of queries, increase txn cnt per part!\n");
115 | 		    //return NULL;
116 |     }
117 | 	base_query * query = &queries[q_idx++];
118 | 	return query;
119 | }
120 | 


--------------------------------------------------------------------------------
/system/query.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "global.h"
 4 | 
 5 | class workload;
 6 | class ycsb_query;
 7 | class tpcc_query;
 8 | class ycsb_request;
 9 | 
10 | extern thread_local drand48_data per_thread_rand_buf;
11 | 
12 | class base_query {
13 | public:
14 | 	virtual void init(uint64_t thd_id, workload * h_wl) = 0;
15 | 	uint64_t waiting_time;
16 | 	uint64_t part_num;
17 | 	uint64_t * part_to_access;
18 | 	bool rerun;
19 | 	uint32_t num_abort = 0;
20 | 	uint32_t prio = 0;
21 | 	uint32_t max_prio = LOW_PRIO_BOUND;
22 | 	// note prio may be overwritten by subclass to support more complicated
23 | 	// priority distribution, e.g. long-running txn
24 | };
25 | 
26 | // All the querise for a particular thread.
27 | class Query_thd {
28 | public:
29 | 	void init(workload * h_wl, int thread_id);
30 | 	base_query * get_next_query(); 
31 | 	uint64_t q_idx;
32 | #if WORKLOAD == YCSB
33 | 	ycsb_query * queries;
34 |     ycsb_request * long_txn;
35 |     uint64_t * long_txn_part;
36 | #else 
37 | 	tpcc_query * queries;
38 | #endif
39 | 	char pad[CL_SIZE - sizeof(void *) - sizeof(int)];
40 | 	drand48_data buffer;
41 | 	uint64_t request_cnt;
42 | };
43 | 
44 | // TODO we assume a separate task queue for each thread in order to avoid 
45 | // contention in a centralized query queue. In reality, more sofisticated 
46 | // queue model might be implemented.
47 | class Query_queue {
48 | public:
49 | 	void init(workload * h_wl);
50 | 	void init_per_thread(int thread_id);
51 | 	base_query * get_next_query(uint64_t thd_id); 
52 | 	
53 | private:
54 | 	static void * threadInitQuery(void * This);
55 | 
56 | 	Query_thd ** all_queries;
57 | 	workload * _wl;
58 | 	static int _next_tid;
59 | };
60 | 


--------------------------------------------------------------------------------
/system/thread.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #include "global.h"
 4 | 
 5 | class workload;
 6 | class base_query;
 7 | class BatchMgr;
 8 | 
 9 | class thread_t {
10 |   public:
11 |     uint64_t _thd_id;
12 |     workload * _wl;
13 | 
14 | #if CC_ALG == ARIA
15 |     BatchMgr* batch_mgr;
16 | #endif
17 | 
18 |     constexpr uint64_t  get_thd_id() { return _thd_id; }
19 |     constexpr uint64_t  get_host_cid() { return _host_cid; }
20 |     void    set_host_cid(uint64_t cid) { _host_cid = cid; }
21 |     constexpr uint64_t  get_cur_cid() { return _cur_cid; }
22 |     void   set_cur_cid(uint64_t cid) { _cur_cid = cid; }
23 | 
24 |     void 		init(uint64_t thd_id, workload * workload);
25 |     // the following function must be in the form void* (*)(void*)
26 |     // to run with pthread.
27 |     // conversion is done within the function.
28 |     RC 			run();
29 | 
30 |     // moved from private to global for clv
31 |     ts_t 		get_next_ts();
32 |     ts_t 		get_next_n_ts(int n);
33 | 
34 | 
35 |   private:
36 |     uint64_t 	_host_cid;
37 |     uint64_t 	_cur_cid;
38 |     ts_t 		_curr_ts;
39 | 
40 |     RC	 		runTest(txn_man * txn);
41 |     drand48_data buffer;
42 | 
43 |     // added for wound wait
44 |     base_query * curr_query;
45 |     ts_t         starttime;
46 | 
47 |     // A restart buffer for aborted txns.
48 |     struct AbortBufferEntry	{
49 |         ts_t abort_time; // abort_time + penalty == ready_time
50 |         ts_t ready_time;
51 |         base_query * query;
52 |         ts_t starttime;
53 |         uint64_t exec_time_abort; // exec time that eventually aborts
54 |         uint64_t backoff_time; // accumulated backoff time
55 |     };
56 |     AbortBufferEntry * _abort_buffer;
57 |     int _abort_buffer_size;
58 |     int _abort_buffer_empty_slots;
59 |     bool _abort_buffer_enable;
60 | };
61 | 


--------------------------------------------------------------------------------
/system/wl.cpp:
--------------------------------------------------------------------------------
  1 | #include "global.h"
  2 | #include "helper.h"
  3 | #include "wl.h"
  4 | #include "row.h"
  5 | #include "table.h"
  6 | #include "index_hash.h"
  7 | #include "index_btree.h"
  8 | #include "catalog.h"
  9 | #include "mem_alloc.h"
 10 | 
 11 | RC workload::init() {
 12 | 	sim_done.store(false, std::memory_order_release);
 13 | 	return RCOK;
 14 | }
 15 | 
 16 | RC workload::init_schema(string schema_file) {
 17 |     assert(sizeof(uint64_t) == 8);
 18 |     assert(sizeof(double) == 8);	
 19 | 	string line;
 20 | 	ifstream fin(schema_file);
 21 |     Catalog * schema;
 22 |     while (getline(fin, line)) {
 23 | 		if (line.compare(0, 6, "TABLE=") == 0) {
 24 | 			string tname;
 25 | 			tname = &line[6];
 26 | 			schema = (Catalog *) _mm_malloc(sizeof(Catalog), CL_SIZE);
 27 | 			getline(fin, line);
 28 | 			int col_count = 0;
 29 | 			// Read all fields for this table.
 30 | 			vector<string> lines;
 31 | 			while (line.length() > 1) {
 32 | 				lines.push_back(line);
 33 | 				getline(fin, line);
 34 | 			}
 35 | 			schema->init( tname.c_str(), lines.size() );
 36 | 			for (UInt32 i = 0; i < lines.size(); i++) {
 37 | 				string line = lines[i];
 38 | 			    size_t pos = 0;
 39 | 				string token;
 40 | 				int elem_num = 0;
 41 | 				int size = 0;
 42 | 				string type;
 43 | 				string name;
 44 | 				while (line.length() != 0) {
 45 | 					pos = line.find(",");
 46 | 					if (pos == string::npos)
 47 | 						pos = line.length();
 48 | 	    			token = line.substr(0, pos);
 49 | 			    	line.erase(0, pos + 1);
 50 | 					switch (elem_num) {
 51 | 					case 0: size = atoi(token.c_str()); break;
 52 | 					case 1: type = token; break;
 53 | 					case 2: name = token; break;
 54 | 					default: assert(false);
 55 | 					}
 56 | 					elem_num ++;
 57 | 				}
 58 | 				assert(elem_num == 3);
 59 |                 schema->add_col((char *)name.c_str(), size, (char *)type.c_str());
 60 | 				col_count ++;
 61 | 			}
 62 | 			table_t * cur_tab = (table_t *) _mm_malloc(sizeof(table_t), CL_SIZE);
 63 | 			cur_tab->init(schema);
 64 | 			tables[tname] = cur_tab;
 65 |         } else if (!line.compare(0, 6, "INDEX=")) {
 66 | 			string iname;
 67 | 			iname = &line[6];
 68 | 			getline(fin, line);
 69 | 
 70 | 			vector<string> items;
 71 | 			string token;
 72 | 			size_t pos;
 73 | 			while (line.length() != 0) {
 74 | 				pos = line.find(",");
 75 | 				if (pos == string::npos)
 76 | 					pos = line.length();
 77 | 	    		token = line.substr(0, pos);
 78 | 				items.push_back(token);
 79 | 		    	line.erase(0, pos + 1);
 80 | 			}
 81 | 			
 82 | 			string tname(items[0]);
 83 | 			INDEX * index = (INDEX *) _mm_malloc(sizeof(INDEX), 64);
 84 | 			new(index) INDEX();
 85 | 			int part_cnt = (CENTRAL_INDEX)? 1 : g_part_cnt;
 86 | 			if (tname == "ITEM")
 87 | 				part_cnt = 1;
 88 | #if INDEX_STRUCT == IDX_HASH
 89 | 	#if WORKLOAD == YCSB
 90 | 			index->init(part_cnt, tables[tname], g_synth_table_size * 2);
 91 | 	#elif WORKLOAD == TPCC
 92 | 			assert(tables[tname] != NULL);
 93 | 			index->init(part_cnt, tables[tname], stoi( items[1] ) * part_cnt);
 94 | 	#endif
 95 | #else
 96 | 			index->init(part_cnt, tables[tname]);
 97 | #endif
 98 | 			indexes[iname] = index;
 99 | 		}
100 |     }
101 | 	fin.close();
102 | 	return RCOK;
103 | }
104 | 
105 | 
106 | 
107 | void workload::index_insert(string index_name, uint64_t key, row_t * row) {
108 | 	assert(false);
109 | 	INDEX * index = (INDEX *) indexes[index_name];
110 | 	index_insert(index, key, row);
111 | }
112 | 
113 | void workload::index_insert(INDEX * index, uint64_t key, row_t * row, int64_t part_id) {
114 | 	uint64_t pid = part_id;
115 | 	if (part_id == -1)
116 | 		pid = get_part_id(row);
117 | 	itemid_t * m_item =
118 | 		(itemid_t *) mem_allocator.alloc( sizeof(itemid_t), pid );
119 | 	m_item->init();
120 | 	m_item->type = DT_row;
121 | 	m_item->location = row;
122 | 	m_item->valid = true;
123 | #ifdef NDEBUG
124 |     index->index_insert(key, m_item, pid);
125 | #else
126 |     assert( index->index_insert(key, m_item, pid) == RCOK );
127 | #endif
128 | }
129 | 
130 | SC_PIECE * workload::get_cedges(TPCCTxnType type, int idx) {
131 | 	assert(false);
132 | 	return NULL;
133 | }
134 | 
135 | 
136 | 


--------------------------------------------------------------------------------
/system/wl.h:
--------------------------------------------------------------------------------
 1 | #pragma once 
 2 | 
 3 | #include "global.h"
 4 | 
 5 | class row_t;
 6 | class table_t;
 7 | class IndexHash;
 8 | class index_btree;
 9 | class Catalog;
10 | class lock_man;
11 | class txn_man;
12 | class thread_t;
13 | class index_base;
14 | class Timestamp;
15 | class Mvcc;
16 | 
17 | struct SC_PIECE {
18 | 	int txn_type;
19 | 	int piece_id;
20 | };
21 | 
22 | // this is the base class for all workload
23 | class workload
24 | {
25 | public:
26 | 	// tables indexed by table name
27 | 	map<string, table_t *> tables;
28 | 	map<string, INDEX *> indexes;
29 | 
30 | 	
31 | 	// initialize the tables and indexes.
32 | 	virtual RC init();
33 | 	virtual RC init_schema(string schema_file);
34 | 	virtual RC init_table()=0;
35 | 	virtual RC get_txn_man(txn_man *& txn_manager, thread_t * h_thd)=0;
36 | 
37 | 	// ic3 helpers
38 | 	virtual SC_PIECE * get_cedges(TPCCTxnType txn_type, int piece_id); 
39 | 
40 | 	std::atomic_bool sim_done;
41 | protected:
42 | 	void index_insert(string index_name, uint64_t key, row_t * row);
43 | 	void index_insert(INDEX * index, uint64_t key, row_t * row, int64_t part_id = -1);
44 | };
45 | 
46 | 


--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
  1 | import os, sys, re, os.path
  2 | import platform
  3 | import subprocess, datetime, time, signal, json
  4 | 
  5 | 
  6 | dbms_cfg = ["config-std.h", "config.h"]
  7 | 
  8 | def replace(filename, pattern, replacement):
  9 |     f = open(filename)
 10 |     s = f.read()
 11 |     f.close()
 12 |     s = re.sub(pattern,replacement,s)
 13 |     f = open(filename,'w')
 14 |     f.write(s)
 15 |     f.close()
 16 | 
 17 | def set_ndebug(ndebug):
 18 |     f = open("system/global.h", "r")
 19 |     found = None
 20 |     set_ndebug = False
 21 |     for line in f:
 22 |         if "#define NDEBUG" in line:
 23 |             found = line
 24 |             if line[0] != '#':
 25 |                 set_ndebug = True
 26 |         if "<cassert" in line:
 27 |             break
 28 |     f.close()
 29 |     if found is None:
 30 |         if ndebug:
 31 |             replace("system/global.h", r"#include <cassert>", "#define NDEBUG\n#include <cassert>")
 32 |     else:
 33 |         if ndebug:
 34 |             replace("system/global.h", found, "#define NDEBUG\n")
 35 |         else:
 36 |             replace("system/global.h", found, "")
 37 | 
 38 | def compile(job):
 39 |     os.system("cp {} {}".format(dbms_cfg[0], dbms_cfg[1]))
 40 |     # define workload
 41 |     for param, value in job.items():
 42 |         pattern = r"\#define\s*" + re.escape(param) + r'[\t ].*'
 43 |         replacement = "#define " + param + ' ' + str(value)
 44 |         replace(dbms_cfg[1], pattern, replacement)
 45 |     os.system("make clean > temp.out 2>&1")
 46 |     ret = os.system("make -j8 > temp.out 2>&1")
 47 |     if ret != 0:
 48 |         print("ERROR in compiling, output saved in temp.out", file=sys.stderr)
 49 |         exit(0)
 50 |     else:
 51 |         os.system("rm -f temp.out")
 52 | 
 53 | def run(test = '', job=None, numa=True):
 54 |     app_flags = ""
 55 |     if test == 'read_write':
 56 |         app_flags = "-Ar -t1"
 57 |     if test == 'conflict':
 58 |         app_flags = "-Ac -t4"
 59 |     if numa:
 60 |         os.system("numactl --interleave all ./rundb %s | tee temp.out" % app_flags)
 61 |     else:
 62 |         os.system("./rundb %s | tee temp.out" % app_flags)
 63 |     
 64 | def eval_arg(job, arg):
 65 |     return ((arg in job) and (job[arg] == "true"))
 66 | 
 67 | def parse_output(job):
 68 |     output = open("temp.out")
 69 |     success = False
 70 |     for line in output:
 71 |             line = line.strip()
 72 |             if "[summary]" in line:
 73 |                     success = True
 74 |                     for token in line.strip().split('[summary]')[-1].split(','):
 75 |                             key, val = token.strip().split('=')
 76 |                             job[key] = val
 77 |                     break
 78 |     if success:
 79 |         output.close()
 80 |         os.system("rm -f temp.out")
 81 |         return job
 82 |     errlog = open("log/{}.log".format(datetime.datetime.now().strftime("%b-%d_%H-%M-%S-%f")), 'a+')
 83 |     errlog.write("{}\n".format(json.dumps(job)))
 84 |     output = open("temp.out")
 85 |     for line in output:
 86 |         errlog.write(line)
 87 |     errlog.close()
 88 |     output.close()
 89 |     os.system("rm -f temp.out")
 90 |     return job
 91 | 
 92 | if __name__ == "__main__":
 93 |     print("usage: path/to/json [more args]", file=sys.stderr)
 94 |     fname = sys.argv[1]
 95 |     idx = 2
 96 |     if ".json" not in fname:
 97 |         fname = "experiments/default.json"
 98 |         idx = 1
 99 |     print("- read config from file: {}".format(fname), file=sys.stderr)
100 |     job = json.load(open(fname))
101 |     if len(sys.argv) > idx:
102 |         # has more args / overwrite existing args
103 |         for item in sys.argv[idx:]:
104 |             key, value = item.split("=", 1)
105 |             job[key] = value
106 |     if not eval_arg(job, "EXEC_ONLY"):
107 |         print("- compiling...", file=sys.stderr)
108 |         ndebug = eval_arg(job, "NDEBUG")
109 |         set_ndebug(ndebug)
110 |         if ndebug:
111 |             print("- disable assert()", file=sys.stderr)
112 |         compile(job)
113 |     numa = eval_arg(job, "UNSET_NUMA") == False
114 |     if not numa:
115 |         print("- disable interleaving allocation across numa nodes", file=sys.stderr)
116 |     if not eval_arg(job, "COMPILE_ONLY"):
117 |         print("- executing...", file=sys.stderr)
118 |         run("", job, numa=numa)
119 |         if eval_arg(job, "OUTPUT_TO_FILE"):
120 |             job = parse_output(job)
121 |             stats = open("outputs/stats.json", "a+")
122 |             stats.write(json.dumps(job)+"\n")
123 |             stats.close()
124 | 


--------------------------------------------------------------------------------