├── GraphLite-0.20 ├── Input │ ├── facebookcombined │ ├── facebookcombined_4w_1 │ ├── facebookcombined_4w_2 │ ├── facebookcombined_4w_3 │ ├── facebookcombined_4w_4 │ ├── tinygraph │ ├── tinygraph_4w_1 │ ├── tinygraph_4w_2 │ ├── tinygraph_4w_3 │ └── tinygraph_4w_4 ├── LICENSE.txt ├── Makefile ├── README.txt ├── bin │ ├── clean-output │ ├── hash-partitioner.pl │ ├── setenv │ ├── start-graphlite │ └── start-worker ├── doxygen.conf ├── engine │ ├── ChunkedList.h │ ├── FreeList.h │ ├── InputFormatter.cc │ ├── MW.begin.proto │ ├── MW.end.proto │ ├── MW.nextss_start.proto │ ├── Makefile │ ├── Master.cc │ ├── Master.h │ ├── MsgBuffer.h │ ├── Node.cc │ ├── OutputFormatter.cc │ ├── Receiver.cc │ ├── Receiver.h │ ├── Sender.cc │ ├── Sender.h │ ├── Utility.h │ ├── WM.begin.proto │ ├── WM.curss_finish.proto │ ├── WM.end.proto │ ├── WW.nodemsg_list.proto │ ├── Worker.cc │ ├── Worker.h │ └── main.cc ├── example │ ├── Makefile │ └── PageRankVertex.cc ├── include │ ├── Addr.h │ ├── Aggregator.h │ ├── AggregatorBase.h │ ├── GenericArrayIterator.h │ ├── GenericLinkIterator.h │ ├── Graph.h │ ├── GraphLite.h │ ├── InputFormatter.h │ ├── Node.h │ ├── OutputFormatter.h │ ├── Vertex.h │ └── VertexBase.h └── mainpage.dox └── README.md /GraphLite-0.20/Input/tinygraph: -------------------------------------------------------------------------------- 1 | 5 2 | 12 3 | 0 1 4 | 0 3 5 | 1 0 6 | 1 2 7 | 1 3 8 | 2 1 9 | 2 4 10 | 3 0 11 | 3 1 12 | 3 4 13 | 4 3 14 | 4 2 15 | -------------------------------------------------------------------------------- /GraphLite-0.20/Input/tinygraph_4w_1: -------------------------------------------------------------------------------- 1 | 2 2 | 4 3 | 0 1 4 | 0 3 5 | 4 3 6 | 4 2 7 | -------------------------------------------------------------------------------- /GraphLite-0.20/Input/tinygraph_4w_2: -------------------------------------------------------------------------------- 1 | 1 2 | 3 3 | 1 0 4 | 1 2 5 | 1 3 6 | -------------------------------------------------------------------------------- /GraphLite-0.20/Input/tinygraph_4w_3: -------------------------------------------------------------------------------- 1 | 1 2 | 2 3 | 2 1 4 | 2 4 5 | -------------------------------------------------------------------------------- /GraphLite-0.20/Input/tinygraph_4w_4: -------------------------------------------------------------------------------- 1 | 1 2 | 3 3 | 3 0 4 | 3 1 5 | 3 4 6 | -------------------------------------------------------------------------------- /GraphLite-0.20/LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 2 | Songjie Niu (niusongjie@ict.ac.cn) 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | -------------------------------------------------------------------------------- /GraphLite-0.20/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | cd engine; make; cd - 3 | cd example; make; cd - 4 | 5 | clean: 6 | cd engine; make clean; cd - 7 | cd example; make clean; cd - 8 | bin/clean-output 9 | -------------------------------------------------------------------------------- /GraphLite-0.20/README.txt: -------------------------------------------------------------------------------- 1 | ------------------------------------------------------------ 2 | Requirements 3 | ------------------------------------------------------------ 4 | 1. JDK 1.7.x 5 | 2. Hadoop 2.6.x 6 | 3. protocol buffers 7 | $ apt-get install protobuf-c-compiler libprotobuf-c0 libprotobuf-c0-dev 8 | 9 | ------------------------------------------------------------ 10 | Directory Structure 11 | ------------------------------------------------------------ 12 | bin/ scripts and graphlite executable 13 | engine/ graphlite engine source code 14 | example/ PageRank example 15 | include/ header that represents programming API 16 | 17 | Input/ a number of small example graphs 18 | Output/ empty, will contain the output of a run 19 | 20 | Makefile this can make both engine and example 21 | 22 | LICENSE.txt Apache License, Version 2.0 23 | 24 | README.txt this file 25 | 26 | ------------------------------------------------------------ 27 | Build graphlite 28 | ------------------------------------------------------------ 29 | 1. source bin/setenv 30 | 31 | (1) edit bin/setenv, set the following paths: 32 | JAVA_HOME, HADOOP_HOME, GRAPHLITE_HOME 33 | 34 | (2) $ . bin/setenv 35 | 36 | 2. build graphlite 37 | 38 | $ cd engine 39 | $ make 40 | 41 | check if bin/graphlite is successfully generated. 42 | 43 | ------------------------------------------------------------ 44 | Compile and Run Vertex Program 45 | ------------------------------------------------------------ 46 | 47 | 1. build example 48 | 49 | $ cd example 50 | $ make 51 | 52 | check if example/PageRankVertex.so is successfully generated. 53 | 54 | 2. run example 55 | 56 | $ start-graphlite example/PageRankVertex.so Input/facebookcombined_4w Output/out 57 | 58 | PageRankVertex.cc declares 5 processes, including 1 master and 4 workers. 59 | So the input graph file is prepared as four files: Input/facebookcombined_4w_[1-4] 60 | 61 | The output of PageRank will be in: Output/out_[1-4] 62 | 63 | Workers generate log files in WorkOut/worker*.out 64 | 65 | ------------------------------------------------------------ 66 | Write Vertex Program 67 | ------------------------------------------------------------ 68 | Please refer to PageRankVertex.cc 69 | 70 | 1. change VERTEX_CLASS_NAME(name) definition to use a different class name 71 | 72 | 2. VERTEX_CLASS_NAME(InputFormatter) can be kept as is 73 | 74 | 3. VERTEX_CLASS_NAME(OutputFormatter): this is where the output is generated 75 | 76 | 4. VERTEX_CLASS_NAME(Aggregator): you can implement other types of aggregation 77 | 78 | 5. VERTEX_CLASS_NAME(): the main vertex program with compute() 79 | 80 | 6. VERTEX_CLASS_NAME(Graph): set the running configuration here 81 | 82 | 7. Modify Makefile: 83 | EXAMPLE_ALGOS=PageRankVertex 84 | 85 | if your program is your_program.cc, then 86 | EXAMPLE_ALGOS=your_program 87 | 88 | make will produce your_program.so 89 | 90 | ------------------------------------------------------------ 91 | Use Hash Partitioner 92 | ------------------------------------------------------------ 93 | 94 | bin/hash-partitioner.pl can be used to divide a graph input 95 | file into multiple partitions. 96 | 97 | $ hash-partitioner.pl Input/facebookcombined 4 98 | 99 | will generate: Input/facebookcombined_4w_[1-4] 100 | 101 | -------------------------------------------------------------------------------- /GraphLite-0.20/bin/clean-output: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | cd ${GRAPHLITE_HOME} 4 | rm -rf Output WorkerOut bin/graphlite 5 | mkdir Output 6 | -------------------------------------------------------------------------------- /GraphLite-0.20/bin/hash-partitioner.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl -w 2 | 3 | if ($#ARGV < 1) { 4 | print "usage: $0 \n"; 5 | exit (0); 6 | } 7 | 8 | my $source_file= $ARGV[0]; 9 | my $num_part= $ARGV[1]; 10 | 11 | my $vertex_num= `head -n 1 $source_file`; chomp($vertex_num); 12 | 13 | my @edge_cnt; 14 | my @output_file; 15 | 16 | for (my $i=1; $i<=$num_part; $i++) { 17 | $edge_cnt[$i]= 0; 18 | } 19 | 20 | open IN, "$source_file" or die "can't open $source_file!\n"; 21 | while () { # $_ 22 | if (/(\d+) (\d+).*/) { 23 | $file_index = $1 % $num_part + 1; 24 | $edge_cnt[$file_index]++; 25 | } 26 | } 27 | close IN; 28 | 29 | my $file1_num = $vertex_num % $num_part; 30 | my $total_vertex = $vertex_num / $num_part + 1; 31 | for(my $i = 1; $i <= $file1_num; ++$i) { 32 | open $output_file[$i], "> ${source_file}_${num_part}w_$i" or die "can't open ${source_file}_${num_part}w_$i!\n"; 33 | printf {$output_file[$i]} "%d\n", $total_vertex; 34 | printf {$output_file[$i]} "%d\n", $edge_cnt[$i]; 35 | } 36 | 37 | $total_vertex = $vertex_num / $num_part; 38 | for(my $i = $file1_num + 1; $i <= $num_part; ++$i) { 39 | open $output_file[$i], "> ${source_file}_${num_part}w_$i" or die "can't open ${source_file}_${num_part}w_$i!\n"; 40 | printf {$output_file[$i]} "%d\n", $total_vertex; 41 | printf {$output_file[$i]} "%d\n", $edge_cnt[$i]; 42 | } 43 | 44 | open IN, "$source_file" or die "can't open $source_file!\n"; 45 | while () { # $_ 46 | if (/(\d+) (\d+).*/) { 47 | $file_index = $1 % $num_part + 1; 48 | print {$output_file[$file_index]} "$1 $2\n"; 49 | } 50 | } 51 | close IN; 52 | 53 | for($i = 1; $i <= $num_part; ++$i) { 54 | close $output_file[$i]; 55 | } 56 | -------------------------------------------------------------------------------- /GraphLite-0.20/bin/setenv: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | export JAVA_HOME=/usr/local/java/jdk1.7.0_40 4 | 5 | export HADOOP_HOME=/home/chensm/java/hadoop-2.6.0 6 | export GRAPHLITE_HOME=/home/chensm/java/2016Q1-BDMS/graphlite/GraphLite-0.20 7 | 8 | #export HADOOP_HOME=/usr/local/java/hadoop-2.6.0 9 | #export GRAPHLITE_HOME=/home/guest/work/hw2/GraphLite-0.20 10 | 11 | # ----------------------------------------------------------------------------- 12 | # STOP: no need to change the following 13 | 14 | # java 15 | export JRE_HOME=${JAVA_HOME}/jre 16 | export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib 17 | 18 | # hadoop 19 | export CLASSPATH=$CLASSPATH:`$HADOOP_HOME/bin/hadoop classpath --glob` 20 | 21 | export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:${GRAPHLITE_HOME}/bin 22 | 23 | machine=`uname -m` 24 | if [ ${machine} == 'x86_64' ]; then 25 | export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JRE_HOME/lib/amd64/server 26 | else 27 | export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JRE_HOME/lib/i386/server 28 | fi 29 | -------------------------------------------------------------------------------- /GraphLite-0.20/bin/start-graphlite: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | if [ $# -lt 1 ]; then 4 | echo "Usage: $0 "; 5 | exit 0; 6 | fi 7 | 8 | mypwd=`pwd` 9 | 10 | mydir=`dirname $0` 11 | cd ${mydir}; mydir=`pwd`; cd ${mypwd} 12 | 13 | myalgodir=`dirname $1` 14 | cd ${myalgodir}; myalgodir=`pwd`; cd ${mypwd} 15 | 16 | myalgo=`basename $1` 17 | 18 | 19 | num=$# 20 | i=3 21 | shift 22 | arg=$1 23 | while [ $i -le ${num} ] 24 | do 25 | shift 26 | arg+=" "$1 27 | ((i++)) 28 | done 29 | 30 | echo "${mydir}/graphlite 0 ${mydir}/start-worker ${myalgodir}/${myalgo} ${arg}" 31 | ${mydir}/graphlite 0 ${mydir}/start-worker ${myalgodir}/${myalgo} ${arg} 32 | -------------------------------------------------------------------------------- /GraphLite-0.20/bin/start-worker: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ulimit -c unlimited 4 | 5 | mydir=`dirname $0` 6 | 7 | . ${mydir}/setenv 8 | 9 | NUMBER=$# 10 | 11 | i=0 12 | while [ $i -lt $NUMBER ] 13 | do 14 | args[$i]=$1 15 | ((i++)) 16 | shift # now $1 is the next arg 17 | done 18 | 19 | cmd=${args[0]} 20 | for (( i=1 ; i<$NUMBER-1 ; i++ )) 21 | do 22 | cmd+=" "${args[$i]} 23 | done 24 | 25 | mkdir -p ${mydir}/../WorkerOut 26 | $cmd > ${mydir}/../WorkerOut/worker${args[$NUMBER-1]}.out 2>&1 41 | #include 42 | #include 43 | 44 | #include "Utility.h" 45 | 46 | using namespace std; 47 | 48 | /** Definition of ChunkedList class. */ 49 | class ChunkedList { 50 | public: 51 | vector m_mem_pools; /**< pointers of all memory pools */ 52 | int m_mem_pool_capacity; /**< memory pool capacity */ 53 | int m_num_ele_per_buf; /**< element number of each memory pool */ 54 | 55 | int m_cur_mem_pool; /**< current memory pool index */ 56 | char** m_pavail; /**< pointer of next element to be appended */ 57 | char** m_pavail_begin; /**< pointer of current block begin */ 58 | char** m_pavail_end; /**< pointer of whole blocks end */ 59 | 60 | public: 61 | /** Constructor. */ 62 | ChunkedList() { 63 | m_mem_pool_capacity = MEMPOOL_CAPACITY; 64 | m_num_ele_per_buf = m_mem_pool_capacity / sizeof(char*); 65 | m_cur_mem_pool = -1; 66 | m_pavail = m_pavail_begin = m_pavail_end = NULL; 67 | } 68 | 69 | /** 70 | * Allocate a new memory pool. 71 | * Called when there is no freespace on ChunkedList to append. 72 | */ 73 | void allocateNewBlock() { 74 | char* p = (char *)memalign(64, m_mem_pool_capacity); // cache line: 64B 75 | if (!p) { perror("memalign"); exit(1); } 76 | m_mem_pools.push_back(p); 77 | 78 | ++m_cur_mem_pool; 79 | m_pavail = (char **)p; 80 | m_pavail_begin = m_pavail; 81 | m_pavail_end = m_pavail + m_num_ele_per_buf; 82 | } 83 | 84 | /** 85 | * Append a new element. 86 | * If there is no freespace on ChunkedList to append a new element, we need 87 | * to request for a new memory pool before allocation. 88 | * @param p pointer of the new element to be appended 89 | * @see allocateNewBlock() 90 | */ 91 | void append(void* p) { 92 | if (m_pavail == m_pavail_end) allocateNewBlock(); 93 | *m_pavail = (char *)p; 94 | ++m_pavail; 95 | 96 | if (m_pavail == m_pavail_end) allocateNewBlock(); 97 | else if (m_pavail == m_pavail_begin + m_num_ele_per_buf) { 98 | m_pavail_begin = (char **)m_mem_pools[++m_cur_mem_pool]; 99 | m_pavail = m_pavail_begin; 100 | } 101 | } 102 | 103 | /** 104 | * Judge whether ChunkedList is empty. 105 | * @retval 0 not empty 106 | * @retval 1 empty 107 | */ 108 | bool isEmpty() { 109 | if ( ! m_mem_pools.size() || (m_pavail == (char **)m_mem_pools[0]) ) { 110 | /* 111 | Actual condition is 112 | ( ! m_mem_pools.size() || (m_mem_pools.size() && m_pavail == mem_pool[0]) ), 113 | here cuz of "||" property. 114 | */ 115 | return 1; 116 | } 117 | return 0; 118 | } 119 | 120 | /** 121 | * Get ChunkedList tail element. 122 | * @return tail element 123 | */ 124 | char* getTail() { 125 | if (m_pavail == m_pavail_begin) { // will not remove current empty block 126 | m_pavail_begin = (char **)m_mem_pools[--m_cur_mem_pool]; 127 | m_pavail = m_pavail_begin + m_num_ele_per_buf; 128 | } 129 | 130 | --m_pavail; 131 | return *m_pavail; 132 | } 133 | 134 | /** 135 | * Get total number of elements on ChunkedList. 136 | * @see isEmpty() 137 | * @return total number of elements 138 | */ 139 | int64_t total() { 140 | if ( isEmpty() ) return 0; 141 | 142 | int64_t r = m_num_ele_per_buf; 143 | r *= (m_cur_mem_pool + 1); 144 | r -= (m_pavail_begin + m_num_ele_per_buf - m_pavail); 145 | return r; 146 | } 147 | 148 | /** 149 | * Destructor. 150 | * Free requested memory of whole runtime. 151 | */ 152 | ~ChunkedList() { 153 | for (int i = m_mem_pools.size()-1; i>=0; i--) { 154 | ::free(m_mem_pools[i]); // system global function 155 | m_mem_pools[i] = NULL; 156 | } 157 | m_mem_pools.clear(); 158 | m_pavail = m_pavail_begin = m_pavail_end = NULL; 159 | } 160 | 161 | public: 162 | /** Definition of ChunkedList::Iterator class */ 163 | class Iterator { 164 | private: 165 | ChunkedList* m_plist; /**< pointer of ChunkedList to be iterated on */ 166 | int m_which_chunk; /**< index of chunk to be visited */ 167 | char** m_pmy_next; /**< pointer of next element on ChunkedList */ 168 | char** m_pchunk_end; /**< pointer of current chunk end */ 169 | 170 | public: 171 | /** 172 | * Constructor. 173 | * @param pl pointer of ChunkedList to be iterated on 174 | * @see init() 175 | */ 176 | Iterator(ChunkedList* pl) { init(pl); } 177 | 178 | /** 179 | * Initialization. 180 | * @param pl pointer of ChunkedList to be iterated on 181 | * @see startNextChunk() 182 | */ 183 | void init(ChunkedList* pl) { 184 | m_plist = pl; 185 | m_which_chunk = 0; 186 | startNextChunk(0); 187 | } 188 | 189 | /** 190 | * Set chunk state value when start to visit next chunk. 191 | * @param which index of chunk to be visited 192 | */ 193 | void startNextChunk(int which) { 194 | if ( which < (int)m_plist->m_mem_pools.size() ) { 195 | m_pmy_next = (char **)(m_plist->m_mem_pools[which]); 196 | m_pchunk_end = m_pmy_next + m_plist->m_num_ele_per_buf; 197 | } else { 198 | m_pmy_next = NULL; 199 | } 200 | } 201 | 202 | /** 203 | * Get next element on ChunkedList. 204 | * @see startNextChunk() 205 | * @return next element on ChunkedList 206 | */ 207 | void* next() { 208 | if ( (m_pmy_next == NULL) || (m_pmy_next == m_plist->m_pavail) ) { 209 | return NULL; 210 | } 211 | 212 | void* ret = *m_pmy_next; 213 | ++m_pmy_next; 214 | if (m_pmy_next == m_pchunk_end) { 215 | ++m_which_chunk; 216 | startNextChunk(m_which_chunk); 217 | } 218 | return ret; 219 | } 220 | }; // definition of ChunkedList::Iterator class 221 | 222 | public: 223 | /* 224 | Previous implementation below leads to the case that nowhere to delete 225 | pointer from new. To avoid this, return only the iterator pointer. 226 | Similiar to Node::getGenericArrayIterator(). 227 | */ 228 | // Iterator getIterator() { return *( new Iterator(this) ); } 229 | /** 230 | * Get pointer of Iterator. 231 | * @see Iterator() 232 | * @return pointer of Iterator 233 | */ 234 | Iterator* getIterator() { return new Iterator(this); } 235 | 236 | /** 237 | * Initialize Iterator. 238 | * @param pit pointer of Iterator 239 | * @see init() 240 | */ 241 | void initIterator(Iterator* pit) { pit->init(this); } 242 | }; // definition of ChunkedList class 243 | 244 | #endif /* CHUNKEDLIST_H */ 245 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/FreeList.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file FreeList.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined FreeList class to manage memory allocation in program. 26 | * 27 | * If there is no freespace on FreeList, we allocate a block of 28 | * m_mem_pool_capacity size once from memory, called memory pool, split it into 29 | * items of element size, and store each address in m_chunked_list, so as to 30 | * reduce request times for memory. 31 | * 32 | * When we need m_element_size freespace, just pick up one item from FreeList; 33 | * when some item becomes unused, it'll be added to FreeList again. 34 | * 35 | * @see ChunkedList 36 | * 37 | */ 38 | 39 | #ifndef FREELIST_H 40 | #define FREELIST_H 41 | 42 | #include 43 | #include 44 | 45 | #include "ChunkedList.h" 46 | #include "Utility.h" 47 | 48 | /** Definition of FreeList class. */ 49 | class FreeList { 50 | public: 51 | vector m_mem_pools; /**< pointers of all memory pools */ 52 | int m_mem_pool_capacity; /**< memory pool capacity */ 53 | int m_element_size; /**< size of FreeList element */ 54 | ChunkedList m_chunked_list; /**< pointer array to store free element address */ 55 | 56 | public: 57 | /** Constructor. */ 58 | FreeList(): m_mem_pool_capacity(MEMPOOL_CAPACITY) {} 59 | 60 | /** 61 | * Set element size. 62 | * @param ele_size memory pool element size 63 | */ 64 | void setEle(int ele_size) { m_element_size = ele_size; } 65 | 66 | /** 67 | * Recycle unused memory to freespace. 68 | * @param p pointer of unused memory item 69 | * @see ChunkedList::append() 70 | */ 71 | void free(void* p) { 72 | m_chunked_list.append(p); 73 | } 74 | 75 | /** 76 | * Allocate a new memory pool. 77 | * If there is no freespace on FreeList, we need to request for a new 78 | * memory pool. Then split it into m_element_size items and add them to 79 | * FreeList one by one through FreeList::free(). 80 | * @see free() 81 | */ 82 | void allocateNewBlock() { 83 | char* p = (char *)memalign(64, m_mem_pool_capacity); // cache line: 64B 84 | if (!p) { 85 | perror("memalign"); 86 | exit(1); 87 | } 88 | m_mem_pools.push_back(p); 89 | 90 | /* 91 | If for-loop control condition stays as before-"i= 0; --i) { 122 | ::free(m_mem_pools[i]); // system global function in stdlib.h 123 | m_mem_pools[i] = NULL; 124 | } 125 | m_mem_pools.clear(); 126 | } 127 | }; // definition of FreeList class 128 | 129 | #endif /* FREELIST_H */ 130 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/InputFormatter.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file InputFormatter.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * @see InputFormatter.h 26 | * 27 | */ 28 | 29 | #include "InputFormatter.h" 30 | #include "Worker.h" 31 | 32 | extern Worker worker; 33 | 34 | void InputFormatter::open(const char* pin_path) { 35 | if (worker.m_hdfs_flag) { 36 | if ( hdfsExists(worker.m_fs_handle, pin_path) ) { 37 | perror("input file"); 38 | exit(1); 39 | } 40 | 41 | m_hdfs_file = hdfsOpenFile(worker.m_fs_handle, pin_path, 42 | O_RDONLY, 0, 0, 0); 43 | if (! m_hdfs_file) { 44 | perror("Error opening input file on HDFS"); 45 | exit(1); 46 | } 47 | 48 | m_vertex_num_size = 0; 49 | m_edge_num_size = 0; 50 | m_next_edge_offset = 0; 51 | 52 | m_ptotal_vertex_line = new char[MAXINPUTLEN]; 53 | m_ptotal_edge_line = new char[MAXINPUTLEN]; 54 | getVertexNumLine(); 55 | getEdgeNumLine(); 56 | } else { 57 | m_local_file.open(pin_path); 58 | if (! m_local_file.good()) { 59 | fprintf(stderr, "Error opening local file %s\n", pin_path); 60 | exit(1); 61 | } 62 | 63 | getVertexNumLine(); 64 | getEdgeNumLine(); 65 | } 66 | 67 | m_total_vertex = 0; 68 | m_total_edge = 0; 69 | m_n_value_size = 0; 70 | m_e_value_size = 0; 71 | m_m_value_size = 0; 72 | } 73 | 74 | void InputFormatter::close() { 75 | if (worker.m_hdfs_flag) { 76 | hdfsCloseFile(worker.m_fs_handle, m_hdfs_file); 77 | } else { 78 | m_local_file.close(); 79 | } 80 | } 81 | 82 | void InputFormatter::getVertexNumLine() { 83 | if (worker.m_hdfs_flag) { 84 | // notice hdfs type: int64_t->tOffset, int32_t->tSize 85 | tOffset offset = 0; 86 | tSize num_read_bytes = hdfsPread( worker.m_fs_handle, m_hdfs_file, offset, 87 | (void*)m_ptotal_vertex_line, MAXINPUTLEN * sizeof(char) ); 88 | for (int i = 0; i < num_read_bytes; ++i) { 89 | if (m_ptotal_vertex_line[i] == '\n') { 90 | m_vertex_num_size = i + 1; 91 | break; 92 | } 93 | } 94 | } 95 | else { 96 | getline(m_local_file, m_total_vertex_line); 97 | m_ptotal_vertex_line = m_total_vertex_line.c_str(); 98 | } 99 | } 100 | 101 | void InputFormatter::getEdgeNumLine() { 102 | if (worker.m_hdfs_flag) { 103 | // notice hdfs type: int64_t->tOffset, int32_t->tSize 104 | tOffset offset = m_vertex_num_size; 105 | tSize num_read_bytes = hdfsPread( worker.m_fs_handle, m_hdfs_file, offset, 106 | (void*)m_ptotal_edge_line, MAXINPUTLEN * sizeof(char) ); 107 | for (int i = 0; i < num_read_bytes; ++i) { 108 | if (m_ptotal_edge_line[i] == '\n') { 109 | m_edge_num_size = i + 1; 110 | break; 111 | } 112 | } 113 | m_next_edge_offset = m_vertex_num_size + m_edge_num_size; 114 | } 115 | else { 116 | getline(m_local_file, m_total_edge_line); 117 | m_ptotal_edge_line = m_total_edge_line.c_str(); 118 | } 119 | } 120 | 121 | const char* InputFormatter::getEdgeLine() { 122 | if (worker.m_hdfs_flag) { 123 | // notice hdfs type: int64_t->tOffset, int32_t->tSize 124 | tOffset offset = m_next_edge_offset; 125 | tSize num_read_bytes = hdfsPread( worker.m_fs_handle, m_hdfs_file, offset, 126 | (void*)m_buf_line, sizeof(m_buf_line) ); 127 | for (int i = 0; i < num_read_bytes; ++i) { 128 | if (m_buf_line[i] == '\n') { 129 | m_next_edge_offset += (i + 1); 130 | break; 131 | } 132 | } 133 | 134 | return (num_read_bytes>0 ? m_buf_line : 0); 135 | } 136 | else { 137 | getline(m_local_file, m_buf_string); 138 | return (m_local_file.good() ? m_buf_string.c_str() : NULL); 139 | } 140 | } 141 | 142 | void InputFormatter::addVertex(int64_t vid, void* pvalue, int outdegree) { 143 | worker.addVertex(vid, pvalue, outdegree); 144 | } 145 | 146 | void InputFormatter::addEdge(int64_t from, int64_t to, void* pweight) { 147 | worker.addEdge(from, to, pweight); 148 | } 149 | 150 | InputFormatter::~InputFormatter() { 151 | if (worker.m_hdfs_flag) { 152 | delete[] m_ptotal_vertex_line; 153 | delete[] m_ptotal_edge_line; 154 | } 155 | } 156 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/MW.begin.proto: -------------------------------------------------------------------------------- 1 | /** message from master to worker */ 2 | package mw; 3 | 4 | /* 5 | * Respond to whole supersteps begin. 6 | * message type: 1 7 | */ 8 | message begin { 9 | required int32 s_id = 1; /**< master id */ 10 | required int32 d_id = 2; /**< worker id */ 11 | required int32 state = 3; /**< 0: OK to begin, 1: need to wait */ 12 | } 13 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/MW.end.proto: -------------------------------------------------------------------------------- 1 | /** message from master to worker */ 2 | package mw; 3 | 4 | /* 5 | * Notify workers to end supersteps. 6 | * message type: 4 7 | */ 8 | message end { 9 | required int32 s_id = 1; /**< master id */ 10 | required int32 d_id = 2; /**< worker id */ 11 | required int32 state = 3; /**< 0: OK to end, 1: need to go on */ 12 | } 13 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/MW.nextss_start.proto: -------------------------------------------------------------------------------- 1 | /** message from master to worker */ 2 | package mw; 3 | 4 | /** 5 | * Next superstep starts. 6 | * Master sends this message to all workers before next superstep starts. 7 | * message type: 3 8 | */ 9 | message nextss_start { 10 | required int32 s_id = 1; /**< master id */ 11 | required int32 d_id = 2; /**< worker id */ 12 | required int32 superstep = 3; /**< next superstep number */ 13 | required int64 node_msg = 4; /**< count of node messages the worker should receive before next superstep */ 14 | repeated bytes aggr_global = 5; /**< global aggregator value before next superstep */ 15 | } 16 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Makefile: -------------------------------------------------------------------------------- 1 | # ---------------------------------------------------------------------- 2 | # compiler options 3 | # ---------------------------------------------------------------------- 4 | 5 | CXX= g++ 6 | 7 | CFLAGS_COMMON=-std=c++0x -g -O3 -Wall -Wno-strict-aliasing -I../include 8 | #CFLAGS_COMMON=-std=c++0x -g -O0 -DDEBUG_LVL_1 -DDEBUG_LVL_2 -I../include 9 | 10 | INCLD_HDFS=-I$(HADOOP_HOME)/include 11 | 12 | LIB_HDFS=-L$(HADOOP_HOME)/lib/native -lhdfs 13 | LIB_JAVA= $(shell echo $(JAVA_HOME)/jre/lib/*/server/libjvm.so) 14 | 15 | LIB_GRAPHLITE=-lpthread -rdynamic -ldl -lprotobuf-c 16 | 17 | #LIB_GRAPHLITE=/usr/lib/x86_64-linux-gnu/libprotobuf-c.a -lpthread -rdynamic -ldl 18 | #LIB_GRAPHLITE=/usr/lib/i386-linux-gnu/libprotobuf-c.a -lpthread -rdynamic -ldl 19 | 20 | LIB_GRAPHALGO=-fPIC -shared 21 | 22 | # ---------------------------------------------------------------------- 23 | # target 24 | # ---------------------------------------------------------------------- 25 | 26 | all : graphlite 27 | cp graphlite ../bin 28 | 29 | # ---------------------------------------------------------------------- 30 | # graphlite 31 | # ---------------------------------------------------------------------- 32 | 33 | graphlite : WM.begin.pb-c.o MW.begin.pb-c.o WM.curss_finish.pb-c.o MW.nextss_start.pb-c.o MW.end.pb-c.o WM.end.pb-c.o WW.nodemsg_list.pb-c.o Sender.o Receiver.o InputFormatter.o OutputFormatter.o Node.o Worker.o Master.o main.o 34 | ${CXX} ${CFLAGS_COMMON} $^ ${INCLD_HDFS} ${LIB_HDFS} ${LIB_JAVA} ${LIB_GRAPHLITE} -o $@ 35 | 36 | # ---------------------------------------------------------------------- 37 | # protocol buffer c code 38 | # ---------------------------------------------------------------------- 39 | 40 | %.pb-c.o : %.proto 41 | protoc-c --c_out=./ $< 42 | ${CXX} -c ${CFLAGS_COMMON} ${@:.o=.c} -o $@ 43 | 44 | WM.begin.pb-c.o : WM.begin.proto 45 | 46 | MW.begin.pb-c.o : MW.begin.proto 47 | 48 | WM.curss_finish.pb-c.o : WM.curss_finish.proto 49 | 50 | MW.nextss_start.pb-c.o : MW.nextss_start.proto 51 | 52 | MW.end.pb-c.o : MW.end.proto 53 | 54 | WM.end.pb-c.o : WM.end.proto 55 | 56 | WW.nodemsg_list.pb-c.o : WW.nodemsg_list.proto 57 | 58 | # ---------------------------------------------------------------------- 59 | # C++ source code 60 | # ---------------------------------------------------------------------- 61 | 62 | %.o : %.cc 63 | ${CXX} -c ${CFLAGS_COMMON} $< ${INCLD_HDFS} -o $@ 64 | 65 | Master.h : WM.begin.pb-c.o MW.begin.pb-c.o WM.curss_finish.pb-c.o MW.nextss_start.pb-c.o MW.end.pb-c.o WM.end.pb-c.o WW.nodemsg_list.pb-c.o 66 | 67 | Worker.h : WM.begin.pb-c.o MW.begin.pb-c.o WM.curss_finish.pb-c.o MW.nextss_start.pb-c.o MW.end.pb-c.o WM.end.pb-c.o WW.nodemsg_list.pb-c.o 68 | 69 | InputFormatter.o : InputFormatter.cc ../include/InputFormatter.h Worker.h 70 | 71 | OutputFormatter.o : OutputFormatter.cc ../include/OutputFormatter.h Worker.h 72 | 73 | Node.o : Node.cc ../include/Node.h Worker.h 74 | 75 | Sender.o: Sender.cc Sender.h 76 | 77 | Receiver.o: Receiver.cc Receiver.h 78 | 79 | Worker.o : Worker.cc Worker.h 80 | 81 | Master.o : Master.cc Master.h 82 | 83 | main.o : main.cc Master.h Worker.h 84 | 85 | # ---------------------------------------------------------------------- 86 | # clean up 87 | # ---------------------------------------------------------------------- 88 | 89 | clean : 90 | rm -rf graphlite *.o *.so *.pb-c.[ch] 91 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Master.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #include "Master.h" 7 | 8 | extern Master master; 9 | 10 | struct timeval b_time, e_time; 11 | double elapsed = 0; 12 | 13 | void* m_receiveFun(void *args) { 14 | master.m_receiver.bindServerAddr(master.m_addr_self.port); 15 | master.m_receiver.listenClient(); 16 | master.m_receiver.acceptClient(); 17 | master.m_receiver.recvMsg(); 18 | master.m_receiver.closeAllSocket(); 19 | return NULL; 20 | } 21 | 22 | void* m_sendFun(void *args) { 23 | master.m_sender.getSocketFd(); 24 | master.m_sender.getServerAddr(master.m_paddr_table); 25 | master.m_sender.connectServer(master.m_addr_self.id); 26 | master.m_sender.sendMsg(); 27 | master.m_sender.closeAllSocket(); 28 | return NULL; 29 | } 30 | 31 | void Master::run(int argc, char* argv[]) { 32 | parseCmdArg(argc, argv); 33 | loadUserFile(argc, argv); 34 | startWorkers(argv); 35 | init(); 36 | manageSuperstep(); 37 | terminate(); 38 | } 39 | 40 | void Master::parseCmdArg(int argc, char* argv[]) { 41 | printf("parseCmdArg\n"); fflush(stdout); 42 | 43 | m_addr_self.id = atoi(argv[1]); 44 | m_pstart_file = argv[2]; 45 | m_puser_file = argv[3]; 46 | for (int i = 4; i < argc - 1; ++i) { 47 | m_algo_args += argv[i]; 48 | m_algo_args += " "; 49 | } 50 | if (argc > 4) { 51 | m_algo_args += argv[argc - 1]; 52 | } 53 | // Parsing user cmd line is below in Graph init(). 54 | int imdm = 2; 55 | switch (imdm) { 56 | case 0: 57 | m_imdm = IMDM_OPT_PLAIN; 58 | break; 59 | case 1: 60 | m_imdm = IMDM_OPT_GROUP_PREF; 61 | break; 62 | case 2: 63 | m_imdm = IMDM_OPT_SWPL_PREF; 64 | break; 65 | default: 66 | break; 67 | } 68 | } 69 | 70 | void Master::loadUserFile(int argc, char* argv[]) { 71 | printf("loadUserFile\n"); fflush(stdout); 72 | 73 | // 1. Open user file. 74 | m_puf_handle = dlopen(m_puser_file, RTLD_NOW); 75 | if (! m_puf_handle) { 76 | fprintf( stderr, "%s\n", dlerror() ); 77 | exit(1); 78 | } 79 | 80 | // 2. Create Graph class. 81 | GraphCreateFn create_graph; 82 | create_graph = (GraphCreateFn)dlsym(m_puf_handle, "create_graph"); 83 | m_pmy_graph = create_graph(); 84 | m_pmy_graph->init(argc - 3, (char **)&argv[3]); 85 | 86 | // 3. Read in configuration from Graph. 87 | m_machine_cnt = m_pmy_graph->m_machine_cnt; 88 | m_paddr_table = m_pmy_graph->m_paddr_table; 89 | m_hdfs_flag = m_pmy_graph->m_hdfs_flag; 90 | m_my_aggregator_cnt = m_pmy_graph->m_aggregator_cnt; 91 | m_pmy_aggregator = m_pmy_graph->m_paggregator; 92 | } 93 | 94 | void Master::startWorkers(char* argv[]) { 95 | printf("startWorkers\n"); fflush(stdout); 96 | 97 | char *curdir= get_current_dir_name(); // we cd to the curdir then call start-worker 98 | 99 | char s[SPRINTF_LEN]; 100 | int ret; 101 | for (int i = 1; i < m_machine_cnt; ++i) { 102 | if (m_algo_args == "") { 103 | /* ssh command, no srcipt for .profile 104 | sprintf(s, "ssh %s '%s %d %s > worker%d.out 2>&1 worker%d.out 2>&1 getSize(); 153 | } 154 | mw__end__init(&m_mw_end); 155 | m_pwm_end = NULL; 156 | 157 | m_pfinish_send = (int *)malloc( m_machine_cnt * sizeof(int) ); 158 | if (! m_pfinish_send) { 159 | perror("malloc"); 160 | exit(1); 161 | } 162 | 163 | m_worker_msg = (int64_t *)malloc( m_machine_cnt * sizeof(int64_t) ); 164 | if (! m_worker_msg) { 165 | perror("malloc"); 166 | exit(1); 167 | } 168 | 169 | // 3. Initialize sender and receiver. 170 | m_receiver.init(m_machine_cnt); 171 | m_sender.init(m_machine_cnt); 172 | 173 | // 4. Create send and receive thread. 174 | if ( pthread_create(&m_pth_receive, NULL, m_receiveFun, NULL) ) { 175 | fprintf(stderr, "Error creating receive thread !\n"); 176 | exit(1); 177 | } 178 | if ( pthread_create(&m_pth_send, NULL, m_sendFun, NULL) ) { 179 | fprintf(stderr, "Error creating send thread !\n"); 180 | exit(1); 181 | } 182 | } 183 | 184 | int Master::sendBegin(int worker_id) { 185 | int ret = 1; 186 | 187 | // 1. Set message content to be sent. 188 | m_mw_begin.s_id = 0; 189 | m_mw_begin.d_id = worker_id; 190 | m_mw_begin.state = 0; 191 | 192 | // 2. Send to destination worker. 193 | if (! m_sender.m_out_buffer[worker_id].m_state) { // just test, totally empty, can write to m_out_buffer 194 | 195 | pthread_mutex_lock(&m_sender.m_out_mutex); 196 | if (! m_sender.m_out_buffer[worker_id].m_state) { // totally empty, can write to m_out_buffer 197 | m_sender.m_out_buffer[worker_id].m_msg_len = sizeof(int) + 198 | mw__begin__pack(&m_mw_begin, 199 | &m_sender.m_out_buffer[worker_id].m_buffer[2 * sizeof(int)]); 200 | m_sender.m_out_buffer[worker_id].m_buf_len = m_sender.m_out_buffer[worker_id].m_msg_len + sizeof(int); 201 | * (int *)m_sender.m_out_buffer[worker_id].m_buffer = m_sender.m_out_buffer[worker_id].m_buf_len; 202 | * (int *)&(m_sender.m_out_buffer[worker_id].m_buffer[sizeof(int)]) = MW_BEGIN; 203 | m_sender.m_out_buffer[worker_id].m_head = 0; 204 | m_sender.m_out_buffer[worker_id].m_tail = m_sender.m_out_buffer[worker_id].m_buf_len; 205 | m_sender.m_out_buffer[worker_id].m_state = 1; 206 | ret = 0; 207 | } 208 | pthread_mutex_unlock(&m_sender.m_out_mutex); 209 | } // else can't write to m_out_buffer 210 | 211 | return ret; 212 | } 213 | 214 | int Master::sendNextssstart(int worker_id) { 215 | int ret = 1; 216 | 217 | // 1. Set message content to be sent. 218 | m_mw_nextssstart.s_id = 0; 219 | m_mw_nextssstart.d_id = worker_id; 220 | m_mw_nextssstart.node_msg = m_worker_msg[worker_id]; 221 | for (size_t i = 0; i < m_mw_nextssstart.n_aggr_global; ++i) { 222 | m_mw_nextssstart.aggr_global[i].data = (uint8_t *)( m_pmy_aggregator[i]->getGlobal() ); 223 | } 224 | 225 | // 2. Send to destination worker. 226 | if (! m_sender.m_out_buffer[worker_id].m_state) { // just test, totally empty, can write to m_out_buffer 227 | 228 | pthread_mutex_lock(&m_sender.m_out_mutex); 229 | if (! m_sender.m_out_buffer[worker_id].m_state) { // totally empty, can write to m_out_buffer 230 | m_sender.m_out_buffer[worker_id].m_msg_len = sizeof(int) + 231 | mw__nextss_start__pack(&m_mw_nextssstart, 232 | &m_sender.m_out_buffer[worker_id].m_buffer[2 * sizeof(int)]); 233 | m_sender.m_out_buffer[worker_id].m_buf_len = m_sender.m_out_buffer[worker_id].m_msg_len + sizeof(int); 234 | * (int *)m_sender.m_out_buffer[worker_id].m_buffer = m_sender.m_out_buffer[worker_id].m_buf_len; 235 | * (int *)&(m_sender.m_out_buffer[worker_id].m_buffer[sizeof(int)]) = MW_NEXTSSSTART; 236 | m_sender.m_out_buffer[worker_id].m_head = 0; 237 | m_sender.m_out_buffer[worker_id].m_tail = m_sender.m_out_buffer[worker_id].m_buf_len; 238 | m_sender.m_out_buffer[worker_id].m_state = 1; 239 | ret = 0; 240 | } 241 | pthread_mutex_unlock(&m_sender.m_out_mutex); 242 | } // else can't write to m_out_buffer 243 | 244 | return ret; 245 | } 246 | 247 | int Master::sendEnd(int worker_id) { 248 | int ret = 1; 249 | 250 | // 1. Set message content to be sent. 251 | m_mw_end.s_id = 0; 252 | m_mw_end.d_id = worker_id; 253 | m_mw_end.state = 0; 254 | 255 | // 2. Send to destination worker. 256 | 257 | if (! m_sender.m_out_buffer[worker_id].m_state) { // just test, totally empty, can write to m_out_buffer 258 | 259 | pthread_mutex_lock(&m_sender.m_out_mutex); 260 | if (! m_sender.m_out_buffer[worker_id].m_state) { // totally empty, can write to m_out_buffer 261 | m_sender.m_out_buffer[worker_id].m_msg_len = sizeof(int) + 262 | mw__end__pack(&m_mw_end, 263 | &m_sender.m_out_buffer[worker_id].m_buffer[2 * sizeof(int)]); 264 | m_sender.m_out_buffer[worker_id].m_buf_len = m_sender.m_out_buffer[worker_id].m_msg_len + sizeof(int); 265 | * (int *)m_sender.m_out_buffer[worker_id].m_buffer = m_sender.m_out_buffer[worker_id].m_buf_len; 266 | * (int *)&(m_sender.m_out_buffer[worker_id].m_buffer[sizeof(int)]) = MW_END; 267 | m_sender.m_out_buffer[worker_id].m_head = 0; 268 | m_sender.m_out_buffer[worker_id].m_tail = m_sender.m_out_buffer[worker_id].m_buf_len; 269 | m_sender.m_out_buffer[worker_id].m_state = 1; 270 | ret = 0; 271 | } 272 | pthread_mutex_unlock(&m_sender.m_out_mutex); 273 | } // else can't write to m_out_buffer 274 | 275 | return ret; 276 | } 277 | 278 | void Master::sendAll(int msg_type) { 279 | printf("step into sendAll\n"); 280 | 281 | int msg2send; // count of messages to send 282 | memset( m_pfinish_send, 0, m_machine_cnt * sizeof(int) ); 283 | 284 | switch (msg_type) { 285 | case MW_BEGIN: 286 | do { 287 | msg2send = 0; 288 | for (int i = 1; i < m_machine_cnt; ++i) { 289 | if (! m_pfinish_send[i]) { 290 | ++msg2send; 291 | if ( sendBegin(i) ) continue; 292 | printf("sent MW_BEGIN to worker[%d]\n", i); fflush(stdout); 293 | m_pfinish_send[i] = 1; 294 | --msg2send; 295 | } 296 | } 297 | } while (msg2send); 298 | break; 299 | case MW_NEXTSSSTART: 300 | do { 301 | msg2send = 0; 302 | for (int i = 1; i < m_machine_cnt; ++i) { 303 | if (! m_pfinish_send[i]) { 304 | ++msg2send; 305 | if ( sendNextssstart(i) ) continue; 306 | printf("sent MW_NEXTSSSTART to worker[%d]\n", i); fflush(stdout); 307 | m_pfinish_send[i] = 1; 308 | --msg2send; 309 | } 310 | } 311 | } while (msg2send); 312 | break; 313 | case MW_END: 314 | do { 315 | msg2send = 0; 316 | for (int i = 1; i < m_machine_cnt; ++i) { 317 | if (! m_pfinish_send[i]) { 318 | ++msg2send; 319 | if ( sendEnd(i) ) continue; 320 | printf("sent MW_END to worker[%d]\n", i); fflush(stdout); 321 | m_pfinish_send[i] = 1; 322 | --msg2send; 323 | } 324 | } 325 | } while (msg2send); 326 | break; 327 | default: 328 | fprintf(stderr, "There is no such message type !\n"); 329 | break; 330 | } 331 | } 332 | 333 | void Master::receiveMessage(int worker_id) { 334 | 335 | if (m_receiver.m_in_buffer[worker_id].m_state) { // just test, totally full, can read from m_in_buffer 336 | 337 | pthread_mutex_lock(&m_receiver.m_in_mutex); 338 | if (m_receiver.m_in_buffer[worker_id].m_state) { // totally full, can read from m_in_buffer 339 | m_receiver.m_in_buffer[worker_id].m_buf_len = * (int *)m_receiver.m_in_buffer[worker_id].m_buffer; 340 | m_receiver.m_in_buffer[worker_id].m_msg_type = * (int *)&(m_receiver.m_in_buffer[worker_id].m_buffer[sizeof(int)]); 341 | 342 | if (m_receiver.m_in_buffer[worker_id].m_buf_len) { 343 | m_receiver.m_in_buffer[worker_id].m_msg_len = m_receiver.m_in_buffer[worker_id].m_buf_len - sizeof(int); 344 | int pack_len = m_receiver.m_in_buffer[worker_id].m_msg_len - sizeof(int); 345 | 346 | switch (m_receiver.m_in_buffer[worker_id].m_msg_type) { 347 | case WM_BEGIN: 348 | m_pwm_begin = wm__begin__unpack(NULL, pack_len, &m_receiver.m_in_buffer[worker_id].m_buffer[2 * sizeof(int)]); 349 | if (m_pwm_begin->state == 0) { // this worker ready to begin 350 | ++m_ready2begin_wk; 351 | } 352 | wm__begin__free_unpacked(m_pwm_begin, NULL); 353 | break; 354 | case WM_CURSSFINISH: 355 | m_pwm_curssfinish = wm__curss_finish__unpack(NULL, pack_len, &m_receiver.m_in_buffer[worker_id].m_buffer[2 * sizeof(int)]); 356 | for (size_t i = 0; i < m_pwm_curssfinish->n_aggr_local; ++i) { 357 | m_pmy_aggregator[i]->merge(m_pwm_curssfinish->aggr_local[i].data); 358 | // printf( "m_pmy_aggregator[%ld]: %f\n", i, * (double *)m_pmy_aggregator[i]->getGlobal() ); fflush(stdout); 359 | } 360 | for (size_t i = 0; i < m_pwm_curssfinish->n_worker_msg; ++i) { 361 | m_worker_msg[i] += m_pwm_curssfinish->worker_msg[i]; 362 | } 363 | m_act_vertex += m_pwm_curssfinish->act_vertex; 364 | m_sent_msg += m_pwm_curssfinish->sent_msg; 365 | // printf("m_act_vertex: %ld, m_sent_msg: %ld\n", m_act_vertex, m_sent_msg); fflush(stdout); 366 | wm__curss_finish__free_unpacked(m_pwm_curssfinish, NULL); 367 | ++m_curssfinish_wk; 368 | break; 369 | case WM_END: 370 | m_pwm_end = wm__end__unpack(NULL, pack_len, &m_receiver.m_in_buffer[worker_id].m_buffer[2 * sizeof(int)]); 371 | if (m_pwm_end->state == 0) { // this worker already ends 372 | ++m_alreadyend_wk; 373 | } 374 | wm__end__free_unpacked(m_pwm_end, NULL); 375 | break; 376 | default: 377 | fprintf(stderr, "There is no such message type !\n"); 378 | break; 379 | } 380 | 381 | m_receiver.m_in_buffer[worker_id].m_buf_len = 0; 382 | m_receiver.m_in_buffer[worker_id].m_state = 0; 383 | } 384 | } 385 | pthread_mutex_unlock(&m_receiver.m_in_mutex); 386 | 387 | } // else can't read from m_in_buffer 388 | } 389 | 390 | void Master::manageSuperstep() { 391 | printf("manageSuperstep\n"); fflush(stdout); 392 | 393 | // 1. Receive worker requests for whole supersteps begin. 394 | m_ready2begin_wk = 1; 395 | while (m_ready2begin_wk < m_machine_cnt) { 396 | for (int i = 1; i < m_machine_cnt; ++i) receiveMessage(i); 397 | } 398 | printf("received WM_BEGIN\n"); fflush(stdout); 399 | 400 | // 2. Send responds to whole supersteps begin. 401 | printf("MW_BEGIN: %d\n", MW_BEGIN); fflush(stdout); 402 | sendAll(MW_BEGIN); 403 | printf("sent MW_BEGIN\n"); fflush(stdout); 404 | 405 | // 3. Run into supersteps. 406 | m_term = 0; 407 | m_mw_nextssstart.superstep = -1; 408 | m_alreadyend_wk = 1; 409 | 410 | gettimeofday(&b_time, NULL); 411 | 412 | while (! m_term) { 413 | // 3.1 Initialize before every superstep. 414 | ++m_mw_nextssstart.superstep; 415 | printf("-----------------------------------------\n"); fflush(stdout); 416 | printf("superstep: %d\n", m_mw_nextssstart.superstep); fflush(stdout); 417 | for (int i = 0; i < m_my_aggregator_cnt; ++i) m_pmy_aggregator[i]->init(); 418 | m_curssfinish_wk = 1; 419 | 420 | // 3.2 Receive current superstep finish message from workers. 421 | memset( m_worker_msg, 0, m_machine_cnt * sizeof(int64_t) ); 422 | m_act_vertex = 0; 423 | m_sent_msg = 0; 424 | while (m_curssfinish_wk < m_machine_cnt) { 425 | for (int i = 1; i < m_machine_cnt; ++i) receiveMessage(i); 426 | } 427 | printf("received WM_CURSSFINISH\n"); fflush(stdout); 428 | 429 | // 3.3 Check if supersteps terminate. 430 | if ( ( m_pmy_graph->masterComputePerstep(m_mw_nextssstart.superstep, m_pmy_aggregator) ) || 431 | (!m_act_vertex && !m_sent_msg) ) m_term = 1; 432 | 433 | // 3.4 Send terminate/go on supersteps message to workers. 434 | if (m_term) { 435 | printf("MW_END: %d\n", MW_END); fflush(stdout); 436 | sendAll(MW_END); 437 | printf("sent MW_END\n"); fflush(stdout); 438 | 439 | // receive already end message from workers 440 | while (m_alreadyend_wk < m_machine_cnt) { 441 | for (int i = 1; i < m_machine_cnt; ++i) receiveMessage(i); 442 | } 443 | printf("received WM_END\n"); fflush(stdout); 444 | } else { 445 | printf("MW_NEXTSSSTART: %d\n", MW_NEXTSSSTART); fflush(stdout); 446 | sendAll(MW_NEXTSSSTART); 447 | printf("sent MW_NEXTSSSTART\n"); fflush(stdout); 448 | } 449 | } 450 | 451 | gettimeofday(&e_time, NULL); 452 | 453 | // 4. Set main thread to terminate. 454 | main_term = 1; 455 | } 456 | 457 | void Master::terminate() { 458 | printf("terminate\n"); fflush(stdout); 459 | 460 | // 1. Destroy Graph class. 461 | m_pmy_graph->term(); 462 | GraphDestroyFn destroy_graph; 463 | destroy_graph = (GraphDestroyFn)dlsym(m_puf_handle, "destroy_graph"); 464 | destroy_graph(m_pmy_graph); 465 | 466 | // 2. Free memory allocated. 467 | if (m_worker_msg) free(m_worker_msg); 468 | if (m_pfinish_send) free(m_pfinish_send); 469 | if (m_mw_nextssstart.aggr_global) free(m_mw_nextssstart.aggr_global); 470 | 471 | // 3. Quit receive & send threads. 472 | void* recv_retval; 473 | pthread_join(m_pth_receive, &recv_retval); 474 | void* send_retval; 475 | pthread_join(m_pth_send, &send_retval); 476 | 477 | // 4. Output elapsed. 478 | elapsed = (double)(e_time.tv_sec - b_time.tv_sec) + ((double)e_time.tv_usec - (double)b_time.tv_usec)/1e6; 479 | printf("elapsed time: %f seconds\n", elapsed); fflush(stdout); 480 | } 481 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Master.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Master.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Master class to coordinate workers and control progress 26 | * of supersteps. 27 | * 28 | */ 29 | 30 | #ifndef MASTER_H 31 | #define MASTER_H 32 | 33 | #include 34 | #include 35 | 36 | #include "Utility.h" 37 | #include "Graph.h" 38 | #include "AggregatorBase.h" 39 | #include "Sender.h" 40 | #include "Receiver.h" 41 | #include "WM.begin.pb-c.h" 42 | #include "MW.begin.pb-c.h" 43 | #include "WM.curss_finish.pb-c.h" 44 | #include "MW.nextss_start.pb-c.h" 45 | #include "MW.end.pb-c.h" 46 | #include "WM.end.pb-c.h" 47 | 48 | extern int main_term; 49 | 50 | /** Definition of Master class. */ 51 | class Master { 52 | public: 53 | const char* m_pstart_file; /**< start worker script file path */ 54 | const char* m_puser_file; /**< user file path */ 55 | std::string m_algo_args; /**< algorithm arguments */ 56 | IMDM m_imdm; /**< in-message deliver method */ 57 | void* m_puf_handle; /**< handle of opened user file */ 58 | Graph* m_pmy_graph; /**< configuration class */ 59 | int m_machine_cnt; /**< machine count, one master and some workers */ 60 | Addr* m_paddr_table; /**< address table, master 0 workers from 1 */ 61 | Addr m_addr_self; /**< self address */ 62 | int m_hdfs_flag; /**< read input from hdfs or local-fs, hdfs 1 local-fs 0 */ 63 | int m_my_aggregator_cnt; /**< aggregator count */ 64 | AggregatorBase** m_pmy_aggregator; /**< pointers of AggregatorBase */ 65 | 66 | Sender m_sender; /**< to manage activities about send */ 67 | Receiver m_receiver; /**< to manage activities about receive */ 68 | pthread_t m_pth_send; /**< send thread */ 69 | pthread_t m_pth_receive; /**< receive thread */ 70 | 71 | Wm__Begin* m_pwm_begin; /**< worker requests for whole supersteps begin */ 72 | Mw__Begin m_mw_begin; /**< master responds to whole supersteps begin */ 73 | Wm__CurssFinish* m_pwm_curssfinish; /**< worker current superstep finishes */ 74 | Mw__NextssStart m_mw_nextssstart; /**< master next superstep starts */ 75 | Mw__End m_mw_end; /**< master notifies workers to end supersteps */ 76 | Wm__End* m_pwm_end; /**< worker reports to master after ending supersteps */ 77 | 78 | int m_term; /**< to mark if supersteps end, 1/0 yes/no */ 79 | int* m_pfinish_send; /**< to mark if MW_* message sent successfully to every worker, from [1] */ 80 | int m_ready2begin_wk; /**< count of workers ready to begin */ 81 | int m_curssfinish_wk; /**< count of workers having finished current superstep */ 82 | int64_t* m_worker_msg; /**< count of messages for each worker in current superstep */ 83 | int64_t m_act_vertex; /**< count of active vertices left after current superstep */ 84 | int64_t m_sent_msg; /**< count of messages sent in current superstep */ 85 | int m_alreadyend_wk; /**< count of workers already ends */ 86 | 87 | public: 88 | /** 89 | * Run function. 90 | * Master process entrance, which consists of child methods. 91 | * @param argc command line argument number 92 | * @param argv command line arguments 93 | */ 94 | void run(int argc, char* argv[]); 95 | 96 | /** 97 | * Parse command line arguments. 98 | * @param argc command line argument number 99 | * @param argv command line arguments 100 | */ 101 | void parseCmdArg(int argc, char* argv[]); 102 | 103 | /** 104 | * Load user file. 105 | * @param argc command line argument number 106 | * @param argv command line arguments 107 | */ 108 | void loadUserFile(int argc, char* argv[]); 109 | 110 | /** Start all worker processes. */ 111 | void startWorkers(char* argv[]); 112 | 113 | /** Initialize some global/member variables. */ 114 | void init(); 115 | 116 | /** 117 | * Respond whole supersteps begin message to a worker. 118 | * @param worker_id destination worker id 119 | * @retval 0 send successfully 120 | * @retval 1 send unsuccessfully 121 | */ 122 | int sendBegin(int worker_id); 123 | 124 | /** 125 | * Send next superstep start message to a worker. 126 | * @param worker_id destination worker id 127 | * @retval 0 send successfully 128 | * @retval 1 send unsuccessfully 129 | */ 130 | int sendNextssstart(int worker_id); 131 | 132 | /** 133 | * Send whole superstep end message to a worker. 134 | * @param worker_id destination worker id 135 | * @retval 0 send successfully 136 | * @retval 1 send unsuccessfully 137 | */ 138 | int sendEnd(int worker_id); 139 | 140 | /** 141 | * Send messages of same type to all workers. 142 | * @param msg_type message type 143 | */ 144 | void sendAll(int msg_type); 145 | 146 | /** 147 | * Receive all kinds of messages from a worker. 148 | * @param worker_id source worker id 149 | */ 150 | void receiveMessage(int worker_id); 151 | 152 | /** Manage a series of supersteps. */ 153 | void manageSuperstep(); 154 | 155 | /** Free some global/member variables. */ 156 | void terminate(); 157 | }; // definition of Master class 158 | 159 | #endif /* MASTER_H */ 160 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/MsgBuffer.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file MsgBuffer.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined MsgBuffer class used to store network messages. 26 | * We define out_buffer of MsgBuffer type in Sender class to store messages to 27 | * sent, and in_buffer of MsgBuffer type in Receiver class to store messages to 28 | * receive. 29 | * 30 | * A whole network message includes buf_len, msg_type, and packed messages in 31 | * buffer. In detail, buf_len is the length of a whole network message in bytes, 32 | * including occupied bytes of buf_len, msg_type and packed messages; msg_len is 33 | * the length of a network message content in bytes, including occupied bytes of 34 | * msg_type and packed messages. 35 | * In case below, m_buffer first sizeof(int) bytes to store buf_len value, 36 | * second sizeof(int) bytes to store msg_type value, and rest bytes to store 37 | * packed messages from m_buffer[2 * sizeof(int)]. 38 | * 39 | * In order to deal with the situation where one piece of network message hasn't 40 | * been sent or received totally in one send() or recv(), we use m_head & m_tail 41 | * to mark positions. In send-thread, m_head suggests head position of 42 | * the left-to-send message, and accumulates return value of send(); in 43 | * receive-thread, m_tail suggests tail position of the already-received 44 | * message, more precisely, not tail but the next position of tail, and 45 | * accumulates return value of recv(). 46 | * m_head plays no role in receive-thread, and same as m_tail in send-thread. 47 | * 48 | */ 49 | 50 | #ifndef MSGBUFFER_H 51 | #define MSGBUFFER_H 52 | 53 | #include "Utility.h" 54 | 55 | /** Definition of MsgBuffer class. */ 56 | class MsgBuffer { 57 | public: 58 | int m_state; /**< indicate buffer state- 59 | 0: for sender, totally empty; for receiver, totally/partly empty. 60 | 1: for sender, totally/partly full; for receiver, totally full. */ 61 | int m_head; /**< mark head position of left-to-send message */ 62 | int m_tail; /**< mark tail position of already-received message */ 63 | int m_buf_len; /**< length of whole network message */ 64 | int m_msg_type; /**< network message type */ 65 | int m_msg_len; /**< length of network message content */ 66 | uint8_t m_buffer[BUFFER_SIZE]; /**< network message buffer */ 67 | 68 | public: 69 | MsgBuffer(): m_state(0), m_head(0), m_tail(0), m_buf_len(0), m_msg_type(0), 70 | m_msg_len(0) {} 71 | }; // definition of MsgBuffer class 72 | 73 | #endif /* MSGBUFFER_H */ 74 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Node.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Node.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * @see Node.h 26 | * 27 | */ 28 | 29 | #include 30 | #include 31 | 32 | #include "Node.h" 33 | #include "Worker.h" 34 | 35 | extern Worker worker; 36 | 37 | int Edge::e_value_size; 38 | int Edge::e_size; 39 | int Msg::m_value_size; 40 | int Msg::m_size; 41 | int Node::n_value_size; 42 | int Node::n_size; 43 | 44 | Node& Node::getNode(int64_t index) { 45 | return *( (Node *)( (char *)(worker.m_pnode) + index * Node::n_size ) ); 46 | } 47 | 48 | Edge& Node::getEdge(int64_t index) { 49 | return *( (Edge *)( (char *)(worker.m_pedge) + index * Edge::e_size ) ); 50 | } 51 | 52 | void Node::initInMsg() { 53 | new(&m_cur_in_msg) std::vector(); 54 | } 55 | 56 | void Node::recvNewMsg(Msg* pmsg) { 57 | m_cur_in_msg.push_back(pmsg); 58 | if (! m_active) { 59 | m_active = true; 60 | ++worker.m_wm_curssfinish.act_vertex; 61 | } 62 | } 63 | 64 | void Node::clearCurInMsg() { 65 | if ( m_cur_in_msg.size() ) { 66 | 67 | for (int64_t i = m_cur_in_msg.size() - 1; i >= 0; --i) { 68 | worker.m_free_list.free(m_cur_in_msg[i]); 69 | } 70 | 71 | // m_cur_in_msg.clear(); memory leak 72 | freeInMsgVector(); 73 | initInMsg(); 74 | } 75 | } 76 | 77 | void Node::freeInMsgVector() { 78 | (&m_cur_in_msg)->~vector(); 79 | } 80 | 81 | int Node::getSuperstep() const { return worker.m_wm_curssfinish.superstep; } 82 | 83 | int64_t Node::getVertexId() const { return m_v_id; } 84 | 85 | void Node::voteToHalt() { 86 | m_active = false; 87 | --worker.m_wm_curssfinish.act_vertex; 88 | } 89 | 90 | GenericLinkIterator* Node::getGenericLinkIterator() { 91 | return new GenericLinkIterator(&m_cur_in_msg); 92 | } 93 | 94 | /* 95 | GenericArrayIterator* Node::getGenericArrayIterator() { 96 | return new GenericArrayIterator( 97 | (char *)(worker.m_pedge) + m_edge_index * Edge::e_size, 98 | (char *)(worker.m_pedge) + (m_edge_index + m_out_degree) * Edge::e_size, 99 | Edge::e_size ); 100 | } 101 | */ 102 | 103 | void Node::sendMessageTo(int64_t dest_vertex, const char* pmessage) { 104 | int wk = dest_vertex % (worker.m_machine_cnt - 1) + 1; // hash partition 105 | 106 | if (wk == worker.m_addr_self.id) { // self message, can be commented for debug 1 107 | 108 | Msg* pmsg = (Msg *)( worker.m_free_list.allocate() ); 109 | pmsg->s_id = m_v_id; 110 | pmsg->d_id = dest_vertex; 111 | memcpy(pmsg->message, pmessage, Msg::m_value_size); 112 | 113 | worker.recvNewNodeMsg(pmsg); // self message certainly not for next next 114 | // superstep but for next 115 | } else { // message to another worker 116 | 117 | // check if we should send messages 118 | if (worker.m_psendlist_curpos[wk] == SENDLIST_LEN) { 119 | 120 | while ( worker.sendNodeMessage(wk, worker.m_psendlist_curpos[wk]) ); 121 | // How to change into wait ? 122 | 123 | worker.m_psendlist_curpos[wk] = 0; 124 | ++worker.m_wm_curssfinish.worker_msg[wk]; 125 | } 126 | 127 | // copy the message into send message list 128 | Msg* pmsg = (Msg *)(worker.m_pww_sendlist[wk].msgs.data + worker.m_psendlist_curpos[wk] * Msg::m_size); 129 | pmsg->s_id = m_v_id; 130 | pmsg->d_id = dest_vertex; 131 | memcpy(pmsg->message, pmessage, Msg::m_value_size); 132 | ++worker.m_psendlist_curpos[wk]; 133 | // printf("wrote one piece of node2node message to sendlist[%d]\n", wk); // 134 | } 135 | 136 | ++worker.m_wm_curssfinish.sent_msg; 137 | } 138 | 139 | void Node::sendMessageToAllNeighbors(const char* pmessage) { 140 | char* pedge = (char *)(worker.m_pedge) + m_edge_index * Edge::e_size; 141 | for (int64_t i = 0; i < m_out_degree; ++i) { 142 | sendMessageTo( ( (Edge *)pedge )->to, pmessage ); 143 | pedge += Edge::e_size; 144 | } 145 | } 146 | 147 | const void* Node::getAggrGlobal(int aggr) { 148 | return worker.m_pmy_aggregator[aggr]->getGlobal(); 149 | } 150 | 151 | void Node::accumulateAggr(int aggr, const void* p) { 152 | worker.m_pmy_aggregator[aggr]->accumulate(p); 153 | } 154 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/OutputFormatter.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file OutputFormatter.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * @see OutputFormatter.h 26 | * 27 | */ 28 | 29 | #include "OutputFormatter.h" 30 | #include "Worker.h" 31 | 32 | extern Worker worker; 33 | 34 | void OutputFormatter::open(const char* pout_path) { 35 | if (worker.m_hdfs_flag) { 36 | m_hdfs_file = hdfsOpenFile(worker.m_fs_handle, pout_path, 37 | O_WRONLY|O_CREAT, 0, 0, 0); 38 | if (! m_hdfs_file) { 39 | perror("output file open"); 40 | exit(1); 41 | } 42 | } else { 43 | m_local_file.open(pout_path); 44 | } 45 | } 46 | 47 | void OutputFormatter::close() { 48 | if (worker.m_hdfs_flag) { 49 | hdfsCloseFile(worker.m_fs_handle, m_hdfs_file); 50 | } else { 51 | m_local_file.close(); 52 | } 53 | } 54 | 55 | void OutputFormatter::writeNextResLine(char* pbuffer, int len) { 56 | if (worker.m_hdfs_flag) { 57 | hdfsWrite(worker.m_fs_handle, m_hdfs_file, (void*)pbuffer, len); 58 | } else { 59 | m_local_file.write(pbuffer, len); 60 | } 61 | } 62 | 63 | void OutputFormatter::ResultIterator::getIdValue(int64_t& vid, void* pvalue) { 64 | worker.res_iter.getIdValue(vid, pvalue); 65 | } 66 | 67 | void OutputFormatter::ResultIterator::next() { 68 | worker.res_iter.next(); 69 | } 70 | 71 | bool OutputFormatter::ResultIterator::done() { 72 | return worker.res_iter.done(); 73 | } 74 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Receiver.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Receiver.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * @see Receiver.h 26 | * 27 | */ 28 | 29 | #include 30 | #include 31 | #include 32 | #include 33 | #include 34 | #include 35 | #include 36 | 37 | #include "Receiver.h" 38 | 39 | /** Get elapsed time. */ 40 | #define elapsedTime(begin, end) \ 41 | (double)(end.tv_sec - begin.tv_sec) + \ 42 | ((double)end.tv_usec - (double)begin.tv_usec) / 1e6; 43 | 44 | void Receiver::init(int cnt) { 45 | 46 | // 1. Set client count. 47 | m_cli_cnt = cnt; 48 | 49 | // 2. Get in_buffer memory. 50 | m_in_buffer = new MsgBuffer[m_cli_cnt]; 51 | if (! m_in_buffer) { 52 | perror("Receiver: new"); 53 | exit(1); 54 | } 55 | 56 | // 3. Initialize in_mutex. 57 | m_in_mutex = PTHREAD_MUTEX_INITIALIZER; 58 | 59 | // 4. Get server self socket. 60 | m_mysock_fd = socket(AF_INET, SOCK_STREAM, 0); 61 | if (m_mysock_fd < 0) { 62 | perror("Receiver: socket"); 63 | exit(1); 64 | } 65 | } 66 | 67 | void Receiver::bindServerAddr(int port) { 68 | // Shimin mod: set sockopt 69 | int optval= 1; 70 | setsockopt(m_mysock_fd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)); 71 | 72 | struct sockaddr_in cli_addr; 73 | memset( (char *) &cli_addr, 0, sizeof(cli_addr) ); 74 | cli_addr.sin_family = AF_INET; 75 | cli_addr.sin_addr.s_addr = INADDR_ANY; 76 | cli_addr.sin_port = htons(port); 77 | 78 | int retries = 0; 79 | while (bind( m_mysock_fd, (struct sockaddr *) &cli_addr, sizeof(cli_addr) ) < 0) { 80 | if (errno != EADDRINUSE) { 81 | perror("Receiver: bind"); 82 | exit(1); 83 | } 84 | 85 | ++retries; 86 | if (retries % 10 == 0) { 87 | perror("Receiver: bind"); 88 | } 89 | 90 | sleep(1); 91 | } 92 | } 93 | 94 | void Receiver::listenClient() { 95 | listen(m_mysock_fd, m_cli_cnt); 96 | } 97 | 98 | void Receiver::acceptClient() { 99 | int sock; 100 | char buffer[SPRINTF_LEN]; 101 | int machine_no; 102 | 103 | m_sock_fd = (int *)malloc( m_cli_cnt * sizeof(int) ); 104 | if (! m_sock_fd) { 105 | perror("Receiver: malloc"); 106 | exit(1); 107 | } 108 | 109 | m_max_sock = 0; 110 | m_cli_addr = (struct sockaddr_in *)malloc( m_cli_cnt * sizeof(struct sockaddr_in) ); 111 | if (! m_cli_addr) { 112 | perror("Receiver: malloc"); 113 | exit(1); 114 | } 115 | 116 | int clilen = sizeof(struct sockaddr_in); 117 | for (int i = 0; i < m_cli_cnt; ++i) { 118 | sock = accept(m_mysock_fd, (struct sockaddr *)&m_cli_addr[i], (socklen_t*)&clilen); 119 | if (sock < 0) { 120 | perror("Receiver: accept"); 121 | continue; 122 | } 123 | if (sock > m_max_sock) { 124 | m_max_sock = sock; 125 | } 126 | 127 | memset( buffer, 0, sizeof(buffer) ); 128 | int64_t ret = recv(sock, buffer, sizeof(buffer), 0); 129 | if (ret >= 0) { 130 | // printf("recv bytes: %ld\n", ret); fflush(stdout); 131 | } 132 | machine_no = atoi(buffer); 133 | m_sock_fd[machine_no] = sock; 134 | } 135 | 136 | printf("Receiver: accept all client success\n"); fflush(stdout); 137 | } 138 | 139 | void Receiver::recvMsg() { 140 | int64_t recv_bytes = 0; 141 | double elapsed = 0; 142 | struct timeval b_time, e_time; 143 | 144 | fd_set fds_orig; 145 | struct timeval tv; 146 | int64_t ret; 147 | 148 | FD_ZERO(&fds_orig); 149 | for (int i = 0; i < m_cli_cnt; ++i) { 150 | FD_SET(m_sock_fd[i], &fds_orig); 151 | } 152 | 153 | // int loop = 0; 154 | int retries = 0; 155 | while (! main_term) { 156 | // ++loop; 157 | // printf("Receiver: loop %d\n", loop); fflush(stdout); 158 | 159 | FD_ZERO(&m_fds); 160 | m_fds = fds_orig; 161 | 162 | tv.tv_sec = 1; 163 | tv.tv_usec = 0; 164 | 165 | ret = select(m_max_sock + 1, &m_fds, NULL, NULL, &tv); // readable 166 | // printf("Receiver: select ret %d\n", ret); fflush(stdout); 167 | if (ret < 0) { 168 | perror("Receiver: select"); 169 | break; 170 | } else if (!ret) { 171 | ++retries; 172 | if (retries % 100 == 0) { 173 | printf("Receiver: timeout\n"); fflush(stdout); 174 | } 175 | 176 | sleep(1); 177 | continue; 178 | } 179 | 180 | for (int i = 0; i < m_cli_cnt; ++i) { 181 | if (! m_in_buffer[i].m_state) { // just test, at least one buffer doesn't have complete data. 182 | if ( FD_ISSET(m_sock_fd[i], &m_fds) ) { // Socket i has been set. 183 | // double check 184 | pthread_mutex_lock(&m_in_mutex); 185 | int state = m_in_buffer[i].m_state; 186 | pthread_mutex_unlock(&m_in_mutex); 187 | 188 | if (! state) { 189 | // Every message needs to call recv() at least twice, first for message length and rest for the content. 190 | if (! m_in_buffer[i].m_buf_len) { // buf_len hasn't been read in completely. 191 | // printf("Receiver: buf_len recv()\n"); fflush(stdout); 192 | gettimeofday(&b_time, NULL); 193 | ret = recv(m_sock_fd[i], &m_in_buffer[i].m_buffer[m_in_buffer[i].m_tail], 194 | sizeof(m_in_buffer[i].m_buf_len) - m_in_buffer[i].m_tail, MSG_DONTWAIT); 195 | gettimeofday(&e_time, NULL); 196 | elapsed += elapsedTime(b_time, e_time); 197 | // printf("Receiver: buf_len ret %d\n", ret); fflush(stdout); 198 | if (ret <= 0) continue; 199 | 200 | recv_bytes += ret; 201 | m_in_buffer[i].m_tail += ret; 202 | if ( m_in_buffer[i].m_tail == sizeof(m_in_buffer[i].m_buf_len) ) { // buf_len has been read in completely. 203 | m_in_buffer[i].m_buf_len = * (int *)m_in_buffer[i].m_buffer; 204 | m_in_buffer[i].m_msg_len = m_in_buffer[i].m_buf_len - sizeof(int); 205 | } 206 | } else { // buf_len has been read in completely. 207 | // receive 208 | // printf("Receiver: recv()\n"); fflush(stdout); 209 | gettimeofday(&b_time, NULL); 210 | ret = recv(m_sock_fd[i], &m_in_buffer[i].m_buffer[m_in_buffer[i].m_tail], 211 | m_in_buffer[i].m_msg_len, MSG_DONTWAIT); 212 | gettimeofday(&e_time, NULL); 213 | elapsed += elapsedTime(b_time, e_time); 214 | // printf("Receiver: ret %d\n", ret); fflush(stdout); 215 | if (ret <= 0) continue; 216 | 217 | recv_bytes += ret; 218 | if (ret < m_in_buffer[i].m_msg_len) { 219 | m_in_buffer[i].m_tail += ret; 220 | m_in_buffer[i].m_msg_len -= ret; 221 | } else if (ret == m_in_buffer[i].m_msg_len) { 222 | m_in_buffer[i].m_head = 0; 223 | m_in_buffer[i].m_tail = 0; 224 | m_in_buffer[i].m_msg_len = 0; 225 | m_in_buffer[i].m_buf_len = 0; 226 | // memset m_out_buffer[i].m_buffer 227 | 228 | pthread_mutex_lock(&m_in_mutex); 229 | m_in_buffer[i].m_state = 1; 230 | pthread_mutex_unlock(&m_in_mutex); 231 | } 232 | } 233 | } 234 | } 235 | } 236 | } 237 | } 238 | 239 | // printf("Receiver: break select\n"); fflush(stdout); 240 | // printf("recv bytes: %ld, network bandwidth: %f MB/s\n", recv_bytes, recv_bytes / 1e6 / elapsed); 241 | // fflush(stdout); 242 | } 243 | 244 | void Receiver::closeAllSocket() { 245 | 246 | // 1. Close all socket. 247 | for (int i = 0; i < m_cli_cnt; ++i) { 248 | close(m_sock_fd[i]); 249 | } 250 | 251 | // 2. Free memory allocated. 252 | free(m_cli_addr); 253 | free(m_sock_fd); 254 | delete[] m_in_buffer; 255 | 256 | printf("Receiver: closeAllSocket\n"); fflush(stdout); 257 | } 258 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Receiver.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Receiver.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Receiver class used to receive messages between machines. 26 | 27 | * We temporarily choose TCP/IP as communication protocol through socket, 28 | * and later we may change protocol into UDP to improve performance. 29 | 30 | * In receive-thread, Receiver keeps on receiving all the time if there are 31 | * empty buffers. In order to receive efficiently, we use select() non-blocking 32 | * model and set MSG_DONTWAIT flag in recv(). When all buffers are full, 33 | * Receiver signal() main thread, and wait() until signaled by main, that is, 34 | * when there is at least one empty buffer. 35 | * 36 | * @see MsgBuffer class 37 | * 38 | */ 39 | 40 | #ifndef RECEIVER_H 41 | #define RECEIVER_H 42 | 43 | #include 44 | #include 45 | #include 46 | #include 47 | 48 | #include "MsgBuffer.h" 49 | #include "Utility.h" 50 | 51 | extern int main_term; 52 | 53 | /** Definition of Sender class. */ 54 | class Receiver { 55 | public: 56 | int m_mysock_fd; /**< server self socket file descriptor */ 57 | int m_cli_cnt; /**< count of clients */ 58 | int* m_sock_fd; /**< sockets with all clients */ 59 | int m_max_sock; /**< maximum socket in m_sock_fd, used for select() */ 60 | struct sockaddr_in* m_cli_addr; /**< address of clients */ 61 | fd_set m_fds; /**< file descriptor set for all sockets */ 62 | MsgBuffer* m_in_buffer; /**< in buffers for all clients */ 63 | 64 | pthread_mutex_t m_in_mutex; /**< mutex for m_in_buffer */ 65 | 66 | public: 67 | /** 68 | * Initialize. 69 | * @param cnt client count 70 | */ 71 | void init(int cnt); 72 | 73 | /** 74 | * Bind server name to got socket. 75 | * @param port server port 76 | */ 77 | void bindServerAddr(int port); 78 | 79 | /** Listen to clients. */ 80 | void listenClient(); 81 | 82 | /** Accept clients. */ 83 | void acceptClient(); 84 | 85 | /** Receive messages from all clients continuously. */ 86 | void recvMsg(); 87 | 88 | /** Close sockets with all clients. */ 89 | void closeAllSocket(); 90 | }; // definition of Receiver class 91 | 92 | #endif /* RECEIVER_H */ 93 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Sender.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Sender.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * @see Sender.h 26 | * 27 | */ 28 | 29 | #include 30 | #include 31 | #include 32 | #include 33 | #include 34 | #include 35 | #include 36 | 37 | #include "Sender.h" 38 | 39 | /** Get elapsed time. */ 40 | #define elapsedTime(begin, end) \ 41 | (double)(end.tv_sec - begin.tv_sec) + \ 42 | ((double)end.tv_usec - (double)begin.tv_usec) / 1e6; 43 | 44 | void Sender::init(int cnt) { 45 | 46 | // 1. Set server count. 47 | m_serv_cnt = cnt; 48 | 49 | // 2. Get out_buffer memory. 50 | m_out_buffer = new MsgBuffer[m_serv_cnt]; 51 | if (! m_out_buffer) { 52 | perror("Sender: new"); 53 | exit(1); 54 | } 55 | 56 | // 3. Initialize out_mutex. 57 | m_out_mutex = PTHREAD_MUTEX_INITIALIZER; 58 | } 59 | 60 | void Sender::getSocketFd() { 61 | m_sock_fd = (int *)malloc( m_serv_cnt * sizeof(int) ); 62 | if (! m_sock_fd) { 63 | perror("Sender: malloc"); 64 | exit(1); 65 | } 66 | 67 | m_max_sock = 0; 68 | for (int i = 0; i < m_serv_cnt; ++i) { 69 | m_sock_fd[i] = socket(AF_INET, SOCK_STREAM, 0); 70 | if (m_sock_fd[i] < 0) { 71 | perror("Sender: socket"); 72 | exit(1); 73 | } 74 | 75 | if (m_sock_fd[i] > m_max_sock) { 76 | m_max_sock = m_sock_fd[i]; 77 | } 78 | } 79 | } 80 | 81 | void Sender::getServerAddr(Addr *addr) { 82 | m_serv_addr = (struct sockaddr_in *)malloc( m_serv_cnt * sizeof(struct sockaddr_in) ); 83 | if (! m_serv_addr) { 84 | perror("Sender: malloc"); 85 | exit(1); 86 | } 87 | 88 | for (int i = 0; i < m_serv_cnt; ++i) { 89 | memset( (char *) &m_serv_addr[i], 0, sizeof(struct sockaddr_in) ); 90 | m_serv_addr[i].sin_family = AF_INET; 91 | struct hostent * server = gethostbyname(addr[i].hostname); 92 | if (!server) { 93 | perror("Sender: no host"); 94 | exit(1); 95 | } 96 | 97 | memcpy( (char *)&m_serv_addr[i].sin_addr.s_addr, (char *)server->h_addr, server->h_length ); 98 | m_serv_addr[i].sin_port = htons(addr[i].port); 99 | } 100 | } 101 | 102 | void Sender::connectServer(int id) { 103 | char buffer[SPRINTF_LEN]; 104 | 105 | for (int i = 0; i < m_serv_cnt; ++i) { 106 | int retries = 0; 107 | 108 | while (connect( m_sock_fd[i], (struct sockaddr *)&m_serv_addr[i], sizeof(struct sockaddr_in) ) < 0) { 109 | if (errno != ECONNREFUSED) { 110 | perror("Sender: connect"); 111 | exit(1); 112 | } 113 | 114 | ++retries; 115 | if (retries % 10 == 0) { 116 | perror("Sender: connect"); 117 | if (retries >= 60) { 118 | fprintf(stderr, "Sender cannot connect after %d retries\n", retries); 119 | // exit(1); 120 | } 121 | } 122 | 123 | sleep(1); 124 | } 125 | 126 | memset( buffer, 0, sizeof(buffer) ); 127 | sprintf(buffer, "%d", id); 128 | int64_t ret = send(m_sock_fd[i], buffer, sizeof(buffer), 0); 129 | if (ret >= 0) { 130 | // printf("sent bytes: %ld\n", ret); fflush(stdout); 131 | } 132 | } 133 | 134 | printf("Sender: connect all server success\n"); fflush(stdout); 135 | } 136 | 137 | void Sender::selectSend(fd_set& fds_orig, int64_t& sent_bytes, double& elapsed) { 138 | struct timeval b_time, e_time; 139 | 140 | FD_ZERO(&m_fds); 141 | m_fds = fds_orig; 142 | 143 | struct timeval tv; 144 | int ret; 145 | 146 | tv.tv_sec = 1; 147 | tv.tv_usec = 0; 148 | 149 | ret = select(m_max_sock + 1, NULL, &m_fds, NULL, &tv); // writable 150 | // printf("Sender: select ret %d\n", ret); fflush(stdout); 151 | if (ret < 0) { 152 | perror("Sender: select\n"); fflush(stdout); 153 | return; 154 | } else if (! ret) { 155 | printf("Sender: timeout\n"); fflush(stdout); 156 | return; 157 | } 158 | 159 | for (int i = 0; i < m_serv_cnt; ++i) { 160 | if (m_out_buffer[i].m_state) { // just test, at least one buffer has data. 161 | if ( FD_ISSET(m_sock_fd[i], &m_fds) ) { // Socket i has been set. 162 | // double check 163 | pthread_mutex_lock(&m_out_mutex); 164 | int state = m_out_buffer[i].m_state; 165 | pthread_mutex_unlock(&m_out_mutex); 166 | 167 | if (state) { 168 | // printf("Sender: send()\n"); fflush(stdout); 169 | gettimeofday(&b_time, NULL); 170 | ret = send(m_sock_fd[i], m_out_buffer[i].m_buffer + m_out_buffer[i].m_head, 171 | m_out_buffer[i].m_buf_len, MSG_DONTWAIT); 172 | gettimeofday(&e_time, NULL); 173 | elapsed += elapsedTime(b_time, e_time); 174 | // printf("Sender: ret %d\n", ret); fflush(stdout); 175 | if (ret <= 0) continue; 176 | 177 | sent_bytes += ret; 178 | if (ret < m_out_buffer[i].m_buf_len) { 179 | m_out_buffer[i].m_head += ret; 180 | m_out_buffer[i].m_buf_len -= ret; 181 | } else if (ret == m_out_buffer[i].m_buf_len) { 182 | m_out_buffer[i].m_head = 0; 183 | m_out_buffer[i].m_tail = 0; 184 | m_out_buffer[i].m_msg_len = 0; 185 | m_out_buffer[i].m_buf_len = 0; 186 | // memset m_out_buffer[i].m_buffer 187 | 188 | pthread_mutex_lock(&m_out_mutex); 189 | m_out_buffer[i].m_state = 0; 190 | pthread_cond_signal(&m_out_cond); 191 | pthread_mutex_unlock(&m_out_mutex); 192 | } 193 | } 194 | } 195 | } 196 | } 197 | } 198 | 199 | void Sender::sendMsg() { 200 | int64_t sent_bytes = 0; 201 | double elapsed = 0; 202 | 203 | printf("Sender::sendMsg()\n"); fflush(stdout); 204 | 205 | fd_set fds_orig; 206 | FD_ZERO(&fds_orig); 207 | 208 | for (int i = 0; i < m_serv_cnt; ++i) { 209 | FD_SET(m_sock_fd[i], &fds_orig); 210 | } 211 | 212 | while (! main_term) selectSend(fds_orig, sent_bytes, elapsed); 213 | // printf("Sender: break select\n"); fflush(stdout); 214 | 215 | // the lase send after term 216 | selectSend(fds_orig, sent_bytes, elapsed); 217 | 218 | // printf("sent bytes: %ld, network bandwidth: %f MB/s\n", sent_bytes, sent_bytes / 1e6 / elapsed); 219 | // fflush(stdout); 220 | } 221 | 222 | void Sender::closeAllSocket() { 223 | 224 | // 1. Close all socket. 225 | for (int i = 0; i < m_serv_cnt; ++i) { 226 | close(m_sock_fd[i]); 227 | } 228 | 229 | // 2. Free memory allocated. 230 | free(m_serv_addr); 231 | free(m_sock_fd); 232 | delete[] m_out_buffer; 233 | 234 | printf("Sender: closeAllSocket\n"); fflush(stdout); 235 | } 236 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Sender.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Sender.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Sender class used to send messages between machines. 26 | 27 | * We temporarily choose TCP/IP as communication protocol through socket, 28 | * and later we may change protocol into UDP to improve performance. 29 | 30 | * In send-thread, Sender keeps on sending all the time if there are messages 31 | * in buffers. In order to send efficiently, we use select() non-blocking model 32 | * and set MSG_DONTWAIT flag in send(). When all buffers are empty, Sender 33 | * signal() main thread, and wait() until signaled by main, that is, when there 34 | * is at least one full buffer. 35 | * 36 | * @see MsgBuffer class 37 | * 38 | */ 39 | 40 | #ifndef SENDER_H 41 | #define SENDER_H 42 | 43 | #include 44 | #include 45 | #include 46 | #include 47 | 48 | #include "MsgBuffer.h" 49 | #include "Utility.h" 50 | 51 | extern int main_term; 52 | 53 | /** Definition of Sender class. */ 54 | class Sender { 55 | public: 56 | int m_serv_cnt; /**< count of servers */ 57 | int* m_sock_fd; /**< sockets with all servers */ 58 | int m_max_sock; /**< maximum socket in m_sock_fd, used for select() */ 59 | struct sockaddr_in* m_serv_addr; /**< address of servers */ 60 | fd_set m_fds; /**< file descriptor set for all sockets */ 61 | MsgBuffer* m_out_buffer; /**< out buffers for all servers */ 62 | 63 | pthread_cond_t m_out_cond; /**< condition variable for m_out_buffer */ 64 | pthread_mutex_t m_out_mutex; /**< mutex for m_out_buffer */ 65 | 66 | public: 67 | /** 68 | * Initialize. 69 | * @param cnt server count 70 | */ 71 | void init(int cnt); 72 | 73 | /** Get sockets for all servers. */ 74 | void getSocketFd(); 75 | 76 | /** 77 | * Get server addresses. 78 | * @param addr server address table 79 | */ 80 | void getServerAddr(Addr* addr); 81 | 82 | /** 83 | * Build connection with all servers. 84 | * @param id self machine id 85 | */ 86 | void connectServer(int id); 87 | 88 | /** 89 | * Select sockets and send messages. 90 | * @param fds_orig original file descriptors 91 | * @retval sent_bytes 92 | * @retval elapsed elapsed time by second 93 | */ 94 | void selectSend(fd_set& fds_orig, int64_t& sent_bytes, double& elapsed); 95 | 96 | /** Send messages to all servers continuously. */ 97 | void sendMsg(); 98 | 99 | /** Close sockets with all servers. */ 100 | void closeAllSocket(); 101 | }; // definition of Sender class 102 | 103 | #endif /* SENDER_H */ 104 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Utility.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Utility.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defines constants and message types. 26 | * 27 | */ 28 | 29 | #ifndef UTILITY_H 30 | #define UTILITY_H 31 | 32 | #include "Addr.h" 33 | 34 | #define SPRINTF_LEN 1000 // used for sprintf() buffer 35 | // #define MEMPOOL_CAPACITY 1<<8 // memory pool capacity in ChunkedList&FreeList 36 | #define MEMPOOL_CAPACITY 1<<20 // memory pool capacity in ChunkedList&FreeList 37 | // #define SENDLIST_LEN 2 // length of sendlist in Worker, in message 38 | #define SENDLIST_LEN 10000 // length of sendlist in Worker, in message 39 | #define BUFFER_SIZE 1000000 // size of Sender/Receiver buffer, in bytes 40 | /* 41 | * There should be SENDLIST_LEN*Msg::m_size<=BUFFER_SIZE, since for pack(), 42 | * SENDLIST_LEN*Msg::m_size is the message length before packed, BUFFER_SIZE 43 | * should be larger than packed length, and packed length must be less than 44 | * the length before packed, so SENDLIST_LEN*Msg::m_size<=BUFFER_SIZE. 45 | */ 46 | 47 | /** A enum for in-message deliver method. */ 48 | typedef enum InMsgDeliverMethod { 49 | IMDM_OPT_PLAIN, 50 | IMDM_OPT_GROUP_PREF, 51 | IMDM_OPT_SWPL_PREF 52 | } IMDM; 53 | 54 | /** A enum for message type, only defined member but no instance. */ 55 | enum MessageType { 56 | WM_BEGIN, /**< worker requests for whole supersteps begin */ 57 | MW_BEGIN, /**< master responds to whole supersteps begin */ 58 | WM_CURSSFINISH, /**< worker current superstep finishes */ 59 | MW_NEXTSSSTART, /**< master next superstep starts */ 60 | MW_END, /**< master notifies workers to end supersteps */ 61 | WM_END, /**< worker reports to master after ending supersteps */ 62 | WW_NODEMSGLIST /**< worker Node2Node message list */ 63 | /* 64 | * Notice there's no message of WW_FINISHSENDNODEMSG kind, cuz it is 65 | * WW_NODEMSGLIST with num_msgs=0. We just use this expression to illustrate 66 | * in annotation. 67 | */ 68 | }; 69 | 70 | #endif 71 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/WM.begin.proto: -------------------------------------------------------------------------------- 1 | /** message from worker to master */ 2 | package wm; 3 | 4 | /** 5 | * Request for whole supersteps begin. 6 | * message type: 0 7 | */ 8 | message begin { 9 | required int32 s_id = 1; /**< worker id */ 10 | required int32 d_id = 2; /**< master id */ 11 | required int32 state = 3; /**< 0: ready to begin, 1: not ready to begin */ 12 | } 13 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/WM.curss_finish.proto: -------------------------------------------------------------------------------- 1 | /** message from worker to master */ 2 | package wm; 3 | 4 | /** 5 | * Current superstep finishes. 6 | * Worker sends this message to master after current superstep computing finishes. 7 | * message type: 2 8 | */ 9 | message curss_finish { 10 | required int32 s_id = 1; /**< worker id */ 11 | required int32 d_id = 2; /**< master id */ 12 | required int32 superstep = 3; /**< current superstep number */ 13 | required int64 compute = 4; /**< count of computations in current superstep */ 14 | required int64 recv_msg = 5; /**< count of messages received in current superstep */ 15 | required int64 sent_msg = 6; /**< count of messages sent in current superstep */ 16 | repeated int64 worker_msg = 7; /**< count of messages for each worker in current superstep */ 17 | required int64 act_vertex = 8; /**< count of active vertices left after current superstep */ 18 | repeated bytes aggr_local = 9; /**< local aggregator value after current superstep */ 19 | } 20 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/WM.end.proto: -------------------------------------------------------------------------------- 1 | /** message from worker to master */ 2 | package wm; 3 | 4 | /* 5 | * Report to master after ending supersteps. 6 | * message type: 5 7 | */ 8 | message end { 9 | required int32 s_id = 1; /**< worker id */ 10 | required int32 d_id = 2; /**< master id */ 11 | required int32 state = 3; /**< 0: end sucessfully, 1: end unsuccessfully */ 12 | } 13 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/WW.nodemsg_list.proto: -------------------------------------------------------------------------------- 1 | /** message from worker to worker */ 2 | package ww; 3 | 4 | /** 5 | * Node2Node message list. 6 | * Worker may send a series of this message to dest-worker during a superstep. 7 | * If num_msgs is 0, this is a signal which means the worker have finished 8 | * sending node2node messages to dest-worker in current superstep. 9 | * message type: 6 10 | */ 11 | message nodemsg_list { 12 | required int32 s_id = 1; /**< source worker id */ 13 | required int32 d_id = 2; /**< destination worker id */ 14 | required int32 superstep = 3; /**< current superstep number */ 15 | required int32 num_msgs = 4; /**< number of node messages on the list */ 16 | required int32 msg_size = 5; /**< size of one piece of node message */ 17 | required bytes msgs = 6; /**< content of compressed node messages */ 18 | } 19 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Worker.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Worker.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * @see Worker.h 26 | * 27 | */ 28 | 29 | /** 30 | * 1. There are some assumptions: 31 | * (1) Vertex id counts continuously from 0; 32 | * (2) Input file has format as below. 33 | * ------------------------------------ 34 | * vertex number 35 | * edge number 36 | * edge_source_1, edge_destination_1 37 | * edge_source_1, edge_destination_2 38 | * . 39 | * . 40 | * edge_source_1, edge_destination_m1 41 | * edge_source_2, edge_destination_1 42 | * edge_source_2, edge_destination_2 43 | * . 44 | * . 45 | * edge_source_2, edge_destination_m2 46 | * . 47 | * . 48 | * . 49 | * edge_source_n, edge_destination_1 50 | * edge_source_n, edge_destination_2 51 | * . 52 | * . 53 | * edge_source_n, edge_destination_mn 54 | * ------------------------------------ 55 | * Notice that edge_source_i is in ascending order and n equals 56 | * vertex number may not happen, which means some vertices may have 57 | * no outedges. 58 | * 2. We temporarily use hash partition for whole graph, so that vertex 59 | * with vertex id of vid and edges from this vertex will be located at 60 | * the worker with id of vid%worker_cnt-1, and vertex index of node array 61 | * on this worker is vid/worker_cnt. 62 | * Worker id counts from 1 and vertex index of node counts from 0. 63 | * 3. If we change graph partition method later, addVertex() and node array 64 | * initialization in readInput() should be changed accordingly. 65 | */ 66 | 67 | #include 68 | #include 69 | // #include 70 | 71 | #include "Worker.h" 72 | 73 | #define prefetcht0(mem_var) \ 74 | __asm__ __volatile__ ("prefetcht0 %0": :"m"(mem_var)) 75 | #define prefetcht1(mem_var) \ 76 | __asm__ __volatile__ ("prefetcht1 %0": :"m"(mem_var)) 77 | #define prefetcht2(mem_var) \ 78 | __asm__ __volatile__ ("prefetcht2 %0": :"m"(mem_var)) 79 | #define prefetchnta(mem_var) \ 80 | __asm__ __volatile__ ("prefetchnta %0": :"m"(mem_var)) 81 | 82 | #define prefetch(ptr) prefetcht0(*((char *)(ptr))) 83 | 84 | extern Worker worker; 85 | 86 | // struct timeval b_time, e_time; 87 | // double elapsed = 0; 88 | 89 | void* w_receiveFun(void *args) { 90 | worker.m_receiver.bindServerAddr(worker.m_addr_self.port); 91 | worker.m_receiver.listenClient(); 92 | worker.m_receiver.acceptClient(); 93 | worker.m_receiver.recvMsg(); 94 | worker.m_receiver.closeAllSocket(); 95 | return NULL; 96 | } 97 | 98 | void* w_sendFun(void *args) { 99 | worker.m_sender.getSocketFd(); 100 | worker.m_sender.getServerAddr(worker.m_paddr_table); 101 | worker.m_sender.connectServer(worker.m_addr_self.id); 102 | worker.m_sender.sendMsg(); 103 | worker.m_sender.closeAllSocket(); 104 | return NULL; 105 | } 106 | 107 | void Worker::run(int argc, char* argv[]) { 108 | parseCmdArg(argv); 109 | loadUserFile(argc, argv); 110 | init(); 111 | // gettimeofday(&b_time, NULL); 112 | readInput(); 113 | // gettimeofday(&e_time, NULL); 114 | // elapsed = (double)(e_time.tv_sec - b_time.tv_sec) + ((double)e_time.tv_usec - (double)b_time.tv_usec)/1e6; 115 | // printf("readInput() elapsed: %f\n", elapsed); fflush(stdout); 116 | performSuperstep(); 117 | writeOutput(); 118 | terminate(); 119 | } 120 | 121 | void Worker::parseCmdArg(char* argv[]) { 122 | printf("parseCmdArg\n"); fflush(stdout); 123 | 124 | m_addr_self.id = atoi(argv[1]); 125 | m_puser_file = argv[2]; 126 | int imdm = 2; 127 | switch (imdm) { 128 | case 0: 129 | m_imdm = IMDM_OPT_PLAIN; 130 | break; 131 | case 1: 132 | m_imdm = IMDM_OPT_GROUP_PREF; 133 | break; 134 | case 2: 135 | m_imdm = IMDM_OPT_SWPL_PREF; 136 | break; 137 | default: 138 | break; 139 | } 140 | } 141 | 142 | void Worker::loadUserFile(int argc, char* argv[]) { 143 | printf("loadUserFile\n"); fflush(stdout); 144 | 145 | // 1. Open user file. 146 | m_puf_handle = dlopen(m_puser_file, RTLD_NOW); 147 | if (! m_puf_handle) { 148 | fprintf( stderr, "%s\n", dlerror() ); 149 | exit(1); 150 | } 151 | 152 | // 2. Create Graph class. 153 | GraphCreateFn create_graph; 154 | create_graph = (GraphCreateFn)dlsym(m_puf_handle, "create_graph"); 155 | m_pmy_graph = create_graph(); 156 | m_pmy_graph->init(argc - 2, (char **)&argv[2]); 157 | 158 | // 3. Read in configuration from Graph. 159 | m_machine_cnt = m_pmy_graph->m_machine_cnt; 160 | m_paddr_table = m_pmy_graph->m_paddr_table; 161 | m_hdfs_flag = m_pmy_graph->m_hdfs_flag; 162 | m_pfs_host = m_pmy_graph->m_pfs_host; 163 | m_fs_port = m_pmy_graph->m_fs_port; 164 | 165 | // input/output file name convention 166 | m_pin_path = (char *)malloc(SPRINTF_LEN); 167 | if (! m_pin_path) { 168 | perror("malloc"); 169 | exit(1); 170 | } 171 | sprintf(m_pin_path, "%s_%d", m_pmy_graph->m_pin_path, m_addr_self.id); 172 | m_pout_path = (char *)malloc(SPRINTF_LEN); 173 | if (! m_pout_path) { 174 | perror("malloc"); 175 | exit(1); 176 | } 177 | sprintf(m_pout_path, "%s_%d", m_pmy_graph->m_pout_path, m_addr_self.id); 178 | 179 | m_pmy_in_formatter = m_pmy_graph->m_pin_formatter; 180 | m_pmy_out_formatter = m_pmy_graph->m_pout_formatter; 181 | m_my_aggregator_cnt = m_pmy_graph->m_aggregator_cnt; 182 | m_pmy_aggregator = m_pmy_graph->m_paggregator; 183 | m_pmy_vertex = m_pmy_graph->m_pver_base; 184 | } 185 | 186 | void Worker::init() { 187 | printf("init\n"); fflush(stdout); 188 | 189 | // 1. Get self address from address table through id. 190 | strcpy(m_addr_self.hostname, m_paddr_table[m_addr_self.id].hostname); 191 | m_addr_self.port = m_paddr_table[m_addr_self.id].port; 192 | 193 | // 2. Connect to hdfs. 194 | if (m_hdfs_flag) { 195 | m_fs_handle = hdfsConnect(m_pfs_host, m_fs_port); 196 | if (! m_fs_handle) { 197 | perror("connect to hdfs"); 198 | exit(1); 199 | } 200 | } 201 | 202 | // 3. Get node/edge/message size from user vertex. 203 | Node::n_value_size = m_pmy_vertex->getVSize(); 204 | Node::n_size = offsetof(Node, value) + Node::n_value_size; 205 | Edge::e_value_size = m_pmy_vertex->getESize(); 206 | Edge::e_size = offsetof(Edge, weight) + Edge::e_value_size; 207 | Msg::m_value_size = m_pmy_vertex->getMSize(); 208 | Msg::m_size = offsetof(Msg, message) + Msg::m_value_size; 209 | 210 | // 4. Initialize freelist element size. 211 | m_free_list.setEle(Msg::m_size); 212 | 213 | // 5. Initialize aggregators for superstep 0. 214 | for (int i = 0; i < m_my_aggregator_cnt; ++i) { 215 | m_pmy_aggregator[i]->init(); 216 | } 217 | 218 | // 6. Initialize messages with master. 219 | wm__begin__init(&m_wm_begin); 220 | m_pmw_begin = NULL; 221 | wm__curss_finish__init(&m_wm_curssfinish); 222 | m_wm_curssfinish.n_worker_msg = m_machine_cnt; 223 | m_wm_curssfinish.worker_msg = (int64_t *)malloc( m_machine_cnt * sizeof(int64_t) ); 224 | if (! m_wm_curssfinish.worker_msg) { 225 | perror("malloc"); 226 | exit(1); 227 | } 228 | m_wm_curssfinish.n_aggr_local = m_my_aggregator_cnt; 229 | m_wm_curssfinish.aggr_local = (ProtobufCBinaryData *)malloc( 230 | (m_my_aggregator_cnt < 1 ? 1 : m_my_aggregator_cnt) * sizeof(ProtobufCBinaryData) ); 231 | if (! m_wm_curssfinish.aggr_local) { 232 | perror("malloc"); 233 | exit(1); 234 | } 235 | for (size_t i = 0; i < m_wm_curssfinish.n_aggr_local; ++i) { 236 | m_wm_curssfinish.aggr_local[i].len = m_pmy_aggregator[i]->getSize(); 237 | } 238 | m_pmw_nextssstart = NULL; 239 | m_pmw_end = NULL; 240 | wm__end__init(&m_wm_end); 241 | 242 | // 7. Initialize messages with workers. 243 | 244 | // send message list 245 | // Actually [0] for master is not used. 246 | m_pww_sendlist = (Ww__NodemsgList *)malloc( m_machine_cnt * sizeof(Ww__NodemsgList) ); 247 | int msgs_max_len = SENDLIST_LEN * Msg::m_size; 248 | char* msgs_buf = (char *)malloc((m_machine_cnt-1) * msgs_max_len); 249 | if (! m_pww_sendlist || ! msgs_buf) { 250 | perror("malloc"); 251 | exit(1); 252 | } 253 | for (int i = 1; i < m_machine_cnt; ++i) { // 0 is master 254 | ww__nodemsg_list__init(&m_pww_sendlist[i]); 255 | 256 | m_pww_sendlist[i].num_msgs = 0; 257 | m_pww_sendlist[i].msg_size = Msg::m_size; 258 | m_pww_sendlist[i].msgs.len = 0; 259 | m_pww_sendlist[i].msgs.data = (uint8_t*) msgs_buf; 260 | msgs_buf += msgs_max_len; 261 | } 262 | 263 | m_psendlist_curpos = (size_t *)malloc( m_machine_cnt * sizeof(size_t) ); 264 | if (! m_psendlist_curpos) { 265 | perror("malloc"); 266 | exit(1); 267 | } 268 | 269 | m_pfinish_send = (int *)malloc( m_machine_cnt * sizeof(int) ); 270 | if (! m_pfinish_send) { 271 | perror("malloc"); 272 | exit(1); 273 | } 274 | 275 | // receive message list 276 | // Actually [0] for master is not used. 277 | m_pww_recvlist = (Ww__NodemsgList **)malloc( m_machine_cnt * sizeof(Ww__NodemsgList *) ); 278 | if (! m_pww_recvlist) { 279 | perror("malloc"); 280 | exit(1); 281 | } 282 | 283 | // 8. Initialize sender and receiver. 284 | m_receiver.init(m_machine_cnt); 285 | m_sender.init(m_machine_cnt); 286 | 287 | // 9. Create send and receive thread. 288 | if ( pthread_create(&m_pth_receive, NULL, w_receiveFun, NULL) ) { 289 | fprintf(stderr, "Error creating receive thread !\n"); 290 | exit(1); 291 | } 292 | if ( pthread_create(&m_pth_send, NULL, w_sendFun, NULL) ) { 293 | fprintf(stderr, "Error creating send thread !\n"); 294 | exit(1); 295 | } 296 | } 297 | 298 | /* 299 | pnode can't accumulate by Node::n_size cuz previous and current vertices 300 | maynot be stored continuously in node array in consideration of vertices 301 | with no outedges. 302 | */ 303 | void Worker::addVertex(int64_t vid, void* pvalue, int64_t outdegree) { 304 | int worker_cnt = m_machine_cnt - 1; 305 | int64_t index = vid / worker_cnt; // hash partition 306 | Node* pnode = (Node *)( (char *)m_pnode + index * Node::n_size ); 307 | pnode->m_out_degree = outdegree; 308 | pnode->m_edge_index = m_edge_cnt; 309 | m_edge_cnt += (pnode->m_out_degree); 310 | memcpy(pnode->value, pvalue, Node::n_value_size); 311 | } 312 | 313 | /* 314 | m_pcur_edge can accumulate by Edge::e_size cuz previous and current edges 315 | must be stored continuously in edge array. 316 | */ 317 | void Worker::addEdge(int64_t from, int64_t to, void* pweight) { 318 | m_pcur_edge->from = from; 319 | m_pcur_edge->to = to; 320 | memcpy(m_pcur_edge->weight, pweight, Edge::e_value_size); 321 | m_pcur_edge = (Edge *)( (char *)m_pcur_edge + Edge::e_size ); 322 | } 323 | 324 | void Worker::readInput() { 325 | printf("readInput\n"); fflush(stdout); 326 | 327 | // 1. Open input file and get basic information about graph. 328 | m_pmy_in_formatter->open(m_pin_path); 329 | m_total_vertex = m_pmy_in_formatter->getVertexNum(); 330 | m_total_edge = m_pmy_in_formatter->getEdgeNum(); 331 | m_pmy_in_formatter->getVertexValueSize(); 332 | m_pmy_in_formatter->getEdgeValueSize(); 333 | m_pmy_in_formatter->getMessageValueSize(); 334 | 335 | // 2. Allocate memory for vertices. 336 | m_pnode = (Node *)malloc(m_total_vertex * Node::n_size); 337 | if (! m_pnode) { 338 | perror("malloc"); 339 | exit(1); 340 | } 341 | 342 | // 3. Initialize node array. cur_vid related to hash partition 343 | char* p = (char *)m_pnode; 344 | int64_t cur_vid = m_addr_self.id - 1; 345 | int worker_cnt = m_machine_cnt - 1; 346 | for (int64_t i = 0; i < m_total_vertex; 347 | ++i, p += Node::n_size, cur_vid += worker_cnt) { 348 | Node* pnode = (Node *)p; 349 | pnode->m_active = true; 350 | pnode->m_v_id = cur_vid; 351 | pnode->m_out_degree = 0; 352 | pnode->m_edge_index = -1; 353 | memset(pnode->value, 0, Node::n_value_size); 354 | pnode->initInMsg(); 355 | } 356 | 357 | // 4. Allocate memory for edges. 358 | m_pedge = (Edge *)malloc(m_total_edge * Edge::e_size); 359 | if (! m_pedge) { 360 | perror("malloc"); 361 | exit(1); 362 | } 363 | m_edge_cnt = 0; 364 | 365 | // 5. Judge if vertex/edge/message types are consistent between user file 366 | // and input file. 367 | if (Node::n_value_size != m_pmy_in_formatter->m_n_value_size) { 368 | perror("vertex type inconsistent between user file and input file"); 369 | exit(1); 370 | } 371 | if (Edge::e_value_size != m_pmy_in_formatter->m_e_value_size) { 372 | perror("edge type inconsistent between user file and input file"); 373 | exit(1); 374 | } 375 | if (Msg::m_value_size != m_pmy_in_formatter->m_m_value_size) { 376 | perror("message type inconsistent between user file and input file"); 377 | exit(1); 378 | } 379 | 380 | // Notice that it's useless to assign cur_node/cur_edge in constructor. 381 | // The proper time is after memory allocation for vertices/edges. 382 | m_pcur_edge = m_pedge; 383 | m_pmy_in_formatter->loadGraph(); 384 | 385 | m_pmy_in_formatter->close(); 386 | } 387 | 388 | void Worker::recvNewNodeMsg(Msg* pmsg) { 389 | switch (m_imdm) { 390 | case IMDM_OPT_PLAIN: 391 | case IMDM_OPT_GROUP_PREF: 392 | case IMDM_OPT_SWPL_PREF: 393 | { 394 | m_pnext_all_in_msg_chunklist->append(pmsg); 395 | } 396 | break; 397 | default: 398 | break; 399 | } 400 | } 401 | 402 | void Worker::recvNewNodeMsg2(Msg* pmsg) { 403 | switch (m_imdm) { 404 | case IMDM_OPT_PLAIN: 405 | case IMDM_OPT_GROUP_PREF: 406 | case IMDM_OPT_SWPL_PREF: 407 | { 408 | m_pnext2_all_in_msg_chunklist->append(pmsg); 409 | } 410 | break; 411 | default: 412 | break; 413 | } 414 | } 415 | 416 | void Worker::deliverAllNewNodeMsg() { 417 | Node* np; 418 | 419 | // Deliver messages to next_in_msg in each node. 420 | switch (m_imdm) { 421 | case IMDM_OPT_PLAIN: { // This uses the data structure but does not optimize. 422 | Msg *mp; 423 | int worker_cnt = m_machine_cnt - 1; 424 | 425 | ChunkedList::Iterator* pit = m_pnext_all_in_msg_chunklist->getIterator(); 426 | for ( mp = (Msg *)pit->next(); mp; mp = (Msg *)pit->next() ) { 427 | int64_t index = mp->d_id / worker_cnt; 428 | np = (Node *)( (char *)m_pnode + index * Node::n_size ); 429 | np->recvNewMsg(mp); 430 | } 431 | 432 | // for the new superstep 433 | delete pit; 434 | delete m_pnext_all_in_msg_chunklist; 435 | m_pnext_all_in_msg_chunklist = m_pnext2_all_in_msg_chunklist; 436 | m_pnext2_all_in_msg_chunklist = new ChunkedList(); 437 | } 438 | break; 439 | 440 | case IMDM_OPT_GROUP_PREF: { 441 | Msg *mp; 442 | int worker_cnt = m_machine_cnt - 1; 443 | ChunkedList::Iterator* pit = m_pnext_all_in_msg_chunklist->getIterator(); 444 | 445 | // preparation 446 | const int pref_group_size = 8; // This is a tunable parameter 447 | int64_t total_num = m_pnext_all_in_msg_chunklist->total(); 448 | 449 | Msg *msg[pref_group_size]; 450 | Node *nd[pref_group_size]; 451 | 452 | // for each group 453 | for (int64_t i = pref_group_size; i < total_num; i += pref_group_size) { 454 | 455 | // (1) obtain the Msg* and prefetch 456 | for (int j = 0; j < pref_group_size; j++) { 457 | msg[j] = (Msg *)pit->next(); 458 | prefetch(msg[j]); 459 | } 460 | 461 | // (2) compute node and prefetch 462 | for (int j = 0; j < pref_group_size; j++) { 463 | int64_t index = msg[j]->d_id / worker_cnt; 464 | nd[j] = (Node *)( (char *)m_pnode + index * Node::n_size ); 465 | prefetch(nd[j]); 466 | } 467 | 468 | // (3) deliver message 469 | for (int j = 0; j < pref_group_size; j++) { 470 | nd[j]->recvNewMsg(msg[j]); 471 | } 472 | } 473 | 474 | // process the rest of the messages 475 | for ( mp = (Msg *)pit->next(); mp; mp = (Msg *)pit->next() ) { 476 | int64_t index = mp->d_id / worker_cnt; 477 | np = (Node *)( (char *)m_pnode + index * Node::n_size ); 478 | np->recvNewMsg(mp); 479 | } 480 | 481 | // for the new superstep 482 | delete pit; 483 | delete m_pnext_all_in_msg_chunklist; 484 | m_pnext_all_in_msg_chunklist = m_pnext2_all_in_msg_chunklist; 485 | m_pnext2_all_in_msg_chunklist = new ChunkedList(); 486 | } 487 | break; 488 | 489 | case IMDM_OPT_SWPL_PREF: { 490 | int worker_cnt = m_machine_cnt - 1; 491 | ChunkedList::Iterator* pit = m_pnext_all_in_msg_chunklist->getIterator(); 492 | int64_t total_num = m_pnext_all_in_msg_chunklist->total(); 493 | 494 | // preparation 495 | const int pref_dist = 4; // This is a tunable parameter 496 | const int pref_group_size = 16; // pref_group_size >= pref_dist*3 497 | // must be a power of 2 498 | Msg *msg[pref_group_size]; 499 | Node *nd[pref_group_size]; 500 | int j1, j2, j3; // j2 = j1 + pref_dist, j3 = j2 + pref_dist 501 | 502 | int64_t i, i_end; 503 | 504 | // start up 505 | i = 0; 506 | i_end = 2*pref_dist; 507 | if (i_end > total_num) i_end = total_num; 508 | 509 | for (; i < i_end; i++) { 510 | 511 | // (1) obtain the Msg* and prefetch 512 | j1 = i%pref_group_size; 513 | msg[j1] = (Msg *)pit->next(); 514 | prefetch(msg[j1]); 515 | 516 | // (2) compute node and prefetch 517 | if (i >= pref_dist) { 518 | j2= (i - pref_dist)%pref_group_size; 519 | int64_t index = msg[j2]->d_id / worker_cnt; 520 | nd[j2] = (Node *)( (char *)m_pnode + index * Node::n_size ); 521 | prefetch(nd[j2]); 522 | } 523 | 524 | } 525 | 526 | // main pipeline 527 | i_end = total_num; 528 | for (; i < i_end; i++) { 529 | 530 | // (1) obtain the Msg* and prefetch 531 | j1 = i%pref_group_size; 532 | msg[j1] = (Msg *)pit->next(); 533 | prefetch(msg[j1]); 534 | 535 | // (2) compute node and prefetch 536 | j2 = (i - pref_dist)%pref_group_size; 537 | int64_t index = msg[j2]->d_id / worker_cnt; 538 | nd[j2] = (Node *)( (char *)m_pnode + index * Node::n_size ); 539 | prefetch(nd[j2]); 540 | 541 | // (3) deliver message 542 | j3 = (i - 2 * pref_dist) % pref_group_size; 543 | nd[j3]->recvNewMsg(msg[j3]); 544 | } 545 | 546 | // drain down 547 | i_end = total_num + 2 * pref_dist; 548 | for (; i < i_end; i++) { 549 | 550 | // (2) compute node and prefetch 551 | if (i >= pref_dist && i - pref_dist < total_num) { 552 | // "&&" has lower priority than ">=" "-" "<" . 553 | j2 = (i - pref_dist) % pref_group_size; 554 | int64_t index = msg[j2]->d_id / worker_cnt; 555 | nd[j2] = (Node *)( (char *)m_pnode + index * Node::n_size ); 556 | prefetch(nd[j2]); 557 | } 558 | 559 | // (3) deliver message 560 | if (i >= 2 * pref_dist) { 561 | j3 = (i - 2 * pref_dist) % pref_group_size; 562 | nd[j3]->recvNewMsg(msg[j3]); 563 | } 564 | } 565 | 566 | // for the new superstep 567 | delete pit; 568 | delete m_pnext_all_in_msg_chunklist; 569 | m_pnext_all_in_msg_chunklist = m_pnext2_all_in_msg_chunklist; 570 | m_pnext2_all_in_msg_chunklist = new ChunkedList(); 571 | } 572 | break; 573 | default: 574 | break; 575 | } 576 | } 577 | 578 | void Worker::sendBegin() { 579 | 580 | // 1. Set message content to be sent. 581 | m_wm_begin.s_id = m_addr_self.id; 582 | m_wm_begin.d_id = 0; 583 | m_wm_begin.state = 0; 584 | 585 | // 2. Send to master. 586 | pthread_mutex_lock(&m_sender.m_out_mutex); 587 | while (m_sender.m_out_buffer[0].m_state) { 588 | pthread_cond_wait(&m_sender.m_out_cond, &m_sender.m_out_mutex); 589 | } 590 | 591 | // totally empty, can write to m_out_buffer 592 | m_sender.m_out_buffer[0].m_msg_len = sizeof(int) + 593 | wm__begin__pack(&m_wm_begin, 594 | &m_sender.m_out_buffer[0].m_buffer[2 * sizeof(int)]); 595 | m_sender.m_out_buffer[0].m_buf_len = m_sender.m_out_buffer[0].m_msg_len + sizeof(int); 596 | * (int *)m_sender.m_out_buffer[0].m_buffer = m_sender.m_out_buffer[0].m_buf_len; 597 | * (int *)&(m_sender.m_out_buffer[0].m_buffer[sizeof(int)]) = WM_BEGIN; 598 | m_sender.m_out_buffer[0].m_head = 0; 599 | m_sender.m_out_buffer[0].m_tail = m_sender.m_out_buffer[0].m_buf_len; 600 | m_sender.m_out_buffer[0].m_state = 1; 601 | 602 | pthread_mutex_unlock(&m_sender.m_out_mutex); 603 | } 604 | 605 | void Worker::sendCurssfinish() { 606 | 607 | // 1. Set message content to be sent. 608 | m_wm_curssfinish.s_id = m_addr_self.id; 609 | m_wm_curssfinish.d_id = 0; 610 | for (size_t i = 0; i < m_wm_curssfinish.n_aggr_local; ++i) { 611 | m_wm_curssfinish.aggr_local[i].data = (uint8_t *)( m_pmy_aggregator[i]->getLocal() ); 612 | } 613 | 614 | // 2. Send to master. 615 | pthread_mutex_lock(&m_sender.m_out_mutex); 616 | while (m_sender.m_out_buffer[0].m_state) { 617 | pthread_cond_wait(&m_sender.m_out_cond, &m_sender.m_out_mutex); 618 | } 619 | 620 | // totally empty, can write to m_out_buffer 621 | m_sender.m_out_buffer[0].m_msg_len = sizeof(int) + 622 | wm__curss_finish__pack(&m_wm_curssfinish, 623 | &m_sender.m_out_buffer[0].m_buffer[2 * sizeof(int)]); 624 | m_sender.m_out_buffer[0].m_buf_len = m_sender.m_out_buffer[0].m_msg_len + sizeof(int); 625 | * (int *)m_sender.m_out_buffer[0].m_buffer = m_sender.m_out_buffer[0].m_buf_len; 626 | * (int *)&(m_sender.m_out_buffer[0].m_buffer[sizeof(int)]) = WM_CURSSFINISH; 627 | m_sender.m_out_buffer[0].m_head = 0; 628 | m_sender.m_out_buffer[0].m_tail = m_sender.m_out_buffer[0].m_buf_len; 629 | m_sender.m_out_buffer[0].m_state = 1; 630 | 631 | pthread_mutex_unlock(&m_sender.m_out_mutex); 632 | } 633 | 634 | void Worker::sendEnd() { 635 | 636 | // 1. Set message content to be sent. 637 | m_wm_end.s_id = m_addr_self.id; 638 | m_wm_end.d_id = 0; 639 | m_wm_end.state = 0; 640 | 641 | // 2. Send to master. 642 | pthread_mutex_lock(&m_sender.m_out_mutex); 643 | while (m_sender.m_out_buffer[0].m_state) { 644 | pthread_cond_wait(&m_sender.m_out_cond, &m_sender.m_out_mutex); 645 | } 646 | 647 | // totally empty, can write to m_out_buffer 648 | m_sender.m_out_buffer[0].m_msg_len = sizeof(int) + 649 | wm__end__pack(&m_wm_end, 650 | &m_sender.m_out_buffer[0].m_buffer[2 * sizeof(int)]); 651 | m_sender.m_out_buffer[0].m_buf_len = m_sender.m_out_buffer[0].m_msg_len + sizeof(int); 652 | * (int *)m_sender.m_out_buffer[0].m_buffer = m_sender.m_out_buffer[0].m_buf_len; 653 | * (int *)&(m_sender.m_out_buffer[0].m_buffer[sizeof(int)]) = WM_END; 654 | m_sender.m_out_buffer[0].m_head = 0; 655 | m_sender.m_out_buffer[0].m_tail = m_sender.m_out_buffer[0].m_buf_len; 656 | m_sender.m_out_buffer[0].m_state = 1; 657 | 658 | pthread_mutex_unlock(&m_sender.m_out_mutex); 659 | } 660 | 661 | int Worker::sendNodeMessage(int worker_id, int num_msg) { 662 | int ret = 1; 663 | 664 | // 1. Get new messages in current superstep from other workers if necessary. 665 | for (int j = 1; j < m_machine_cnt; ++j) receiveMessage(j); 666 | 667 | // 2. Set message content to be sent. 668 | m_pww_sendlist[worker_id].s_id = m_addr_self.id; 669 | m_pww_sendlist[worker_id].d_id = worker_id; 670 | m_pww_sendlist[worker_id].superstep = m_wm_curssfinish.superstep; 671 | m_pww_sendlist[worker_id].num_msgs = num_msg; 672 | m_pww_sendlist[worker_id].msgs.len = num_msg * m_pww_sendlist[worker_id].msg_size; 673 | // m_pww_sendlist[worker_id].msgs.data has been set in Node::sendMessageTo() 674 | 675 | // 3. Send to dest-worker. 676 | if (! m_sender.m_out_buffer[worker_id].m_state) { // just test, totally empty, can write to m_out_buffer 677 | 678 | pthread_mutex_lock(&m_sender.m_out_mutex); 679 | if (! m_sender.m_out_buffer[worker_id].m_state) { // totally empty, can write to m_out_buffer 680 | m_sender.m_out_buffer[worker_id].m_msg_len = sizeof(int) + 681 | ww__nodemsg_list__pack(&m_pww_sendlist[worker_id], 682 | &m_sender.m_out_buffer[worker_id].m_buffer[2 * sizeof(int)]); 683 | m_sender.m_out_buffer[worker_id].m_buf_len = m_sender.m_out_buffer[worker_id].m_msg_len + sizeof(int); 684 | * (int *)m_sender.m_out_buffer[worker_id].m_buffer = m_sender.m_out_buffer[worker_id].m_buf_len; 685 | * (int *)&(m_sender.m_out_buffer[worker_id].m_buffer[sizeof(int)]) = WW_NODEMSGLIST; 686 | m_sender.m_out_buffer[worker_id].m_head = 0; 687 | m_sender.m_out_buffer[worker_id].m_tail = m_sender.m_out_buffer[worker_id].m_buf_len; 688 | m_sender.m_out_buffer[worker_id].m_state = 1; 689 | ret = 0; 690 | } 691 | pthread_mutex_unlock(&m_sender.m_out_mutex); 692 | } // else can't write to m_out_buffer 693 | 694 | return ret; 695 | } 696 | 697 | void Worker::receiveMessage(int machine_id) { 698 | 699 | if (m_receiver.m_in_buffer[machine_id].m_state) { // just test, totally full, can read from m_in_buffer 700 | 701 | pthread_mutex_lock(&m_receiver.m_in_mutex); 702 | if (m_receiver.m_in_buffer[machine_id].m_state) { // totally full, can read from m_in_buffer 703 | m_receiver.m_in_buffer[machine_id].m_buf_len = * (int *)m_receiver.m_in_buffer[machine_id].m_buffer; 704 | m_receiver.m_in_buffer[machine_id].m_msg_type = * (int *)&(m_receiver.m_in_buffer[machine_id].m_buffer[sizeof(int)]); 705 | 706 | if (m_receiver.m_in_buffer[machine_id].m_buf_len) { 707 | m_receiver.m_in_buffer[machine_id].m_msg_len = m_receiver.m_in_buffer[machine_id].m_buf_len - sizeof(int); 708 | int pack_len = m_receiver.m_in_buffer[machine_id].m_msg_len - sizeof(int); 709 | 710 | switch (m_receiver.m_in_buffer[machine_id].m_msg_type) { 711 | case MW_BEGIN: 712 | m_pmw_begin = mw__begin__unpack(NULL, pack_len, &m_receiver.m_in_buffer[machine_id].m_buffer[2 * sizeof(int)]); 713 | if (m_pmw_begin->state == 0) { // this worker ready to begin 714 | m_from_master = 1; 715 | } 716 | mw__begin__free_unpacked(m_pmw_begin, NULL); 717 | break; 718 | case MW_NEXTSSSTART: 719 | // We keep the m_pmw_nextssstart until the next use, e.g. visit ->superstep/node_msg/aggr_global. 720 | if (m_pmw_nextssstart) { 721 | mw__nextss_start__free_unpacked(m_pmw_nextssstart, NULL); 722 | m_pmw_nextssstart = NULL; 723 | } 724 | m_pmw_nextssstart = mw__nextss_start__unpack(NULL, pack_len, &m_receiver.m_in_buffer[machine_id].m_buffer[2 * sizeof(int)]); 725 | m_node_msg = m_pmw_nextssstart->node_msg; 726 | // m_node_msg can be used to control time of worker waiting 727 | // to receive messages from other workers, but here we use 728 | // m_finishnn_wk to control. 729 | for (int i = 0; i < m_my_aggregator_cnt; ++i) { 730 | m_pmy_aggregator[i]->setGlobal(m_pmw_nextssstart->aggr_global[i].data); 731 | } 732 | m_from_master = 1; 733 | break; 734 | case MW_END: 735 | m_pmw_end = mw__end__unpack(NULL, pack_len, &m_receiver.m_in_buffer[machine_id].m_buffer[2 * sizeof(int)]); 736 | if (m_pmw_end->state == 0) { // OK to end 737 | m_term = 1; 738 | } 739 | mw__end__free_unpacked(m_pmw_end, NULL); 740 | m_from_master = 1; 741 | break; 742 | case WW_NODEMSGLIST: 743 | m_pww_recvlist[machine_id] = ww__nodemsg_list__unpack(NULL, pack_len, &m_receiver.m_in_buffer[machine_id].m_buffer[2 * sizeof(int)]); 744 | if (m_pww_recvlist[machine_id]->num_msgs == 0) { 745 | ++m_finishnn_wk; 746 | } else { 747 | int ss = m_pww_recvlist[machine_id]->superstep; 748 | if (ss == m_wm_curssfinish.superstep) { 749 | 750 | Msg* pnode_msg; 751 | char* p = (char *)(m_pww_recvlist[machine_id]->msgs.data); 752 | for (int i = 0; i < m_pww_recvlist[machine_id]->num_msgs; ++i, p+=Msg::m_size) { 753 | pnode_msg = (Msg *)( m_free_list.allocate() ); 754 | memcpy(pnode_msg, p, Msg::m_size); 755 | recvNewNodeMsg(pnode_msg); 756 | } 757 | 758 | } else if (ss == m_wm_curssfinish.superstep + 1) { 759 | 760 | Msg* pnode_msg; 761 | char* p = (char *)(m_pww_recvlist[machine_id]->msgs.data); 762 | for (int i = 0; i < m_pww_recvlist[machine_id]->num_msgs; ++i, p+=Msg::m_size) { 763 | pnode_msg = (Msg *)( m_free_list.allocate() ); 764 | memcpy(pnode_msg, p, Msg::m_size); 765 | recvNewNodeMsg2(pnode_msg); 766 | } 767 | 768 | } 769 | 770 | // This is the total number of received node-node messages 771 | m_wm_curssfinish.recv_msg += m_pww_recvlist[machine_id]->num_msgs; 772 | } 773 | ww__nodemsg_list__free_unpacked(m_pww_recvlist[machine_id], NULL); 774 | break; 775 | default: 776 | fprintf(stderr, "There is no such message type !\n"); 777 | break; 778 | } 779 | 780 | m_receiver.m_in_buffer[machine_id].m_buf_len = 0; 781 | m_receiver.m_in_buffer[machine_id].m_state = 0; 782 | } 783 | } 784 | pthread_mutex_unlock(&m_receiver.m_in_mutex); 785 | 786 | } // else can't read from m_in_buffer 787 | } 788 | 789 | void Worker::performSuperstep() { 790 | printf("performSuperstep\n"); fflush(stdout); 791 | 792 | // 1. Initialize before the first superstep. 793 | // The in-message lists are empty. 794 | switch (m_imdm) { 795 | case IMDM_OPT_PLAIN: 796 | case IMDM_OPT_GROUP_PREF: 797 | case IMDM_OPT_SWPL_PREF: 798 | { 799 | m_pnext_all_in_msg_chunklist = new ChunkedList(); 800 | m_pnext2_all_in_msg_chunklist = new ChunkedList(); 801 | } 802 | break; 803 | default: 804 | break; 805 | } 806 | m_wm_curssfinish.superstep = -1; 807 | m_wm_curssfinish.act_vertex = m_total_vertex; 808 | 809 | // 2. Send requests for whole supersteps begin to master. 810 | sendBegin(); 811 | printf("sent WM_BEGIN\n"); fflush(stdout); 812 | 813 | // 3. Receive master respond to whole supersteps begin. 814 | m_from_master = 0; 815 | while (! m_from_master) receiveMessage(0); 816 | printf("received MW_BEGIN\n"); fflush(stdout); 817 | 818 | // 4. Run into supersteps. 819 | int msg2send; // count of messages to send, used in 4.4 & 4.5 820 | m_term = 0; 821 | while (! m_term) { 822 | 823 | // 4.1 Initialize before every superstep. 824 | ++m_wm_curssfinish.superstep; 825 | printf("-----------------------------------------\n"); fflush(stdout); 826 | printf("superstep: %d\n", m_wm_curssfinish.superstep); fflush(stdout); 827 | m_wm_curssfinish.compute = 0; 828 | m_wm_curssfinish.recv_msg = 0; 829 | m_wm_curssfinish.sent_msg = 0; 830 | memset( m_wm_curssfinish.worker_msg, 0, m_machine_cnt * sizeof(int32_t) ); 831 | memset( m_psendlist_curpos, 0, m_machine_cnt * sizeof(size_t) ); 832 | memset( m_pfinish_send, 0, m_machine_cnt * sizeof(int) ); 833 | m_finishnn_wk = 2; // excluding master and myself 834 | // m_finishnn_wk = 1; // excluding master only, for debug 2 835 | 836 | // 4.2 Deliver node messages to current in-message list. 837 | deliverAllNewNodeMsg(); 838 | 839 | // 4.3 Local compute. 840 | // printf("before compute()\n"); fflush(stdout); 841 | char* p = (char *)m_pnode; 842 | for (int64_t i = 0; i < m_total_vertex; ++i, p += Node::n_size) { 843 | Node* pnode = (Node *)p; 844 | 845 | if (pnode->m_active) { // active node 846 | m_pmy_vertex->setMe(pnode); 847 | 848 | GenericLinkIterator* pgeneric_link_iterator = 849 | pnode->getGenericLinkIterator(); 850 | m_pmy_vertex->compute(pgeneric_link_iterator); 851 | delete pgeneric_link_iterator; 852 | 853 | pnode->clearCurInMsg(); 854 | ++m_wm_curssfinish.compute; 855 | } 856 | 857 | // get new messages in current superstep from other workers if necessary 858 | for (int j = 1; j < m_machine_cnt; ++j) receiveMessage(j); 859 | } // end of local compute 860 | // printf("after compute()\n"); fflush(stdout); 861 | 862 | // 4.4 Send last node2node message to some workers. 863 | do { 864 | msg2send = 0; 865 | for (int i = 1; i < m_machine_cnt; ++i) { 866 | if (m_psendlist_curpos[i] > 0) { 867 | ++msg2send; 868 | if ( sendNodeMessage(i, m_psendlist_curpos[i]) ) continue; 869 | printf("sent WW_NODEMSGLIST, %lld msgs to worker[%d]\n", 870 | (long long)m_psendlist_curpos[i], i); fflush(stdout); 871 | ++m_wm_curssfinish.worker_msg[i]; 872 | m_psendlist_curpos[i] = 0; 873 | --msg2send; 874 | } 875 | } 876 | } while (msg2send); 877 | 878 | // 4.5 Notify the other workers to have finished sending node messages. 879 | /* 880 | Here just need a flag array, if not consider time for memset size_t 881 | costs more than int, can take m_psendlist_curpos as convenience. 882 | If change all flags in whole program as bit representation later, 883 | these two flag array can be combined to one. 884 | memset( m_psendlist_curpos, -1, m_machine_cnt * sizeof(size_t) ); 885 | m_pfinish_send[i] = 0; 886 | */ 887 | do { 888 | msg2send = 0; 889 | for (int i = 1; i < m_machine_cnt; ++i) { 890 | if (i == m_addr_self.id) continue; // do not send myself finish of sending, can be commented for debug 3 891 | if (! m_pfinish_send[i]) { 892 | ++msg2send; 893 | if ( sendNodeMessage(i, 0) ) continue; // here argument 0 means this is a signal messsage 894 | printf("sent WW_FINISHSENDNODEMSG to worker[%d]\n", i); fflush(stdout); 895 | m_pfinish_send[i] = 1; 896 | --msg2send; 897 | } 898 | } 899 | } while (msg2send); 900 | 901 | // 4.6 Wait for all the other workers' signals of having finished 902 | // sending node messages. 903 | while (m_finishnn_wk < m_machine_cnt) { 904 | for (int j = 1; j < m_machine_cnt; ++j) receiveMessage(j); 905 | } 906 | printf("received all WW_NODEMSGLIST && WW_FINISHNODEMSG\n"); fflush(stdout); 907 | 908 | // 4.7 Send current superstep finish to master. 909 | sendCurssfinish(); 910 | printf("sent WM_CURSSFINISH\n"); fflush(stdout); 911 | 912 | // 4.8 Clear aggregator values before barrier rather than in 4.1, 913 | // cuz global value will be set later and used in compute() of next 914 | // superstep. 915 | for (int i = 0; i < m_my_aggregator_cnt; ++i) m_pmy_aggregator[i]->init(); 916 | 917 | // 4.9 Wait for barrier message from master. 918 | m_from_master = 0; 919 | while (! m_from_master) receiveMessage(0); 920 | if (m_term) { 921 | printf("received MW_END\n"); fflush(stdout); 922 | sendEnd(); 923 | printf("sent WM_END\n"); fflush(stdout); 924 | } else { 925 | printf("received MW_NEXTSSSTART\n"); fflush(stdout); 926 | } 927 | } // end of while 928 | 929 | // 5. Set main thread to terminate. 930 | main_term = 1; 931 | } 932 | 933 | void Worker::writeOutput() { 934 | printf("writeOutput\n"); fflush(stdout); 935 | 936 | m_pmy_out_formatter->open(m_pout_path); 937 | 938 | res_iter.init(m_pnode, m_total_vertex); 939 | 940 | m_pmy_out_formatter->writeResult(); 941 | 942 | m_pmy_out_formatter->close(); 943 | } 944 | 945 | 946 | void Worker::terminate() { 947 | printf("terminate\n"); fflush(stdout); 948 | 949 | // 1. Disconnect from hdfs. 950 | if ( m_hdfs_flag && hdfsDisconnect(m_fs_handle) ) { 951 | perror("disconnect from hdfs"); 952 | exit(1); 953 | } 954 | 955 | // 2. Destroy Graph class. 956 | if (m_pin_path) free(m_pin_path); 957 | if (m_pout_path) free(m_pout_path); 958 | 959 | m_pmy_graph->term(); 960 | GraphDestroyFn destroy_graph; 961 | destroy_graph = (GraphDestroyFn)dlsym(m_puf_handle, "destroy_graph"); 962 | destroy_graph(m_pmy_graph); 963 | 964 | // 3. Free memory allocated. 965 | if (m_pnext_all_in_msg_chunklist) delete m_pnext_all_in_msg_chunklist; 966 | if (m_pnext2_all_in_msg_chunklist) delete m_pnext2_all_in_msg_chunklist; 967 | if (m_pedge) free(m_pedge); 968 | 969 | char* p = (char *)m_pnode; 970 | for (int64_t i = 0; i < m_total_vertex; ++i, p += Node::n_size) { 971 | Node* pnode = (Node *)p; 972 | pnode->freeInMsgVector(); 973 | } 974 | if (m_pnode) free(m_pnode); 975 | 976 | if (m_pmw_nextssstart) { 977 | mw__nextss_start__free_unpacked(m_pmw_nextssstart, NULL); 978 | } 979 | if (m_pww_recvlist) free(m_pww_recvlist); 980 | if (m_pfinish_send) free(m_pfinish_send); 981 | if (m_psendlist_curpos) free(m_psendlist_curpos); 982 | if (m_pww_sendlist[1].msgs.data) free(m_pww_sendlist[1].msgs.data); // msgs_buf 983 | if (m_pww_sendlist) free(m_pww_sendlist); 984 | if (m_wm_curssfinish.aggr_local) free(m_wm_curssfinish.aggr_local); 985 | if (m_wm_curssfinish.worker_msg) free(m_wm_curssfinish.worker_msg); 986 | 987 | // 4. Quit receive & send threads. 988 | void* recv_retval; 989 | pthread_join(m_pth_receive, &recv_retval); 990 | void* send_retval; 991 | pthread_join(m_pth_send, &send_retval); 992 | } 993 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/Worker.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Worker.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Worker class to manage local computation of subgraph. 26 | * Mention that we choose hdfs as file system. 27 | * 28 | */ 29 | 30 | #ifndef WORKER_H 31 | #define WORKER_H 32 | 33 | #include 34 | #include 35 | 36 | #include "hdfs.h" 37 | #include "Utility.h" 38 | #include "Graph.h" 39 | #include "InputFormatter.h" 40 | #include "OutputFormatter.h" 41 | #include "AggregatorBase.h" 42 | #include "VertexBase.h" 43 | #include "Node.h" 44 | #include "FreeList.h" 45 | #include "ChunkedList.h" 46 | #include "Sender.h" 47 | #include "Receiver.h" 48 | #include "WM.begin.pb-c.h" 49 | #include "MW.begin.pb-c.h" 50 | #include "WM.curss_finish.pb-c.h" 51 | #include "MW.nextss_start.pb-c.h" 52 | #include "MW.end.pb-c.h" 53 | #include "WM.end.pb-c.h" 54 | #include "WW.nodemsg_list.pb-c.h" 55 | 56 | extern int main_term; 57 | 58 | /** Definition of Worker class. */ 59 | class Worker { 60 | public: 61 | /** Definition of ResultIterator class. */ 62 | class ResultIterator { 63 | public: 64 | char* m_pbegin; /**< pointer of iterator begin position */ 65 | char* m_pend; /**< pointer of iterator end position */ 66 | int m_element_size; /**< size of element */ 67 | 68 | public: 69 | /** 70 | * Initialization. 71 | * @param pbegin iterator begin position 72 | * @param pend iterator end position 73 | */ 74 | void init(Node* pnode, int total_vertex) { 75 | m_pbegin = (char *)pnode; 76 | m_pend = (char *)pnode + total_vertex * Node::n_size; 77 | m_element_size = Node::n_size; 78 | } 79 | 80 | /** 81 | * Get vertex value result after computation. 82 | * @param vid reference of vertex id to be got 83 | * @param pvalue pointer of vertex value 84 | */ 85 | void getIdValue(int64_t& vid, void* pvalue) { // Reference is necessary. 86 | Node* pnode = (Node *)m_pbegin; 87 | vid = pnode->m_v_id; 88 | memcpy(pvalue, pnode->value, Node::n_value_size); 89 | } 90 | 91 | /** Go to visit next element. */ 92 | void next() { m_pbegin += m_element_size; } 93 | 94 | /** 95 | * Judge if iterator terminates or not. 96 | * @retval true done 97 | * @retval false not 98 | */ 99 | bool done() { return (m_pbegin >= m_pend); } 100 | }; 101 | 102 | public: 103 | const char* m_puser_file; /**< user file path */ 104 | IMDM m_imdm; /**< in-message deliver method */ 105 | void* m_puf_handle; /**< handle of opened user file */ 106 | Graph* m_pmy_graph; /**< configuration class */ 107 | int m_machine_cnt; /**< machine count, one master and some workers */ 108 | Addr* m_paddr_table; /**< address table, master 0 workers from 1 */ 109 | Addr m_addr_self; /**< self address */ 110 | int m_hdfs_flag; /**< read input from hdfs or local-fs, hdfs 1 local-fs 0 */ 111 | hdfsFS m_fs_handle; /**< handle of file system, here hdfs */ 112 | const char* m_pfs_host; /**< hdfs host */ 113 | int m_fs_port; /**< hdfs port */ 114 | char* m_pin_path; /**< input file path */ 115 | char* m_pout_path; /**< output file path */ 116 | int m_my_aggregator_cnt; /**< aggregator count */ 117 | AggregatorBase** m_pmy_aggregator; /**< pointers of AggregatorBase */ 118 | InputFormatter* m_pmy_in_formatter; /**< pointer of InputFormatter */ 119 | OutputFormatter* m_pmy_out_formatter; /**< pointer of OutputFormatter */ 120 | VertexBase* m_pmy_vertex; /**< pointer of VertexBase */ 121 | int64_t m_total_vertex; /**< total vertex number in local subgraph */ 122 | int64_t m_total_edge; /**< total edge number in local subgraph */ 123 | Node* m_pnode; /**< node array */ 124 | Edge* m_pedge; /**< edge array */ 125 | int64_t m_edge_cnt; /**< current edge count to help add vertices */ 126 | Edge* m_pcur_edge; /**< current position in edge to help add edges */ 127 | FreeList m_free_list; /**< freelist to store node-node messages */ 128 | // Msg* m_pnext_all_in_msg; /**< next superstep messages in IMDM_SEQ */ 129 | // Msg* m_pnext2_all_in_msg; /**< next next superstep messages in IMDM_SEQ */ 130 | ChunkedList* m_pnext_all_in_msg_chunklist; /**< next superstep messages in IMDM_OPT */ 131 | ChunkedList* m_pnext2_all_in_msg_chunklist; /**< next next superstep messages in IMDM_OPT */ 132 | ResultIterator res_iter; /**< computation result iterator */ 133 | 134 | Sender m_sender; /**< to manage activities about send */ 135 | Receiver m_receiver; /**< to manage activities about receive */ 136 | pthread_t m_pth_send; /**< send thread */ 137 | pthread_t m_pth_receive; /**< receive thread */ 138 | 139 | Wm__Begin m_wm_begin; /**< worker requests for whole supersteps begin */ 140 | Mw__Begin* m_pmw_begin; /**< master responds to whole supersteps begin */ 141 | Wm__CurssFinish m_wm_curssfinish; /**< worker current superstep finishes */ 142 | Mw__NextssStart* m_pmw_nextssstart; /**< master next superstep starts */ 143 | Mw__End* m_pmw_end; /**< master notifies workers to end supersteps */ 144 | Wm__End m_wm_end; /**< worker reports to master after ending supersteps */ 145 | Ww__NodemsgList* m_pww_sendlist; /**< node2node message send list to other workers */ 146 | size_t* m_psendlist_curpos; /**< to record current insert-position of sendlist for every worker, from [1] */ 147 | Ww__NodemsgList** m_pww_recvlist; /**< node2node message receive list from other workers */ 148 | 149 | int m_term; /**< to mark if supersteps end, 1/0 yes/no */ 150 | int m_from_master; /**< to mark if received message from master, 1/0 yes/no */ 151 | int* m_pfinish_send; /**< to mark if WW_LARVEREDGELIST/WW_FINISHSENDNODEMSG message sent successfully to every worker, from [1] */ 152 | int m_finishnn_wk; /**< count of workers having finished sending node2node messages in current superstep */ 153 | int64_t m_node_msg; /**< count of node messages the worker should receive before next superstep */ 154 | 155 | public: 156 | /** 157 | * Run function. 158 | * Worker process entrance, which consists of child methods. 159 | * @param argc command line argument number 160 | * @param argv command line arguments 161 | */ 162 | void run(int argc, char* argv[]); 163 | 164 | /** 165 | * Parse command line arguments. 166 | * @param argv command line arguments 167 | */ 168 | void parseCmdArg(char* argv[]); 169 | 170 | /** 171 | * Load user file. 172 | * @param argc command line argument number 173 | * @param argv command line arguments 174 | */ 175 | void loadUserFile(int argc, char* argv[]); 176 | 177 | /** Initialize some global/member variables. */ 178 | void init(); 179 | 180 | /** 181 | * Add a vertex to system storage for graph. 182 | * @param vid vertex id 183 | * @param pvalue pointer of vertex value 184 | * @param outdegree vertex outdegree 185 | */ 186 | void addVertex(int64_t vid, void* pvalue, int64_t outdegree); 187 | 188 | /** 189 | * Add an edge to system storage for graph. 190 | * @param from edge source vertex id 191 | * @param to edge destination vertex id 192 | * @param pweight pointer of edge weight 193 | */ 194 | void addEdge(int64_t from, int64_t to, void* pweight); 195 | 196 | /** Read graph from input file. */ 197 | void readInput(); 198 | 199 | /** 200 | * Receive a new piece of node-node message for next superstep. 201 | * @param pmsg pointer of the message 202 | */ 203 | void recvNewNodeMsg(Msg* pmsg); 204 | 205 | /** 206 | * Receive a new piece of node-node message for next next superstep. 207 | * @param pmsg pointer of the message 208 | */ 209 | void recvNewNodeMsg2(Msg* pmsg); 210 | 211 | /** Deliver all new node-node messages to destination node. */ 212 | void deliverAllNewNodeMsg(); 213 | 214 | /** Send request to master for whole supersteps begin. */ 215 | void sendBegin(); 216 | 217 | /** Send currrent superstep finish message to master. */ 218 | void sendCurssfinish(); 219 | 220 | /** Send superstep already end message to master. */ 221 | void sendEnd(); 222 | 223 | /** 224 | * Send node2node message to a worker. 225 | * @param worker_id destination machine id 226 | * @param num_msg number of node messsages 227 | * @retval 0 send successfully 228 | * @retval 1 send unsuccessfully 229 | */ 230 | int sendNodeMessage(int worker_id, int num_msg); 231 | 232 | /** 233 | * Receive all kinds of messages from master or a worker. 234 | * @param machine_id source machine id 235 | */ 236 | void receiveMessage(int machine_id); 237 | 238 | /** Perform a series of supersteps. */ 239 | void performSuperstep(); 240 | 241 | /** Write computation results to output file. */ 242 | void writeOutput(); 243 | 244 | /** Free some global/member variables. */ 245 | void terminate(); 246 | }; // definition of Worker class 247 | 248 | #endif /* WORKER_H */ 249 | -------------------------------------------------------------------------------- /GraphLite-0.20/engine/main.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file main.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * The main entrance for master and workers. 26 | * 27 | */ 28 | 29 | #include 30 | #include 31 | #include 32 | 33 | #include "Master.h" 34 | #include "Worker.h" 35 | 36 | Master master; 37 | Worker worker; 38 | 39 | int main_term = 0; // main thread terminate flag, just for sender/receiver to 40 | // test if break loop, not need mutex 41 | 42 | void usage(char *cmd) 43 | { 44 | printf("Master Usage: %s [algorithm args]\n", cmd); 45 | printf("Worker Usage: %s [algorithm args]\n", cmd); 46 | exit(1); 47 | } 48 | 49 | int main(int argc, char* argv[]) { 50 | 51 | if (argc < 3) usage(argv[0]); 52 | 53 | int machine_id= atoi(argv[1]); 54 | 55 | if (strcmp(argv[1], "0") == 0) { 56 | 57 | printf("master run\n"); 58 | if (argc < 4) usage(argv[0]); 59 | master.run(argc, argv); 60 | 61 | } else if ( machine_id > 0) { 62 | 63 | printf("worker run\n"); 64 | worker.run(argc, argv); 65 | 66 | } else { 67 | usage(argv[0]); 68 | } 69 | 70 | return 0; 71 | } 72 | -------------------------------------------------------------------------------- /GraphLite-0.20/example/Makefile: -------------------------------------------------------------------------------- 1 | # ---------------------------------------------------------------------- 2 | # compiler options 3 | # ---------------------------------------------------------------------- 4 | 5 | CXX= g++ 6 | 7 | CFLAGS_COMMON=-std=c++0x -g -O3 -I${HADOOP_HOME}/include -I${GRAPHLITE_HOME}/include 8 | LIB_GRAPHALGO=-fPIC -shared 9 | 10 | # ---------------------------------------------------------------------- 11 | # target 12 | # ---------------------------------------------------------------------- 13 | 14 | all : example 15 | 16 | # ---------------------------------------------------------------------- 17 | # example graph algorithms 18 | # ---------------------------------------------------------------------- 19 | 20 | EXAMPLE_ALGOS=PageRankVertex.so 21 | 22 | example: ${EXAMPLE_ALGOS} 23 | 24 | %.so : %.cc 25 | ${CXX} ${CFLAGS_COMMON} $< ${LIB_GRAPHALGO} -o $@ 26 | 27 | PageRankVertex.so : PageRankVertex.cc 28 | 29 | # ---------------------------------------------------------------------- 30 | # clean up 31 | # ---------------------------------------------------------------------- 32 | 33 | clean : 34 | rm -rf *.so 35 | -------------------------------------------------------------------------------- /GraphLite-0.20/example/PageRankVertex.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file PageRankVertex.cc 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file implements the PageRank algorithm using graphlite API. 26 | * 27 | */ 28 | 29 | #include 30 | #include 31 | #include 32 | 33 | #include "GraphLite.h" 34 | 35 | #define VERTEX_CLASS_NAME(name) PageRankVertex##name 36 | 37 | #define EPS 1e-6 38 | 39 | class VERTEX_CLASS_NAME(InputFormatter): public InputFormatter { 40 | public: 41 | int64_t getVertexNum() { 42 | unsigned long long n; 43 | sscanf(m_ptotal_vertex_line, "%lld", &n); 44 | m_total_vertex= n; 45 | return m_total_vertex; 46 | } 47 | int64_t getEdgeNum() { 48 | unsigned long long n; 49 | sscanf(m_ptotal_edge_line, "%lld", &n); 50 | m_total_edge= n; 51 | return m_total_edge; 52 | } 53 | int getVertexValueSize() { 54 | m_n_value_size = sizeof(double); 55 | return m_n_value_size; 56 | } 57 | int getEdgeValueSize() { 58 | m_e_value_size = sizeof(double); 59 | return m_e_value_size; 60 | } 61 | int getMessageValueSize() { 62 | m_m_value_size = sizeof(double); 63 | return m_m_value_size; 64 | } 65 | void loadGraph() { 66 | if (m_total_edge <= 0) return; 67 | 68 | unsigned long long last_vertex; 69 | unsigned long long from; 70 | unsigned long long to; 71 | double weight = 0; 72 | 73 | double value = 1; 74 | int outdegree = 0; 75 | 76 | const char *line= getEdgeLine(); 77 | 78 | // Note: modify this if an edge weight is to be read 79 | // modify the 'weight' variable 80 | 81 | sscanf(line, "%lld %lld", &from, &to); 82 | addEdge(from, to, &weight); 83 | 84 | last_vertex = from; 85 | ++outdegree; 86 | for (int64_t i = 1; i < m_total_edge; ++i) { 87 | line= getEdgeLine(); 88 | 89 | // Note: modify this if an edge weight is to be read 90 | // modify the 'weight' variable 91 | 92 | sscanf(line, "%lld %lld", &from, &to); 93 | if (last_vertex != from) { 94 | addVertex(last_vertex, &value, outdegree); 95 | last_vertex = from; 96 | outdegree = 1; 97 | } else { 98 | ++outdegree; 99 | } 100 | addEdge(from, to, &weight); 101 | } 102 | addVertex(last_vertex, &value, outdegree); 103 | } 104 | }; 105 | 106 | class VERTEX_CLASS_NAME(OutputFormatter): public OutputFormatter { 107 | public: 108 | void writeResult() { 109 | int64_t vid; 110 | double value; 111 | char s[1024]; 112 | 113 | for (ResultIterator r_iter; ! r_iter.done(); r_iter.next() ) { 114 | r_iter.getIdValue(vid, &value); 115 | int n = sprintf(s, "%lld: %f\n", (unsigned long long)vid, value); 116 | writeNextResLine(s, n); 117 | } 118 | } 119 | }; 120 | 121 | // An aggregator that records a double value tom compute sum 122 | class VERTEX_CLASS_NAME(Aggregator): public Aggregator { 123 | public: 124 | void init() { 125 | m_global = 0; 126 | m_local = 0; 127 | } 128 | void* getGlobal() { 129 | return &m_global; 130 | } 131 | void setGlobal(const void* p) { 132 | m_global = * (double *)p; 133 | } 134 | void* getLocal() { 135 | return &m_local; 136 | } 137 | void merge(const void* p) { 138 | m_global += * (double *)p; 139 | } 140 | void accumulate(const void* p) { 141 | m_local += * (double *)p; 142 | } 143 | }; 144 | 145 | class VERTEX_CLASS_NAME(): public Vertex { 146 | public: 147 | void compute(MessageIterator* pmsgs) { 148 | double val; 149 | if (getSuperstep() == 0) { 150 | val= 1.0; 151 | } else { 152 | if (getSuperstep() >= 2) { 153 | double global_val = * (double *)getAggrGlobal(0); 154 | if (global_val < EPS) { 155 | voteToHalt(); return; 156 | } 157 | } 158 | 159 | double sum = 0; 160 | for ( ; ! pmsgs->done(); pmsgs->next() ) { 161 | sum += pmsgs->getValue(); 162 | } 163 | val = 0.15 + 0.85 * sum; 164 | 165 | double acc = fabs(getValue() - val); 166 | accumulateAggr(0, &acc); 167 | } 168 | * mutableValue() = val; 169 | const int64_t n = getOutEdgeIterator().size(); 170 | sendMessageToAllNeighbors(val / n); 171 | } 172 | }; 173 | 174 | class VERTEX_CLASS_NAME(Graph): public Graph { 175 | public: 176 | VERTEX_CLASS_NAME(Aggregator)* aggregator; 177 | 178 | public: 179 | // argv[0]: PageRankVertex.so 180 | // argv[1]: 181 | // argv[2]: 182 | void init(int argc, char* argv[]) { 183 | 184 | setNumHosts(5); 185 | setHost(0, "localhost", 1411); 186 | setHost(1, "localhost", 1421); 187 | setHost(2, "localhost", 1431); 188 | setHost(3, "localhost", 1441); 189 | setHost(4, "localhost", 1451); 190 | 191 | if (argc < 3) { 192 | printf ("Usage: %s \n", argv[0]); 193 | exit(1); 194 | } 195 | 196 | m_pin_path = argv[1]; 197 | m_pout_path = argv[2]; 198 | 199 | aggregator = new VERTEX_CLASS_NAME(Aggregator)[1]; 200 | regNumAggr(1); 201 | regAggr(0, &aggregator[0]); 202 | } 203 | 204 | void term() { 205 | delete[] aggregator; 206 | } 207 | }; 208 | 209 | /* STOP: do not change the code below. */ 210 | extern "C" Graph* create_graph() { 211 | Graph* pgraph = new VERTEX_CLASS_NAME(Graph); 212 | 213 | pgraph->m_pin_formatter = new VERTEX_CLASS_NAME(InputFormatter); 214 | pgraph->m_pout_formatter = new VERTEX_CLASS_NAME(OutputFormatter); 215 | pgraph->m_pver_base = new VERTEX_CLASS_NAME(); 216 | 217 | return pgraph; 218 | } 219 | 220 | extern "C" void destroy_graph(Graph* pobject) { 221 | delete ( VERTEX_CLASS_NAME()* )(pobject->m_pver_base); 222 | delete ( VERTEX_CLASS_NAME(OutputFormatter)* )(pobject->m_pout_formatter); 223 | delete ( VERTEX_CLASS_NAME(InputFormatter)* )(pobject->m_pin_formatter); 224 | delete ( VERTEX_CLASS_NAME(Graph)* )pobject; 225 | } 226 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/Addr.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Addr.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defines struct Addr for storing machine information. 26 | * 27 | */ 28 | 29 | #ifndef _ADDR_H 30 | #define _ADDR_H 31 | 32 | #define NAME_LEN 128 // machine hostname length 33 | 34 | /** Machine address structure. */ 35 | typedef struct Addr { 36 | int id; /**< machine id: master 0, worker from 1 */ 37 | char hostname[NAME_LEN]; /**< machine hostname */ 38 | int port; /**< machine port for process */ 39 | } Addr; 40 | 41 | #endif 42 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/Aggregator.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Aggregator.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Aggregator class to aggregate values among workers during 26 | * supersteps. Template argument is aggregated value type. 27 | * 28 | * Each worker accumulates values from vertices on it, and sends the local 29 | * result to master. Master merges values from all workers to a global value 30 | * and controls the process of supersteps through that to some degree. 31 | * 32 | * Different value types or methods correspond to different kinds of aggregator. 33 | * Master and workers may keep a series of different aggregators. 34 | * 35 | * @see AggregatorBase class 36 | * 37 | */ 38 | 39 | #ifndef AGGREGATOR_H 40 | #define AGGREGATOR_H 41 | 42 | #include "AggregatorBase.h" 43 | 44 | /** Definition of Aggregator class. */ 45 | template 46 | class Aggregator: public AggregatorBase { 47 | public: 48 | AggrValue m_global; /**< aggregator global value of AggrValue type */ 49 | AggrValue m_local; /**< aggregator local value of AggrValue type */ 50 | 51 | public: 52 | virtual void init() = 0; 53 | virtual int getSize() const { 54 | return sizeof(AggrValue); 55 | } 56 | virtual void* getGlobal() = 0; 57 | virtual void setGlobal(const void* p) = 0; 58 | virtual void* getLocal() = 0; 59 | virtual void merge(const void* p) = 0; 60 | virtual void accumulate(const void* p) = 0; 61 | }; // definition of Aggregator class 62 | 63 | #endif /* AGGREGATOR_H */ 64 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/AggregatorBase.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file AggregatorBase.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined AggregatorBase class to load user Aggregator class. 26 | * AggregatorBase is an abstract class with virtual methods. 27 | * 28 | * There is a 3-layer inheritance hierarchy about Aggregator: AggregatorBase is 29 | * on top, Aggregator class of template is in the middle, and user Aggregator 30 | * class is at the bottom. 31 | * 32 | * Both AggregatorBase and Aggreagator can be abstract class(include pure 33 | * virtual method), and pure virtual methods in AggregatorBase can also be 34 | * written into virtual methods. 35 | * The differrence is, pure virtual methods need not have implementation, but 36 | * virtual methods need just like other common functions. 37 | * 38 | */ 39 | 40 | #ifndef AGGREGATORBASE_H 41 | #define AGGREGATORBASE_H 42 | 43 | /** Definition of AggregatorBase class. */ 44 | class AggregatorBase { 45 | public: 46 | /** Initialize, mainly for aggregator value. */ 47 | virtual void init() = 0; 48 | 49 | /** 50 | * Get aggregator value type size. 51 | * @return aggregator value type size. 52 | */ 53 | virtual int getSize() const = 0; 54 | 55 | /** 56 | * Get aggregator global value. 57 | * @return pointer of aggregator global value 58 | */ 59 | virtual void* getGlobal() = 0; 60 | 61 | /** 62 | * Set aggregator global value. 63 | * @param p pointer of value to set global as 64 | */ 65 | virtual void setGlobal(const void* p) = 0; 66 | 67 | /** 68 | * Get aggregator local value. 69 | * @return pointer of aggregator local value 70 | */ 71 | virtual void* getLocal() = 0; 72 | 73 | /** 74 | * Merge method for global. 75 | * @param p pointer of value to be merged 76 | */ 77 | virtual void merge(const void* p) = 0; 78 | 79 | /** 80 | * Accumulate method for local. 81 | * @param p pointer of value to be accumulated 82 | */ 83 | virtual void accumulate(const void* p) = 0; 84 | }; // definition of AggregatorBase class 85 | 86 | #endif /* AGGREGATORBASE_H */ 87 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/GenericArrayIterator.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file GenericArrayIterator.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined GenericArrayIterator class to iterate on array structure. 26 | * 27 | */ 28 | 29 | #ifndef GENERICARRAYITRATOR_H 30 | #define GENERICARRAYITRATOR_H 31 | 32 | /** Definition of GenericArrayIterator class. */ 33 | class GenericArrayIterator { 34 | public: 35 | char* m_pbegin; /**< pointer of iterator begin position */ 36 | char* m_pend; /**< pointer of iterator end position */ 37 | int m_element_size; /**< size of array element */ 38 | 39 | public: 40 | /** 41 | * Constructor. 42 | * @param pbegin iterator begin position 43 | * @param pend iterator end position 44 | * @param size array element size 45 | */ 46 | GenericArrayIterator(char* pbegin, char* pend, int size): 47 | m_pbegin(pbegin), m_pend(pend), m_element_size(size) {} 48 | 49 | /** 50 | * Get iterator size. 51 | * @return count of elements to visit 52 | */ 53 | int64_t size() { return (int64_t)(m_pend - m_pbegin) / m_element_size; } 54 | 55 | /** 56 | * Get current element position. 57 | * @return pointer of current element 58 | */ 59 | char* current() { return m_pbegin; } 60 | 61 | /** Go to visit next element. */ 62 | void next() { m_pbegin += m_element_size; } 63 | /** Go to visit next K element. */ 64 | void next(int64_t k) { m_pbegin += k * m_element_size; } 65 | 66 | /** 67 | * Judge if iterator terminates or not. 68 | * @retval true done 69 | * @retval false not 70 | */ 71 | bool done() { return (m_pbegin >= m_pend); } 72 | }; // definition of GenericArrayIterator class 73 | 74 | #endif /* GENERICARRAYITRATOR_H */ 75 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/GenericLinkIterator.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file GenericLinkIterator.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined node-node Message struct and GenericLinkIterator class to 26 | * iterate on link structure implemented in pointer vector. 27 | * Mainly for in-message list in Node.h. 28 | * 29 | * @see Node class 30 | * 31 | */ 32 | 33 | #ifndef GENERICLINKITERATOR_H 34 | #define GENERICLINKITERATOR_H 35 | 36 | #include 37 | #include 38 | #include 39 | 40 | using namespace std; 41 | 42 | /** Definition of node-node Message struct. */ 43 | typedef struct Msg { 44 | int64_t s_id; /**< source vertex id of the message */ 45 | int64_t d_id; /**< destination vertex id of the message */ 46 | char message[0]; /**< start positon of memory to store message content */ 47 | 48 | static int m_value_size; /**< messge value size in character */ 49 | static int m_size; /**< total size of a piece of message, sizeof(Msg) + m_value_size */ 50 | } Msg; 51 | 52 | /** Definition of GenericLinkIterator class. */ 53 | class GenericLinkIterator { 54 | public: 55 | vector* m_pvector; /**< pointer of vector to be iterated on */ 56 | int64_t m_vector_size; /**< vector size */ 57 | int64_t m_cur_index; /**< index of current element in vector */ 58 | 59 | public: 60 | /** 61 | * Constructor. 62 | * @param pvector pointer of vector to be iterated on 63 | */ 64 | GenericLinkIterator(vector* pvector): m_pvector(pvector) { 65 | m_vector_size = pvector->size(); 66 | m_cur_index = 0; 67 | } 68 | 69 | /** 70 | * Get current element. 71 | * @return current element of vector 72 | */ 73 | char* getCurrent() { return (char *)(*m_pvector)[m_cur_index]; } 74 | 75 | /** Go to visit next element. */ 76 | void next() { ++m_cur_index; } 77 | 78 | /** 79 | * Judge if iterator terminates or not. 80 | * @retval true done 81 | * @retval false not 82 | */ 83 | bool done() { return m_cur_index == m_vector_size; } 84 | }; // definition of GenericLinkIterator class 85 | 86 | #endif /* GENERICLINKITERATOR_H */ 87 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/Graph.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Graph.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Graph class as an interface for users to config arguments 26 | * and control system runtime to some degree. 27 | * 28 | * System needs to load user files dynamically. That is, we can't know details 29 | * about subclass implemented by users before whole program linked up. So by 30 | * means of C++ language features, some pointers of abstract classes are 31 | * necessary. 32 | * 33 | */ 34 | 35 | #ifndef GRAPH_H 36 | #define GRAPH_H 37 | 38 | #include 39 | 40 | #include 41 | #include 42 | 43 | #include "VertexBase.h" 44 | #include "InputFormatter.h" 45 | #include "OutputFormatter.h" 46 | #include "AggregatorBase.h" 47 | #include "Addr.h" 48 | 49 | 50 | /** Definition of Graph class. */ 51 | class Graph { 52 | public: 53 | int m_machine_cnt; /**< machine count, one master and some workers */ 54 | Addr* m_paddr_table; /**< address table, master 0 workers from 1 */ 55 | int m_hdfs_flag; /**< read input from hdfs or local-fs, hdfs 1 local-fs 0 */ 56 | const char* m_pfs_host; /**< hdfs host */ 57 | int m_fs_port; /**< hdfs port */ 58 | const char* m_pin_path; /**< input file path */ 59 | const char* m_pout_path; /**< output file path */ 60 | InputFormatter* m_pin_formatter; /**< pointer of InputFormatter */ 61 | OutputFormatter* m_pout_formatter; /**< pointer of OutputFormatter */ 62 | int m_aggregator_cnt; /**< aggregator count */ 63 | AggregatorBase** m_paggregator; /**< pointers of AggregatorBase */ 64 | VertexBase* m_pver_base; /**< pointer of VertexBase */ 65 | 66 | public: 67 | void setNumHosts(int num_hosts) { 68 | if (num_hosts <= 0) return; 69 | 70 | m_machine_cnt = num_hosts; 71 | m_paddr_table = new Addr[num_hosts]; 72 | } 73 | 74 | void setHost(int id, const char *hostname, int port) { 75 | if (id<0 || id>=m_machine_cnt) return; 76 | 77 | m_paddr_table[id].id = id; 78 | strncpy(m_paddr_table[id].hostname, hostname, NAME_LEN-1); 79 | m_paddr_table[id].hostname[NAME_LEN-1]= '\0'; 80 | m_paddr_table[id].port = port; 81 | } 82 | 83 | /** 84 | * Format of hosts_str: localhost:1500,localhost:1501,localhost:1502 85 | * The first address for master. 86 | * */ 87 | void setupHosts(const char* hosts_str) { 88 | // printf("\nsetup hosts:\n"); 89 | std::string hs = std::string(hosts_str); 90 | int num_hosts = std::count(hs.begin(), hs.end(), ':'); 91 | setNumHosts(num_hosts); 92 | 93 | int hostname_pos = 0; 94 | int port_pos = 0; 95 | for (int id = 0; id < num_hosts; ++id) { 96 | port_pos = hs.find_first_of(":", hostname_pos) + 1; 97 | m_paddr_table[id].id = id; 98 | hs.copy(m_paddr_table[id].hostname, port_pos - hostname_pos - 1, hostname_pos); 99 | m_paddr_table[id].hostname[port_pos - hostname_pos - 1] = '\0'; 100 | hostname_pos = hs.find_first_of(",", port_pos) + 1; 101 | m_paddr_table[id].port = std::stoi( 102 | hs.substr(port_pos, hostname_pos - port_pos - 1), 103 | NULL 104 | ); 105 | // printf("host %d: %s:%d\n", id, m_paddr_table[id].hostname, m_paddr_table[id].port); fflush(stdout); 106 | } 107 | } 108 | 109 | void regNumAggr(int num) { 110 | if (num <= 0) return; 111 | 112 | m_aggregator_cnt= num; 113 | m_paggregator= new AggregatorBase*[num]; 114 | } 115 | 116 | void regAggr(int id, AggregatorBase *aggr) { 117 | if (id<0 || id>=m_aggregator_cnt) return; 118 | 119 | m_paggregator[id]= aggr; 120 | } 121 | 122 | Graph(){ 123 | setNumHosts(1); 124 | setHost(0, "localhost", 1411); 125 | 126 | m_hdfs_flag= 0; // use local file by default 127 | 128 | m_aggregator_cnt= 0; 129 | m_paggregator= NULL; 130 | } 131 | 132 | /** 133 | * Initialize, virtual method. All arguments from command line. 134 | * @param argc algorithm argument number 135 | * @param argv algorithm arguments 136 | */ 137 | virtual void init(int argc, char* agrv[]) {} 138 | 139 | /** 140 | * Master computes per superstep to judge if supersteps terminate, 141 | * virtual method. 142 | * @param superstep current superstep number 143 | * @param paggr_base aggregator base pointer 144 | * @retval 1 supersteps terminate 145 | * @retval 0 supersteps not terminate 146 | */ 147 | virtual int masterComputePerstep(int superstep, AggregatorBase** paggr_base) { 148 | return 0; 149 | } 150 | 151 | /** Terminate, virtual method. */ 152 | virtual void term() {} 153 | 154 | /** Destructor, virtual method. */ 155 | virtual ~Graph() { 156 | if(m_paggregator) delete[] m_paggregator; 157 | if(m_paddr_table) delete[] m_paddr_table; 158 | } 159 | }; // definition of Graph class 160 | 161 | /** 162 | * A type definition for function pointer. 163 | * Function has no arguments and return value of Graph* type. 164 | */ 165 | typedef Graph* (* GraphCreateFn)(); 166 | 167 | /** 168 | * A type definition for function pointer. 169 | * Function has one argument of Graph* type and no return value. 170 | */ 171 | typedef void (* GraphDestroyFn)(Graph*); 172 | 173 | #endif /* GRAPH_H */ 174 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/GraphLite.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file GraphLite.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file includes multiple headers. A Vertex program needs only to 26 | * include GraphLite.h. 27 | * 28 | */ 29 | 30 | #include "InputFormatter.h" 31 | #include "OutputFormatter.h" 32 | #include "Aggregator.h" 33 | #include "Graph.h" 34 | #include "Vertex.h" 35 | 36 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/InputFormatter.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file InputFormatter.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined InputFormatter class to provide input interface for user, 26 | * mainly to load local subgraph, supporting reading from both local file system 27 | * and hdfs. 28 | * 29 | * We provide abstract methods for users to override, and also some helpful 30 | * interfaces implemented by system. 31 | * 32 | */ 33 | 34 | #ifndef INPUTFORMATTER_H 35 | #define INPUTFORMATTER_H 36 | 37 | #define MAXINPUTLEN 100 38 | 39 | #include 40 | #include 41 | #include 42 | 43 | #include "hdfs.h" 44 | 45 | /** Definition of InputFormatter class. */ 46 | // 47 | // GraphLite engine will call an InputFormatter as follows: 48 | // 49 | // open(); 50 | // getVertexNum(); getEdgeNum(); 51 | // getVertexValueSize(); getEdgeValueSize(); getMessageValueSize(); 52 | // loadGraph(); 53 | // close(); 54 | // 55 | class InputFormatter { 56 | public: 57 | hdfsFile m_hdfs_file; /**< input file handle, for hdfs */ 58 | std::ifstream m_local_file; /**< input file stream, for local file system */ 59 | const char* m_ptotal_vertex_line; /**< pointer of total vertex count line */ 60 | const char* m_ptotal_edge_line; /**< pointer of total edge count line */ 61 | int64_t m_total_vertex; /**< total vertex count of local subgraph */ 62 | int64_t m_total_edge; /**< total edge count of local subgraph */ 63 | int m_n_value_size; /**< vertex value type size */ 64 | int m_e_value_size; /**< edge value type size */ 65 | int m_m_value_size; /**< message value type size */ 66 | int m_vertex_num_size; /**< vertex number size */ 67 | int m_edge_num_size; /**< edge number size */ 68 | tOffset m_next_edge_offset; /**< offset for next edge */ 69 | std::string m_total_vertex_line; /**< buffer of local total vertex count line */ 70 | std::string m_total_edge_line; /**< buffer of local total edge count line */ 71 | char m_buf_line[MAXINPUTLEN]; /**< buffer of hdfs file line */ 72 | std::string m_buf_string; 73 | 74 | public: 75 | /** 76 | * Open input file, virtual method. 77 | * @param pin_path input file path 78 | */ 79 | virtual void open(const char* pin_path); 80 | 81 | /** Close input file, virtual method. */ 82 | virtual void close(); 83 | 84 | /** 85 | * Get vertex number, pure virtual method. 86 | * @return total vertex number in local subgraph 87 | */ 88 | virtual int64_t getVertexNum() = 0; 89 | 90 | /** Get vertex number line */ 91 | void getVertexNumLine(); 92 | 93 | /** 94 | * Get edge number, pure virtual method. 95 | * @return total edge number in local subgraph 96 | */ 97 | virtual int64_t getEdgeNum() = 0; 98 | 99 | /** Get edge number line */ 100 | void getEdgeNumLine(); 101 | 102 | /** 103 | * Get vertex value type size, pure virtual method. 104 | * @return vertex value type size 105 | */ 106 | virtual int getVertexValueSize() = 0; 107 | 108 | /** 109 | * Get edge value type size, pure virtual method. 110 | * @return edge value type size 111 | */ 112 | virtual int getEdgeValueSize() = 0; 113 | 114 | /** 115 | * Get message value type size, pure virtual method. 116 | * @return message value type size 117 | */ 118 | virtual int getMessageValueSize() = 0; 119 | 120 | /** 121 | * Get edge line, for user. 122 | * Read from current file offset. 123 | * @return a string of edge in local subgraph 124 | */ 125 | const char* getEdgeLine(); 126 | 127 | /** 128 | * Add one vertex to Node array. 129 | * @param vid vertex id 130 | * @param pvalue pointer of vertex value 131 | * @param outdegree vertex outdegree 132 | */ 133 | void addVertex(int64_t vid, void* pvalue, int outdegree); 134 | 135 | /** 136 | * Add one edge to Edge array. 137 | * @param from edge source vertex id 138 | * @param to edge destination vertex id 139 | * @param pweight pointer of edge weight 140 | */ 141 | void addEdge(int64_t from, int64_t to, void* pweight); 142 | 143 | /** Load local subgraph, pure virtual method. */ 144 | virtual void loadGraph() = 0; 145 | 146 | /** Destructor. */ 147 | ~InputFormatter(); 148 | }; // definition of InputFormatter class 149 | 150 | #endif /* INPUTFORMATTER_H */ 151 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/Node.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Node.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Edge struct and Node class. node-node Message struct is in 26 | * GenericLinkIterator.h. 27 | * Node class implements interfaces of Vertex class to hide inner details from 28 | * users. 29 | * 30 | * @see Edge struct 31 | * @see Msg struct 32 | * @see GenericLinkIterator class 33 | * @see GenericArrayIterator class 34 | * 35 | */ 36 | 37 | #ifndef NODE_H 38 | #define NODE_H 39 | 40 | #include 41 | #include 42 | 43 | #include "GenericLinkIterator.h" 44 | #include "GenericArrayIterator.h" 45 | 46 | /** Definition of Edge struct. */ 47 | typedef struct Edge { 48 | int64_t from; /**< start vertex id of the edge */ 49 | int64_t to; /**< end vertex id of the edge */ 50 | char weight[0]; /**< start positon of memory to store edge weight */ 51 | 52 | static int e_value_size; /**< edge value size in character */ 53 | static int e_size; /**< total size of an edge, sizeof(Edge) + e_value_size */ 54 | } Edge; 55 | 56 | /** Definition of Node class. */ 57 | class Node { 58 | public: 59 | bool m_active; /**< vertex state: active m_active=1, inactive m_active=0 */ 60 | int64_t m_v_id; /**< vertex id */ 61 | int64_t m_out_degree; /**< vertex outdegree */ 62 | int64_t m_edge_index; /**< index of first edge from this vertex in edge array */ 63 | std::vector m_cur_in_msg; /**< current superstep in-message pointer list */ 64 | char value[0]; /**< start positon of memory to store node value */ 65 | 66 | public: 67 | static int n_value_size; /**< node value size in character */ 68 | static int n_size; /**< total size of a node, sizeof(Node) + n_value_size */ 69 | 70 | public: 71 | /** 72 | * Get a Node structure of index(begin at 0) in node array. 73 | * @param index node index 74 | * @return a Node structure of index in node array 75 | */ 76 | static Node& getNode(int64_t index); 77 | 78 | /** 79 | * Get an Eode structure of index in edge array. 80 | * @param index edge index 81 | * @return an Edge structure of index in edge array 82 | */ 83 | static Edge& getEdge(int64_t index); 84 | 85 | /** Initialize pointers of in-message lists. */ 86 | void initInMsg(); 87 | 88 | /** 89 | * Receive a new piece of message for next superstep. 90 | * Link the message to next_in_msg list. 91 | * @param pmsg pointer of the message 92 | */ 93 | void recvNewMsg(Msg* pmsg); 94 | 95 | /** Free current in-message list to freelist. */ 96 | void clearCurInMsg(); 97 | 98 | /** Free memory of in-message lists vector allocation. */ 99 | void freeInMsgVector(); 100 | 101 | /** 102 | * Get current superstep number. 103 | * @return current superstep number 104 | */ 105 | int getSuperstep() const; 106 | 107 | /** 108 | * Get vertex id. 109 | * @return vertex id 110 | */ 111 | int64_t getVertexId() const; 112 | 113 | /** 114 | * Vote to halt. 115 | * Change vertex state to be inactive. 116 | */ 117 | void voteToHalt(); 118 | 119 | /** 120 | * Get a generic link iterator. 121 | * @return a pointer of GenericLinkIterator 122 | */ 123 | GenericLinkIterator* getGenericLinkIterator(); 124 | 125 | /** 126 | * Get a generic array iterator. 127 | * @return a pointer of GenericArrayIterator 128 | */ 129 | // GenericArrayIterator* getGenericArrayIterator(); 130 | 131 | /** 132 | * Send a piece of node-node message to target vertex. 133 | * @param dest_vertex destination vertex id 134 | * @param pmessage pointer of the message to be sent 135 | */ 136 | void sendMessageTo(int64_t dest_vertex, const char* pmessage); 137 | 138 | /** 139 | * Send a piece of node-node message to all outedge-target vertex. 140 | * Call sendMessageTo() for all outedges. 141 | * @param pmessage pointer of the message to be sent 142 | * @see sendMessageTo() 143 | */ 144 | void sendMessageToAllNeighbors(const char* pmessage); 145 | 146 | /** 147 | * Get global value of some aggregator. 148 | * @param aggr index of aggregator, count from 0 149 | */ 150 | const void* getAggrGlobal(int aggr); 151 | 152 | /** 153 | * Accumulate local value of some aggregator. 154 | * @param aggr index of aggregator, count from 0 155 | */ 156 | void accumulateAggr(int aggr, const void* p); 157 | }; // definition of Node class 158 | 159 | #endif /* NODE_H */ 160 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/OutputFormatter.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file OutputFormatter.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined OutputFormatter class to provide output interface for user, 26 | * mainly to write local subgraph computation result. 27 | * 28 | * We provide an abstract method for users to override, and also some helpful 29 | * interfaces implemented by system. 30 | * 31 | */ 32 | 33 | #ifndef OUTPUTFORMATTER_H 34 | #define OUTPUTFORMATTER_H 35 | 36 | #include 37 | #include 38 | 39 | #include "hdfs.h" 40 | 41 | /** Definition of OutputFormatter class. */ 42 | class OutputFormatter { 43 | public: 44 | /** Definition of ResultIterator class. */ 45 | class ResultIterator { 46 | public: 47 | /** 48 | * Get the result after computation. 49 | * @param vid reference of vertex id to be got 50 | * @param pvalue pointer of vertex value 51 | */ 52 | void getIdValue(int64_t& vid, void* pvalue); 53 | 54 | /** Go to visit next element. */ 55 | void next(); 56 | 57 | /** 58 | * Judge if iterator terminates or not. 59 | * @retval true done 60 | * @retval false not 61 | */ 62 | bool done(); 63 | }; 64 | 65 | public: 66 | hdfsFile m_hdfs_file; /**< output file handle, for hdfs */ 67 | std::ofstream m_local_file; /**< output file stream, for local file system */ 68 | 69 | public: 70 | /** 71 | * Open output file, virtual method. 72 | * @param pout_path output file path 73 | */ 74 | virtual void open(const char* pout_path); 75 | 76 | /** Close output file, virtual method. */ 77 | virtual void close(); 78 | 79 | /** 80 | * Write next result line, for user. 81 | * Write to current file offset. 82 | * @param pbuffer buffer of result line in string 83 | * @param len length of result line string 84 | */ 85 | void writeNextResLine(char* pbuffer, int len); 86 | 87 | /** Write local subgraph computation result, pure virtual method. */ 88 | virtual void writeResult() = 0; 89 | 90 | /** Destructor, virtual method. */ 91 | virtual ~OutputFormatter() {} 92 | }; // definition of OutputFormatter class 93 | 94 | #endif /* OUTPUTFORMATTER_H */ 95 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/Vertex.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file Vertex.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined Vertex class to provide computation interface for users. 26 | * Template arguments are value types of vertex, edge and message. 27 | * 28 | * Given template arguments, Vertex/Edge/Msg value types will be definite. So we 29 | * define MessageIterator(subclass of GenericLinkIterator) and OutEdgeIterator 30 | * (subclass of GenericArrayIterator) as inner classes. 31 | * 32 | * Almost all methods in Vertex class will call corresponding implementation 33 | * in Node class. Users need to inherit Vertex class, fill in template arguments 34 | * and override virtual compute() method. In consideration of only access to 35 | * base class before whole program linked up, namely system files and user files, 36 | * compute(GenericLinkIterator*) need to cast to pure virtual function 37 | * compute(MessageIterator*) to fit interface, which will execute at each active 38 | * vertex during every superstep. 39 | * 40 | * @see VertexBase class 41 | * @see GenericLinkIterator class 42 | * @see GenericArrayIterator class 43 | * @see Node class 44 | * @see Vertex::GenericLinkIterator 45 | * @see Vertex::GenericArrayIterator 46 | * 47 | */ 48 | 49 | #ifndef VERTEX_H 50 | #define VERTEX_H 51 | 52 | #include 53 | 54 | #include "VertexBase.h" 55 | #include "GenericLinkIterator.h" 56 | #include "GenericArrayIterator.h" 57 | #include "Node.h" 58 | 59 | /** Definition of Vertex class. */ 60 | template 61 | class Vertex: public VertexBase { 62 | public: 63 | 64 | /** Definition of MessageIterator class. */ 65 | class MessageIterator: public GenericLinkIterator { 66 | public: 67 | /** 68 | * Get the source vertex of the current message. 69 | * @see getCurrent() 70 | * @return the source vertex 71 | */ 72 | int64_t getSrc() { return ( (Msg *)getCurrent() )->s_id; } 73 | 74 | /** 75 | * Get the destination vertex of the current message. 76 | * @see getCurrent() 77 | * @return the destination vertex 78 | */ 79 | int64_t getDst() { return ( (Msg *)getCurrent() )->d_id; } 80 | 81 | /** 82 | * Get message value. 83 | * @return message value 84 | */ 85 | const MessageValue& getValue() { 86 | return *( (MessageValue *)( getCurrent() + offsetof(Msg, message) ) ); 87 | } 88 | }; 89 | 90 | /** Definition of OutEdgeIterator class. */ 91 | class OutEdgeIterator: public GenericArrayIterator { 92 | public: 93 | /** 94 | * Constructor with arguments. 95 | * @param pbegin iterator begin position 96 | * @param pend iterator end position 97 | * @param size array element size 98 | */ 99 | OutEdgeIterator(char* pbegin, char* pend, int size): 100 | GenericArrayIterator(pbegin, pend, size) {} 101 | 102 | /** 103 | * Get current edge target vertex id. 104 | * @see current() 105 | * @return target vertex id 106 | */ 107 | int64_t target() { 108 | char* p = current(); 109 | return ( (Edge *)p )->to; 110 | } 111 | /** 112 | * Get edge value. 113 | * @see current() 114 | * @return edge value 115 | */ 116 | const EdgeValue& getValue() { 117 | char* p = current(); 118 | return *( (EdgeValue *)( (Edge *)p )->weight ); 119 | } 120 | }; 121 | 122 | public: 123 | /** 124 | * Compute at active vertex, pure virtual method. 125 | * @param pmsgs specialized pointer of received message iterator 126 | */ 127 | virtual void compute(MessageIterator* pmsgs) = 0; // Virtual is necessary. 128 | 129 | /** 130 | * Compute at active vertex. 131 | * @see compute(MessageIterator*) 132 | * @param pmsgs generic pointer of received message iterator 133 | */ 134 | void compute(GenericLinkIterator* pmsgs) { 135 | compute( (MessageIterator *)pmsgs ); // cast to compute() below 136 | } 137 | 138 | /** 139 | * Get vertex value type size. 140 | * @return vertex value type size 141 | */ 142 | int getVSize() const { 143 | return sizeof(VertexValue); 144 | } 145 | 146 | /** 147 | * Get edge value type size. 148 | * @return edge value type size 149 | */ 150 | int getESize() const { 151 | return sizeof(EdgeValue); 152 | } 153 | 154 | /** 155 | * Get message value type size. 156 | * @return message value type size 157 | */ 158 | int getMSize() const { 159 | return sizeof(MessageValue); 160 | } 161 | 162 | /** 163 | * Get current superstep number. 164 | * @see Node::getSuperstep() 165 | * @return current superstep number 166 | */ 167 | int getSuperstep() const { 168 | return m_pme->getSuperstep(); 169 | } 170 | 171 | /** 172 | * Get vertex id. 173 | * @see Node::getVertexId() 174 | * @return vertex id 175 | */ 176 | int64_t getVertexId() const { 177 | return m_pme->getVertexId(); 178 | } 179 | 180 | /** 181 | * Vote to halt. 182 | * @see Node::voteToHalt() 183 | * Change vertex state to be inactive. 184 | */ 185 | void voteToHalt() { 186 | return m_pme->voteToHalt(); 187 | } 188 | 189 | /** 190 | * Get vertex value. 191 | * @see Node::value 192 | * @return vertex value 193 | */ 194 | const VertexValue& getValue() { 195 | return *( (VertexValue *)m_pme->value ); 196 | } // Type cast has lower priority than "->" operator. 197 | 198 | /** 199 | * Mutate vertex value. 200 | * @see Node::value 201 | * @return vertex value position 202 | */ 203 | VertexValue* mutableValue() { 204 | return (VertexValue *)m_pme->value; 205 | } // Type cast has lower priority than "->" operator. 206 | 207 | /** 208 | * Get an out-edge iterator. 209 | * @see OutEdgeIterator::OutEdgeIterator() 210 | * @return an object of OutEdgeIterator class 211 | */ 212 | OutEdgeIterator getOutEdgeIterator() { 213 | OutEdgeIterator out_edge_iterator( 214 | (char *)&(m_pme->getEdge(m_pme->m_edge_index)), 215 | (char *)&(m_pme->getEdge(m_pme->m_edge_index + m_pme->m_out_degree)), 216 | Edge::e_size ); 217 | return out_edge_iterator; 218 | } 219 | 220 | /** 221 | * Send a piece of node-node message to target vertex. 222 | * @param dest_vertex destination vertex id 223 | * @param message content of the message to be sent 224 | * @see Node::sendMessageTo() 225 | */ 226 | void sendMessageTo(int64_t dest_vertex, const MessageValue& message) { 227 | m_pme->sendMessageTo(dest_vertex, (const char *)&message); 228 | } 229 | 230 | /** 231 | * Send a piece of node-node message to all outedge-target vertex. 232 | * @param message content of the message to be sent 233 | * @see Node::sendMessageToAllNeighbors() 234 | */ 235 | void sendMessageToAllNeighbors(const MessageValue& message) { 236 | m_pme->sendMessageToAllNeighbors( (const char *)&message ); 237 | } 238 | 239 | /** 240 | * Get global value of some aggregator. 241 | * @param aggr index of aggregator, count from 0 242 | * @see Node::getAggrGlobal() 243 | */ 244 | const void* getAggrGlobal(int aggr) { 245 | return m_pme->getAggrGlobal(aggr); 246 | } 247 | 248 | /** 249 | * Accumulate local value of some aggregator. 250 | * @param aggr index of aggregator, count from 0 251 | * @see Node::accumulateAggr() 252 | */ 253 | void accumulateAggr(int aggr, const void* p) { 254 | m_pme->accumulateAggr(aggr, p); 255 | } 256 | 257 | /** Destructor, virtual method. */ 258 | virtual ~Vertex() {} 259 | }; // definition of Vertex class 260 | 261 | #endif /* VERTEX_H */ 262 | -------------------------------------------------------------------------------- /GraphLite-0.20/include/VertexBase.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file VertexBase.h 3 | * @author Songjie Niu, Shimin Chen 4 | * @version 0.1 5 | * 6 | * @section LICENSE 7 | * 8 | * Copyright 2016 Shimin Chen (chensm@ict.ac.cn) and 9 | * Songjie Niu (niusongjie@ict.ac.cn) 10 | * 11 | * Licensed under the Apache License, Version 2.0 (the "License"); 12 | * you may not use this file except in compliance with the License. 13 | * You may obtain a copy of the License at 14 | * 15 | * http://www.apache.org/licenses/LICENSE-2.0 16 | * 17 | * Unless required by applicable law or agreed to in writing, software 18 | * distributed under the License is distributed on an "AS IS" BASIS, 19 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | * See the License for the specific language governing permissions and 21 | * limitations under the License. 22 | * 23 | * @section DESCRIPTION 24 | * 25 | * This file defined VertexBase class to load user Vertex class. VertexBase is 26 | * an abstract class with virtual methods. Node class is pointed to inside, 27 | * which implements inner details for corresponding Vertex. 28 | * 29 | * There is a 3-layer inheritance hierarchy about Vertex: VertexBase is on top, 30 | * Vertex class of template is in the middle, and user Vertex class is at the 31 | * bottom. 32 | * 33 | * @see Node class 34 | * @see GenericLinkIterator class 35 | * 36 | */ 37 | 38 | #ifndef VERTEXBASE_H 39 | #define VERTEXBASE_H 40 | 41 | #include "GenericLinkIterator.h" 42 | 43 | class Node; 44 | 45 | /** Definition of VertexBase class. */ 46 | class VertexBase { 47 | public: 48 | Node* m_pme; /**< pointer of Node class */ 49 | 50 | public: 51 | /** 52 | * Set pointer to Node class. 53 | * @param p corresponding Node structure position 54 | */ 55 | void setMe(Node* p) { m_pme = p; } 56 | 57 | /** 58 | * Get vertex value type size, pure virtual method. 59 | * @return vertex value type size 60 | */ 61 | virtual int getVSize() const = 0; 62 | 63 | /** 64 | * Get edge value type size, pure virtual method. 65 | * @return edge value type size 66 | */ 67 | virtual int getESize() const = 0; 68 | 69 | /** 70 | * Get message value type size, pure virtual method. 71 | * @return message value type size 72 | */ 73 | virtual int getMSize() const = 0; 74 | 75 | /** 76 | * Get vertex id. 77 | * @return vertex id 78 | */ 79 | virtual int64_t getVertexId() const = 0; 80 | 81 | /** 82 | * Get current superstep number, pure virtual method. 83 | * @return current superstep number 84 | */ 85 | virtual int getSuperstep() const = 0; 86 | 87 | /** 88 | * Vote to halt, pure virtual method. 89 | * Change vertex state to be inactive. 90 | */ 91 | virtual void voteToHalt() = 0; 92 | 93 | /** 94 | * Compute at active vertex, pure virtual method. 95 | * @param pmsgs generic pointer of received message iterator 96 | */ 97 | virtual void compute(GenericLinkIterator* pmsgs) = 0; 98 | }; // definition of VertexBase class 99 | 100 | #endif /* VERTEXBASE_H */ 101 | -------------------------------------------------------------------------------- /GraphLite-0.20/mainpage.dox: -------------------------------------------------------------------------------- 1 | /** 2 | \mainpage The mainpage documentation 3 | 4 |
  5 | ------------------------------------------------------------
  6 | Requirements
  7 | ------------------------------------------------------------
  8 | 1. JDK 1.7.x
  9 | 2. Hadoop 2.6.x
 10 | 3. protocol buffers
 11 |    $ apt-get install protobuf-c-compiler libprotobuf-c0 libprotobuf-c0-dev
 12 | 
 13 | ------------------------------------------------------------
 14 | Directory Structure
 15 | ------------------------------------------------------------
 16 | bin/         scripts and graphlite executable
 17 | engine/      graphlite engine source code     
 18 | example/     PageRank example
 19 | include/     header that represents programming API
 20 | 
 21 | Input/       a number of small example graphs
 22 | Output/      empty, will contain the output of a run
 23 | 
 24 | Makefile     this can make both engine and example
 25 | 
 26 | LICENSE.txt  Apache License, Version 2.0
 27 | 
 28 | README.txt   this file
 29 | 
 30 | ------------------------------------------------------------
 31 | Build graphlite
 32 | ------------------------------------------------------------
 33 | 1. source bin/setenv
 34 | 
 35 |    (1) edit bin/setenv, set the following paths:
 36 |        JAVA_HOME, HADOOP_HOME, GRAPHLITE_HOME
 37 | 
 38 |    (2) $ . bin/setenv
 39 | 
 40 | 2. build graphlite
 41 | 
 42 |    $ cd engine
 43 |    $ make
 44 | 
 45 |    check if bin/graphlite is successfully generated.
 46 | 
 47 | ------------------------------------------------------------
 48 | Compile and Run Vertex Program
 49 | ------------------------------------------------------------
 50 | 
 51 | 1. build example
 52 | 
 53 |    $ cd example
 54 |    $ make
 55 | 
 56 |    check if example/PageRankVertex.so is successfully generated.
 57 |    
 58 | 2. run example
 59 | 
 60 |    $ start-graphlite example/PageRankVertex.so Input/facebookcombined_4w Output/out
 61 | 
 62 |    PageRankVertex.cc declares 5 processes, including 1 master and 4 workers.
 63 |    So the input graph file is prepared as four files: Input/facebookcombined_4w_[1-4]
 64 | 
 65 |    The output of PageRank will be in: Output/out_[1-4]
 66 | 
 67 |    Workers generate log files in WorkOut/worker*.out
 68 | 
 69 | ------------------------------------------------------------
 70 | Write Vertex Program
 71 | ------------------------------------------------------------
 72 | Please refer to PageRankVertex.cc
 73 | 
 74 | 1. change VERTEX_CLASS_NAME(name) definition to use a different class name
 75 | 
 76 | 2. VERTEX_CLASS_NAME(InputFormatter) can be kept as is
 77 | 
 78 | 3. VERTEX_CLASS_NAME(OutputFormatter): this is where the output is generated
 79 | 
 80 | 4. VERTEX_CLASS_NAME(Aggregator): you can implement other types of aggregation
 81 | 
 82 | 5. VERTEX_CLASS_NAME(): the main vertex program with compute()
 83 |    
 84 | 6. VERTEX_CLASS_NAME(Graph): set the running configuration here
 85 | 
 86 | 7. Modify Makefile:
 87 |    EXAMPLE_ALGOS=PageRankVertex
 88 | 
 89 |    if your program is your_program.cc, then 
 90 |    EXAMPLE_ALGOS=your_program
 91 | 
 92 |    make will produce your_program.so
 93 | 
 94 | ------------------------------------------------------------
 95 | Use Hash Partitioner
 96 | ------------------------------------------------------------
 97 | 
 98 |  bin/hash-partitioner.pl can be used to divide a graph input
 99 |  file into multiple partitions.
100 | 
101 |   $ hash-partitioner.pl Input/facebookcombined 4
102 | 
103 |   will generate: Input/facebookcombined_4w_[1-4]
104 | 
105 | 
106 | 107 | */ 108 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | GraphLite version 0.20 2 | --------------------------------------------------------------------------------