├── Figure2.jpg ├── README.md ├── mock_code.py ├── random_modular_generator_variable_modules.py └── sequence_generator.py /Figure2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/prathasah/random-modular-network-generator/238f3689b6299e25d64313fa2117a3cab97f3807/Figure2.jpg -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Random Modular Network Generator 2 | ================================ 3 | This is the source code used for the following paper: 4 | 5 | Sah, Pratha*, Lisa O. Singh, Aaron Clauset, and Shweta Bansal. "Exploring community structure in biological networks with random graphs." BMC Bioinformatics 2014, 15:220. doi:10.1186/1471-2105-15-220 6 | 7 | This Python script generates undirected, simple, connected graphs with a specified degrees and pattern of communities, while maintaining a graph structure that is as random as possible. 8 | 9 | email: ps875@georgetown.edu, sb753@georgetown.edu 10 | 11 | Please cite the paper above, if you use our code in any form or create a derivative work. 12 | 13 | Sample Output Graph 14 | ================================ 15 | 16 | ![alt tag](https://github.com/prathasah/random-modular-network-generator/blob/master/Figure2.jpg) 17 | 18 | Modular random graphs with network size = 150, 375 edges, 3 modules (or communities) of size, *s* = 50 and degree distribution is power law with modularity values of: a) *Q* = 0.1; b) *Q* = 0.3; and c) *Q* = 0.6. 19 | 20 | From Figure 2, Sah *et al.* (2014) 21 | 22 | Dependencies 23 | ================================ 24 | * [Python 2.7](http://python.org/) 25 | * [Networkx 2.2](https://networkx.github.io/) 26 | 27 | Link to install Networkx can be found [here](https://networkx.github.io/). The generator also requires sequence_generator.py script which is provided. 28 | 29 | Usage 30 | ================================ 31 | 32 | For a quick demo of the code, open the script "random_modular_generator_variable_modules.py" and scroll down to the *main* function. Adjust the model parameters, degree distribution function and module size distribution function and run in terminal using the command: 33 | 34 | `$ python random_modular_generator_variable_modules.py` 35 | 36 | 37 | 38 | Sample Code 39 | ================================ 40 | For users unfamiliar to Python, I have uploadeded a sample code file (mock_code.py) demonstrating how the graph generator can be imported and used in a script. The mock code can be run using the command 41 | 42 | `$ python mock_code.py` 43 | 44 | Output 45 | ================================ 46 | The code saves the the adjacency matrix of the generated random modular graph under the filename 'adjacency_matrix.txt'. The graph is also saved in a graphml format under the filename "random_modular_graph.graphml". The graphml file can be uploaded in Gephi (http://gephi.github.io/) for graph visualization. Note that Gephi currently does not have a layout plugin to visualize modular graphs. In the near future, I am planning to add a function in random-modular-network-generator.py code to assign each node a particular coordinate that allows easy visualization of modules (I have done this in Figure 2). Shoot us an email if you need this feature sooner than later. 47 | 48 | ========= 49 | 50 | 51 | 52 | 53 | -------------------------------------------------------------------------------- /mock_code.py: -------------------------------------------------------------------------------- 1 | ################################################### 2 | #mock code to test modular graph generator 3 | 4 | ################################################### 5 | 6 | #####importing functions 7 | 8 | #the generator requires networkx package to be installed 9 | import networkx as nx 10 | #importing modular graph generator by Sah2014 11 | import random_modular_generator_variable_modules as rmg 12 | #importing sequence generator by Sah2014 13 | import sequence_generator as sg 14 | 15 | ################################################################################################ 16 | # Enter the network size(N), average network degree (d), total modules in the network (m), and modularity (Q) 17 | N= 1000 18 | d= 10 19 | m= 10 20 | Q= 0.85 21 | 22 | # specify the degree distribution of the graph. In it's current format the code can generate 23 | # four well known degree distribution found in biological networks - scalefree, geometric, poisson and regular distribution 24 | sfunction = sg.poisson_sequence 25 | 26 | # specify the distribution of module size. The distribution can be scalefree, geometric, poisson and regular distribution (or any aribtrary sequence) 27 | #in it's simplest form speicify module size tp be regular which implies that all modules are of equal size 28 | modfunction = sg.regular_sequence 29 | 30 | # generate the graph! 31 | 32 | print "Generating a simple poisson random modular graph with modularity(Q)= " + str(Q) 33 | print "Graph has " + str(N) + " nodes, " +str(m)+ " modules, and a network mean degree of " + str(d) 34 | print "Generating graph....." 35 | G = rmg.generate_modular_networks(N, sfunction, modfunction, Q, m, d, verbose=True) 36 | #nx.write_graphml(G, "random_modular_graph_Q"+str(Q)+"_N"+str(N)+"_d"+str(d)+"_m"+str(m)+".graphml") 37 | nx.write_graphml(G, "random_graph_poisson_N"+str(N)+"_d"+str(d)+".graphml") 38 | -------------------------------------------------------------------------------- /random_modular_generator_variable_modules.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """ 4 | This module generate random modular graphs 5 | """ 6 | __author__ = "Pratha Sah and Shweta Bansal" 7 | __copyright__ = "Copyright (C) 2013 Pratha Sah" 8 | __license__ = "GPL" 9 | __version__ = "1.0" 10 | __maintainer__ = "Pratha Sah" 11 | __email__ = "ps875@georgetown.edu" 12 | 13 | 14 | import networkx as nx 15 | import random as rnd 16 | import numpy as np 17 | #import sequence_generator as sg 18 | import matplotlib.pyplot as plt 19 | 20 | ############################################################################# 21 | 22 | # Change log 23 | # 25 July 2013: Added a function that allows variable module size according to 24 | # distribution modfunction defined by the user 25 | # 10 March 2015: Fixed code. The code now generates graphs at Q=0. 26 | # 27 | ############################################################################# 28 | 29 | #enter the degree distribution, modularity, total network size, number of modules, and the mean degree 30 | def generate_modular_networks(N, sfunction, modfunction, Q, m, avg_degree, verbose=False, **kwds): 31 | """This function generates modular random connected graph with a specified 32 | degree distribution and number of modules. 33 | Q is the desired value of modularity as defined by Newman (2004) 34 | n is the network size (i.e. the total number of nodes in the network) 35 | d is the average network degree 36 | sfunction is the degree distribution of the graph 37 | modfunction is the distribution of module size 38 | """ 39 | #iterate till the module sequence is realizable 40 | is_valid_module_seq= False 41 | scale=0 42 | while not is_valid_module_seq: 43 | 44 | # assign nodes to modules based on module size distribution function 45 | mod_nodes = {} 46 | nc = (1.0*N)/m # average module size 47 | # wd_hat = average within-degree when module sizes are kept constant. This value is used to fix minimum size of modules 48 | wd_hat = avg_degree*(Q+(1./m)) 49 | scale+=0.5 50 | mod_nodes = assign_module_membership(scale, N, nc, m, wd_hat, modfunction) 51 | mod_sizes = [len(mod_nodes[x]) for x in mod_nodes.keys()] 52 | 53 | 54 | # calculate tolerance for module-level within-degree and average-degree 55 | # 0.005 is the tolerance on Q 56 | tol = 0.1 * avg_degree/(1.0* (1-sum([(num/(1.0*N))**2 for num in mod_sizes]))) 57 | #print ("tol=="), tol 58 | #Qmax = 1.0 - (sum([(num/(1.0*N))**2 for num in mod_sizes])) 59 | #print ("Qmax=="), Qmax 60 | #if Q >= Qmax: 61 | # raise ValueError ("Q value exceeds Qmax for the chosen number of modules. Select a lower value") 62 | G = nx. Graph() 63 | G.add_nodes_from(range(0, N)) 64 | 65 | 66 | # Calculate the average within-degree of the network 67 | wd1 = avg_degree*(Q + (sum([(num/(1.0*N))**2 for num in mod_sizes]))) 68 | wd = round(wd1, 2) 69 | 70 | # Network generated in 5 steps 71 | # Step 1: Created total-degree list 72 | # Step 2: Create within-degree list 73 | # Step 3: Create between-degree list 74 | # Step 4: Connect outstubs (create between-edges) 75 | # Step 5: Connect instubs (create within-edges) 76 | 77 | connect_trial=100 # set initial connect_trial high to enter the while loop 78 | graph_connected = False 79 | outedge_graphical = False 80 | while connect_trial >= 10 or outedge_graphical == False or graph_connected == False: 81 | connect_trial = 0 82 | print ("generating degree lists......") 83 | #assigns total-degree to each node 84 | degree_list = create_total_degree_sequence (N, sfunction, avg_degree, mod_nodes, 0.01*tol, max_tries=1000, **kwds) 85 | 86 | 87 | #assigns within-degree to each node 88 | print ("generating indegree lists......") 89 | indegree_list = create_indegree_sequence(N, m, sfunction, mod_nodes, wd, degree_list, tol, **kwds) 90 | if len(indegree_list)==0: 91 | is_valid_module_seq= False 92 | break 93 | else: is_valid_module_seq= True 94 | 95 | 96 | #compute between-degree by formula d=wd+bd 97 | outdegree_list = create_outdegree_sequence(degree_list, indegree_list) 98 | # check if the outdegree (i.e between-degree) list is graphical 99 | outedge_graphical = is_graphical(outdegree_list, mod_nodes, m) 100 | print ("outdegree list graphical?"), outedge_graphical 101 | if outedge_graphical == True: 102 | print ("connecting out nodes..............") 103 | #connect nodes between modules using outedge list 104 | connect_trial = connect_out_nodes(G, m, mod_nodes, outdegree_list,connect_trial, indegree_list) 105 | 106 | # connect within-module edges only when between-module edges are connected. 107 | if len(G.edges()) > 1: 108 | print ("connecting in nodes..............") 109 | #connect nodes within a module using the inedge list 110 | 111 | connect_in_nodes(G, m, mod_nodes, indegree_list, outdegree_list) 112 | graph_connected = nx.is_connected(G) # check if the graph is connected 113 | 114 | Q1 = test_modularity_variable_mod(G, mod_nodes, verbose) 115 | return G 116 | 117 | ############################################################################# 118 | 119 | def assign_module_membership(scale, N, nc, m, wd_hat, modfunction): 120 | """assigns nodal membership to module according to the specified distribution""" 121 | 122 | mod_nodes = {} 123 | estimate_se = np.std(modfunction(m, nc, seqtype="modulesize"))/np.sqrt(m) 124 | 125 | min_size = int(round(wd_hat + (scale*estimate_se))) 126 | valid_seq = False 127 | while not valid_seq: 128 | trialseq=modfunction(m-1, nc, seqtype="modulesize") #generate m-1 number of random nos. 129 | seq=[max(min_size,s) for s in trialseq] 130 | valid_seq= (N-sum(seq)) >= min_size # so that the last module is at least size=min_size 131 | seq.append(N-sum(seq)) 132 | 133 | init=0 134 | for mod in xrange(m): 135 | mod_nodes[mod]=[x for x in range (init, init+seq[mod])] 136 | init=init+seq[mod] 137 | 138 | return mod_nodes 139 | 140 | ############################################################################# 141 | 142 | 143 | #assign each node with degree based on user-defined distribution 144 | def create_total_degree_sequence (n, sfunction, avg_degree, mod_nodes, tolerance, max_tries=2000, **kwds): 145 | 146 | """ 147 | Creates a total-degree sequence.Ensures that the minimum degree is 1 and 148 | the max degree is 1 less than the number of nodes and that the average 149 | degree of the sequence generated is within a tolerance of 0.05. 150 | 151 | `n`: number of nodes 152 | `sfunction`: a sequence generating function with signature (number of 153 | nodes, mean) 154 | `avg_degree`: mean degree 155 | `max_tries`: maximum number of tries before dying. 156 | 157 | 8Nov 2013: function now assigns total degree to the nodes module-wise, to 158 | ensure that mean(d(k))= mean(d). ie., mean degree of each module is equal 159 | to the network mean degree 160 | 161 | """ 162 | 163 | 164 | seqlist=[] 165 | # this loop assumes modules are indexed sequentially from 0 to K-1 166 | for mod in xrange(len(mod_nodes.keys())): 167 | tries = 0 168 | max_deg = len(mod_nodes[mod]) -1 169 | is_valid_seq = False 170 | tol = 5.0 171 | while tol > tolerance or (not is_valid_seq) or (tries > max_tries): 172 | trialseq = sfunction(len(mod_nodes[mod]), avg_degree, seqtype="degree") 173 | seq = [min(max_deg, max( int(round(s)), 1 )) for s in trialseq] 174 | is_valid_seq = nx.is_valid_degree_sequence_havel_hakimi(seq) 175 | 176 | if not is_valid_seq and sum(seq)%2 !=0: 177 | x = rnd.choice(xrange(len(seq))) 178 | seq[x] += 1 179 | is_valid_seq = nx.is_valid_degree_sequence_havel_hakimi(seq) 180 | # check if d_k (bar) = d(bar) 181 | tol = abs(avg_degree - np.mean(seq)) 182 | tries += 1 183 | if (tries > max_tries): 184 | raise nx.NetworkXError, \ 185 | "Exceeded max (%d) attempts at a valid sequence."%max_tries 186 | 187 | seqlist.append(seq) 188 | deg_list = [val for sublist in seqlist for val in sublist] 189 | return deg_list 190 | 191 | 192 | ############################################################################# 193 | 194 | #assign each node with within-degree based on user-defined distribution 195 | 196 | def create_indegree_sequence(n, m , sfunction, mod_nodes, wd, degree_list, tolerance, **kwds): 197 | 198 | """ 199 | Creates indegree sequence. 200 | Ensures that (i)the within-module degree of node is less than or equal to 201 | its total degree; (ii) the within-module degree sequence is graphical 202 | nodes; and (iii) the average within module degree of the sequence 203 | generated is within a tolerance of 0.05 204 | 205 | 'n': number of nodes 206 | 'sfunction': a sequence generating function with signature (number of 207 | nodes, mean) 208 | 'wd' = mean within-module degree, m= total modules in the network 209 | 'nc'=average community size 210 | 'mod_nodes'= dictionary of nodal membership to communities 211 | 'avg_degree': mean degree 212 | 'degree_list'= total degree sequence 213 | """ 214 | is_valid_seq = False 215 | is_valid_indegree = False 216 | is_valid_module_size=False 217 | is_valid_outdegree = True 218 | tol = 5.0 219 | connect_trial=0 220 | # Return empty list if the main loop breaks 221 | indegree_list=[] 222 | 223 | while ((not is_valid_seq) or (not is_valid_indegree) or (not is_valid_module_size) or (not is_valid_outdegree)): 224 | 225 | indegree_seq = sfunction(n, wd, seqtype="indegree") 226 | indegree_sort = list(np.sort(indegree_seq)) 227 | degree_sort = list(np.sort(degree_list)) 228 | is_valid_indegree= all([indegree_sort[i] <= degree_sort[i] for i in range(n)])==True 229 | mod_sizes = [len(mod_nodes[x]) for x in mod_nodes.keys()] 230 | is_valid_module_size = min(mod_sizes) >= max(indegree_seq) 231 | is_valid_seq=True 232 | 233 | 234 | if not is_valid_module_size: connect_trial+=1 235 | if is_valid_indegree and is_valid_seq and is_valid_module_size: 236 | #assign within-degree to a node such that wd(i)<=d(i) 237 | indegree_list = sort_inedge(indegree_sort, degree_sort, degree_list, n) 238 | #if network has two modules then sum(outdegree) for module 1 has to be equal to sum(outdegree) of module 2 239 | if m==2: 240 | mod1, mod2 = mod_nodes.keys() 241 | is_valid_outdegree = sum([(degree_list[i] -indegree_list[i]) for i in mod_nodes[mod1]]) == sum([(degree_list[i] -indegree_list[i]) for i in mod_nodes[mod2]]) 242 | 243 | 244 | for module in mod_nodes: 245 | seq = [indegree_list[i] for i in mod_nodes[module]] 246 | 247 | while (sum(seq)%2) != 0: 248 | # choose a random node in the module 249 | node_add_degree = rnd.choice([i for i in mod_nodes[module]]) 250 | # ensure that wd<=d and wd< (module size -1)after adding a within-degree to the node 251 | if indegree_list[node_add_degree] < degree_list[node_add_degree] and indegree_list[node_add_degree] < len(mod_nodes[module]): 252 | indegree_list[node_add_degree] += 1 253 | seq = [indegree_list[i] for i in mod_nodes[module]] 254 | 255 | is_valid_seq = nx.is_valid_degree_sequence_havel_hakimi(seq) 256 | tol = abs(wd - (sum(seq)/(1.0*len(seq)))) #ensure that wd_k(bar) = wd(bar) 257 | if (not is_valid_seq) and tol>tolerance: 258 | break 259 | 260 | 261 | if connect_trial>10:break 262 | 263 | return indegree_list 264 | 265 | ############################################################################# 266 | 267 | #assign each node with between-degree based on formula d=wd+bd 268 | def create_outdegree_sequence(degree_seq, indegree_seq): 269 | """Creates out(between-module) degree sequence""" 270 | outdegree_seq = [x-y for x, y in zip(degree_seq,indegree_seq)] 271 | return outdegree_seq 272 | 273 | ############################################################################# 274 | 275 | #Connect instubs (form within-edges) 276 | def connect_in_nodes(G, m, mod_nodes, indegree_list, outdegree_list): 277 | """Connects within-module stubs (or half edges) using a modified version of 278 | Havel-Hakimi algorithm""" 279 | a = 0 280 | 281 | while a < m: 282 | GI = nx.Graph() # treat each module as an independent graph 283 | edge1 = {} 284 | 285 | #create dictionary with key=node id and value=within-degree 286 | for num in range (min(mod_nodes[a]), (max(mod_nodes[a])+1)): 287 | edge1[num] = indegree_list[num] 288 | 289 | # sorts within-degrees in descending order keeping the node identity intact 290 | edge1_sorted = sorted(edge1.items(), key = lambda x: x[1], reverse=True) 291 | 292 | # creates a tuple of (node#, within-degree) 293 | nodelist, deglist = [[z[i] for z in edge1_sorted] for i in (0, 1)] 294 | node_edge = zip(deglist, nodelist) 295 | node_edge = [(x, y) for x, y in node_edge if x > 0] 296 | 297 | # Connection of instubs based on Havel-Hakimi algorithm. 298 | #Terminates when number of instubs=0 299 | while node_edge: 300 | node_edge.sort(reverse = True) 301 | deg1, node1 = node_edge[0] #choose node with highest within-degree 302 | if deg1 > 0: 303 | #connect the stubs of the node(i) to the next "wd(i)" nodes in the sorted list. 304 | for num in range(1, 1+deg1): 305 | deg2, node2 = node_edge[num] 306 | GI.add_edge(node1, node2) 307 | node_edge = [(x-1, y) if y == node2 else (x, y) for x, y in node_edge] # reducing the degree (stubs) the next "wd(i)" nodes by 1 308 | 309 | node_edge = node_edge[1:] #remove the node1 from the list 310 | node_edge = [(x, y) for x, y in node_edge if x > 0] # remove node if the within-degree (instub) hits zero 311 | 312 | randomize_graph(GI) # remove degree correlations by edge-randomization 313 | 314 | if nx.is_connected(GI) == False: # reconnect graph if disconnected 315 | connect_module_graph(GI, outdegree_list) 316 | 317 | 318 | #integrate the sub-graph to the main graph G. 319 | G.add_edges_from(GI.edges()) 320 | a += 1 321 | 322 | ############################################################################# 323 | 324 | #Connect outstubs (form between-edges) 325 | def connect_out_nodes(G, m, mod_nodes, outdegree_list, connect_trial, indegree_list): 326 | """Connects between-module stubs (or half edges) using a modified version of 327 | Havel-Hakimi algorithm""" 328 | nbunch = G.edges() 329 | # additonal check: to ensure that Graph G does not have any 330 | # pre-existing edges from earlier steps 331 | G.remove_edges_from(nbunch) 332 | is_valid_connection = False 333 | 334 | # maximum attemp to connect outstubs=10 335 | while is_valid_connection == False and connect_trial < 10: 336 | is_valid_connection = True 337 | trial = 0 338 | outnodelist = [] 339 | 340 | # creates a tuple of (node#, degree, module#) 341 | for num in xrange(len(G.nodes())): 342 | my = [key for key in mod_nodes.keys() if num in mod_nodes[key]] 343 | outnodelist.append((num, outdegree_list[num], my[0])) 344 | 345 | #Connect outstubs using a modified version of Havel-Hakimi algorithm 346 | #terminate when all the outstubs are connected 347 | while outnodelist: 348 | 349 | if is_valid_connection == False: break 350 | rnd.shuffle(outnodelist) 351 | outnodelist = [(x, y, z) for x, y, z in outnodelist if y != 0] # removes outnodes with zero outedges 352 | 353 | # select module with the highest between-degree = hfm 354 | outmod_tot = [(sum([y for x, y, z in outnodelist if z == a]), a) for a in xrange(m)] 355 | outmod_tot.sort(reverse = True) 356 | hfm = outmod_tot[0][1] 357 | 358 | # Select node (=node1) in module=hfm which has the highest between-degree 359 | possible_node1 = [(x, y) for x, y, z in outnodelist if z == hfm] 360 | possible_node1 = sorted(possible_node1, key = lambda x:x[1], reverse = True) 361 | node1, deg1 = possible_node1[0] 362 | 363 | 364 | # connect all the outstubs of node1 to random possible nodes 365 | # Criteria for selecting possible nodes 366 | # (a) Nodes cannot belong to the same module as node1 367 | # (b) No multi-edges allowed 368 | for degrees in xrange(deg1): 369 | if is_valid_connection == False: break 370 | # criteria (a) and (b) 371 | node_exclude = set(G.neighbors(node1)).union(set(mod_nodes[hfm])) 372 | possible_node2 = [(x, y, z) for x, y, z in outnodelist if x not in node_exclude] 373 | # list of possible nodes that node1 can connect to. 374 | 375 | possible_node2 = [(x, y) for x, y, z in possible_node2] 376 | 377 | #terminate if there are no possible nodes left for node 1 to connenct to. 378 | if len(possible_node2) > 0: 379 | is_isolates_avoided = False 380 | is_valid_connection = True 381 | 382 | # prevent nodes with 0within- module edge and one between- 383 | #module edge to connect to one another 384 | #Avoids formation of disconnected graphs 385 | trial=0 386 | # preference to nodes with indegree>0 to connect to nodes to indegree= 0 387 | if indegree_list[node1]!=0 and [indegree_list [node] for (node,deg) in possible_node2].count(0)>1: 388 | possible_node2 = [(node, deg) for node, deg in possible_node2 if indegree_list [node]==0] 389 | while not is_isolates_avoided: 390 | node2, deg2 = rnd.choice(list(possible_node2)) 391 | trial+=1 392 | is_isolates_avoided = indegree_list [node1] != 0 or indegree_list[node2] != 0 393 | 394 | if is_isolates_avoided: trial = 0 395 | 396 | # terminate if the attempt of finding possible nodes exceed 10000 397 | if trial == 50000: 398 | is_valid_connection, connect_trial = remove_outedges(G, connect_trial) 399 | break 400 | 401 | # on termination remove all the between-edges added and try again 402 | else: 403 | is_valid_connection, connect_trial = remove_outedges(G, connect_trial) 404 | break 405 | 406 | if is_valid_connection == True: 407 | G.add_edge(node1, node2) 408 | # reduces the degree of node1 and node 2by 1 in outnodelist 409 | outnodelist = [(x, y, z) if x != node2 else (x, y-1, z) for x, y, z in outnodelist] 410 | 411 | # remove node if the between-degree (outstub) hits zero 412 | outnodelist = [(x, y, z) for x, y, z in outnodelist if y > 0 and x != node1] 413 | 414 | # remove degree correlations by edge-randomization 415 | if len(G.edges()) > 0: 416 | randomize_graph_outedges(G, mod_nodes, indegree_list, outdegree_list) 417 | 418 | return connect_trial 419 | 420 | ############################################################################# 421 | 422 | def remove_outedges(G, connect_trial): 423 | """Removes all the within-module edges of the graph""" 424 | edges_added = G.edges() 425 | G.remove_edges_from(edges_added) 426 | connect_trial += 1 427 | state = False 428 | 429 | return state, connect_trial 430 | 431 | ##################################################################3########### 432 | 433 | def is_graphical(edgelist, mod_nodes, m): 434 | """Check if the between-module degree sequence is graphically realizable 435 | using algorithm by Chungphaisan (1974). The algorithm allows for the 436 | assumption of modules to beindividual nodes and allows multiple-edges 437 | between the nodes.""" 438 | state = True 439 | b = sum(edgelist) 440 | n = m 441 | s = {} 442 | for a in range(0, m): 443 | s[a+1] = sum(edgelist[min(mod_nodes[a]):(max(mod_nodes[a])+1)]) 444 | if b%2 != 0: 445 | state = False 446 | if m==2: 447 | if s[1]!=s[2]:state=False 448 | else: 449 | lhs=[] 450 | rhs=[] 451 | for j in range(1, n+1): 452 | lhs.append(sum([s[i] for i in range (1, j+1)])-(b*j*(j-1))) 453 | rhs.append(sum([min((j*b), s[i]) for i in range(j+1, n+1)])) 454 | 455 | if lhs > rhs:state = False 456 | return state 457 | 458 | ############################################################################# 459 | 460 | #assign within-degree to a node such that wd(i)<=d(i) 461 | def sort_inedge(inedge_sort, edge_sort, degree_list,n): 462 | """Assign within-module degree to nodes from the list of random-numbers 463 | sequence generated with the constraint that within-module degree is less 464 | than or equal to the total nodal degree.""" 465 | indegree_list_dict = {} 466 | indegree_list1 = {} 467 | degree_list1 = {} 468 | nodelist = [x for x in xrange(n)] 469 | rnd.shuffle(nodelist) # so that node are chosen at random 470 | 471 | for i in range(n): 472 | indegree_list1[i] = inedge_sort[i] 473 | degree_list1[i] = edge_sort[i] 474 | 475 | while nodelist: 476 | node_i = nodelist.pop() #chose a random node 477 | for key, value in degree_list1.items(): 478 | #check for the rank of its total-degree in the sorted 479 | #total-degree list 480 | if value == degree_list[node_i]: 481 | #assign within-degree with the same rank (=r) in the sorted 482 | #within-degree list 483 | indegree_list_dict[node_i] = indegree_list1[key] 484 | del indegree_list1[key] 485 | del degree_list1[key] 486 | break 487 | indegree_list = [value for key, value in indegree_list_dict.items()] 488 | return indegree_list 489 | 490 | ############################################################################# 491 | 492 | def randomize_graph(G): 493 | """randomize a network using double-edged swaps. 494 | Note: This is used to randomize only the within-edges. A separate algorithm 495 | (randomize_graph_outedges) is used to randomize between-edges""" 496 | 497 | size = G.size() # number of edges in graph 498 | its = 1 499 | for counter in range(its): 500 | nx.double_edge_swap(G, nswap = 5*size, max_tries = 10000*size) 501 | 502 | ############################################################################# 503 | 504 | 505 | def randomize_graph_outedges(G, mod_nodes, indegree_list, outdegree_list): 506 | """randomize between-edges using double-edge swaps""" 507 | size = G.size() 508 | double_edge_swap_outedges(G, mod_nodes, indegree_list, outdegree_list, nswap = 5*size,max_tries = 10000*size) 509 | 510 | ############################################################################# 511 | 512 | def double_edge_swap_outedges(G, mod_nodes, indegree_list, outdegree_list, nswap, max_tries): 513 | """Randomizes between-modul edges of the graph. This function is similiar 514 | to the generic double-edge-swap technique with an additonal constraint: 515 | Swaps that create within-module edges are not allowed.""" 516 | 517 | if len(G) < 4: raise nx.NetworkXError("Graph has less than four nodes.") 518 | n = 0 519 | swapcount = 0 520 | #keys, degrees = zip(*G.degree().items()) # nodes, degree 521 | keys, degrees = zip(*G.degree()) # nodes, degree 522 | cdf = nx.utils.cumulative_distribution(degrees) # cdf of degree 523 | while swapcount < nswap: 524 | (ui, xi) = nx.utils.discrete_sequence(2, cdistribution=cdf) 525 | if ui == xi : 526 | continue # same source, skip 527 | u = keys[ui] # convert index to label 528 | x = keys[xi] 529 | u_mod = [module for module in mod_nodes if u in mod_nodes[module]] 530 | u_mod = u_mod[0] 531 | 532 | if x in mod_nodes[u_mod]: 533 | continue # same module, skip 534 | 535 | # choose target uniformly from neighbors 536 | v = rnd.choice(list(G[u])) 537 | y = rnd.choice(list(G[x])) 538 | 539 | v_mod = [module for module in mod_nodes if v in mod_nodes[module]] 540 | v_mod = v_mod[0] 541 | 542 | if v == y or y in mod_nodes[v_mod]: 543 | continue # same target or same module, skip 544 | if (x not in G[u]) and (y not in G[v]): # don't create parallel edges 545 | if (indegree_list[u]+indegree_list[x]!=0) and (indegree_list[v]+indegree_list[y]!=0): 546 | 547 | G.add_edge(u, x) 548 | G.add_edge(v, y) 549 | G.remove_edge(u, v) 550 | G.remove_edge(x, y) 551 | swapcount += 1 552 | if n >= max_tries: 553 | e = ('Maximum number of swap attempts (%s) exceeded '%n + 554 | 'before desired swaps achieved (%s).'%nswap) 555 | raise nx.NetworkXAlgorithmError(e) 556 | n += 1 557 | return G 558 | 559 | ############################################################################# 560 | 561 | def connect_module_graph(G, outdegree_list): 562 | """Connect disconnected modules. Note: This function cannot be used to 563 | connect the entire modular graph.""" 564 | cc_tot = list(nx.connected_components(G)) # cc returns the connected components of G as lists cc[0], cc[1], etc. 565 | isolated_comp, outedge_comp, isolated_comp_count, outedge_comp_count = partition_network_components(cc_tot, outdegree_list) 566 | 567 | while isolated_comp_count > 0: #while G is not connected, reduce number of components 568 | # pick a random node in the largest component cc[0] that has degree > 1 569 | node1 = rnd.choice(isolated_comp[0]) 570 | # pick a node in another component whose degree >1 571 | node2 = rnd.choice(outedge_comp[rnd.choice([x for x in xrange(outedge_comp_count)])]) 572 | while G.degree(node2) <= 1: 573 | node2 = rnd.choice(outedge_comp[rnd.choice([x for x in xrange(outedge_comp_count)])]) 574 | 575 | # pick neighbors of node1 and node2 576 | nbr1 = rnd.choice([n for n in G.neighbors(node1)]) 577 | nbr2 = rnd.choice([n for n in G.neighbors(node2)]) 578 | 579 | # swap connections between node1,nbr1 with connections between node2,nbr2 580 | # to attempt to connect the two components 581 | G.remove_edges_from([(node1, nbr1), (node2, nbr2)]) 582 | G.add_edges_from([(node1, node2), (nbr1, nbr2)]) 583 | 584 | cc_tot = list(nx.connected_components(G)) 585 | isolated_comp, outedge_comp, isolated_comp_count, outedge_comp_count = partition_network_components(cc_tot, outdegree_list) 586 | 587 | ############################################################################# 588 | 589 | def partition_network_components(cc_tot, outdegree_list): 590 | """Partitions network disconnected components of a module into: 591 | (a) components with no within-module edges and hence isolates, and 592 | (b) components with atleast one within-module edge""" 593 | tot_component_count = len(cc_tot) 594 | isolated_comp = {} 595 | outedge_comp = {} 596 | count1 = 0 597 | count2 = 0 598 | for component in xrange(tot_component_count): 599 | outedge_stubs = False 600 | for node in cc_tot[component]: 601 | if outdegree_list[node] > 0: 602 | outedge_stubs = True 603 | 604 | if outedge_stubs is False and len(cc_tot[component]) > 1: 605 | isolated_comp[count1] = list(cc_tot[component]) 606 | count1 += 1 607 | elif outedge_stubs is True and len(cc_tot[component]) > 1: 608 | outedge_comp[count2] = list(cc_tot[component]) 609 | count2 += 1 610 | return isolated_comp, outedge_comp, len(isolated_comp), len(outedge_comp) 611 | ################################################################################ 612 | 613 | def adjust_indegree_seq(mod_nodes, indegree_seq): 614 | """checks which module does node belongs to and returns the module size""" 615 | mod_size=[(len(mod_nodes[s]), s) for s in mod_nodes] 616 | mod_size.sort() 617 | adjust_indegree_seq={} 618 | for mods in mod_size[:-1]: 619 | max_deg_valid=False 620 | total_deg_valid=False 621 | while not(max_deg_valid) or not(total_deg_valid): 622 | indegree_topop=[x for x in indegree_seq] 623 | rnd.shuffle(indegree_seq) 624 | adjust_indegree_seq[mods[1]]=[indegree_topop.pop() for nodes in mod_nodes[mods[1]]] 625 | max_deg_valid= max(adjust_indegree_seq[mods[1]]) 1: #while G is not connected, reduce number of components 725 | 726 | # pick a random node in the largest component cc[0] that has degree > 1 727 | node1 = rnd.choice(cc[0]) 728 | while G.degree(node1) == 1: 729 | node1 = rnd.choice(cc[0]) 730 | 731 | # pick a node in another component 732 | node2 = rnd.choice(cc[1]) 733 | 734 | # pick neighbors of node1 and node2 735 | nbr1 = rnd.choice(G.neighbors(node1)) 736 | nbr2 = rnd.choice(G.neighbors(node2)) 737 | 738 | # swap connections between node1,nbr1 with connections between node2,nbr2 739 | # to attempt to connect the two components 740 | G.remove_edges_from([(node1,nbr1),(node2,nbr2)]) 741 | G.add_edges_from([(node1,node2),(nbr1,nbr2)]) 742 | 743 | cc = list(nx.connected_components(G)) 744 | component_count = len(cc) 745 | 746 | 747 | ############################################################################# 748 | 749 | if __name__ == "__main__": 750 | 751 | """Main function to mimic C++ version behavior""" 752 | 753 | ### Enter the network size(N), average network degree (d), total modules in the network (m), and modularity (Q) 754 | N=2000 755 | d=10 756 | m=10 757 | Q= 0.4 758 | 759 | # specify the degree distribution of the graph. In it's current format the code can generate 760 | # four well known degree distribution found in biological networks - scalefree, geometric, poisson and regular distribution 761 | degfunction = sg.poisson_sequence 762 | 763 | # specify the distribution of module size. The distribution can be scalefree, geometric, poisson and regular distribution (or any aribtrary sequence) 764 | #in it's simplest form speicify module size tp be regular which implies that all modules are of equal size 765 | modfunction = sg.regular_sequence 766 | 767 | try : 768 | print "Generating random modular graph with modularity(Q) = " +str(Q) 769 | print "The graph has " + str(N) + " nodes, " + str(m)+ " modules, and a network mean degree of " +str(d) 770 | print "Generating graph....." 771 | 772 | G = generate_modular_networks(N, degfunction, modfunction, Q, m, d) 773 | 774 | print "Saving adjacency matrix under the filename adjacency_matrix.txt" 775 | adjacency_matrix = nx.to_numpy_matrix(G) 776 | np.savetxt("adjacency_matrix.txt", adjacency_matrix, delimiter="\t") 777 | 778 | print "writing graph in the GraphML format under the filename random_modular_graph.graphml" 779 | # The graphml file can be uploaded in Gephi (http://gephi.github.io/) for graph visualization. 780 | nx.write_graphml(G, "random_modular_graph.graphml") 781 | 782 | except (IndexError, IOError): 783 | print "try again" 784 | ############################################################################# 785 | 786 | -------------------------------------------------------------------------------- /sequence_generator.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """ 4 | This module generates random numbers from a given function"s 5 | """ 6 | __author__ = "Pratha Sah and Shweta Bansal" 7 | __copyright__ = "Copyright (C) 2013 Pratha Sah" 8 | __license__ = "GPL" 9 | __version__ = "1.0" 10 | __maintainer__ = "Pratha Sah" 11 | __email__ = "ps875@georgetown.edu" 12 | 13 | import numpy as np 14 | import math 15 | import random as rnd 16 | 17 | ################################################################################################### 18 | 19 | def regular_sequence (N, mean, seqtype): 20 | """returns a sequence with N entries and values symmetric around the mean""" 21 | seq=[int(round(mean))]*(N) 22 | return seq 23 | 24 | ############################################################################################## 25 | def poisson_sequence (N, mean, seqtype): 26 | """returns a sequence with N entries and values following a Poisson distribution with mean = mean""" 27 | seq= list(np.random.poisson(lam=mean, size=(N))) 28 | return seq 29 | 30 | ############################################################################################## 31 | 32 | def scalefree_sequence (N, mean, seqtype): 33 | """returns a sequence with N entries and values following a Power-law distribution with mean = mean""" 34 | if seqtype =="modulesize": alpha=10 35 | else: alpha=1/(1.0*mean)*10 36 | condition=False 37 | tol= 5 # set initial tolerance high to enter the loop 38 | while tol> 0.2: 39 | #print alpha, 40 | seq1= list((np.random. power(alpha, size=(N)))) 41 | seq1=[abs(1-x) for x in seq1] 42 | if seqtype == "modulesize": seq=[int(num*(N*mean-1)) for num in seq1] 43 | if seqtype == "simple_degree": seq=[int(num*(N-2))+1 for num in seq1] 44 | else: seq=[int(num*(N-1)) for num in seq1] 45 | tol = abs(mean-np.mean(seq)) 46 | #print alpha, mean, np.mean(seq) 47 | # check if the average of the total-degree list is close to the network mean degree (d). Tolerance=0.5 deviation from d. 48 | if tol>0.2: 49 | if mean-np.mean(seq)>0.05:alpha-=0.05 50 | else:alpha+=0.05 51 | 52 | return seq 53 | 54 | ############################################################################################## 55 | 56 | def geometric_sequence(N, mean, seqtype): 57 | """returns a sequence with N entries and values following a geometric distribution with mean = mean""" 58 | max_trial=1000 59 | avg = mean 60 | calc_mean=0 61 | 62 | while(abs(mean-calc_mean))>0.05: 63 | if (mean-calc_mean)>0.05: 64 | avg+=0.1 65 | 66 | elif (mean-calc_mean)<0.05: 67 | avg-=0.1 68 | x = 1.0/avg; 69 | seq = [(math.log(np.random. random())/math.log(1-x)) for i in xrange(N)] 70 | seq=[(int(round(x))) for x in seq] 71 | calc_mean = sum(seq)/(1.0*len(seq)) # compute avg degree of generated degree sequence 72 | 73 | return seq 74 | 75 | ############################################################################################## 76 | --------------------------------------------------------------------------------