├── README.md └── code.py /README.md: -------------------------------------------------------------------------------- 1 | This program (download here) is developed out of the following paper: 2 | 3 | Jiang B. and Ding M. (2015), Defining least community as a homogeneous group in complex networks, Physica A, 428, 154-160. 4 | 5 | Simply put, it is for obtaining homogeneous communities in complex networks by applying head/tail breaks on edge betweenness given its heavy-tailed distribution. The program is based on NetworkX 1.8, and its input and output files follow with Pajek file format (.net). 6 | 7 | 8 | How to do: 9 | 10 | 1. Open .net file in txt and copy all edges under line "*edges" and then paste them into a new txt file. 11 | 2. Replace the related lines of the code with the input file location, the division rule, and output file location as the comment indicated. 12 | 3. Run the program and then replace the edges from the original .net with the edges from the output file (It is better to make a copy of original net file for later comparison). 13 | 4. Load the changed .net file in Pajek, click draw network button, then press "ctrl + k", all the homogeneous groups will show up. In case you want to know further info of detected homo-groups, in Pajek click Network > Create Partition > Components > Weak , set minimum size = 1, then the number of homo-groups and their node members etc. will be obtained. 14 | 15 | Note that the computing capacity of this program is limited to around 30,000 edges, and it may take couple of minutes or longer for a bigger graph (say > 10,000 edges). 16 | 17 | 18 | -------------------------------------------------------------------------------- /code.py: -------------------------------------------------------------------------------- 1 | import networkx as nx 2 | import csv 3 | import timeit 4 | 5 | def getMean(values): 6 | mean=0 7 | numsum=0 8 | for index in range(len(values)): 9 | numsum=numsum+values[index] 10 | mean=float(numsum)/float(len(values)) 11 | return mean 12 | 13 | def HeadTailCommunityDetection(G,finaledgelist): 14 | H=nx.connected_component_subgraphs(G) 15 | for subgraph in H: 16 | result=nx.edge_betweenness(subgraph, False, None) 17 | edges=result.keys() 18 | values=result.values() 19 | mean = getMean(values) 20 | edgelist=[] 21 | edgetemp=subgraph.edges(); 22 | if len(edgetemp)<=2: 23 | for edge in edgetemp: 24 | finaledgelist.append(edge) 25 | else: 26 | for index in range(len(values)): 27 | if values[index] <= mean: 28 | edgelist.append(edges[index]) 29 | if (float(len(edgelist))/float(len(edges)))<=0.6: #change the head/tail division rule here, here is for tail percentage, so if the rule is 40/60, the value should be assigned 0.6 as in the code. 30 | for edge in edgelist: 31 | finaledgelist.append(edge) 32 | else: 33 | Gsub= nx.Graph() 34 | for edge in edgelist: 35 | Gsub.add_edge(edge[0],edge[1]) 36 | HeadTailCommunityDetection(Gsub,finaledgelist) 37 | 38 | def HeadTailInitiator(): 39 | G = nx.Graph() 40 | ins = open("sampleNetFile\\Protein\\edge.txt", "r") #input file path 41 | for line in ins: 42 | words = line.split(' ') 43 | G.add_edge(int(words[0]), int(words[1])) 44 | ins.close() 45 | finaledgelist=[] 46 | start = timeit.default_timer() 47 | HeadTailCommunityDetection(G,finaledgelist) 48 | print "done!" 49 | stop = timeit.default_timer() 50 | print "Processing time(in second): " 51 | print stop - start 52 | text_file = open("OutputEdge.txt", "w") #output file path 53 | for edge in finaledgelist: 54 | text_file.write(str(edge[0])+" "+str(edge[1])+"\n") 55 | text_file.close() 56 | 57 | HeadTailInitiator() 58 | 59 | --------------------------------------------------------------------------------