├── .gitattributes ├── .gitignore ├── README.head.md ├── README.md ├── README.tail.md ├── eigenvimrc.py ├── fig.png ├── plugin └── eigenvimrc.vim └── util.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | -------------------------------------------------------------------------------- /README.head.md: -------------------------------------------------------------------------------- 1 | TODO2: create phylogenetic tree of vimrcs 2 | TODO3: add all options from option.c (see src/nvim/option.c) 3 | TODO4: cover the entire github periodically 4 | 5 | This script scrapes vimrc's hosted in github and returns the most commonly used vim configurations. 6 | 7 | ```python eigenvimrc.py``` 8 | 9 | The "voting" process may iterate as follows: 10 | 11 | ```python 12 | def vote(default): 13 | vimrcs = people_curate_their_vimrc() 14 | sleep(a_month) 15 | new_default = most_common_50_percent(vimrcs) 16 | if new_default != default: 17 | return vote(new_default) 18 | else: 19 | return new_default 20 | ``` 21 | 22 | #Install 23 | 3. Make sure pathogen is available and ```execute pathogen#infect()``` is in ```~/.vimrc``` 24 | 2. ```cd ~/.vim/bundle``` 25 | 3. ```git clone git://github.com/rht/eigenvimrc.git``` 26 | 27 | #Result 28 | ```set nocompatible``` > ```syntax on``` 29 | 30 | 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | TODO2: create phylogenetic tree of vimrcs 2 | TODO3: add all options from option.c (see src/nvim/option.c) 3 | TODO4: cover the entire github periodically 4 | 5 | This script scrapes vimrc's hosted in github and returns the most commonly used vim configurations. 6 | 7 | ```python eigenvimrc.py``` 8 | 9 | The "voting" process may iterate as follows: 10 | 11 | ```python 12 | def vote(default): 13 | vimrcs = people_curate_their_vimrc() 14 | sleep(a_month) 15 | new_default = most_common_50_percent(vimrcs) 16 | if new_default != default: 17 | return vote(new_default) 18 | else: 19 | return new_default 20 | ``` 21 | 22 | # Install 23 | 3. Make sure pathogen is available and ```execute pathogen#infect()``` is in ```~/.vimrc``` 24 | 2. ```cd ~/.vim/bundle``` 25 | 3. ```git clone git://github.com/rht/eigenvimrc.git``` 26 | 27 | # Result 28 | ```set nocompatible``` > ```syntax on``` 29 | 30 | 31 | Most common vim config out of 13759 vimrc's 32 | 33 | 0. ```set nocompatible``` 83.33% 34 | 1. ```syntax on``` 79.78% 35 | 2. ```set number``` 74.55% 36 | 3. ```set expandtab``` 71.10% 37 | 4. ```set incsearch``` 68.44% 38 | 5. ```set hlsearch``` 67.92% 39 | 6. ```set laststatus=2``` 67.76% 40 | 7. ```filetype plugin indent on``` 67.59% 41 | 8. ```set ruler``` 60.38% 42 | 9. ```set ignorecase``` 59.67% 43 | 10. ```set autoindent``` 53.00% 44 | 11. ```set showcmd``` 50.05% 45 | 12. ```set backspace=indent,eol,start``` 47.30% 46 | 13. ```set wildmenu``` 45.68% 47 | 14. ```set background=dark``` 40.45% 48 | 15. ```let mapleader=","``` 39.67% 49 | 16. ```set smartcase``` 38.73% 50 | 17. ```filetype off``` 37.95% 51 | 18. ```set tabstop=2``` 36.87% 52 | 19. ```set shiftwidth=4``` 36.79% 53 | 20. ```set showmatch``` 36.66% 54 | 21. ```set tabstop=4``` 36.55% 55 | 22. ```set cursorline``` 36.49% 56 | 23. ```set t_Co=256``` 35.90% 57 | 24. ```set nobackup``` 34.89% 58 | 25. ```set shiftwidth=2``` 34.49% 59 | 26. ```set encoding=utf-8``` 33.50% 60 | 27. ```set mouse=a``` 32.01% 61 | 28. ```set hidden``` 30.77% 62 | 29. ```set showmode``` 27.84% 63 | 30. ```set smarttab``` 27.20% 64 | 31. ```set noswapfile``` 26.27% 65 | 32. ```set nowrap``` 26.16% 66 | 33. ```set title``` 25.41% 67 | 34. ```autocmd!``` 25.02% 68 | 35. ```set list``` 25.00% 69 | 36. ```set noerrorbells``` 23.56% 70 | 37. ```filetype plugin on``` 23.10% 71 | 38. ```syntax enable``` 22.85% 72 | 39. ```set smartindent``` 22.26% 73 | 40. ```set softtabstop=4``` 21.98% 74 | 41. ```set scrolloff=3``` 21.14% 75 | 42. ```filetype on``` 21.08% 76 | 43. ```set relativenumber``` 19.26% 77 | 44. ```autocmd BufReadPost *``` 18.46% 78 | 45. ```set autoread``` 18.45% 79 | 46. ```set softtabstop=2``` 18.16% 80 | 47. ```colorscheme solarized``` 18.00% 81 | 48. ```set ttyfast``` 17.73% 82 | 49. ```set clipboard=unnamed``` 17.62% 83 | 50. ```filetype indent on``` 17.41% 84 | 51. ```set gdefault``` 16.73% 85 | 52. ```set undofile``` 14.96% 86 | 53. ```set visualbell``` 14.88% 87 | 54. ```set nostartofline``` 14.38% 88 | 55. ```set nowritebackup``` 14.37% 89 | 56. ```set splitright``` 14.36% 90 | 57. ```set history=1000``` 14.35% 91 | 58. ```set splitbelow``` 13.79% 92 | 59. ```set rtp+=~/.vim/bundle/vundle/``` 13.55% 93 | 60. ```set wrap``` 13.18% 94 | 61. ```set history=50``` 12.84% 95 | 62. ```set modeline``` 12.42% 96 | 63. ```set shortmess=atI``` 12.38% 97 | 64. ```set guioptions-=T``` 11.91% 98 | 65. ```set shiftround``` 11.75% 99 | 66. ```set undodir=~/.vim/undo``` 11.74% 100 | 67. ```set backupdir=~/.vim/backups``` 10.76% 101 | 68. ```set wildmode=list:longest``` 10.75% 102 | 69. ```set esckeys``` 10.75% 103 | 70. ```set encoding=utf-8 nobomb``` 10.67% 104 | 71. ```let save_cursor=getpos(".")``` 10.59% 105 | 72. ```set directory=~/.vim/swaps``` 10.36% 106 | 73. ```let old_query=getreg('/')``` 10.32% 107 | 74. ```set backspace=2``` 9.46% 108 | 75. ```set lazyredraw``` 9.14% 109 | 76. ```set rtp+=~/.vim/bundle/Vundle.vim``` 9.13% 110 | 77. ```nnoremap k gk``` 8.77% 111 | 78. ```nnoremap j gj``` 8.76% 112 | 79. ```set backup``` 8.71% 113 | 80. ```colorscheme molokai``` 8.23% 114 | 81. ```set binary``` 8.10% 115 | 82. ```set noeol``` 8.06% 116 | 83. ```set linebreak``` 7.99% 117 | 84. ```set nofoldenable``` 7.98% 118 | 85. ```set numberwidth=5``` 7.92% 119 | 86. ```Plugin 'gmarik/Vundle.vim'``` 7.87% 120 | 87. ```set novisualbell``` 7.78% 121 | 88. ```let g:airline_powerline_fonts=1``` 7.76% 122 | 89. ```set fileencoding=utf-8``` 7.74% 123 | 90. ```set wildmode=list:longest,list:full``` 7.66% 124 | 91. ```set exrc``` 7.58% 125 | 92. ```function! StripWhitespace()``` 7.44% 126 | 93. ```set secure``` 7.44% 127 | 94. ```autocmd BufNewFile,BufRead *.json setfiletype json syntax=javascript``` 7.40% 128 | 95. ```noremap W :w !sudo tee % > /dev/null``` 7.33% 129 | 96. ```vnoremap < >gv``` 7.17% 132 | 99. ```set foldmethod=indent``` 7.07% 133 | 134 | # Colorscheme stat 135 | 0. solarized 25.00% 136 | 1. molokai 11.44% 137 | 2. desert 6.19% 138 | 3. badwolf 4.33% 139 | 4. jellybeans 3.99% 140 | 5. github 2.88% 141 | 6. default 1.86% 142 | 7. hybrid 1.84% 143 | 8. vividchalk 1.72% 144 | 9. railscasts 1.71% 145 | 146 | # Plugin manager stat 147 | 148 | vam: 0.34% 149 | vundle: 10.39% 150 | neobundle: 4.40% 151 | others or none: 69.96% 152 | dein: 0.00% 153 | pathogen: 14.91% 154 | 155 | 156 | # Plot 157 | Strangely it doesn't follow the power law distribution. Likely because some settings are highly correlated with the others. 158 | ![plot](fig.png) 159 | 160 | # Data 161 | 162 | Last updated Mar 31 2017. 163 | Repository list is queried from [http://ghtorrent.org/dblite/](http://ghtorrent.org/dblite/) 164 | 165 | ```SELECT * FROM projects WHERE language = 'VimL' AND ((name = 'dotfiles') OR (name = 'vimrc'))``` 166 | -------------------------------------------------------------------------------- /README.tail.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | #Plot 4 | Strangely it doesn't follow the power law distribution. Likely because some settings are highly correlated with the others. 5 | ![plot](fig.png) 6 | 7 | #Data 8 | Repository list is queried from [http://ghtorrent.org/dblite/](http://ghtorrent.org/dblite/) 9 | 10 | ```SELECT * FROM projects WHERE language = 'VimL' AND ((name = 'dotfiles') OR (name = 'vimrc'))``` 11 | -------------------------------------------------------------------------------- /eigenvimrc.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import print_function 3 | import os 4 | import json 5 | import requests 6 | import collections 7 | import time 8 | import sys 9 | 10 | import util 11 | 12 | # requires internet connection; time intensive 13 | step2 = 0 # 2 for ghtorrent.csv, 1 for github.json 14 | 15 | # doesn't require internet connection 16 | step3 = step4 = step5 = 1 17 | 18 | api_url = "https://api.github.com/search/repositories" 19 | content_url = "https://raw.githubusercontent.com/" 20 | data_dir = 'data' 21 | 22 | def get_colorscheme_stat(vimrcs): 23 | colorschemes = [] 24 | outstr = [] 25 | for line in vimrcs: 26 | if line.startswith("colorscheme"): 27 | words = line.split(' ') 28 | if len(words) > 1: 29 | colorschemes.append(words[1]) 30 | total = len(colorschemes) 31 | cs = collections.Counter(colorschemes).most_common(10) 32 | for n, i in enumerate(cs): 33 | outstr.append("%d. %s %.2f%%" % (n, i[0], i[1] * 100. / total)) 34 | return '\n\n#Colorscheme stat\n' + '\n'.join(outstr) + '\n' 35 | 36 | def get_stat(vimrcs, total_vimrcs): 37 | outstr = [] 38 | outstr.append("Most common vim config out of " + str(total_vimrcs) + " vimrc's\n") 39 | 40 | eigenvimrc = collections.Counter(vimrcs).most_common(80) 41 | for n, i in enumerate(eigenvimrc): 42 | outstr.append("%d. ```%s``` %.2f%%" % (n, i[0], i[1] * 100. / total_vimrcs)) 43 | return '\n'.join(outstr), eigenvimrc 44 | 45 | def sanitize_line(line): 46 | # strip trailing whitespaces 47 | sanitized_line = line.strip() 48 | # strip whitespace surrounding '=' operator 49 | sanitized_line = sanitized_line.replace(" = ", "=") 50 | # format short-hand keyword 51 | sanitized_line = util.keyword_reformat(sanitized_line) 52 | # strip comment 53 | # detection is done by checking if the number of quotes 54 | # are odd 55 | if line.count('"') % 2 != 0: 56 | sanitized_line = sanitized_line[:sanitized_line.rfind('"')].strip() 57 | return sanitized_line 58 | 59 | class pm_stat: # pm stands for plugin manager 60 | def __init__(self): 61 | self.pms = { 62 | 'pathogen': 0, 63 | 'vundle': 0, 64 | 'vam': 0, 65 | 'neobundle': 0, 66 | 'dein': 0, 67 | 'plug': 0, 68 | 'others or none': 0} 69 | 70 | def get_pm_type(self, text): 71 | if 'call pathogen#infect' in text: 72 | self.pms['pathogen'] += 1 73 | elif 'call vundle#begin' in text: 74 | self.pms['vundle'] += 1 75 | elif 'call vam#ActivateAddons' in text: 76 | self.pms['vam'] += 1 77 | elif 'call neobundle#begin' in text: 78 | self.pms['neobundle'] += 1 79 | elif 'call dein#begin' in text: 80 | self.pms['dein'] += 1 81 | elif 'call plug#begin' in text: 82 | self.pms['plug'] += 1 83 | else: 84 | self.pms['others or none'] += 1 85 | 86 | def out(self): 87 | total = sum(self.pms.values()) 88 | outstr = "\n#Plugin manager stat\n\n" 89 | for k, v in self.pms.iteritems(): 90 | p = v * 100. / total 91 | outstr += "%s: %.2f%%\n" % (k, p) 92 | return outstr 93 | 94 | tic = time.time() 95 | # step 2: scrape vimrc's 96 | if step2: 97 | print("downloading vimrc's") 98 | if step2 is 1: 99 | data = json.load(open("github.json"))["items"] 100 | elif step2 is 2: 101 | data = [] 102 | import csv 103 | with open("ghtorrent.csv") as csvfile: 104 | reader = csv.reader(csvfile, delimiter=',') 105 | next(reader, None) # skip the header 106 | for row in reader: 107 | repo_name = row[0].replace("https://api.github.com/repos/", "") 108 | data.append({"full_name": repo_name}) 109 | counter = 0 110 | offset = 0 111 | 112 | def save_text(repo, request): 113 | with open(data_dir + '/' + repo['full_name'].replace('/', ''), 'wb') \ 114 | as f: 115 | f.write(request.text.encode('utf8')) 116 | 117 | # essentially `mkdir -p github_vimrcs` 118 | if not os.path.isdir(data_dir): 119 | os.makedirs(data_dir) 120 | 121 | for i in data: 122 | if counter > offset: 123 | r = requests.get(content_url + i['full_name'] + '/master/.vimrc') 124 | if r.status_code == 200: 125 | save_text(i, r) 126 | else: 127 | r2 = requests.get(content_url + i['full_name'] + '/master/vimrc') 128 | if r2.status_code == 200: 129 | save_text(i, r2) 130 | 131 | counter += 1 132 | print("%d %.2f%%" % (counter, counter * 1. / len(data) * 100)) 133 | 134 | # step 3: process data 135 | if step3: 136 | vimrcs = [] 137 | total_vimrcs = 0 138 | pms = pm_stat() 139 | excluded_keywords = ['endfunction', 'endfunc', 'call', 'if ', 'else', 140 | 'endif', 'return', 'augroup', 'Bundle', 'execute', 141 | '\\ }', '\\', 'endfun', 'endf', 'try', 'endtry', 142 | 'endwhile', 'catch', 'end', 'au!'] 143 | 144 | filenames = os.listdir(data_dir) 145 | for vimrc in filenames: 146 | txt = open(data_dir + '/' + vimrc).read().split('\n') 147 | if len(txt) < 10: 148 | continue 149 | total_vimrcs += 1 150 | if total_vimrcs % 200 == 0: 151 | print("Processed %.2f%%" % (total_vimrcs * 100. / len(filenames))) 152 | sys.stdout.write('\033[F') 153 | 154 | pms.get_pm_type('\n'.join(txt)) 155 | sanitized_txt = [] 156 | for line in txt: 157 | if (not line.startswith('\"')) and \ 158 | not any(s in line for s in excluded_keywords): 159 | sanitized_line = sanitize_line(line) 160 | # only append when line is not empty 161 | if len(sanitized_line) > 0: 162 | sanitized_txt.append(sanitized_line) 163 | vimrcs.append(sanitized_txt) 164 | 165 | # flatten the nested list 166 | vimrcs = [item for sublist in vimrcs for item in sublist] 167 | 168 | outfile = open("README.md", "w") 169 | head = open("README.head.md").read() 170 | tail = open("README.tail.md").read() 171 | outstr, eigenvimrc = get_stat(vimrcs, total_vimrcs) 172 | outfile.write(head + outstr + get_colorscheme_stat(vimrcs) + pms.out() + tail) 173 | 174 | # step 4: generate eigenvimrc.vim 175 | if step4: 176 | if not step3: 177 | print("this step depends on step #3, please enable step3") 178 | exit() 179 | # essentially `mkdir -p plugin` 180 | if not os.path.isdir("plugin"): 181 | os.makedirs("plugin") 182 | with open('plugin/eigenvimrc.vim', 'wb') as f: 183 | for i in eigenvimrc: 184 | if i[1] * 100. / total_vimrcs >= 40: # 40% most used 185 | f.write(i[0] + '\n') 186 | 187 | # step 5: generate plot for analysis 188 | if step5: 189 | if not step3: 190 | print("this step depends on step #3, please enable step3") 191 | exit() 192 | import matplotlib 193 | matplotlib.use('Agg') 194 | import matplotlib.pyplot as plt 195 | plt.style.use('ggplot') 196 | import pylab 197 | y = [i[1] for i in eigenvimrc] 198 | x = range(1, len(y) + 1) 199 | pylab.scatter(x, y, c='r') 200 | pylab.ylabel("Number of usage") 201 | 202 | # power law fit 203 | import scipy.optimize as optimize 204 | logx = pylab.log10(x) 205 | logy = pylab.log10(y) 206 | 207 | def fitfunc(p, x): 208 | return p[0] + p[1] * x 209 | 210 | errfunc = lambda p, x, y: (y - fitfunc(p, x)) ** 2 211 | powerlaw = lambda x, amp, index: amp * (x**index) 212 | pinit = [1.0, -1.0] 213 | out = optimize.leastsq(errfunc, pinit, 214 | args=(logx, logy), full_output=1) 215 | pfinal = out[0] 216 | covar = out[1] 217 | index = pfinal[1] 218 | amp = 10.0**pfinal[0] 219 | print(covar) 220 | 221 | pylab.plot(x, powerlaw(x, amp, index)) 222 | pylab.legend(['power law fit', 'data']) 223 | pylab.xlim([1, x[-1]]) 224 | 225 | pylab.savefig('fig.png') 226 | 227 | print('elapsed: %.2f' % (time.time() - tic)) 228 | -------------------------------------------------------------------------------- /fig.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rht/eigenvimrc/a01e4bb2a84d93f1588b21584d7e7dbbb87c2497/fig.png -------------------------------------------------------------------------------- /plugin/eigenvimrc.vim: -------------------------------------------------------------------------------- 1 | set nocompatible 2 | syntax on 3 | set number 4 | set expandtab 5 | set incsearch 6 | set hlsearch 7 | set laststatus=2 8 | filetype plugin indent on 9 | set ruler 10 | set ignorecase 11 | set autoindent 12 | set showcmd 13 | set backspace=indent,eol,start 14 | set wildmenu 15 | set background=dark 16 | -------------------------------------------------------------------------------- /util.py: -------------------------------------------------------------------------------- 1 | import re 2 | 3 | 4 | def keyword_reformat(line): 5 | formatted_line = line 6 | # TODO: complete this 7 | dic = {"nu": "number", 8 | "rnu": "relativenumber"} 9 | dic_with_params = {"ts": "tabstop", 10 | "sts": "softtabstop"} 11 | dic_others = {"syn on": "syntax on"} 12 | 13 | for i, j in dic.iteritems(): 14 | formatted_line = re.sub("set "+i+'$', "set "+j, formatted_line) 15 | 16 | for i, j in dic_with_params.iteritems(): 17 | formatted_line = re.sub("set "+i+'=', "set "+j+"=", formatted_line) 18 | 19 | for i, j in dic_others.iteritems(): 20 | formatted_line = re.sub(i, j, formatted_line) 21 | return formatted_line 22 | --------------------------------------------------------------------------------