├── .gitignore ├── README.md ├── data ├── README.md ├── crawl_graph.py ├── songs.py ├── sp_songs.py ├── spotipy.py └── trim_sp_songs.py ├── new-web ├── callback.html ├── deploy ├── dist │ ├── fonts │ │ ├── glyphicons-halflings-regular.eot │ │ ├── glyphicons-halflings-regular.svg │ │ ├── glyphicons-halflings-regular.ttf │ │ └── glyphicons-halflings-regular.woff │ ├── sp-bootstrap.css │ └── sp-bootstrap.min.css ├── dr-deploy ├── gindex.html ├── gstyles.css ├── images │ ├── frog.jpg │ ├── lz.png │ ├── missing-1.jpg │ ├── missing.png │ ├── pause.png │ └── play.png ├── index.html ├── lib │ └── underscore-min.js ├── ss.png └── styles.css ├── new_crawler ├── artist_graph.py ├── blacklist.csv ├── build_db.py ├── db.py ├── flask_server.py ├── foo.js ├── g2 │ ├── edges.js │ └── nodes.js ├── rdb.py ├── search.py ├── shell.py ├── sim_crawl.py ├── spotipy_util.py ├── start_simple_server └── stringutils.py ├── server ├── deploy ├── graph.py ├── sdeploy ├── server.py └── web.conf └── web ├── callback.html ├── deploy ├── dist ├── fonts │ ├── glyphicons-halflings-regular.eot │ ├── glyphicons-halflings-regular.svg │ ├── glyphicons-halflings-regular.ttf │ └── glyphicons-halflings-regular.woff ├── sp-bootstrap.css └── sp-bootstrap.min.css ├── dr-deploy ├── images ├── frog.jpg ├── lz.png ├── missing-1.jpg ├── missing.png ├── pause.png └── play.png ├── index.html ├── lib └── underscore-min.js ├── ss.png └── styles.css /.gitignore: -------------------------------------------------------------------------------- 1 | *.dat 2 | *.pyc 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Boil The Frog 2 | 3 | This is the source for a web app called BoilTheFrog that creates seamless 4 | Spotify playlists between any two artists. The app uses data from 5 | the Echo Nest and Spotify to create the playlists. 6 | 7 | The app is online at [Boil The Frog](http://static.echonest.com/BoilTheFrog) 8 | 9 |

10 | 11 | 12 | # The Server 13 | 14 | Boil the Frog has a server component that provides a web API used by the app. The web API has two main entry points: 15 | 16 | * find_path - finds a path between two artists 17 | * similar - shows the similar artists for an artist 18 | 19 | The server relies on pre-crawled artist similarity data from the Echo Nest and song data (including links to album art and audio previews) from Spotify. There are two python scripts that gather this data: 20 | 21 | * crawl_graph.py - crawls the artist similarity data. It takes 12 to 24 hours to crawl the data for about 100 to 150K artists. 22 | 23 | * sp_songs.py - crawls the top songs for each artist from Spotify. This takes about 12 hours to run. 24 | 25 | The output data from these two scripts are loaded by the server and used to build a graph (via networkx) that is used to satisfy the find_path requests. 26 | 27 | The server relies on cherrypy and networkx. 28 | 29 | 30 | # The Web App 31 | The web app is a relatively simple app that solicits artist names from the user, calls the find_path method on the server to get the path and displays the path to the user. The web audio api is used to manage playback. The playlist can be saved to Spotify if the user allows it. The authentication code is based on this [example](https://github.com/possan/playlistcreator-example). 32 | 33 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | # Boil The Frog Crawlers 2 | 3 | These are the tools for crawling the data for the Boil The Frog app. There are 3 Programs: 4 | 5 | * crawl_graph.py - this program crawls the similarity data. It will build a connected graph of artists based upon Echo Nest similarity 6 | * sp_songs.py - this program crawls the Spotify Web API for top song info (album art and preview urls) for each artist 7 | * trim_sp_songs - trims the song data to exactly match the artist data. 8 | 9 | ## How to use: 10 | 11 | ### Crawl the data: 12 | 13 | ` % python crawl_graph.py s0.dat ` 14 | 15 | This crawls the artist similarity graph. This can take 12 to 24 hours depending on the depth of the crawl. You can change the depth by changing min_hotttnesss. Lower values mean more artists. 16 | 17 | If the program crashes (i.e. network goes down or you close your laptop), you can restart from where you left off like so: 18 | 19 | ` % python crawl_graph.py s0.dat s1.dat` 20 | 21 | ### crawl the songs 22 | Get all the songs for the artist data with the command 23 | 24 | ` %python sp_songs.py s0.dat` 25 | 26 | You can restart this if/when it crashes. It will recover from where it left off. This creates a file called song-data.dat 27 | 28 | song-data.dat grows to include songs for all artists, but you may want to trim the songs to exactly match a particular artist graph. use 29 | 30 | ` %python trim_sp_songs.py s0.dat` 31 | 32 | to create a file called trimmed-song-data.dat with songs from song-data.dat that match the artists in s0.dat. 33 | 34 | ### Typical sequence 35 | 36 | ` % python crawl_graph s0.dat 37 | % python crawl_graph s0.dat s1.dat 38 | % python crawl_graph s1.dat s2.dat 39 | 40 | % python sp_songs.py s2.dat 41 | % python sp_songs.py s2.dat 42 | 43 | % python trim_sp_songs.py s2.dat 44 | 45 | % cp s2.dat ../server/full_spotify.dat 46 | % cp trimmed-song-data.dat ../server/spotify_songs.dat` 47 | 48 | 49 | -------------------------------------------------------------------------------- /data/crawl_graph.py: -------------------------------------------------------------------------------- 1 | import sys 2 | reload(sys) 3 | sys.setdefaultencoding("utf-8") 4 | 5 | import os 6 | import urllib2 7 | import time 8 | import codecs 9 | 10 | from pyechonest import artist, config, util 11 | config.TRACE_API_CALLS=False 12 | 13 | 14 | queue = [] 15 | sim_done = set() 16 | S = '' 17 | RS = ' ' 18 | min_hotttnesss = .4 19 | idspace = 'spotify-WW' 20 | idspace = 'spotifyv2-ZZ' 21 | 22 | 23 | def get_hottest(): 24 | maxh = 0 25 | for i in xrange(len(queue)): 26 | hot = queue[i][0] 27 | if hot > maxh: 28 | best = i 29 | maxh = hot 30 | return queue.pop(best) 31 | 32 | def enc(s): 33 | return s 34 | # return s.encode('UTF-8') 35 | 36 | def short_fid(fid): 37 | return fid.split(':')[-1] 38 | 39 | def crawl(): 40 | while len(queue) > 0: 41 | hot, id, fid, name = get_hottest() 42 | sfid = short_fid(fid) 43 | if id in sim_done: 44 | continue 45 | print "%4d %4d %.3f %s" % (len(sim_done), len(queue), hot, name) 46 | 47 | print >> outfile, 'artist', S, id, S, name, S, sfid, S, hot 48 | try: 49 | sims = artist.similar(ids=[id], results=12, buckets=['hotttnesss', 'familiarity', 'id:' + idspace], limit=True, min_hotttnesss=min_hotttnesss) 50 | for s in sims: 51 | proc_artist(s) 52 | for s in sims: 53 | simfid = s.get_foreign_id(idspace=idspace) 54 | ssimfid = short_fid(simfid) 55 | try: 56 | print >>outfile, 'sim', S, id, S, name, S, s.id, S, s.name, S, s.hotttnesss, S, ssimfid 57 | if s.id not in sim_done: 58 | queue.append( (s.hotttnesss, s.id, simfid, s.name) ) 59 | except UnicodeDecodeError: 60 | print "skipping artist because of unicode error", s.id 61 | sim_done.add(id) 62 | except urllib2.URLError: 63 | time.sleep(10) 64 | except util.EchoNestAPIError: 65 | time.sleep(5) 66 | 67 | 68 | def proc_artist(a): 69 | if a.id not in sim_done: 70 | fid = a.get_foreign_id(idspace=idspace) 71 | 72 | def queue_by_name(name): 73 | results = artist.search(name, buckets=['hotttnesss', 'familiarity', 'id:' + idspace], results=1, limit=True) 74 | for r in results: 75 | fid = r.get_foreign_id(idspace=idspace) 76 | queue.append( (r.hotttnesss, r.id, fid, r.name) ) 77 | 78 | 79 | def load(path): 80 | tqueue = [] 81 | for line in open(path): 82 | line = line.strip() 83 | print >> outfile, line 84 | fields = line.split(RS) 85 | if fields[0] == 'artist': 86 | sim_done.add(fields[1]) 87 | name = fields[2] 88 | hot = float(fields[4]) 89 | print "%4d %4d %.3f %s" % (len(sim_done), len(queue), hot, name) 90 | elif fields[0] == 'sim': 91 | id = fields[3] 92 | name = fields[4] 93 | hot = float(fields[5]) 94 | fid = fields[6] 95 | if not id in sim_done: 96 | tqueue.append( (hot, id, fid, name) ) 97 | 98 | for fields in tqueue: 99 | if not fields[1] in sim_done: 100 | queue.append(fields) 101 | 102 | 103 | if __name__ == '__main__': 104 | outpath = sys.argv[-1] 105 | if not os.path.exists(outpath): 106 | outfile = open(outpath, 'w') 107 | if len(sys.argv) == 3: 108 | load(sys.argv[-2]) 109 | else: 110 | queue_by_name('weezer') 111 | queue_by_name('lady gaga') 112 | queue_by_name('miles davis') 113 | queue_by_name('Led Zeppelin') 114 | queue_by_name('Explosions in the sky') 115 | queue_by_name('kanye west') 116 | crawl() 117 | else: 118 | print "won't override", outpath 119 | 120 | -------------------------------------------------------------------------------- /data/songs.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import pprint 3 | import simplejson as json 4 | import atexit 5 | import os 6 | import sys 7 | 8 | ECHO_NEST_API_KEY='EHY4JJEGIOFA1RCJP' 9 | 10 | playlist = 'http://developer.echonest.com/api/v4/playlist/static?api_key=%s&artist_id=%s&results=10&type=artist&bucket=id:rdio-US&bucket=tracks&limit=true&bucket=audio_summary' 11 | 12 | search = 'http://developer.echonest.com/api/v4/song/search?api_key=%s&artist_id=%s&results=10&sort=song_hotttnesss-desc&bucket=id:rdio-US&bucket=tracks&limit=true&bucket=audio_summary' 13 | 14 | path = 'song-data.dat' 15 | 16 | def fetch_songs(id): 17 | url = playlist % (ECHO_NEST_API_KEY, id) 18 | r = requests.get(url) 19 | results = None 20 | if r.status_code == 200: 21 | results = json.loads(r.text) 22 | # pprint.pprint(results) 23 | results = results['response']['songs'] 24 | return results 25 | 26 | def fetch_songs2(id): 27 | url = search % (ECHO_NEST_API_KEY, id) 28 | r = requests.get(url) 29 | results = None 30 | if r.status_code == 200: 31 | results = json.loads(r.text) 32 | # pprint.pprint(results) 33 | results = results['response']['songs'] 34 | return results 35 | 36 | def fetch_rdio_tracks(id): 37 | songs = fetch_songs(id) 38 | if songs == None or len(songs) == 0: 39 | songs = fetch_songs2(id) 40 | all = [] 41 | for s in songs: 42 | rs = {} 43 | rs['energy'] = s['audio_summary']['energy'] 44 | rs['title'] = s['title'] 45 | rs['id'] = s['id'] 46 | rdio_id = s['tracks'][0]['foreign_id'] 47 | rdio_id = rdio_id.replace('rdio-US:track:', '') 48 | rs['rdio_id'] = rdio_id 49 | all.append(rs) 50 | return all 51 | 52 | hash = {} 53 | dirty = False 54 | 55 | def flush_hash(): 56 | global dirty 57 | if dirty: 58 | dirty = False 59 | print 'flushing hash' 60 | out = open(path, 'w') 61 | shash = json.dumps(hash) 62 | print >> out, shash 63 | out.close() 64 | 65 | def load_hash(): 66 | hash = {} 67 | if os.path.exists(path): 68 | file = open(path) 69 | shash = file.read() 70 | hash = json.loads(shash) 71 | file.close() 72 | return hash 73 | 74 | 75 | def load_graph(path): 76 | RS = ' ' 77 | artists = [] 78 | seen = set() 79 | 80 | for i, line in enumerate(open(path)): 81 | fields = line.strip().split(RS) 82 | if i % 100000 == 0: 83 | print i, fields[2] 84 | if fields[0] == 'artist': 85 | artist = { 'id' : fields[1], 'name' : fields[2], 'sid' : fields[3], 'hot': float(fields[4]) } 86 | elif fields[0] == 'sim': 87 | artist = { 'id' : fields[3], 'name' : fields[4], 'sid' : fields[6], 'hot': float(fields[5]) } 88 | if not artist['id'] in seen: 89 | artists.append(artist) 90 | seen.add(artist['id']) 91 | artists.sort(reverse=True, key=lambda a: a['hot']) 92 | return artists 93 | 94 | def crawl_songs(path): 95 | global hash, dirty 96 | 97 | artists = load_graph(path) 98 | 99 | hash = load_hash() 100 | atexit.register(flush_hash) 101 | 102 | total = len(artists) 103 | for i, a in enumerate(artists): 104 | id = a['id'] 105 | if not id in hash: 106 | print "%d/%d %.2f %s" % (i, total, a['hot'], a['name']) 107 | songs = fetch_rdio_tracks(id) 108 | hash[id] = songs 109 | dirty = True 110 | if i % 1000 == 0: 111 | flush_hash() 112 | 113 | if __name__ == '__main__': 114 | crawl_songs(sys.argv[1]) 115 | -------------------------------------------------------------------------------- /data/sp_songs.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import pprint 3 | import simplejson as json 4 | import atexit 5 | import os 6 | import sys 7 | 8 | import spotipy 9 | 10 | 11 | spotify = spotipy.Spotify() 12 | 13 | ECHO_NEST_API_KEY='EHY4JJEGIOFA1RCJP' 14 | 15 | playlist = 'http://developer.echonest.com/api/v4/playlist/static?api_key=%s&artist_id=%s&results=10&type=artist&bucket=id:rdio-US&bucket=tracks&limit=true&bucket=audio_summary' 16 | 17 | search = 'http://developer.echonest.com/api/v4/song/search?api_key=%s&artist_id=%s&results=10&sort=song_hotttnesss-desc&bucket=id:rdio-US&bucket=tracks&limit=true&bucket=audio_summary' 18 | 19 | path = 'song-data.dat' 20 | 21 | def fetch_songs(id): 22 | url = playlist % (ECHO_NEST_API_KEY, id) 23 | r = requests.get(url) 24 | results = None 25 | if r.status_code == 200: 26 | results = json.loads(r.text) 27 | # pprint.pprint(results) 28 | results = results['response']['songs'] 29 | return results 30 | 31 | def fetch_songs2(id): 32 | url = search % (ECHO_NEST_API_KEY, id) 33 | r = requests.get(url) 34 | results = None 35 | if r.status_code == 200: 36 | results = json.loads(r.text) 37 | # pprint.pprint(results) 38 | results = results['response']['songs'] 39 | return results 40 | 41 | def fetch_rdio_tracks(id): 42 | songs = fetch_songs(id) 43 | if songs == None or len(songs) == 0: 44 | songs = fetch_songs2(id) 45 | all = [] 46 | for s in songs: 47 | rs = {} 48 | rs['energy'] = s['audio_summary']['energy'] 49 | rs['title'] = s['title'] 50 | rs['id'] = s['id'] 51 | rdio_id = s['tracks'][0]['foreign_id'] 52 | rdio_id = rdio_id.replace('rdio-US:track:', '') 53 | rs['rdio_id'] = rdio_id 54 | all.append(rs) 55 | return all 56 | 57 | 58 | def get_best_album_art(images, min_width): 59 | best = images[0] 60 | for image in images: 61 | if image['width'] < min_width: 62 | break 63 | best = image 64 | return best['url']; 65 | 66 | def get_simple_id(sid): 67 | fields = sid.split(':') 68 | if len(fields) == 3: 69 | return fields[2] 70 | else: 71 | return sid 72 | 73 | def fetch_spotify_tracks(id): 74 | max_tracks_per_artist = 5 75 | response = spotify.artist_top_tracks(id) 76 | try: 77 | tracks = response['tracks'] 78 | except spotify.SpotifyException: 79 | tracks = [] 80 | print 'exception for', id 81 | 82 | if tracks == None or len(tracks) == 0: 83 | print 'missing tracks for', id 84 | all = [] 85 | for track in tracks: 86 | rs = {} 87 | rs['energy'] = .5 88 | rs['title'] = track['name'] 89 | rs['id'] = track['id'] 90 | 91 | if 'preview_url' not in track: 92 | continue 93 | 94 | if 'album' not in track or 'images' not in track['album']: 95 | continue 96 | 97 | images = track['album']['images'] 98 | if len(images) == 0: 99 | continue 100 | 101 | rs['audio'] = track['preview_url'] 102 | rs['album_art'] = get_best_album_art(images, 250) 103 | 104 | all.append(rs) 105 | if len(all) >= max_tracks_per_artist: 106 | break 107 | 108 | return all 109 | 110 | hash = {} 111 | dirty = False 112 | 113 | def flush_hash(): 114 | global dirty 115 | if dirty: 116 | dirty = False 117 | print 'flushing hash' 118 | out = open(path, 'w') 119 | shash = json.dumps(hash) 120 | print >> out, shash 121 | out.close() 122 | 123 | def load_hash(): 124 | hash = {} 125 | if os.path.exists(path): 126 | file = open(path) 127 | shash = file.read() 128 | hash = json.loads(shash) 129 | file.close() 130 | return hash 131 | 132 | # artist ARX6TAQ11C8A415850 Lady Gaga 1HY2Jd0NmPuamShAr6KMms 0.839888 133 | # sim ARX6TAQ11C8A415850 Lady Gaga ARORMBJ1241B9CDB1A Ke$ha 0.749993 6LqNN22kT3074XbTVUrhzX 134 | def load_graph(path): 135 | RS = ' ' 136 | artists = [] 137 | seen = set() 138 | 139 | for i, line in enumerate(open(path)): 140 | fields = line.strip().split(RS) 141 | if i % 100000 == 0: 142 | print i, fields[2] 143 | if fields[0] == 'artist' and len(fields) > 4: 144 | artist = { 'id' : fields[1], 'name' : fields[2], 'sid' : fields[3], 'hot': float(fields[4]) } 145 | elif fields[0] == 'sim' and len(fields) > 6: 146 | artist = { 'id' : fields[3], 'name' : fields[4], 'sid' : fields[6], 'hot': float(fields[5]) } 147 | else: 148 | continue 149 | 150 | if not artist['id'] in seen: 151 | artists.append(artist) 152 | seen.add(artist['id']) 153 | 154 | artists.sort(reverse=True, key=lambda a: a['hot']) 155 | return artists 156 | 157 | def crawl_songs(path): 158 | global hash, dirty 159 | 160 | artists = load_graph(path) 161 | 162 | hash = load_hash() 163 | atexit.register(flush_hash) 164 | 165 | total = len(artists) 166 | for i, a in enumerate(artists): 167 | sid = a['sid'] 168 | sid = get_simple_id(sid) 169 | if not sid in hash: 170 | print "%d/%d %.2f %s" % (i, total, a['hot'], a['name']) 171 | songs = fetch_spotify_tracks(sid) 172 | hash[sid] = songs 173 | dirty = True 174 | if i % 1000 == 0: 175 | flush_hash() 176 | 177 | if __name__ == '__main__': 178 | crawl_songs(sys.argv[1]) 179 | -------------------------------------------------------------------------------- /data/spotipy.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import print_function 4 | 5 | import base64 6 | import requests 7 | import simplejson as json 8 | 9 | __all__ = ['oauth2'] 10 | 11 | ''' A simple and thin Python library for the Spotify Web API 12 | ''' 13 | 14 | class SpotifyException(Exception): 15 | def __init__(self, http_status, code, msg): 16 | self.http_status = http_status 17 | self.code = code 18 | self.msg = msg 19 | 20 | def __str__(self): 21 | return u'http status: {0}, code:{1} - {2}'.format( 22 | self.http_status, self.code, self.msg) 23 | 24 | 25 | class Spotify(object): 26 | ''' 27 | Example usage: 28 | 29 | import spotipy 30 | 31 | urn = 'spotify:artist:3jOstUTkEu2JkjvRdBA5Gu' 32 | sp = spotipy.Spotify() 33 | 34 | sp.trace = True # turn on tracing 35 | 36 | artist = sp.artist(urn) 37 | print(artist) 38 | 39 | user = sp.user('plamere') 40 | print(user) 41 | ''' 42 | 43 | trace = False 44 | _auth = None 45 | 46 | def __init__(self, auth=None): 47 | self.prefix = 'https://api.spotify.com/v1/' 48 | self._auth = auth 49 | 50 | def _auth_headers(self): 51 | if self._auth: 52 | return {'Authorization': 'Bearer {0}'.format(self._auth)} 53 | else: 54 | return None 55 | 56 | def _internal_call(self, verb, method, params): 57 | if not method.startswith('http'): 58 | url = self.prefix + method 59 | else: 60 | url = method 61 | args = dict(params=params) 62 | headers = self._auth_headers() 63 | r = requests.request(verb, url, headers=headers, **args) 64 | if self.trace: 65 | print() 66 | print(verb, r.url) 67 | if r.status_code != 200: 68 | raise SpotifyException(r.status_code, -1, u'the requested resource could not be found: ' + r.url) 69 | results = r.json() 70 | if self.trace: 71 | print('RESP', results) 72 | print() 73 | return results 74 | 75 | def get(self, method, args=None, **kwargs): 76 | if args: 77 | kwargs.update(args) 78 | return self._internal_call('GET', method, kwargs) 79 | 80 | def post(self, method, payload=None, **kwargs): 81 | args = dict(params=kwargs) 82 | if not method.startswith('http'): 83 | url = self.prefix + method 84 | else: 85 | url = method 86 | headers = self._auth_headers() 87 | headers['Content-Type'] = 'application/json' 88 | print('headers', headers) 89 | if payload: 90 | r = requests.post(url, headers=headers, data=json.dumps(payload), **args) 91 | else: 92 | r = requests.post(url, headers=headers, **args) 93 | if self.trace: 94 | print() 95 | print("POST", r.url) 96 | print("DATA", json.dumps(payload)) 97 | if r.status_code != 200: 98 | raise SpotifyException(r.status_code, -1, u'the requested resource could not be found: ' + r.url) 99 | results = r.json() 100 | if self.trace: 101 | print('RESP', results) 102 | print() 103 | return results 104 | 105 | def next(self, result): 106 | ''' returns the next result given a result 107 | ''' 108 | if result['next']: 109 | return self.get(result['next']) 110 | else: 111 | return None 112 | 113 | def previous(self, result): 114 | ''' returns the previous result given a result 115 | ''' 116 | if result['previous']: 117 | return self.get(result['previous']) 118 | else: 119 | return None 120 | 121 | def _warn(self, msg): 122 | print('warning:' + msg, file=sys.stderr) 123 | 124 | def track(self, track_id): 125 | ''' returns a single track given the track's ID, URN or URL 126 | ''' 127 | 128 | trid = self._get_id('track', track_id) 129 | return self.get('tracks/' + trid) 130 | 131 | def tracks(self, tracks): 132 | ''' returns a list of tracks given the track IDs, URNs, or URLs 133 | ''' 134 | 135 | tlist = [self._get_id('track', t) for t in tracks] 136 | return self.get('tracks/?ids=' + ','.join(tlist)) 137 | 138 | def artist(self, artist_id): 139 | ''' returns a single artist given the artist's ID, URN or URL 140 | ''' 141 | 142 | trid = self._get_id('artist', artist_id) 143 | return self.get('artists/' + trid) 144 | 145 | 146 | def artists(self, artists): 147 | ''' returns a list of artists given the artist IDs, URNs, or URLs 148 | ''' 149 | 150 | tlist = [self._get_id('artist', a) for a in artists] 151 | return self.get('artists/?ids=' + ','.join(tlist)) 152 | 153 | def artist_albums(self, artist_id, album_type=None, limit=20, offset=0): 154 | ''' Get Spotify catalog information about an artist’s albums 155 | ''' 156 | 157 | trid = self._get_id('artist', artist_id) 158 | return self.get('artists/' + trid + '/albums', album_type=album_type, limit=limit, offset=offset) 159 | 160 | def artist_top_tracks(self, artist_id, country='US'): 161 | ''' Get Spotify catalog information about an artist’s top 10 tracks by country. 162 | ''' 163 | 164 | trid = self._get_id('artist', artist_id) 165 | return self.get('artists/' + trid + '/top-tracks', country=country) 166 | 167 | def album(self, album_id): 168 | ''' returns a single album given the album's ID, URN or URL 169 | ''' 170 | 171 | trid = self._get_id('album', album_id) 172 | return self.get('albums/' + trid) 173 | 174 | def album_tracks(self, album_id): 175 | ''' Get Spotify catalog information about an album’s tracks 176 | ''' 177 | 178 | trid = self._get_id('album', album_id) 179 | return self.get('albums/' + trid + '/tracks/') 180 | 181 | def albums(self, albums): 182 | ''' returns a list of albums given the album IDs, URNs, or URLs 183 | ''' 184 | 185 | tlist = [self._get_id('album', a) for a in albums] 186 | return self.get('albums/?ids=' + ','.join(tlist)) 187 | 188 | def search(self, q, limit=10, offset=0, type='track'): 189 | ''' searches for an item 190 | ''' 191 | return self.get('search', q=q, limit=limit, offset=offset, type=type) 192 | 193 | def user(self, user_id): 194 | ''' Gets basic profile information about a Spotify User 195 | ''' 196 | return self.get('users/' + user_id) 197 | 198 | def user_playlists(self, user): 199 | ''' Gets playlists of a user 200 | ''' 201 | return self.get("users/%s/playlists" % user) 202 | 203 | def user_playlist(self, user, playlist_id, fields=None): 204 | ''' Gets playlist of a user 205 | ''' 206 | return self.get("users/%s/playlists/%s" % (user, playlist_id), fields=fields) 207 | 208 | def user_playlist_create(self, user, name, public=True): 209 | ''' Creates a playlist for a user 210 | ''' 211 | data = {'name':name, 'public':True } 212 | return self.post("users/%s/playlists" % (user,), payload = data) 213 | 214 | def user_playlist_add_tracks(self, user, playlist_id, uris, position=None): 215 | ''' Adds tracks to a playlist 216 | ''' 217 | data = {'uris':uris} 218 | uri_param = ','.join(uris) 219 | # return self.post("users/%s/playlists/%s/tracks" % (user,playlist_id), payload = data, position=position) 220 | return self.post("users/%s/playlists/%s/tracks" % (user,playlist_id), uris=uri_param, position=position) 221 | 222 | def me(self): 223 | ''' returns info about me 224 | ''' 225 | return self.get('me/') 226 | 227 | def _get_id(self, type, id): 228 | fields = id.split(':') 229 | if len(fields) == 3: 230 | if type != fields[1]: 231 | self._warn('expected id of type ' + type + ' but found type ' + fields[2] + " " + id) 232 | return fields[2] 233 | fields = id.split('/') 234 | if len(fields) >= 3: 235 | itype = fields[-2] 236 | if type != itype: 237 | self._warn('expected id of type ' + type + ' but found type ' + itype + " " + id) 238 | return fields[-1] 239 | return id 240 | -------------------------------------------------------------------------------- /data/trim_sp_songs.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import pprint 3 | import simplejson as json 4 | import atexit 5 | import os 6 | import sys 7 | 8 | path = 'song-data.dat' 9 | opath = 'trimmed-song-data.dat' 10 | 11 | 12 | hash = {} 13 | 14 | def flush_hash(hash): 15 | print 'flushing hash' 16 | out = open(opath, 'w') 17 | shash = json.dumps(hash) 18 | print >> out, shash 19 | out.close() 20 | 21 | def load_hash(): 22 | hash = {} 23 | if os.path.exists(path): 24 | file = open(path) 25 | shash = file.read() 26 | hash = json.loads(shash) 27 | file.close() 28 | return hash 29 | 30 | def load_graph(path): 31 | RS = ' ' 32 | artists = [] 33 | seen = set() 34 | 35 | for i, line in enumerate(open(path)): 36 | fields = line.strip().split(RS) 37 | if i % 100000 == 0: 38 | print i, fields[2] 39 | if fields[0] == 'artist' and len(fields) > 4: 40 | artist = { 'id' : fields[1], 'name' : fields[2], 'sid' : fields[3], 'hot': float(fields[4]) } 41 | elif fields[0] == 'sim' and len(fields) > 6: 42 | artist = { 'id' : fields[3], 'name' : fields[4], 'sid' : fields[6], 'hot': float(fields[5]) } 43 | else: 44 | continue 45 | 46 | if not artist['id'] in seen: 47 | artists.append(artist) 48 | seen.add(artist['id']) 49 | 50 | artists.sort(reverse=True, key=lambda a: a['hot']) 51 | print 'total artists', len(artists) 52 | return artists 53 | 54 | def trim_songs(path): 55 | artists = load_graph(path) 56 | hash = load_hash() 57 | hash2 = {} 58 | 59 | total = len(artists) 60 | for i, a in enumerate(artists): 61 | sid = a['sid'] 62 | if sid in hash: 63 | hash2[sid] = hash[sid] 64 | flush_hash(hash2) 65 | print 'total artists with songs', len(hash2) 66 | 67 | if __name__ == '__main__': 68 | trim_songs(sys.argv[1]) 69 | -------------------------------------------------------------------------------- /new-web/callback.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 121 | 142 | 143 | 144 |

145 |

146 |

147 |

Creating playlist...

148 |

149 | 158 | 162 |

163 |

164 | 165 | 166 | 167 | -------------------------------------------------------------------------------- /new-web/deploy: -------------------------------------------------------------------------------- 1 | s3cmd sync --acl-public * s3://static.echonest.com/BoilTheFrog/ 2 | -------------------------------------------------------------------------------- /new-web/dist/fonts/glyphicons-halflings-regular.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/dist/fonts/glyphicons-halflings-regular.eot -------------------------------------------------------------------------------- /new-web/dist/fonts/glyphicons-halflings-regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/dist/fonts/glyphicons-halflings-regular.ttf -------------------------------------------------------------------------------- /new-web/dist/fonts/glyphicons-halflings-regular.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/dist/fonts/glyphicons-halflings-regular.woff -------------------------------------------------------------------------------- /new-web/dr-deploy: -------------------------------------------------------------------------------- 1 | s3cmd sync --dry-run --acl-public * s3://static.echonest.com/BoilTheFrog/ 2 | -------------------------------------------------------------------------------- /new-web/gindex.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Boil the Frog 5 | 6 | 7 | 8 | 9 | 10 | 11 | 25 | 26 | 37 | 38 | 39 | 40 | 55 | 56 | 57 |

58 |

59 |

Boil the Frog

60 |

Create a seamless playlist between any two artists

61 |

62 | 63 | 64 | 65 | 66 | 67 |

68 |

69 |

70 | 71 | 72 | 74 | 75 | 76 |

77 |

78 |

79 | 80 |

81 |

82 |

Boil the Frog Gallery

83 | 84 |
85 |

86 |

87 | Some of the most popular artist paths made with Boil The Frog: 88 |

89 |

90 |

Miley Cyrus to Miles Davis 92 |
Justin Bieber to Jimi Hendrix 93 |
Patti Smith to the Smiths 94 |
Elvis to Elvis 95 |
The Carter Family to Rammstein 96 |
Kanye West to Taylor Swift 97 |
Cage the Elephant to John Cage 98 |
Ryan Adams to Bryan Adams 99 |
Lorde to God is an Astronaut 100 | 104 |

105 |

106 |

107 | 108 |

109 |

110 |

About Boil the Frog

111 | 112 |
113 |

114 | boiling frog 115 | Boil the Frog lets you create a playlist of tracks that gradually takes you from one music style 116 | to another. It's like the proverbial frog in the pot of water. If you heat up the pot slowly enough, the 117 | frog will never notice that he's being made into a stew and jump out of the pot. With a Boil the 118 | frog playlist you can do the same, but with music. You can generate a playlist that will take the 119 | listener from one style of music to the other, without the listener ever noticing that they are being made 120 | into a stew. 121 |

122 |
123 | 124 |

How does it work?

125 |

126 | To create a Boil The Frog playlist, just type in the names of two artists and a playlist will be 127 | generated that takes you gradually, step by step, from the first artist to the second artist. You can 128 | click on any track to hear the track. Click on the first track to hear the whole playlist. If you don't 129 | like a particular artist you can route around that particular artist by clicking the 'bypass' button. 130 | The 'New Track' button will select a different track for an artist. 131 |

132 |

133 | Boil the Frog plays 30 second versions of your tracks. When you find a playlist you like you can save it 134 | to Spotify to listen to the full-length versions. 135 |

136 | 137 |

Here are some examples

138 |

147 | 148 |

How does it really work?

149 | 150 |

151 | To create this app, The Echo Nest artist similarity info is 152 | used to build an artist similarity graph of about 153 | 100,000 of the most popular artists. Each artist in the graph is connected to it's most similar neighbors 154 | according to the Echo Nest artist similarity algorithm. 155 |

156 | image graph 157 |
158 |

159 | When a playlist between two artists is created, the graph is used to find the path between the two artists. 160 | The path isn't necessarily the shortest path through the graph. Instead, priority is given to paths that 161 | travel through artists of similar popularity. If you start and end with a popular artist, you are more 162 | likely to find a path that takes you though other popular artists, and if you start with a long-tail artist 163 | you will likely find a path through other long-tail artists. 164 |

165 | Once the path of artists is found, we need to select the best tracks for the playlist. To do this, we pick 166 | a well-known track for each artist that minimizes the difference in energy between this track, the previous 167 | track and the next track. 168 |

169 | Once we have selected the best tracks, we build a playlist using Spotify's nifty web api. 170 | 171 | 172 |

Who made this?

173 |

174 | This app was built by Paul Lamere. If you like this sort of 175 | thing you may be interested in my blog at Music Machinery. 176 |

177 |

178 |

179 | 180 |

181 |

182 |
183 |

Path created in 10 ms.

184 |

185 | 186 | 195 | 196 | 197 | 198 | 840 | 841 | 842 | 843 | 844 | 859 | 860 | 861 | 862 | -------------------------------------------------------------------------------- /new-web/gstyles.css: -------------------------------------------------------------------------------- 1 | .non-hero-unit { 2 | margin-top:66px; 3 | padding-bottom:30px; 4 | color: #fff; 5 | background-color: #84BD00; 6 | text-align:center; 7 | } 8 | 9 | .hero-unit h1 { 10 | font-size:64px; 11 | } 12 | 13 | #gallery h1 { 14 | color: #84BD00; 15 | } 16 | 17 | #about h1 { 18 | color: #84BD00; 19 | } 20 | 21 | #about h2 { 22 | color: #84BD00; 23 | } 24 | 25 | 26 | .reg-unit { 27 | color:white; 28 | margin-top:66px; 29 | /*background-color: #84BD00;*/ 30 | background-color: #6F7073; 31 | } 32 | 33 | .navbar-nav li a { 34 | cursor:pointer; 35 | } 36 | 37 | .artist:hover { 38 | cursor:pointer; 39 | } 40 | 41 | #options li a:hover { 42 | background-color: #727272; 43 | } 44 | 45 | .option-active { 46 | /*background-color: #6f7073;*/ 47 | color: #84BD00 !important; 48 | } 49 | 50 | /* 51 | .reg-unit a { 52 | color:red; 53 | } 54 | 55 | .reg-unit a:hover { 56 | color:red; 57 | } 58 | 59 | .reg-unit a:visited { 60 | color:orange; 61 | } 62 | */ 63 | 64 | 65 | 66 | #gallery { 67 | display:none; 68 | } 69 | 70 | #time-info { 71 | display:none; 72 | } 73 | 74 | .gallery-list { 75 | font-size:24px; 76 | } 77 | 78 | #about { 79 | display:none; 80 | } 81 | 82 | 83 | 84 | #main { 85 | margin-top:20px; 86 | margin-left:8px; 87 | } 88 | 89 | .adiv { 90 | width:294px; 91 | height:344px; 92 | background-size:100%; 93 | background-repeat:no-repeat; 94 | overflow:hidden; 95 | position:relative; 96 | /*background-color: #122;*/ 97 | margin-bottom:2px; 98 | padding:4px; 99 | background-color:#ddd; 100 | } 101 | 102 | #go { 103 | } 104 | 105 | #xbuttons { 106 | width:100%; 107 | margin-left:auto; 108 | margin-right:auto; 109 | text-align:center; 110 | display:none; 111 | } 112 | 113 | #tweet-span { 114 | position:relative; 115 | top:10px; 116 | } 117 | 118 | .tadiv { 119 | display:inline-block; 120 | width:310px; 121 | height:364px; 122 | position:relative; 123 | /*background-color: #122;*/ 124 | margin:3px; 125 | } 126 | 127 | 128 | .is-current { 129 | background-color:#e36b23; 130 | font-size:18px; 131 | } 132 | 133 | #list { 134 | margin-left:4px; 135 | text-align:center; 136 | } 137 | 138 | .adiv:hover { 139 | background-color: #84BD00; 140 | } 141 | 142 | #info { 143 | margin-left:10px; 144 | width:100%; 145 | text-align:center; 146 | margin-bottom:10px; 147 | height:32px; 148 | font-size:18px; 149 | } 150 | 151 | 152 | .track-info { 153 | position:absolute; 154 | bottom: 4px; 155 | margin-left:6px; 156 | margin-right:6px; 157 | line-height:18px; 158 | font-size:14px; 159 | height:40px; 160 | overflow:hidden; 161 | } 162 | 163 | .playbutton { 164 | width:100px; 165 | position:absolute; 166 | top:100px; 167 | left:100px; 168 | } 169 | 170 | .buttons { 171 | opacity:.8; 172 | } 173 | 174 | #footer { 175 | margin-top:10px; 176 | margin-left:20px; 177 | margin-bottom:20px; 178 | } 179 | 180 | 181 | 182 | .album-label { 183 | overflow:hidden; 184 | height:18px; 185 | text-overflow:ellipsis; 186 | width:280px; 187 | white-space:nowrap; 188 | } 189 | 190 | .change { 191 | float:left; 192 | } 193 | 194 | .bypass { 195 | float:right; 196 | } 197 | 198 | #search { 199 | margin-bottom: 20px; 200 | } 201 | 202 | #search input { 203 | color:black; 204 | } 205 | 206 | .faq { 207 | margin-right:10px; 208 | } 209 | 210 | #frog-image { 211 | margin-right:15px; 212 | margin-top:5px; 213 | float:left; 214 | border-radius:15px; 215 | 216 | } 217 | 218 | #lz-graph { 219 | float:right; 220 | border-radius:15px; 221 | margin-left:10px; 222 | margin-bottom:20px; 223 | } 224 | 225 | #time-info { 226 | margin-right:20px; 227 | font-size:8px; 228 | } 229 | 230 | .empty-link { 231 | cursor:pointer; 232 | } 233 | -------------------------------------------------------------------------------- /new-web/images/frog.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/images/frog.jpg -------------------------------------------------------------------------------- /new-web/images/lz.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/images/lz.png -------------------------------------------------------------------------------- /new-web/images/missing-1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/images/missing-1.jpg -------------------------------------------------------------------------------- /new-web/images/missing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/images/missing.png -------------------------------------------------------------------------------- /new-web/images/pause.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/images/pause.png -------------------------------------------------------------------------------- /new-web/images/play.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/images/play.png -------------------------------------------------------------------------------- /new-web/lib/underscore-min.js: -------------------------------------------------------------------------------- 1 | (function(){var n=this,t=n._,r={},e=Array.prototype,u=Object.prototype,i=Function.prototype,a=e.push,o=e.slice,c=e.concat,l=u.toString,f=u.hasOwnProperty,s=e.forEach,p=e.map,v=e.reduce,h=e.reduceRight,g=e.filter,d=e.every,m=e.some,y=e.indexOf,b=e.lastIndexOf,x=Array.isArray,_=Object.keys,j=i.bind,w=function(n){return n instanceof w?n:this instanceof w?(this._wrapped=n,void 0):new w(n)};"undefined"!=typeof exports?("undefined"!=typeof module&&module.exports&&(exports=module.exports=w),exports._=w):n._=w,w.VERSION="1.4.3";var A=w.each=w.forEach=function(n,t,e){if(null!=n)if(s&&n.forEach===s)n.forEach(t,e);else if(n.length===+n.length){for(var u=0,i=n.length;i>u;u++)if(t.call(e,n[u],u,n)===r)return}else for(var a in n)if(w.has(n,a)&&t.call(e,n[a],a,n)===r)return};w.map=w.collect=function(n,t,r){var e=[];return null==n?e:p&&n.map===p?n.map(t,r):(A(n,function(n,u,i){e[e.length]=t.call(r,n,u,i)}),e)};var O="Reduce of empty array with no initial value";w.reduce=w.foldl=w.inject=function(n,t,r,e){var u=arguments.length>2;if(null==n&&(n=[]),v&&n.reduce===v)return e&&(t=w.bind(t,e)),u?n.reduce(t,r):n.reduce(t);if(A(n,function(n,i,a){u?r=t.call(e,r,n,i,a):(r=n,u=!0)}),!u)throw new TypeError(O);return r},w.reduceRight=w.foldr=function(n,t,r,e){var u=arguments.length>2;if(null==n&&(n=[]),h&&n.reduceRight===h)return e&&(t=w.bind(t,e)),u?n.reduceRight(t,r):n.reduceRight(t);var i=n.length;if(i!==+i){var a=w.keys(n);i=a.length}if(A(n,function(o,c,l){c=a?a[--i]:--i,u?r=t.call(e,r,n[c],c,l):(r=n[c],u=!0)}),!u)throw new TypeError(O);return r},w.find=w.detect=function(n,t,r){var e;return E(n,function(n,u,i){return t.call(r,n,u,i)?(e=n,!0):void 0}),e},w.filter=w.select=function(n,t,r){var e=[];return null==n?e:g&&n.filter===g?n.filter(t,r):(A(n,function(n,u,i){t.call(r,n,u,i)&&(e[e.length]=n)}),e)},w.reject=function(n,t,r){return w.filter(n,function(n,e,u){return!t.call(r,n,e,u)},r)},w.every=w.all=function(n,t,e){t||(t=w.identity);var u=!0;return null==n?u:d&&n.every===d?n.every(t,e):(A(n,function(n,i,a){return(u=u&&t.call(e,n,i,a))?void 0:r}),!!u)};var E=w.some=w.any=function(n,t,e){t||(t=w.identity);var u=!1;return null==n?u:m&&n.some===m?n.some(t,e):(A(n,function(n,i,a){return u||(u=t.call(e,n,i,a))?r:void 0}),!!u)};w.contains=w.include=function(n,t){return null==n?!1:y&&n.indexOf===y?-1!=n.indexOf(t):E(n,function(n){return n===t})},w.invoke=function(n,t){var r=o.call(arguments,2);return w.map(n,function(n){return(w.isFunction(t)?t:n[t]).apply(n,r)})},w.pluck=function(n,t){return w.map(n,function(n){return n[t]})},w.where=function(n,t){return w.isEmpty(t)?[]:w.filter(n,function(n){for(var r in t)if(t[r]!==n[r])return!1;return!0})},w.max=function(n,t,r){if(!t&&w.isArray(n)&&n[0]===+n[0]&&65535>n.length)return Math.max.apply(Math,n);if(!t&&w.isEmpty(n))return-1/0;var e={computed:-1/0,value:-1/0};return A(n,function(n,u,i){var a=t?t.call(r,n,u,i):n;a>=e.computed&&(e={value:n,computed:a})}),e.value},w.min=function(n,t,r){if(!t&&w.isArray(n)&&n[0]===+n[0]&&65535>n.length)return Math.min.apply(Math,n);if(!t&&w.isEmpty(n))return 1/0;var e={computed:1/0,value:1/0};return A(n,function(n,u,i){var a=t?t.call(r,n,u,i):n;e.computed>a&&(e={value:n,computed:a})}),e.value},w.shuffle=function(n){var t,r=0,e=[];return A(n,function(n){t=w.random(r++),e[r-1]=e[t],e[t]=n}),e};var F=function(n){return w.isFunction(n)?n:function(t){return t[n]}};w.sortBy=function(n,t,r){var e=F(t);return w.pluck(w.map(n,function(n,t,u){return{value:n,index:t,criteria:e.call(r,n,t,u)}}).sort(function(n,t){var r=n.criteria,e=t.criteria;if(r!==e){if(r>e||void 0===r)return 1;if(e>r||void 0===e)return-1}return n.indexi;){var o=i+a>>>1;u>r.call(e,n[o])?i=o+1:a=o}return i},w.toArray=function(n){return n?w.isArray(n)?o.call(n):n.length===+n.length?w.map(n,w.identity):w.values(n):[]},w.size=function(n){return null==n?0:n.length===+n.length?n.length:w.keys(n).length},w.first=w.head=w.take=function(n,t,r){return null==n?void 0:null==t||r?n[0]:o.call(n,0,t)},w.initial=function(n,t,r){return o.call(n,0,n.length-(null==t||r?1:t))},w.last=function(n,t,r){return null==n?void 0:null==t||r?n[n.length-1]:o.call(n,Math.max(n.length-t,0))},w.rest=w.tail=w.drop=function(n,t,r){return o.call(n,null==t||r?1:t)},w.compact=function(n){return w.filter(n,w.identity)};var R=function(n,t,r){return A(n,function(n){w.isArray(n)?t?a.apply(r,n):R(n,t,r):r.push(n)}),r};w.flatten=function(n,t){return R(n,t,[])},w.without=function(n){return w.difference(n,o.call(arguments,1))},w.uniq=w.unique=function(n,t,r,e){w.isFunction(t)&&(e=r,r=t,t=!1);var u=r?w.map(n,r,e):n,i=[],a=[];return A(u,function(r,e){(t?e&&a[a.length-1]===r:w.contains(a,r))||(a.push(r),i.push(n[e]))}),i},w.union=function(){return w.uniq(c.apply(e,arguments))},w.intersection=function(n){var t=o.call(arguments,1);return w.filter(w.uniq(n),function(n){return w.every(t,function(t){return w.indexOf(t,n)>=0})})},w.difference=function(n){var t=c.apply(e,o.call(arguments,1));return w.filter(n,function(n){return!w.contains(t,n)})},w.zip=function(){for(var n=o.call(arguments),t=w.max(w.pluck(n,"length")),r=Array(t),e=0;t>e;e++)r[e]=w.pluck(n,""+e);return r},w.object=function(n,t){if(null==n)return{};for(var r={},e=0,u=n.length;u>e;e++)t?r[n[e]]=t[e]:r[n[e][0]]=n[e][1];return r},w.indexOf=function(n,t,r){if(null==n)return-1;var e=0,u=n.length;if(r){if("number"!=typeof r)return e=w.sortedIndex(n,t),n[e]===t?e:-1;e=0>r?Math.max(0,u+r):r}if(y&&n.indexOf===y)return n.indexOf(t,r);for(;u>e;e++)if(n[e]===t)return e;return-1},w.lastIndexOf=function(n,t,r){if(null==n)return-1;var e=null!=r;if(b&&n.lastIndexOf===b)return e?n.lastIndexOf(t,r):n.lastIndexOf(t);for(var u=e?r:n.length;u--;)if(n[u]===t)return u;return-1},w.range=function(n,t,r){1>=arguments.length&&(t=n||0,n=0),r=arguments[2]||1;for(var e=Math.max(Math.ceil((t-n)/r),0),u=0,i=Array(e);e>u;)i[u++]=n,n+=r;return i};var I=function(){};w.bind=function(n,t){var r,e;if(n.bind===j&&j)return j.apply(n,o.call(arguments,1));if(!w.isFunction(n))throw new TypeError;return r=o.call(arguments,2),e=function(){if(!(this instanceof e))return n.apply(t,r.concat(o.call(arguments)));I.prototype=n.prototype;var u=new I;I.prototype=null;var i=n.apply(u,r.concat(o.call(arguments)));return Object(i)===i?i:u}},w.bindAll=function(n){var t=o.call(arguments,1);return 0==t.length&&(t=w.functions(n)),A(t,function(t){n[t]=w.bind(n[t],n)}),n},w.memoize=function(n,t){var r={};return t||(t=w.identity),function(){var e=t.apply(this,arguments);return w.has(r,e)?r[e]:r[e]=n.apply(this,arguments)}},w.delay=function(n,t){var r=o.call(arguments,2);return setTimeout(function(){return n.apply(null,r)},t)},w.defer=function(n){return w.delay.apply(w,[n,1].concat(o.call(arguments,1)))},w.throttle=function(n,t){var r,e,u,i,a=0,o=function(){a=new Date,u=null,i=n.apply(r,e)};return function(){var c=new Date,l=t-(c-a);return r=this,e=arguments,0>=l?(clearTimeout(u),u=null,a=c,i=n.apply(r,e)):u||(u=setTimeout(o,l)),i}},w.debounce=function(n,t,r){var e,u;return function(){var i=this,a=arguments,o=function(){e=null,r||(u=n.apply(i,a))},c=r&&!e;return clearTimeout(e),e=setTimeout(o,t),c&&(u=n.apply(i,a)),u}},w.once=function(n){var t,r=!1;return function(){return r?t:(r=!0,t=n.apply(this,arguments),n=null,t)}},w.wrap=function(n,t){return function(){var r=[n];return a.apply(r,arguments),t.apply(this,r)}},w.compose=function(){var n=arguments;return function(){for(var t=arguments,r=n.length-1;r>=0;r--)t=[n[r].apply(this,t)];return t[0]}},w.after=function(n,t){return 0>=n?t():function(){return 1>--n?t.apply(this,arguments):void 0}},w.keys=_||function(n){if(n!==Object(n))throw new TypeError("Invalid object");var t=[];for(var r in n)w.has(n,r)&&(t[t.length]=r);return t},w.values=function(n){var t=[];for(var r in n)w.has(n,r)&&t.push(n[r]);return t},w.pairs=function(n){var t=[];for(var r in n)w.has(n,r)&&t.push([r,n[r]]);return t},w.invert=function(n){var t={};for(var r in n)w.has(n,r)&&(t[n[r]]=r);return t},w.functions=w.methods=function(n){var t=[];for(var r in n)w.isFunction(n[r])&&t.push(r);return t.sort()},w.extend=function(n){return A(o.call(arguments,1),function(t){if(t)for(var r in t)n[r]=t[r]}),n},w.pick=function(n){var t={},r=c.apply(e,o.call(arguments,1));return A(r,function(r){r in n&&(t[r]=n[r])}),t},w.omit=function(n){var t={},r=c.apply(e,o.call(arguments,1));for(var u in n)w.contains(r,u)||(t[u]=n[u]);return t},w.defaults=function(n){return A(o.call(arguments,1),function(t){if(t)for(var r in t)null==n[r]&&(n[r]=t[r])}),n},w.clone=function(n){return w.isObject(n)?w.isArray(n)?n.slice():w.extend({},n):n},w.tap=function(n,t){return t(n),n};var S=function(n,t,r,e){if(n===t)return 0!==n||1/n==1/t;if(null==n||null==t)return n===t;n instanceof w&&(n=n._wrapped),t instanceof w&&(t=t._wrapped);var u=l.call(n);if(u!=l.call(t))return!1;switch(u){case"[object String]":return n==t+"";case"[object Number]":return n!=+n?t!=+t:0==n?1/n==1/t:n==+t;case"[object Date]":case"[object Boolean]":return+n==+t;case"[object RegExp]":return n.source==t.source&&n.global==t.global&&n.multiline==t.multiline&&n.ignoreCase==t.ignoreCase}if("object"!=typeof n||"object"!=typeof t)return!1;for(var i=r.length;i--;)if(r[i]==n)return e[i]==t;r.push(n),e.push(t);var a=0,o=!0;if("[object Array]"==u){if(a=n.length,o=a==t.length)for(;a--&&(o=S(n[a],t[a],r,e)););}else{var c=n.constructor,f=t.constructor;if(c!==f&&!(w.isFunction(c)&&c instanceof c&&w.isFunction(f)&&f instanceof f))return!1;for(var s in n)if(w.has(n,s)&&(a++,!(o=w.has(t,s)&&S(n[s],t[s],r,e))))break;if(o){for(s in t)if(w.has(t,s)&&!a--)break;o=!a}}return r.pop(),e.pop(),o};w.isEqual=function(n,t){return S(n,t,[],[])},w.isEmpty=function(n){if(null==n)return!0;if(w.isArray(n)||w.isString(n))return 0===n.length;for(var t in n)if(w.has(n,t))return!1;return!0},w.isElement=function(n){return!(!n||1!==n.nodeType)},w.isArray=x||function(n){return"[object Array]"==l.call(n)},w.isObject=function(n){return n===Object(n)},A(["Arguments","Function","String","Number","Date","RegExp"],function(n){w["is"+n]=function(t){return l.call(t)=="[object "+n+"]"}}),w.isArguments(arguments)||(w.isArguments=function(n){return!(!n||!w.has(n,"callee"))}),w.isFunction=function(n){return"function"==typeof n},w.isFinite=function(n){return isFinite(n)&&!isNaN(parseFloat(n))},w.isNaN=function(n){return w.isNumber(n)&&n!=+n},w.isBoolean=function(n){return n===!0||n===!1||"[object Boolean]"==l.call(n)},w.isNull=function(n){return null===n},w.isUndefined=function(n){return void 0===n},w.has=function(n,t){return f.call(n,t)},w.noConflict=function(){return n._=t,this},w.identity=function(n){return n},w.times=function(n,t,r){for(var e=Array(n),u=0;n>u;u++)e[u]=t.call(r,u);return e},w.random=function(n,t){return null==t&&(t=n,n=0),n+(0|Math.random()*(t-n+1))};var T={escape:{"&":"&","<":"<",">":">",'"':""","'":"'","/":"/"}};T.unescape=w.invert(T.escape);var M={escape:RegExp("["+w.keys(T.escape).join("")+"]","g"),unescape:RegExp("("+w.keys(T.unescape).join("|")+")","g")};w.each(["escape","unescape"],function(n){w[n]=function(t){return null==t?"":(""+t).replace(M[n],function(t){return T[n][t]})}}),w.result=function(n,t){if(null==n)return null;var r=n[t];return w.isFunction(r)?r.call(n):r},w.mixin=function(n){A(w.functions(n),function(t){var r=w[t]=n[t];w.prototype[t]=function(){var n=[this._wrapped];return a.apply(n,arguments),z.call(this,r.apply(w,n))}})};var N=0;w.uniqueId=function(n){var t=""+ ++N;return n?n+t:t},w.templateSettings={evaluate:/<%([\s\S]+?)%>/g,interpolate:/<%=([\s\S]+?)%>/g,escape:/<%-([\s\S]+?)%>/g};var q=/(.)^/,B={"'":"'","\\":"\\","\r":"r","\n":"n"," ":"t","\u2028":"u2028","\u2029":"u2029"},D=/\\|'|\r|\n|\t|\u2028|\u2029/g;w.template=function(n,t,r){r=w.defaults({},r,w.templateSettings);var e=RegExp([(r.escape||q).source,(r.interpolate||q).source,(r.evaluate||q).source].join("|")+"|$","g"),u=0,i="__p+='";n.replace(e,function(t,r,e,a,o){return i+=n.slice(u,o).replace(D,function(n){return"\\"+B[n]}),r&&(i+="'+\n((__t=("+r+"))==null?'':_.escape(__t))+\n'"),e&&(i+="'+\n((__t=("+e+"))==null?'':__t)+\n'"),a&&(i+="';\n"+a+"\n__p+='"),u=o+t.length,t}),i+="';\n",r.variable||(i="with(obj||{}){\n"+i+"}\n"),i="var __t,__p='',__j=Array.prototype.join,print=function(){__p+=__j.call(arguments,'');};\n"+i+"return __p;\n";try{var a=Function(r.variable||"obj","_",i)}catch(o){throw o.source=i,o}if(t)return a(t,w);var c=function(n){return a.call(this,n,w)};return c.source="function("+(r.variable||"obj")+"){\n"+i+"}",c},w.chain=function(n){return w(n).chain()};var z=function(n){return this._chain?w(n).chain():n};w.mixin(w),A(["pop","push","reverse","shift","sort","splice","unshift"],function(n){var t=e[n];w.prototype[n]=function(){var r=this._wrapped;return t.apply(r,arguments),"shift"!=n&&"splice"!=n||0!==r.length||delete r[0],z.call(this,r)}}),A(["concat","join","slice"],function(n){var t=e[n];w.prototype[n]=function(){return z.call(this,t.apply(this._wrapped,arguments))}}),w.extend(w.prototype,{chain:function(){return this._chain=!0,this},value:function(){return this._wrapped}})}).call(this); -------------------------------------------------------------------------------- /new-web/ss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/new-web/ss.png -------------------------------------------------------------------------------- /new-web/styles.css: -------------------------------------------------------------------------------- 1 | .hero-unit { 2 | margin-top:6px; 3 | padding-bottom:30px; 4 | color: #fff; 5 | background-color: #1ED760; 6 | text-align:center; 7 | } 8 | 9 | .hero-unit h1 { 10 | font-weight:lighter; 11 | font-size:64px; 12 | } 13 | 14 | .btn-narrow :hover{ 15 | background:green; 16 | width:12px !important; 17 | } 18 | 19 | .artist-popularity { 20 | font-size:7px; 21 | } 22 | 23 | #gallery h1 { 24 | color: #1ED760; 25 | } 26 | 27 | #about h1 { 28 | color: #1ED760; 29 | } 30 | 31 | #about h2 { 32 | color: #1ED760; 33 | } 34 | 35 | 36 | .reg-unit { 37 | color:white; 38 | margin-top:6px; 39 | /*background-color: #1ED760;*/ 40 | /*background-color: #6F7073;*/ 41 | background-color: #444; 42 | } 43 | 44 | .navbar-nav li a { 45 | cursor:pointer; 46 | } 47 | 48 | .artist:hover { 49 | cursor:pointer; 50 | } 51 | 52 | #options li a:hover { 53 | background-color: #727272; 54 | } 55 | 56 | .option-active { 57 | /*background-color: #6f7073;*/ 58 | color: #1ED760 !important; 59 | } 60 | 61 | /* 62 | .reg-unit a { 63 | color:red; 64 | } 65 | 66 | .reg-unit a:hover { 67 | color:red; 68 | } 69 | 70 | .reg-unit a:visited { 71 | color:orange; 72 | } 73 | */ 74 | 75 | 76 | 77 | #gallery { 78 | display:none; 79 | } 80 | 81 | #time-info { 82 | display:none; 83 | } 84 | 85 | .gallery-list { 86 | font-size:24px; 87 | } 88 | 89 | #about { 90 | display:none; 91 | font-weight:lighter !important; 92 | } 93 | 94 | 95 | 96 | #main { 97 | margin-top:20px; 98 | margin-left:8px; 99 | } 100 | 101 | .adiv { 102 | width:300px; 103 | height:350px; 104 | background-size:100%; 105 | background-repeat:no-repeat; 106 | overflow:hidden; 107 | position:relative; 108 | /*background-color: #122;*/ 109 | margin-bottom:2px; 110 | /*padding:4px; */ 111 | background-color:#ddd; 112 | } 113 | 114 | #go { 115 | } 116 | 117 | #xbuttons { 118 | width:100%; 119 | margin-left:auto; 120 | margin-right:auto; 121 | text-align:center; 122 | display:none; 123 | } 124 | 125 | #tweet-span { 126 | position:relative; 127 | top:10px; 128 | } 129 | 130 | .tadiv { 131 | display:inline-block; 132 | width:310px; 133 | height:364px; 134 | position:relative; 135 | /*background-color: #122;*/ 136 | margin:3px; 137 | } 138 | 139 | 140 | .is-current { 141 | background-color:#e36b23; 142 | font-size:18px; 143 | } 144 | 145 | #list { 146 | margin-left:4px; 147 | text-align:center; 148 | } 149 | 150 | .adiv:hover { 151 | background-color: #1ED760; 152 | } 153 | 154 | #info { 155 | margin-left:10px; 156 | width:100%; 157 | text-align:center; 158 | margin-bottom:10px; 159 | height:32px; 160 | font-size:18px; 161 | } 162 | 163 | 164 | .track-info { 165 | position:absolute; 166 | bottom: 4px; 167 | margin-left:6px; 168 | margin-right:6px; 169 | line-height:18px; 170 | font-size:14px; 171 | height:40px; 172 | overflow:hidden; 173 | } 174 | 175 | .playbutton { 176 | width:100px; 177 | position:absolute; 178 | top:100px; 179 | left:100px; 180 | } 181 | 182 | .buttons { 183 | opacity:.8; 184 | } 185 | 186 | #footer { 187 | margin-top:10px; 188 | margin-left:20px; 189 | margin-bottom:20px; 190 | } 191 | 192 | 193 | 194 | .album-label { 195 | overflow:hidden; 196 | height:18px; 197 | text-overflow:ellipsis; 198 | width:280px; 199 | white-space:nowrap; 200 | } 201 | 202 | .change { 203 | float:left; 204 | } 205 | 206 | .bypass { 207 | float:right; 208 | } 209 | 210 | #search { 211 | margin-bottom: 20px; 212 | } 213 | 214 | #search input { 215 | color:black; 216 | } 217 | 218 | .faq { 219 | margin-right:10px; 220 | } 221 | 222 | #frog-image { 223 | margin-right:15px; 224 | margin-top:5px; 225 | float:left; 226 | border-radius:15px; 227 | 228 | } 229 | 230 | #lz-graph { 231 | float:right; 232 | border-radius:15px; 233 | margin-left:10px; 234 | margin-bottom:20px; 235 | } 236 | 237 | #time-info { 238 | margin-right:20px; 239 | font-size:8px; 240 | } 241 | 242 | .empty-link { 243 | cursor:pointer; 244 | } 245 | 246 | #the-plot : { 247 | width:1000px; 248 | height:600px; 249 | } 250 | -------------------------------------------------------------------------------- /new_crawler/artist_graph.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import time 3 | import networkx as nx 4 | import json 5 | import rocksdb 6 | import collections 7 | import search 8 | 9 | 10 | class ArtistGraph: 11 | def __init__(self, db_path='rocks.db'): 12 | self.db = rocksdb.DB(db_path, rocksdb.Options(), read_only=True) 13 | self.trace = True 14 | self.searcher = search.Searcher(exact=True) 15 | 16 | #important configs 17 | self.skip_artists_with_no_tracks = True 18 | self.simple_edges = False 19 | self.max_edges_per_artist = 4 20 | self.pop_weight = 100.00 21 | self.min_popularity = 30 22 | 23 | 24 | self.artist_blacklist = set() 25 | self.edge_blacklist = collections.defaultdict(set) 26 | self.load_blacklist() 27 | 28 | self.load_graph() 29 | 30 | 31 | def load_blacklist(self): 32 | f = open("blacklist.csv") 33 | for lineno, line in enumerate(f): 34 | line = line.strip() 35 | if len(line) == 0: 36 | continue 37 | if line[0] == '#': 38 | continue 39 | fields = [f.strip() for f in line.split(',')] 40 | if len(fields) > 1 and fields[0] == 'artist': 41 | aid = to_aid(fields[1]) 42 | self.artist_blacklist.add(aid) 43 | elif fields[0] == 'edge' and len(fields) > 2: 44 | aid1 = to_aid(fields[1]) 45 | aid2 = to_aid(fields[2]) 46 | self.edge_blacklist[aid1].add(aid2) 47 | self.edge_blacklist[aid2].add(aid1) 48 | else: 49 | print "unknown blacklist type", fields[0], "at line", lineno 50 | 51 | def load_graph(self): 52 | self.G = nx.Graph() 53 | popularity = collections.defaultdict(int) 54 | skips = set() 55 | 56 | it = self.db.itervalues() 57 | it.seek_to_first() 58 | missing = [] 59 | 60 | print "loading popularity" 61 | for i, tartist_js in enumerate(it): 62 | artist = json.loads(tartist_js) 63 | popularity[artist['id']] = artist['popularity'] 64 | if self.skip_artists_with_no_tracks: 65 | if 'tracks' not in artist or len(artist['tracks']) == 0: 66 | skips.add(artist['id']) 67 | print len(popularity), "artists", "skipping", len(skips) 68 | 69 | print "bulding graph" 70 | nnodes = 0 71 | it.seek_to_first() 72 | for i, tartist_js in enumerate(it): 73 | artist = json.loads(tartist_js) 74 | node = artist['id'] 75 | if node in skips: 76 | continue 77 | 78 | if node in self.artist_blacklist: 79 | print 'skipped artist', node 80 | continue 81 | 82 | 83 | pop = popularity[node] 84 | if pop < self.min_popularity: 85 | continue 86 | 87 | nnodes += 1 88 | self.index(artist['name'],node) 89 | if 'edges' in artist: 90 | edges = [edge for edge in artist['edges'] if self.is_good_edge(node, edge, skips, popularity)] 91 | 92 | if self.simple_edges: 93 | # just weighed by order 94 | edges = edges[:self.max_edges_per_artist] 95 | nedges = float(len(edges)) 96 | weighted_edges = [(1.0 + nedge / nedges, edge) for nedge, edge in enumerate(edges)] 97 | else: 98 | weighted_edges = [ (1 + self.pop_weight * abs(pop - popularity[edge]) / 100.0, edge) for edge in edges] 99 | weighted_edges.sort() 100 | weighted_edges = weighted_edges[:self.max_edges_per_artist] 101 | 102 | for weight, target in weighted_edges: 103 | self.add_edge(node, target, weight) 104 | 105 | 106 | if self.trace and nnodes % 1000 == 0: 107 | print "loading %d artists" % (nnodes, ) 108 | 109 | print "nodes", self.G.number_of_nodes() 110 | print "edges", self.G.number_of_edges() 111 | components = list(nx.connected_components(self.G)) 112 | print "connnected components", len(components) 113 | clens = [len(c) for c in components] 114 | clens.sort(reverse=True) 115 | for cl in clens: 116 | print cl, 117 | 118 | def is_good_edge(self, src, dest, skips, popularity): 119 | if dest in skips: 120 | return False 121 | 122 | if dest in self.artist_blacklist: 123 | return False 124 | 125 | if dest in self.edge_blacklist[src]: 126 | return False 127 | 128 | if popularity[dest] < self.min_popularity: 129 | return False 130 | 131 | return True 132 | 133 | 134 | def index(self, name, aid): 135 | self.searcher.add(name, aid) 136 | 137 | def search(self, name): 138 | if name == None or len(name) == 0: 139 | return None 140 | if name.startswith('spotify:artist:'): 141 | fields = name.split(':') 142 | if len(fields) == 3: 143 | return fields[2] 144 | 145 | matches = self.searcher.search(name) 146 | for match in matches: 147 | if match in self.G: 148 | return match 149 | return None 150 | 151 | def get_artist(self, aid): 152 | tjs = self.db.get(aid) 153 | if tjs: 154 | tartist = json.loads(tjs) 155 | if not 'edges' in tartist: 156 | tartist['edges'] = [] 157 | if not 'incoming_edges' in tartist: 158 | tartist['incoming_edges'] = [] 159 | else: 160 | tartist = None 161 | return tartist 162 | 163 | def get_skipset(self): 164 | skips = set() 165 | it = self.db.itervalues() 166 | it.seek_to_first() 167 | missing = [] 168 | for i, tartist_js in enumerate(it): 169 | artist = json.loads(tartist_js) 170 | if 'tracks' not in artist or len(artist['tracks']) == 0: 171 | skips.add(artist['id']) 172 | if self.trace: 173 | print "found %d artists with no tracks" % (len(skips),) 174 | return skips 175 | 176 | def path(self, source_name, target_name, skipset=set()): 177 | def get_weight(src, dest, attrs): 178 | if src in skipset or dest in skipset: 179 | # print "gw", srx, dest, attrs, 10000 180 | return 10000 181 | # print "gw", src, dest, attrs, 1 182 | return attrs['weight'] 183 | 184 | results = { 185 | 'status': 'ok' 186 | } 187 | 188 | if len(source_name) == 0: 189 | results['status'] = 'error' 190 | results['reason'] = "No artist given" 191 | else: 192 | source_aid = self.search(source_name) 193 | if source_aid == None: 194 | results['status'] = 'error' 195 | results['reason'] = "Can't find " + source_name 196 | 197 | target_aid = self.search(target_name) 198 | if target_aid == None: 199 | results['status'] = 'error' 200 | results['reason'] = "Can't find " + target_name 201 | 202 | print "s=t", source_aid, target_aid 203 | if source_aid not in self.G: 204 | results['status'] = 'error' 205 | results['reason'] = "Can't find " + source_name + " in the artist graph" 206 | 207 | if target_aid not in self.G: 208 | results['status'] = 'error' 209 | results['reason'] = "Can't find " + target_name + " in the artist graph" 210 | 211 | if source_aid and target_aid and results['status'] == 'ok': 212 | start = time.time() 213 | if len(skipset) > 0: 214 | rpath = nx.dijkstra_path(self.G, source_aid, target_aid, get_weight) 215 | score = len(rpath) 216 | else: 217 | score, rpath = nx.bidirectional_dijkstra(self.G, source_aid, target_aid) 218 | pdelta = time.time() - start 219 | results['score'] = score 220 | populated_path = [self.get_artist(aid) for aid in rpath] 221 | fdelta = time.time() - start 222 | 223 | results['status'] = 'ok' 224 | results['raw_path'] = rpath 225 | results['path'] = populated_path 226 | results['pdelta'] = pdelta * 1000 227 | results['fdelta'] = fdelta * 1000 228 | return results 229 | 230 | def add_node(self, node): 231 | if node not in self.G: 232 | self.G.add_node(node) 233 | 234 | def add_edge(self, source, target, weight): 235 | self.add_node(source) 236 | self.add_node(target) 237 | self.G.add_edges_from([(source, target, {"weight": weight})]) 238 | 239 | 240 | def normalize_name(self, name): 241 | name = name.lower().strip() 242 | return name 243 | 244 | def an(self, aid): 245 | #return self.get_artist(aid)['name'] 246 | artist = self.get_artist(aid) 247 | return "%s(%d)" % (artist['name'], artist['popularity']) 248 | 249 | def edge_check(self, uri): 250 | aid = to_aid(uri) 251 | artist = self.get_artist(aid) 252 | 253 | print "edge check", self.an(aid) 254 | 255 | combined = set() 256 | combined.union(artist['edges']) 257 | combined.union(artist['incoming_edges']) 258 | 259 | print "combined:" 260 | for aid in combined: 261 | print " ", self.an(aid) 262 | print 263 | 264 | print "outgoing:" 265 | for aid in artist['edges']: 266 | if aid not in combined: 267 | print " ", self.an(aid) 268 | print 269 | print "incoming:" 270 | for aid in artist['incoming_edges']: 271 | if aid not in combined: 272 | print " ", self.an(aid) 273 | print 274 | 275 | def sim_check(self, uri): 276 | aid = to_aid(uri) 277 | artist = self.get_artist(aid) 278 | 279 | if not 'edges' in artist: 280 | print "leaf node, nothing to do" 281 | return 282 | 283 | sim_counts = collections.Counter() 284 | osim_counts = collections.Counter() 285 | 286 | simset = set(artist['edges']) 287 | print "sim_check", self.an(aid) 288 | print 289 | print "normal sims" 290 | for i, edge in enumerate(artist['edges']): 291 | print " %d %s %s" % (i, edge, self.an(edge)) 292 | sim_artist = self.get_artist(edge) 293 | if 'edges' in sim_artist: 294 | for sedge in sim_artist['edges']: 295 | if sedge in simset: 296 | sim_counts[sedge] += 1 297 | osim_counts[sedge] += 1 298 | print 299 | 300 | print "ranked sims" 301 | print artist['name'] 302 | for edge, count in sim_counts.most_common(): 303 | print " %d %s %s"% (count, edge, self.an(edge)) 304 | 305 | print 306 | print "sim neighborhooed" 307 | for edge, count in osim_counts.most_common(): 308 | if count > 1: 309 | print "%d %s %s"% (count, edge, self.an(edge)) 310 | 311 | 312 | def to_aid(uri_or_aid): 313 | if uri_or_aid: 314 | fields = uri_or_aid.split(':') 315 | if len(fields) == 3: 316 | return fields[2] 317 | return uri_or_aid 318 | 319 | if __name__ == '__main__': 320 | args = sys.argv[1:] 321 | uris = [] 322 | 323 | ag = ArtistGraph() 324 | 325 | while args: 326 | arg = args.pop(0) 327 | if arg == '--path': 328 | pass 329 | 330 | 331 | -------------------------------------------------------------------------------- /new_crawler/blacklist.csv: -------------------------------------------------------------------------------- 1 | # the black list. you can block artists like so: 2 | 3 | artist, spotify:artist:61zv3hX7l838ZyhaDyAx8S, gary glitter 4 | artist, spotify:artist:1faxe25Wp3Nk43xVVxsdSB, billy davis, too many bad sims due to ambiguous artist 5 | 6 | edge, spotify:artist:1uKR3ihZmv8a93heLPYKQ8,spotify:artist:4NgfOZCL9Ml67xzM0xzIvC, janice to janice joplin 7 | -------------------------------------------------------------------------------- /new_crawler/build_db.py: -------------------------------------------------------------------------------- 1 | import db 2 | import json 3 | 4 | def process_file(in_path): 5 | f = open(in_path) 6 | 7 | for line in f: 8 | line.strip() 9 | 10 | 11 | if __name__ == '__main__': 12 | 13 | in_path = sys.argv[1] 14 | db_path = sys.argv[1] 15 | -------------------------------------------------------------------------------- /new_crawler/db.py: -------------------------------------------------------------------------------- 1 | import json 2 | import sys 3 | import os 4 | 5 | artists = {} 6 | edges = {} 7 | 8 | artist_by_name = {} 9 | 10 | def get_artist(uri): 11 | if uri in artists: 12 | return artists[uri] 13 | else: 14 | return None 15 | 16 | def get_artists(uris): 17 | return [artists[uri] for uri in uris if uri in artists] 18 | 19 | def get_artist_name(uri): 20 | artist = get_artist(uri) 21 | if artist: 22 | return artist['name'] 23 | else: 24 | return None 25 | 26 | def get_artists_with_edges(uris): 27 | ret_artists = [] 28 | for uri in uris: 29 | artist = get_artist(uri) 30 | if artist: 31 | ret_artists.append(artist) 32 | edges = get_edges(uri) 33 | if edges: 34 | artist['edges'] = edges 35 | return ret_artists 36 | 37 | def get_edges(uri): 38 | if uri in edges: 39 | return edges[uri] 40 | else: 41 | return None 42 | 43 | def get_all_edges(): 44 | return edges 45 | 46 | def get_all_artists(): 47 | return artists 48 | 49 | 50 | def load_db(prefix="g1"): 51 | if len(artists) == 0: 52 | f = open(prefix + "/nodes.js") 53 | for line in f: 54 | try: 55 | artist = json.loads(line.strip()) 56 | artists[artist['uri']] = artist 57 | nname = normalize_name(artist['name']) 58 | artist_by_name[nname] = artist 59 | except: 60 | print "skipped bad line in db", line 61 | print "loaded", len(artists), "artists" 62 | 63 | if len(edges) == 0: 64 | f = open(prefix + "/edges.js") 65 | for line in f: 66 | try: 67 | edge = json.loads(line.strip()) 68 | for uri, targets in edge.items(): 69 | edges[uri] = targets 70 | except: 71 | print "skipped bad edge in db", line 72 | print "loaded", len(edges), "edges" 73 | 74 | 75 | def normalize_name(n): 76 | return ''.join(e.lower() for e in n if e.isalnum()) 77 | 78 | def get_artist_by_name(name): 79 | nname = normalize_name(name) 80 | print "nname", nname 81 | if nname in artist_by_name: 82 | return artist_by_name[nname] 83 | else: 84 | return None 85 | 86 | 87 | if __name__ == '__main__': 88 | load_db() 89 | 90 | for uri in sys.argv[1:]: 91 | print uri 92 | print json.dumps(get_artist(uri), indent=4) 93 | print json.dumps(get_edges(uri), indent=4) 94 | print 95 | 96 | -------------------------------------------------------------------------------- /new_crawler/flask_server.py: -------------------------------------------------------------------------------- 1 | """ the http server for SFC 2 | """ 3 | import sys 4 | import logging 5 | import atexit 6 | import time 7 | import collections 8 | 9 | from flask import Flask, request, jsonify 10 | from flask_cors import cross_origin 11 | from werkzeug.contrib.fixers import ProxyFix 12 | import artist_graph 13 | 14 | APP = Flask(__name__) 15 | APP.debug = False 16 | APP.trace = False 17 | APP.testing = False 18 | APP.ag = artist_graph.ArtistGraph() 19 | 20 | @APP.route('/frog/path') 21 | @cross_origin() 22 | def api_path(): 23 | start = time.time() 24 | src = request.args.get("src", None) 25 | dest = request.args.get("dest", None) 26 | skips = request.args.get("skips", None) 27 | 28 | if skips and len(skips) > 0: 29 | skipset = set(skips.split(',')) 30 | else: 31 | skipset = set() 32 | 33 | if src and dest: 34 | results = APP.ag.path(src, dest, skipset) 35 | if results['status'] == 'ok' and results['path']: 36 | src_name = results['path'][0]['name'] 37 | src_id = results['path'][0]['id'] 38 | dest_name = results['path'][-1]['name'] 39 | dest_id = results['path'][-1]['id'] 40 | text = "From " + src_name + " to " + dest_name 41 | add_to_history(src_id, dest_id, text, skips) 42 | else: 43 | results = { 44 | "status": "error", 45 | "reason": "missing src and/or dest", 46 | } 47 | return jsonify(results) 48 | 49 | 50 | history = [] 51 | max_history = 100 52 | popular = collections.Counter() 53 | popular_text = {} 54 | 55 | def add_to_history(src, dest, text, skips): 56 | global history 57 | if not found_in_history(src, dest): 58 | history.append( (src, dest, skips, text, time.time()) ) 59 | history = history[:max_history] 60 | 61 | key = src + ":::" + dest 62 | popular[key] += 1 63 | popular_text[key] = text 64 | 65 | 66 | def found_in_history(src, dest): 67 | for hsrc, hdest, skips, text, ts in history: 68 | if src == hsrc and dest == hdest: 69 | return True 70 | return False 71 | 72 | 73 | @APP.route('/frog/history') 74 | @cross_origin() 75 | def api_get_history(): 76 | out = [] 77 | for hist in reversed(history): 78 | src, dest, skips, text, ts = hist 79 | h = { 80 | "src": src, 81 | "dest": dest, 82 | "skips": skips, 83 | "text": text, 84 | "ts": ts, 85 | } 86 | out.append(h) 87 | results = { 88 | "status": 'ok', 89 | "history": out 90 | } 91 | return jsonify(results) 92 | 93 | @APP.route('/frog/popular') 94 | @cross_origin() 95 | def api_get_popular(): 96 | pop = [] 97 | for key, count in popular.most_common(100): 98 | src, dest = key.split(':::') 99 | text = popular_text[key] 100 | h = { 101 | "src": src, 102 | "dest": dest, 103 | "text": text, 104 | "count": count 105 | } 106 | pop.append(h) 107 | results = { 108 | "status": 'ok', 109 | "popular": pop 110 | } 111 | return jsonify(results) 112 | 113 | 114 | @APP.route('/frog/get') 115 | @cross_origin() 116 | def api_get(): 117 | """ get artist info for the given aids/uris 118 | """ 119 | start = time.time() 120 | aids = request.args.get("aids", None) 121 | 122 | if aids and len(aids) > 0: 123 | artist_ids = aids.split(',') 124 | out = [] 125 | for artist_id in artist_ids: 126 | artist = APP.ag.get_artist(artist_id) 127 | out.append(artist) 128 | 129 | results = { 130 | "status": "ok", 131 | "artists": out 132 | } 133 | else: 134 | results = { 135 | "status": "error", 136 | "reason": "no input artist ids given" 137 | } 138 | 139 | fdelta = time.time() - start 140 | results['fdelta'] = fdelta 141 | return jsonify(results) 142 | 143 | @APP.route('/frog/sims') 144 | @cross_origin() 145 | def api_sims(): 146 | """ get sim artists for the given aid 147 | """ 148 | start = time.time() 149 | name = request.args.get("artist", None) 150 | 151 | aid = APP.ag.search(name) 152 | 153 | out = [] 154 | if aid: 155 | artist = APP.ag.get_artist(aid) 156 | for edge in artist["edges"]: 157 | out.append(APP.ag.get_artist(edge)) 158 | 159 | results = { 160 | "status": "ok", 161 | "sims": out, 162 | "seed": artist 163 | } 164 | else: 165 | results = { 166 | "status": "error", 167 | "reason": "can't find artist " + name 168 | } 169 | 170 | fdelta = time.time() - start 171 | results['fdelta'] = fdelta 172 | return jsonify(results) 173 | 174 | 175 | #@APP.errorhandler(Exception) 176 | def handle_invalid_usage(error): 177 | """ implements the standard error processing 178 | """ 179 | print "error", error 180 | results = {'status': 'internal_error', "message": str(error)} 181 | return jsonify(results) 182 | 183 | 184 | APP.wsgi_app = ProxyFix(APP.wsgi_app) 185 | 186 | 187 | 188 | def shutdown(): 189 | """ performs any server shutdown cleanup 190 | """ 191 | print 'shutting down server ...' 192 | print 'done' 193 | 194 | 195 | if __name__ == '__main__': 196 | APP.debug = False 197 | APP.trace = False 198 | APP.wsgi = False 199 | HOST = '0.0.0.0' 200 | PORT = 3457 # debugging 201 | PORT = 4682 202 | 203 | logging.basicConfig( 204 | stream=sys.stdout, 205 | level=logging.INFO, 206 | format='%(asctime)s %(levelname)s %(message)s') 207 | 208 | atexit.register(shutdown) 209 | 210 | for arg in sys.argv[1:]: 211 | if arg == '--debug': 212 | APP.debug = True 213 | if arg == '--trace': 214 | APP.trace = True 215 | if APP.debug: 216 | print 'debug mode', 'host', HOST, 'port', PORT 217 | APP.run(threaded=False, debug=True, host=HOST, port=PORT) 218 | elif APP.wsgi: 219 | from gevent.wsgi import WSGIServer 220 | print 'WSGI production mode', 'port', PORT 221 | print 'WSGI production mode - ready' 222 | HTTP_SERVER = WSGIServer(('', PORT), APP) 223 | HTTP_SERVER.serve_forever() 224 | else: 225 | print 'production mode', 'port/host', PORT, HOST 226 | print 'production mode - ready' 227 | APP.run(threaded=True, debug=False, host=HOST, port=PORT) 228 | -------------------------------------------------------------------------------- /new_crawler/foo.js: -------------------------------------------------------------------------------- 1 | { 2 | "external_urls": { 3 | "spotify": "https://open.spotify.com/artist/3hE8S8ohRErocpkY7uJW4a" 4 | }, 5 | "followers": { 6 | "href": null, 7 | "total": 393621 8 | }, 9 | "genres": [ 10 | "gothic metal", 11 | "gothic symphonic metal", 12 | "power metal", 13 | "symphonic metal" 14 | ], 15 | "href": "https://api.spotify.com/v1/artists/3hE8S8ohRErocpkY7uJW4a", 16 | "id": "3hE8S8ohRErocpkY7uJW4a", 17 | "images": [ 18 | { 19 | "height": 640, 20 | "url": "https://i.scdn.co/image/ffaf90c9047adffbccc3af6f6f783ec608ced282", 21 | "width": 640 22 | }, 23 | { 24 | "height": 320, 25 | "url": "https://i.scdn.co/image/7312779b10c0a90d1ef61f29addb0b1f9b17c3b3", 26 | "width": 320 27 | }, 28 | { 29 | "height": 160, 30 | "url": "https://i.scdn.co/image/452dda43bb548452614feb72a87ad93fb6515f7a", 31 | "width": 160 32 | } 33 | ], 34 | "name": "Within Temptation", 35 | "popularity": 64, 36 | "type": "artist", 37 | "uri": "spotify:artist:3hE8S8ohRErocpkY7uJW4a" 38 | } 39 | -------------------------------------------------------------------------------- /new_crawler/rdb.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import rocksdb 3 | import time 4 | import collections 5 | import json 6 | import random 7 | import spotipy 8 | import spotipy_util as util 9 | from spotipy.oauth2 import SpotifyClientCredentials 10 | 11 | js_nodes = 'g2/nodes.js' 12 | js_edges = 'g2/edges.js' 13 | js_nodes = 'g4/nodes.js' 14 | js_edges = 'g4/edges.js' 15 | db_path = 'rocks2.db' 16 | read_only = False 17 | total_runs = 1 18 | 19 | 20 | server_side_credentials = False 21 | 22 | if server_side_credentials: 23 | client_credentials_manager = SpotifyClientCredentials() 24 | spotify = spotipy.Spotify(client_credentials_manager=client_credentials_manager) 25 | else: 26 | scope = '' 27 | username = 'plamere' 28 | token = util.prompt_for_user_token(username, scope, use_web_browser=False) 29 | if token: 30 | spotify = spotipy.Spotify(auth=token) 31 | else: 32 | print "can't get token" 33 | 34 | 35 | def to_tiny_artist(artist): 36 | image = None 37 | if len(artist['images']) > 0: 38 | image = artist['images'][0]['url'] 39 | ta = { 40 | "id": artist['id'], 41 | "name": artist['name'], 42 | "followers": artist['followers']['total'], 43 | "popularity": artist['popularity'], 44 | "image": image, 45 | "genres": artist['genres'] 46 | } 47 | return ta 48 | 49 | def to_tiny_track(track): 50 | if len(track['album']['images']) > 0: 51 | image = track['album']['images'][0]['url'] 52 | else: 53 | image = None 54 | 55 | tt = { 56 | "id": track['id'], 57 | "name": track['name'], 58 | "audio": track['preview_url'], 59 | "image": image 60 | } 61 | return tt 62 | 63 | 64 | def build(): 65 | db = rocksdb.DB(db_path, rocksdb.Options(create_if_missing=True)) 66 | edge_map = load_edges(js_edges) 67 | 68 | f = open(js_nodes) 69 | for i, line in enumerate(f): 70 | try: 71 | artist = json.loads(line.strip()) 72 | tiny_artist = to_tiny_artist(artist) 73 | if tiny_artist['id'] in edge_map: 74 | tiny_artist['edges'] = edge_map[tiny_artist['id']] 75 | else: 76 | tiny_artist['edges'] = [] 77 | db.put(tiny_artist['id'], json.dumps(tiny_artist)) 78 | print i, tiny_artist['name'] 79 | except: 80 | print "trouble with artist", line 81 | continue 82 | f.close() 83 | 84 | def dump_nodes(): 85 | f = open(js_nodes) 86 | for i, line in enumerate(f): 87 | try: 88 | artist = json.loads(line.strip()) 89 | tiny_artist = to_tiny_artist(artist) 90 | print "%7d %s" % (tiny_artist['followers'], tiny_artist['name']) 91 | except: 92 | print "trouble with", line 93 | continue 94 | f.close() 95 | 96 | def load_edges(path): 97 | edge_map = {} 98 | f = open(path) 99 | for line in f: 100 | try: 101 | edge_dict = json.loads(line.strip()) 102 | for uri, edges in edge_dict.items(): 103 | tid = uri_to_tid(uri) 104 | tedges = [] 105 | for edge in edges: 106 | tedges.append(uri_to_tid(edge)) 107 | edge_map[tid] = tedges 108 | except: 109 | print "trouble with edge", line 110 | continue 111 | f.close() 112 | return edge_map 113 | 114 | 115 | def add_track_info(): 116 | db = rocksdb.DB(db_path, rocksdb.Options(create_if_missing=False)) 117 | it = db.itervalues() 118 | it.seek_to_first() 119 | missing = [] 120 | for i, tartist_js in enumerate(it): 121 | tartist = json.loads(tartist_js) 122 | if 'tracks' not in tartist or len(tartist['tracks']) == 0 or has_no_audio(tartist['tracks']): 123 | missing.append(tartist) 124 | 125 | print "artists missing tracks", len(missing) 126 | missing.sort(key=lambda a:a['followers'], reverse=True) 127 | for i, artist in enumerate(missing): 128 | add_tracks(artist) 129 | if not 'incoming_edges' in artist: 130 | artist['incoming_edges'] = [] 131 | if not 'edges' in artist: 132 | artist['edges'] = [] 133 | db.put(artist['id'], json.dumps(artist)) 134 | print i, len(missing), len(artist['tracks']), artist['followers'], artist['id'], artist['name'] 135 | print "artists missing tracks", len(missing) 136 | 137 | def port_tracks(old_db, new_db): 138 | odb = rocksdb.DB(old_db, rocksdb.Options(create_if_missing=False)) 139 | ndb = rocksdb.DB(new_db, rocksdb.Options(create_if_missing=False)) 140 | it = ndb.itervalues() 141 | it.seek_to_first() 142 | for i, tartist_js in enumerate(it): 143 | tartist = json.loads(tartist_js) 144 | oartist = get_artist(odb, tartist['id']) 145 | if oartist and 'tracks' in oartist and len(oartist['tracks']) > 0: 146 | tartist['tracks'] = oartist['tracks'] 147 | ndb.put(tartist['id'], json.dumps(tartist)) 148 | print i, len(tartist['tracks']) 149 | 150 | def has_no_audio(tracks): 151 | for track in tracks: 152 | if 'audio' in track and track['audio'] != None: 153 | return False 154 | return True 155 | 156 | def has_no_audio(tracks): 157 | for track in tracks: 158 | if 'audio' in track and track['audio'] != None: 159 | return False 160 | return True 161 | 162 | 163 | 164 | def add_incoming_edges(): 165 | 166 | db = rocksdb.DB(db_path, rocksdb.Options(create_if_missing=False)) 167 | it = db.itervalues() 168 | it.seek_to_first() 169 | 170 | incoming_edges = collections.defaultdict(list) 171 | all_ids = set() 172 | 173 | for i, tartist_js in enumerate(it): 174 | tartist = json.loads(tartist_js) 175 | source = tartist['id'] 176 | all_ids.add(source) 177 | if i % 1000 == 0: 178 | print i, tartist['name'] 179 | if 'edges' in tartist: 180 | for edge in tartist['edges']: 181 | incoming_edges[edge].append(source) 182 | 183 | for i, aid in enumerate(all_ids): 184 | artist = get_artist(db, aid) 185 | artist['incoming_edges'] = incoming_edges[artist['id']] 186 | db.put(artist['id'], json.dumps(artist)) 187 | if i % 1000 == 0: 188 | print i, artist['name'], len(artist['incoming_edges']) 189 | 190 | def add_tracks(artist): 191 | results = spotify.artist_top_tracks(artist['id'], country='SE') 192 | #print json.dumps(results, indent=4) 193 | tracks = [] 194 | for track in results['tracks']: 195 | ttrack = to_tiny_track(track) 196 | tracks.append(ttrack) 197 | artist['tracks'] = tracks 198 | 199 | def id_to_uri(tid): 200 | if not tid.startswith('spotify:artist:'): 201 | return 'spotify:artist:' + tid 202 | return tid 203 | 204 | def uri_to_tid(uri): 205 | return uri.split(':')[-1] 206 | 207 | def get_artist(db, uri_or_tid): 208 | tid = uri_to_tid(uri_or_tid) 209 | tjs = db.get(tid) 210 | if tjs: 211 | tartist = json.loads(tjs) 212 | else: 213 | tartist = None 214 | return tartist 215 | 216 | def test_getter(total_runs=1): 217 | db = rocksdb.DB(db_path, rocksdb.Options(), read_only=True) 218 | errs = 0 219 | total_time = 0 220 | count = 0 221 | f = open(js_nodes) 222 | tartists = [] 223 | for i, line in enumerate(f): 224 | artist = json.loads(line.strip()) 225 | tiny_artist = to_tiny_artist(artist) 226 | tartists.append(tiny_artist) 227 | 228 | while total_runs: 229 | random.shuffle(tartists) 230 | for i, tiny_artist in enumerate(tartists): 231 | start = time.time() 232 | tartist = get_artist(db, tiny_artist['id']) 233 | delta = time.time() - start 234 | total_time += delta 235 | 236 | count += 1 237 | if tiny_artist['name'] == tartist['name']: 238 | print i, total_runs, errs, tiny_artist['name'], '==', tartist['name'] 239 | else: 240 | print 'MISMATCH', i, total_runs, errs, tiny_artist['name'], '==', tartist['name'] 241 | errs += 1 242 | total_runs -= 1 243 | f.close() 244 | 245 | print "errors", errs 246 | print "total_time", total_time, "ms per read", total_time * 1000 / count 247 | 248 | if __name__ == '__main__': 249 | args = sys.argv[1:] 250 | uris = [] 251 | while args: 252 | arg = args.pop(0) 253 | 254 | if arg == '--build': 255 | build() 256 | elif arg == '--artist' and args and not args[0].startswith('--'): 257 | uris.append(args.pop(0)) 258 | elif arg == '--dump': 259 | db = rocksdb.DB(db_path, rocksdb.Options(), read_only=True) 260 | for uri in uris: 261 | artist = get_artist(db, uri) 262 | print json.dumps(artist, indent=4) 263 | print 264 | elif arg == '--dump-nodes': 265 | dump_nodes() 266 | elif arg == '--add-tracks': 267 | add_track_info() 268 | elif arg == '--add-incoming-edges': 269 | add_incoming_edges() 270 | 271 | elif arg == '--port-tracks': 272 | old_path = args.pop(0) 273 | port_tracks(old_path, db_path) 274 | elif arg == '--db' and args and not args[0].startswith('--'): 275 | db_path = args.pop(0) 276 | elif arg == '--test': 277 | test_getter(total_runs) 278 | elif arg == '--ptest': 279 | artist = { 'id': '6jJ0s89eD6GaHleKKya26X' } 280 | add_tracks(artist) 281 | print json.dumps(artist, indent=4) 282 | elif arg == '--runs' and args: 283 | total_runs = int(args.pop(0)) 284 | -------------------------------------------------------------------------------- /new_crawler/search.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # This Python file uses the following encoding: utf-8 3 | import bisect 4 | import collections 5 | import re 6 | import stringutils 7 | import spotipy 8 | from spotipy.oauth2 import SpotifyClientCredentials 9 | 10 | client_credentials_manager = SpotifyClientCredentials() 11 | spotify = spotipy.Spotify(client_credentials_manager=client_credentials_manager) 12 | 13 | 14 | class Searcher: 15 | 16 | def __init__(self, exact=False): 17 | self.items = collections.defaultdict(list) 18 | self.names = [] 19 | self.exact = exact 20 | 21 | def add(self, name, o): 22 | # print 'adding', name, o 23 | s = de_norm(name) 24 | if o not in self.items[s]: 25 | self.items[s].append(o) 26 | bisect.insort_left(self.names, s) 27 | 28 | def search(self, s, force_exact=False): 29 | exact = self.exact or force_exact 30 | 31 | org_name = s 32 | s = de_norm(s) 33 | p = bisect.bisect_left(self.names, de_norm(s)) 34 | 35 | matches = [] 36 | for i in xrange(p, len(self.names)): 37 | if exact and self.names[i] == s: 38 | matches.append( (len(self.names[i]) - len(s), self.names[i]) ) 39 | elif not exact and self.names[i].startswith(s): 40 | matches.append( (len(self.names[i]) - len(s), self.names[i]) ) 41 | else: 42 | break 43 | 44 | matches.sort() 45 | results = [] 46 | 47 | # TODO: don't add dups 48 | for l, name in matches: 49 | for o in self.items[name]: 50 | results.append(o) 51 | if len(results) == 0: 52 | print 'ssearch', s 53 | sresults = spotify.search(q=s, type='artist') 54 | for item in sresults['artists']['items']: 55 | aid = item['id'] 56 | print ' ss', item['name'] 57 | self.add(org_name, aid) 58 | self.add(item['name'], aid) 59 | results.append(aid) 60 | return results 61 | 62 | 63 | def de_norm(name, space=''): 64 | ''' Dan Ellis normalization 65 | ''' 66 | s = name 67 | s = s.replace("'", "") 68 | s = s.replace(".", "") 69 | s = strip_accents(s) 70 | s = s.lower() 71 | s = re.sub(r'&', ' and ', s) 72 | s = re.sub(r'^the ', '', s) 73 | s = re.sub(r'[\W+]', '_', s) 74 | s = re.sub(r'_+', '_', s) 75 | s = s.strip('_') 76 | s = s.replace('_', space) 77 | 78 | # if we've normalized away everything 79 | # keep it. 80 | if len(s) == 0: 81 | s = name 82 | return s 83 | 84 | def de_equals(n1, n2): 85 | return n1 == n2 or de_norm(n1) == de_norm(n2) 86 | 87 | def de_match(n1, n2): 88 | if de_equals(n1, n2): 89 | return True 90 | else: 91 | dn1 = de_norm(n1) 92 | dn2 = de_norm(n2) 93 | return dn1.find(dn2) >= 0 or dn2.find(dn1) >= 0 94 | 95 | def strip_accents(s): 96 | return stringutils.unaccent(s) 97 | 98 | def test_norm(s): 99 | print s, de_norm(s) 100 | 101 | def norm_test(): 102 | test_norm("N'sync") 103 | test_norm("D'Angelo") 104 | test_norm("R. Kelly") 105 | test_norm("P.J. Harvey") 106 | test_norm("Beyoncé") 107 | test_norm("The Bangles") 108 | test_norm("Run-D.M.C.") 109 | test_norm("The Presidents of the United States of America") 110 | test_norm("Emerson Lake & Palmer") 111 | test_norm("Emerson, Lake & Palmer") 112 | test_norm("Emerson, Lake and Palmer") 113 | test_norm("Emerson Lake and Palmer") 114 | 115 | 116 | if __name__ == '__main__': 117 | norm_test() 118 | -------------------------------------------------------------------------------- /new_crawler/shell.py: -------------------------------------------------------------------------------- 1 | import cmd 2 | import artist_graph 3 | import simplejson as json 4 | import time 5 | 6 | class ArtistGraphShell(cmd.Cmd): 7 | prompt = "ag% " 8 | ag = artist_graph.ArtistGraph() 9 | raw = False 10 | skips = set() 11 | 12 | def do_test(self, line): 13 | print self.my_redis 14 | print 'hello world' 15 | 16 | def do_EOF(self, line): 17 | return True 18 | 19 | def do_toggle_raw(self, line): 20 | self.raw = not self.raw 21 | return True 22 | 23 | def do_skip(self, line): 24 | if len(line) == 0: 25 | for s in self.skips: 26 | print self.an(s), 27 | print 28 | elif line == 'clear': 29 | self.skips = set() 30 | else: 31 | artists = line.split(",") 32 | for artist in artists: 33 | aid = self.ag.search(line) 34 | if aid: 35 | self.skips.add(aid) 36 | 37 | def an(self, aid): 38 | return self.ag.get_artist(aid)['name'] 39 | 40 | def do_path(self, line): 41 | artists = line.split(",") 42 | if len(artists) == 2: 43 | results = self.ag.path(artists[0].strip(), artists[1].strip(), self.skips) 44 | if self.raw: 45 | print json.dumps(results, indent=4) 46 | if results['status'] == 'ok': 47 | dump_path(results['path']) 48 | print "time:", results['pdelta'], results['fdelta'] 49 | else: 50 | print results['status'] 51 | print results['reason'] 52 | else: 53 | print "usage: path artist1, artist2" 54 | 55 | def do_edge_check(self, line): 56 | aid = self.ag.search(line) 57 | if aid: 58 | self.ag.edge_check(aid) 59 | 60 | def do_sim_check(self, line): 61 | aid = self.ag.search(line) 62 | if aid: 63 | self.ag.sim_check(aid) 64 | 65 | 66 | def dump_path(path): 67 | for i, artist in enumerate(path): 68 | print "%2d %2d %s %s" % (i, artist['popularity'], artist['id'], artist['name']) 69 | 70 | if __name__ == '__main__': 71 | ArtistGraphShell().cmdloop() 72 | -------------------------------------------------------------------------------- /new_crawler/sim_crawl.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import json 3 | import spotipy 4 | import db 5 | from spotipy.oauth2 import SpotifyClientCredentials 6 | 7 | max_artists = 1000000 8 | superseeds = 50000 9 | client_credentials_manager = SpotifyClientCredentials() 10 | spotify = spotipy.Spotify(client_credentials_manager=client_credentials_manager) 11 | 12 | known_artists = set() 13 | expanded_artists = set() 14 | queue = [] 15 | 16 | ta = 'spotifh:artist:7dGJo4pcD2V6oG8kP0tJRR' # troublesome artist 17 | def check_ta(where, artist): 18 | if artist['uri'] == ta: 19 | print 'TA', where 20 | 21 | def check_ta_uri(where, uri): 22 | if uri == ta: 23 | print 'TAU', where 24 | 25 | def queue_append(artist): 26 | check_ta('queue_append', artist) 27 | queue.append( (artist['followers']['total'], artist['uri'], artist['name'])) 28 | 29 | def queue_sort(): 30 | queue.sort(reverse=True) 31 | 32 | def process_queue(nodefile, edgefile): 33 | edge_count = 0 34 | 35 | queue_sort() 36 | while queue and len(known_artists) < max_artists: 37 | followers, uri, artist_name = queue.pop(0) 38 | print len(queue), followers, uri, artist_name 39 | if uri in expanded_artists: 40 | print " done" 41 | check_ta_uri('already expanded', uri) 42 | continue 43 | 44 | expanded_artists.add(uri) 45 | results = spotify.artist_related_artists(uri) 46 | if not results['artists']: 47 | print "NO SIMS FOR", artist_name 48 | check_ta_uri('goit sims', uri) 49 | for sim_artist in results['artists']: 50 | print " %s => %s" % (artist_name, sim_artist['name']) 51 | 52 | sim_uris = [] 53 | for sim_artist in results['artists']: 54 | edge_count += 1 55 | sim_uri = sim_artist['uri'] 56 | if sim_uri not in known_artists: 57 | known_artists.add(sim_uri) 58 | print "%5d/%-7d %7d %s %3d %7d %s" % (len(known_artists), len(queue), edge_count, sim_uri, 59 | sim_artist['popularity'], sim_artist['followers']['total'], sim_artist['name']) 60 | queue_append(sim_artist) 61 | print >> nodefile, json.dumps(sim_artist) 62 | sim_uris.append(sim_artist['uri']) 63 | queue_sort() 64 | 65 | check_ta_uri('appended sims', uri) 66 | dict = { uri: sim_uris } 67 | print >> edgefile, json.dumps(dict) 68 | 69 | # print " %s %s => %s %s" % (artist['uri'], artist['name'], sim_artist['uri'], sim_artist['name']) 70 | 71 | def load_external_artist_list(top, nodefile, dbpath=None): 72 | if dbpath: 73 | db.load_db(dbpath) 74 | for i, line in enumerate(open('top_artists.txt')): 75 | if i < top: 76 | fields = line.strip().split() 77 | uri = fields[0] 78 | count = int(fields[1]) 79 | name = ' '.join(fields[2:]) 80 | 81 | if uri not in known_artists: 82 | print "NEW", i, uri, count, name 83 | artist = None 84 | if dbpath: 85 | artist = db.get_artist(uri) 86 | if not artist: 87 | artist = spotify.artist(uri) 88 | else: 89 | print " cache hit for", name 90 | known_artists.add(uri) 91 | queue_append(artist) 92 | print >> nodefile, json.dumps(artist) 93 | else: 94 | break 95 | 96 | if __name__ == '__main__': 97 | 98 | seeds = [ 99 | 'spotify:artist:3hE8S8ohRErocpkY7uJW4a', # within temptation 100 | 'spotify:artist:0kbYTNQb4Pb1rPbbaF0pT4', # miles davis 101 | 'spotify:artist:3WrFJ7ztbogyGnTHbHJFl2', # the beatles 102 | 'spotify:artist:6eUKZXaKkcviH0Ku9w2n3V', # ed sheeran 103 | 'spotify:artist:36QJpDe2go2KgaRleHCDTp', # led zeppelin 104 | ] 105 | 106 | args = sys.argv[1:] 107 | prefix = "./" 108 | 109 | while args: 110 | arg = args.pop(0) 111 | 112 | if arg == '--path': 113 | prefix = args.pop(0) 114 | 115 | elif arg == '--load': 116 | db.load_db(prefix) 117 | 118 | artists = db.get_all_artists() 119 | 120 | for uri, artist in artists.items(): 121 | known_artists.add(uri) 122 | 123 | edges = db.get_all_edges() 124 | 125 | for source, targets in edges.items(): 126 | check_ta_uri('load expanded', source) 127 | expanded_artists.add(source) 128 | 129 | for uri, artist in artists.items(): 130 | if uri not in expanded_artists: 131 | queue_append(artist) 132 | 133 | for source, targets in edges.items(): 134 | for target in targets: 135 | if target not in expanded_artists: 136 | artist = db.get_artist(target) 137 | if artist: 138 | queue_append(artist) 139 | else: 140 | print "trouble on restart, unknown artist", artist 141 | 142 | nodefile = open(prefix + '/nodes.js', 'a') 143 | edgefile = open(prefix + '/edges.js', 'a') 144 | #load_external_artist_list(superseeds, nodefile) 145 | queue_sort() 146 | process_queue(nodefile, edgefile) 147 | 148 | elif arg == '--fresh': 149 | nodefile = open(prefix + '/nodes.js', 'w') 150 | edgefile = open(prefix + '/edges.js', 'w') 151 | for seed in seeds: 152 | artist = spotify.artist(seed) 153 | known_artists.add(seed) 154 | queue_append(artist) 155 | print >> nodefile, json.dumps(artist) 156 | 157 | process_queue(nodefile, edgefile) 158 | 159 | elif arg == '--superseeds': 160 | seed_count = 100 161 | if args: 162 | seed_count = int(args.pop(0)) 163 | nodefile = open(prefix + '/nodes.js', 'w') 164 | edgefile = open(prefix + '/edges.js', 'w') 165 | load_external_artist_list(seed_count, nodefile, "g2") 166 | process_queue(nodefile, edgefile) 167 | 168 | -------------------------------------------------------------------------------- /new_crawler/spotipy_util.py: -------------------------------------------------------------------------------- 1 | 2 | # shows a user's playlists (need to be authenticated via oauth) 3 | 4 | from __future__ import print_function 5 | import os 6 | from spotipy import oauth2 7 | import spotipy 8 | 9 | def prompt_for_user_token(username, scope=None, client_id = None, 10 | client_secret = None, redirect_uri = None, cache_path = None, use_web_browser=True): 11 | ''' prompts the user to login if necessary and returns 12 | the user token suitable for use with the spotipy.Spotify 13 | constructor 14 | 15 | Parameters: 16 | 17 | - username - the Spotify username 18 | - scope - the desired scope of the request 19 | - client_id - the client id of your app 20 | - client_secret - the client secret of your app 21 | - redirect_uri - the redirect URI of your app 22 | - cache_path - path to location to save tokens 23 | 24 | ''' 25 | 26 | if not client_id: 27 | client_id = os.getenv('SPOTIPY_CLIENT_ID') 28 | 29 | if not client_secret: 30 | client_secret = os.getenv('SPOTIPY_CLIENT_SECRET') 31 | 32 | if not redirect_uri: 33 | redirect_uri = os.getenv('SPOTIPY_REDIRECT_URI') 34 | 35 | if not client_id: 36 | print(''' 37 | You need to set your Spotify API credentials. You can do this by 38 | setting environment variables like so: 39 | 40 | export SPOTIPY_CLIENT_ID='your-spotify-client-id' 41 | export SPOTIPY_CLIENT_SECRET='your-spotify-client-secret' 42 | export SPOTIPY_REDIRECT_URI='your-app-redirect-url' 43 | 44 | Get your credentials at 45 | https://developer.spotify.com/my-applications 46 | ''') 47 | raise spotipy.SpotifyException(550, -1, 'no credentials set') 48 | 49 | cache_path = cache_path or ".cache-" + username 50 | sp_oauth = oauth2.SpotifyOAuth(client_id, client_secret, redirect_uri, 51 | scope=scope, cache_path=cache_path) 52 | 53 | # try to get a valid token for this user, from the cache, 54 | # if not in the cache, the create a new (this will send 55 | # the user to a web page where they can authorize this app) 56 | 57 | token_info = sp_oauth.get_cached_token() 58 | 59 | if not token_info: 60 | print(''' 61 | 62 | User authentication requires interaction with your 63 | web browser. Once you enter your credentials and 64 | give authorization, you will be redirected to 65 | a url. Paste that url you were directed to to 66 | complete the authorization. 67 | 68 | ''') 69 | auth_url = sp_oauth.get_authorize_url() 70 | try: 71 | if use_web_browser: 72 | import webbrowser 73 | webbrowser.open(auth_url) 74 | print("Opened %s in your browser" % auth_url) 75 | else: 76 | print("Please navigate here: %s" % auth_url) 77 | except: 78 | print("Please navigate here: %s" % auth_url) 79 | 80 | print() 81 | print() 82 | try: 83 | response = raw_input("Enter the URL you were redirected to: ") 84 | except NameError: 85 | response = input("Enter the URL you were redirected to: ") 86 | 87 | print() 88 | print() 89 | 90 | code = sp_oauth.parse_response_code(response) 91 | token_info = sp_oauth.get_access_token(code) 92 | # Auth'ed API request 93 | if token_info: 94 | return token_info['access_token'] 95 | else: 96 | return None 97 | -------------------------------------------------------------------------------- /new_crawler/start_simple_server: -------------------------------------------------------------------------------- 1 | source credentials.sh 2 | export PBL_CACHE=REDIS 3 | # python flask_server.py 4 | #gunicorn flask_server:app -b localhost:8000 5 | until python flask_server.py; do 6 | echo "flask server crashed with exit code $?. Respawning.." >&2 7 | sleep 1 8 | done 9 | -------------------------------------------------------------------------------- /new_crawler/stringutils.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | import hashlib 3 | import htmlentitydefs 4 | import logging 5 | import random 6 | import re 7 | import string 8 | import sys 9 | import time 10 | from types import BooleanType, FloatType, IntType, ListType, LongType, StringType, UnicodeType 11 | import unicodedata 12 | import uuid 13 | import urllib 14 | 15 | from chartable import simplerchars 16 | 17 | COMMENT = re.compile(r'') 18 | 19 | # doesn't handle higher range control chars - will interfere with unicode chars: 20 | # + range(127,160) 21 | control_chars = ''.join(map(unichr, range(0,32))) 22 | CONTROL_CHARS = re.compile('[%s]' % re.escape(control_chars)) 23 | 24 | logger = logging.getLogger(__name__) 25 | 26 | class NestOpener(urllib.FancyURLopener): 27 | version = "nestReader/0.2 (discovery; http://the.echonest.com/reader.html; reader at echonest.com)" 28 | 29 | def random_alphanumeric(length): 30 | chars = string.letters + string.digits 31 | return ''.join(random.choice(chars) for i in xrange(length)) 32 | 33 | def randomString(length=10): 34 | return ''.join(random.choice(string.letters) for x in xrange(length)) 35 | 36 | def randomType(): 37 | return random.choice(["artist","track","release","doc"]) 38 | 39 | def randomInt(max=1000): 40 | return random.randint(0,max) 41 | 42 | def randomFloat(): 43 | return random.random() 44 | 45 | def randomUUID(): 46 | return str(uuid.uuid4()) 47 | 48 | def randomDocument(): 49 | return {"name":randomString(15), "enid":randomUUID(), "type":randomType(), "grackleCount":randomInt(), "hotttnesss":randomFloat()} 50 | 51 | def bandNameNormalize(name): 52 | # Does name normalization for myspace etc name matching 53 | out = name.lower() 54 | out = re.sub(r' group', '', out) 55 | out = re.sub(r' band', '', out) 56 | out = re.sub(r' and ', ' ', out) 57 | out = re.sub(r'$.*?$', '', out) 58 | out = re.sub(r'\[.*?\]', '', out) 59 | out = re.sub(r'[\-\,\.\&\$\%\!\@\#\*\:\"\'\?\;]',' ', out) 60 | out = re.sub(r'\ {2,}', ' ', out) 61 | out = re.sub(r'^ ', '', out) 62 | out = re.sub(r' $', '', out) 63 | out = re.sub(r'^the', '', out) 64 | out = re.sub(r'^ ', '', out) 65 | out = re.sub(r' $', '', out) 66 | out = out[:25] 67 | out = unaccent(out, erase_unrecognized=False) 68 | return out 69 | 70 | def uncomma(s, dumb=False): 71 | if type(s) not in (StringType, UnicodeType): 72 | raise ValueError, "Argument must be a string." 73 | if ', ' not in s: 74 | return s 75 | 76 | if dumb: 77 | return re.sub('(.*?), ([^ ]*)', r'\2 \1', s) 78 | 79 | a, b = s.split(', ', 1) 80 | suffix = '' 81 | 82 | for amp in [' & ', ' And ', ' and ', ' AND ', ' / ', ' + ']: 83 | if amp in b: 84 | commable, suffix = b.split(amp, 1) 85 | suffix = amp + suffix 86 | return "%s %s%s" % (commable, a, suffix) 87 | return "%s %s" % (b, a) 88 | 89 | def str2bool(s): 90 | if(isinstance(s,bool)): 91 | return s 92 | if s in ['Y', 'y']: 93 | return True 94 | if s in ['N', 'n']: 95 | return False 96 | if s in ['True', 'true']: 97 | return True 98 | elif s in ['False', 'false']: 99 | return False 100 | else: 101 | raise ValueError, "Bool-looking string required." 102 | 103 | def delist(item): 104 | if item == []: 105 | return '' 106 | if type(item) is ListType: 107 | return item[0] 108 | return item 109 | 110 | def summarize_string(s, length=50): 111 | ss = str(s) 112 | if len(ss) > length: 113 | return '%s: %s ... [%s more chars]' % (type(s), ss[:length], len(ss) - length) 114 | else: 115 | return s 116 | 117 | def reallyunicode(s, encoding="utf-8"): 118 | """ 119 | Try the user's encoding first, then others in order; break the loop as 120 | soon as we find an encoding we can read it with. If we get to ascii, 121 | include the "replace" argument so it can't fail (it'll just turn 122 | everything fishy into question marks). 123 | 124 | Usually this will just try utf-8 twice, because we will rarely if ever 125 | specify an encoding. But we could! 126 | """ 127 | if type(s) is StringType: 128 | for args in ((encoding,), ('utf-8',), ('latin-1',), ('ascii', 'replace')): 129 | try: 130 | s = s.decode(*args) 131 | break 132 | except UnicodeDecodeError: 133 | continue 134 | if type(s) is not UnicodeType: 135 | raise ValueError, "%s is not a string at all." % s 136 | return s 137 | 138 | def reallyUTF8(s): 139 | return reallyunicode(s).encode("utf-8") 140 | 141 | def unfancy(s): 142 | "Removes smartquotes, smartellipses, and nbsps. Always returns Unicode." 143 | simplerpunc = {145: "'", 146: "'", 147: '"', 148: '"', 133: '...', 160: ' ', 173: '-', 144 | 8211: "-", 8212: "--", 8216:"'", 8217: "'", 8220:'"', 8221:'"', 8222: '"', 8230: '...'} 145 | ret = "".join([simplerpunc.get(ord(char), char) for char in reallyunicode(s)]) 146 | return ret 147 | 148 | def unaccent(s, erase_unrecognized=True): 149 | """Removes umlauts and accents, etc. Unless erase_unrecognized=False, 150 | any characters that don't have an ASCII simplified form are 151 | removed entirely.""" 152 | ## The dict "simplerchars" is in another file because it's 153 | ## so huge. See import statement at top. 154 | if not isinstance(s,basestring): 155 | raise ValueError, "unaccent argument %s must be a string." % str(s) 156 | unistr = reallyunicode(s) 157 | ret = u'' 158 | for c in unistr: 159 | if c in simplerchars: 160 | ret += simplerchars[c] 161 | else: 162 | decomp = unicodedata.normalize('NFKD', c) 163 | basechar = decomp[0] 164 | ## These will all be unicode characters, so 165 | ## technically none of them are in string.printable. 166 | ## But "in" uses equality, not identity, and 167 | ## since u'a' == 'a' it will all work out. 168 | if not erase_unrecognized or basechar in string.printable: 169 | ret += basechar 170 | return ret 171 | 172 | def convertentity(m): 173 | """Convert a HTML entity into normal string (UTF-8)""" 174 | prefix, entity = m.groups() 175 | try: 176 | if prefix != '#': 177 | ## Look up name, change it to a unicode code point (integer). 178 | entity = htmlentitydefs.name2codepoint[entity] 179 | else: 180 | if entity.startswith('x'): 181 | entity = int(entity[1:], 16) 182 | else: 183 | entity = int(entity) 184 | except (KeyError, ValueError): 185 | ## Give back original unchanged. 186 | return "&%s%s;" % (prefix, entity) 187 | 188 | return unichr(int(entity)) 189 | 190 | def decode_htmlentities(string): 191 | """Uses converentity to convert a string containing 192 | HTML entitites in a string into normal strings (UTF-8.)""" 193 | entity_re = re.compile("&(#?)(\d{1,5}|\w{1,8});") 194 | return entity_re.subn(convertentity, reallyunicode(string))[0] 195 | 196 | def convertentities(s): 197 | """Convert a HTML quoted string into normal string (UTF-8). 198 | Works with &#XX; and with > etc.""" 199 | s = reallyunicode(s) 200 | rep = re.compile(r'&(#?)([a-zA-Z0-9]+?);') 201 | unquoted = rep.sub(convertentity,s) 202 | return unquoted 203 | 204 | def unquotehtml(s): 205 | unquoted = convertentities(s) 206 | return unfancy(unquoted).encode('utf-8') 207 | 208 | # YES I WANT TO CALL IT CHOMP 209 | def chomp(str): 210 | return str.rstrip('\r\n') 211 | 212 | def long_time(t=None): 213 | if t is None: 214 | t = datetime.datetime.now() 215 | st = datetime.datetime.isoformat(t) 216 | st = re.sub(r"[\-\:\.A-Za-z]","",st) 217 | st = st[0:17] 218 | return st 219 | 220 | def solr_time(when=None): 221 | "Returns solr-specific UTC time string (1995-12-31T23:59:59.999Z)." 222 | if when is None: 223 | when = time.gmtime() 224 | return time.strftime('%Y-%m-%dT%H:%M:%SZ', when) 225 | 226 | timeToLongTime = long_time 227 | timeToSolrTime = solr_time 228 | 229 | def readable_time(t=None): 230 | "Returns second-accuracy time string suitable for sorting or reading by humans, like '2009-04-24-12-45-04'." 231 | if t is None: 232 | t = time.localtime() 233 | return time.strftime('%Y-%m-%d-%H-%M-%S', t) 234 | 235 | def MD5(text): 236 | ## We will convert to UTF-8 if given Unicode, 237 | ## BUT if fed a bytestring we just checksum it, 238 | ## so don't go MD5ing strings in encodings 239 | ## incompatible with UTF-8 and expecting it 240 | ## to work out okay! 241 | if type(text) == UnicodeType: 242 | sys.stderr.write("md5 of Unicode requested - encoding to UTF-8.\n") 243 | sys.stderr.write("text (asciified for display): %s\n" % ascii(text)) 244 | text = text.encode('utf-8') 245 | m = hashlib.md5() 246 | m.update(text) 247 | return m.hexdigest() 248 | 249 | def parseEntererLine(line): 250 | line = line.rstrip("\r\n") 251 | s = line.split(' ### ') 252 | l = {} 253 | if(len(s) == 9): 254 | l["foreignID_AR"] = s[0] 255 | l["foreignID_RE"] = s[1] 256 | l["foreignID_TR"] = s[2] 257 | l["name_AR"] = s[3] 258 | l["name_RE"] = s[4] 259 | l["name_TR"] = s[5] 260 | l["type"] = s[6] 261 | l["tagname"] = s[7] 262 | l["tagvalue"] = s[8] 263 | else: 264 | logger.error("Can't parse line %s got %d out of it", line, len(s)) 265 | l = None 266 | return l 267 | 268 | def makeNiceLucene(text): 269 | #http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%20Special%20Characters 270 | text = re.sub(r'\bAND\b', '\AND', text) 271 | text = re.sub(r'\bOR\b', '\OR', text) 272 | text = re.sub(r'\bNOT\b', '\NOT', text) 273 | return re.sub(r"([\+\-\&\|\!\{\}\[\]\;\^\"\~\*\?\:\\])",r"\\\1", text) 274 | 275 | def normalizeString(text): 276 | "Does ryan mckinley text normalization" 277 | digitMap = ["zero","one","two","three","four","five","six","seven","eight","nine"] 278 | LOWASCII = range(0,128) 279 | charList = (['E', chr(129) ,',', 'f', ',', '.', 't', '+', '^', '%', 'S', '<','D', ' ', 'Z', ' ', ' ', '\'', '\'', '\"','\"', '-','-', '-', '-', ' ', 's', '>', 'c', ' ', 'z', 'Y',' ', '!', 'c', 'L', '.', 'Y', '|', 'S', '.', 'c',' ', '<', '-', '-', 'R', '-', 'o', '+', '.','3','\'', 'u', 'P','.',',', '1', 'o', '>', '.', '.', ' ', '?', 'A', 'A','A', 'A', 'A', 'A', 'A', 'C','E', 'E', 'E', 'E', 'I','I', 'I', 'I', 'D', 'N', 'O', 'O','O','O', 'O', 'x','O','U','U', 'U', 'U', 'Y','P','B', 'a', 'a','a','a', 'a', 'a', 'A', 'c', 'e', 'e','e','e','i','i','i', 'i','o', 'n', 'o', 'o', 'o', 'o', 'o', '/','o', 'u','u', 'u', 'u', 'y', 'p', 'y']) 280 | for c in charList: 281 | LOWASCII.append(ord(c)) 282 | 283 | # Strip whitespace at ends 284 | r = text.strip().lower() 285 | r = re.sub(r"[\/\,\:\.\&\<\>\:\;\-\_\+]"," ",r) 286 | words = text.split() 287 | newphrase = [] 288 | for w in words: 289 | if(w=="and" or w=="the" or w=="of" or w=="und"): 290 | pass 291 | else: 292 | newword = "" 293 | for c in w: 294 | if(ord(c)>1 and ord(c)<255): 295 | c = chr(LOWASCII[ord(c)]).lower() 296 | if(ord(c)>=ord('a') and ord(c) <= ord('z')): 297 | newword = newword + c 298 | if(ord(c)>=ord('0') and ord(c) <= ord('9')): 299 | if(len(newword)>0): 300 | newphrase.append(newword) 301 | newphrase.append(digitMap[ord(c)-ord('0')]) 302 | 303 | if(len(newword)>0): 304 | newphrase.append(newword) 305 | 306 | normalized = " ".join(newphrase) 307 | return normalized 308 | 309 | def undo_wtf8(s): 310 | try: 311 | if type(s) == str: 312 | s2 = s.decode('utf-8') 313 | else: 314 | s2 = s 315 | s3 = s2.encode('raw-unicode-escape') 316 | s4 = s3.decode('utf-8') 317 | return s4 318 | except (UnicodeEncodeError, UnicodeDecodeError): 319 | return s 320 | 321 | def undo_windows_wtf8(s): 322 | try: 323 | if type(s) == str: 324 | s2 = s.decode('utf-8') 325 | else: 326 | s2 = s 327 | s3 = s2.encode('windows-1252') 328 | s4 = s3.decode('utf-8') 329 | return s4 330 | except (UnicodeEncodeError, UnicodeDecodeError): 331 | return s 332 | 333 | def utf8(text): 334 | return unicode(text, "utf8", errors="replace") 335 | 336 | def ascii(text, errors='ignore'): 337 | if type(text) not in (StringType, UnicodeType): 338 | raise ValueError, "Argument %s must be string or Unicode!" % str(text) 339 | text = unaccent(text, erase_unrecognized=True) 340 | ## unaccent returns Unicode-- should be ascii safe, but 341 | ## in case it's not... 342 | return text.encode('ascii', errors) 343 | 344 | def cleanup(text): 345 | ltRem = text.replace("\r","").replace("\n","") 346 | ltRem = re.sub(r" {2,}"," ",ltRem) 347 | ltRem = re.sub(r"\<.{1,20}\>","",ltRem) 348 | return ltRem 349 | 350 | def striphtml(text): 351 | return re.sub('<.*?>', '', text) 352 | 353 | def clean(html): 354 | """strip html and unquotehtml""" 355 | for tag in ['
', '
', '

']: 356 | html = html.replace(tag, ' ') 357 | html = COMMENT.sub('', html) 358 | return unquotehtml(htmlstripper.stripHTML(html,'UTF-8')) 359 | 360 | def link(url, timeout=5, version=False): 361 | """save URL link to temp file, return html 362 | if it fails, retry after timeout=5 seconds; use input ua""" 363 | if version: 364 | NestOpener.version = version 365 | myOpener = NestOpener() 366 | try: 367 | page = myOpener.open(url) 368 | except (IOError, AttributeError): 369 | time.sleep(timeout) 370 | try: 371 | page = myOpener.open(url) 372 | except (IOError, AttributeError): 373 | return False 374 | try: 375 | html = page.read() 376 | except Exception: 377 | time.sleep(timeout) 378 | try: 379 | html = page.read() 380 | except Exception: 381 | logger.exception('SCRAPPY: After waiting page could not be read.') 382 | return "" 383 | return reallyunicode(html) 384 | 385 | 386 | 387 | def istyperight(doc): 388 | for (field, val) in doc.items(): 389 | ## Some docs in sands have None in them (how?) so 390 | ## we need to clean them to re-add them. 391 | if val is None: 392 | doc.pop(field) 393 | if isinstance(val, list): 394 | while None in val: 395 | val.remove(None) 396 | 397 | if not is_right_type(field, val): 398 | logger.error("field %s and value %s were not right type", field, val) 399 | if isinstance(val, list): 400 | logger.error("BAD TYPE; DID NOT ADD DOCUMENT. Field '%s' had value %s, which is %s.", field, val, set(type(x) for x in val)) 401 | else: 402 | logger.error("BAD TYPE; DID NOT ADD DOCUMENT. Field '%s' had value %s, which is %s.", field, val, type(val)) 403 | return False 404 | 405 | return True 406 | 407 | def is_right_type(fieldname, value): 408 | OurDateType = type(datetime.datetime.today()) 409 | if fieldname in ['thingID', 'url', 'id'] or fieldname.startswith('_'): 410 | return type(value) in (StringType, UnicodeType) 411 | if fieldname in ['indexed', 'modified']: 412 | return type(value) == OurDateType 413 | if fieldname in ['score']: 414 | return type(value) == FloatType 415 | 416 | righttypes = {'i_': (IntType,) , 417 | 'f_': (FloatType,IntType) , 418 | 's_': (StringType, UnicodeType) , 419 | 'v_': (StringType, UnicodeType) , 420 | 't_': (StringType, UnicodeType) , 421 | 'n_': (StringType, UnicodeType) , 422 | 'b_': (BooleanType,) , 423 | 'd_': (OurDateType,) , 424 | 'l_': (IntType, LongType) } 425 | rightfuncs = {'i_': int, 426 | 'f_': float, 427 | 'l_': long, 428 | 'b_': str2bool} 429 | 430 | prefix = fieldname[:2] 431 | if prefix not in righttypes: 432 | raise ValueError("Field called %s has an invalid prefix for sands." % fieldname, 'uknown doc ID') 433 | if type(value) is ListType: 434 | return bool(False not in [is_right_type(fieldname, x) for x in value]) 435 | if type(value) in (StringType, UnicodeType) and prefix in rightfuncs: 436 | try: 437 | rightfuncs[prefix](value) 438 | ## What the HELL people 439 | if prefix == 'f_' and str(float(value)) in ['inf', '-inf', 'nan']: 440 | return False 441 | sys.stderr.write("Warning: string '%s' being added with prefix '%s'.\n" % (value, prefix)) 442 | return True 443 | except ValueError: 444 | return False 445 | ## Finally, if the prefix is valid and it's not an array 446 | ## AND it's not a string that Solr will numericalize, 447 | ## then answer the obvious way: is it the right type? 448 | return type(value) in righttypes[prefix] 449 | 450 | def remove_control_chars(s): 451 | return CONTROL_CHARS.sub('', s) 452 | 453 | def is_valid_unicode_xml_char(character): 454 | ''' 455 | test whether a unicode character can exist in an xml document, 456 | according to the characters specified in: 457 | 458 | Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ 459 | 460 | as defined in: 461 | http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char 462 | 463 | raises TypeError `character` is not unicode 464 | ''' 465 | 466 | if not isinstance(character, unicode): 467 | raise TypeError('character must be unicode') 468 | 469 | # if len(character) != 1: 470 | # raise ValueError('character must be a single character: %s' % character) 471 | 472 | if character in (u'\u0009', u'\u000A', u'\u000D'): 473 | return True 474 | 475 | if character < u'\u0020': 476 | return False 477 | 478 | if character > u'\uD7FF' and character < u'\uE000': 479 | return False 480 | 481 | if character > u'\uFFFD' and character < u'\U00010000': 482 | return False 483 | 484 | if character > u'\U0010FFFF': 485 | return False 486 | 487 | return True 488 | 489 | def truncate_words(s, num, end_text='...'): 490 | """Truncates a string after a certain number of words. Takes an optional 491 | argument of what should be used to notify that the string has been 492 | truncated, defaults to ellipsis (...)""" 493 | s = reallyunicode(s) 494 | length = int(num) 495 | words = s.split() 496 | if len(words) > length: 497 | words = words[:length] 498 | if not words[-1].endswith(end_text): 499 | words.append(end_text) 500 | return u' '.join(words) 501 | -------------------------------------------------------------------------------- /server/deploy: -------------------------------------------------------------------------------- 1 | deploy_labs "graph.py full_spotify.dat server.py spotify_songs.dat web.conf" BoilTheFrog 2 | -------------------------------------------------------------------------------- /server/graph.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import time 3 | import networkx as nx 4 | import search 5 | import random 6 | import os 7 | import pprint 8 | import simplejson as json 9 | 10 | RS = ' ' 11 | artists ={} 12 | aid_to_sid = {} 13 | max_edges_per_node = 4 14 | min_hotttnesss = .5 15 | 16 | G = nx.Graph() 17 | 18 | searcher = search.Searcher() 19 | songs = {} 20 | skips = set() 21 | 22 | skip_artists_with_no_songs = True 23 | 24 | 25 | def stats(): 26 | print 'nodes', G.number_of_nodes() 27 | print 'edges', G.number_of_edges() 28 | cc = nx.connected_components(G) 29 | print 'components', len(cc) 30 | 31 | print ' ', 32 | for c in cc: 33 | print len(c), 34 | print 35 | #print 'diameter', nx.diameter(G) 36 | 37 | def get_artist(id): 38 | artist = artists[id] 39 | return artist 40 | 41 | 42 | def get_random_artist(): 43 | id = random.choice(G.nodes()) 44 | return artists[id] 45 | 46 | def add_artist(artist): 47 | id = artist['id'] 48 | if not id in artists: 49 | artists[id] = artist 50 | searcher.add(artist['name'], artist) 51 | G.add_node(id) 52 | if id in songs: 53 | artist['songs'] = songs[id] 54 | # print 'found', len(artist['songs']), 'songs for', artist['name'] 55 | 56 | def add_edge(sid, did, weight): 57 | G.add_edge(sid, did, weight=weight) 58 | 59 | 60 | def npair(n1, n2): 61 | return n1 + ' // ' + n2 62 | 63 | 64 | def load_skiplist(path): 65 | for line in open(path): 66 | fields = line.strip().split(RS) 67 | if len(fields) == 4: 68 | skips.add(npair(fields[0], fields[2])) 69 | skips.add(npair(fields[2], fields[0])) 70 | else: 71 | skips.add(fields[0]) 72 | 73 | 74 | def has_songs(id): 75 | return id in songs and len(songs[id]) > 0 76 | 77 | def skipped(n1, n2): 78 | if n1 in skips: 79 | return True 80 | if n2 in skips: 81 | return True 82 | 83 | if skip_artists_with_no_songs: 84 | if not has_songs(n1): 85 | return True 86 | 87 | if not has_songs(n2): 88 | return True 89 | 90 | skip = npair(n1, n2) in skips 91 | return skip 92 | 93 | 94 | def get_edge_weight(id1, id2): 95 | hot1 = artists[id1]['hot'] 96 | hot2 = artists[id2]['hot'] 97 | edge_weight = 1 + int(100 * (abs(hot1 - hot2))) 98 | # edge_weight = 1 + int(1000 * (abs(hot1 - hot2))) 99 | return edge_weight 100 | 101 | 102 | def add_ids(aid, sid): 103 | if aid in aid_to_sid: 104 | if sid != aid_to_sid[aid]: 105 | print 'mismatched ids', aid, sid 106 | sys.exit(-1) 107 | else: 108 | aid_to_sid[aid] = sid 109 | 110 | 111 | def load_graph(path): 112 | last_source = '' 113 | edge_count = 0 114 | for i, line in enumerate(open(path)): 115 | fields = line.strip().split(RS) 116 | if i % 100000 == 0: 117 | print i, fields[2] 118 | if fields[0] == 'artist': 119 | aid = fields[1] 120 | # sid = fields[3].split(':')[2] 121 | sid = fields[3] 122 | add_ids(aid, sid) 123 | hot = float(fields[4]) 124 | artist = { 'id' : sid, 'name' : fields[2], 'hot': hot } 125 | if has_songs(sid) and hot >= min_hotttnesss: 126 | add_artist(artist) 127 | elif fields[0] == 'sim' and len(fields) > 5: 128 | source_aid = fields[1] 129 | source_sid = aid_to_sid[source_aid] 130 | if source_sid <> last_source: 131 | last_source = source_sid 132 | edge_count = 0 133 | 134 | if edge_count < max_edges_per_node: 135 | dest_aid = fields[3] 136 | #dest_sid = fields[6].split(':')[2] 137 | dest_sid = fields[6] 138 | add_ids(dest_aid, dest_sid) 139 | 140 | if not skipped(source_sid, dest_sid) and source_sid in artists: 141 | source = artists[source_sid] 142 | shot = float(fields[5]) 143 | dest = { 'id' : dest_sid, 'name' : fields[4], 'hot': shot } 144 | if has_songs(dest['id']) and shot >= min_hotttnesss: 145 | add_artist(dest) 146 | edge_weight = get_edge_weight(source_sid, dest_sid) 147 | add_edge(source_sid, dest['id'], edge_weight) 148 | edge_count += 1 149 | 150 | def find_artist(name): 151 | results = searcher.search(name) 152 | if len(results) > 0: 153 | return results[0] 154 | return None 155 | 156 | def is_id(name_or_id): 157 | return len(name_or_id) == 18 and name_or_id.startswith('AR') 158 | 159 | def sim_artist(name_or_id): 160 | if is_id(name_or_id): 161 | a = artists[name_or_id] 162 | else: 163 | a = find_artist(name_or_id) 164 | if a: 165 | id = a['id'] 166 | return id, G[id].keys() 167 | return None, None 168 | 169 | 170 | def sims(artist): 171 | return [get_artist(id) for id in G[artist['id']]] 172 | 173 | def find_path(n1, n2, skip = []): 174 | start = time.time() 175 | path = None 176 | status = 'ok' 177 | 178 | a1 = find_artist(n1) 179 | a2 = find_artist(n2) 180 | 181 | if not a1: 182 | status = "Can't find " + n1 183 | if not a2: 184 | status = "Can't find " + n2 185 | 186 | if a1 and a2: 187 | if skip and len(skip) > 0: 188 | # graph = G.copy() 189 | graph = G 190 | else: 191 | graph = G 192 | 193 | remove_nodes(graph, skip) 194 | try: 195 | l, path = nx.bidirectional_dijkstra(graph, a1['id'], a2['id'], 'weight') 196 | 197 | except nx.NetworkXNoPath: 198 | status = 'No path found between ' + n1 + " and " + n2; 199 | 200 | restore_nodes(graph, skip) 201 | 202 | print 'find_path took %s seconds' % (time.time() - start,) 203 | return status, path 204 | 205 | def qfind(a1, a2): 206 | start = time.time() 207 | path = None 208 | 209 | if a1 and a2: 210 | graph = G 211 | try: 212 | l, path = nx.bidirectional_dijkstra(graph, a1['id'], a2['id'], 'weight') 213 | except nx.NetworkXNoPath: 214 | pass 215 | return path 216 | 217 | def remove_nodes(graph, nodes): 218 | if nodes: 219 | for n in nodes: 220 | for other, edge in graph[n].items(): 221 | edge['weight'] = 10000000 222 | 223 | def restore_nodes(graph, nodes): 224 | if nodes: 225 | for n in nodes: 226 | for other, edge in graph[n].items(): 227 | edge['weight'] = get_edge_weight(n, other) 228 | 229 | 230 | def sp(n1, n2, skip=[]): 231 | print '===', n1, 'to', n2, 'with', len(skip), 'skips', '===' 232 | iskip = [] 233 | for n in skip: 234 | artist = find_artist(n) 235 | if artist: 236 | iskip.append(artist['id']) 237 | 238 | status, path = find_path(n1, n2, iskip) 239 | 240 | if path: 241 | for a in path: 242 | print artists[a]['name'] 243 | pprint.pprint( artists[a]) 244 | print 245 | 246 | else: 247 | print status 248 | 249 | edges = set() 250 | 251 | def edge_exists(a1, a2): 252 | n1 = a1+ '--' + a2 253 | found = n1 in edges 254 | if not found: 255 | n2 = a2 + '--' + a1 256 | edges.add(n1) 257 | edges.add(n2) 258 | return found 259 | 260 | def gv(n1, n2, skip=[]): 261 | 262 | gv = open('graph.gv', 'w') 263 | print >>gv, "digraph {" 264 | iskip = [] 265 | for n in skip: 266 | artist = find_artist(n) 267 | if artist: 268 | iskip.append(artist['id']) 269 | 270 | status, path = find_path(n1, n2, iskip) 271 | 272 | extra = 4 273 | if path: 274 | last = None 275 | for a in path: 276 | if last: 277 | neighbors = list(G[last].keys()) 278 | neighbors.remove(a) 279 | #for n, attr in G[last].items()[:2]: 280 | for n in neighbors[0:extra]: 281 | if not edge_exists(last, n): 282 | print >>gv, q(last), '->', q(n) + ';' 283 | print >>gv, q(last), '->', q(a), '[color=red,style=bold];' 284 | edge_exists(last, a) 285 | for n in neighbors[extra: extra + 4]: 286 | if not edge_exists(last, n): 287 | print >>gv, q(last), '->', q(n) + ';' 288 | print >>gv, q(a), '[color=red,style=bold];' 289 | last = a 290 | else: 291 | print status 292 | print >>gv, "}" 293 | gv.close() 294 | 295 | def q(a): 296 | return '"' + artists[a]['name'] + '"' 297 | 298 | def init(): 299 | global songs 300 | 301 | #load_skiplist('skip_list.dat') 302 | songs = load_song_data('spotify_songs.dat') 303 | load_graph('full_spotify.dat') 304 | #load_graph('tiny_spotify.dat') 305 | stats() 306 | 307 | def load_song_data(path): 308 | hash = {} 309 | if os.path.exists(path): 310 | file = open(path) 311 | shash = file.read() 312 | hash = json.loads(shash) 313 | file.close() 314 | print 'loaded', len(hash), 'songs from', path 315 | return hash 316 | 317 | def test(): 318 | sp('Miley Cyrus', 'Miles Davis') 319 | sp('Miley Cyrus', 'Miles Davis', [ 'Beth Orton'] ) 320 | sp('Miley Cyrus', 'Miles Davis', [ 'Beth Orton', 'Miles Davis' ] ) 321 | sp('Miley Cyrus', 'Miles Davis') 322 | 323 | def test1(): 324 | sp('Miley Cyrus', 'Britney Spears') 325 | 326 | def test2(): 327 | #gv('Miley Cyrus', 'Miles Davis') 328 | #gv('Cannibal Corpse', 'Dora the Explorer') 329 | gv('Kenny G', 'Nile') 330 | 331 | if __name__ == '__main__': 332 | init() 333 | test1() 334 | #test2() 335 | -------------------------------------------------------------------------------- /server/sdeploy: -------------------------------------------------------------------------------- 1 | deploy_labs "graph.py server.py web.conf" BoilTheFrog 2 | -------------------------------------------------------------------------------- /server/server.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import cherrypy 4 | import ConfigParser 5 | import urllib2 6 | import simplejson as json 7 | import webtools 8 | import time 9 | 10 | 11 | import graph 12 | 13 | 14 | class ArtistGraphServer(object): 15 | def __init__(self, config): 16 | self.production_mode = config.getboolean('settings', 'production') 17 | graph.init() 18 | 19 | 20 | def find_path(self, start, end, skips=None, callback=None, _=''): 21 | cherrypy.response.headers["Access-Control-Allow-Origin"] = "*" 22 | if callback: 23 | cherrypy.response.headers['Content-Type']= 'text/javascript' 24 | else: 25 | cherrypy.response.headers['Content-Type']= 'application/json' 26 | 27 | start_time = time.time() 28 | skips = make_list(skips) 29 | status, path = graph.find_path(start, end, skips) 30 | results = {} 31 | results['status'] = status 32 | if path: 33 | results['path'] = [graph.get_artist(id) for id in path] 34 | results['time'] = time.time() - start_time 35 | return to_json(results, callback) 36 | find_path.exposed = True 37 | 38 | def similar(self, artist, callback=None, _=''): 39 | cherrypy.response.headers["Access-Control-Allow-Origin"] = "*" 40 | if callback: 41 | cherrypy.response.headers['Content-Type']= 'text/javascript' 42 | else: 43 | cherrypy.response.headers['Content-Type']= 'application/json' 44 | 45 | start_time = time.time() 46 | results = {} 47 | seed, sims = graph.sim_artist(artist) 48 | if sims: 49 | results['status'] = 'ok' 50 | results['seed'] = graph.get_artist(seed) 51 | results['sims'] = [graph.get_artist(id) for id in sims] 52 | else: 53 | results['status'] = 'error' 54 | results['time'] = time.time() - start_time 55 | return to_json(results, callback) 56 | similar.exposed = True 57 | 58 | def random(self, callback=None, _=''): 59 | cherrypy.response.headers["Access-Control-Allow-Origin"] = "*" 60 | if callback: 61 | cherrypy.response.headers['Content-Type']= 'text/javascript' 62 | else: 63 | cherrypy.response.headers['Content-Type']= 'application/json' 64 | 65 | results = graph.get_random_artist() 66 | return to_json(results, callback) 67 | random.exposed = True 68 | 69 | 70 | 71 | def make_list(item): 72 | if item and not isinstance(item, list): 73 | item = [ item ] 74 | return item 75 | 76 | def to_json(dict, callback=None): 77 | results = json.dumps(dict, sort_keys=True, indent = 4) 78 | if callback: 79 | results = callback + "(" + results + ")" 80 | return results 81 | 82 | if __name__ == '__main__': 83 | urllib2.install_opener(urllib2.build_opener()) 84 | conf_path = os.path.abspath('web.conf') 85 | print 'reading config from', conf_path 86 | cherrypy.config.update(conf_path) 87 | 88 | config = ConfigParser.ConfigParser() 89 | config.read(conf_path) 90 | production_mode = config.getboolean('settings', 'production') 91 | 92 | current_dir = os.path.dirname(os.path.abspath(__file__)) 93 | # Set up site-wide config first so we get a log if errors occur. 94 | 95 | if production_mode: 96 | print "Starting in production mode" 97 | cherrypy.config.update({'environment': 'production', 98 | 'log.error_file': 'simdemo.log', 99 | 'log.screen': True}) 100 | else: 101 | print "Starting in development mode" 102 | cherrypy.config.update({'noenvironment': 'production', 103 | 'log.error_file': 'site.log', 104 | 'log.screen': True}) 105 | 106 | conf = webtools.get_export_map_for_directory("static") 107 | cherrypy.quickstart(ArtistGraphServer(config), '/ArtistGraphServer', config=conf) 108 | 109 | -------------------------------------------------------------------------------- /server/web.conf: -------------------------------------------------------------------------------- 1 | [global] 2 | #server.socket_host = '127.0.0.1' 3 | server.socket_host = '0.0.0.0' 4 | #server.socket_host = 'localhost' 5 | server.socket_port = 8444 6 | server.thread_pool = 10 7 | 8 | [settings] 9 | production = True 10 | -------------------------------------------------------------------------------- /web/callback.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 121 | 141 | 142 | 143 |

144 |

145 |

146 |

Creating playlist...

147 |

148 | 157 | 161 |

162 |

163 | 164 | 165 | -------------------------------------------------------------------------------- /web/deploy: -------------------------------------------------------------------------------- 1 | s3cmd sync --acl-public * s3://static.echonest.com/BoilTheFrog/ 2 | -------------------------------------------------------------------------------- /web/dist/fonts/glyphicons-halflings-regular.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/dist/fonts/glyphicons-halflings-regular.eot -------------------------------------------------------------------------------- /web/dist/fonts/glyphicons-halflings-regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/dist/fonts/glyphicons-halflings-regular.ttf -------------------------------------------------------------------------------- /web/dist/fonts/glyphicons-halflings-regular.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/dist/fonts/glyphicons-halflings-regular.woff -------------------------------------------------------------------------------- /web/dr-deploy: -------------------------------------------------------------------------------- 1 | s3cmd sync --dry-run --acl-public * s3://static.echonest.com/BoilTheFrog/ 2 | -------------------------------------------------------------------------------- /web/images/frog.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/images/frog.jpg -------------------------------------------------------------------------------- /web/images/lz.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/images/lz.png -------------------------------------------------------------------------------- /web/images/missing-1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/images/missing-1.jpg -------------------------------------------------------------------------------- /web/images/missing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/images/missing.png -------------------------------------------------------------------------------- /web/images/pause.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/images/pause.png -------------------------------------------------------------------------------- /web/images/play.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/images/play.png -------------------------------------------------------------------------------- /web/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Boil the Frog 5 | 6 | 7 | 8 | 9 | 10 | 11 | 25 | 26 | 37 | 38 | 39 | 40 | 41 | 55 | 56 | 57 |

58 |

59 |

Boil the Frog

60 |

Create a seamless playlist between any two artists

61 |

Woah! The Frog has escaped. Boil The Frog will be offline for a bit while we try to catch a new one.

62 |

But really, some APIs have changed and some servers have been moved, so we need to make some repairs. Back online 63 | soon!

64 | 82 |

83 |

84 | 85 |

86 |

87 |

Boil the Frog Gallery

88 | 89 |
90 |

91 |

92 | Some of the most popular artist paths made with Boil The Frog: 93 |

94 |

95 |

Miley Cyrus to Miles Davis 97 |
Justin Bieber to Jimi Hendrix 98 |
Patti Smith to the Smiths 99 |
Elvis to Elvis 100 |
The Carter Family to Rammstein 101 |
Kanye West to Taylor Swift 102 |
Cage the Elephant to John Cage 103 |
Ryan Adams to Bryan Adams 104 |
Lorde to God is an Astronaut 105 | 109 |

110 |

111 |

112 | 113 |

114 |

115 |

About Boil the Frog

116 | 117 |
118 |

119 | boiling frog 120 | Boil the Frog lets you create a playlist of songs that gradually takes you from one music style 121 | to another. It's like the proverbial frog in the pot of water. If you heat up the pot slowly enough, the 122 | frog will never notice that he's being made into a stew and jump out of the pot. With a Boil the 123 | frog playlist you can do the same, but with music. You can generate a playlist that will take the 124 | listener from one style of music to the other, without the listener ever noticing that they are being made 125 | into a stew. 126 |

127 |
128 | 129 |

How does it work?

130 |

131 | To create a Boil The Frog playlist, just type in the names of two artists and a playlist will be 132 | generated that takes you gradually, step by step, from the first artist to the second artist. You can 133 | click on any song to hear the song. Click on the first song to hear the whole playlist. If you don't 134 | like a particular artist you can route around that particular artist by clicking the 'bypass' button. 135 | The 'New Track' button will select a different track for an artist. 136 |

137 |

138 | Boil the Frog plays 30 second versions of your songs. When you find a playlist you like you can save it 139 | to Spotify to listen to the full-length versions. 140 |

141 | 142 |

Here are some examples

143 |

152 | 153 |

How does it really work?

154 | 155 |

156 | To create this app, The Echo Nest artist similarity info is 157 | used to build an artist similarity graph of about 158 | 100,000 of the most popular artists. Each artist in the graph is connected to it's most similar neighbors 159 | according to the Echo Nest artist similarity algorithm. 160 |

161 | image graph 162 |
163 |

164 | When a playlist between two artists is created, the graph is used to find the path between the two artists. 165 | The path isn't necessarily the shortest path through the graph. Instead, priority is given to paths that 166 | travel through artists of similar popularity. If you start and end with a popular artist, you are more 167 | likely to find a path that takes you though other popular artists, and if you start with a long-tail artist 168 | you will likely find a path through other long-tail artists. 169 |

170 | Once the path of artists is found, we need to select the best songs for the playlist. To do this, we pick 171 | a well-known song for each artist that minimizes the difference in energy between this song, the previous 172 | song and the next song. 173 |

174 | Once we have selected the best songs, we build a playlist using Spotify's nifty web api. 175 | 176 | 177 |

Who made this?

178 |

179 | This app was built by Paul Lamere. If you like this sort of 180 | thing you may be interested in my blog at Music Machinery. 181 |

182 |

183 |

184 | 185 |

186 |

187 |
188 |

Path created in 10 ms.

189 |

190 | 191 | 200 | 201 | 202 | 203 | 825 | 826 | 827 | 842 | 843 | 844 | 845 | -------------------------------------------------------------------------------- /web/lib/underscore-min.js: -------------------------------------------------------------------------------- 1 | (function(){var n=this,t=n._,r={},e=Array.prototype,u=Object.prototype,i=Function.prototype,a=e.push,o=e.slice,c=e.concat,l=u.toString,f=u.hasOwnProperty,s=e.forEach,p=e.map,v=e.reduce,h=e.reduceRight,g=e.filter,d=e.every,m=e.some,y=e.indexOf,b=e.lastIndexOf,x=Array.isArray,_=Object.keys,j=i.bind,w=function(n){return n instanceof w?n:this instanceof w?(this._wrapped=n,void 0):new w(n)};"undefined"!=typeof exports?("undefined"!=typeof module&&module.exports&&(exports=module.exports=w),exports._=w):n._=w,w.VERSION="1.4.3";var A=w.each=w.forEach=function(n,t,e){if(null!=n)if(s&&n.forEach===s)n.forEach(t,e);else if(n.length===+n.length){for(var u=0,i=n.length;i>u;u++)if(t.call(e,n[u],u,n)===r)return}else for(var a in n)if(w.has(n,a)&&t.call(e,n[a],a,n)===r)return};w.map=w.collect=function(n,t,r){var e=[];return null==n?e:p&&n.map===p?n.map(t,r):(A(n,function(n,u,i){e[e.length]=t.call(r,n,u,i)}),e)};var O="Reduce of empty array with no initial value";w.reduce=w.foldl=w.inject=function(n,t,r,e){var u=arguments.length>2;if(null==n&&(n=[]),v&&n.reduce===v)return e&&(t=w.bind(t,e)),u?n.reduce(t,r):n.reduce(t);if(A(n,function(n,i,a){u?r=t.call(e,r,n,i,a):(r=n,u=!0)}),!u)throw new TypeError(O);return r},w.reduceRight=w.foldr=function(n,t,r,e){var u=arguments.length>2;if(null==n&&(n=[]),h&&n.reduceRight===h)return e&&(t=w.bind(t,e)),u?n.reduceRight(t,r):n.reduceRight(t);var i=n.length;if(i!==+i){var a=w.keys(n);i=a.length}if(A(n,function(o,c,l){c=a?a[--i]:--i,u?r=t.call(e,r,n[c],c,l):(r=n[c],u=!0)}),!u)throw new TypeError(O);return r},w.find=w.detect=function(n,t,r){var e;return E(n,function(n,u,i){return t.call(r,n,u,i)?(e=n,!0):void 0}),e},w.filter=w.select=function(n,t,r){var e=[];return null==n?e:g&&n.filter===g?n.filter(t,r):(A(n,function(n,u,i){t.call(r,n,u,i)&&(e[e.length]=n)}),e)},w.reject=function(n,t,r){return w.filter(n,function(n,e,u){return!t.call(r,n,e,u)},r)},w.every=w.all=function(n,t,e){t||(t=w.identity);var u=!0;return null==n?u:d&&n.every===d?n.every(t,e):(A(n,function(n,i,a){return(u=u&&t.call(e,n,i,a))?void 0:r}),!!u)};var E=w.some=w.any=function(n,t,e){t||(t=w.identity);var u=!1;return null==n?u:m&&n.some===m?n.some(t,e):(A(n,function(n,i,a){return u||(u=t.call(e,n,i,a))?r:void 0}),!!u)};w.contains=w.include=function(n,t){return null==n?!1:y&&n.indexOf===y?-1!=n.indexOf(t):E(n,function(n){return n===t})},w.invoke=function(n,t){var r=o.call(arguments,2);return w.map(n,function(n){return(w.isFunction(t)?t:n[t]).apply(n,r)})},w.pluck=function(n,t){return w.map(n,function(n){return n[t]})},w.where=function(n,t){return w.isEmpty(t)?[]:w.filter(n,function(n){for(var r in t)if(t[r]!==n[r])return!1;return!0})},w.max=function(n,t,r){if(!t&&w.isArray(n)&&n[0]===+n[0]&&65535>n.length)return Math.max.apply(Math,n);if(!t&&w.isEmpty(n))return-1/0;var e={computed:-1/0,value:-1/0};return A(n,function(n,u,i){var a=t?t.call(r,n,u,i):n;a>=e.computed&&(e={value:n,computed:a})}),e.value},w.min=function(n,t,r){if(!t&&w.isArray(n)&&n[0]===+n[0]&&65535>n.length)return Math.min.apply(Math,n);if(!t&&w.isEmpty(n))return 1/0;var e={computed:1/0,value:1/0};return A(n,function(n,u,i){var a=t?t.call(r,n,u,i):n;e.computed>a&&(e={value:n,computed:a})}),e.value},w.shuffle=function(n){var t,r=0,e=[];return A(n,function(n){t=w.random(r++),e[r-1]=e[t],e[t]=n}),e};var F=function(n){return w.isFunction(n)?n:function(t){return t[n]}};w.sortBy=function(n,t,r){var e=F(t);return w.pluck(w.map(n,function(n,t,u){return{value:n,index:t,criteria:e.call(r,n,t,u)}}).sort(function(n,t){var r=n.criteria,e=t.criteria;if(r!==e){if(r>e||void 0===r)return 1;if(e>r||void 0===e)return-1}return n.indexi;){var o=i+a>>>1;u>r.call(e,n[o])?i=o+1:a=o}return i},w.toArray=function(n){return n?w.isArray(n)?o.call(n):n.length===+n.length?w.map(n,w.identity):w.values(n):[]},w.size=function(n){return null==n?0:n.length===+n.length?n.length:w.keys(n).length},w.first=w.head=w.take=function(n,t,r){return null==n?void 0:null==t||r?n[0]:o.call(n,0,t)},w.initial=function(n,t,r){return o.call(n,0,n.length-(null==t||r?1:t))},w.last=function(n,t,r){return null==n?void 0:null==t||r?n[n.length-1]:o.call(n,Math.max(n.length-t,0))},w.rest=w.tail=w.drop=function(n,t,r){return o.call(n,null==t||r?1:t)},w.compact=function(n){return w.filter(n,w.identity)};var R=function(n,t,r){return A(n,function(n){w.isArray(n)?t?a.apply(r,n):R(n,t,r):r.push(n)}),r};w.flatten=function(n,t){return R(n,t,[])},w.without=function(n){return w.difference(n,o.call(arguments,1))},w.uniq=w.unique=function(n,t,r,e){w.isFunction(t)&&(e=r,r=t,t=!1);var u=r?w.map(n,r,e):n,i=[],a=[];return A(u,function(r,e){(t?e&&a[a.length-1]===r:w.contains(a,r))||(a.push(r),i.push(n[e]))}),i},w.union=function(){return w.uniq(c.apply(e,arguments))},w.intersection=function(n){var t=o.call(arguments,1);return w.filter(w.uniq(n),function(n){return w.every(t,function(t){return w.indexOf(t,n)>=0})})},w.difference=function(n){var t=c.apply(e,o.call(arguments,1));return w.filter(n,function(n){return!w.contains(t,n)})},w.zip=function(){for(var n=o.call(arguments),t=w.max(w.pluck(n,"length")),r=Array(t),e=0;t>e;e++)r[e]=w.pluck(n,""+e);return r},w.object=function(n,t){if(null==n)return{};for(var r={},e=0,u=n.length;u>e;e++)t?r[n[e]]=t[e]:r[n[e][0]]=n[e][1];return r},w.indexOf=function(n,t,r){if(null==n)return-1;var e=0,u=n.length;if(r){if("number"!=typeof r)return e=w.sortedIndex(n,t),n[e]===t?e:-1;e=0>r?Math.max(0,u+r):r}if(y&&n.indexOf===y)return n.indexOf(t,r);for(;u>e;e++)if(n[e]===t)return e;return-1},w.lastIndexOf=function(n,t,r){if(null==n)return-1;var e=null!=r;if(b&&n.lastIndexOf===b)return e?n.lastIndexOf(t,r):n.lastIndexOf(t);for(var u=e?r:n.length;u--;)if(n[u]===t)return u;return-1},w.range=function(n,t,r){1>=arguments.length&&(t=n||0,n=0),r=arguments[2]||1;for(var e=Math.max(Math.ceil((t-n)/r),0),u=0,i=Array(e);e>u;)i[u++]=n,n+=r;return i};var I=function(){};w.bind=function(n,t){var r,e;if(n.bind===j&&j)return j.apply(n,o.call(arguments,1));if(!w.isFunction(n))throw new TypeError;return r=o.call(arguments,2),e=function(){if(!(this instanceof e))return n.apply(t,r.concat(o.call(arguments)));I.prototype=n.prototype;var u=new I;I.prototype=null;var i=n.apply(u,r.concat(o.call(arguments)));return Object(i)===i?i:u}},w.bindAll=function(n){var t=o.call(arguments,1);return 0==t.length&&(t=w.functions(n)),A(t,function(t){n[t]=w.bind(n[t],n)}),n},w.memoize=function(n,t){var r={};return t||(t=w.identity),function(){var e=t.apply(this,arguments);return w.has(r,e)?r[e]:r[e]=n.apply(this,arguments)}},w.delay=function(n,t){var r=o.call(arguments,2);return setTimeout(function(){return n.apply(null,r)},t)},w.defer=function(n){return w.delay.apply(w,[n,1].concat(o.call(arguments,1)))},w.throttle=function(n,t){var r,e,u,i,a=0,o=function(){a=new Date,u=null,i=n.apply(r,e)};return function(){var c=new Date,l=t-(c-a);return r=this,e=arguments,0>=l?(clearTimeout(u),u=null,a=c,i=n.apply(r,e)):u||(u=setTimeout(o,l)),i}},w.debounce=function(n,t,r){var e,u;return function(){var i=this,a=arguments,o=function(){e=null,r||(u=n.apply(i,a))},c=r&&!e;return clearTimeout(e),e=setTimeout(o,t),c&&(u=n.apply(i,a)),u}},w.once=function(n){var t,r=!1;return function(){return r?t:(r=!0,t=n.apply(this,arguments),n=null,t)}},w.wrap=function(n,t){return function(){var r=[n];return a.apply(r,arguments),t.apply(this,r)}},w.compose=function(){var n=arguments;return function(){for(var t=arguments,r=n.length-1;r>=0;r--)t=[n[r].apply(this,t)];return t[0]}},w.after=function(n,t){return 0>=n?t():function(){return 1>--n?t.apply(this,arguments):void 0}},w.keys=_||function(n){if(n!==Object(n))throw new TypeError("Invalid object");var t=[];for(var r in n)w.has(n,r)&&(t[t.length]=r);return t},w.values=function(n){var t=[];for(var r in n)w.has(n,r)&&t.push(n[r]);return t},w.pairs=function(n){var t=[];for(var r in n)w.has(n,r)&&t.push([r,n[r]]);return t},w.invert=function(n){var t={};for(var r in n)w.has(n,r)&&(t[n[r]]=r);return t},w.functions=w.methods=function(n){var t=[];for(var r in n)w.isFunction(n[r])&&t.push(r);return t.sort()},w.extend=function(n){return A(o.call(arguments,1),function(t){if(t)for(var r in t)n[r]=t[r]}),n},w.pick=function(n){var t={},r=c.apply(e,o.call(arguments,1));return A(r,function(r){r in n&&(t[r]=n[r])}),t},w.omit=function(n){var t={},r=c.apply(e,o.call(arguments,1));for(var u in n)w.contains(r,u)||(t[u]=n[u]);return t},w.defaults=function(n){return A(o.call(arguments,1),function(t){if(t)for(var r in t)null==n[r]&&(n[r]=t[r])}),n},w.clone=function(n){return w.isObject(n)?w.isArray(n)?n.slice():w.extend({},n):n},w.tap=function(n,t){return t(n),n};var S=function(n,t,r,e){if(n===t)return 0!==n||1/n==1/t;if(null==n||null==t)return n===t;n instanceof w&&(n=n._wrapped),t instanceof w&&(t=t._wrapped);var u=l.call(n);if(u!=l.call(t))return!1;switch(u){case"[object String]":return n==t+"";case"[object Number]":return n!=+n?t!=+t:0==n?1/n==1/t:n==+t;case"[object Date]":case"[object Boolean]":return+n==+t;case"[object RegExp]":return n.source==t.source&&n.global==t.global&&n.multiline==t.multiline&&n.ignoreCase==t.ignoreCase}if("object"!=typeof n||"object"!=typeof t)return!1;for(var i=r.length;i--;)if(r[i]==n)return e[i]==t;r.push(n),e.push(t);var a=0,o=!0;if("[object Array]"==u){if(a=n.length,o=a==t.length)for(;a--&&(o=S(n[a],t[a],r,e)););}else{var c=n.constructor,f=t.constructor;if(c!==f&&!(w.isFunction(c)&&c instanceof c&&w.isFunction(f)&&f instanceof f))return!1;for(var s in n)if(w.has(n,s)&&(a++,!(o=w.has(t,s)&&S(n[s],t[s],r,e))))break;if(o){for(s in t)if(w.has(t,s)&&!a--)break;o=!a}}return r.pop(),e.pop(),o};w.isEqual=function(n,t){return S(n,t,[],[])},w.isEmpty=function(n){if(null==n)return!0;if(w.isArray(n)||w.isString(n))return 0===n.length;for(var t in n)if(w.has(n,t))return!1;return!0},w.isElement=function(n){return!(!n||1!==n.nodeType)},w.isArray=x||function(n){return"[object Array]"==l.call(n)},w.isObject=function(n){return n===Object(n)},A(["Arguments","Function","String","Number","Date","RegExp"],function(n){w["is"+n]=function(t){return l.call(t)=="[object "+n+"]"}}),w.isArguments(arguments)||(w.isArguments=function(n){return!(!n||!w.has(n,"callee"))}),w.isFunction=function(n){return"function"==typeof n},w.isFinite=function(n){return isFinite(n)&&!isNaN(parseFloat(n))},w.isNaN=function(n){return w.isNumber(n)&&n!=+n},w.isBoolean=function(n){return n===!0||n===!1||"[object Boolean]"==l.call(n)},w.isNull=function(n){return null===n},w.isUndefined=function(n){return void 0===n},w.has=function(n,t){return f.call(n,t)},w.noConflict=function(){return n._=t,this},w.identity=function(n){return n},w.times=function(n,t,r){for(var e=Array(n),u=0;n>u;u++)e[u]=t.call(r,u);return e},w.random=function(n,t){return null==t&&(t=n,n=0),n+(0|Math.random()*(t-n+1))};var T={escape:{"&":"&","<":"<",">":">",'"':""","'":"'","/":"/"}};T.unescape=w.invert(T.escape);var M={escape:RegExp("["+w.keys(T.escape).join("")+"]","g"),unescape:RegExp("("+w.keys(T.unescape).join("|")+")","g")};w.each(["escape","unescape"],function(n){w[n]=function(t){return null==t?"":(""+t).replace(M[n],function(t){return T[n][t]})}}),w.result=function(n,t){if(null==n)return null;var r=n[t];return w.isFunction(r)?r.call(n):r},w.mixin=function(n){A(w.functions(n),function(t){var r=w[t]=n[t];w.prototype[t]=function(){var n=[this._wrapped];return a.apply(n,arguments),z.call(this,r.apply(w,n))}})};var N=0;w.uniqueId=function(n){var t=""+ ++N;return n?n+t:t},w.templateSettings={evaluate:/<%([\s\S]+?)%>/g,interpolate:/<%=([\s\S]+?)%>/g,escape:/<%-([\s\S]+?)%>/g};var q=/(.)^/,B={"'":"'","\\":"\\","\r":"r","\n":"n"," ":"t","\u2028":"u2028","\u2029":"u2029"},D=/\\|'|\r|\n|\t|\u2028|\u2029/g;w.template=function(n,t,r){r=w.defaults({},r,w.templateSettings);var e=RegExp([(r.escape||q).source,(r.interpolate||q).source,(r.evaluate||q).source].join("|")+"|$","g"),u=0,i="__p+='";n.replace(e,function(t,r,e,a,o){return i+=n.slice(u,o).replace(D,function(n){return"\\"+B[n]}),r&&(i+="'+\n((__t=("+r+"))==null?'':_.escape(__t))+\n'"),e&&(i+="'+\n((__t=("+e+"))==null?'':__t)+\n'"),a&&(i+="';\n"+a+"\n__p+='"),u=o+t.length,t}),i+="';\n",r.variable||(i="with(obj||{}){\n"+i+"}\n"),i="var __t,__p='',__j=Array.prototype.join,print=function(){__p+=__j.call(arguments,'');};\n"+i+"return __p;\n";try{var a=Function(r.variable||"obj","_",i)}catch(o){throw o.source=i,o}if(t)return a(t,w);var c=function(n){return a.call(this,n,w)};return c.source="function("+(r.variable||"obj")+"){\n"+i+"}",c},w.chain=function(n){return w(n).chain()};var z=function(n){return this._chain?w(n).chain():n};w.mixin(w),A(["pop","push","reverse","shift","sort","splice","unshift"],function(n){var t=e[n];w.prototype[n]=function(){var r=this._wrapped;return t.apply(r,arguments),"shift"!=n&&"splice"!=n||0!==r.length||delete r[0],z.call(this,r)}}),A(["concat","join","slice"],function(n){var t=e[n];w.prototype[n]=function(){return z.call(this,t.apply(this._wrapped,arguments))}}),w.extend(w.prototype,{chain:function(){return this._chain=!0,this},value:function(){return this._wrapped}})}).call(this); -------------------------------------------------------------------------------- /web/ss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/plamere/BoilTheFrog/1f2cb60a1f08d9cec3a06675a633ca35f09c8708/web/ss.png -------------------------------------------------------------------------------- /web/styles.css: -------------------------------------------------------------------------------- 1 | .hero-unit { 2 | margin-top:66px; 3 | padding-bottom:30px; 4 | color: #fff; 5 | background-color: #84BD00; 6 | text-align:center; 7 | } 8 | 9 | .hero-unit h1 { 10 | font-size:64px; 11 | } 12 | 13 | #gallery h1 { 14 | color: #84BD00; 15 | } 16 | 17 | #about h1 { 18 | color: #84BD00; 19 | } 20 | 21 | #about h2 { 22 | color: #84BD00; 23 | } 24 | 25 | 26 | .reg-unit { 27 | color:white; 28 | margin-top:66px; 29 | /*background-color: #84BD00;*/ 30 | background-color: #6F7073; 31 | } 32 | 33 | .navbar-nav li a { 34 | cursor:pointer; 35 | } 36 | 37 | .artist:hover { 38 | cursor:pointer; 39 | } 40 | 41 | #options li a:hover { 42 | background-color: #727272; 43 | } 44 | 45 | .option-active { 46 | /*background-color: #6f7073;*/ 47 | color: #84BD00 !important; 48 | } 49 | 50 | /* 51 | .reg-unit a { 52 | color:red; 53 | } 54 | 55 | .reg-unit a:hover { 56 | color:red; 57 | } 58 | 59 | .reg-unit a:visited { 60 | color:orange; 61 | } 62 | */ 63 | 64 | 65 | 66 | #gallery { 67 | display:none; 68 | } 69 | 70 | #time-info { 71 | display:none; 72 | } 73 | 74 | .gallery-list { 75 | font-size:24px; 76 | } 77 | 78 | #about { 79 | display:none; 80 | } 81 | 82 | 83 | 84 | #main { 85 | margin-top:20px; 86 | margin-left:8px; 87 | } 88 | 89 | .adiv { 90 | width:294px; 91 | height:344px; 92 | background-size:100%; 93 | background-repeat:no-repeat; 94 | overflow:hidden; 95 | position:relative; 96 | /*background-color: #122;*/ 97 | margin-bottom:2px; 98 | padding:4px; 99 | background-color:#ddd; 100 | } 101 | 102 | #go { 103 | } 104 | 105 | #xbuttons { 106 | width:100%; 107 | margin-left:auto; 108 | margin-right:auto; 109 | text-align:center; 110 | display:none; 111 | } 112 | 113 | #tweet-span { 114 | position:relative; 115 | top:10px; 116 | } 117 | 118 | .tadiv { 119 | display:inline-block; 120 | width:310px; 121 | height:364px; 122 | position:relative; 123 | /*background-color: #122;*/ 124 | margin:3px; 125 | } 126 | 127 | 128 | .is-current { 129 | background-color:#e36b23; 130 | font-size:18px; 131 | } 132 | 133 | #list { 134 | margin-left:4px; 135 | text-align:center; 136 | } 137 | 138 | .adiv:hover { 139 | background-color: #84BD00; 140 | } 141 | 142 | #info { 143 | margin-left:10px; 144 | width:100%; 145 | text-align:center; 146 | margin-bottom:10px; 147 | height:32px; 148 | font-size:18px; 149 | } 150 | 151 | 152 | .song-info { 153 | position:absolute; 154 | bottom: 4px; 155 | margin-left:6px; 156 | margin-right:6px; 157 | line-height:18px; 158 | font-size:14px; 159 | height:40px; 160 | overflow:hidden; 161 | } 162 | 163 | .playbutton { 164 | width:100px; 165 | position:absolute; 166 | top:100px; 167 | left:100px; 168 | } 169 | 170 | .buttons { 171 | opacity:.8; 172 | } 173 | 174 | #footer { 175 | margin-top:10px; 176 | margin-left:20px; 177 | margin-bottom:20px; 178 | } 179 | 180 | 181 | 182 | .album-label { 183 | overflow:hidden; 184 | height:18px; 185 | text-overflow:ellipsis; 186 | width:280px; 187 | white-space:nowrap; 188 | } 189 | 190 | .change { 191 | float:left; 192 | } 193 | 194 | .bypass { 195 | float:right; 196 | } 197 | 198 | #search { 199 | margin-bottom: 20px; 200 | } 201 | 202 | #search input { 203 | color:black; 204 | } 205 | 206 | .faq { 207 | margin-right:10px; 208 | } 209 | 210 | #frog-image { 211 | margin-right:15px; 212 | margin-top:5px; 213 | float:left; 214 | border-radius:15px; 215 | 216 | } 217 | 218 | #lz-graph { 219 | float:right; 220 | border-radius:15px; 221 | margin-left:10px; 222 | margin-bottom:20px; 223 | } 224 | 225 | #time-info { 226 | margin-right:20px; 227 | font-size:8px; 228 | } 229 | 230 | .empty-link { 231 | cursor:pointer; 232 | } 233 | --------------------------------------------------------------------------------

Creating playlist...

Done!

Trouble!

Boil the Frog

Boil the Frog Gallery

About Boil the Frog

How does it work?

Here are some examples

How does it really work?

Who made this?

Creating playlist...

Done!

Trouble!

Boil the Frog

Boil the Frog Gallery

About Boil the Frog

How does it work?

Here are some examples

How does it really work?

Who made this?