├── .gitignore
├── .travis.yml
├── LICENSE
├── MANIFEST.in
├── README.md
├── requirements.txt
├── setup.py
├── soundscrape
    ├── .gitignore
    ├── __init__.py
    └── soundscrape.py
├── test.sh
└── tests
    └── test.py


/.gitignore:
--------------------------------------------------------------------------------
1 | env/
2 | *.DS_Store
3 | *.pyc
4 | *.bak
5 | build/
6 | dist/
7 | 


--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
 1 | language: python
 2 | python:
 3 |   - "3.4"
 4 |   - "3.5"
 5 |   - "3.8"
 6 |   - "3.9"
 7 | # command to install dependencies
 8 | install:
 9 |   - "pip install setuptools --upgrade; python setup.py install"
10 | # command to run tests
11 | script: nosetests
12 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2013 Rich Jones
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 6 | this software and associated documentation files (the "Software"), to deal in
 7 | the Software without restriction, including without limitation the rights to
 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 9 | the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include README.md LICENSE requirements.txt
2 | recursive-include soundscrape *.py
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ![SoundScrape!](http://i.imgur.com/nHAt2ow.png)
  2 | 
  3 | SoundScrape [![Build Status](https://travis-ci.org/Miserlou/SoundScrape.svg)](https://travis-ci.org/Miserlou/SoundScrape) [![Python 3](https://img.shields.io/badge/Python-3-brightgreen.svg)](https://pypi.python.org/pypi/soundscrape/) [![PyPI](https://img.shields.io/pypi/v/soundscrape.svg)](https://pypi.python.org/pypi/SoundScrape)
  4 | ==============
  5 | 
  6 | **SoundScrape** makes it super easy to download artists from SoundCloud (and Bandcamp and MixCloud) - even those which don't have download links! It automatically creates ID3 tags as well (including album art), which is handy.
  7 | 
  8 | Usage
  9 | ---------
 10 | 
 11 | First, install it:
 12 | 
 13 | ```bash
 14 | pip install soundscrape
 15 | ```
 16 | 
 17 | Note that if you are having problems, please first try updating to the latest version:
 18 | 
 19 | ```bash
 20 | pip install soundscrape --upgrade
 21 | ```
 22 | 
 23 | Then, just call soundscrape and the name of the artist you want to scrape:
 24 | 
 25 | ```bash
 26 | soundscrape rabbit-i-am
 27 | ```
 28 | 
 29 | And you're done! Hooray! Files are stored as mp3s in the format **Artist name - Track title.mp3**.
 30 | 
 31 | You can also use the *-n* argument to only download a certain number of songs.
 32 | 
 33 | ```bash
 34 | soundscrape rabbit-i-am -n 3
 35 | ```
 36 | 
 37 | Sets
 38 | -------
 39 | 
 40 | Soundscrape can also download sets, but you have to include the full URL of the set you want to download:
 41 | 
 42 | ```bash
 43 | soundscrape https://soundcloud.com/vsauce-awesome/sets/awesome
 44 | ```
 45 | 
 46 | Groups
 47 | --------
 48 | 
 49 | Soundscrape can also download tracks from SoundCloud groups with the *-g* argument.
 50 | 
 51 | ```bash
 52 | soundscrape chopped-and-screwed -gn 2
 53 | ```
 54 | 
 55 | Tracks
 56 | --------
 57 | 
 58 | Soundscrape can also download specific tracks with *-t*:
 59 | 
 60 | ```bash
 61 | soundscrape foolsgoldrecs -t danny-brown-dip
 62 | ```
 63 | 
 64 | or with just the straight URL:
 65 | 
 66 | ```bash
 67 | soundscrape https://soundcloud.com/foolsgoldrecs/danny-brown-dip
 68 | ```
 69 | 
 70 | Likes
 71 | --------
 72 | 
 73 | Soundscrape can also download all of an Artist's Liked items with *-l*:
 74 | 
 75 | ```bash
 76 | soundscrape troyboi -l
 77 | ```
 78 | 
 79 | or with just the straight URL:
 80 | 
 81 | ```bash
 82 | soundscrape https://soundcloud.com/troyboi/likes
 83 | ```
 84 | 
 85 | High-Quality Downloads Only
 86 | --------
 87 | 
 88 | By default, SoundScrape will try to rip everything it can. However, if you only want to download tracks that have an official download available (which are typically at a higher-quality 320kbps bitrate), you can use the *-d* argument.
 89 | 
 90 | ```bash
 91 | soundscrape sly-dogg -d
 92 | ```
 93 | 
 94 | Keep Preview Tracks
 95 | --------
 96 | 
 97 | By default, SoundScrape will skip the 30-second preview tracks that SoundCloud now provides. You can choose to keep these preview snippets with the *-k* argument.
 98 | 
 99 | ```bash
100 | soundscrape chromeo -k
101 | ```
102 | 
103 | Folders
104 | --------
105 | 
106 | By default, SoundScrape aims to act like _wget_, downloading in place in the current directory. With the *-f* argument, however, SoundScrape acts more like a download manager and sorts songs into the following format:
107 | 
108 | ```
109 | ./ARTIST_NAME - ALBUM_NAME/SONG_NUMBER - SONG_TITLE.mp3
110 | ```
111 | 
112 | It will also skip previously downloaded tracks.
113 | 
114 | ```bash
115 | soundscrape murdercitydevils -f
116 | ```
117 | 
118 | Bandcamp
119 | --------
120 | 
121 | SoundScrape can also pull down albums from Bandcamp. For Bandcamp pages, use the *-b* argument along with an artist's username or a specific URL. It only downloads one album at a time. This works with all of the other arguments, except *-d* as Bandcamp streams only come at one bitrate, as far as I can tell.
122 | 
123 | Note: Currently, when using the *-n* argument, the limit is evaluated for each album separately.
124 | 
125 | ```bash
126 | soundscrape warsaw -b -f
127 | ```
128 | 
129 | This also works for non-Bandcamp URLs that are hosted on Bandcamp:
130 | 
131 | ```bash
132 | soundscrape -b http://music.monstercat.com/
133 | ```
134 | 
135 | Note that the full URL must be included.
136 | 
137 | Mixcloud
138 | --------
139 | 
140 | SoundScrape can also grab mixes from Mixcloud. This feature is extremely expermental and is in no way guaranteed to work!
141 | 
142 | Finds the original mp3 of a mix and grabs that (with tags and album art) if it can, or else just gets the raw m4a stream.
143 | 
144 | Mixcloud currently only takes an invidiual mix. Capacity for a whole artist's profile due shortly.
145 | 
146 | ```bash
147 | soundscrape https://www.mixcloud.com/corenewsuploads/flume-essential-mix-2015-10-03/ -of
148 | ```
149 | 
150 | Audiomack
151 | --------
152 | 
153 | Just for fun, SoundScrape can also download individual songs from Audiomack. Not that you'd ever want to.
154 | 
155 | ```bash
156 | soundscrape -a http://www.audiomack.com/song/bottomfeedermusic/top-shottas
157 | ```
158 | 
159 | MusicBed
160 | --------
161 | 
162 | For some strange reason, it also works for MusicBed.com. Thanks @brachna for this feature.
163 | 
164 | ```bash
165 | soundscrape https://www.musicbed.com/albums/be-still/2828
166 | ```
167 | 
168 | Opening Files
169 | --------
170 | 
171 | As a convenience method, SoundScrape can automatically _'open'_ files that it downloads. This uses your system's 'open' command for file associations.
172 | 
173 | ```bash
174 | soundscrape lorn -of
175 | ```
176 | 
177 | Issues
178 | -------
179 | 
180 | There's probably a lot more that can be done to improve this. Please file issues if you find them!
181 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | args>=0.1.0
 2 | clint>=0.3.2
 3 | demjson>=2.2.2
 4 | fudge>=1.0.3
 5 | nose>=1.3.7
 6 | requests[security]>=2.9.0
 7 | setuptools>=18.0.0
 8 | simplejson>=3.3.1
 9 | soundcloud>=0.4.1
10 | wheel>=0.24.0
11 | mutagen>=1.31.0
12 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import setuptools
 3 | import soundscrape
 4 | import sys
 5 | 
 6 | from setuptools import setup
 7 | 
 8 | # To support 2/3 installation
 9 | setup_version = int(setuptools.__version__.split('.')[0])
10 | if setup_version < 18:
11 |     print("Please upgrade your setuptools to install SoundScrape: ")
12 |     print("pip install -U pip wheel setuptools")
13 |     quit()
14 | 
15 | # Set external files
16 | try:
17 |     from pypandoc import convert
18 |     README = convert('README.md', 'rst')	 
19 | except ImportError:
20 |     README = open(os.path.join(os.path.dirname(__file__), 'README.md')).read()
21 | 
22 | with open(os.path.join(os.path.dirname(__file__), 'requirements.txt')) as f:
23 |     required = f.read().splitlines()
24 | 
25 | # allow setup.py to be run from any path
26 | os.chdir(os.path.normpath(os.path.join(os.path.abspath(__file__), os.pardir)))
27 | 
28 | setup(
29 |     name='soundscrape',
30 |     version=soundscrape.__version__,
31 |     packages=['soundscrape'],
32 |     install_requires=required,
33 |     extras_require={ ':python_version < "3.0"': [ 'wsgiref>=0.1.2', ], },    
34 |     include_package_data=True,
35 |     license='MIT License',
36 |     description='Scrape an artist from SoundCloud',
37 |     long_description=README,
38 |     url='https://github.com/Miserlou/SoundScrape',
39 |     author='Rich Jones',
40 |     author_email='rich@openwatch.net',
41 |     entry_points={
42 |         'console_scripts': [
43 |             'soundscrape = soundscrape.soundscrape:main',
44 |         ]
45 |     },
46 |     classifiers=[
47 |         'Environment :: Console',
48 |         'License :: OSI Approved :: Apache Software License',
49 |         'Operating System :: OS Independent',
50 |         'Programming Language :: Python',
51 |         'Programming Language :: Python :: 3.4',
52 |         'Programming Language :: Python :: 3.5',
53 |         'Programming Language :: Python :: 3.7',
54 |         'Programming Language :: Python :: 3.8',
55 |         'Programming Language :: Python :: 3.9',
56 |         'Topic :: Internet :: WWW/HTTP',
57 |         'Topic :: Internet :: WWW/HTTP :: Dynamic Content',
58 |     ],
59 | )
60 | 


--------------------------------------------------------------------------------
/soundscrape/.gitignore:
--------------------------------------------------------------------------------
1 | *.mp3


--------------------------------------------------------------------------------
/soundscrape/__init__.py:
--------------------------------------------------------------------------------
1 | __version__ = '0.31.0'
2 | 


--------------------------------------------------------------------------------
/soundscrape/soundscrape.py:
--------------------------------------------------------------------------------
   1 | #! /usr/bin/env python
   2 | import argparse
   3 | import demjson
   4 | import html
   5 | import os
   6 | import re
   7 | import requests
   8 | import soundcloud
   9 | import sys
  10 | import urllib
  11 | 
  12 | from clint.textui import colored, puts, progress
  13 | from datetime import datetime
  14 | from mutagen.mp3 import MP3, EasyMP3
  15 | from mutagen.id3 import APIC, WXXX
  16 | from mutagen.id3 import ID3 as OldID3
  17 | from subprocess import Popen, PIPE
  18 | from os.path import dirname, exists, join
  19 | from os import access, mkdir, W_OK
  20 | 
  21 | if sys.version_info.minor < 4:
  22 |     html_unescape = html.parser.HTMLParser().unescape
  23 | else:
  24 |     html_unescape = html.unescape
  25 | 
  26 | ####################################################################
  27 | 
  28 | # Please be nice with this!
  29 | CLIENT_ID = 'a3dd183a357fcff9a6943c0d65664087'
  30 | CLIENT_SECRET = '7e10d33e967ad42574124977cf7fa4b7'
  31 | MAGIC_CLIENT_ID = 'b45b1aa10f1ac2941910a7f0d10f8e28'
  32 | 
  33 | AGGRESSIVE_CLIENT_ID = 'OmTFHKYSMLFqnu2HHucmclAptedxWXkq'
  34 | APP_VERSION = '1481046241'
  35 | 
  36 | ####################################################################
  37 | 
  38 | 
  39 | def main():
  40 |     """
  41 |     Main function.
  42 | 
  43 |     Converts arguments to Python and processes accordingly.
  44 | 
  45 |     """
  46 | 
  47 |     # Hack related to #58
  48 |     if sys.platform == "win32":
  49 |         os.system("chcp 65001");
  50 | 
  51 |     parser = argparse.ArgumentParser(description='SoundScrape. Scrape an artist from SoundCloud.\n')
  52 |     parser.add_argument('artist_url', metavar='U', type=str, nargs='*',
  53 |                         help='An artist\'s SoundCloud username or URL')
  54 |     parser.add_argument('-n', '--num-tracks', type=int, default=sys.maxsize,
  55 |                         help='The number of tracks to download')
  56 |     parser.add_argument('-g', '--group', action='store_true',
  57 |                         help='Use if downloading tracks from a SoundCloud group')
  58 |     parser.add_argument('-b', '--bandcamp', action='store_true',
  59 |                         help='Use if downloading from Bandcamp rather than SoundCloud')
  60 |     parser.add_argument('-m', '--mixcloud', action='store_true',
  61 |                         help='Use if downloading from Mixcloud rather than SoundCloud')
  62 |     parser.add_argument('-a', '--audiomack', action='store_true',
  63 |                         help='Use if downloading from Audiomack rather than SoundCloud')
  64 |     parser.add_argument('-c', '--hive', action='store_true',
  65 |                         help='Use if downloading from Hive.co rather than SoundCloud')
  66 |     parser.add_argument('-l', '--likes', action='store_true',
  67 |                         help='Download all of a user\'s Likes.')
  68 |     parser.add_argument('-L', '--login', type=str, default='soundscrape123@mailinator.com',
  69 |                         help='Set login')
  70 |     parser.add_argument('-d', '--downloadable', action='store_true',
  71 |                         help='Only fetch tracks with a Downloadable link.')
  72 |     parser.add_argument('-t', '--track', type=str, default='',
  73 |                         help='The name of a specific track by an artist')
  74 |     parser.add_argument('-f', '--folders', action='store_true',
  75 |                         help='Organize saved songs in folders by artists')
  76 |     parser.add_argument('-p', '--path', type=str, default='',
  77 |                         help='Set directory path where downloads should be saved to')
  78 |     parser.add_argument('-P', '--password', type=str, default='soundscraperocks',
  79 |                         help='Set password')
  80 |     parser.add_argument('-o', '--open', action='store_true',
  81 |                         help='Open downloaded files after downloading.')
  82 |     parser.add_argument('-k', '--keep', action='store_true',
  83 |                         help='Keep 30-second preview tracks')
  84 |     parser.add_argument('-v', '--version', action='store_true', default=False,
  85 |                         help='Display the current version of SoundScrape')
  86 | 
  87 |     args = parser.parse_args()
  88 |     vargs = vars(args)
  89 | 
  90 |     if vargs['version']:
  91 |         import pkg_resources
  92 |         version = pkg_resources.require("soundscrape")[0].version
  93 |         print(version)
  94 |         return
  95 | 
  96 |     if not vargs['artist_url']:
  97 |         parser.error('Please supply an artist\'s username or URL!')
  98 | 
  99 |     if sys.version_info < (3,0,0):
 100 |         vargs['artist_url'] = urllib.quote(vargs['artist_url'][0], safe=':/')
 101 |     else:
 102 |         vargs['artist_url'] = urllib.parse.quote(vargs['artist_url'][0], safe=':/')
 103 | 
 104 |     artist_url = vargs['artist_url']
 105 | 
 106 |     if not exists(vargs['path']):
 107 |         if not access(dirname(vargs['path']), W_OK):
 108 |             vargs['path'] = ''
 109 |         else:
 110 |             mkdir(vargs['path'])
 111 | 
 112 |     if 'bandcamp.com' in artist_url or vargs['bandcamp']:
 113 |         process_bandcamp(vargs)
 114 |     elif 'mixcloud.com' in artist_url or vargs['mixcloud']:
 115 |         process_mixcloud(vargs)
 116 |     elif 'audiomack.com' in artist_url or vargs['audiomack']:
 117 |         process_audiomack(vargs)
 118 |     elif 'hive.co' in artist_url or vargs['hive']:
 119 |         process_hive(vargs)
 120 |     elif 'musicbed.com' in artist_url:
 121 |         process_musicbed(vargs)
 122 |     else:
 123 |         process_soundcloud(vargs)
 124 | 
 125 | 
 126 | ####################################################################
 127 | # SoundCloud
 128 | ####################################################################
 129 | 
 130 | 
 131 | def process_soundcloud(vargs):
 132 |     """
 133 |     Main SoundCloud path.
 134 |     """
 135 | 
 136 |     artist_url = vargs['artist_url']
 137 |     track_permalink = vargs['track']
 138 |     keep_previews = vargs['keep']
 139 |     folders = vargs['folders']
 140 | 
 141 |     id3_extras = {}
 142 |     one_track = False
 143 |     likes = False
 144 |     client = get_client()
 145 |     if 'soundcloud' not in artist_url.lower():
 146 |         if vargs['group']:
 147 |             artist_url = 'https://soundcloud.com/groups/' + artist_url.lower()
 148 |         elif len(track_permalink) > 0:
 149 |             one_track = True
 150 |             track_url = 'https://soundcloud.com/' + artist_url.lower() + '/' + track_permalink.lower()
 151 |         else:
 152 |             artist_url = 'https://soundcloud.com/' + artist_url.lower()
 153 |             if vargs['likes'] or 'likes' in artist_url.lower():
 154 |                 likes = True
 155 | 
 156 |     if 'likes' in artist_url.lower():
 157 |         artist_url = artist_url[0:artist_url.find('/likes')]
 158 |         likes = True
 159 | 
 160 |     if one_track:
 161 |         num_tracks = 1
 162 |     else:
 163 |         num_tracks = vargs['num_tracks']
 164 | 
 165 |     try:
 166 |         if one_track:
 167 |             resolved = client.get('/resolve', url=track_url, limit=200)
 168 | 
 169 |         elif likes:
 170 |             userId = str(client.get('/resolve', url=artist_url).id)
 171 | 
 172 |             resolved = client.get('/users/' + userId + '/favorites', limit=200, linked_partitioning=1)
 173 |             next_href = False
 174 |             if(hasattr(resolved, 'next_href')):
 175 |                 next_href = resolved.next_href
 176 |             while (next_href):
 177 | 
 178 |                 resolved2 = requests.get(next_href).json()
 179 |                 if('next_href' in resolved2):
 180 |                     next_href = resolved2['next_href']
 181 |                 else:
 182 |                     next_href = False
 183 |                 resolved2 = soundcloud.resource.ResourceList(resolved2['collection'])
 184 |                 resolved.collection.extend(resolved2)
 185 |             resolved = resolved.collection
 186 | 
 187 |         else:
 188 |             resolved = client.get('/resolve', url=artist_url, limit=200)
 189 | 
 190 |     except Exception as e:  # HTTPError?
 191 | 
 192 |         # SoundScrape is trying to prevent us from downloading this.
 193 |         # We're going to have to stop trusting the API/client and
 194 |         # do all our own scraping. Boo.
 195 | 
 196 |         if '404 Client Error' in str(e):
 197 |             puts(colored.red("Problem downloading [404]: ") + colored.white("Item Not Found"))
 198 |             return None
 199 | 
 200 |         message = str(e)
 201 |         item_id = message.rsplit('/', 1)[-1].split('.json')[0].split('?client_id')[0]
 202 |         hard_track_url = get_hard_track_url(item_id)
 203 | 
 204 |         track_data = get_soundcloud_data(artist_url)
 205 |         puts_safe(colored.green("Scraping") + colored.white(": " + track_data['title']))
 206 | 
 207 |         filenames = []
 208 |         filename = sanitize_filename(track_data['artist'] + ' - ' + track_data['title'] + '.mp3')
 209 | 
 210 |         if folders:
 211 |             name_path = join(vargs['path'], track_data['artist'])
 212 |             if not exists(name_path):
 213 |                 mkdir(name_path)
 214 |             filename = join(name_path, filename)
 215 |         else:
 216 |             filename = join(vargs['path'], filename)
 217 | 
 218 |         if exists(filename):
 219 |             puts_safe(colored.yellow("Track already downloaded: ") + colored.white(track_data['title']))
 220 |             return None
 221 | 
 222 |         filename = download_file(hard_track_url, filename)
 223 |         tagged = tag_file(filename,
 224 |                  artist=track_data['artist'],
 225 |                  title=track_data['title'],
 226 |                  year='2018',
 227 |                  genre='',
 228 |                  album='',
 229 |                  artwork_url='')
 230 | 
 231 |         if not tagged:
 232 |             wav_filename = filename[:-3] + 'wav'
 233 |             os.rename(filename, wav_filename)
 234 |             filename = wav_filename
 235 | 
 236 |         filenames.append(filename)
 237 | 
 238 |     else:
 239 | 
 240 |         aggressive = False
 241 | 
 242 |         # This is is likely a 'likes' page.
 243 |         if not hasattr(resolved, 'kind'):
 244 |             tracks = resolved
 245 |         else:
 246 |             if resolved.kind == 'artist':
 247 |                 artist = resolved
 248 |                 artist_id = str(artist.id)
 249 |                 tracks = client.get('/users/' + artist_id + '/tracks', limit=200)
 250 |             elif resolved.kind == 'playlist':
 251 |                 id3_extras['album'] = resolved.title
 252 |                 if resolved.tracks != []:
 253 |                     tracks = resolved.tracks
 254 |                 else:
 255 |                     tracks = get_soundcloud_api_playlist_data(resolved.id)['tracks']
 256 |                     tracks = tracks[:num_tracks]
 257 |                     aggressive = True
 258 |                     for track in tracks:
 259 |                         download_track(track, resolved.title, keep_previews, folders, custom_path=vargs['path'])
 260 | 
 261 |             elif resolved.kind == 'track':
 262 |                 tracks = [resolved]
 263 |             elif resolved.kind == 'group':
 264 |                 group = resolved
 265 |                 group_id = str(group.id)
 266 |                 tracks = client.get('/groups/' + group_id + '/tracks', limit=200)
 267 |             else:
 268 |                 artist = resolved
 269 |                 artist_id = str(artist.id)
 270 |                 tracks = client.get('/users/' + artist_id + '/tracks', limit=200)
 271 |                 if tracks == [] and artist.track_count > 0:
 272 |                     aggressive = True
 273 |                     filenames = []
 274 | 
 275 |                     # this might be buggy
 276 |                     data = get_soundcloud_api2_data(artist_id)
 277 | 
 278 |                     for track in data['collection']:
 279 | 
 280 |                         if len(filenames) >= num_tracks:
 281 |                             break
 282 | 
 283 |                         if track['type'] == 'playlist':
 284 |                             track['playlist']['tracks'] = track['playlist']['tracks'][:num_tracks]
 285 |                             for playlist_track in track['playlist']['tracks']:
 286 |                                 album_name = track['playlist']['title']
 287 |                                 filename = download_track(playlist_track, album_name, keep_previews, folders, filenames, custom_path=vargs['path'])
 288 |                                 if filename:
 289 |                                     filenames.append(filename)
 290 |                         else:
 291 |                             d_track = track['track']
 292 |                             filename = download_track(d_track, custom_path=vargs['path'])
 293 |                             if filename:
 294 |                                 filenames.append(filename)
 295 | 
 296 |         if not aggressive:
 297 |             filenames = download_tracks(client, tracks, num_tracks, vargs['downloadable'], vargs['folders'], vargs['path'],
 298 |                                         id3_extras=id3_extras)
 299 | 
 300 |     if vargs['open']:
 301 |         open_files(filenames)
 302 | 
 303 | 
 304 | def get_client():
 305 |     """
 306 |     Return a new SoundCloud Client object.
 307 |     """
 308 |     client = soundcloud.Client(client_id=CLIENT_ID)
 309 |     return client
 310 | 
 311 | def download_track(track, album_name=u'', keep_previews=False, folders=False, filenames=[], custom_path=''):
 312 |     """
 313 |     Given a track, force scrape it.
 314 |     """
 315 | 
 316 |     hard_track_url = get_hard_track_url(track['id'])
 317 | 
 318 |     # We have no info on this track whatsoever.
 319 |     if not 'title' in track:
 320 |         return None
 321 | 
 322 |     if not keep_previews:
 323 |         if (track.get('duration', 0) < track.get('full_duration', 0)):
 324 |             puts_safe(colored.yellow("Skipping preview track") + colored.white(": " + track['title']))
 325 |             return None
 326 | 
 327 |     # May not have a "full name"
 328 |     name = track['user'].get('full_name', '')
 329 |     if name == '':
 330 |         name = track['user']['username']
 331 | 
 332 |     filename = sanitize_filename(name + ' - ' + track['title'] + '.mp3')
 333 | 
 334 |     if folders:
 335 |         name_path = join(custom_path, name)
 336 |         if not exists(name_path):
 337 |             mkdir(name_path)
 338 |         filename = join(name_path, filename)
 339 |     else:
 340 |         filename = join(custom_path, filename)
 341 | 
 342 |     if exists(filename):
 343 |         puts_safe(colored.yellow("Track already downloaded: ") + colored.white(track['title']))
 344 |         return None
 345 | 
 346 |     # Skip already downloaded track.
 347 |     if filename in filenames:
 348 |         return None
 349 | 
 350 |     if hard_track_url:
 351 |         puts_safe(colored.green("Scraping") + colored.white(": " + track['title']))
 352 |     else:
 353 |         # Region coded?
 354 |         puts_safe(colored.yellow("Unable to download") + colored.white(": " + track['title']))
 355 |         return None
 356 | 
 357 |     filename = download_file(hard_track_url, filename)
 358 |     tagged = tag_file(filename,
 359 |              artist=name,
 360 |              title=track['title'],
 361 |              year=track['created_at'][:4],
 362 |              genre=track['genre'],
 363 |              album=album_name,
 364 |              artwork_url=track['artwork_url'])
 365 |     if not tagged:
 366 |         wav_filename = filename[:-3] + 'wav'
 367 |         os.rename(filename, wav_filename)
 368 |         filename = wav_filename
 369 | 
 370 |     return filename
 371 | 
 372 | def download_tracks(client, tracks, num_tracks=sys.maxsize, downloadable=False, folders=False, custom_path='', id3_extras={}):
 373 |     """
 374 |     Given a list of tracks, iteratively download all of them.
 375 | 
 376 |     """
 377 | 
 378 |     filenames = []
 379 | 
 380 |     for i, track in enumerate(tracks):
 381 | 
 382 |         # "Track" and "Resource" objects are actually different,
 383 |         # even though they're the same.
 384 |         if isinstance(track, soundcloud.resource.Resource):
 385 | 
 386 |             try:
 387 | 
 388 |                 t_track = {}
 389 |                 t_track['downloadable'] = track.downloadable
 390 |                 t_track['streamable'] = track.streamable
 391 |                 t_track['title'] = track.title
 392 |                 t_track['user'] = {'username': track.user['username']}
 393 |                 t_track['release_year'] = track.release
 394 |                 t_track['genre'] = track.genre
 395 |                 t_track['artwork_url'] = track.artwork_url
 396 |                 if track.downloadable:
 397 |                     t_track['stream_url'] = track.download_url
 398 |                 else:
 399 |                     if downloadable:
 400 |                         puts_safe(colored.red("Skipping") + colored.white(": " + track.title))
 401 |                         continue
 402 |                     if hasattr(track, 'stream_url'):
 403 |                         t_track['stream_url'] = track.stream_url
 404 |                     else:
 405 |                         t_track['direct'] = True
 406 |                         streams_url = "https://api.soundcloud.com/i1/tracks/%s/streams?client_id=%s&app_version=%s" % (
 407 |                         str(track.id), AGGRESSIVE_CLIENT_ID, APP_VERSION)
 408 |                         response = requests.get(streams_url).json()
 409 |                         t_track['stream_url'] = response['http_mp3_128_url']
 410 | 
 411 |                 track = t_track
 412 |             except Exception as e:
 413 |                 puts_safe(colored.white(track.title) + colored.red(' is not downloadable.'))
 414 |                 continue
 415 | 
 416 |         if i > num_tracks - 1:
 417 |             continue
 418 |         try:
 419 |             if not track.get('stream_url', False):
 420 |                 puts_safe(colored.white(track['title']) + colored.red(' is not downloadable.'))
 421 |                 continue
 422 |             else:
 423 |                 track_artist = sanitize_filename(track['user']['username'])
 424 |                 track_title = sanitize_filename(track['title'])
 425 |                 track_filename = track_artist + ' - ' + track_title + '.mp3'
 426 | 
 427 |                 if folders:
 428 |                     track_artist_path = join(custom_path, track_artist)
 429 |                     if not exists(track_artist_path):
 430 |                         mkdir(track_artist_path)
 431 |                     track_filename = join(track_artist_path, track_filename)
 432 |                 else:
 433 |                     track_filename = join(custom_path, track_filename)
 434 | 
 435 |                 if exists(track_filename):
 436 |                     puts_safe(colored.yellow("Track already downloaded: ") + colored.white(track_title))
 437 |                     continue
 438 | 
 439 |                 puts_safe(colored.green("Downloading") + colored.white(": " + track['title']))
 440 | 
 441 | 
 442 |                 if track.get('direct', False):
 443 |                     location = track['stream_url']
 444 |                 else:
 445 |                     stream = client.get(track['stream_url'], allow_redirects=False, limit=200)
 446 |                     if hasattr(stream, 'location'):
 447 |                         location = stream.location
 448 |                     else:
 449 |                         location = stream.url
 450 | 
 451 |                 filename = download_file(location, track_filename)
 452 |                 tagged = tag_file(filename,
 453 |                          artist=track['user']['username'],
 454 |                          title=track['title'],
 455 |                          year=track['release_year'],
 456 |                          genre=track['genre'],
 457 |                          album=id3_extras.get('album', None),
 458 |                          artwork_url=track['artwork_url'])
 459 | 
 460 |                 if not tagged:
 461 |                     wav_filename = filename[:-3] + 'wav'
 462 |                     os.rename(filename, wav_filename)
 463 |                     filename = wav_filename
 464 | 
 465 |                 filenames.append(filename)
 466 |         except Exception as e:
 467 |             puts_safe(colored.red("Problem downloading ") + colored.white(track['title']))
 468 |             puts_safe(str(e))
 469 | 
 470 |     return filenames
 471 | 
 472 | 
 473 | 
 474 | def get_soundcloud_data(url):
 475 |     """
 476 |     Scrapes a SoundCloud page for a track's important information.
 477 | 
 478 |     Returns:
 479 |         dict: of audio data
 480 | 
 481 |     """
 482 | 
 483 |     data = {}
 484 | 
 485 |     request = requests.get(url)
 486 | 
 487 |     title_tag = request.text.split('<title>')[1].split('</title')[0]
 488 |     data['title'] = title_tag.split(' by ')[0].strip()
 489 |     data['artist'] = title_tag.split(' by ')[1].split('|')[0].strip()
 490 |     # XXX Do more..
 491 | 
 492 |     return data
 493 | 
 494 | 
 495 | def get_soundcloud_api2_data(artist_id):
 496 |     """
 497 |     Scrape the new API. Returns the parsed JSON response.
 498 |     """
 499 | 
 500 |     v2_url = "https://api-v2.soundcloud.com/stream/users/%s?limit=500&client_id=%s&app_version=%s" % (
 501 |     artist_id, AGGRESSIVE_CLIENT_ID, APP_VERSION)
 502 |     response = requests.get(v2_url)
 503 |     parsed = response.json()
 504 | 
 505 |     return parsed
 506 | 
 507 | def get_soundcloud_api_playlist_data(playlist_id):
 508 |     """
 509 |     Scrape the new API. Returns the parsed JSON response.
 510 |     """
 511 | 
 512 |     url = "https://api.soundcloud.com/playlists/%s?representation=full&client_id=02gUJC0hH2ct1EGOcYXQIzRFU91c72Ea&app_version=1467724310" % (
 513 |         playlist_id)
 514 |     response = requests.get(url)
 515 |     parsed = response.json()
 516 | 
 517 |     return parsed
 518 | 
 519 | def get_hard_track_url(item_id):
 520 |     """
 521 |     Hard-scrapes a track.
 522 |     """
 523 | 
 524 |     streams_url = "https://api.soundcloud.com/i1/tracks/%s/streams/?client_id=%s&app_version=%s" % (
 525 |     item_id, AGGRESSIVE_CLIENT_ID, APP_VERSION)
 526 |     response = requests.get(streams_url)
 527 |     json_response = response.json()
 528 | 
 529 |     if response.status_code == 200:
 530 |         hard_track_url = json_response['http_mp3_128_url']
 531 |         return hard_track_url
 532 |     else:
 533 |         return None
 534 | 
 535 | ####################################################################
 536 | # Bandcamp
 537 | ####################################################################
 538 | 
 539 | 
 540 | def process_bandcamp(vargs):
 541 |     """
 542 |     Main BandCamp path.
 543 |     """
 544 | 
 545 |     artist_url = vargs['artist_url']
 546 | 
 547 |     if 'bandcamp.com' in artist_url or ('://' in artist_url and vargs['bandcamp']):
 548 |         bc_url = artist_url
 549 |     else:
 550 |         bc_url = 'https://' + artist_url + '.bandcamp.com/music'
 551 | 
 552 |     filenames = scrape_bandcamp_url(
 553 |         bc_url,
 554 |         num_tracks=vargs['num_tracks'],
 555 |         folders=vargs['folders'],
 556 |         custom_path=vargs['path'],
 557 |     )
 558 | 
 559 |     # check if we have lists inside a list, which indicates the
 560 |     # scraping has gone recursive, so we must format the output
 561 |     # ( reference: http://stackoverflow.com/a/5251706 )
 562 |     if any(isinstance(elem, list) for elem in filenames):
 563 |         # first, remove any empty sublists inside our outter list
 564 |         # ( reference: http://stackoverflow.com/a/19875634 )
 565 |         filenames = [sub for sub in filenames if sub]
 566 |         # now, make sure we "flatten" the list
 567 |         # ( reference: http://stackoverflow.com/a/11264751 )
 568 |         filenames = [val for sub in filenames for val in sub]
 569 | 
 570 |     if vargs['open']:
 571 |         open_files(filenames)
 572 | 
 573 |     return
 574 | 
 575 | 
 576 | # Largely borrowed from Ronier's bandcampscrape
 577 | def scrape_bandcamp_url(url, num_tracks=sys.maxsize, folders=False, custom_path=''):
 578 |     """
 579 |     Pull out artist and track info from a Bandcamp URL.
 580 | 
 581 |     Returns:
 582 |         list: filenames to open
 583 |     """
 584 | 
 585 |     filenames = []
 586 |     album_data = get_bandcamp_metadata(url)
 587 | 
 588 |     # If it's a list, we're dealing with a list of Album URLs,
 589 |     # so we call the scrape_bandcamp_url() method for each one
 590 |     if type(album_data) is list:
 591 |         for album_url in album_data:
 592 |             filenames.append(
 593 |                 scrape_bandcamp_url(
 594 |                     album_url, num_tracks, folders, custom_path
 595 |                 )
 596 |             )
 597 |         return filenames
 598 | 
 599 |     artist = album_data.get("artist")
 600 |     album_name = album_data.get("album_title")
 601 | 
 602 |     if folders:
 603 |         if album_name:
 604 |             directory = artist + " - " + album_name
 605 |         else:
 606 |             directory = artist
 607 |         directory = sanitize_filename(directory)
 608 |         directory = join(custom_path, directory)
 609 |         if not exists(directory):
 610 |             mkdir(directory)
 611 | 
 612 |     for i, track in enumerate(album_data["trackinfo"]):
 613 | 
 614 |         if i > num_tracks - 1:
 615 |             continue
 616 | 
 617 |         try:
 618 |             track_name = track["title"]
 619 |             if track["track_num"]:
 620 |                 track_number = str(track["track_num"]).zfill(2)
 621 |             else:
 622 |                 track_number = None
 623 |             if track_number and folders:
 624 |                 track_filename = '%s - %s.mp3' % (track_number, track_name)
 625 |             else:
 626 |                 track_filename = '%s.mp3' % (track_name)
 627 |             track_filename = sanitize_filename(track_filename)
 628 | 
 629 |             if folders:
 630 |                 path = join(directory, track_filename)
 631 |             else:
 632 |                 path = join(custom_path, sanitize_filename(artist) + ' - ' + track_filename)
 633 | 
 634 |             if exists(path):
 635 |                 puts_safe(colored.yellow("Track already downloaded: ") + colored.white(track_name))
 636 |                 continue
 637 | 
 638 |             if not track['file']:
 639 |                 puts_safe(colored.yellow("Track unavailble for scraping: ") + colored.white(track_name))
 640 |                 continue
 641 | 
 642 |             puts_safe(colored.green("Downloading") + colored.white(": " + track_name))
 643 |             path = download_file(track['file']['mp3-128'], path)
 644 | 
 645 |             album_year = album_data['album_release_date']
 646 |             if album_year:
 647 |                 album_year = datetime.strptime(album_year, "%d %b %Y %H:%M:%S GMT").year
 648 | 
 649 |             tag_file(path,
 650 |                      artist,
 651 |                      track_name,
 652 |                      album=album_name,
 653 |                      year=album_year,
 654 |                      genre=album_data['genre'],
 655 |                      artwork_url=album_data['artFullsizeUrl'],
 656 |                      track_number=track_number,
 657 |                      url=album_data['url'])
 658 | 
 659 |             filenames.append(path)
 660 | 
 661 |         except Exception as e:
 662 |             puts_safe(colored.red("Problem downloading ") + colored.white(track_name))
 663 |             print(e)
 664 |     return filenames
 665 | 
 666 | 
 667 | def extract_embedded_json_from_attribute(request, attribute, debug=False):
 668 |     """
 669 |     Extract JSON object embedded in an element's attribute value.
 670 | 
 671 |     The JSON is "sloppy". The native python JSON parser often can't deal,
 672 |     so we use the more tolerant demjson instead.
 673 | 
 674 |     Args:
 675 |         request (obj:`requests.Response`): HTTP GET response from which to extract
 676 |         attribute (str): name of the attribute holding the desired JSON object
 677 |         debug (bool, optional): whether to print debug messages
 678 | 
 679 |     Returns:
 680 |         The embedded JSON object as a dict, or None if extraction failed
 681 |     """
 682 |     try:
 683 |         embed = request.text.split('{}="'.format(attribute))[1]
 684 |         embed = html_unescape(
 685 |             embed.split('"')[0]
 686 |         )
 687 |         output = demjson.decode(embed)
 688 |         if debug:
 689 |             print(
 690 |                 'extracted JSON: '
 691 |                 + demjson.encode(
 692 |                     output,
 693 |                     compactly=False,
 694 |                     indent_amount=2,
 695 |                 )
 696 |             )
 697 |     except Exception as e:
 698 |         output = None
 699 |         if debug:
 700 |             print(e)
 701 |     return output
 702 | 
 703 | 
 704 | def get_bandcamp_metadata(url):
 705 |     """
 706 |     Read information from Bandcamp embedded JavaScript object notation.
 707 |     The method may return a list of URLs (indicating this is probably a "main" page which links to one or more albums),
 708 |     or a JSON if we can already parse album/track info from the given url.
 709 |     """
 710 |     request = requests.get(url)
 711 |     output = {}
 712 |     try:
 713 |         for attr in ['data-tralbum', 'data-embed']:
 714 |             output.update(
 715 |                 extract_embedded_json_from_attribute(
 716 |                     request, attr
 717 |                 )
 718 |             )
 719 |     # if the JSON parser failed, we should consider it's a "/music" page,
 720 |     # so we generate a list of albums/tracks and return it immediately
 721 |     except Exception as e:
 722 |         regex_all_albums = r'<a href="(/(?:album|track)/[^>]+)">'
 723 |         all_albums = re.findall(regex_all_albums, request.text, re.MULTILINE)
 724 |         album_url_list = list()
 725 |         for album in all_albums:
 726 |             album_url = re.sub(r'music/?$', '', url) + album
 727 |             album_url_list.append(album_url)
 728 |         return album_url_list
 729 |     # if the JSON parser was successful, use a regex to get all tags
 730 |     # from this album/track, join them and set it as the "genre"
 731 |     regex_tags = r'<a class="tag" href[^>]+>([^<]+)</a>'
 732 |     tags = re.findall(regex_tags, request.text, re.MULTILINE)
 733 |     # make sure we treat integers correctly with join()
 734 |     # according to http://stackoverflow.com/a/7323861
 735 |     # (very unlikely, but better safe than sorry!)
 736 |     output['genre'] = ' '.join(s for s in tags)
 737 | 
 738 |     try:
 739 |         artUrl = request.text.split("\"tralbumArt\">")[1].split("\">")[0].split("href=\"")[1]
 740 |         output['artFullsizeUrl'] = artUrl
 741 |     except:
 742 |         puts_safe(colored.red("Couldn't get full artwork") + "")
 743 |         output['artFullsizeUrl'] = None
 744 | 
 745 |     return output
 746 | 
 747 | 
 748 | ####################################################################
 749 | # Mixcloud
 750 | ####################################################################
 751 | 
 752 | 
 753 | def process_mixcloud(vargs):
 754 |     """
 755 |     Main MixCloud path.
 756 |     """
 757 | 
 758 |     artist_url = vargs['artist_url']
 759 | 
 760 |     if 'mixcloud.com' in artist_url:
 761 |         mc_url = artist_url
 762 |     else:
 763 |         mc_url = 'https://mixcloud.com/' + artist_url
 764 | 
 765 |     filenames = scrape_mixcloud_url(mc_url, num_tracks=vargs['num_tracks'], folders=vargs['folders'], custom_path=vargs['path'])
 766 | 
 767 |     if vargs['open']:
 768 |         open_files(filenames)
 769 | 
 770 |     return
 771 | 
 772 | 
 773 | def scrape_mixcloud_url(mc_url, num_tracks=sys.maxsize, folders=False, custom_path=''):
 774 |     """
 775 |     Returns:
 776 |         list: filenames to open
 777 | 
 778 |     """
 779 | 
 780 |     try:
 781 |         data = get_mixcloud_data(mc_url)
 782 |     except Exception as e:
 783 |         puts_safe(colored.red("Problem downloading ") + mc_url)
 784 |         print(e)
 785 |         return []
 786 | 
 787 |     filenames = []
 788 | 
 789 |     track_artist = sanitize_filename(data['artist'])
 790 |     track_title = sanitize_filename(data['title'])
 791 |     track_filename = track_artist + ' - ' + track_title + data['mp3_url'][-4:]
 792 | 
 793 |     if folders:
 794 |         track_artist_path = join(custom_path, track_artist)
 795 |         if not exists(track_artist_path):
 796 |             mkdir(track_artist_path)
 797 |         track_filename = join(track_artist_path, track_filename)
 798 |         if exists(track_filename):
 799 |             puts_safe(colored.yellow("Skipping") + colored.white(': ' + data['title'] + " - it already exists!"))
 800 |             return []
 801 |     else:
 802 |         track_filename = join(custom_path, track_filename)
 803 | 
 804 |     puts_safe(colored.green("Downloading") + colored.white(
 805 |         ': ' + data['artist'] + " - " + data['title'] + " (" + track_filename[-4:] + ")"))
 806 |     download_file(data['mp3_url'], track_filename)
 807 |     if track_filename[-4:] == '.mp3':
 808 |         tag_file(track_filename,
 809 |                  artist=data['artist'],
 810 |                  title=data['title'],
 811 |                  year=data['year'],
 812 |                  genre="Mix",
 813 |                  artwork_url=data['artwork_url'])
 814 |     filenames.append(track_filename)
 815 | 
 816 |     return filenames
 817 | 
 818 | 
 819 | def get_mixcloud_data(url):
 820 |     """
 821 |     Scrapes a Mixcloud page for a track's important information.
 822 | 
 823 |     Returns:
 824 |         dict: containing audio data
 825 | 
 826 |     """
 827 | 
 828 |     data = {}
 829 |     request = requests.get(url)
 830 |     preview_mp3_url = request.text.split('m-preview="')[1].split('" m-preview-light')[0]
 831 |     song_uuid = request.text.split('m-preview="')[1].split('" m-preview-light')[0].split('previews/')[1].split('.mp3')[0]
 832 | 
 833 |     # Fish for the m4a..
 834 |     for server in range(1, 23):
 835 |         # Ex: https://stream6.mixcloud.com/c/m4a/64/1/2/0/9/30fe-23aa-40da-9bf3-4bee2fba649d.m4a
 836 |         mp3_url = "https://stream" + str(server) + ".mixcloud.com/c/m4a/64/" + song_uuid + '.m4a'
 837 |         try:
 838 |             if requests.head(mp3_url).status_code == 200:
 839 |                 if '?' in mp3_url:
 840 |                     mp3_url = mp3_url.split('?')[0]
 841 |                 break
 842 |         except Exception as e:
 843 |             continue
 844 | 
 845 |     full_title = request.text.split("<title>")[1].split(" | Mixcloud")[0]
 846 |     title = full_title.split(' by ')[0].strip()
 847 |     artist = full_title.split(' by ')[1].strip()
 848 | 
 849 |     img_thumbnail_url = request.text.split('m-thumbnail-url="')[1].split(" ng-class")[0]
 850 |     artwork_url = img_thumbnail_url.replace('60/', '300/').replace('60/', '300/').replace('//', 'https://').replace('"',
 851 |                                                                                                                     '')
 852 | 
 853 |     data['mp3_url'] = mp3_url
 854 |     data['title'] = title
 855 |     data['artist'] = artist
 856 |     data['artwork_url'] = artwork_url
 857 |     data['year'] = None
 858 | 
 859 |     return data
 860 | 
 861 | 
 862 | ####################################################################
 863 | # Audiomack
 864 | ####################################################################
 865 | 
 866 | 
 867 | def process_audiomack(vargs):
 868 |     """
 869 |     Main Audiomack path.
 870 |     """
 871 | 
 872 |     artist_url = vargs['artist_url']
 873 | 
 874 |     if 'audiomack.com' in artist_url:
 875 |         mc_url = artist_url
 876 |     else:
 877 |         mc_url = 'https://audiomack.com/' + artist_url
 878 | 
 879 |     filenames = scrape_audiomack_url(mc_url, num_tracks=vargs['num_tracks'], folders=vargs['folders'], custom_path=vargs['path'])
 880 | 
 881 |     if vargs['open']:
 882 |         open_files(filenames)
 883 | 
 884 |     return
 885 | 
 886 | 
 887 | def scrape_audiomack_url(mc_url, num_tracks=sys.maxsize, folders=False, custom_path=''):
 888 |     """
 889 |     Returns:
 890 |         list: filenames to open
 891 | 
 892 |     """
 893 | 
 894 |     try:
 895 |         data = get_audiomack_data(mc_url)
 896 |     except Exception as e:
 897 |         puts_safe(colored.red("Problem downloading ") + mc_url)
 898 |         print(e)
 899 | 
 900 |     filenames = []
 901 | 
 902 |     track_artist = sanitize_filename(data['artist'])
 903 |     track_title = sanitize_filename(data['title'])
 904 |     track_filename = track_artist + ' - ' + track_title + '.mp3'
 905 | 
 906 |     if folders:
 907 |         track_artist_path = join(custom_path, track_artist)
 908 |         if not exists(track_artist_path):
 909 |             mkdir(track_artist_path)
 910 |         track_filename = join(track_artist_path, track_filename)
 911 |         if exists(track_filename):
 912 |             puts_safe(colored.yellow("Skipping") + colored.white(': ' + data['title'] + " - it already exists!"))
 913 |             return []
 914 |     else:
 915 |         track_filename = join(custom_path, track_filename)
 916 | 
 917 |     puts_safe(colored.green("Downloading") + colored.white(': ' + data['artist'] + " - " + data['title']))
 918 |     download_file(data['mp3_url'], track_filename)
 919 |     tag_file(track_filename,
 920 |              artist=data['artist'],
 921 |              title=data['title'],
 922 |              year=data['year'],
 923 |              genre=None,
 924 |              artwork_url=data['artwork_url'])
 925 |     filenames.append(track_filename)
 926 | 
 927 |     return filenames
 928 | 
 929 | 
 930 | def get_audiomack_data(url):
 931 |     """
 932 |     Scrapes a Mixcloud page for a track's important information.
 933 | 
 934 |     Returns:
 935 |         dict: containing audio data
 936 | 
 937 |     """
 938 | 
 939 |     data = {}
 940 |     request = requests.get(url)
 941 | 
 942 |     mp3_url = request.text.split('class="player-icon download-song" title="Download" href="')[1].split('"')[0]
 943 |     artist = request.text.split('<span class="artist">')[1].split('</span>')[0].strip()
 944 |     title = request.text.split('<span class="artist">')[1].split('</span>')[1].split('</h1>')[0].strip()
 945 |     artwork_url = request.text.split('<a class="lightbox-trigger" href="')[1].split('" data')[0].strip()
 946 | 
 947 |     data['mp3_url'] = mp3_url
 948 |     data['title'] = title
 949 |     data['artist'] = artist
 950 |     data['artwork_url'] = artwork_url
 951 |     data['year'] = None
 952 | 
 953 |     return data
 954 | 
 955 | 
 956 | ####################################################################
 957 | # Hive.co
 958 | ####################################################################
 959 | 
 960 | 
 961 | def process_hive(vargs):
 962 |     """
 963 |     Main Hive.co path.
 964 |     """
 965 | 
 966 |     artist_url = vargs['artist_url']
 967 | 
 968 |     if 'hive.co' in artist_url:
 969 |         mc_url = artist_url
 970 |     else:
 971 |         mc_url = 'https://www.hive.co/downloads/download/' + artist_url
 972 | 
 973 |     filenames = scrape_hive_url(mc_url, num_tracks=vargs['num_tracks'], folders=vargs['folders'], custom_path=vargs['path'])
 974 | 
 975 |     if vargs['open']:
 976 |         open_files(filenames)
 977 | 
 978 |     return
 979 | 
 980 | 
 981 | def scrape_hive_url(mc_url, num_tracks=sys.maxsize, folders=False, custom_path=''):
 982 |     """
 983 |     Scrape a Hive.co download page.
 984 | 
 985 |     Returns:
 986 |         list: filenames to open
 987 | 
 988 |     """
 989 | 
 990 |     try:
 991 |         data = get_hive_data(mc_url)
 992 |     except Exception as e:
 993 |         puts_safe(colored.red("Problem downloading ") + mc_url)
 994 |         print(e)
 995 | 
 996 |     filenames = []
 997 | 
 998 |     # track_artist = sanitize_filename(data['artist'])
 999 |     # track_title = sanitize_filename(data['title'])
1000 |     # track_filename = track_artist + ' - ' + track_title + '.mp3'
1001 | 
1002 |     # if folders:
1003 |     #     track_artist_path = join(custom_path, track_artist)
1004 |     #     if not exists(track_artist_path):
1005 |     #         mkdir(track_artist_path)
1006 |     #     track_filename = join(track_artist_path, track_filename)
1007 |     #     if exists(track_filename):
1008 |     #         puts_safe(colored.yellow("Skipping") + colored.white(': ' + data['title'] + " - it already exists!"))
1009 |     #         return []
1010 | 
1011 |     # puts_safe(colored.green("Downloading") + colored.white(': ' + data['artist'] + " - " + data['title']))
1012 |     # download_file(data['mp3_url'], track_filename)
1013 |     # tag_file(track_filename,
1014 |     #         artist=data['artist'],
1015 |     #         title=data['title'],
1016 |     #         year=data['year'],
1017 |     #         genre=None,
1018 |     #         artwork_url=data['artwork_url'])
1019 |     # filenames.append(track_filename)
1020 | 
1021 |     return filenames
1022 | 
1023 | 
1024 | def get_hive_data(url):
1025 |     """
1026 | 
1027 |     Scrapes a Mixcloud page for a track's important information.
1028 | 
1029 |     Returns a dict of data.
1030 | 
1031 |     """
1032 | 
1033 |     data = {}
1034 |     request = requests.get(url)
1035 | 
1036 |     # import pdb
1037 |     # pdb.set_trace()
1038 | 
1039 |     # mp3_url = request.text.split('class="player-icon download-song" title="Download" href="')[1].split('"')[0]
1040 |     # artist = request.text.split('<span class="artist">')[1].split('</span>')[0].strip()
1041 |     # title = request.text.split('<span class="artist">')[1].split('</span>')[1].split('</h1>')[0].strip()
1042 |     # artwork_url = request.text.split('<a class="lightbox-trigger" href="')[1].split('" data')[0].strip()
1043 | 
1044 |     # data['mp3_url'] = mp3_url
1045 |     # data['title'] = title
1046 |     # data['artist'] = artist
1047 |     # data['artwork_url'] = artwork_url
1048 |     # data['year'] = None
1049 | 
1050 |     return data
1051 | 
1052 | 
1053 | ####################################################################
1054 | # MusicBed
1055 | ####################################################################
1056 | 
1057 | 
1058 | def process_musicbed(vargs):
1059 |     """
1060 |     Main MusicBed path.
1061 |     """
1062 | 
1063 |     # let's validate given MusicBed url
1064 |     validated = False
1065 |     if vargs['artist_url'].startswith( 'https://www.musicbed.com/' ):
1066 |         splitted = vargs['artist_url'][len('https://www.musicbed.com/'):].split( '/' )
1067 |         if len( splitted ) == 3:
1068 |             if ( splitted[0] == 'artists' or splitted[0] == 'albums' or splitted[0] == 'songs' ) and splitted[2].isdigit():
1069 |                 validated = True
1070 | 
1071 |     if not validated:
1072 |         puts( colored.red( 'process_musicbed: you provided incorrect MusicBed url. Aborting.' ) )
1073 |         puts( colored.white( 'Please make sure that url is either artist-url, album-url or song-url.' ) )
1074 |         puts( colored.white( 'Example of correct artist-url: https://www.musicbed.com/artists/lights-motion/5188' ) )
1075 |         puts( colored.white( 'Example of correct album-url:  https://www.musicbed.com/albums/be-still/2828' ) )
1076 |         puts( colored.white( 'Example of correct song-url:   https://www.musicbed.com/songs/be-still/24540' ) )
1077 |         return
1078 | 
1079 |     filenames = scrape_musicbed_url(vargs['artist_url'], vargs['login'], vargs['password'], num_tracks=vargs['num_tracks'], folders=vargs['folders'], custom_path=vargs['path'])
1080 | 
1081 |     if vargs['open']:
1082 |         open_files(filenames)
1083 | 
1084 | 
1085 | def scrape_musicbed_url(url, login, password, num_tracks=sys.maxsize, folders=False, custom_path=''):
1086 |     """
1087 |     Scrapes provided MusicBed url.
1088 |     Uses requests' Session object in order to store cookies.
1089 |     Requires login and password information.
1090 |     If provided url is of pattern 'https://www.musicbed.com/artists/<string>/<number>' - a number of albums will be downloaded.
1091 |     If provided url is of pattern 'https://www.musicbed.com/albums/<string>/<number>'  - only one album will be downloaded.
1092 |     If provided url is of pattern 'https://www.musicbed.com/songs/<string>/<number>'   - will be treated as one album (but download only 1st track).
1093 |     Metadata and urls are obtained from JavaScript data that's treated as JSON data.
1094 | 
1095 |     Returns:
1096 |         list: filenames to open
1097 |     """
1098 | 
1099 |     session = requests.Session()
1100 | 
1101 |     response = session.get( url )
1102 |     if response.status_code != 200:
1103 |         puts( colored.red( 'scrape_musicbed_url: couldn\'t open provided url. Status code: ' + str( response.status_code ) + '. Aborting.' ) )
1104 |         session.close()
1105 |         return []
1106 | 
1107 |     albums = []
1108 |     # let's determine what url type we got
1109 |     # '/artists/' - search for and download many albums
1110 |     # '/albums/'  - means we're downloading 1 album
1111 |     # '/songs/'   - means 1 album as well, but we're forcing num_tracks=1 in order to download only first relevant track
1112 |     if url.startswith( 'https://www.musicbed.com/artists/' ):
1113 |         # a hackjob code to get a list of available albums
1114 |         main_index = 0
1115 |         while response.text.find( 'https://www.musicbed.com/albums/', main_index ) != -1:
1116 |             start_index = response.text.find( 'https://www.musicbed.com/albums/', main_index )
1117 |             end_index   = response.text.find( '">', start_index )
1118 |             albums.append( response.text[start_index:end_index] )
1119 |             main_index = end_index
1120 |     elif url.startswith( 'https://www.musicbed.com/songs/' ):
1121 |         albums.append( url )
1122 |         num_tracks = 1
1123 |     else: # url.startswith( 'https://www.musicbed.com/albums/' )
1124 |         albums.append( url )
1125 | 
1126 |     # let's get our token and try to login (csrf_token seems to be present on every page)
1127 |     token = response.text.split( 'var csrf_token = "' )[1].split( '";' )[0]
1128 |     details = { '_token': token, 'login': login, 'password': password }
1129 |     response = session.post( 'https://www.musicbed.com/ajax/login', data=details )
1130 |     if response.status_code != 200:
1131 |         puts( colored.red( 'scrape_musicbed_url: couldn\'t login. Aborting. ' ) + colored.white( 'Couldn\'t access login page.' ) )
1132 |         session.close()
1133 |         return []
1134 |     login_response_data = demjson.decode( response.text )
1135 |     if not login_response_data['body']['status']:
1136 |         puts( colored.red( 'scrape_musicbed_url: couldn\'t login. Aborting. ' ) + colored.white( 'Did you provide correct login and password?' ) )
1137 |         session.close()
1138 |         return []
1139 | 
1140 |     # now let's actually scrape collected pages
1141 |     filenames = []
1142 |     for each_album_url in albums:
1143 |         response = session.get( each_album_url )
1144 |         if response.status_code != 200:
1145 |             puts_safe( colored.red( 'scrape_musicbed_url: couldn\'t open url: ' + each_album_url +
1146 |                                     '. Status code: ' + str( response.status_code ) + '. Skipping.' ) )
1147 |             continue
1148 | 
1149 |         # actually not a JSON, but a JS object, but so far so good
1150 |         json = response.text.split( 'App.components.SongRows = ' )[1].split( '</script>' )[0]
1151 |         data = demjson.decode( json )
1152 | 
1153 |         song_count = 1
1154 |         for each_song in data['loadedSongs']:
1155 |             if song_count > num_tracks:
1156 |                 break
1157 | 
1158 |             try:
1159 |                 url, params = each_song['playback_url'].split( '?' )
1160 | 
1161 |                 details = dict()
1162 |                 for each_param in params.split( '&' ):
1163 |                     name, value = each_param.split( '=' )
1164 |                     details.update( { name: value } )
1165 |                 # musicbed warns about it if it's not fixed
1166 |                 details['X-Amz-Credential'] = details['X-Amz-Credential'].replace( '%2F', '/' )
1167 | 
1168 |                 directory = custom_path
1169 |                 if folders:
1170 |                     sanitized_artist = sanitize_filename( each_song['album']['data']['artist']['data']['name'] )
1171 |                     sanitized_album  = sanitize_filename( each_song['album']['data']['name'] )
1172 |                     directory = join( directory, sanitized_artist + ' - ' + sanitized_album )
1173 |                     if not exists( directory ):
1174 |                         mkdir( directory )
1175 |                 filename = join( directory, str( song_count ) + ' - ' + sanitize_filename( each_song['name'] ) + '.mp3' )
1176 | 
1177 |                 if exists( filename ):
1178 |                     puts_safe( colored.yellow( 'Skipping' ) + colored.white( ': ' + each_song['name'] + ' - it already exists!' ) )
1179 |                     song_count += 1
1180 |                     continue
1181 | 
1182 |                 puts_safe( colored.green( 'Downloading' ) + colored.white( ': ' + each_song['name'] ) )
1183 |                 path = download_file( url, filename, session=session, params=details )
1184 | 
1185 |                 # example of genre_string:
1186 |                 # "<a href=\"https://www.musicbed.com/genres/ambient/2\">Ambient</a> <a href=\"https://www.musicbed.com/genres/cinematic/4\">Cinematic</a>"
1187 |                 genres = ''
1188 |                 for each in each_song['genre_string'].split( '</a>' ):
1189 |                     if ( each != "" ):
1190 |                         genres += each.split( '">' )[1] + '/'
1191 |                 genres = genres[:-1] # removing last '/
1192 | 
1193 |                 tag_file(path,
1194 |                          each_song['album']['data']['artist']['data']['name'],
1195 |                          each_song['name'],
1196 |                          album=each_song['album']['data']['name'],
1197 |                          year=int( each_song['album']['data']['released_at'].split( '-' )[0] ),
1198 |                          genre=genres,
1199 |                          artwork_url=each_song['album']['data']['imageObject']['data']['paths']['original'],
1200 |                          track_number=str( song_count ),
1201 |                          url=each_song['song_url'])
1202 | 
1203 |                 filenames.append( path )
1204 |                 song_count += 1
1205 |             except:
1206 |                 puts_safe( colored.red( 'Problem downloading ' ) + colored.white( each_song['name'] ) + '. Skipping.' )
1207 |                 song_count += 1
1208 | 
1209 |     session.close()
1210 | 
1211 |     return filenames
1212 | 
1213 | 
1214 | ####################################################################
1215 | # File Utility
1216 | ####################################################################
1217 | 
1218 | 
1219 | def download_file(url, path, session=None, params=None):
1220 |     """
1221 |     Download an individual file.
1222 |     """
1223 | 
1224 |     if url[0:2] == '//':
1225 |         url = 'https://' + url[2:]
1226 | 
1227 |     # Use a temporary file so that we don't import incomplete files.
1228 |     tmp_path = path + '.tmp'
1229 | 
1230 |     if session and params:
1231 |         r = session.get( url, params=params, stream=True )
1232 |     elif session and not params:
1233 |         r = session.get( url, stream=True )
1234 |     else:
1235 |         r = requests.get(url, stream=True)
1236 |     with open(tmp_path, 'wb') as f:
1237 |         total_length = int(r.headers.get('content-length', 0))
1238 |         for chunk in progress.bar(r.iter_content(chunk_size=1024), expected_size=(total_length / 1024) + 1):
1239 |             if chunk:  # filter out keep-alive new chunks
1240 |                 f.write(chunk)
1241 |                 f.flush()
1242 | 
1243 |     os.rename(tmp_path, path)
1244 | 
1245 |     return path
1246 | 
1247 | 
1248 | def tag_file(filename, artist, title, year=None, genre=None, artwork_url=None, album=None, track_number=None, url=None):
1249 |     """
1250 |     Attempt to put ID3 tags on a file.
1251 | 
1252 |     Args:
1253 |         artist (str):
1254 |         title (str):
1255 |         year (int):
1256 |         genre (str):
1257 |         artwork_url (str):
1258 |         album (str):
1259 |         track_number (str):
1260 |         filename (str):
1261 |         url (str):
1262 |     """
1263 | 
1264 |     try:
1265 |         audio = EasyMP3(filename)
1266 |         audio.tags = None
1267 |         audio["artist"] = artist
1268 |         audio["title"] = title
1269 |         if year:
1270 |             audio["date"] = str(year)
1271 |         if album:
1272 |             audio["album"] = album
1273 |         if track_number:
1274 |             audio["tracknumber"] = track_number
1275 |         if genre:
1276 |             audio["genre"] = genre
1277 |         if url: # saves the tag as WOAR
1278 |             audio["website"] = url
1279 |         audio.save()
1280 | 
1281 |         if artwork_url:
1282 | 
1283 |             artwork_url = artwork_url.replace('https', 'http')
1284 | 
1285 |             mime = 'image/jpeg'
1286 |             if '.jpg' in artwork_url:
1287 |                 mime = 'image/jpeg'
1288 |             if '.png' in artwork_url:
1289 |                 mime = 'image/png'
1290 | 
1291 |             if '-large' in artwork_url:
1292 |                 new_artwork_url = artwork_url.replace('-large', '-t500x500')
1293 |                 try:
1294 |                     image_data = requests.get(new_artwork_url).content
1295 |                 except Exception as e:
1296 |                     # No very large image available.
1297 |                     image_data = requests.get(artwork_url).content
1298 |             else:
1299 |                 image_data = requests.get(artwork_url).content
1300 | 
1301 |             audio = MP3(filename, ID3=OldID3)
1302 |             audio.tags.add(
1303 |                 APIC(
1304 |                     encoding=3,  # 3 is for utf-8
1305 |                     mime=mime,
1306 |                     type=3,  # 3 is for the cover image
1307 |                     desc='Cover',
1308 |                     data=image_data
1309 |                 )
1310 |             )
1311 |             audio.save()
1312 | 
1313 |         # because there is software that doesn't seem to use WOAR we save url tag again as WXXX
1314 |         if url:
1315 |             audio = MP3(filename, ID3=OldID3)
1316 |             audio.tags.add( WXXX( encoding=3, url=url ) )
1317 |             audio.save()
1318 | 
1319 |         return True
1320 | 
1321 |     except Exception as e:
1322 |         puts(colored.red("Problem tagging file: ") + colored.white("Is this file a WAV?"))
1323 |         return False
1324 | 
1325 | def open_files(filenames):
1326 |     """
1327 |     Call the system 'open' command on a file.
1328 |     """
1329 |     command = ['open'] + filenames
1330 |     process = Popen(command, stdout=PIPE, stderr=PIPE)
1331 |     stdout, stderr = process.communicate()
1332 | 
1333 | 
1334 | def sanitize_filename(filename):
1335 |     """
1336 |     Make sure filenames are valid paths.
1337 | 
1338 |     Returns:
1339 |         str:
1340 |     """
1341 |     sanitized_filename = re.sub(r'[/\\:*?"<>|]', '-', filename)
1342 |     sanitized_filename = sanitized_filename.replace('&', 'and')
1343 |     sanitized_filename = sanitized_filename.replace('"', '')
1344 |     sanitized_filename = sanitized_filename.replace("'", '')
1345 |     sanitized_filename = sanitized_filename.replace("/", '')
1346 |     sanitized_filename = sanitized_filename.replace("\\", '')
1347 | 
1348 |     # Annoying.
1349 |     if sanitized_filename[0] == '.':
1350 |         sanitized_filename = u'dot' + sanitized_filename[1:]
1351 | 
1352 |     return sanitized_filename
1353 | 
1354 | def puts_safe(text):
1355 |     if sys.platform == "win32":
1356 |         if sys.version_info < (3,0,0):
1357 |             puts(text)
1358 |         else:
1359 |             puts(text.encode(sys.stdout.encoding, errors='replace').decode())
1360 |     else:
1361 |         puts(text)
1362 | 
1363 | 
1364 | ####################################################################
1365 | # Main
1366 | ####################################################################
1367 | 
1368 | if __name__ == '__main__':
1369 |     try:
1370 |         sys.exit(main())
1371 |     except Exception as e:
1372 |         print(e)
1373 | 


--------------------------------------------------------------------------------
/test.sh:
--------------------------------------------------------------------------------
1 | #! /bin/bash
2 | nosetests
3 | 


--------------------------------------------------------------------------------
/tests/test.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import os
  3 | import sys
  4 | import unittest
  5 | 
  6 | from mutagen.mp3 import EasyMP3
  7 | from soundscrape.soundscrape import get_client
  8 | from soundscrape.soundscrape import process_soundcloud
  9 | from soundscrape.soundscrape import process_bandcamp
 10 | 
 11 | 
 12 | def rm_mp3():
 13 |     """ deletes all ``*.mp3`` files in current directory
 14 |     """
 15 |     for f in glob.glob('*.mp3'):
 16 |         os.unlink(f)
 17 | 
 18 | 
 19 | class TestSoundscrape(unittest.TestCase):
 20 | 
 21 |     ##
 22 |     # Basic Tests
 23 |     ##
 24 | 
 25 |     def test_test(self):
 26 |         self.assertTrue(True)
 27 | 
 28 |     def test_get_client(self):
 29 |         client = get_client()
 30 |         self.assertTrue(bool(client))
 31 | 
 32 |     def test_soundcloud(self):
 33 |         rm_mp3()
 34 |         mp3_count = len(glob.glob1('', "*.mp3"))
 35 |         vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 9223372036854775807, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://soundcloud.com/fzpz/revised', 'keep': True}
 36 |         process_soundcloud(vargs)
 37 |         new_mp3_count = len(glob.glob1('', "*.mp3"))
 38 |         self.assertTrue(new_mp3_count > mp3_count)
 39 |         rm_mp3()
 40 | 
 41 |     def test_soundcloud_hard(self):
 42 |         rm_mp3()
 43 |         mp3_count = len(glob.glob1('', "*.mp3"))
 44 |         vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 1, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'puptheband', 'keep': False}
 45 |         process_soundcloud(vargs)
 46 |         new_mp3_count = len(glob.glob1('', "*.mp3"))
 47 |         self.assertTrue(new_mp3_count > mp3_count)
 48 |         self.assertTrue(new_mp3_count == 1) # This used to be 3, but is now 'Not available in United States.'
 49 |         rm_mp3()
 50 | 
 51 |     def test_soundcloud_hard_2(self):
 52 |         rm_mp3()
 53 |         mp3_count = len(glob.glob1('', "*.mp3"))
 54 |         vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 1, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://soundcloud.com/lostdogz/snuggles-chapstick', 'keep': False}
 55 |         process_soundcloud(vargs)
 56 |         new_mp3_count = len(glob.glob1('', "*.mp3"))
 57 |         self.assertTrue(new_mp3_count > mp3_count)
 58 |         self.assertTrue(new_mp3_count == 1) # This used to be 3, but is now 'Not available in United States.'
 59 |         rm_mp3()
 60 | 
 61 |     # The test URL for this is no longer a WAV. Need a new testcase.
 62 |     #
 63 |     # def test_soundcloud_wav(self):
 64 |     #     for f in glob.glob('*.wav'):
 65 |     #        os.unlink(f)
 66 | 
 67 |     #     wav_count = len(glob.glob1('', "*.wav"))
 68 |     #     vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 1, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://soundcloud.com/coastal/major-lazer-aerosol-can-coastal-flip', 'keep': False}
 69 |     #     process_soundcloud(vargs)
 70 |     #     new_wav_count = len(glob.glob1('', "*.wav"))
 71 |     #     self.assertTrue(new_wav_count > wav_count)
 72 |     #     self.assertTrue(new_wav_count == 1)
 73 | 
 74 |     #     for f in glob.glob('*.wav'):
 75 |     #        os.unlink(f)
 76 | 
 77 |     def test_bandcamp(self):
 78 |         rm_mp3()
 79 |         mp3_count = len(glob.glob1('', "*.mp3"))
 80 |         vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 9223372036854775807, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://atenrays.bandcamp.com/track/who-u-think'}
 81 |         process_bandcamp(vargs)
 82 |         new_mp3_count = len(glob.glob1('', "*.mp3"))
 83 |         self.assertTrue(new_mp3_count > mp3_count)
 84 |         rm_mp3()
 85 | 
 86 |     def test_bandcamp_slashes(self):
 87 |         rm_mp3()
 88 |         mp3_count = len(glob.glob1('', "*.mp3"))
 89 |         vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 9223372036854775807, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://defill.bandcamp.com/track/amnesia-chamber-harvest-skit'}
 90 |         process_bandcamp(vargs)
 91 |         new_mp3_count = len(glob.glob1('', "*.mp3"))
 92 |         self.assertTrue(new_mp3_count > mp3_count)
 93 |         rm_mp3()
 94 | 
 95 |     def test_bandcamp_html_entities(self):
 96 |         rm_mp3()
 97 |         vargs = {'path': '', 'folders': False, 'num_tracks': sys.maxsize, 'open': False, 'artist_url': 'https://anaalnathrakh.bandcamp.com/track/man-at-c-a-bonus-track'}
 98 |         process_bandcamp(vargs)
 99 |         mp3s = glob.glob('*.mp3')
100 |         self.assertEquals(1, len(mp3s))
101 |         fn = mp3s[0]
102 |         self.assertTrue('CandA' in fn)
103 |         t = EasyMP3(fn)['title']
104 |         self.assertTrue('C&A' in t[0])
105 |         rm_mp3()
106 | 
107 | 
108 |     # def test_musicbed(self):
109 |     #     for f in glob.glob('*.mp3'):
110 |     #         os.unlink(f)
111 | 
112 |     #     mp3_count = len(glob.glob1('', "*.mp3"))
113 |     #     vargs = {'login':'musicbedtest@gmail.com', 'password':'oo6alY9T', 'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 9223372036854775807, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://www.musicbed.com/albums/be-still/2828'}
114 |     #     process_musicbed(vargs)
115 |     #     new_mp3_count = len(glob.glob1('', "*.mp3"))
116 |     #     self.assertTrue(new_mp3_count > mp3_count)
117 | 
118 |     #     for f in glob.glob('*.mp3'):
119 |     #        os.unlink(f)
120 | 
121 |     def test_mixcloud(self):
122 |         """
123 |         MixCloud is being blocked from Travis, interestingly.
124 |         """
125 | 
126 |         # rm_mp3()
127 |         # for f in glob.glob('*.m4a'):
128 |         #    os.unlink(f)
129 | 
130 |         # shortest mix I could find that was still semi tolerable
131 |         #mp3_count = len(glob.glob1('', "*.mp3"))
132 |         #m4a_count = len(glob.glob1('', "*.m4a"))
133 |         #vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 9223372036854775807, 'bandcamp': False, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://www.mixcloud.com/Bobby_T_FS15/coffee-cigarettes-saturday-morning-hip-hop-fix/'}
134 |         #process_mixcloud(vargs)
135 |         #new_mp3_count = len(glob.glob1('', "*.mp3"))
136 |         #new_m4a_count = len(glob.glob1('', "*.m4a"))
137 |         #self.assertTrue((new_mp3_count > mp3_count) or (new_m4a_count > m4a_count))
138 | 
139 |         # rm_mp3()
140 |         # for f in glob.glob('*.m4a'):
141 |         #    os.unlink(f)
142 | 
143 |     # def test_audiomack(self):
144 |     #     for f in glob.glob('*.mp3'):
145 |     #        os.unlink(f)
146 | 
147 |     #     mp3_count = len(glob.glob1('', "*.mp3"))
148 |     #     vargs = {'path':'', 'folders': False, 'group': False, 'track': '', 'num_tracks': 9223372036854775807, 'bandcamp': False, 'audiomack': True, 'downloadable': False, 'likes': False, 'open': False, 'artist_url': 'https://www.audiomack.com/song/bottomfeedermusic/power'}
149 |     #     process_audiomack(vargs)
150 |     #     new_mp3_count = len(glob.glob1('', "*.mp3"))
151 |     #     self.assertTrue(new_mp3_count > mp3_count)
152 | 
153 |     #     for f in glob.glob('*.mp3'):
154 |     #        os.unlink(f)
155 | 
156 | if __name__ == '__main__':
157 |     unittest.main()
158 | 


--------------------------------------------------------------------------------