├── .github
    └── ISSUE_TEMPLATE
    │   ├── bug_report.md
    │   └── feature_request.md
├── .gitignore
├── README.md
├── api.md
├── kemono-dl.py
├── requirements.txt
└── src
    ├── __init__.py
    ├── args.py
    ├── helper.py
    ├── logger.py
    ├── main.py
    ├── my_yt_dlp.py
    └── version.py


/.github/ISSUE_TEMPLATE/bug_report.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Bug report
 3 | about: Create a bug report
 4 | labels: bug
 5 | 
 6 | ---
 7 | <!-- If you do not fill out this forum correctly your issue will be closed! -->
 8 | 
 9 | <!-- You must run the script using --verbose and attach the debug.log -->
10 | 
11 | <!-- Before submitting please click "Preview" at the top to make sure everything is formatted properly. Thank you :) -->
12 | 
13 | ### Version
14 | <!-- To get version run `python kemono-dl.py --version` -->
15 | Version:
16 | 
17 | ### Your Command
18 | <!-- Please list the entire command that was used -->
19 | ```bash
20 | 
21 | Please replace with command used.
22 | 
23 | ```
24 | 
25 | ### Description of bug
26 | <!-- A description of the bug. -->
27 | 
28 | 
29 | ### How To Reproduce
30 | <!-- Steps to reproduce the bug. -->
31 | 
32 | 
33 | ### Error messages and tracebacks
34 | <!-- Please include any error messages or tacebacks. -->
35 | ```python
36 | 
37 | Please replace with errors or tracebacks.
38 | 
39 | ```
40 | 
41 | ### Additional comments
42 | <!-- Anything else you think might help. -->
43 | 
44 | 
45 | 
46 | 


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feature_request.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Feature Request
 3 | about: Suggest a feature
 4 | labels: "feature request"
 5 | 
 6 | ---
 7 | ### Description
 8 | <!-- A description of the feature. -->
 9 | 
10 | 
11 | ### Service, User ID, Post ID
12 | <!-- If your feature if for a specific type of post or user please list it here -->
13 | <!-- Your Link: `https://{Site}.party/{Service}/user/{User ID}/post/{Post ID}` -->
14 | - Site:
15 | - Service:
16 | - User ID:
17 | - Post ID:
18 | 
19 | ### Additional comments
20 | <!-- Anything else you think might help. -->
21 | 
22 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | venv/
 2 | Downloads/
 3 | yt_dlp_temp/
 4 | *cookies.txt
 5 | *.pyc
 6 | *.log
 7 | *.bat
 8 | links.txt
 9 | test.py
10 | archive.txt


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # kemono-dl
  2 | A downloader tool for kemono.party and coomer.party.
  3 | 
  4 | ## How to use
  5 | 1.  Install python 3. (Disable path length limit during install)
  6 | 2.  Download source code for the [latest release](https://github.com/AplhaSlayer1964/kemono-dl/releases/latest) and extract it
  7 | 3.  Then install requirements with  `pip install -r requirements.txt`
  8 |     - If the command doesn't run try adding `python -m`, `python3 -m`, or `py -m` to the front
  9 | 4.  Get a cookie.txt file from kemono.party/coomer.party
 10 |     - You can get a cookie text file on [Chrome](https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid?hl=en) with this extension.
 11 |     - A cookie.txt file is required to use downloader!
 12 | 5.  Run `python kemono-dl.py --cookies "cookie.txt" --links https://kemono.party/SERVICE/user/USERID`
 13 |     - If the script doesn't run try replacing `python` with `python3` or `py`
 14 | 
 15 | # Command Line Options
 16 | 
 17 | ## Required!
 18 | 
 19 | `--cookies FILE`  
 20 | Takes in a cookie file or a list of cookie files separated by a comma. Used to get around the DDOS protection. Your cookie file must have been gotten while logged in to use the favorite options.  
 21 | 
 22 | ## What posts to download
 23 | 
 24 | `--links LINKS`  
 25 | Takes in a url or list of urls separated by a comma.  
 26 | `--from-file FILE`  
 27 | Reads in a file with urls separated by new lines. Lines starting with # will not be read in.  
 28 | `--kemono-fav-users SERVICE`  
 29 | Downloads favorite users from kemono.party of specified type or types separated by a comma. Types include: all, patreon, fanbox, gumroad, subscribestar, dlsite, fantia. Your cookie file must have been gotten while logged in to work.  
 30 | `--coomer-fav-users SERVICE`  
 31 | Downloads favorite users from coomer.party of specified type or types separated by a comma. Types include: all, onlyfans. Your cookie file must have been gotten while logged in to work.  
 32 | `--kemono-fav-posts`  
 33 | Downloads favorite posts from kemono.party. Your cookie file must have been gotten while logged in to work.  
 34 | `--coomer-fav-posts`  
 35 | Downloads favorite posts from coomer.party. Your cookie file must have been gotten while logged in to work.  
 36 | 
 37 | ## What files to download
 38 | 
 39 | `--inline`  
 40 | Download the inline images from the post content.  
 41 | `--content`  
 42 | Write the post content to a html file. The html file includes comments if `--comments` is passed.  
 43 | `--comments`  
 44 | Write the post comments to a html file.  
 45 | `--json`  
 46 | Write the post json to a file.  
 47 | `--extract-links`  
 48 | Write extracted links from post content to a text file.  
 49 | `--dms`  
 50 | Write user dms to a html file. Only works when a user url is passed.  
 51 | `--icon`  
 52 | Download the users profile icon. Only works when a user url is passed.  
 53 | `--banner`  
 54 | Download the users profile banner. Only works when a user url is passed.  
 55 | `--yt-dlp` (UNDER CONSTRUCTION)  
 56 | Try to download the post embed with yt-dlp.  
 57 | `--skip-attachments`  
 58 | Do not download post attachments.  
 59 | `--overwrite`  
 60 | Overwrite any previously created files.  
 61 | 
 62 | ## Output
 63 | 
 64 | `--dirname-pattern PATTERN`  
 65 | Set the file path pattern for where files are downloaded. See [Output Patterns](https://github.com/AplhaSlayer1964/kemono-dl#output-patterns=) for more detail.  
 66 | `--filename-pattern PATTERN`  
 67 | Set the file name pattern for attachments. See [Output Patterns](https://github.com/AplhaSlayer1964/kemono-dl#output-patterns=) for more detail.  
 68 | `--inline-filename-pattern PATTERN`  
 69 | Set the file name pattern for inline images. See [Output Patterns](https://github.com/AplhaSlayer1964/kemono-dl#output-patterns=) for more detail.  
 70 | `--other-filename-pattern PATTERN`  
 71 | Set the file name pattern for post content, extracted links, and json. See [Output Patterns](https://github.com/AplhaSlayer1964/kemono-dl#output-patterns=) for more detail.  
 72 | `--user-filename-pattern PATTERN`  
 73 | Set the file name pattern for icon, banner, and dms. See [Output Patterns](https://github.com/AplhaSlayer1964/kemono-dl#output-patterns=) for more detail.  
 74 | `--date-strf-pattern PATTERN`  
 75 | Set the date strf pattern variable. See [Output Patterns](https://github.com/AplhaSlayer1964/kemono-dl#output-patterns=) for more detail.  
 76 | `--restrict-names`  
 77 | Set all file and folder names to be limited to only the ascii character set.  
 78 | 
 79 | ## Download Filters
 80 | 
 81 | `--archive FILE`  
 82 | Only download posts that are not recorded in the archive file.  
 83 | `--date YYYYMMDD`  
 84 | Only download posts published from this date.  
 85 | `--datebefore YYYYMMDD`  
 86 | Only download posts published before this date.  
 87 | `--dateafter YYYYMMDD`  
 88 | Only download posts published after this date.  
 89 | `--user-updated-datebefore YYYYMMDD`  
 90 | Only download user posts if the user was updated before this date.  
 91 | `--user-updated-dateafter YYYYMMDD`  
 92 | Only download user posts if the user was updated after this date.  
 93 | `--min-filesize SIZE`  
 94 | Only download attachments or inline images with greater than this file size. (ex #gb | #mb | #kb | #b)  
 95 | `--max-filesize SIZE`  
 96 | Only download attachments or inline images with less than this file size. (ex #gb | #mb | #kb | #b)  
 97 | `--only-filetypes EXT`  
 98 | Only download attachments or inline images with the given file type(s). Takes a file extensions or list of file extensions separated by a comma. (ex mp4,jpg,gif,zip)  
 99 | `--skip-filetypes EXT`  
100 | Only download attachments or inline images without the given file type(s). Takes a file extensions or list of file extensions separated by a comma. (ex mp4,jpg,gif,zip)  
101 | 
102 | ## Other
103 | 
104 | `--help`  
105 | Prints all available options and exit.  
106 | `--version`  
107 | Print the version and exit.  
108 | `--verbose`  
109 | Display debug information and copies output to a file.  
110 | `--quite`  
111 | Suppress printing except for warnings, errors, and exceptions.  
112 | `--simulate`  
113 | Simulate the given command and do not write to disk.  
114 | `--no-part-files`  
115 | Do not save attachments or inline images as .part files while downloading. Files partially downloaded will not be resumed if program stops.  
116 | `--yt-dlp-args ARGS` (UNDER CONSTRUCTION)  
117 | The args yt-dlp will use to download with. Formatted as a python dictionary object.  
118 | `--post-timeout SEC`  
119 | The time in seconds to wait between downloading posts. (default: 0)  
120 | `--retry COUNT`  
121 | The amount of times to retry / resume downloading a file. (default: 5)  
122 | `--ratelimit-sleep SEC`  
123 | The time in seconds to wait after being ratelimited (default: 120)    
124 | 
125 | # Notes
126 | -   Excepted link formats:
127 |     -   `https://{site}.party/{service}/user/{user_id}`
128 |     -   `https://{site}.party/{service}/user/{user_id}/post/{post_id}`
129 | -   By default files are saved as .part files until completed.
130 | -   I assume the .party site has the correct hash for attachments. This may not be the case in rare cases.
131 |     -   If the server is incorrect the file will remain a .part file. 
132 |     -   You can remove the .part from the file name and see if it downloaded correctly.
133 |         -   If it is correct but the downloader said the hash was wrong please report it in the [pinned issue]() so I can report it to the .party site.
134 | -   Some files do not have the file size in the response header and will not be downloaded when using `--min-filesize` or `--max-filesize`.
135 |     -   `.pdf` is a known file type that will never return file size from response headers.
136 | -   Gumroad posts published date is not provided so `--date`, `--datebefore`, and `--dateafter` will always skip Gumroad posts.  
137 | -   Files will not be overwritten by default.
138 | -   Inline images default names are the file hash.
139 | -   For getting `--yt-dlp` to work please follow its instillation [guide](https://github.com/yt-dlp/yt-dlp#installation=).
140 | -   For `--yt-dlp-args ARGS` refer to this for available [options](https://github.com/yt-dlp/yt-dlp/blob/master/yt_dlp/YoutubeDL.py#L181). 
141 | 
142 | # Output Patterns
143 | 
144 | ## Variables
145 | 
146 | The pattern options allow you to modify the file path and file name using variables from the post. `--dirname-pattern` is the base file path for all post files. 
147 | All file name patterns are appended to the end of the `--dirname-pattern`. File name patterns may also contain sub folder paths specific to that type of file such as with the default pattern for `--inline-filename-pattern`.  
148 |   
149 | All variables referring to dates are controlled by `--date-strf-pattern`. Standard python datetime strftime() format codes can be found [here](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).
150 | 
151 | ### All Options
152 | -   `{site}`  
153 | The .party site the post is hosted on.  (ie. kemono.party or coomer.party)
154 | -   `{service}`  
155 | The service of the post.  
156 | -   `{user_id}`  
157 | The user id of the poster.  
158 | -   `{username}`  
159 | The user name of the poster.  
160 | -   `{id}`  
161 | The post id.  
162 | -   `{title}`  
163 | The post title.  
164 | -   `{published}`  
165 | The published date of the post.  
166 | -   `{added}`  
167 | The date the post was added to the .party site.  
168 | -   `{updated}`  
169 | The date the post was last updated on the .party site.  
170 | -   `{user_updated}`  
171 | The date the user was last updated on the .party site.  
172 | 
173 | ### Only file names
174 | -   `{ext}`  
175 | The file extension.  
176 | -   `{filename}`  
177 | The original file name.  
178 | -   `{index}`  
179 | The files index order. Only `--filename-pattern` and `--inline-filename-pattern`  
180 | -   `{hash}`  
181 | The hash of the file. Only `--filename-pattern` and `--inline-filename-pattern`    
182 | 
183 | 
184 | ## Default Patterns
185 | `--dirname-pattern`  
186 | ```python
187 | "Downloads\{service}\{username} [{user_id}]"  
188 | ```
189 | `--filename-pattern`  
190 | ```python
191 | "[{published}] [{id}] {title}\{index}_{filename}.{ext}"  
192 | ```
193 | `--inline-filename-pattern`  
194 | ```python
195 | "[{published}] [{id}] {title}\inline\{index}_{filename}.{ext}"  
196 | ```
197 | `--other-filename-pattern`  
198 | ```python
199 | "[{published}] [{id}] {title}\[{id}]_{filename}.{ext}"  
200 | ```
201 | `--user-filename-pattern`  
202 | ```python
203 | "[{user_id}]_{filename}.{ext}"  
204 | ```
205 | `--date-strf-pattern`  
206 | ```python
207 | "%Y%m%d"  
208 | ```
209 | 
210 | ## Examples
211 | TODO
212 | 


--------------------------------------------------------------------------------
/api.md:
--------------------------------------------------------------------------------
  1 | ## USERS
  2 | ```python
  3 | # api call: /api/{service}/user/{user_id}?o={chunk}
  4 | # chunk starts at 0 and incriments by 25
  5 | # returns a list of post data (see POSTS)
  6 | ```
  7 | ## POSTS
  8 | ```python
  9 | # api call: /api/{service}/user/{user_id}/post/{post_id}
 10 | # returns a dictionary of the post data
 11 | 
 12 | post                    # dict      
 13 |     ['title']           # str 
 14 |     ['added']           # str, datetime object
 15 |     ['edited']          # str, datetime object
 16 |     ['id']              # str
 17 |     ['user']            # str
 18 |     ['published']       # str, datetime object
 19 |     ['attachments']     # list of dict
 20 |         ['name']        # str
 21 |         ['path']        # str
 22 |     ['file']            # dict
 23 |         ['name']        # str
 24 |         ['path']        # str 
 25 |     ['content']         # str, html
 26 |     ['shared_file']     # bool 
 27 |     ['embed']:          # dict
 28 |         ['description'] # str
 29 |         ['subject']     # str
 30 |         ['url']         # str
 31 | ```
 32 | ## DISCORD CHANNELS
 33 | ```python
 34 | # api call: /api/discord/channels/lookup?q={sercer_id}
 35 | # returns a list of dictionaris contaning channel names and ids
 36 | 
 37 | channel                     # dict
 38 |     ['id']                  # str
 39 |     ['name']                # str
 40 | ```
 41 | ## DISCORD CHANNEL POSTS
 42 | ```python
 43 | # api call: /api/discord/channel/{channel_id}?skip={skip}
 44 | # skip starts at 0 and incriments by 10
 45 | # returns a list of dictionaries contaning each posts data
 46 | 
 47 | post                        # dict
 48 |     ['added']               # str, datetime object
 49 |     ['attachments']         # list of dict
 50 |         ['isImage']         # str
 51 |         ['name']            # str
 52 |         ['path']            # str
 53 |     ['author']              # dict   
 54 |         ['avatar']          # str
 55 |         ['discriminator']   # str
 56 |         ['id']              # str
 57 |         ['public_flags']    # int
 58 |         ['username']        # str
 59 |     ['channel']             # str
 60 |     ['content']             # str, html
 61 |     ['edited']              # ???
 62 |     ['embeds']              # list of dict
 63 |         ['description']     # str
 64 |         ['thumbnail']       # dict
 65 |             ['height']      # int
 66 |             ['proxy_url']   # str
 67 |             ['url']         # str
 68 |             ['width']       # int
 69 |         ['title']           # str
 70 |         ['type']            # str
 71 |         ['url']             # str
 72 |     ['id']                  # str
 73 |     ['mentions']            # list of dict
 74 |         ['avatar']          # str
 75 |         ['discriminator']   # str
 76 |         ['id']              # str
 77 |         ['public_flags']    # int
 78 |         ['username']        # str    
 79 |     ['published']           # str, datetime object
 80 |     ['server']              # str
 81 | ```
 82 | ## CREATORS
 83 | ```python
 84 | # api call: /api/creators
 85 | # returns a list of dictionaries of user data
 86 | 
 87 | creator            # dict
 88 |     ['id']         # str
 89 |     ['indexed']    # str 
 90 |     ['name']       # str
 91 |     ['service']    # str
 92 |     ['updated']    # str
 93 | ```
 94 | ## FAVORITES
 95 | ```python
 96 | # api all: /api/favorites?type={type}
 97 | # type can be post or artist
 98 | # (artist) returns a list of dictionaries with user data
 99 | 
100 | favorite_user       # dict
101 |     ['faved_seq']   # int
102 |     ['id']          # str
103 |     ['indexed']     # str, datetime object
104 |     ['name']        # str
105 |     ['service']     # str
106 |     ['updated']     # str, datetime object
107 |    
108 | # (post) returns a list of dictionaries with post data
109 | 
110 | favorite_post       # dict, same as post
111 |     ['faved_seq']   # int
112 | ```
113 | 


--------------------------------------------------------------------------------
/kemono-dl.py:
--------------------------------------------------------------------------------
1 | from src.main import main
2 | 
3 | if __name__ == '__main__':
4 |     main()


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | beautifulsoup4==4.11.1
2 | Pillow==9.1.0
3 | requests==2.27.1
4 | yt_dlp==2022.4.8
5 | 


--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AlphaSlayer1964/kemono-dl/5bfab2ee925c3dcd092bbe4ad0532fe504b34cc7/src/__init__.py


--------------------------------------------------------------------------------
/src/args.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import datetime
  3 | import re
  4 | import argparse
  5 | from http.cookiejar import MozillaCookieJar, LoadError
  6 | 
  7 | from .version import __version__
  8 | 
  9 | def get_args():
 10 | 
 11 |     ap = argparse.ArgumentParser()
 12 | 
 13 |     ap.add_argument("--cookies",
 14 |                     metavar="FILE", type=str, default=None, required=True,
 15 |                     help="Takes in a cookie file or a list of cookie files separated by a comma. Used to get around the DDOS protection. Your cookie file must have been gotten while logged in to use the favorite options.")
 16 | 
 17 | 
 18 | 
 19 |     ap.add_argument("--links",
 20 |                     metavar="LINKS", type=str, default=None,
 21 |                     help="Takes in a url or list of urls separated by a comma.")
 22 | 
 23 |     ap.add_argument("--from-file",
 24 |                     metavar="FILE", type=str, default=None,
 25 |                     help="Reads in a file with urls separated by new lines. Lines starting with # will not be read in.")
 26 | 
 27 |     ap.add_argument("--kemono-fav-users",
 28 |                     metavar="SERVICE", type=str, default=None,
 29 |                     help="Downloads favorite users from kemono.party of specified type or types separated by a comma. Types include: all, patreon, fanbox, gumroad, subscribestar, dlsite, fantia. Your cookie file must have been gotten while logged in to work.")
 30 | 
 31 |     ap.add_argument("--coomer-fav-users",
 32 |                     metavar="SERVICE", type=str, default=None,
 33 |                     help="Downloads favorite users from coomer.party of specified type or types separated by a comma. Types include: all, onlyfans. Your cookie file must have been gotten while logged in to work.")
 34 | 
 35 |     ap.add_argument("--kemono-fav-posts",
 36 |                     action='store_true', default=False,
 37 |                     help="Downloads favorite posts from kemono.party. Your cookie file must have been gotten while logged in to work.")
 38 | 
 39 |     ap.add_argument("--coomer-fav-posts",
 40 |                     action='store_true', default=False,
 41 |                     help="Downloads favorite posts from coomer.party. Your cookie file must have been gotten while logged in to work.")
 42 | 
 43 | 
 44 | 
 45 |     ap.add_argument("--inline",
 46 |                     action='store_true', default=False,
 47 |                     help="Download the inline images from the post content.")
 48 | 
 49 |     ap.add_argument("--content",
 50 |                     action='store_true', default=False,
 51 |                     help="Write the post content to a html file. The html file includes comments if `--comments` is passed.")
 52 | 
 53 |     ap.add_argument("--comments",
 54 |                     action='store_true', default=False,
 55 |                     help="Write the post comments to a html file.")
 56 | 
 57 |     ap.add_argument("--json",
 58 |                     action='store_true', default=False,
 59 |                     help="Write the post json to a file.")
 60 | 
 61 |     ap.add_argument("--extract-links",
 62 |                     action='store_true', default=False,
 63 |                     help="Write extracted links from post content to a text file.")
 64 | 
 65 |     ap.add_argument("--dms",
 66 |                     action='store_true', default=False,
 67 |                     help="Write user dms to a html file. Only works when a user url is passed.")
 68 | 
 69 |     ap.add_argument("--icon",
 70 |                     action='store_true', default=False,
 71 |                     help="Download the users profile icon. Only works when a user url is passed.")
 72 | 
 73 |     ap.add_argument("--banner",
 74 |                     action='store_true', default=False,
 75 |                     help="Download the users profile banner. Only works when a user url is passed.")
 76 | 
 77 |     ap.add_argument("--yt-dlp",
 78 |                     action='store_true', default=False,
 79 |                     help="Try to download the post embed with yt-dlp.")
 80 | 
 81 |     ap.add_argument("--skip-attachments",
 82 |                     action='store_true', default=False,
 83 |                     help="Do not download post attachments.")
 84 | 
 85 |     ap.add_argument("--overwrite",
 86 |                     action='store_true', default=False,
 87 |                     help="Overwrite any previously created files.")
 88 | 
 89 | 
 90 | 
 91 |     ap.add_argument("--dirname-pattern",
 92 |                     metavar="DIRNAME_PATTERN", type=str, default='Downloads\{service}\{username} [{user_id}]',
 93 |                     help="Set the file path pattern for where files are downloaded. See Output Patterns for more detail.")
 94 | 
 95 |     ap.add_argument("--filename-pattern",
 96 |                     metavar="FILENAME_PATTERN", type=str, default='[{published}] [{id}] {title}\{index}_{filename}.{ext}',
 97 |                     help="Set the file name pattern for attachments. See Output Patterns for more detail.")
 98 | 
 99 |     ap.add_argument("--inline-filename-pattern",
100 |                     metavar="INLINE_FILENAME_PATTERN", type=str, default='[{published}] [{id}] {title}\inline\{index}_{filename}.{ext}',
101 |                     help="Set the file name pattern for inline images. See Output Patterns for more detail.")
102 | 
103 |     ap.add_argument("--other-filename-pattern",
104 |                     metavar="OTHER_FILENAME_PATTERN", type=str, default='[{published}] [{id}] {title}\[{id}]_{filename}.{ext}',
105 |                     help="Set the file name pattern for post content, extracted links, and json. See Output Patterns for more detail.")
106 | 
107 |     ap.add_argument("--user-filename-pattern",
108 |                     metavar="USER_FILENAME_PATTERN", type=str, default='[{user_id}]_{filename}.{ext}',
109 |                     help="Set the file name pattern for icon, banner and dms. See Output Patterns for more detail.")
110 | 
111 |     ap.add_argument("--date-strf-pattern",
112 |                     metavar="DATE_STRF_PATTERN", type=str, default='%Y%m%d',
113 |                     help="Set the date strf pattern variable. See Output Patterns for more detail.")
114 | 
115 |     ap.add_argument("--restrict-names",
116 |                     action='store_true', default=False,
117 |                     help='Set all file and folder names to be limited to only the ascii character set.')
118 | 
119 | 
120 | 
121 |     ap.add_argument("--archive",
122 |                     metavar="FILE", type=str, default=None,
123 |                     help="Only download posts that are not recorded in the archive file.")
124 | 
125 |     ap.add_argument("--date",
126 |                     metavar="YYYYMMDD", type=str, default=None,
127 |                     help="Only download posts published from this date.")
128 | 
129 |     ap.add_argument("--datebefore",
130 |                     metavar="YYYYMMDD", type=str, default=None,
131 |                     help="Only download posts published before this date.")
132 | 
133 |     ap.add_argument("--dateafter",
134 |                     metavar="YYYYMMDD", type=str, default=None,
135 |                     help="Only download posts published after this date.")
136 | 
137 |     ap.add_argument("--user-updated-datebefore",
138 |                     metavar="YYYYMMDD", type=str, default=None,
139 |                     help="Only download user posts if the user was updated before this date.")
140 | 
141 |     ap.add_argument("--user-updated-dateafter",
142 |                     metavar="YYYYMMDD", type=str, default=None,
143 |                     help="Only download user posts if the user was updated after this date.")
144 | 
145 |     ap.add_argument("--min-filesize",
146 |                     metavar="SIZE", type=str, default=None,
147 |                     help="Only download attachments or inline images with greater than this file size. (ex #gb | #mb | #kb | #b)")
148 | 
149 |     ap.add_argument("--max-filesize",
150 |                     metavar="SIZE", type=str, default=None,
151 |                     help="Only download attachments or inline images with less than this file size. (ex #gb | #mb | #kb | #b)")
152 | 
153 |     ap.add_argument("--only-filetypes",
154 |                     metavar="EXT", type=str, default=[],
155 |                     help="Only download attachments or inline images with the given file type(s). Takes a file extensions or list of file extensions separated by a comma. (ex mp4,jpg,gif,zip)")
156 | 
157 |     ap.add_argument("--skip-filetypes",
158 |                     metavar="EXT", type=str, default=[],
159 |                     help="Only download attachments or inline images without the given file type(s). Takes a file extensions or list of file extensions separated by a comma. (ex mp4,jpg,gif,zip)")
160 | 
161 | 
162 | 
163 |     ap.add_argument("--version",
164 |                     action='version', version=str(__version__),
165 |                     help="Print the version and exit.")
166 | 
167 |     ap.add_argument("--verbose",
168 |                     action='store_true', default=False,
169 |                     help="Display debug information and copies output to a file.")
170 | 
171 |     ap.add_argument("--quiet",
172 |                     action='store_true', default=False,
173 |                     help="Suppress printing except for warnings, errors, and exceptions.")
174 | 
175 |     ap.add_argument("--simulate",
176 |                     action='store_true', default=False,
177 |                     help="Simulate the given command and do not write to disk.")
178 | 
179 |     ap.add_argument("--no-part-files",
180 |                     action='store_true', default=False,
181 |                     help="Do not save attachments or inline images as .part files while downloading. Files partially downloaded will not be resumed if program stops. ")
182 | 
183 |     ap.add_argument("--yt-dlp-args",
184 |                     metavar="YT_DLP_ARGS", type=str, default=None,
185 |                     help="The args yt-dlp will use to download with. Formatted as a python dictionary object. ")
186 | 
187 |     ap.add_argument("--post-timeout",
188 |                     metavar="SEC", type=int, default=0,
189 |                     help="The time in seconds to wait between downloading posts. (default: 0)")
190 | 
191 |     ap.add_argument("--retry",
192 |                     metavar="COUNT", type=int, default=5,
193 |                     help="The amount of times to retry / resume downloading a file. (default: 5)")
194 | 
195 |     ap.add_argument("--ratelimit-sleep",
196 |                     metavar="SEC", type=int, default=120,
197 |                     help="The time in seconds to wait after being ratelimited (default: 120)")
198 | 
199 |     ap.add_argument("--user-agent",
200 |                     metavar="UA", type=str, default='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
201 |                     help="Set a custom user agent")
202 | 
203 |     args = vars(ap.parse_args())
204 | 
205 |     # takes a comma seperated lost of cookie files and loads them into a cookie jar
206 |     if args['cookies']:
207 |         cookie_files = [s.strip() for s in args["cookies"].split(",")]
208 |         args['cookies'] = MozillaCookieJar()
209 |         loaded = 0
210 |         for cookie_file in cookie_files:
211 |             try:
212 |                 args['cookies'].load(cookie_file)
213 |                 loaded += 1
214 |             except LoadError:
215 |                 print(F"Unable to load cookie {cookie_file}")
216 |             except FileNotFoundError:
217 |                 print(F"Unable to find cookie {cookie_file}")
218 |         if loaded == 0:
219 |             print("No cookies loaded | exiting"), exit()
220 | 
221 |     # takes a comma seperated string of links and converts them to a list
222 |     if args['links']:
223 |         args['links'] = [s.strip().split('?')[0] for s in args["links"].split(",")]
224 |     else:
225 |         args['links'] = []
226 | 
227 |     # takes a file and converts it to a list
228 |     if args['from_file']:
229 |         if not os.path.exists(args['from_file']):
230 |             print(f"--from-file {args['from_file']} does not exist")
231 |         with open(args['from_file'],'r') as f:
232 |             # lines starting with '#' are ignored
233 |             args['from_file'] = [line.rstrip().split('?')[0] for line in f if line[0] != '#' and line.strip() != '']
234 |     else:
235 |         args['from_file'] = []
236 | 
237 |     if args['archive']:
238 |         # the archive file doesn't need to exist but the directory does
239 |         if not os.path.isdir(os.path.dirname(os.path.abspath(args['archive']))):
240 |             print(f"--archive {args['archive']} directory does not exist"), quit()
241 | 
242 |     if args['only_filetypes'] and args['skip_filetypes']:
243 |         print('--only-filetypes and --skip-filetypes can not be given together'), quit()
244 |     # takes a comma seperated string of extentions and converts them to a list
245 |     if args['only_filetypes']:
246 |         args['only_filetypes'] = [s.strip().lower() for s in args["only_filetypes"].split(",")]
247 |     # takes a comma seperated string of extentions and converts them to a list
248 |     if args['skip_filetypes']:
249 |         args['skip_filetypes'] = [s.strip().lower() for s in args["skip_filetypes"].split(",")]
250 | 
251 |     def check_date(args, key):
252 |         try:
253 |             args[key] = datetime.datetime.strptime(args[key], r'%Y%m%d')
254 |         except:
255 |             print(f"--{key} {args[key]} is an invalid date | correct format: YYYYMMDD"), exit()
256 | 
257 |     if args['date']:
258 |         check_date(args, 'date')
259 |     if args['datebefore']:
260 |         check_date(args, 'datebefore')
261 |     if args['dateafter']:
262 |         check_date(args, 'dateafter')
263 |     if args['user_updated_datebefore']:
264 |         check_date(args, 'user_updated_datebefore')
265 |     if args['user_updated_dateafter']:
266 |         check_date(args, 'user_updated_dateafter')
267 | 
268 |     def check_size(args, key):
269 |         found = re.search(r'([0-9]+)(gb|mb|kb|b)', args[key].lower())
270 |         if found:
271 |             if found.group(2) == 'b':
272 |                 args[key] = int(found.group(1))
273 |             elif found.group(2) == 'kb':
274 |                 args[key] = int(found.group(1)) * 10**2
275 |             elif found.group(2) == 'mb':
276 |                 args[key] = int(found.group(1)) * 10**6
277 |             elif found.group(2) == 'gb':
278 |                 args[key] = int(found.group(1)) * 10**9
279 |             return
280 |         print(f"--{key} {args[key]} is an invalid size | correct format: ex 1b 1kb 1mb 1gb"), quit()
281 | 
282 |     if args['max_filesize']:
283 |         check_size(args, 'max_filesize')
284 |     if args['min_filesize']:
285 |         check_size(args, 'min_filesize')
286 | 
287 |     if args['kemono_fav_users']:
288 |         temp = []
289 |         for s in args["kemono_fav_users"].split(","):
290 |             if s.strip().lower() in {'all', 'patreon', 'fanbox', 'gumroad', 'subscribestar', 'dlsite', 'fantia'}:
291 |                 temp.append(s.strip().lower())
292 |             else:
293 |                 print(f"--kemono-fav-users {s.strip()} is not a valid option")
294 |         if len(temp) == 0:
295 |             print(f"--kemono-fav-users no valid options were passed")
296 |         args['kemono_fav_users'] = temp
297 | 
298 |     if args['coomer_fav_users']:
299 |         temp = []
300 |         for s in args["coomer_fav_users"].split(","):
301 |             if s.strip().lower() in {'all', 'onlyfans'}:
302 |                 temp.append(s.strip().lower())
303 |             else:
304 |                 print(f"--coomer-fav-users {s.strip()} is not a valid option")
305 |         if len(temp) == 0:
306 |             print(f"--coomer-fav-users no valid options were passed")
307 |         args['coomer_fav_users'] = temp
308 | 
309 |     return args


--------------------------------------------------------------------------------
/src/helper.py:
--------------------------------------------------------------------------------
  1 | import re
  2 | import hashlib
  3 | import os
  4 | import time
  5 | 
  6 | def parse_url(url):
  7 |     # parse urls
  8 |     downloadable = re.search(r'^https://(kemono\.party|coomer\.party)/([^/]+)/user/([^/]+)($|/post/([^/]+)$)',url)
  9 |     if not downloadable:
 10 |         return None
 11 |     return downloadable.group(1)
 12 | 
 13 | # create path from template pattern
 14 | def compile_post_path(post_variables, template, ascii):
 15 |     drive, tail = os.path.splitdrive(template)
 16 |     tail = tail[1:] if tail[0] in {'/','\\'} else tail
 17 |     tail_split = re.split(r'\\|/', tail)
 18 |     cleaned_path = drive + os.path.sep if drive else ''
 19 |     for folder in tail_split:
 20 |         if ascii:
 21 |             cleaned_path = os.path.join(cleaned_path, restrict_ascii(clean_folder_name(folder.format(**post_variables))))
 22 |         else:
 23 |             cleaned_path = os.path.join(cleaned_path, clean_folder_name(folder.format(**post_variables)))
 24 |     return cleaned_path
 25 | 
 26 | # create file path from template pattern
 27 | def compile_file_path(post_path, post_variables, file_variables, template, ascii):
 28 |     file_split = re.split(r'\\|/', template)
 29 |     if len(file_split) > 1:
 30 |         for folder in file_split[:-1]:
 31 |             if ascii:
 32 |                 post_path = os.path.join(post_path, restrict_ascii(clean_folder_name(folder.format(**file_variables, **post_variables))))
 33 |             else:
 34 |                 post_path = os.path.join(post_path, clean_folder_name(folder.format(**file_variables, **post_variables)))
 35 |     if ascii:
 36 |         cleaned_file = restrict_ascii(clean_file_name(file_split[-1].format(**file_variables, **post_variables)))
 37 |     else:
 38 |         cleaned_file = clean_file_name(file_split[-1].format(**file_variables, **post_variables))
 39 |     return os.path.join(post_path, cleaned_file)
 40 | 
 41 | # get file hash
 42 | def get_file_hash(file:str):
 43 |     sha256_hash = hashlib.sha256()
 44 |     with open(file,"rb") as f:
 45 |         for byte_block in iter(lambda: f.read(4096),b""):
 46 |             sha256_hash.update(byte_block)
 47 |     return sha256_hash.hexdigest().lower()
 48 | 
 49 | # clean folder name for windows
 50 | def clean_folder_name(folder_name:str):
 51 |     if not folder_name.rstrip():
 52 |         folder_name = '_'
 53 |     return re.sub(r'[\x00-\x1f\\/:\"*?<>\|]|\.$','_',folder_name.rstrip())[:248]
 54 | 
 55 | # clean file name for windows
 56 | def clean_file_name(file_name:str):
 57 |     if not file_name:
 58 |         file_name = '_'
 59 |     file_name = re.sub(r'[\x00-\x1f\\/:\"*?<>\|]','_', file_name)
 60 |     file_name, file_extension = os.path.splitext(file_name)
 61 |     return file_name[:255-len(file_extension)-5] + file_extension
 62 | 
 63 | def restrict_ascii(string:str):
 64 |     return re.sub(r'[^\x21-\x7f]','_',string)
 65 | 
 66 | def check_date(post_date, date, datebefore, dateafter):
 67 |     if date:
 68 |         if date == post_date:
 69 |             return False
 70 |     if datebefore and dateafter:
 71 |         if dateafter <= post_date <= datebefore:
 72 |             return False
 73 |     elif datebefore:
 74 |         if datebefore >= post_date:
 75 |             return False
 76 |     elif dateafter:
 77 |         if dateafter <= post_date:
 78 |             return False
 79 |     return True
 80 | 
 81 | # prints download bar
 82 | def print_download_bar(total:int, downloaded:int, resumed:int, start):
 83 |     time_diff = time.time() - start
 84 |     if time_diff == 0.0:
 85 |         time_diff = 0.000001
 86 |     done = 50
 87 | 
 88 |     rate = (downloaded-resumed)/time_diff
 89 | 
 90 |     eta = time.strftime("%H:%M:%S", time.gmtime((total-downloaded) / rate))
 91 | 
 92 |     if rate/2**10 < 100:
 93 |         rate = (round(rate/2**10, 1), 'KB')
 94 |     elif rate/2**20 < 100:
 95 |         rate = (round(rate/2**20, 1), 'MB')
 96 |     else:
 97 |         rate = (round(rate/2**30, 1), 'GB')
 98 | 
 99 |     if total:
100 |         done = int(50*downloaded/total)
101 |         if total/2**10 < 100:
102 |             total = (round(total/2**10, 1), 'KB')
103 |             downloaded = round(downloaded/2**10,1)
104 |         elif total/2**20 < 100:
105 |             total = (round(total/2**20, 1), 'MB')
106 |             downloaded = round(downloaded/2**20,1)
107 |         else:
108 |             total = (round(total/2**30, 1), 'GB')
109 |             downloaded = round(downloaded/2**30,1)
110 |     else:
111 |         if downloaded/2**10 < 100:
112 |             total = ('???', 'KB')
113 |             downloaded = round(downloaded/2**10,1)
114 |         elif downloaded/2**20 < 100:
115 |             total = ('???', 'MB')
116 |             downloaded = round(downloaded/2**20,1)
117 |         else:
118 |             total = ('???', 'GB')
119 |             downloaded = round(downloaded/2**30,1)
120 | 
121 |     bar_fill = '='*done
122 |     bar_empty = ' '*(50-done)
123 |     overlap_buffer = ' '*15
124 |     print(f'[{bar_fill}{bar_empty}] {downloaded}/{total[0]} {total[1]} at {rate[0]} {rate[1]}/s ETA {eta}{overlap_buffer}', end='\r')
125 | 
126 | # redo this
127 | # def check_version():
128 | #     try:
129 | #         current_version = datetime.datetime.strptime(__version__, r'%Y.%m.%d')
130 | #     except:
131 | #         current_version = datetime.datetime.strptime(__version__, r'%Y.%m.%d.%H')
132 | #     github_api_url = 'https://api.github.com/repos/AplhaSlayer1964/kemono-dl/releases/latest'
133 | #     try:
134 | #         latest_tag = requests.get(url=github_api_url, timeout=300).json()['tag_name']
135 | #     except:
136 | #         logger.error("Failed to check latest version of kemono-dl")
137 | #         return
138 | #     try:
139 | #         latest_version = datetime.datetime.strptime(latest_tag, r'%Y.%m.%d')
140 | #     except:
141 | #         latest_version = datetime.datetime.strptime(latest_tag, r'%Y.%m.%d.%H')
142 | #     if current_version < latest_version:
143 | #         logger.debug(f"Using kemono-dl {__version__} while latest release is kemono-dl {latest_tag}")
144 | #         logger.warning(f"A newer version of kemono-dl is available. Please update to the latest release at https://github.com/AplhaSlayer1964/kemono-dl/releases/latest")


--------------------------------------------------------------------------------
/src/logger.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | 
 3 | from .args import get_args
 4 | 
 5 | args = get_args()
 6 | 
 7 | if args['verbose']:
 8 |     # clear log file
 9 |     file = open('debug.log','w')
10 |     file.close()
11 | 
12 | logging.getLogger("requests").setLevel(logging.WARNING)
13 | logging.getLogger("urllib3").setLevel(logging.WARNING)
14 | 
15 | logger = logging.getLogger('kemono-dl')
16 | 
17 | logger.setLevel(logging.INFO)
18 | if args['quiet']:
19 |     logger.setLevel(logging.WARNING)
20 | if args['verbose']:
21 |     logger.setLevel(logging.DEBUG)
22 | 
23 | file_format = logging.Formatter('%(asctime)s:%(levelname)s:%(message)s')
24 | stream_format = logging.Formatter('%(levelname)s:%(message)s')
25 | 
26 | file_handler = logging.FileHandler('debug.log', encoding="utf-16")
27 | file_handler.setFormatter(file_format)
28 | 
29 | stream_handler = logging.StreamHandler()
30 | stream_handler.setFormatter(stream_format)
31 | 
32 | if args['verbose']:
33 |     logger.addHandler(file_handler)
34 | logger.addHandler(stream_handler)


--------------------------------------------------------------------------------
/src/main.py:
--------------------------------------------------------------------------------
  1 | import requests
  2 | from requests.adapters import HTTPAdapter, Retry
  3 | import re
  4 | import os
  5 | from bs4 import BeautifulSoup
  6 | import time
  7 | import datetime
  8 | from PIL import Image
  9 | from io import BytesIO
 10 | import json
 11 | 
 12 | from .args import get_args
 13 | from .logger import logger
 14 | from .version import __version__
 15 | from .helper import get_file_hash, print_download_bar, check_date, parse_url, compile_post_path, compile_file_path
 16 | from .my_yt_dlp import my_yt_dlp
 17 | 
 18 | class downloader:
 19 | 
 20 |     def __init__(self, args):
 21 |         self.input_urls = args['links'] + args['from_file']
 22 |         # list of completed posts from current session
 23 |         self.comp_posts = []
 24 |         # list of creators info
 25 |         self.creators = []
 26 | 
 27 |         # requests variables
 28 |         self.headers = {'User-Agent': args['user_agent']}
 29 |         self.cookies = args['cookies']
 30 |         self.timeout = 300
 31 | 
 32 |         # file/folder naming
 33 |         self.download_path_template = args['dirname_pattern']
 34 |         self.filename_template = args['filename_pattern']
 35 |         self.inline_filename_template = args['inline_filename_pattern']
 36 |         self.other_filename_template = args['other_filename_pattern']
 37 |         self.user_filename_template = args['user_filename_pattern']
 38 |         self.date_strf_pattern = args['date_strf_pattern']
 39 |         self.yt_dlp_args = args['yt_dlp_args']
 40 |         self.restrict_ascii = args['restrict_names']
 41 | 
 42 |         self.archive_file = args['archive']
 43 |         self.archive_list = []
 44 |         self.post_errors = 0
 45 | 
 46 |         # controls what to download/save
 47 |         self.attachments = not args['skip_attachments']
 48 |         self.inline = args['inline']
 49 |         self.content = args['content']
 50 |         self.extract_links = args['extract_links']
 51 |         self.comments = args['comments']
 52 |         self.json = args['json']
 53 |         self.yt_dlp = args['yt_dlp']
 54 |         self.k_fav_posts = args['kemono_fav_posts']
 55 |         self.c_fav_posts = args['coomer_fav_posts']
 56 |         self.k_fav_users = args['kemono_fav_users']
 57 |         self.c_fav_users = args['coomer_fav_users']
 58 |         self.icon_banner = []
 59 |         if args['icon']:
 60 |             self.icon_banner.append('icon')
 61 |         if args['banner']:
 62 |             self.icon_banner.append('banner')
 63 |         self.dms = args['dms']
 64 | 
 65 |         # controls files to ignore
 66 |         self.overwrite = args['overwrite']
 67 |         self.only_ext = args['only_filetypes']
 68 |         self.not_ext = args['skip_filetypes']
 69 |         self.max_size = args['max_filesize']
 70 |         self.min_size = args['min_filesize']
 71 | 
 72 |         # controlls posts to ignore
 73 |         self.date = args['date']
 74 |         self.datebefore = args['datebefore']
 75 |         self.dateafter = args['dateafter']
 76 |         self.user_up_datebefore = args['user_updated_datebefore']
 77 |         self.user_up_dateafter = args['user_updated_dateafter']
 78 | 
 79 |         # other
 80 |         self.retry = args['retry']
 81 |         self.no_part = args['no_part_files']
 82 |         self.ratelimit_sleep = args['ratelimit_sleep']
 83 |         self.post_timeout = args['post_timeout']
 84 |         self.simulate = args['simulate']
 85 | 
 86 |         self.session = requests.Session()
 87 |         retries = Retry(
 88 |             total=self.retry,
 89 |             backoff_factor=0.1,
 90 |             status_forcelist=[ 500, 502, 503, 504 ]
 91 |         )
 92 |         self.session.mount('https://', HTTPAdapter(max_retries=retries))
 93 |         self.session.mount('http://', HTTPAdapter(max_retries=retries))
 94 | 
 95 |         self.start_download()
 96 | 
 97 |     def get_creators(self, domain:str):
 98 |         # get site creators
 99 |         creators_api = f"https://{domain}/api/creators/"
100 |         logger.debug(f"Getting creator json from {creators_api}")
101 |         return self.session.get(url=creators_api, cookies=self.cookies, headers=self.headers, timeout=self.timeout).json()
102 | 
103 |     def get_user(self, user_id:str, service:str):
104 |         for creator in self.creators:
105 |             if creator['id'] == user_id and creator['service'] == service:
106 |                 return creator
107 |         return None
108 | 
109 |     def get_favorites(self, domain:str, fav_type:str, services:list = None):
110 |         fav_api = f'https://{domain}/api/favorites?type={fav_type}'
111 |         logger.debug(f"Getting favorite json from {fav_api}")
112 |         response = self.session.get(url=fav_api, headers=self.headers, cookies=self.cookies, timeout=self.timeout)
113 |         if response.status_code == 401:
114 |             logger.error(f"{response.status_code} {response.reason} | Bad cookie file")
115 |             return
116 |         if not response.ok:
117 |             logger.error(f"{response.status_code} {response.reason}")
118 |             return
119 |         for favorite in response.json():
120 |             if fav_type == 'post':
121 |                 self.get_post(f"https://{domain}/{favorite['service']}/user/{favorite['user']}/post/{favorite['id']}")
122 |             if fav_type == 'artist':
123 |                 if not (favorite['service'] in services or 'all' in services):
124 |                     logger.info(f"Skipping user {favorite['name']} | Service {favorite['service']} was not requested")
125 |                     continue
126 |                 self.get_post(f"https://{domain}/{favorite['service']}/user/{favorite['id']}")
127 | 
128 |     def get_post(self, url:str):
129 |         found = re.search(r'(https://(kemono\.party|coomer\.party)/)(([^/]+)/user/([^/]+)($|/post/[^/]+))', url)
130 |         if not found:
131 |             logger.error(f"Unable to find url parameters for {url}")
132 |             return
133 |         api = f"{found.group(1)}api/{found.group(3)}"
134 |         site = found.group(2)
135 |         service = found.group(4)
136 |         user_id = found.group(5)
137 |         is_post = found.group(6)
138 |         user = self.get_user(user_id, service)
139 |         if not user:
140 |             logger.error(f"Unable to find user info in creators list | {service} | {user_id}")
141 |             return
142 |         if not is_post:
143 |             if self.skip_user(user):
144 |                 return
145 |         logger.info(f"Downloading posts from {site}.party | {service} | {user['name']} | {user['id']}")
146 |         chunk = 0
147 |         first = True
148 |         while True:
149 |             if is_post:
150 |                 logger.debug(f"Requesting post json from: {api}")
151 |                 json = self.session.get(url=api, cookies=self.cookies, headers=self.headers, timeout=self.timeout).json()
152 |             else:
153 |                 logger.debug(f"Requesting user json from: {api}?o={chunk}")
154 |                 json = self.session.get(url=f"{api}?o={chunk}", cookies=self.cookies, headers=self.headers, timeout=self.timeout).json()
155 |             if not json:
156 |                 if is_post:
157 |                     logger.error(f"Unable to find post json for {api}")
158 |                 elif chunk == 0:
159 |                     logger.error(f"Unable to find user json for {api}?o={chunk}")
160 |                 return # completed
161 |             for post in json:
162 |                 post = self.clean_post(post, user, site)
163 |                 # only download once
164 |                 if not is_post and first:
165 |                     self.download_icon_banner(post, self.icon_banner)
166 |                     if self.dms:
167 |                         self.write_dms(post)
168 |                     first = False
169 |                 if self.skip_post(post):
170 |                     continue
171 |                 try:
172 |                     self.download_post(post)
173 |                     if self.post_timeout:
174 |                         logger.info(f"Sleeping for {self.post_timeout} seconds.")
175 |                         time.sleep(self.post_timeout)
176 |                 except:
177 |                     logger.exception("Unable to download post | service:{service} user_id:{user_id} post_id:{id}".format(**post['post_variables']))
178 |                 self.comp_posts.append("https://{site}/{service}/user/{user_id}/post/{id}".format(**post['post_variables']))
179 |             if len(json) < 25:
180 |                 return # completed
181 |             chunk += 25
182 | 
183 | 
184 |     def download_icon_banner(self, post:dict, img_types:list):
185 |         for img_type in img_types:
186 |             if post['post_variables']['service'] in {'dlsite'}:
187 |                 logger.warning(f"Profile {img_type}s are not supported for {post['post_variables']['service']} users")
188 |                 return
189 |             if post['post_variables']['service'] in {'gumroad'} and img_type == 'banner':
190 |                 logger.warning(f"Profile {img_type}s are not supported for {post['post_variables']['service']} users")
191 |                 return
192 |             image_url = "https://{site}/{img_type}s/{service}/{user_id}".format(img_type=img_type, **post['post_variables'])
193 |             response = self.session.get(url=image_url,headers=self.headers, cookies=self.cookies, timeout=self.timeout)
194 |             try:
195 |                 image = Image.open(BytesIO(response.content))
196 |                 file_variables = {
197 |                     'filename':img_type,
198 |                     'ext':image.format.lower()
199 |                 }
200 |                 file_path = compile_file_path(post['post_path'], post['post_variables'], file_variables, self.user_filename_template, self.restrict_ascii)
201 |                 if os.path.exists(file_path):
202 |                     logger.info(f"Skipping: {os.path.split(file_path)[1]} | File already exists")
203 |                     return
204 |                 logger.info(f"Downloading: {os.path.split(file_path)[1]}")
205 |                 logger.debug(f"Downloading to: {file_path}")
206 |                 if not self.simulate:
207 |                     if not os.path.exists(os.path.split(file_path)[0]):
208 |                         os.makedirs(os.path.split(file_path)[0])
209 |                     image.save(file_path, format=image.format)
210 |             except:
211 |                 logger.error(f"Unable to download profile {img_type} for {post['post_variables']['username']}")
212 | 
213 |     def write_dms(self, post:dict):
214 |         # no api method to get comments so using from html (not future proof)
215 |         post_url = "https://{site}/{service}/user/{user_id}/dms".format(**post['post_variables'])
216 |         response = self.session.get(url=post_url, allow_redirects=True, headers=self.headers, cookies=self.cookies, timeout=self.timeout)
217 |         page_soup = BeautifulSoup(response.text, 'html.parser')
218 |         if page_soup.find("div", {"class": "no-results"}):
219 |             logger.info("No DMs found for https://{site}/{service}/user/{user_id}".format(**post['post_variables']))
220 |             return
221 |         dms_soup = page_soup.find("div", {"class": "card-list__items"})
222 |         file_variables = {
223 |             'filename':'direct messages',
224 |             'ext':'html'
225 |         }
226 |         file_path = compile_file_path(post['post_path'], post['post_variables'], file_variables, self.user_filename_template, self.restrict_ascii)
227 |         self.write_to_file(file_path, dms_soup.prettify())
228 | 
229 |     def get_inline_images(self, post, content_soup):
230 |         # only get images that are hosted by the .party site
231 |         inline_images = [inline_image for inline_image in content_soup.find_all("img") if inline_image['src'][0] == '/']
232 |         for index, inline_image in enumerate(inline_images):
233 |             file = {}
234 |             filename, file_extension = os.path.splitext(inline_image['src'].rsplit('/')[-1])
235 |             m = re.search(r'[a-zA-Z0-9]{64}', inline_image['src'])
236 |             file_hash = m.group(0) if m else None
237 |             file['file_variables'] = {
238 |                 'filename': filename,
239 |                 'ext': file_extension[1:],
240 |                 'url': f"https://{post['post_variables']['site']}/data{inline_image['src']}",
241 |                 'hash': file_hash,
242 |                 'index': f"{index + 1}".zfill(len(str(len(inline_images))))
243 |             }
244 |             file['file_path'] = compile_file_path(post['post_path'], post['post_variables'], file['file_variables'], self.inline_filename_template, self.restrict_ascii)
245 |             # set local image location in html
246 |             inline_image['src'] = file['file_path']
247 |             post['inline_images'].append(file)
248 |         return content_soup
249 | 
250 |     def compile_content_links(self, post, content_soup, embed_links):
251 |         href_links = content_soup.find_all(href=True)
252 |         post['links']['text'] = embed_links
253 |         for href_link in href_links:
254 |             post['links']['text'] += f"{href_link['href']}\n"
255 |         post['links']['file_variables'] = {
256 |             'filename':'links',
257 |             'ext':'txt'
258 |         }
259 |         post['links']['file_path'] = compile_file_path(post['post_path'], post['post_variables'], post['links']['file_variables'], self.other_filename_template, self.restrict_ascii)
260 | 
261 |     def get_comments(self, post_variables:dict):
262 |         try:
263 |             # no api method to get comments so using from html (not future proof)
264 |             post_url = "https://{site}/{service}/user/{user_id}/post/{id}".format(**post_variables)
265 |             response = self.session.get(url=post_url, allow_redirects=True, headers=self.headers, cookies=self.cookies, timeout=self.timeout)
266 |             page_soup = BeautifulSoup(response.text, 'html.parser')
267 |             comment_soup = page_soup.find("div", {"class": "post__comments"})
268 |             no_comments = re.search('([^ ]+ does not support comment scraping yet\.|No comments found for this post\.)',comment_soup.text)
269 |             if no_comments:
270 |                 logger.debug(no_comments.group(1).strip())
271 |                 return ''
272 |             return comment_soup.prettify()
273 |         except:
274 |             self.post_errors += 1
275 |             logger.exception("Failed to get post comments")
276 | 
277 |     def compile_post_content(self, post, content_soup, comment_soup, embed):
278 |         post['content']['text'] = f"{content_soup}\n{embed}\n{comment_soup}"
279 |         post['content']['file_variables'] = {
280 |             'filename':'content',
281 |             'ext':'html'
282 |         }
283 |         post['content']['file_path'] = compile_file_path(post['post_path'], post['post_variables'], post['content']['file_variables'], self.other_filename_template, self.restrict_ascii)
284 | 
285 |     def clean_post(self, post:dict, user:dict, domain:str):
286 |         new_post = {}
287 |         # set post variables
288 |         new_post['post_variables'] = {}
289 |         new_post['post_variables']['title'] = post['title']
290 |         new_post['post_variables']['id'] = post['id']
291 |         new_post['post_variables']['user_id'] = post['user']
292 |         new_post['post_variables']['username'] = user['name']
293 |         new_post['post_variables']['site'] = domain
294 |         new_post['post_variables']['service'] = post['service']
295 |         new_post['post_variables']['added'] = datetime.datetime.strptime(post['added'], r'%a, %d %b %Y %H:%M:%S %Z').strftime(self.date_strf_pattern) if post['added'] else None
296 |         new_post['post_variables']['updated'] = datetime.datetime.strptime(post['edited'], r'%a, %d %b %Y %H:%M:%S %Z').strftime(self.date_strf_pattern) if post['edited'] else None
297 |         new_post['post_variables']['user_updated'] = datetime.datetime.strptime(user['updated'], r'%a, %d %b %Y %H:%M:%S %Z').strftime(self.date_strf_pattern) if user['updated'] else None
298 |         new_post['post_variables']['published'] = datetime.datetime.strptime(post['published'], r'%a, %d %b %Y %H:%M:%S %Z').strftime(self.date_strf_pattern) if post['published'] else None
299 | 
300 |         new_post['post_path'] = compile_post_path(new_post['post_variables'], self.download_path_template, self.restrict_ascii)
301 | 
302 |         new_post['attachments'] = []
303 |         if self.attachments:
304 |             # add post file to front of attachments list if it doesn't already exist
305 |             if post['file'] and not post['file'] in post['attachments']:
306 |                 post['attachments'].insert(0, post['file'])
307 |             # loop over attachments and set file variables
308 |             for index, attachment in enumerate(post['attachments']):
309 |                 file = {}
310 |                 filename, file_extension = os.path.splitext(attachment['name'])
311 |                 m = re.search(r'[a-zA-Z0-9]{64}', attachment['path'])
312 |                 file_hash = m.group(0) if m else None
313 |                 file['file_variables'] = {
314 |                     'filename': filename,
315 |                     'ext': file_extension[1:],
316 |                     'url': f"https://{domain}/data{attachment['path']}?f={attachment['name']}",
317 |                     'hash': file_hash,
318 |                     'index': f"{index + 1}".zfill(len(str(len(post['attachments']))))
319 |                 }
320 |                 file['file_path'] = compile_file_path(new_post['post_path'], new_post['post_variables'], file['file_variables'], self.filename_template, self.restrict_ascii)
321 |                 new_post['attachments'].append(file)
322 | 
323 |         new_post['inline_images'] = []
324 |         content_soup = BeautifulSoup(post['content'], 'html.parser')
325 |         if self.inline:
326 |             content_soup = self.get_inline_images(new_post, content_soup)
327 | 
328 |         comment_soup = self.get_comments(new_post['post_variables']) if self.comments else ''
329 | 
330 |         new_post['content'] = {'text':None,'file_variables':None, 'file_path':None}
331 |         embed = "{subject}\n{url}\n{description}".format(**post['embed']) if post['embed'] else ''
332 |         if (self.content or self.comments) and (content_soup or comment_soup or embed):
333 |             self.compile_post_content(new_post, content_soup.prettify(), comment_soup, embed)
334 | 
335 |         new_post['links'] = {'text':None,'file_variables':None, 'file_path':None}
336 |         embed_url = "{url}\n".format(**post['embed']) if post['embed'] else ''
337 |         if self.extract_links:
338 |             self.compile_content_links(new_post, content_soup, embed_url)
339 | 
340 |         return new_post
341 | 
342 |     def download_post(self, post:dict):
343 |         # might look buggy if title has new lines in it
344 |         logger.info("Downloading Post | {title}".format(**post['post_variables']))
345 |         logger.debug("Post URL: https://{site}/{service}/user/{user_id}/post/{id}".format(**post['post_variables']))
346 |         self.download_attachments(post)
347 |         self.download_inline(post)
348 |         self.write_content(post)
349 |         self.write_links(post)
350 |         if self.json:
351 |             self.write_json(post)
352 |         self.download_yt_dlp(post)
353 |         self.write_archive(post)
354 |         self.post_errors = 0
355 | 
356 |     def download_attachments(self, post:dict):
357 |         # download the post attachments
358 |         for file in post['attachments']:
359 |             try:
360 |                 self.download_file(file, retry=self.retry)
361 |             except:
362 |                 self.post_errors += 1
363 |                 logger.exception(f"Failed to download: {file['file_path']}")
364 | 
365 |     def download_inline(self, post:dict):
366 |         # download the post inline files
367 |         for file in post['inline_images']:
368 |             try:
369 |                 self.download_file(file, retry=self.retry)
370 |             except:
371 |                 self.post_errors += 1
372 |                 logger.exception(f"Failed to download: {file['file_path']}")
373 | 
374 |     def write_content(self, post:dict):
375 |         # write post content
376 |         if post['content']['text']:
377 |             try:
378 |                 self.write_to_file(post['content']['file_path'], post['content']['text'])
379 |             except:
380 |                 self.post_errors += 1
381 |                 logger.exception(f"Failed to save content")
382 | 
383 |     def write_links(self, post:dict):
384 |         # Write post content links
385 |         if post['links']['text']:
386 |             try:
387 |                 self.write_to_file(post['links']['file_path'], post['links']['text'])
388 |             except:
389 |                 self.post_errors += 1
390 |                 logger.exception(f"Failed to save content links")
391 | 
392 |     def write_json(self, post:dict):
393 |         try:
394 |             # add this to clean post function
395 |             file_variables = {
396 |                 'filename':'json',
397 |                 'ext':'json'
398 |             }
399 |             file_path = compile_file_path(post['post_path'], post['post_variables'], file_variables, self.other_filename_template, self.restrict_ascii)
400 |             self.write_to_file(file_path, post)
401 |         except:
402 |             self.post_errors += 1
403 |             logger.exception(f"Failed to save json")
404 | 
405 |     def write_to_file(self, file_path, file_content):
406 |         # check if file exists and if should overwrite
407 |         if os.path.exists(file_path) and not self.overwrite:
408 |             logger.info(f"Skipping: {os.path.split(file_path)[1]} | File already exists")
409 |             return
410 |         logger.info(f"Writing: {os.path.split(file_path)[1]}")
411 |         logger.debug(f"Writing to: {file_path}")
412 |         if not self.simulate:
413 |             # create folder path if it doesn't exist
414 |             if not os.path.exists(os.path.split(file_path)[0]):
415 |                 os.makedirs(os.path.split(file_path)[0])
416 |             # write to file
417 |             if isinstance(file_content, dict):
418 |                 with open(file_path,'w') as f:
419 |                     json.dump(file_content, f, indent=4, sort_keys=True)
420 |             else:
421 |                 with open(file_path,'wb') as f:
422 |                     f.write(file_content.encode("utf-16"))
423 | 
424 |     def download_file(self, file:dict, retry:int):
425 |         # download a file
426 |         if self.skip_file(file):
427 |             return
428 | 
429 |         part_file = f"{file['file_path']}.part" if not self.no_part else file['file_path']
430 | 
431 |         logger.info(f"Downloading: {os.path.split(file['file_path'])[1]}")
432 |         logger.debug(f"Downloading from: {file['file_variables']['url']}")
433 |         logger.debug(f"Downloading to: {part_file}")
434 | 
435 |         # try to resume part file
436 |         resume_size = 0
437 |         if os.path.exists(part_file) and not self.overwrite:
438 |             resume_size = os.path.getsize(part_file)
439 |             logger.info(f"Trying to resuming partial download | Resume size: {resume_size} bytes")
440 | 
441 |         try:
442 |             response = self.session.get(url=file['file_variables']['url'], stream=True, headers={**self.headers,'Range':f"bytes={resume_size}-"}, cookies=self.cookies, timeout=self.timeout)
443 |         except:
444 |             logger.exception(f"Failed to get responce: {file['file_variables']['url']} | Retrying")
445 |             if retry > 0:
446 |                 self.download_file(file, retry=retry-1)
447 |                 return
448 |             logger.error(f"Failed to get responce: {file['file_variables']['url']} | All retries failed")
449 |             self.post_errors += 1
450 |             return
451 | 
452 |         # responce status code checking
453 |         if response.status_code == 404:
454 |             logger.error(f"Failed to download: {os.path.split(file['file_path'])[1]} | 404 Not Found")
455 |             self.post_errors += 1
456 |             return
457 | 
458 |         if response.status_code == 403:
459 |             logger.error(f"Failed to download: {os.path.split(file['file_path'])[1]} | 403 Forbidden")
460 |             self.post_errors += 1
461 |             return
462 | 
463 |         if response.status_code == 416:
464 |             logger.warning(f"Failed to download: {os.path.split(file['file_path'])[1]} | 416 Range Not Satisfiable | Assuming broken server hash value")
465 |             content_length = self.session.get(url=file['file_variables']['url'], stream=True, headers=self.headers, cookies=self.cookies, timeout=self.timeout).headers.get('content-length', '')
466 |             if content_length == resume_size:
467 |                 logger.debug("Correct amount of bytes downloaded | Assuming download completed successfully")
468 |                 if self.overwrite:
469 |                     os.replace(part_file, file['file_path'])
470 |                 else:
471 |                     os.rename(part_file, file['file_path'])
472 |                 return
473 |             logger.error("Incorrect amount of bytes downloaded | Something went so wrong I have no idea what happened | Removing file")
474 |             os.remove(part_file)
475 |             self.post_errors += 1
476 |             return
477 | 
478 |         if response.status_code == 429:
479 |             logger.warning(f"Failed to download: {os.path.split(file['file_path'])[1]} | 429 Too Many Requests | Sleeping for {self.ratelimit_sleep} seconds")
480 |             time.sleep(self.ratelimit_sleep)
481 |             if retry > 0:
482 |                 self.download_file(file, retry=retry-1)
483 |                 return
484 |             logger.error(f"Failed to download: {os.path.split(file['file_path'])[1]} | 429 Too Many Requests | All retries failed")
485 |             self.post_errors += 1
486 |             return
487 |         if not response.ok:
488 |             logger.error(f"Failed to download: {os.path.split(file['file_path'])[1]} | {response.status_code} {response.reason}")
489 |             self.post_errors += 1
490 |             return
491 | 
492 |         total = int(response.headers.get('content-length', 0))
493 |         if total:
494 |             total += resume_size
495 | 
496 |         if not self.simulate:
497 |             if not os.path.exists(os.path.split(file['file_path'])[0]):
498 |                 os.makedirs(os.path.split(file['file_path'])[0])
499 |             with open(part_file, 'ab') as f:
500 |                 start = time.time()
501 |                 downloaded = resume_size
502 |                 for chunk in response.iter_content(chunk_size=1024*1024):
503 |                     downloaded += len(chunk)
504 |                     f.write(chunk)
505 |                     print_download_bar(total, downloaded, resume_size, start)
506 |             print()
507 | 
508 |             # verify download
509 |             local_hash = get_file_hash(part_file)
510 |             logger.debug(f"Local File hash: {local_hash}")
511 |             logger.debug(f"Sever File hash: {file['file_variables']['hash']}")
512 |             if local_hash != file['file_variables']['hash']:
513 |                 logger.warning(f"File hash did not match server! | Retrying")
514 |                 if retry > 0:
515 |                     self.download_file(file, retry=retry-1)
516 |                     return
517 |                 logger.error(f"File hash did not match server! | All retries failed")
518 |                 self.post_errors += 1
519 |                 return
520 |             # remove .part from file name
521 |             if self.overwrite:
522 |                 os.replace(part_file, file['file_path'])
523 |             else:
524 |                 os.rename(part_file, file['file_path'])
525 | 
526 |     def download_yt_dlp(self, post:dict):
527 |         # download from video streaming site
528 |         # if self.yt_dlp and post['embed']:
529 |             pass
530 |             # my_yt_dlp(post['embed']['url'], post['post_path'], self.yt_dlp_args)
531 | 
532 |     def load_archive(self):
533 |         # load archived posts
534 |         if self.archive_file and os.path.exists(self.archive_file):
535 |             with open(self.archive_file,'r') as f:
536 |                 self.archive_list = f.read().splitlines()
537 | 
538 |     def write_archive(self, post:dict):
539 |         if self.archive_file and self.post_errors == 0 and not self.simulate:
540 |             with open(self.archive_file,'a') as f:
541 |                 f.write("https://{site}/{service}/user/{user_id}/post/{id}".format(**post['post_variables']) + '\n')
542 | 
543 |     def skip_user(self, user:dict):
544 |         # check last update date
545 |         if self.user_up_datebefore or self.user_up_dateafter:
546 |             if check_date(datetime.datetime.strptime(user['updated'], r'%a, %d %b %Y %H:%M:%S %Z'), None, self.user_up_datebefore, self.user_up_dateafter):
547 |                 logger.info("Skipping user | user updated date not in range")
548 |                 return True
549 |         return False
550 | 
551 |     def skip_post(self, post:dict):
552 |         # check if the post should be downloaded
553 |         if self.archive_file:
554 |             if "https://{site}/{service}/user/{user_id}/post/{id}".format(**post['post_variables']) in self.archive_list:
555 |                 logger.info("Skipping post | post already archived")
556 |                 return True
557 | 
558 |         if self.date or self.datebefore or self.dateafter:
559 |             if not post['post_variables']['published']:
560 |                 logger.info("Skipping post | post published date not in range")
561 |                 return True
562 |             elif check_date(datetime.datetime.strptime(post['post_variables']['published'], self.date_strf_pattern), self.date, self.datebefore, self.dateafter):
563 |                 logger.info("Skipping post | post published date not in range")
564 |                 return True
565 | 
566 |         if "https://{site}/{service}/user/{user_id}/post/{id}".format(**post['post_variables']) in self.comp_posts:
567 |             logger.info("Skipping post | post was already downloaded this session")
568 |             return True
569 | 
570 |         return False
571 | 
572 |     def skip_file(self, file:dict):
573 |         # check if file exists
574 |         if not self.overwrite:
575 |             if os.path.exists(file['file_path']):
576 |                 logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File already exists")
577 |                 return True
578 | 
579 |         # check file name extention
580 |         if self.only_ext:
581 |             if not file['file_variables']['ext'].lower() in self.only_ext:
582 |                 logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File extention {file['file_variables']['ext']} not found in include list {self.only_ext}")
583 |                 return True
584 |         if self.not_ext:
585 |             if file['file_variables']['ext'].lower() in self.not_ext:
586 |                 logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File extention {file['file_variables']['ext']} found in exclude list {self.not_ext}")
587 |                 return True
588 | 
589 |         # check file size
590 |         if self.min_size or self.max_size:
591 |             file_size = requests.get(file['file_variables']['url'], cookies=self.cookies, stream=True).headers.get('content-length', 0)
592 |             if int(file_size) == 0:
593 |                     logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File size not included in file header")
594 |                     return True
595 |             if self.min_size and self.max_size:
596 |                 if not (self.min_size <= int(file_size) <= self.max_size):
597 |                     logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File size in bytes {file_size} was not between {self.min_size} and {self.max_size}")
598 |                     return True
599 |             elif self.min_size:
600 |                 if not (self.min_size <= int(file_size)):
601 |                     logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File size in bytes {file_size} was not >= {self.min_size}")
602 |                     return True
603 |             elif self.max_size:
604 |                 if not (int(file_size) <= self.max_size):
605 |                     logger.info(f"Skipping: {os.path.split(file['file_path'])[1]} | File size in bytes {file_size} was not <= {self.max_size}")
606 |                     return True
607 |         return False
608 | 
609 | 
610 | 
611 |     def start_download(self):
612 |         # start the download process
613 |         self.load_archive()
614 | 
615 |         urls = []
616 |         domains = []
617 | 
618 |         for url in self.input_urls:
619 |             domain = parse_url(url)
620 |             if not domain:
621 |                 logger.warning(f"URL is not downloadable | {url}")
622 |                 continue
623 |             urls.append(url)
624 |             if not domain in domains: domains.append(domain)
625 | 
626 |         if self.k_fav_posts or self.k_fav_users:
627 |             if not 'kemono.party' in domains:
628 |                 domains.append('kemono.party')
629 |         if self.c_fav_posts or self.c_fav_users:
630 |             if not 'coomer.party' in domains:
631 |                 domains.append('coomer.party')
632 | 
633 |         for domain in domains:
634 |             try:
635 |                 self.creators += self.get_creators(domain)
636 |             except:
637 |                 logger.exception(f"Unable to get list of creators from {domain}")
638 |         if not self.creators:
639 |             logger.error("No creator information was retrieved. | exiting")
640 |             exit()
641 | 
642 |         if self.k_fav_posts:
643 |             try:
644 |                 self.get_favorites('kemono.party', 'post', retry=self.retry)
645 |             except:
646 |                 logger.exception("Unable to get favorite posts from kemono.party")
647 |         if self.c_fav_posts:
648 |             try:
649 |                 self.get_favorites('coomer.party', 'post')
650 |             except:
651 |                 logger.exception("Unable to get favorite posts from coomer.party")
652 |         if self.k_fav_users:
653 |             try:
654 |                 self.get_favorites('kemono.party', 'artist', self.k_fav_users)
655 |             except:
656 |                 logger.exception("Unable to get favorite users from kemono.party")
657 |         if self.c_fav_users:
658 |             try:
659 |                 self.get_favorites('coomer.party', 'artist', self.c_fav_users)
660 |             except:
661 |                 logger.exception("Unable to get favorite users from coomer.party")
662 | 
663 |         for url in urls:
664 |             try:
665 |                 self.get_post(url)
666 |             except:
667 |                 logger.exception(f"Unable to get posts for {url}")
668 | 
669 | def main():
670 |     downloader(get_args())
671 | 


--------------------------------------------------------------------------------
/src/my_yt_dlp.py:
--------------------------------------------------------------------------------
 1 | import yt_dlp
 2 | import shutil
 3 | from yt_dlp import DownloadError
 4 | import os
 5 | 
 6 | from .logger import logger
 7 | 
 8 | 
 9 | def my_yt_dlp(url:str, file_path:str, args:dict):
10 |     logger.info(f"Downloading with yt-dlp: URL {url}")
11 |     temp_folder = os.path.join(os.getcwd(),"yt_dlp_temp")
12 |     try:
13 |         # please reffer to yt-dlp's github for options
14 |         ydl_opts = {"paths": {"home": file_path}, "noplaylist" : True, "quiet" : True, "verbose": False}
15 |         with yt_dlp.YoutubeDL(ydl_opts) as ydl:
16 |             ydl.download([url])
17 |         # clean up temp folder
18 |         shutil.rmtree(temp_folder)
19 |     except (Exception, DownloadError) as e:
20 |         # clean up temp folder
21 |         if os.path.exists(temp_folder):
22 |             shutil.rmtree(temp_folder)
23 |         logger.error(f"yt-dlp: Could not download URL {url}")
24 |         return


--------------------------------------------------------------------------------
/src/version.py:
--------------------------------------------------------------------------------
1 | __version__ = '2022.04.28'
2 | 


--------------------------------------------------------------------------------