├── .gitignore ├── README.md ├── incremental_upload_to_aliyun_oss.py └── oss_config.json /.gitignore: -------------------------------------------------------------------------------- 1 | # config files # 2 | ################ 3 | oss_config.json 4 | 5 | # OS generated files # 6 | ###################### 7 | .DS_Store 8 | 9 | # PyCharm generated files # 10 | ###################### 11 | .idea/ 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # aliyun-oss-sync 2 | 3 | ### 简介 4 | 阿里云 OSS 增量上传脚本。 5 | 6 | 逻辑很简单,递归遍历本地目录,然后判断每个文件在 OSS 里是否存在,如果不存在则直接上传,如果存在则检查 Content-Md5 是否相等,如果不相等则表明该文件内容已经发生变化,则上传该文件,OSS 会自动覆盖同名文件。 7 | 8 | 值得注意的是检查 Content-Md5 的值是用的 HTTP 的 HEAD 方法,因为我们只需要 header 中的 Content-Md5 字段的值,所以并不需要使用 GET 方法拿到响应体,这样既加快了速度也节省了 OSS 流量。遍历是单个线程在进行遍历,然后将遍历到的文件上传任务放进了线程池,因为上传任务为 I/O 密集型,故使用多线程上传,采用 Python 指定的默认线程数。源代码如下: 9 | 10 | ```python 11 | max_workers = (os.cpu_count() or 1) * 5 12 | ``` 13 | 14 | 关于 ossDomain 的值,你如果在同地域内网的 ECS 上使用该脚本,建议使用内网域名,速度快并且节省了流量费用,否则使用外网域名。 15 | 16 | 环境要求: 17 | Python 3.2 + 18 | 19 | 使用方法: 20 | 21 | 下载 [incremental_upload_to_aliyun_oss.py](https://storage.tianshuang.me/aliyun-oss-sync/incremental_upload_to_aliyun_oss.py) 22 | 23 | 下载 [oss_config.json](https://storage.tianshuang.me/aliyun-oss-sync/oss_config.json) 24 | 25 | 将你的 OSS 配置信息替换掉 oss_config.json 中的模版配置,并将以上两个文件放置于同一目录下即可运行。 26 | 27 | ```Bash 28 | python3 incremental_upload_to_aliyun_oss.py 29 | ``` 30 | 31 | 如有建议,敬请指出。 32 | -------------------------------------------------------------------------------- /incremental_upload_to_aliyun_oss.py: -------------------------------------------------------------------------------- 1 | import base64 2 | import hashlib 3 | import json 4 | import os 5 | from concurrent.futures import ThreadPoolExecutor 6 | 7 | import oss2 8 | import requests 9 | 10 | 11 | def content_md5(local_file_path): 12 | with open(local_file_path, 'rb') as file: 13 | m = hashlib.md5() 14 | while True: 15 | d = file.read(8096) 16 | if not d: 17 | break 18 | m.update(d) 19 | return str(base64.b64encode(m.digest()), 'utf-8') 20 | 21 | 22 | def upload_file_to_aliyun_oss(local_file_path): 23 | if is_windows: 24 | local_file_path = local_file_path.replace('\\', '/') 25 | if local_file_path.endswith('.DS_Store') or not os.path.isfile(local_file_path): 26 | return 27 | oss_object_key = local_file_path[local_dir.__len__():] 28 | oss_response = requests.head('https://' + oss_domain + '/' + oss_object_key) 29 | if oss_response.status_code == 200 and content_md5(local_file_path) == oss_response.headers['Content-MD5']: 30 | return 31 | 32 | print('uploading: ' + local_file_path) 33 | result = bucket.put_object_from_file(oss_object_key, local_file_path) 34 | if result.status != 200: 35 | print('upload error, response information: ' + str(result)) 36 | exit(1) 37 | 38 | 39 | if __name__ == '__main__': 40 | oss_config = None 41 | with open('oss_config.json') as oss_config_file: 42 | oss_config = json.load(oss_config_file) 43 | if 'accessKeyId' not in oss_config: 44 | raise ValueError('No accessKeyId in oss_config.json') 45 | if 'accessKeySecret' not in oss_config: 46 | raise ValueError('No accessKeySecret in oss_config.json') 47 | if 'endpoint' not in oss_config: 48 | raise ValueError('No endpoint in oss_config.json') 49 | if 'bucketName' not in oss_config: 50 | raise ValueError('No bucketName in oss_config.json') 51 | if 'ossDomain' not in oss_config: 52 | raise ValueError('No ossDomain in oss_config.json') 53 | if 'localDir' not in oss_config: 54 | raise ValueError('No localDir in oss_config.json') 55 | if not str(oss_config['localDir']).strip().endswith('/'): 56 | raise ValueError('localDir must end with a slash, example: /Users/Poison/blog/public/') 57 | 58 | is_windows = False 59 | if os.name == 'nt': 60 | is_windows = True 61 | 62 | auth = oss2.Auth(oss_config['accessKeyId'], oss_config['accessKeySecret']) 63 | bucket = oss2.Bucket(auth, oss_config['endpoint'], oss_config['bucketName']) 64 | oss_domain = oss_config['ossDomain'] 65 | local_dir = oss_config['localDir'] 66 | 67 | with ThreadPoolExecutor() as executor: 68 | for dirpath, dirnames, filenames in os.walk(local_dir): 69 | for filename in filenames: 70 | executor.submit(upload_file_to_aliyun_oss, os.path.join(dirpath, filename)) 71 | -------------------------------------------------------------------------------- /oss_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "accessKeyId": "", 3 | "accessKeySecret": "", 4 | "endpoint": "", 5 | "bucketName": "", 6 | "ossDomain": "", 7 | "localDir": "" 8 | } 9 | --------------------------------------------------------------------------------