├── .github └── workflows │ └── release.yaml ├── README.md ├── json └── live.json ├── live ├── live.m3u └── live.txt ├── liveRoom ├── 卫视.txt ├── 央视.txt └── 港澳台.txt ├── main.py ├── pylive ├── __init__.py ├── liveSpider.py └── mp4Info.py └── requirements.txt /.github/workflows/release.yaml: -------------------------------------------------------------------------------- 1 | name: Live Spider 2 | on: 3 | workflow_dispatch: 4 | jobs: 5 | Spider: 6 | runs-on: ubuntu-latest 7 | env: 8 | TZ: Asia/Shanghai 9 | permissions: 10 | contents: write 11 | security-events: write 12 | pull-requests: write 13 | steps: 14 | - name: Clone TVSpider Repository 15 | uses: actions/checkout@v3 16 | with: 17 | fetch-depth: 0 18 | - name: Setup Python 19 | uses: actions/setup-python@v3 20 | with: 21 | python-version: 3.8 22 | 23 | - name: Upgrade Pip 24 | run: | 25 | python -m pip install --upgrade pip 26 | 27 | - name: Install requirements 28 | run: | 29 | pip install -r ./requirements.txt 30 | - name: Live Spider 31 | run: | 32 | python ./main.py 33 | 34 | - name: Commit Live Config files 35 | run: | 36 | git config --global user.emai ${{ secrets.GT_EMAIL }} 37 | git config --global user.name ${{ secrets.GH_ACTOR}} 38 | git add json 39 | git add live 40 | commit_msg="* 更新直播源" 41 | git commit -a -m "$commit_msg" 42 | 43 | - name: Push Live Config Changes To Github 44 | uses: ad-m/github-push-action@master 45 | with: 46 | github_token: ${{ github.Token }} 47 | directory: . 48 | branch: main 49 | force: true 50 | env: 51 | GITHUB_ACTOR: ${{ secrets.GH_ACTOR }} 52 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LiveSpider 2 | 直播源爬虫 3 | 4 | ## 环境安装 5 | ``bash 6 | pip install requirements.txt 7 | `` 8 | 9 | ## 运行脚本 10 | 11 | ```bash 12 | python main.py 13 | ``` 14 | 15 | > 本地运行,也可以通过Workflows运行 16 | 17 | ## Workflows 18 | 19 | 需要再Actions中添加以下参数 20 | 21 | ```bash 22 | GT_EMAIL:邮箱地址 23 | GH_ACTOR:作者名 24 | ``` 25 | > 每天北京时间00:00开始自动同步 26 | -------------------------------------------------------------------------------- /json/live.json: -------------------------------------------------------------------------------- 1 | {} -------------------------------------------------------------------------------- /live/live.txt: -------------------------------------------------------------------------------- 1 | #EXTM3U 2 | -------------------------------------------------------------------------------- /liveRoom/卫视.txt: -------------------------------------------------------------------------------- 1 | 北京卫视 2 | 湖南卫视 3 | 浙江卫视 4 | 江苏卫视 5 | 东方卫视 6 | 河南卫视 7 | 山东卫视 8 | 河北卫视 9 | 辽宁卫视 10 | 安徽卫视 11 | 重庆卫视 12 | 四川卫视 13 | 天津卫视 14 | 吉林卫视 15 | 黑龙江卫视 16 | 云南卫视 17 | 广东卫视 18 | 广西卫视 19 | 贵州卫视 20 | 深圳卫视 21 | 湖北卫视 22 | 东南卫视 23 | 厦门卫视 24 | 山西卫视 25 | 陕西卫视 26 | 甘肃卫视 27 | 内蒙古卫视 28 | 宁夏卫视 29 | 新疆卫视 30 | 西藏卫视 31 | 青海卫视 32 | 三沙卫视 33 | 金鹰卡通 34 | 嘉佳卡通 35 | 炫动卡通 36 | 旅游卫视 37 | 星空卫视 -------------------------------------------------------------------------------- /liveRoom/央视.txt: -------------------------------------------------------------------------------- 1 | CCTV1 2 | CCTV2 3 | CCTV3 4 | CCTV4 5 | CCTV5 6 | CCTV5+ 7 | CCTV6 8 | CCTV7 9 | CCTV8 10 | CCTV9 11 | CCTV10 12 | CCTV11 13 | CCTV12 14 | CCTV13 15 | CCTV14 16 | CCTV15 17 | CCTV16 18 | CCTV17 -------------------------------------------------------------------------------- /liveRoom/港澳台.txt: -------------------------------------------------------------------------------- 1 | 香港卫视 2 | 凤凰香港 3 | 凤凰中文 4 | 凤凰资讯 5 | 亚洲新闻 6 | 新加坡中文 7 | 新加坡卫视 -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : searchLive.py 4 | # @Author : jade 5 | # @Date : 2024/5/7 13:58 6 | # @Email : jadehh@1ive.com 7 | # @Software : Samples 8 | # @Desc : 搜索直播Main函数 9 | from pylive.liveSpider import LiveSpider 10 | 11 | if __name__ == '__main__': 12 | liveSpider = LiveSpider() 13 | liveSpider.run() 14 | -------------------------------------------------------------------------------- /pylive/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : __init__.py.py 4 | # @Author : jade 5 | # @Date : 2024/5/7 13:57 6 | # @Email : jadehh@1ive.com 7 | # @Software : Samples 8 | # @Desc : 9 | -------------------------------------------------------------------------------- /pylive/liveSpider.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : liveSpider.py 4 | # @Author : jade 5 | # @Date : 2024/5/7 13:59 6 | # @Email : jadehh@1ive.com 7 | # @Software : Samples 8 | # @Desc : 直播爬虫 9 | import os 10 | import time 11 | import urllib.parse 12 | import m3u8 13 | import requests 14 | from lxml import etree 15 | from jade import JadeLogging, CreateSavePath 16 | from m3u8 import M3U8 17 | from pylive.mp4Info import Mp4Info 18 | import threading 19 | import json 20 | import logging 21 | requests.packages.urllib3.disable_warnings() 22 | JadeLog = JadeLogging("log", Level="INFO") 23 | 24 | 25 | class MyThread(threading.Thread): 26 | def __init__(self, target, args=()): 27 | super(MyThread, self).__init__() 28 | self.target = target 29 | self.args = args 30 | self.result = None 31 | 32 | def run(self): 33 | self.result = self.target(*self.args) 34 | 35 | 36 | class LiveSpider(object): 37 | def __init__(self): 38 | self.siteUrl = "http://tonkiang.us" 39 | self.liveRoomPath = "liveRoom" 40 | self.maxPage = 3 41 | self.maxReconnect = 3 42 | self.maxTsDownloadTimes = 3 43 | self.reconnect = 0 44 | self.sleepTime = 1 45 | self.timeout = 3 46 | self.maxSize = 1024 * 1024 * 10 47 | self.headers = { 48 | "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36", 49 | "host":"tonkiang.us", 50 | "Accept": "*/*", 51 | "Connection":"keep-alive", 52 | "Content-Type": "application/x-www-form-urlencoded", 53 | "Cookie":"_ga=GA1.1.561799003.1715167785; HstCfa4853344=1715167785578; HstCmu4853344=1715167785578;; HstCnv4853344=5; HstCns4853344=8; REFERER=16363174; HstCla4853344=1716384255718; HstPn4853344=2; HstPt4853344=32; _ga_JNMLRB3QLF=GS1.1.1716384250.5.1.1716384258.0.0.0" 54 | } 55 | self.tmpPath = CreateSavePath("tmp") 56 | self.saveJsonPath = CreateSavePath("json") 57 | self.saveLivePath = CreateSavePath("live") 58 | self.saveXmlPath = CreateSavePath("xml") 59 | self.sortKeys = ["央视", "卫视", "港澳台"] 60 | super().__init__() 61 | 62 | 63 | def sortFileList(self, fileList): 64 | newFileList = [] 65 | for key in self.sortKeys: 66 | for fileName in fileList: 67 | if key in fileName: 68 | newFileList.append(fileName) 69 | break 70 | return newFileList 71 | 72 | def getSearchResult(self): 73 | fileList = self.sortFileList(os.listdir(self.liveRoomPath)) 74 | playJson = {} 75 | playObj = {} 76 | index = 1 77 | for fileName in fileList: 78 | playList = [] 79 | with open(os.path.join(self.liveRoomPath, fileName), "rb") as f: 80 | fileName = fileName.split(".")[0] 81 | contentList = f.readlines() 82 | for content in contentList: 83 | name = str(content, encoding="utf-8").strip() 84 | if name: 85 | playUrl = self.spiderSearch(name) 86 | if len(playUrl.strip()) > 0: 87 | playJson[name.lower()] = playUrl 88 | writeContent = '#EXTINF:-1 tvg-id="{}" tvg-name="{}" tvg-logo="{}" group-title="{}",{}\nvideo://{}\n'.format( 89 | index, str(content, encoding="utf-8").strip(), 90 | "https://epg.112114.xyz/logo/{}.png".format(name), fileName, 91 | str(content, encoding="utf-8").strip(), playUrl) 92 | playList.append(writeContent) 93 | index = index + 1 94 | playObj[fileName] = playList 95 | 96 | if len(playObj.keys()) > 0: 97 | with open(os.path.join(self.saveLivePath, "live.txt".format(fileName)), "wb") as f1: 98 | f1.write("#EXTM3U\n".encode("utf-8")) 99 | for key in playObj.keys(): 100 | for playUrl in playObj[key]: 101 | f1.write(playUrl.encode("utf-8")) 102 | 103 | with open(os.path.join(self.saveJsonPath, "live.json".format(fileName)), "wb") as f2: 104 | f2.write(json.dumps(playJson, indent=4, ensure_ascii=False).encode("utf-8")) 105 | 106 | 107 | def getParams(self, name): 108 | return {"search": name, "Submit": " "} 109 | 110 | def fetch(self, url, cookies, headers, params, verify): 111 | try: 112 | res = requests.get(url,cookies=cookies, headers=headers, params=params, verify=verify) 113 | if res.status_code == 200: 114 | self.reconnect = 0 115 | return res 116 | elif res.status_code != 200 and self.reconnect < self.maxReconnect: 117 | time.sleep(self.sleepTime) 118 | self.reconnect = self.reconnect + 1 119 | JadeLog.WARNING("Get请求失败,尝试第{}次重连".format(self.reconnect)) 120 | return self.fetch(url,cookies, headers, params, verify) 121 | else: 122 | self.reconnect = 0 123 | JadeLog.ERROR("Get请求失败,超过最大重连次数,请检查连接:{}".format(url)) 124 | return None 125 | except Exception as e: 126 | JadeLog.ERROR("Get请求失败") 127 | raise e 128 | 129 | def getCookies(self): 130 | res = self.fetch(self.siteUrl,None,self.headers,None,True) 131 | return res.cookies 132 | 133 | def post(self, url,cookies, headers, data, verify): 134 | try: 135 | res = requests.post(url, cookies=cookies,headers=headers, data=data, verify=verify) 136 | if res.status_code == 200: 137 | self.reconnect = 0 138 | return res 139 | elif res.status_code != 200 and self.reconnect < self.maxReconnect: 140 | time.sleep(self.sleepTime) 141 | self.reconnect = self.reconnect + 1 142 | JadeLog.WARNING("Post请求失败,尝试第{}次重连".format(self.reconnect)) 143 | return self.post(url,cookies, headers, data, verify) 144 | else: 145 | self.reconnect = 0 146 | JadeLog.ERROR("Post请求失败,超过最大重连次数,请检查连接:{}".format(url)) 147 | return None 148 | except Exception as e: 149 | JadeLog.ERROR("Post请求失败") 150 | raise e 151 | 152 | 153 | def m3u8Get(self, url): 154 | try: 155 | res = requests.get(url, stream=True, timeout=self.timeout) 156 | for chunk in res.iter_content(chunk_size=2048): # 1024B 157 | if chunk: 158 | res.close() 159 | return chunk 160 | return None 161 | except Exception as e: 162 | JadeLog.ERROR("播放链接为:{},请求失败".format(url)) 163 | return None 164 | 165 | def tsGet(self, url): 166 | try: 167 | start = time.time() 168 | res = requests.get(url, stream=True) 169 | res.raise_for_status() 170 | body = [] 171 | for chunk in res.iter_content(1024): # Adjust this value to provide more or less granularity. 172 | body.append(chunk) 173 | if time.time() > (start + self.timeout): 174 | break # You can set an error value here as well if you want. 175 | return b''.join(body) 176 | except Exception as e: 177 | JadeLog.ERROR("Get请求失败,失败原因为:{}".format(e)) 178 | 179 | 180 | def downloadTsUrl(self, segments): 181 | ## 计算下载速度,和 ts文件的分辨率 182 | startTime = time.time() 183 | totalSize = 0 184 | index = 1 185 | threadList = [] 186 | for segment in segments: 187 | downloadStartTime = time.time() 188 | JadeLog.DEBUG("正在下载:{}文件".format(segment.absolute_uri)) 189 | data = self.tsGet(segment.absolute_uri) 190 | use_time = time.time() - downloadStartTime 191 | JadeLog.DEBUG("文件:{},下载完成,耗时:{}s".format(segment.absolute_uri, ("%.2f" % (use_time)))) 192 | if data is not None and use_time < 5: 193 | mp4Info = Mp4Info(data) 194 | totalSize = totalSize + len(data) 195 | index = index + 1 196 | if index > self.maxTsDownloadTimes: 197 | break 198 | else: 199 | return 0, 0, "0*0" 200 | totalTime = time.time() - startTime 201 | averageSpeed = totalSize / totalTime / (1024 * 1024) # 转换为MB/s 202 | return averageSpeed, mp4Info.videoWidth * mp4Info.videoWidth, mp4Info.aspect 203 | 204 | def downloadLiveUrl(self, m3u8Url): 205 | startTime = time.time() 206 | try: 207 | res = requests.get(m3u8Url, stream=True, timeout=self.timeout) 208 | data = b"" 209 | for chunk in res.iter_content(chunk_size=1024 * 1024): # 1024B 210 | if chunk: 211 | data = data + chunk 212 | if len(data) >= self.maxSize: 213 | res.close() 214 | break 215 | mp4Info = Mp4Info(data) 216 | averageSpeed = len(data) / (time.time() - startTime) / (1024 * 1024) # 转换为MB/s 217 | return averageSpeed, mp4Info.videoWidth * mp4Info.videoWidth, mp4Info.aspect 218 | except: 219 | return 0, 0, "0*0" 220 | 221 | def downloadM3U8Url(self, m3u8Url): 222 | res = self.m3u8Get(m3u8Url) 223 | try: 224 | if res: 225 | if b"EXTM3U" in res: 226 | m3u8Object = M3U8(str(res, encoding="utf-8"), base_uri=m3u8Url.rsplit('/', 1)[0]) 227 | if 0 < len(m3u8Object.segments) < 10: ## 防止有不是直播文件 228 | return self.downloadTsUrl(m3u8Object.segments) 229 | else: 230 | return 0, 0, "0*0" 231 | else: 232 | return self.downloadLiveUrl(m3u8Url) 233 | else: 234 | return 0, 0, "0*0" 235 | except: 236 | return 0, 0, "0*0" 237 | 238 | def selectBestUrl(self, name, m3u8List): 239 | maxQulity = 0 240 | maxQulityUrl = "" 241 | startTime = time.time() 242 | JadeLog.INFO("名称为:{},开始寻找最好的播放连接".format(name), True) 243 | threadList = [] 244 | bestSpeed = 0 245 | bestAspect = "" 246 | for m3u8Url in m3u8List: 247 | mythread = MyThread(self.downloadM3U8Url, args=(m3u8Url,)) 248 | mythread.start() 249 | threadList.append((mythread, m3u8Url)) 250 | for (thread, m3u8Url) in threadList: 251 | thread.join() 252 | speed, resolution, aspect = thread.result 253 | if speed > 0: 254 | JadeLog.INFO( 255 | "播放连接为:{},速度为:{}MB/s,分辨率为:{},测试完成".format(m3u8Url, ("%.2f" % speed), aspect)) 256 | else: 257 | JadeLog.WARNING("播放连接为:{},资源失效".format(m3u8Url)) 258 | if speed > 0.5: 259 | if resolution * speed > maxQulity: 260 | maxQulity = resolution * speed 261 | maxQulityUrl = m3u8Url 262 | bestSpeed = speed 263 | bestAspect = aspect 264 | JadeLog.INFO("名称为:{},最优播放链接为:{},速度为:{}MB/s,分辨率为:{},耗时:{}s".format(name, maxQulityUrl, 265 | ("%.2f" % bestSpeed), 266 | bestAspect, ("%.2f" % ( 267 | time.time() - startTime)))) 268 | return maxQulityUrl 269 | 270 | def parseXML(self, name, html, m3u8List): 271 | root = etree.HTML(html) 272 | result_divs = root.xpath("//div[@class='resultplus']") 273 | for div in result_divs: 274 | a = div.xpath(".//a/div") 275 | for element in div.xpath(".//tba"): 276 | if element.text is not None: 277 | if (a[0].text.strip().lower().replace("-", "").replace("K", "") == name.lower()): 278 | m3u8List.append(element.text.strip()) 279 | return m3u8List 280 | 281 | def spiderSearch(self, name): 282 | m3u8List = [] 283 | for i in range(self.maxPage): 284 | if i > 0: 285 | url = self.siteUrl + "/?page={}&channel={}".format(i+1,name) 286 | response = self.fetch(url,None,self.headers,None,verify=True) 287 | else: 288 | response = self.post(self.siteUrl,None,headers=self.headers,data={"seerch":name,"Submit":"","city":"MTE0NjIxNDA4NjIxeHh4"},verify=True) 289 | if response: 290 | self.writeXml("{}_{}".format(name,i),response.content) 291 | self.parseXML(name, response.text, m3u8List) 292 | return self.selectBestUrl(name, m3u8List) 293 | 294 | def writeXml(self,name,content): 295 | with open(os.path.join(self.saveXmlPath,name + ".xml"),"wb") as f: 296 | f.write(content) 297 | 298 | def run(self): 299 | self.getSearchResult() 300 | JadeLog.release() 301 | 302 | -------------------------------------------------------------------------------- /pylive/mp4Info.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : mp4Info.py 4 | # @Author : jade 5 | # @Date : 2024/5/7 14:56 6 | # @Email : jadehh@1ive.com 7 | # @Software : Samples 8 | # @Desc : 9 | import io 10 | from pymediainfo import MediaInfo 11 | class Mp4Info(): 12 | def __init__(self, data): 13 | self.media_info = self.get_media_info(data) 14 | self.duration = 0 15 | self.size = 0 16 | self.type = "" 17 | self.videoCodec = "" 18 | self.videoWidth = 0 19 | self.videoHeight = 0 20 | self.get_genera_info() 21 | self.aspect = "{}*{}".format(self.videoWidth, self.videoHeight) 22 | 23 | def get_media_info(self, data): 24 | return MediaInfo.parse(io.BufferedReader(io.BytesIO(data))) 25 | 26 | def get_genera_info(self): 27 | for track in self.media_info.tracks: 28 | if track.track_type == 'General': 29 | if track.duration: 30 | self.duration = track.duration 31 | if track.file_size: 32 | self.size = track.file_size 33 | if track.internet_media_type: 34 | self.type = track.internet_media_type 35 | if track.track_type == "Video": 36 | if track.internet_media_type: 37 | self.videoCodec = track.internet_media_type 38 | if track.sampled_height: 39 | self.videoHeight = int(track.sampled_height) 40 | if track.sampled_width: 41 | self.videoWidth = int(track.sampled_width) -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | lxml 2 | pymediainfo 3 | m3u8 4 | https://github.com/jadehh/pythonTools/releases/download/JadeV2.1.7/jade-2.1.7-py3-none-any.whl 5 | requests --------------------------------------------------------------------------------