├── tests ├── __init__.py └── test_crawler.py ├── utils ├── __init__.py ├── git_util.py └── file_util.py ├── img ├── 2021-01-01.png ├── 2021-01-02.png ├── 2021-01-03.png ├── 2021-01-04.png ├── 2021-01-05.png ├── 2021-01-06.png ├── 2021-01-07.png ├── 2021-01-08.png ├── 2021-01-09.png ├── 2021-01-10.png ├── 2021-01-11.png ├── 2021-01-12.png ├── 2021-01-13.png ├── 2021-01-14.png └── 2021-01-15.png ├── .gitignore ├── requirements.txt ├── config.py ├── repository_info.py ├── main.py ├── README.md ├── crawler.py ├── 2021-01-16.md ├── 2021-01-14.md ├── 2021-01-13.md ├── 2021-01-11.md ├── 2021-01-05.md ├── 2021-01-02.md ├── 2021-01-01.md ├── 2021-01-10.md ├── 2021-01-06.md └── 2021-01-09.md /tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /img/2021-01-01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-01.png -------------------------------------------------------------------------------- /img/2021-01-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-02.png -------------------------------------------------------------------------------- /img/2021-01-03.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-03.png -------------------------------------------------------------------------------- /img/2021-01-04.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-04.png -------------------------------------------------------------------------------- /img/2021-01-05.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-05.png -------------------------------------------------------------------------------- /img/2021-01-06.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-06.png -------------------------------------------------------------------------------- /img/2021-01-07.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-07.png -------------------------------------------------------------------------------- /img/2021-01-08.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-08.png -------------------------------------------------------------------------------- /img/2021-01-09.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-09.png -------------------------------------------------------------------------------- /img/2021-01-10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-10.png -------------------------------------------------------------------------------- /img/2021-01-11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-11.png -------------------------------------------------------------------------------- /img/2021-01-12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-12.png -------------------------------------------------------------------------------- /img/2021-01-13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-13.png -------------------------------------------------------------------------------- /img/2021-01-14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-14.png -------------------------------------------------------------------------------- /img/2021-01-15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fgksgf/GitHub-Trending-Crawler/HEAD/img/2021-01-15.png -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | <<<<<<< HEAD 3 | ======= 4 | __pycache__/ 5 | >>>>>>> ff4c7febeb90990d221d481aa64e732fa1971ef8 6 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | certifi==2019.11.28 2 | chardet==3.0.4 3 | cssselect==1.1.0 4 | cycler==0.10.0 5 | docopt==0.6.2 6 | idna==2.8 7 | jieba==0.39 8 | kiwisolver==1.1.0 9 | lxml==4.6.2 10 | matplotlib==3.0.2 11 | numpy==1.18.1 12 | Pillow==7.1.0 13 | pyparsing==2.4.6 14 | pyquery==1.4.0 15 | pytest==5.3.5 16 | python-dateutil==2.8.0 17 | requests==2.22.0 18 | six==1.14.0 19 | urllib3==1.25.8 20 | wordcloud==1.5.0 21 | -------------------------------------------------------------------------------- /tests/test_crawler.py: -------------------------------------------------------------------------------- 1 | from unittest import TestCase 2 | 3 | from crawler import GitHubCrawler 4 | 5 | 6 | class TestGitHubCrawler(TestCase): 7 | def test_crawl(self): 8 | crawler = GitHubCrawler() 9 | self.assertIsNotNone(crawler.crawl('python')) 10 | self.assertIsNotNone(crawler.crawl('c')) 11 | self.assertIsNotNone(crawler.crawl('java')) 12 | 13 | def test_run(self): 14 | # TODO: check if the markdown file and the wordcloud image exist 15 | self.assertTrue(True) 16 | -------------------------------------------------------------------------------- /utils/git_util.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | import os 3 | 4 | from config import MD_FILE_NAME, IMG_FILE_NAME 5 | 6 | 7 | def git_add_commit_push(): 8 | """ 9 | Upload the markdown file and the image to GitHub by git. 10 | """ 11 | date = datetime.datetime.now().strftime('%Y-%m-%d') 12 | 13 | md_name = MD_FILE_NAME.format(name=date) 14 | img_name = IMG_FILE_NAME.format(name=date) 15 | 16 | cmd_git_add = 'git add {file1} && git add {file2}'.format(file1=md_name, file2=img_name) 17 | cmd_git_commit = 'git commit -m "{date}"'.format(date=date) 18 | cmd_git_push = 'git push -u origin master' 19 | 20 | os.system(cmd_git_add) 21 | os.system(cmd_git_commit) 22 | os.system(cmd_git_push) 23 | -------------------------------------------------------------------------------- /config.py: -------------------------------------------------------------------------------- 1 | # programming languages you are interested in and want to crawl 2 | LANGUAGES = ['python', 'java', 'unknown', 'javascript', 'html', 'go'] 3 | 4 | # frequency of trending 5 | FREQUENCY = ['daily', 'weekly', 'monthly'] 6 | 7 | # wordcloud chart config 8 | WC_BG_COLOR = 'white' # background color 9 | WC_WIDTH = 800 10 | WC_HEIGHT = 600 11 | WC_MARGIN = 2 12 | WC_FONT_PATH = 'MSYH.TTC' # use this font to ensure Chinese words can be shown 13 | WC_RANDOM_STATE = 20 14 | 15 | # proxy pool api 16 | PROXY_POOL_API = { 17 | 'get_a_proxy': '', 18 | } 19 | 20 | DAY = 24 * 60 * 60 21 | 22 | HEADERS = { 23 | 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96', 24 | 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 25 | 'Accept-Encoding': 'gzip,deflate,sdch', 26 | 'Accept-Language': 'zh-CN,zh;q=0.8' 27 | } 28 | 29 | TRENDING_URL = 'https://github.com/trending/{language}?since={frequency}' 30 | 31 | MD_FILE_NAME = '{name}.md' 32 | 33 | IMG_FILE_NAME = 'img/{name}.png' 34 | -------------------------------------------------------------------------------- /repository_info.py: -------------------------------------------------------------------------------- 1 | class RepoInfo: 2 | """ 3 | Information of the repository. 4 | """ 5 | 6 | GITHUB_URL = 'https://github.com' 7 | 8 | def __init__(self, href, stars, desc): 9 | # href: '/Username/RepoName' 10 | p = href.rindex('/') 11 | 12 | # owner of the repository 13 | self.owner = href[:p] 14 | 15 | # name of the repository 16 | self.name = href[p + 1:] 17 | 18 | # the complete url of the repository 19 | self.url = self.GITHUB_URL + href 20 | 21 | # status about how many stars it got, a string 22 | self.star_status = stars 23 | 24 | # description of the repository 25 | self.desc = desc 26 | 27 | def convert_to_md(self): 28 | """ 29 | Convert repository info into markdown format. 30 | 31 | :return: 32 | """ 33 | pattern = u"+ [{name}]({url})(**{stars}**): {desc}\n" 34 | return pattern.format(name=self.name, url=self.url, desc=self.desc, stars=self.star_status) 35 | 36 | def __str__(self): 37 | return str({'name': self.name, 'url': self.url, 'star_status': self.star_status, 'desc': self.desc}) 38 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | """GitHub Trending Crawler. 2 | 3 | Usage: 4 | main.py (-h | --help) 5 | main.py (-v | --version) 6 | main.py [-l | --loop] [-p | --push] [--frequency=] 7 | 8 | Options: 9 | -h --help Show this screen. 10 | -v --version Show version. 11 | -l --loop Run this program cyclically. 12 | -p --push Use git to push the markdown and the image. 13 | --frequency= Speed in knots [default: daily]. 14 | 15 | """ 16 | import time 17 | 18 | from docopt import docopt 19 | 20 | from config import DAY, LANGUAGES 21 | from crawler import GitHubCrawler 22 | from utils.git_util import git_add_commit_push 23 | 24 | if __name__ == '__main__': 25 | arguments = docopt(__doc__, version='GitHub Trending Crawler V1.5') 26 | 27 | frequency = {'daily': DAY, 28 | 'weekly': 7 * DAY, 29 | 'monthly': 30 * DAY} 30 | 31 | f = arguments['--frequency'] 32 | 33 | if f not in frequency.keys(): 34 | print("The parameter --frequency should be 'daily', 'weekly' or 'monthly'.") 35 | else: 36 | crawler = GitHubCrawler(frequency=f) 37 | while True: 38 | crawler.run(langs=LANGUAGES) 39 | 40 | if arguments['--push']: 41 | git_add_commit_push() 42 | 43 | if arguments['--loop']: 44 | if f in frequency.keys(): 45 | time.sleep(frequency.get(f)) 46 | else: 47 | time.sleep(DAY) 48 | else: 49 | break 50 | -------------------------------------------------------------------------------- /utils/file_util.py: -------------------------------------------------------------------------------- 1 | import codecs 2 | import jieba 3 | from wordcloud import WordCloud 4 | 5 | from config import IMG_FILE_NAME, WC_BG_COLOR, WC_RANDOM_STATE, WC_FONT_PATH, WC_MARGIN, WC_HEIGHT, WC_WIDTH 6 | 7 | 8 | def create_markdown(date, filename): 9 | """ 10 | Create a markdown file. 11 | 12 | :param date: today's date. 13 | :param filename: the markdown file's name. 14 | """ 15 | with open(filename, 'w') as f: 16 | f.write("# " + date + "\n") 17 | f.write("See what the GitHub community is most excited about.\n") 18 | 19 | 20 | def generate_wordcloud(descriptions, filename, debug=False): 21 | """ 22 | Generate a wordcloud chart according to all descriptions of repositories. 23 | 24 | :param descriptions: a list contains all descriptions of today. 25 | :param filename: the name of wordcloud chart. 26 | :param debug: 27 | """ 28 | # join all strings in the list with '' 29 | text = ''.join(descriptions) 30 | 31 | # use jieba to do Chinese word segmentation first 32 | word_list = jieba.cut(text, cut_all=False) 33 | f = " ".join(word_list) 34 | 35 | # set the word cloud's attributes 36 | wc = WordCloud(background_color=WC_BG_COLOR, 37 | width=WC_WIDTH, 38 | height=WC_HEIGHT, 39 | margin=WC_MARGIN, 40 | font_path=WC_FONT_PATH, # Use this font to ensure Chinese words can be shown 41 | random_state=WC_RANDOM_STATE).generate(f) 42 | 43 | if debug: 44 | # use matplotlib to display wordcloud chart for debugging 45 | import matplotlib.pyplot as plt 46 | plt.imshow(wc, interpolation='bilinear') 47 | plt.axis("off") 48 | plt.figure() 49 | plt.show() 50 | 51 | # save the chart as png file 52 | path = IMG_FILE_NAME.format(name=filename) 53 | wc.to_file(path) 54 | return path 55 | 56 | 57 | def append_infos_to_md(filename, language, infos): 58 | """ 59 | Append all repos' information of a language to the markdown file. 60 | 61 | :param filename: the markdown file's name 62 | :param language: the programming language 63 | :param infos: a dict contains all repos' information 64 | """ 65 | # use codecs to solve the utf-8 encoding problem like Chinese 66 | with codecs.open(filename, "a", "utf-8") as f: 67 | f.write('\n## {language}\n'.format(language=language)) 68 | for info in infos: 69 | f.write(info.convert_to_md()) 70 | 71 | 72 | def append_img_to_md(img_path, md_path): 73 | """ 74 | Append an image of wordcloud to the markdown file. 75 | 76 | :param img_path: the image's path and name 77 | :param md_path: the markdown file's path and name 78 | """ 79 | with codecs.open(md_path, "a", "utf-8") as f: 80 | f.write('\n## WordCloud\n') 81 | f.write('![]({path})\n'.format(path=img_path)) 82 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # GitHub-Trending-Crawler 2 | 3 | Crawling [GitHub Trending Pages](https://github.com/trending/) every day. 4 | 5 | ## Introduction 6 | 7 | The program is highly recommend to be deployed on a Linux server, which can crawl information about popular repositories of languages you are interested in on GitHub every day. Then it will create a markdown file to record those information and generate a wordcloud image according to repositories' descriptions. 8 | 9 | This crawler is designed to help me keep track of the latest trends in technology and discover some new and interesting repositories. In fact, reading the newest markdown file has become a part of my daily routines. More importantly, it increases contributions of GitHub :P 10 | 11 | The idea was inspired by [LJ147](https://github.com/LJ147/GithubTrending). 12 | 13 | ## Requirements 14 | 15 | + python 3.6+ 16 | + git 17 | + screen 18 | + unzip 19 | 20 | ## Configuration 21 | 22 | + **Fork my repo or create your own repo** for uploading markdown files. 23 | 24 | + If you don't have ssh keys, [generating a new SSH key and adding it to the ssh-agent](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/). 25 | 26 | ## Usage on Linux 27 | 28 | ``` bash 29 | $ sudo apt install -y unzip screen python3-pip 30 | $ sudo apt-get install -y python-tk python3-tk 31 | 32 | # the `release` branch is stable, and there is only code. 33 | $ wget https://github.com/fgksgf/GitHub-Trending-Crawler/archive/release.zip 34 | $ unzip release.zip 35 | $ cd GitHub-Trending-Crawler-release/ 36 | $ mkdir img 37 | $ git init 38 | $ git remote add origin 39 | 40 | # using virtual environment is highly recommended 41 | $ pip3 install -r requirements.txt 42 | ``` 43 | 44 | 1. Switch to the repository directory and just type `screen` at the command prompt. Then the screen will show with interface exactly as the command prompt. 45 | 46 | 2. When you enter the screen, you can do all your work as you are in the normal CLI environment. But since the screen is an application, so it have command or parameters. 47 | 48 | 3. And now, we can run the program: `python3 main.py -p -l` 49 | 50 | 4. While the program is running, you can press `Ctrl + A` and `d` to detach the screen. Then you can disconnect your SSH session. 51 | 52 | 5. When you want to check the status of the crawler, just reconnect to your server via ssh. Then use this command `screen -r` to restore the screen. _For more information about `screen` command, you can visit [here](https://www.tecmint.com/screen-command-examples-to-manage-linux-terminals/)._ 53 | 54 | ## CLI Options 55 | 56 | ```bash 57 | python3 main.py (-h | --help) 58 | python3 main.py (-v | --version) 59 | python3 main.py [-l | --loop] [-p | --push] [--frequency=] 60 | 61 | Options: 62 | -h --help Show this screen. 63 | -v --version Show version. 64 | -l --loop Run this program cyclically. 65 | -p --push Use git to push the markdown and the image. 66 | --frequency= The frequency of crawling [default: daily]. 67 | ``` 68 | 69 | ## Change Logs 70 | 71 | ### V1.5 (2020-02-22) 72 | 73 | + Refactor code with object-oriented methods 74 | + Split single python file into several files 75 | + Improve exception handling 76 | + Add logging feature 77 | + Use `docopt` to enhance command-line usage 78 | + Update requirements 79 | -------------------------------------------------------------------------------- /crawler.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | import logging 3 | import sys 4 | 5 | import requests 6 | from pyquery import PyQuery as pq 7 | 8 | from config import TRENDING_URL, HEADERS, MD_FILE_NAME, LANGUAGES, FREQUENCY, PROXY_POOL_API 9 | from repository_info import RepoInfo 10 | from utils.file_util import append_infos_to_md, create_markdown, generate_wordcloud, append_img_to_md 11 | 12 | 13 | class GitHubCrawler: 14 | """ 15 | GitHub trending crawler 16 | """ 17 | 18 | def __init__(self, frequency='daily', use_proxy=False): 19 | self.frequency = frequency 20 | self.use_proxy = use_proxy 21 | 22 | # init logger 23 | self.logger = logging.getLogger(__name__) 24 | self.logger.setLevel(level=logging.INFO) 25 | 26 | # output logs to console 27 | stream_handler = logging.StreamHandler(sys.stdout) 28 | stream_handler.setLevel(level=logging.INFO) 29 | formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s', 30 | datefmt='%Y/%m/%d %H:%M:%S') 31 | stream_handler.setFormatter(formatter) 32 | self.logger.addHandler(stream_handler) 33 | 34 | def get_random_proxy(self): 35 | """ 36 | Get a random proxy. 37 | 38 | :return: a proxy or None 39 | """ 40 | proxies = { 41 | 'http': '', 42 | 'https': '', 43 | } 44 | try: 45 | response = requests.get(PROXY_POOL_API['get_a_proxy']) 46 | if response.status_code == 200: 47 | proxy = 'http://' + str(response.json()['proxy']) 48 | proxies['http'] = proxy 49 | proxies['https'] = proxy 50 | except Exception: 51 | self.logger.error('Failed to get a proxy!') 52 | return None 53 | else: 54 | return proxies 55 | 56 | def parse(self, dollar): 57 | """ 58 | Parse and extract useful information from html by pyquery. 59 | If the trending page changes, just modify this method. 60 | 61 | :param dollar: the pyquery object, just like $ in jquery 62 | :return: a list includes many `RepoInfo` objects 63 | """ 64 | repo_infos = [] 65 | try: 66 | articles = dollar( 67 | '.explore-pjax-container.container-lg.p-responsive.pt-6 > div > div:nth-child(2)').children() 68 | for i in range(len(articles)): 69 | article = articles.eq(i) 70 | 71 | # href: '/Username/RepoName' 72 | href = article('.lh-condensed a').attr('href') 73 | 74 | # the description about the repo 75 | desc = article('.col-9.text-gray.my-1.pr-4').text().strip().replace('\n', '') 76 | 77 | # how many stars it got 78 | stars = article('div.f6.text-gray.mt-2 > span.d-inline-block.float-sm-right').text().strip() 79 | 80 | repo_infos.append(RepoInfo(href=href, stars=stars, desc=desc)) 81 | except Exception: 82 | self.logger.error("The GitHub trending page changed, can't parse!") 83 | 84 | return repo_infos 85 | 86 | def crawl(self, lang): 87 | """ 88 | Crawl a programming language trending page. 89 | 90 | :param lang: programming language name 91 | :return: a list includes many `RepoInfo` objects 92 | """ 93 | ret = None 94 | try: 95 | url = TRENDING_URL.format(language=lang, frequency=self.frequency) 96 | 97 | proxies = None 98 | if self.use_proxy: 99 | proxies = self.get_random_proxy() 100 | if proxies: 101 | r = requests.get(url, headers=HEADERS, proxies=proxies) 102 | else: 103 | r = requests.get(url, headers=HEADERS) 104 | 105 | # if the status code is not 200, then raise the error 106 | r.raise_for_status() 107 | 108 | # use pyquery to parse html 109 | ret = self.parse(pq(r.text)) 110 | 111 | if ret is None: 112 | raise ValueError('Failed to parse!') 113 | except requests.ConnectionError: 114 | self.logger.error("Connection Error.") 115 | except Exception as e: 116 | self.logger.error('%s', e.args) 117 | self.logger.error('Failed to crawl: %s', lang) 118 | else: 119 | self.logger.info('Done: %s', lang) 120 | return ret 121 | 122 | def run(self, langs=LANGUAGES): 123 | """ 124 | Crawling the GitHub trending page of the language. 125 | 126 | :param langs: list of programming language names to be crawled 127 | """ 128 | # get today's date 129 | today_date = datetime.datetime.now().strftime('%Y-%m-%d') 130 | filename = MD_FILE_NAME.format(name=today_date) 131 | 132 | # Create markdown file 133 | create_markdown(today_date, filename) 134 | 135 | descriptions = [] 136 | for lang in langs: 137 | infos = self.crawl(lang) 138 | append_infos_to_md(filename, lang, infos) 139 | for info in infos: 140 | descriptions.append(info.desc) 141 | 142 | path = generate_wordcloud(descriptions=descriptions, filename=today_date) 143 | append_img_to_md(img_path=path, md_path=filename) 144 | self.logger.info("Finish crawling: %s", today_date) 145 | -------------------------------------------------------------------------------- /2021-01-16.md: -------------------------------------------------------------------------------- 1 | # 2021-01-16 2 | See what the GitHub community is most excited about. 3 | 4 | ## python 5 | + [dagster](https://github.com/dagster-io/dagster)(**52 stars today**): A data orchestrator for machine learning, analytics, and ETL. 6 | + [hacktricks](https://github.com/carlospolop/hacktricks)(**222 stars today**): Welcome to the page where you will find each trick/technique/whatever I have learnt in CTFs, real life apps, and reading researches and news. 7 | + [d2l-vn](https://github.com/mlbvn/d2l-vn)(**22 stars today**): Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học. 8 | + [ansible](https://github.com/ansible/ansible)(**23 stars today**): Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com. 9 | + [transformers](https://github.com/huggingface/transformers)(**82 stars today**): 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0. 10 | + [mlflow](https://github.com/mlflow/mlflow)(**12 stars today**): Open source platform for the machine learning lifecycle 11 | + [locust](https://github.com/locustio/locust)(**117 stars today**): Scalable user load testing tool written in Python 12 | + [airflow](https://github.com/apache/airflow)(**32 stars today**): Apache Airflow - A platform to programmatically author, schedule, and monitor workflows 13 | + [algorithms](https://github.com/keon/algorithms)(**16 stars today**): Minimal examples of data structures and algorithms in Python 14 | + [black](https://github.com/psf/black)(**12 stars today**): The uncompromising Python code formatter 15 | + [ray](https://github.com/ray-project/ray)(**31 stars today**): An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library. 16 | + [RushCoupon](https://github.com/pujie1216/RushCoupon)(**10 stars today**): No virus 17 | + [jumpserver](https://github.com/jumpserver/jumpserver)(**35 stars today**): JumpServer 是全球首款开源的堡垒机,是符合 4A 的专业运维安全审计系统。 18 | + [redash](https://github.com/getredash/redash)(**14 stars today**): Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data. 19 | + [jax](https://github.com/google/jax)(**19 stars today**): Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more 20 | + [nuclei-templates](https://github.com/projectdiscovery/nuclei-templates)(**6 stars today**): Community curated list of templates for the nuclei engine to find a security vulnerability in application. 21 | + [faker](https://github.com/joke2k/faker)(**15 stars today**): Faker is a Python package that generates fake data for you. 22 | + [fastapi](https://github.com/tiangolo/fastapi)(**48 stars today**): FastAPI framework, high performance, easy to learn, fast to code, ready for production 23 | + [awx](https://github.com/ansible/awx)(**4 stars today**): AWX Project 24 | + [tinygrad](https://github.com/geohot/tinygrad)(**17 stars today**): You like pytorch? You like micrograd? You love tinygrad!❤️ 25 | + [edx-platform](https://github.com/edx/edx-platform)(**2 stars today**): The Open edX platform, the software that powers edX! 26 | + [client](https://github.com/wandb/client)(**10 stars today**): 🔥A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API. 27 | + [isort](https://github.com/PyCQA/isort)(**3 stars today**): A Python utility / library to sort imports. 28 | + [pyyaml](https://github.com/yaml/pyyaml)(**6 stars today**): Canonical source repository for PyYAML 29 | + [pydantic](https://github.com/samuelcolvin/pydantic)(**11 stars today**): Data parsing and validation using Python type hints 30 | 31 | ## java 32 | + [Logi-KafkaManager](https://github.com/didi/Logi-KafkaManager)(**108 stars today**): 一站式Apache Kafka集群指标监控与运维管控平台 33 | + [mybatis-plus-samples](https://github.com/baomidou/mybatis-plus-samples)(**9 stars today**): MyBatis-Plus Samples 34 | + [GitHub-Chinese-Top-Charts](https://github.com/kon9chunkit/GitHub-Chinese-Top-Charts)(**39 stars today**): 🇨🇳GitHub中文排行榜,帮助你发现高分优秀中文项目、更高效地吸收国人的优秀经验成果;榜单每周更新一次,敬请关注! 35 | + [Mindustry](https://github.com/Anuken/Mindustry)(**26 stars today**): A sandbox tower defense game 36 | + [APIJSON](https://github.com/Tencent/APIJSON)(**34 stars today**): 🏆码云最有价值开源项目🚀后端接口和文档自动化,前端(客户端) 定制返回 JSON 的数据和结构!🏆Gitee Most Valuable Project🚀A JSON Transmission Protocol and an ORM Library for automatically providing APIs and Docs. 37 | + [SmartTubeNext](https://github.com/yuliskov/SmartTubeNext)(**13 stars today**): Better YouTube experience on Android TV 38 | + [elasticsearch](https://github.com/elastic/elasticsearch)(**25 stars today**): Open Source, Distributed, RESTful Search Engine 39 | + [jenkins](https://github.com/jenkinsci/jenkins)(**14 stars today**): Jenkins automation server 40 | + [nacos](https://github.com/alibaba/nacos)(**40 stars today**): an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications. 41 | + [zfile](https://github.com/zhaojun1998/zfile)(**39 stars today**): 在线云盘、网盘、OneDrive、云存储、私有云、对象存储、h5ai 42 | + [languagetool](https://github.com/languagetool-org/languagetool)(**13 stars today**): Style and Grammar Checker for 25+ Languages 43 | + [soul](https://github.com/dromara/soul)(**37 stars today**): High-Performance Java API Gateway 44 | + [geoserver](https://github.com/geoserver/geoserver)(**3 stars today**): Official GeoServer repository 45 | + [dynamic-datasource-spring-boot-starter](https://github.com/baomidou/dynamic-datasource-spring-boot-starter)(**9 stars today**): dynamic datasource for springboot 多数据源 动态数据源 主从分离 读写分离 分布式事务 https://dynamic-datasource.com/ 46 | + [testing-samples](https://github.com/android/testing-samples)(**3 stars today**): A collection of samples demonstrating different frameworks and techniques for automated testing 47 | + [ghidra](https://github.com/NationalSecurityAgency/ghidra)(**26 stars today**): Ghidra is a software reverse engineering (SRE) framework 48 | + [kafka](https://github.com/apache/kafka)(**13 stars today**): Mirror of Apache Kafka 49 | + [TelegramBots](https://github.com/rubenlagus/TelegramBots)(**6 stars today**): Java library to create bots using Telegram Bots API 50 | + [vhr](https://github.com/lenve/vhr)(**27 stars today**): 微人事是一个前后端分离的人力资源管理系统,项目采用SpringBoot+Vue开发。 51 | + [zaproxy](https://github.com/zaproxy/zaproxy)(**6 stars today**): The OWASP ZAP core project 52 | + [pulsar](https://github.com/apache/pulsar)(**6 stars today**): Apache Pulsar - distributed pub-sub messaging system 53 | + [tink](https://github.com/google/tink)(**7 stars today**): Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse. 54 | + [aws-lambda-developer-guide](https://github.com/awsdocs/aws-lambda-developer-guide)(**4 stars today**): The AWS Lambda Developer Guide 55 | + [react-native-video](https://github.com/react-native-video/react-native-video)(**6 stars today**): A