├── README.md └── BSC_Contract_Scraper.py /README.md: -------------------------------------------------------------------------------- 1 | # Binance Smart Chain Contract Scraper + Contract Evaluator 2 | Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials. 3 | Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy. 4 | 5 | Sample execution: 6 | ![2b423cea3307c40b307fdfdfe2528592](https://user-images.githubusercontent.com/62744506/149968695-6a91dc12-be82-408b-9082-ff5796896391.png) 7 | 8 | 9 | 10 | Its common practice to include links to social media such as Telegram somewhere in the contract for added transparency, the idea of the scraper being to return new 11 | contracts just as they come out that have websites, social media etc. that demonstrates some level of ambition - just a website is often enough for decently high market caps. 12 | 13 | Generally the contracts returned will never have high evaluation scores as they have just come out and TokenSniffers evaluation criteria is based on 14 | factors like buying fees, contract ownership renouncals, liquidity locks etc. which in most cases are set some time after contract verification. 15 | 16 | Upon startup you'll have to solve the captcha that appears in the ChromeDriver window to bypass TokenSniffer's bot protection. 17 | Contracts returned with scores around 40-50 are ones to keep a look at generally as their overall evaluation is bound to be high if they decide to do those things mentioned 18 | when launching. In general, I advise to trade these tokens returned by the scraper with caution, many of them are scams and will just lose you your funds. 19 | However, as this scraper returns contracts when they just come out there is ample opportunity to be very early to good projects too. DYOR! 20 | 21 | Future versions might implement frameworks like pupeteer to bypass the captcha so we can run the driver headlessly without that annoying chrome window. 22 | 23 | To run: 24 | 1.Install Google Chrome. 25 | 2.Import selenium, requests, time into a python3 environment of your choice 26 | 3. Install the Chrome webdriver from: https://chromedriver.chromium.org/home 27 | 4. Pass in the directory location of your chromedriver.exe as a String argument into the Service object on line 29 28 | 29 | 30 | ![ce4e75df3c012907aee2c9559c250f30](https://user-images.githubusercontent.com/62744506/149964319-ce99bc2f-46ff-4cb7-8f89-c5791a8cb489.png) 31 | 32 | 5. Run Bsc_Contract_Scraper.py 33 | 34 | 35 | New contracts will be printed into the console, you can then go to the contract code to find the links to eventual social media/website. 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /BSC_Contract_Scraper.py: -------------------------------------------------------------------------------- 1 | #! python3 2 | # 3 | # Pulls BSC Scan feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials 4 | # Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy 5 | # Its common practice to include links to social media such as Telegram somewhere in the contract for added transparency, the idea of the scraper being to return new 6 | # contracts just as they come out that have websites, social media etc. that demonstrates some level of ambition - just a website is often enough for decently high market caps. 7 | # Generally the contracts returned will never have high evaluation scores as they have just come out and TokenSniffers evaluation criteria is based on 8 | # factors like buying fees, contract ownership renouncals, liquidity locks etc. which in most cases are set some time after contract verification. 9 | 10 | # Upon startup you'll have to solve the captcha that appears in the ChromeDriver window to bypass TokenSniffer's bot protection. 11 | # Contracts returned with scores around 40-50 are ones to keep a look at generally as their overall evaluation is bound to be high if they decide to do those things mentioned 12 | # when launching. In general, I advise to trade these tokens returned by the scraper with caution, many of them are scams and will just lose you your funds. 13 | # However, as this scraper returns contracts when they just come out there is ample opportunity to be very early to good projects too. DYOR! 14 | # author Greek, GG 15 | 16 | import requests 17 | import time 18 | import undetected_chromedriver as uc 19 | from selenium import webdriver 20 | from selenium.webdriver.support.ui import WebDriverWait 21 | from selenium.webdriver.support import expected_conditions as EC 22 | from selenium.webdriver.common.by import By 23 | from selenium.common.exceptions import TimeoutException 24 | from selenium.webdriver.chrome.service import Service 25 | 26 | 27 | URL = "https://bscscan.com/" 28 | contractsFeedURL = URL + "contractsVerified" 29 | 30 | options = uc.ChromeOptions() 31 | options.add_argument('--headless') 32 | 33 | browser = uc.Chrome(use_subprocess=True,options=options) 34 | 35 | payload = {} 36 | headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/83.0.4103.116 Safari/537.36'} 37 | 38 | addresses = [] # To store all the addresses pulled from the contracts feed 39 | leadAddresses = [] # To store only those addresses whose contracts include socials information 40 | keywords = ["t.me/", "twitter.com", "medium.com", ".finance"] # Keywords being scraped for in smart contract headers, can include any keyword you want to query for 41 | 42 | 43 | def FeedScan(): # Update the BSC Scan verified contracts feed and scrape the addresses of new additions 44 | 45 | res = requests.get(contractsFeedURL, headers=headers, data=payload) # Load the BSC Scan new verified contracts page 46 | contractsFeed = res.text.split("")[1] # Break out the table of contracts from the page 47 | for i in range(1, 26): # Iterate over each entry in the table, i.e. newest 25 records 48 | addrSplit = contractsFeed.split("href=\'/address/")[i].split("#code\'")[ 49 | 0] # Break out the address of each individual token 50 | if not addrSplit in addresses: # Check if the address has already been parsed or not 51 | addresses.append(addrSplit) 52 | 53 | 54 | def ContractCheck(address): # Check the contract code of a given token address and see if it contains links to socials 55 | 56 | contractURL = URL + "address/" + address + "#code" 57 | 58 | 59 | tokenSnifferUrl = "https://tokensniffer.com/token/bsc/" + address 60 | bscScanRequest = requests.get(contractURL, headers=headers, data=payload) 61 | 62 | try: 63 | contractText = bscScanRequest.text.split("id=\'editor\' style=\'margin-top: 5px;\'>")[1].split("