├── LICENSE ├── Makefile ├── README.md ├── publishfeed ├── .gitignore ├── __init__.py ├── config.py ├── databases │ └── .gitkeep ├── feeds.yml.skel ├── helpers.py ├── main.py ├── models.py ├── test_data │ └── feedparser_data.py ├── tests.py └── twitter.py └── requirements.txt /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Marcelo Canina 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | #virtualenv: 2 | # mkvirtualenv -p /usr/bin/python3.6 twitter_feeder 3 | install: 4 | pip install -r requirements.txt 5 | freeze: 6 | pip freeze > requirements.txt 7 | tests: 8 | cd publishfeed; python -m unittest discover 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Publish Feed 2 | ============ 3 | 4 | A publisher of articles from websites RSS feeds to Twitter. 5 | 6 | 7 | **Table of Contents** 8 | 9 | - [Publish Feed](#publish-feed) 10 | - [Overview](#overview) 11 | 12 | 13 | 14 | 15 | # Overview 16 | 17 | This app currently perform two tasks: 18 | 19 | - download RSS content from several sources listed in `feeds.yml` 20 | - publish its titles and links to Twitter 21 | 22 | This is an easy way to get your [blog posts automatically tweeted](https://simpleit.rocks/automatically-tweet-new-blog-posts-based-in-rss/). 23 | 24 | # Installation 25 | 26 | ## Install dependencies 27 | 28 | Install Pip dependencies: make install 29 | 30 | ## Get Twitter credentials 31 | 32 | Create a Twitter app at and generate keys 33 | with **read and write permissions**. 34 | 35 | ## Set up credentials and feeds 36 | 37 | Customize `feeds.yml.skel` and save it as `feeds.yml`: 38 | 39 | cd publishfeed 40 | cp feeds.yml.skel feeds.yml 41 | 42 | For example, defining feeds for two different Twitter accounts, 43 | and , would 44 | look like: 45 | 46 | simpleitrocks: #twitter handler 47 | twitter: 48 | consumer_key: 'XXXXXX' 49 | consumer_secret: 'XXXXXXX' 50 | access_key: 'XXXXXX' 51 | access_secret: 'XXXXXX' 52 | urls: 53 | - https://simpleit.rocks/feed 54 | hashtags: '#TechTutorials' 55 | reddit: 56 | twitter: 57 | consumer_key: 'XXXXXX' 58 | consumer_secret: 'XXXXXXX' 59 | access_key: 'XXXXXX' 60 | access_secret: 'XXXXXX' 61 | urls: 62 | - http://www.reddit.com/.rss 63 | - http://www.reddit.com/r/news/.rss 64 | - http://www.reddit.com/user/marcanuy/.rss 65 | hashtags: '#RedditFrontPage' 66 | 67 | # Running 68 | 69 | There are two commands available: 70 | 71 | ~~~ bash 72 | $ python main.py 73 | 74 | usage: main.py [-h] [-g | -t] feed 75 | 76 | Process and publish RSS Feeds articles 77 | 78 | positional arguments: 79 | feed Index from feeds.yml 80 | 81 | optional arguments: 82 | -h, --help show this help message and exit 83 | -g, --getfeeds Download and save new articles from the list of feeds 84 | -t, --tweet Tweet unpublished articles from this list of feeds 85 | ~~~ 86 | 87 | - `python main.py --getfeeds` 88 | 89 | Download all the pages from the URLs defined in `urls` and save the 90 | new ones. E.g.: python main.py reddit --getfeeds 91 | 92 | - `python main.py --tweet` 93 | 94 | Tweet the oldest unpublished page (previously downloaded with 95 | `--getfeeds`). E.g.: python main.py reddit --tweet 96 | 97 | ## Cronjobs 98 | 99 | Set up two cronjobs to publish new feed content automatically: 100 | 101 | ~~~ bash 102 | crontab -e 103 | ~~~ 104 | 105 | ~~~ cronjob 106 | # hourly download new pages 107 | 0 * * * * workon publishfeed; cd publishfeed; python main.py reddit --getfeeds 108 | 109 | # tweet every 15 minutes if there are unpublished pages 110 | */15 * * * * workon publishfeed; cd publishfeed; python main.py reddit --tweet 111 | ~~~ 112 | 113 | **Make sure you configure the tweeter cronjob with at least 2 minutes 114 | between each job so your credentials won't be suspended** 115 | 116 | # Design 117 | 118 | `feeds.yml` populates the **FeedSet** model, then for each url, new 119 | content is created as **RSSContent** instances (using SQLAlchemy) and saved in 120 | `/databases/rss_.db` *SQLite* databases. 121 | 122 | To tweet a new post, we get the oldest unpublished page from 123 | **RSSContent**, publish it and change its status. 124 | 125 | # Questions 126 | 127 | Do you think there is something missing or that can be improved? Feel 128 | free to open issues and/or contributing! 129 | 130 | # License 131 | 132 | MIT Licensed. 133 | -------------------------------------------------------------------------------- /publishfeed/.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | *.pyc 3 | databases/ 4 | feeds.yml 5 | test.db -------------------------------------------------------------------------------- /publishfeed/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marcanuy/publishfeed/b2f333c319d6dab9ab8b3095f89a4382c9edd857/publishfeed/__init__.py -------------------------------------------------------------------------------- /publishfeed/config.py: -------------------------------------------------------------------------------- 1 | 2 | TWEET_MAX_LENGTH = 140 3 | TWEET_URL_LENGTH = 22 4 | TWEET_IMG_LENGTH = 0 #23 5 | 6 | DB_TEST_URL = 'sqlite://' # in memory 7 | #DB_TEST_URL = 'sqlite:///test.db' # file 8 | -------------------------------------------------------------------------------- /publishfeed/databases/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/marcanuy/publishfeed/b2f333c319d6dab9ab8b3095f89a4382c9edd857/publishfeed/databases/.gitkeep -------------------------------------------------------------------------------- /publishfeed/feeds.yml.skel: -------------------------------------------------------------------------------- 1 | simpleitrocks: #twitter handler 2 | twitter: 3 | consumer_key: '' 4 | consumer_secret: '' 5 | access_key: '' 6 | access_secret: '' 7 | urls: 8 | - #https://simpleit.rocks/feed 9 | hashtags: '' 10 | -------------------------------------------------------------------------------- /publishfeed/helpers.py: -------------------------------------------------------------------------------- 1 | from models import RSSContent, FeedSet 2 | from twitter import Twitter 3 | import yaml 4 | import config 5 | import feedparser 6 | from datetime import datetime 7 | 8 | class Helper: 9 | def __init__(self, session, data): 10 | self.session = session 11 | if(isinstance(data, dict)): 12 | self.data = data 13 | else: 14 | with open('feeds.yml', 'r') as f: 15 | self.data = yaml.safe_load(f)[data] 16 | 17 | class FeedSetHelper(Helper): 18 | 19 | def get_pages_from_feeds(self): 20 | feed = FeedSet(self.data) 21 | for url in feed.urls: 22 | parsed_feed = feedparser.parse(url) 23 | for entry in parsed_feed.entries: 24 | # if feed page not exist, add it as rsscontent 25 | q = self.session.query(RSSContent).filter_by(url = entry.link) 26 | exists = self.session.query(q.exists()).scalar() # returns True or False 27 | if not exists: 28 | item_title = entry.title 29 | item_url = entry.link #.encode('utf-8') 30 | #RFC 2822 standard: Wed, 7 Jun 2017 16:25:41 +0000 31 | item_date = datetime.strptime(entry.published, "%a, %d %b %Y %H:%M:%S +0000") 32 | item = RSSContent(url=item_url, title=item_title, dateAdded = item_date) 33 | self.session.add(item) 34 | 35 | class RSSContentHelper(Helper): 36 | 37 | def get_oldest_unpublished_rsscontent(self, session): 38 | rsscontent = session.query(RSSContent).filter_by(published = 0).order_by(RSSContent.dateAdded.asc()).first() 39 | return rsscontent 40 | 41 | def _calculate_tweet_length(self): 42 | tweet_net_length = config.TWEET_MAX_LENGTH - config.TWEET_URL_LENGTH - config.TWEET_IMG_LENGTH 43 | hashtag_length = len(self.data['hashtags']) 44 | body_length = tweet_net_length - hashtag_length 45 | return body_length 46 | 47 | def tweet_rsscontent(self, rsscontent): 48 | credentials = self.data['twitter'] 49 | twitter = Twitter(**credentials) 50 | 51 | body_length = self._calculate_tweet_length() 52 | tweet_body = rsscontent.title[:body_length] 53 | tweet_url = rsscontent.url 54 | tweet_hashtag = self.data['hashtags'] 55 | tweet_text = "{} {} {}".format(tweet_body, tweet_url, tweet_hashtag) 56 | twitter.update_status(tweet_text) 57 | rsscontent.published=True 58 | self.session.flush() 59 | -------------------------------------------------------------------------------- /publishfeed/main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | import config 5 | import argparse 6 | 7 | from helpers import RSSContentHelper, FeedSetHelper 8 | from twitter import Twitter 9 | from models import FeedSet,create_tables 10 | from contextlib import contextmanager 11 | from sqlalchemy import create_engine 12 | from sqlalchemy.orm import sessionmaker 13 | 14 | @contextmanager 15 | def session_scope(account): 16 | """Provide a transactional scope around a series of operations.""" 17 | session = db_session(account) 18 | try: 19 | yield session 20 | session.commit() 21 | except: 22 | session.rollback() 23 | raise 24 | finally: 25 | session.close() 26 | 27 | def db_session(account): 28 | """ 29 | Performs database connection using database settings from the 30 | account selected in feeds.yml. 31 | Returns sqlalchemy session instance 32 | """ 33 | db_path = 'databases/rss_{}.db'.format(account) 34 | engine = create_engine("sqlite:///{}".format(db_path)) 35 | Session = sessionmaker(bind=engine) 36 | session = Session() 37 | return session 38 | 39 | def getfeeds(account): 40 | """ 41 | Download and save articles from feeds 42 | """ 43 | create_tables(account) 44 | with session_scope(account) as session: 45 | helper = FeedSetHelper(session, account) 46 | helper.get_pages_from_feeds() 47 | 48 | def tweet(account): 49 | with session_scope(account) as session: 50 | helper = RSSContentHelper(session, account) 51 | rsscontent = helper.get_oldest_unpublished_rsscontent(session) 52 | if(rsscontent): 53 | helper.tweet_rsscontent(rsscontent) 54 | 55 | if __name__ == '__main__': 56 | parser = argparse.ArgumentParser(description='Process and publish RSS Feeds articles') 57 | group = parser.add_mutually_exclusive_group() 58 | group.add_argument("-g", "--getfeeds", action='store_true', help="Download and save new articles from the list of feeds") 59 | group.add_argument("-t", "--tweet", action='store_true', help="Tweet unpublished articles from this list of feeds") 60 | parser.add_argument("feed", help="Index from feeds.yml") 61 | args = parser.parse_args() 62 | 63 | account = args.feed 64 | if(args.getfeeds): 65 | getfeeds(account) 66 | elif(args.tweet): 67 | tweet(account) 68 | 69 | -------------------------------------------------------------------------------- /publishfeed/models.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | import yaml 4 | 5 | from time import mktime, strftime 6 | from sqlalchemy.ext.declarative import declarative_base 7 | from sqlalchemy import create_engine, Column, Integer, String, DateTime, Boolean 8 | 9 | 10 | Base = declarative_base() 11 | 12 | from datetime import datetime 13 | 14 | def db_connect(account): 15 | """ 16 | Performs database connection using database settings from settings.py. 17 | Returns sqlalchemy engine instance 18 | """ 19 | db_path = 'databases/rss_{}.db'.format(account) 20 | return create_engine("sqlite:///{}".format(db_path)) 21 | 22 | def create_tables(account): 23 | """""" 24 | engine = db_connect(account) 25 | Base.metadata.create_all(engine) 26 | 27 | class FeedSet: 28 | """ 29 | feeds.yml item wrapper 30 | """ 31 | def __init__(self, data): 32 | if(isinstance(data, dict)): 33 | self.data = data 34 | 35 | @property 36 | def twitter_keys(self): 37 | return self.data['twitter'] 38 | 39 | @property 40 | def urls(self): 41 | return self.data['urls'] 42 | 43 | @property 44 | def hashtags(self): 45 | return self.data['hashtags'] 46 | 47 | class RSSContent(Base): 48 | __tablename__ = 'rsscontent' 49 | 50 | id = Column(Integer, primary_key=True) 51 | url = Column(String) 52 | title = Column(String) 53 | dateAdded = Column(DateTime, default=datetime.now()) 54 | published = Column(Boolean, unique=False, default=False) 55 | 56 | def __repr__(self): 57 | return "".format(self.dateAdded, self.title, self.url, self.published) 58 | 59 | def __init__(self, url, title, dateAdded=None, published=False): 60 | self.url = url 61 | self.title = title 62 | self.dateAdded = dateAdded 63 | self.published = published 64 | -------------------------------------------------------------------------------- /publishfeed/test_data/feedparser_data.py: -------------------------------------------------------------------------------- 1 | from munch import munchify 2 | # a real response example without time structs (from published_parsed and update-parsed) 3 | 4 | feedparser_parse_response = {'feed': {'title': 'Hacker News', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Hacker News'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://news.ycombinator.com/'}], 'link': 'https://news.ycombinator.com/', 'subtitle': 'Links for the intellectually curious, ranked by readers.', 'subtitle_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Links for the intellectually curious, ranked by readers.'}}, 'entries': [{'title': 'In 1957, Five Men Agreed to Stand Under an Exploding Nuclear Bomb (2012)', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'In 1957, Five Men Agreed to Stand Under an Exploding Nuclear Bomb (2012)'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.npr.org/sections/krulwich/2012/07/16/156851175/five-men-agree-to-stand-directly-under-an-exploding-nuclear-bomb'}], 'link': 'http://www.npr.org/sections/krulwich/2012/07/16/156851175/five-men-agree-to-stand-directly-under-an-exploding-nuclear-bomb', 'published': 'Wed, 7 Jun 2017 16:25:41 +0000', 'comments': 'https://news.ycombinator.com/item?id=14507673', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Americans from Both Political Parties Overwhelmingly Support Net Neutrality', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Americans from Both Political Parties Overwhelmingly Support Net Neutrality'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.mozilla.org/blog/2017/06/06/new-mozilla-poll-americans-political-parties-overwhelmingly-support-net-neutrality/'}], 'link': 'https://blog.mozilla.org/blog/2017/06/06/new-mozilla-poll-americans-political-parties-overwhelmingly-support-net-neutrality/', 'published': 'Wed, 7 Jun 2017 18:54:13 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508921', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Options vs. Cash', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Options vs. Cash'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://danluu.com/startup-options/'}], 'link': 'https://danluu.com/startup-options/', 'published': 'Wed, 7 Jun 2017 11:39:04 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505378', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Performance Improvements in .NET Core', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Performance Improvements in .NET Core'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blogs.msdn.microsoft.com/dotnet/2017/06/07/performance-improvements-in-net-core/'}], 'link': 'https://blogs.msdn.microsoft.com/dotnet/2017/06/07/performance-improvements-in-net-core/', 'published': 'Wed, 7 Jun 2017 16:56:06 +0000', 'comments': 'https://news.ycombinator.com/item?id=14507936', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Oldest Fossils of Homo Sapiens Found in Morocco', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Oldest Fossils of Homo Sapiens Found in Morocco'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.nytimes.com/2017/06/07/science/human-fossils-morocco.html'}], 'link': 'https://www.nytimes.com/2017/06/07/science/human-fossils-morocco.html', 'published': 'Wed, 7 Jun 2017 17:08:05 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508029', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Atlas of Lie Groups and Representations', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Atlas of Lie Groups and Representations'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.liegroups.org/software/documentation/atlasofliegroups-docs/index.html'}], 'link': 'http://www.liegroups.org/software/documentation/atlasofliegroups-docs/index.html', 'published': 'Wed, 7 Jun 2017 10:34:20 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505047', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'A Brief History of the UUID', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'A Brief History of the UUID'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://segment.com/blog/a-brief-history-of-the-uuid/'}], 'link': 'https://segment.com/blog/a-brief-history-of-the-uuid/', 'published': 'Wed, 7 Jun 2017 17:51:36 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508413', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Apple Announces Full WebRTC Support in Safari 11', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Apple Announces Full WebRTC Support in Safari 11'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.peer5.com/apple-announces-support-for-webrtc-in-safari-11/'}], 'link': 'https://blog.peer5.com/apple-announces-support-for-webrtc-in-safari-11/', 'published': 'Wed, 7 Jun 2017 19:17:45 +0000', 'comments': 'https://news.ycombinator.com/item?id=14509100', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Reducers, transducers and core.async in Clojure', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Reducers, transducers and core.async in Clojure'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://eli.thegreenplace.net/2017/reducers-transducers-and-coreasync-in-clojure/'}], 'link': 'http://eli.thegreenplace.net/2017/reducers-transducers-and-coreasync-in-clojure/', 'published': 'Wed, 7 Jun 2017 13:04:39 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506012', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Software Companies Tech Competency Matrix', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Software Companies Tech Competency Matrix'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://geshan.com.np/blog/2017/06/software-companies-tech-competency-matrix/'}], 'link': 'https://geshan.com.np/blog/2017/06/software-companies-tech-competency-matrix/', 'published': 'Wed, 7 Jun 2017 14:12:06 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506563', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'The future of MDN: a focus on web docs', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'The future of MDN: a focus on web docs'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.mozilla.org/opendesign/future-mdn-focus-web-docs/'}], 'link': 'https://blog.mozilla.org/opendesign/future-mdn-focus-web-docs/', 'published': 'Wed, 7 Jun 2017 08:38:27 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504604', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': "Stanford's therapy chatbot for depression", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': "Stanford's therapy chatbot for depression"}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.businessinsider.fr/us/stanford-therapy-chatbot-depression-anxiety-woebot-2017-6/'}], 'link': 'http://www.businessinsider.fr/us/stanford-therapy-chatbot-depression-anxiety-woebot-2017-6/', 'published': 'Wed, 7 Jun 2017 07:20:44 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504306', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Conformity Excuses', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Conformity Excuses'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.overcomingbias.com/2017/06/conformity-excuses.html'}], 'link': 'http://www.overcomingbias.com/2017/06/conformity-excuses.html', 'published': 'Wed, 7 Jun 2017 09:36:47 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504822', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Set up a malware analysis lab with VirtualBox, INetSim and Burp', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Set up a malware analysis lab with VirtualBox, INetSim and Burp'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.christophetd.fr/set-up-your-own-malware-analysis-lab-with-virtualbox-inetsim-and-burp/'}], 'link': 'https://blog.christophetd.fr/set-up-your-own-malware-analysis-lab-with-virtualbox-inetsim-and-burp/', 'published': 'Wed, 7 Jun 2017 11:42:36 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505406', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Show HN: ProximityHash – Geohashes in Proximity', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Show HN: ProximityHash – Geohashes in Proximity'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://github.com/ashwin711/proximityhash'}], 'link': 'https://github.com/ashwin711/proximityhash', 'published': 'Wed, 7 Jun 2017 18:13:06 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508594', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'The Boolean Satisfiability Problem and SAT Solvers', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'The Boolean Satisfiability Problem and SAT Solvers'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://0a.io/boolean-satisfiability-problem-or-sat-in-5-minutes/'}], 'link': 'http://0a.io/boolean-satisfiability-problem-or-sat-in-5-minutes/', 'published': 'Wed, 7 Jun 2017 18:07:30 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508546', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Coursera raises Series D', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Coursera raises Series D'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://techcrunch.com/2017/06/07/online-learning-startup-coursera-raises-64m-at-an-800m-valuation/'}], 'link': 'https://techcrunch.com/2017/06/07/online-learning-startup-coursera-raises-64m-at-an-800m-valuation/', 'published': 'Wed, 7 Jun 2017 13:47:23 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506383', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Another “don’t cargo cult” article', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Another “don’t cargo cult” article'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb'}], 'link': 'https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb', 'published': 'Wed, 7 Jun 2017 17:34:19 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508264', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Ghost in the Shell FUI Design', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Ghost in the Shell FUI Design'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.hudsandguis.com/home/2017/4/17/ghostintheshell-fui'}], 'link': 'http://www.hudsandguis.com/home/2017/4/17/ghostintheshell-fui', 'published': 'Wed, 7 Jun 2017 06:41:53 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504163', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'A day without JavaScript', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'A day without JavaScript'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://sonniesedge.co.uk/blog/a-day-without-javascript'}], 'link': 'https://sonniesedge.co.uk/blog/a-day-without-javascript', 'published': 'Wed, 7 Jun 2017 11:28:09 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505315', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Cache Organization in Intel CPUs (2009)', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Cache Organization in Intel CPUs (2009)'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://duartes.org/gustavo/blog/post/intel-cpu-caches/'}], 'link': 'http://duartes.org/gustavo/blog/post/intel-cpu-caches/', 'published': 'Wed, 7 Jun 2017 13:50:34 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506401', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': "Einstein's Philosophy of Science (2014)", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': "Einstein's Philosophy of Science (2014)"}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://plato.stanford.edu/entries/einstein-philscience/'}], 'link': 'https://plato.stanford.edu/entries/einstein-philscience/', 'published': 'Tue, 6 Jun 2017 18:38:23 +0000', 'comments': 'https://news.ycombinator.com/item?id=14499850', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Getting started with the F# and .Net ecosystem', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Getting started with the F# and .Net ecosystem'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.prigrammer.com/?p=363'}], 'link': 'http://www.prigrammer.com/?p=363', 'published': 'Wed, 7 Jun 2017 13:34:58 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506287', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'BuildZoom (YC W13 – build your dream home) is hiring a VP of Sales', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'BuildZoom (YC W13 – build your dream home) is hiring a VP of Sales'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://jobs.lever.co/buildzoom'}], 'link': 'https://jobs.lever.co/buildzoom', 'published': 'Wed, 7 Jun 2017 17:22:09 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508141', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Edison, Clarence Dally, and the Hidden Perils of X-Rays (1903)', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Edison, Clarence Dally, and the Hidden Perils of X-Rays (1903)'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://web.archive.org/web/20120218234715/http://home.gwi.net/~dnb/read/edison/edison_xrays.htm'}], 'link': 'http://web.archive.org/web/20120218234715/http://home.gwi.net/~dnb/read/edison/edison_xrays.htm', 'published': 'Wed, 7 Jun 2017 03:04:41 +0000', 'comments': 'https://news.ycombinator.com/item?id=14503416', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Pharo 6.0 Released', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Pharo 6.0 Released'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://pharo.org/news/pharo6.0-released'}], 'link': 'http://pharo.org/news/pharo6.0-released', 'published': 'Wed, 7 Jun 2017 07:00:25 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504244', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'A curated list of design systems, pattern libraries, and more', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'A curated list of design systems, pattern libraries, and more'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://github.com/alexpate/awesome-design-systems'}], 'link': 'https://github.com/alexpate/awesome-design-systems', 'published': 'Wed, 7 Jun 2017 13:57:04 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506458', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Comcast Has Always Opposed Internet Freedom', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comcast Has Always Opposed Internet Freedom'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.eff.org/deeplinks/2017/06/dont-be-fooled-comcast-pr-machine-it-has-always-opposed-open-internet'}], 'link': 'https://www.eff.org/deeplinks/2017/06/dont-be-fooled-comcast-pr-machine-it-has-always-opposed-open-internet', 'published': 'Wed, 7 Jun 2017 14:46:30 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506853', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Dasung Paperlike Pro: E-Ink Monitor with HDMI connector [video]', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Dasung Paperlike Pro: E-Ink Monitor with HDMI connector [video]'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.youtube.com/watch?v=wj2Lvuc28k0'}], 'link': 'https://www.youtube.com/watch?v=wj2Lvuc28k0', 'published': 'Wed, 7 Jun 2017 12:33:22 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505762', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'List of Printers Which Do or Do Not Display Tracking Dots', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'List of Printers Which Do or Do Not Display Tracking Dots'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.eff.org/pages/list-printers-which-do-or-do-not-display-tracking-dots'}], 'link': 'https://www.eff.org/pages/list-printers-which-do-or-do-not-display-tracking-dots', 'published': 'Tue, 6 Jun 2017 22:12:51 +0000', 'comments': 'https://news.ycombinator.com/item?id=14501894', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}], 'bozo': 0, 'headers': {'Date': 'Wed, 07 Jun 2017 20:01:20 GMT', 'Content-Type': 'application/rss+xml', 'Transfer-Encoding': 'chunked', 'Connection': 'close', 'Set-Cookie': '__cfduid=d1c55ac10a1500623f632fe9f4feca9441496865679; expires=Thu, 07-Jun-18 20:01:19 GMT; path=/; domain=.ycombinator.com; HttpOnly', 'Cache-Control': 'private', 'X-Frame-Options': 'DENY', 'Strict-Transport-Security': 'max-age=31556900; includeSubDomains', 'Server': 'cloudflare-nginx', 'CF-RAY': '36b633e12e3c680f-EZE'}, 'href': 'https://news.ycombinator.com/rss', 'status': 200, 'encoding': 'utf-8', 'version': 'rss20', 'namespaces': {1}} 5 | fake_response = munchify(feedparser_parse_response) 6 | -------------------------------------------------------------------------------- /publishfeed/tests.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | from models import FeedSet, Base, RSSContent 3 | import config 4 | import sqlalchemy 5 | from sqlalchemy.orm import sessionmaker 6 | from unittest.mock import MagicMock 7 | from test_data.feedparser_data import fake_response 8 | from helpers import RSSContentHelper, FeedSetHelper 9 | 10 | class TestFeedSet(unittest.TestCase): 11 | def setUp(self): 12 | url = config.DB_TEST_URL 13 | if not url: 14 | self.skipTest("No database URL set") 15 | engine = sqlalchemy.create_engine(url) 16 | Base.metadata.drop_all(engine) 17 | Base.metadata.create_all(engine) 18 | Session = sessionmaker(bind=engine) 19 | self.session = Session() 20 | 21 | 22 | feedparser_fake_response = fake_response 23 | 24 | def feed_data_dict(self): 25 | data = { 26 | 'urls': ['https://news.ycombinator.com/rss'], 27 | 'hashtags': '#example', 28 | 'twitter': { 29 | 'consumer_key': 'XXXXXXXXXXX', 30 | 'access_secret': 'XXXXXXXXXXXXXX', 31 | 'consumer_secret': 'XXXXXXXXXXXXXX', 32 | 'access_key': 'XXXXXXXXXXXX' 33 | }, 34 | 'name': 'SimpleItRocks' 35 | } 36 | return data 37 | 38 | 39 | 40 | def test_get_twitter_credentials(self): 41 | data = self.feed_data_dict() 42 | feed = FeedSet(data) 43 | keys = feed.twitter_keys 44 | 45 | self.assertIsInstance(keys, dict) 46 | self.assertIn('consumer_key', keys) 47 | self.assertIn('access_key', keys) 48 | self.assertIn('consumer_secret', keys) 49 | self.assertIn('access_secret', keys) 50 | 51 | def test_urls(self): 52 | data = self.feed_data_dict() 53 | feed = FeedSet(data) 54 | urls = feed.urls 55 | 56 | self.assertIsInstance(urls, list) 57 | 58 | 59 | @unittest.mock.patch('feedparser.parse', return_value=feedparser_fake_response) 60 | def test_save_new_pages(self, feedparser_fake_response): 61 | 62 | self.assertEqual(len(self.session.query(RSSContent).all()), 0) 63 | helper = FeedSetHelper(self.session, self.feed_data_dict()) 64 | helper.get_pages_from_feeds() 65 | self.assertNotEqual(len(self.session.query(RSSContent).all()), 0) 66 | 67 | @unittest.mock.patch('feedparser.parse', return_value=feedparser_fake_response) 68 | def test_not_save_existing_pages(self, feedparser_fake_response): 69 | # presave an item that is present in the retrieved feed, to check if it 70 | # has not been saved after downloading new feeds 71 | entry = fake_response.entries[0] 72 | items_count = len(fake_response.entries) 73 | rsscontent = RSSContent(title=entry.title, url=entry.link) 74 | self.session.add(rsscontent) 75 | self.assertEqual(len(self.session.query(RSSContent).all()), 1) 76 | helper = FeedSetHelper(self.session, self.feed_data_dict()) 77 | 78 | helper.get_pages_from_feeds() 79 | 80 | self.assertEqual(len(self.session.query(RSSContent).all()), items_count, "Entries count has changed") 81 | 82 | if __name__ == '__main__': 83 | unittest.main() 84 | -------------------------------------------------------------------------------- /publishfeed/twitter.py: -------------------------------------------------------------------------------- 1 | import tweepy 2 | 3 | class Twitter: 4 | 5 | def __init__(self, consumer_key, consumer_secret, access_key, access_secret): 6 | auth = tweepy.OAuthHandler(consumer_key, consumer_secret) 7 | auth.set_access_token(access_key, access_secret) 8 | self.api = tweepy.API(auth) 9 | 10 | 11 | def update_status(self, text): 12 | return self.api.update_status(text) 13 | 14 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | beautifulsoup4==4.6.0 2 | certifi==2017.4.17 3 | chardet==3.0.3 4 | feedparser==5.2.1 5 | idna==2.5 6 | munch==2.1.1 7 | oauthlib==2.0.2 8 | PyYAML==5.4 9 | requests==2.20.0 10 | requests-oauthlib==0.8.0 11 | six==1.10.0 12 | SQLAlchemy==1.3.0 13 | tweepy==3.5.0 14 | urllib3==1.26.5 15 | --------------------------------------------------------------------------------