├── LICENSE
├── Makefile
├── README.md
├── publishfeed
├── .gitignore
├── __init__.py
├── config.py
├── databases
│ └── .gitkeep
├── feeds.yml.skel
├── helpers.py
├── main.py
├── models.py
├── test_data
│ └── feedparser_data.py
├── tests.py
└── twitter.py
└── requirements.txt
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2017 Marcelo Canina
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | #virtualenv:
2 | # mkvirtualenv -p /usr/bin/python3.6 twitter_feeder
3 | install:
4 | pip install -r requirements.txt
5 | freeze:
6 | pip freeze > requirements.txt
7 | tests:
8 | cd publishfeed; python -m unittest discover
9 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Publish Feed
2 | ============
3 |
4 | A publisher of articles from websites RSS feeds to Twitter.
5 |
6 |
7 | **Table of Contents**
8 |
9 | - [Publish Feed](#publish-feed)
10 | - [Overview](#overview)
11 |
12 |
13 |
14 |
15 | # Overview
16 |
17 | This app currently perform two tasks:
18 |
19 | - download RSS content from several sources listed in `feeds.yml`
20 | - publish its titles and links to Twitter
21 |
22 | This is an easy way to get your [blog posts automatically tweeted](https://simpleit.rocks/automatically-tweet-new-blog-posts-based-in-rss/).
23 |
24 | # Installation
25 |
26 | ## Install dependencies
27 |
28 | Install Pip dependencies: make install
29 |
30 | ## Get Twitter credentials
31 |
32 | Create a Twitter app at and generate keys
33 | with **read and write permissions**.
34 |
35 | ## Set up credentials and feeds
36 |
37 | Customize `feeds.yml.skel` and save it as `feeds.yml`:
38 |
39 | cd publishfeed
40 | cp feeds.yml.skel feeds.yml
41 |
42 | For example, defining feeds for two different Twitter accounts,
43 | and , would
44 | look like:
45 |
46 | simpleitrocks: #twitter handler
47 | twitter:
48 | consumer_key: 'XXXXXX'
49 | consumer_secret: 'XXXXXXX'
50 | access_key: 'XXXXXX'
51 | access_secret: 'XXXXXX'
52 | urls:
53 | - https://simpleit.rocks/feed
54 | hashtags: '#TechTutorials'
55 | reddit:
56 | twitter:
57 | consumer_key: 'XXXXXX'
58 | consumer_secret: 'XXXXXXX'
59 | access_key: 'XXXXXX'
60 | access_secret: 'XXXXXX'
61 | urls:
62 | - http://www.reddit.com/.rss
63 | - http://www.reddit.com/r/news/.rss
64 | - http://www.reddit.com/user/marcanuy/.rss
65 | hashtags: '#RedditFrontPage'
66 |
67 | # Running
68 |
69 | There are two commands available:
70 |
71 | ~~~ bash
72 | $ python main.py
73 |
74 | usage: main.py [-h] [-g | -t] feed
75 |
76 | Process and publish RSS Feeds articles
77 |
78 | positional arguments:
79 | feed Index from feeds.yml
80 |
81 | optional arguments:
82 | -h, --help show this help message and exit
83 | -g, --getfeeds Download and save new articles from the list of feeds
84 | -t, --tweet Tweet unpublished articles from this list of feeds
85 | ~~~
86 |
87 | - `python main.py --getfeeds`
88 |
89 | Download all the pages from the URLs defined in `urls` and save the
90 | new ones. E.g.: python main.py reddit --getfeeds
91 |
92 | - `python main.py --tweet`
93 |
94 | Tweet the oldest unpublished page (previously downloaded with
95 | `--getfeeds`). E.g.: python main.py reddit --tweet
96 |
97 | ## Cronjobs
98 |
99 | Set up two cronjobs to publish new feed content automatically:
100 |
101 | ~~~ bash
102 | crontab -e
103 | ~~~
104 |
105 | ~~~ cronjob
106 | # hourly download new pages
107 | 0 * * * * workon publishfeed; cd publishfeed; python main.py reddit --getfeeds
108 |
109 | # tweet every 15 minutes if there are unpublished pages
110 | */15 * * * * workon publishfeed; cd publishfeed; python main.py reddit --tweet
111 | ~~~
112 |
113 | **Make sure you configure the tweeter cronjob with at least 2 minutes
114 | between each job so your credentials won't be suspended**
115 |
116 | # Design
117 |
118 | `feeds.yml` populates the **FeedSet** model, then for each url, new
119 | content is created as **RSSContent** instances (using SQLAlchemy) and saved in
120 | `/databases/rss_.db` *SQLite* databases.
121 |
122 | To tweet a new post, we get the oldest unpublished page from
123 | **RSSContent**, publish it and change its status.
124 |
125 | # Questions
126 |
127 | Do you think there is something missing or that can be improved? Feel
128 | free to open issues and/or contributing!
129 |
130 | # License
131 |
132 | MIT Licensed.
133 |
--------------------------------------------------------------------------------
/publishfeed/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__
2 | *.pyc
3 | databases/
4 | feeds.yml
5 | test.db
--------------------------------------------------------------------------------
/publishfeed/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/marcanuy/publishfeed/b2f333c319d6dab9ab8b3095f89a4382c9edd857/publishfeed/__init__.py
--------------------------------------------------------------------------------
/publishfeed/config.py:
--------------------------------------------------------------------------------
1 |
2 | TWEET_MAX_LENGTH = 140
3 | TWEET_URL_LENGTH = 22
4 | TWEET_IMG_LENGTH = 0 #23
5 |
6 | DB_TEST_URL = 'sqlite://' # in memory
7 | #DB_TEST_URL = 'sqlite:///test.db' # file
8 |
--------------------------------------------------------------------------------
/publishfeed/databases/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/marcanuy/publishfeed/b2f333c319d6dab9ab8b3095f89a4382c9edd857/publishfeed/databases/.gitkeep
--------------------------------------------------------------------------------
/publishfeed/feeds.yml.skel:
--------------------------------------------------------------------------------
1 | simpleitrocks: #twitter handler
2 | twitter:
3 | consumer_key: ''
4 | consumer_secret: ''
5 | access_key: ''
6 | access_secret: ''
7 | urls:
8 | - #https://simpleit.rocks/feed
9 | hashtags: ''
10 |
--------------------------------------------------------------------------------
/publishfeed/helpers.py:
--------------------------------------------------------------------------------
1 | from models import RSSContent, FeedSet
2 | from twitter import Twitter
3 | import yaml
4 | import config
5 | import feedparser
6 | from datetime import datetime
7 |
8 | class Helper:
9 | def __init__(self, session, data):
10 | self.session = session
11 | if(isinstance(data, dict)):
12 | self.data = data
13 | else:
14 | with open('feeds.yml', 'r') as f:
15 | self.data = yaml.safe_load(f)[data]
16 |
17 | class FeedSetHelper(Helper):
18 |
19 | def get_pages_from_feeds(self):
20 | feed = FeedSet(self.data)
21 | for url in feed.urls:
22 | parsed_feed = feedparser.parse(url)
23 | for entry in parsed_feed.entries:
24 | # if feed page not exist, add it as rsscontent
25 | q = self.session.query(RSSContent).filter_by(url = entry.link)
26 | exists = self.session.query(q.exists()).scalar() # returns True or False
27 | if not exists:
28 | item_title = entry.title
29 | item_url = entry.link #.encode('utf-8')
30 | #RFC 2822 standard: Wed, 7 Jun 2017 16:25:41 +0000
31 | item_date = datetime.strptime(entry.published, "%a, %d %b %Y %H:%M:%S +0000")
32 | item = RSSContent(url=item_url, title=item_title, dateAdded = item_date)
33 | self.session.add(item)
34 |
35 | class RSSContentHelper(Helper):
36 |
37 | def get_oldest_unpublished_rsscontent(self, session):
38 | rsscontent = session.query(RSSContent).filter_by(published = 0).order_by(RSSContent.dateAdded.asc()).first()
39 | return rsscontent
40 |
41 | def _calculate_tweet_length(self):
42 | tweet_net_length = config.TWEET_MAX_LENGTH - config.TWEET_URL_LENGTH - config.TWEET_IMG_LENGTH
43 | hashtag_length = len(self.data['hashtags'])
44 | body_length = tweet_net_length - hashtag_length
45 | return body_length
46 |
47 | def tweet_rsscontent(self, rsscontent):
48 | credentials = self.data['twitter']
49 | twitter = Twitter(**credentials)
50 |
51 | body_length = self._calculate_tweet_length()
52 | tweet_body = rsscontent.title[:body_length]
53 | tweet_url = rsscontent.url
54 | tweet_hashtag = self.data['hashtags']
55 | tweet_text = "{} {} {}".format(tweet_body, tweet_url, tweet_hashtag)
56 | twitter.update_status(tweet_text)
57 | rsscontent.published=True
58 | self.session.flush()
59 |
--------------------------------------------------------------------------------
/publishfeed/main.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # -*- coding: utf-8 -*-
3 |
4 | import config
5 | import argparse
6 |
7 | from helpers import RSSContentHelper, FeedSetHelper
8 | from twitter import Twitter
9 | from models import FeedSet,create_tables
10 | from contextlib import contextmanager
11 | from sqlalchemy import create_engine
12 | from sqlalchemy.orm import sessionmaker
13 |
14 | @contextmanager
15 | def session_scope(account):
16 | """Provide a transactional scope around a series of operations."""
17 | session = db_session(account)
18 | try:
19 | yield session
20 | session.commit()
21 | except:
22 | session.rollback()
23 | raise
24 | finally:
25 | session.close()
26 |
27 | def db_session(account):
28 | """
29 | Performs database connection using database settings from the
30 | account selected in feeds.yml.
31 | Returns sqlalchemy session instance
32 | """
33 | db_path = 'databases/rss_{}.db'.format(account)
34 | engine = create_engine("sqlite:///{}".format(db_path))
35 | Session = sessionmaker(bind=engine)
36 | session = Session()
37 | return session
38 |
39 | def getfeeds(account):
40 | """
41 | Download and save articles from feeds
42 | """
43 | create_tables(account)
44 | with session_scope(account) as session:
45 | helper = FeedSetHelper(session, account)
46 | helper.get_pages_from_feeds()
47 |
48 | def tweet(account):
49 | with session_scope(account) as session:
50 | helper = RSSContentHelper(session, account)
51 | rsscontent = helper.get_oldest_unpublished_rsscontent(session)
52 | if(rsscontent):
53 | helper.tweet_rsscontent(rsscontent)
54 |
55 | if __name__ == '__main__':
56 | parser = argparse.ArgumentParser(description='Process and publish RSS Feeds articles')
57 | group = parser.add_mutually_exclusive_group()
58 | group.add_argument("-g", "--getfeeds", action='store_true', help="Download and save new articles from the list of feeds")
59 | group.add_argument("-t", "--tweet", action='store_true', help="Tweet unpublished articles from this list of feeds")
60 | parser.add_argument("feed", help="Index from feeds.yml")
61 | args = parser.parse_args()
62 |
63 | account = args.feed
64 | if(args.getfeeds):
65 | getfeeds(account)
66 | elif(args.tweet):
67 | tweet(account)
68 |
69 |
--------------------------------------------------------------------------------
/publishfeed/models.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # -*- coding: utf-8 -*-
3 | import yaml
4 |
5 | from time import mktime, strftime
6 | from sqlalchemy.ext.declarative import declarative_base
7 | from sqlalchemy import create_engine, Column, Integer, String, DateTime, Boolean
8 |
9 |
10 | Base = declarative_base()
11 |
12 | from datetime import datetime
13 |
14 | def db_connect(account):
15 | """
16 | Performs database connection using database settings from settings.py.
17 | Returns sqlalchemy engine instance
18 | """
19 | db_path = 'databases/rss_{}.db'.format(account)
20 | return create_engine("sqlite:///{}".format(db_path))
21 |
22 | def create_tables(account):
23 | """"""
24 | engine = db_connect(account)
25 | Base.metadata.create_all(engine)
26 |
27 | class FeedSet:
28 | """
29 | feeds.yml item wrapper
30 | """
31 | def __init__(self, data):
32 | if(isinstance(data, dict)):
33 | self.data = data
34 |
35 | @property
36 | def twitter_keys(self):
37 | return self.data['twitter']
38 |
39 | @property
40 | def urls(self):
41 | return self.data['urls']
42 |
43 | @property
44 | def hashtags(self):
45 | return self.data['hashtags']
46 |
47 | class RSSContent(Base):
48 | __tablename__ = 'rsscontent'
49 |
50 | id = Column(Integer, primary_key=True)
51 | url = Column(String)
52 | title = Column(String)
53 | dateAdded = Column(DateTime, default=datetime.now())
54 | published = Column(Boolean, unique=False, default=False)
55 |
56 | def __repr__(self):
57 | return "".format(self.dateAdded, self.title, self.url, self.published)
58 |
59 | def __init__(self, url, title, dateAdded=None, published=False):
60 | self.url = url
61 | self.title = title
62 | self.dateAdded = dateAdded
63 | self.published = published
64 |
--------------------------------------------------------------------------------
/publishfeed/test_data/feedparser_data.py:
--------------------------------------------------------------------------------
1 | from munch import munchify
2 | # a real response example without time structs (from published_parsed and update-parsed)
3 |
4 | feedparser_parse_response = {'feed': {'title': 'Hacker News', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Hacker News'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://news.ycombinator.com/'}], 'link': 'https://news.ycombinator.com/', 'subtitle': 'Links for the intellectually curious, ranked by readers.', 'subtitle_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Links for the intellectually curious, ranked by readers.'}}, 'entries': [{'title': 'In 1957, Five Men Agreed to Stand Under an Exploding Nuclear Bomb (2012)', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'In 1957, Five Men Agreed to Stand Under an Exploding Nuclear Bomb (2012)'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.npr.org/sections/krulwich/2012/07/16/156851175/five-men-agree-to-stand-directly-under-an-exploding-nuclear-bomb'}], 'link': 'http://www.npr.org/sections/krulwich/2012/07/16/156851175/five-men-agree-to-stand-directly-under-an-exploding-nuclear-bomb', 'published': 'Wed, 7 Jun 2017 16:25:41 +0000', 'comments': 'https://news.ycombinator.com/item?id=14507673', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Americans from Both Political Parties Overwhelmingly Support Net Neutrality', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Americans from Both Political Parties Overwhelmingly Support Net Neutrality'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.mozilla.org/blog/2017/06/06/new-mozilla-poll-americans-political-parties-overwhelmingly-support-net-neutrality/'}], 'link': 'https://blog.mozilla.org/blog/2017/06/06/new-mozilla-poll-americans-political-parties-overwhelmingly-support-net-neutrality/', 'published': 'Wed, 7 Jun 2017 18:54:13 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508921', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Options vs. Cash', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Options vs. Cash'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://danluu.com/startup-options/'}], 'link': 'https://danluu.com/startup-options/', 'published': 'Wed, 7 Jun 2017 11:39:04 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505378', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Performance Improvements in .NET Core', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Performance Improvements in .NET Core'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blogs.msdn.microsoft.com/dotnet/2017/06/07/performance-improvements-in-net-core/'}], 'link': 'https://blogs.msdn.microsoft.com/dotnet/2017/06/07/performance-improvements-in-net-core/', 'published': 'Wed, 7 Jun 2017 16:56:06 +0000', 'comments': 'https://news.ycombinator.com/item?id=14507936', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Oldest Fossils of Homo Sapiens Found in Morocco', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Oldest Fossils of Homo Sapiens Found in Morocco'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.nytimes.com/2017/06/07/science/human-fossils-morocco.html'}], 'link': 'https://www.nytimes.com/2017/06/07/science/human-fossils-morocco.html', 'published': 'Wed, 7 Jun 2017 17:08:05 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508029', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Atlas of Lie Groups and Representations', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Atlas of Lie Groups and Representations'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.liegroups.org/software/documentation/atlasofliegroups-docs/index.html'}], 'link': 'http://www.liegroups.org/software/documentation/atlasofliegroups-docs/index.html', 'published': 'Wed, 7 Jun 2017 10:34:20 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505047', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'A Brief History of the UUID', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'A Brief History of the UUID'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://segment.com/blog/a-brief-history-of-the-uuid/'}], 'link': 'https://segment.com/blog/a-brief-history-of-the-uuid/', 'published': 'Wed, 7 Jun 2017 17:51:36 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508413', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Apple Announces Full WebRTC Support in Safari 11', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Apple Announces Full WebRTC Support in Safari 11'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.peer5.com/apple-announces-support-for-webrtc-in-safari-11/'}], 'link': 'https://blog.peer5.com/apple-announces-support-for-webrtc-in-safari-11/', 'published': 'Wed, 7 Jun 2017 19:17:45 +0000', 'comments': 'https://news.ycombinator.com/item?id=14509100', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Reducers, transducers and core.async in Clojure', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Reducers, transducers and core.async in Clojure'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://eli.thegreenplace.net/2017/reducers-transducers-and-coreasync-in-clojure/'}], 'link': 'http://eli.thegreenplace.net/2017/reducers-transducers-and-coreasync-in-clojure/', 'published': 'Wed, 7 Jun 2017 13:04:39 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506012', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Software Companies Tech Competency Matrix', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Software Companies Tech Competency Matrix'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://geshan.com.np/blog/2017/06/software-companies-tech-competency-matrix/'}], 'link': 'https://geshan.com.np/blog/2017/06/software-companies-tech-competency-matrix/', 'published': 'Wed, 7 Jun 2017 14:12:06 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506563', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'The future of MDN: a focus on web docs', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'The future of MDN: a focus on web docs'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.mozilla.org/opendesign/future-mdn-focus-web-docs/'}], 'link': 'https://blog.mozilla.org/opendesign/future-mdn-focus-web-docs/', 'published': 'Wed, 7 Jun 2017 08:38:27 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504604', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': "Stanford's therapy chatbot for depression", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': "Stanford's therapy chatbot for depression"}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.businessinsider.fr/us/stanford-therapy-chatbot-depression-anxiety-woebot-2017-6/'}], 'link': 'http://www.businessinsider.fr/us/stanford-therapy-chatbot-depression-anxiety-woebot-2017-6/', 'published': 'Wed, 7 Jun 2017 07:20:44 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504306', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Conformity Excuses', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Conformity Excuses'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.overcomingbias.com/2017/06/conformity-excuses.html'}], 'link': 'http://www.overcomingbias.com/2017/06/conformity-excuses.html', 'published': 'Wed, 7 Jun 2017 09:36:47 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504822', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Set up a malware analysis lab with VirtualBox, INetSim and Burp', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Set up a malware analysis lab with VirtualBox, INetSim and Burp'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.christophetd.fr/set-up-your-own-malware-analysis-lab-with-virtualbox-inetsim-and-burp/'}], 'link': 'https://blog.christophetd.fr/set-up-your-own-malware-analysis-lab-with-virtualbox-inetsim-and-burp/', 'published': 'Wed, 7 Jun 2017 11:42:36 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505406', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Show HN: ProximityHash – Geohashes in Proximity', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Show HN: ProximityHash – Geohashes in Proximity'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://github.com/ashwin711/proximityhash'}], 'link': 'https://github.com/ashwin711/proximityhash', 'published': 'Wed, 7 Jun 2017 18:13:06 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508594', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'The Boolean Satisfiability Problem and SAT Solvers', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'The Boolean Satisfiability Problem and SAT Solvers'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://0a.io/boolean-satisfiability-problem-or-sat-in-5-minutes/'}], 'link': 'http://0a.io/boolean-satisfiability-problem-or-sat-in-5-minutes/', 'published': 'Wed, 7 Jun 2017 18:07:30 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508546', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Coursera raises Series D', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Coursera raises Series D'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://techcrunch.com/2017/06/07/online-learning-startup-coursera-raises-64m-at-an-800m-valuation/'}], 'link': 'https://techcrunch.com/2017/06/07/online-learning-startup-coursera-raises-64m-at-an-800m-valuation/', 'published': 'Wed, 7 Jun 2017 13:47:23 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506383', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Another “don’t cargo cult” article', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Another “don’t cargo cult” article'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb'}], 'link': 'https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb', 'published': 'Wed, 7 Jun 2017 17:34:19 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508264', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Ghost in the Shell FUI Design', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Ghost in the Shell FUI Design'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.hudsandguis.com/home/2017/4/17/ghostintheshell-fui'}], 'link': 'http://www.hudsandguis.com/home/2017/4/17/ghostintheshell-fui', 'published': 'Wed, 7 Jun 2017 06:41:53 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504163', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'A day without JavaScript', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'A day without JavaScript'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://sonniesedge.co.uk/blog/a-day-without-javascript'}], 'link': 'https://sonniesedge.co.uk/blog/a-day-without-javascript', 'published': 'Wed, 7 Jun 2017 11:28:09 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505315', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Cache Organization in Intel CPUs (2009)', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Cache Organization in Intel CPUs (2009)'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://duartes.org/gustavo/blog/post/intel-cpu-caches/'}], 'link': 'http://duartes.org/gustavo/blog/post/intel-cpu-caches/', 'published': 'Wed, 7 Jun 2017 13:50:34 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506401', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': "Einstein's Philosophy of Science (2014)", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': "Einstein's Philosophy of Science (2014)"}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://plato.stanford.edu/entries/einstein-philscience/'}], 'link': 'https://plato.stanford.edu/entries/einstein-philscience/', 'published': 'Tue, 6 Jun 2017 18:38:23 +0000', 'comments': 'https://news.ycombinator.com/item?id=14499850', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Getting started with the F# and .Net ecosystem', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Getting started with the F# and .Net ecosystem'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://www.prigrammer.com/?p=363'}], 'link': 'http://www.prigrammer.com/?p=363', 'published': 'Wed, 7 Jun 2017 13:34:58 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506287', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'BuildZoom (YC W13 – build your dream home) is hiring a VP of Sales', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'BuildZoom (YC W13 – build your dream home) is hiring a VP of Sales'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://jobs.lever.co/buildzoom'}], 'link': 'https://jobs.lever.co/buildzoom', 'published': 'Wed, 7 Jun 2017 17:22:09 +0000', 'comments': 'https://news.ycombinator.com/item?id=14508141', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Edison, Clarence Dally, and the Hidden Perils of X-Rays (1903)', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Edison, Clarence Dally, and the Hidden Perils of X-Rays (1903)'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://web.archive.org/web/20120218234715/http://home.gwi.net/~dnb/read/edison/edison_xrays.htm'}], 'link': 'http://web.archive.org/web/20120218234715/http://home.gwi.net/~dnb/read/edison/edison_xrays.htm', 'published': 'Wed, 7 Jun 2017 03:04:41 +0000', 'comments': 'https://news.ycombinator.com/item?id=14503416', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Pharo 6.0 Released', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Pharo 6.0 Released'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'http://pharo.org/news/pharo6.0-released'}], 'link': 'http://pharo.org/news/pharo6.0-released', 'published': 'Wed, 7 Jun 2017 07:00:25 +0000', 'comments': 'https://news.ycombinator.com/item?id=14504244', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'A curated list of design systems, pattern libraries, and more', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'A curated list of design systems, pattern libraries, and more'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://github.com/alexpate/awesome-design-systems'}], 'link': 'https://github.com/alexpate/awesome-design-systems', 'published': 'Wed, 7 Jun 2017 13:57:04 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506458', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Comcast Has Always Opposed Internet Freedom', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comcast Has Always Opposed Internet Freedom'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.eff.org/deeplinks/2017/06/dont-be-fooled-comcast-pr-machine-it-has-always-opposed-open-internet'}], 'link': 'https://www.eff.org/deeplinks/2017/06/dont-be-fooled-comcast-pr-machine-it-has-always-opposed-open-internet', 'published': 'Wed, 7 Jun 2017 14:46:30 +0000', 'comments': 'https://news.ycombinator.com/item?id=14506853', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'Dasung Paperlike Pro: E-Ink Monitor with HDMI connector [video]', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Dasung Paperlike Pro: E-Ink Monitor with HDMI connector [video]'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.youtube.com/watch?v=wj2Lvuc28k0'}], 'link': 'https://www.youtube.com/watch?v=wj2Lvuc28k0', 'published': 'Wed, 7 Jun 2017 12:33:22 +0000', 'comments': 'https://news.ycombinator.com/item?id=14505762', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}, {'title': 'List of Printers Which Do or Do Not Display Tracking Dots', 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'List of Printers Which Do or Do Not Display Tracking Dots'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.eff.org/pages/list-printers-which-do-or-do-not-display-tracking-dots'}], 'link': 'https://www.eff.org/pages/list-printers-which-do-or-do-not-display-tracking-dots', 'published': 'Tue, 6 Jun 2017 22:12:51 +0000', 'comments': 'https://news.ycombinator.com/item?id=14501894', 'summary': 'Comments', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'https://news.ycombinator.com/rss', 'value': 'Comments'}}], 'bozo': 0, 'headers': {'Date': 'Wed, 07 Jun 2017 20:01:20 GMT', 'Content-Type': 'application/rss+xml', 'Transfer-Encoding': 'chunked', 'Connection': 'close', 'Set-Cookie': '__cfduid=d1c55ac10a1500623f632fe9f4feca9441496865679; expires=Thu, 07-Jun-18 20:01:19 GMT; path=/; domain=.ycombinator.com; HttpOnly', 'Cache-Control': 'private', 'X-Frame-Options': 'DENY', 'Strict-Transport-Security': 'max-age=31556900; includeSubDomains', 'Server': 'cloudflare-nginx', 'CF-RAY': '36b633e12e3c680f-EZE'}, 'href': 'https://news.ycombinator.com/rss', 'status': 200, 'encoding': 'utf-8', 'version': 'rss20', 'namespaces': {1}}
5 | fake_response = munchify(feedparser_parse_response)
6 |
--------------------------------------------------------------------------------
/publishfeed/tests.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | from models import FeedSet, Base, RSSContent
3 | import config
4 | import sqlalchemy
5 | from sqlalchemy.orm import sessionmaker
6 | from unittest.mock import MagicMock
7 | from test_data.feedparser_data import fake_response
8 | from helpers import RSSContentHelper, FeedSetHelper
9 |
10 | class TestFeedSet(unittest.TestCase):
11 | def setUp(self):
12 | url = config.DB_TEST_URL
13 | if not url:
14 | self.skipTest("No database URL set")
15 | engine = sqlalchemy.create_engine(url)
16 | Base.metadata.drop_all(engine)
17 | Base.metadata.create_all(engine)
18 | Session = sessionmaker(bind=engine)
19 | self.session = Session()
20 |
21 |
22 | feedparser_fake_response = fake_response
23 |
24 | def feed_data_dict(self):
25 | data = {
26 | 'urls': ['https://news.ycombinator.com/rss'],
27 | 'hashtags': '#example',
28 | 'twitter': {
29 | 'consumer_key': 'XXXXXXXXXXX',
30 | 'access_secret': 'XXXXXXXXXXXXXX',
31 | 'consumer_secret': 'XXXXXXXXXXXXXX',
32 | 'access_key': 'XXXXXXXXXXXX'
33 | },
34 | 'name': 'SimpleItRocks'
35 | }
36 | return data
37 |
38 |
39 |
40 | def test_get_twitter_credentials(self):
41 | data = self.feed_data_dict()
42 | feed = FeedSet(data)
43 | keys = feed.twitter_keys
44 |
45 | self.assertIsInstance(keys, dict)
46 | self.assertIn('consumer_key', keys)
47 | self.assertIn('access_key', keys)
48 | self.assertIn('consumer_secret', keys)
49 | self.assertIn('access_secret', keys)
50 |
51 | def test_urls(self):
52 | data = self.feed_data_dict()
53 | feed = FeedSet(data)
54 | urls = feed.urls
55 |
56 | self.assertIsInstance(urls, list)
57 |
58 |
59 | @unittest.mock.patch('feedparser.parse', return_value=feedparser_fake_response)
60 | def test_save_new_pages(self, feedparser_fake_response):
61 |
62 | self.assertEqual(len(self.session.query(RSSContent).all()), 0)
63 | helper = FeedSetHelper(self.session, self.feed_data_dict())
64 | helper.get_pages_from_feeds()
65 | self.assertNotEqual(len(self.session.query(RSSContent).all()), 0)
66 |
67 | @unittest.mock.patch('feedparser.parse', return_value=feedparser_fake_response)
68 | def test_not_save_existing_pages(self, feedparser_fake_response):
69 | # presave an item that is present in the retrieved feed, to check if it
70 | # has not been saved after downloading new feeds
71 | entry = fake_response.entries[0]
72 | items_count = len(fake_response.entries)
73 | rsscontent = RSSContent(title=entry.title, url=entry.link)
74 | self.session.add(rsscontent)
75 | self.assertEqual(len(self.session.query(RSSContent).all()), 1)
76 | helper = FeedSetHelper(self.session, self.feed_data_dict())
77 |
78 | helper.get_pages_from_feeds()
79 |
80 | self.assertEqual(len(self.session.query(RSSContent).all()), items_count, "Entries count has changed")
81 |
82 | if __name__ == '__main__':
83 | unittest.main()
84 |
--------------------------------------------------------------------------------
/publishfeed/twitter.py:
--------------------------------------------------------------------------------
1 | import tweepy
2 |
3 | class Twitter:
4 |
5 | def __init__(self, consumer_key, consumer_secret, access_key, access_secret):
6 | auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
7 | auth.set_access_token(access_key, access_secret)
8 | self.api = tweepy.API(auth)
9 |
10 |
11 | def update_status(self, text):
12 | return self.api.update_status(text)
13 |
14 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | beautifulsoup4==4.6.0
2 | certifi==2017.4.17
3 | chardet==3.0.3
4 | feedparser==5.2.1
5 | idna==2.5
6 | munch==2.1.1
7 | oauthlib==2.0.2
8 | PyYAML==5.4
9 | requests==2.20.0
10 | requests-oauthlib==0.8.0
11 | six==1.10.0
12 | SQLAlchemy==1.3.0
13 | tweepy==3.5.0
14 | urllib3==1.26.5
15 |
--------------------------------------------------------------------------------