├── .gitignore ├── LICENSE ├── Procfile ├── README.md ├── app.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | venv_dash 2 | *.pyc 3 | .DS_Store 4 | .env 5 | .vscode -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Keerthan Vantakala 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Procfile: -------------------------------------------------------------------------------- 1 | web: gunicorn app:server -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Crypto Sentiment Analysis 2 | 3 | Welcome! 4 | 5 | This is a project to measure and analyze how various social media / news sites correspond with market fluctuations for crypto currency. 6 | 7 | The main purpose of this project is: 8 | 9 | 1) Create a useful application for those interested in cryptocurrency 10 | 11 | 2) Get hands on experience with various tools/technology I'm interested in. 12 | 13 | I've broken up the project into two repositories, one holds the frontend code for the web applicaiton, the other holds the backend code. 14 | 15 | Backend: https://github.com/vantaka2/crypto_sentiment_analysis- 16 | 17 | Frontend: https://github.com/vantaka2/crypto_dash_app 18 | 19 | Here is a link to the current state of the web app: 20 | https://crypto-sentiment-keerthan.herokuapp.com/ 21 | 22 | (Please keep in mind this is running on heroku's free tier. In Heroku's free tier applications are put to sleep if no one uses it for 30 minutes, so the first person to check it out after its put to sleep will have a to wait a minute or two for the app to start up again.) 23 | 24 | ## Frontend 25 | 26 | ### Framework 27 | 28 | I've chosen to create the frontend using Plotly Dash. Plotly Dash is a python framework for building analytical web applications without javascript. 29 | 30 | Check out the project here: 31 | 32 | https://github.com/plotly/dash/ 33 | 34 | https://plot.ly/products/dash/ 35 | 36 | ### Hosting 37 | 38 | The application is currently being hosted on Heroku. For the time being I am using the free tier until the application is in a better state or there is increased interest in it. Some options moving forward are to move it to Heroku's Standard plan or have it run on the AWS server that I am already paying for to do the backend work. 39 | 40 | You can find the application here: 41 | https://crypto-sentiment-keerthan.herokuapp.com/ 42 | 43 | ### Design: 44 | 45 | I am not a UI/UX person, if you have feedback on how the design of the application can be improved please let me know via github issues! 46 | 47 | For now I've chosen the color scheme after a quick google on what color combinations to use for dashboards. 48 | 49 | Source: https://aesalazar.com/blog/3-professional-color-combinations-for-dashboards-or-mobile-bi-applications/ 50 | 51 | HEX color codes used: #F1F1F1, #202020, #7E909A, #1C4E80, #A5D8DD, #EA6A47, #0091D5 52 | 53 | ## Backend 54 | 55 | ### Server 56 | 57 | The backend processes are running on 2 AWS servers. 58 | 59 | 1 m5.large & 1 t2.micro. 60 | 61 | The m5.large is running apache airflow and is preforming the ETL processing from various data sources. 62 | 63 | The t2.micro is hosting a postgres database 64 | 65 | ### Tools/Tech 66 | 67 | #### Apache Airflow 68 | 69 | Airflow is a platform to programmatically author, schedule, and monitor workflows. 70 | 71 | Check out airflow at : https://github.com/apache/incubator-airflow 72 | 73 | #### PostgreSQL 74 | 75 | I'm using PostgreSQL as my database. 76 | 77 | Version: postgreSQL 9.5.3 78 | 79 | https://www.postgresql.org/ 80 | 81 | ### DATA 82 | 83 | #### Coinmarketcap 84 | 85 | I'm using the coinmarketcap API to obtain data on the current price & market cap of all the coins. They have a awesome API, check it out! 86 | 87 | https://coinmarketcap.com/api/ 88 | 89 | #### reddit 90 | 91 | I'm using the praw library to access the reddit API. 92 | 93 | Check out Praw here: https://github.com/praw-dev/praw 94 | 95 | Info on reddit's API: https://www.reddit.com/dev/api/ 96 | 97 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | import dash 2 | import dash_core_components as dcc 3 | import dash_html_components as html 4 | import pandas as pd 5 | from dash.dependencies import Input, Output 6 | import os 7 | import plotly.graph_objs as go 8 | 9 | 10 | 11 | app = dash.Dash(__name__) 12 | server = app.server 13 | 14 | 15 | ## bootstamp CSS (From https://github.com/amyoshino/DASH_Tutorial_ARGO_Labs/blob/master/app.py) 16 | app.css.append_css( 17 | {'external_url':'https://cdn.rawgit.com/plotly/dash-app-stylesheets/2d266c578d2a6e8850ebce48fdb52759b2aef506/stylesheet-oil-and-gas.css'}) 18 | sql_con = os.environ.get('pg_db') 19 | print("set sql_con") 20 | def mentions_marketcap(pg_conn = sql_con): 21 | sql = """Select e.name, count(distinct a.post_id) as post_count,c.market_cap_usd , b.created:: date 22 | from coin.xref_post_to_coin a 23 | inner join coin.dim_reddit_post b 24 | on a.post_id = b.post_id 25 | inner join (select id, avg(market_cap_usd) as market_cap_usd, insert_timestamp:: date from coin.price_24h where insert_timestamp >= current_date -8 group by 1,3 ) c 26 | on a.coin_id = c.id and b.created:: date = c.insert_timestamp:: date 27 | inner join coin.coin_rank d 28 | on a.coin_id = d.id 29 | inner join coin.dim_coin e 30 | on a.coin_id = e.id 31 | where coin_id is not null and created >= current_date -8 and d.current_rank < 101 32 | group by 1,3,4;""" 33 | df = pd.read_sql(sql,pg_conn) 34 | return df 35 | 36 | ## get market cap data frame 37 | def market_cap_df(pg_conn=sql_con): 38 | """Returns the dataframe used for marketcap graphs""" 39 | sql = """ 40 | select id, name, current_rank, last_updated,insert_timestamp, market_cap_usd 41 | From coin.mc_graph_data 42 | group by 1,2,3,4,5,6 43 | """ 44 | df = pd.read_sql(sql, pg_conn) 45 | return df 46 | ## execute mc data function: 47 | def reddit_posts(pg_conn=sql_con): 48 | "returns all reddit posts " 49 | sql = """Select title, 50 | --c.source, a.post_id, 51 | b.score, c.sentiment, c.confidence 52 | --, array_agg(d.keyword) as keyword 53 | from coin.dim_reddit_post a 54 | inner join (Select max(score) as score, post_id from coin.reddit_post_trends group by 2) b 55 | on a.post_id = b.post_id 56 | inner join coin.sentiment c 57 | on a.post_id = c.source_id 58 | inner join (select post_id, unnest(keyword) as keyword from coin.xref_post_to_coin d group by 1,2) d 59 | on a.post_id = d.post_id 60 | where created >= current_Date -7 61 | and b.score >10 62 | and d.keyword != '{null}' 63 | group by 1,2,3,4""" 64 | df = pd.read_sql(sql,pg_conn) 65 | return df 66 | 67 | def reddit_agg_by_day(pg_conn=sql_con): 68 | "queries database for reddit post data" 69 | sql = """select num_posts, name, created, sentiment 70 | from coin.reddit_post_by_day_agg""" 71 | df = pd.read_sql(sql, pg_conn) 72 | return df 73 | 74 | def reddit_trends_df(pg_conn=sql_con): 75 | "queries database for reddit post trends data" 76 | sql = """Select post_id, created, title, diff, score, num_comments, name 77 | from coin.reddit_trends 78 | where diff <= 1000 """ 79 | df = pd.read_sql(sql, pg_conn) 80 | return df 81 | 82 | df_rt = reddit_trends_df(sql_con) 83 | df_mc = market_cap_df(sql_con) 84 | coin_list = list(df_mc['name'].unique()) 85 | df_red_agg = reddit_agg_by_day(sql_con) 86 | df_post = reddit_posts(sql_con) 87 | df_scatter = mentions_marketcap() 88 | print("ran all sql") 89 | ## layout 90 | app.layout = html.Div([ 91 | ## Top header Author, Title & submit Feedback Button 92 | html.Div( 93 | [ 94 | html.Div("created by: Keerthan Vantakala",style={'color':'#ffffff','textAlign':'center','marginTop': 5}, 95 | className='two columns'), 96 | html.Div( 97 | children= 'Cryptocurrency Sentiment Analysis', 98 | style = dict(backgroundColor="#1C4E80", 99 | color='#ffffff ', 100 | textAlign='center', 101 | fontSize=25), 102 | className='eight columns'), 103 | html.Div(html.A('https://github.com/vantaka2', href = "https://github.com/vantaka2/crypto_dash_app",style={'color':'#ffffff','textAlign':'center'} 104 | ),style={'marginTop': 5}, 105 | className='two columns'), 106 | ],className="row", style={'backgroundColor':'#1C4E80'} 107 | ), 108 | ## Filters 109 | html.Div( 110 | [ 111 | html.Div( 112 | [ 113 | html.Div("Search by Coin - Please limit to 10 coins, This app is running on a weak server!:("), 114 | dcc.Dropdown( 115 | id='coin_select', 116 | options=[ 117 | {'label':i, 'value':i} 118 | for i in coin_list 119 | ], 120 | multi=True 121 | ), 122 | ], className='five columns' 123 | ), 124 | html.Div( 125 | [ 126 | html.Div("Quick Coin Select", 127 | style={'text-align':'center'}), 128 | dcc.RadioItems( 129 | id='quick_filter', 130 | options=[ 131 | {'label':'Top 5', 'value':5}, 132 | {'label':'Top 10', 'value':10} 133 | ], 134 | labelStyle={'display':'inline-block'}, 135 | style={'text-align':'center'} 136 | ), 137 | ], className='two columns' 138 | ), 139 | html.Div( 140 | [ 141 | html.Div("Date Filter", 142 | style={'text-align':'center'}), 143 | dcc.RadioItems( 144 | id='date_filter', 145 | options=[ 146 | {'label':'Last 7 Days', 'value':7}, 147 | {'label':'Last 24 Hours', 'value':1}, 148 | ], 149 | value=7, 150 | labelStyle={'display': 'inline-block', 151 | 'text-align':'center'}, 152 | style={'text-align':'center'} 153 | ), 154 | ], className='two columns' 155 | ), 156 | html.Div( 157 | [ 158 | html.P(html.A(html.Button('Submit Feedback'),href="https://github.com/vantaka2/crypto_dash_app/issues/new", 159 | ),style={'textAlign':'center','marginTop': '5'} ), 160 | ], className='two columns'), 161 | html.Div( 162 | [ 163 | html.P(html.A(html.Button('FAQ'),href="https://medium.com/@keerthanvantakala/faq-for-crypto-currency-sentiment-dashboard-582624a00d89", 164 | ),style={'textAlign':'center', 'float':'right','marginTop': '5'} ) 165 | ], className='one columns'), 166 | 167 | ], className="row",style={'marginTop': 5,'marginRight':15, 'marginLeft':15} 168 | ), 169 | ### KPI Metrics Marketcap, MC percent change, Metnions & mentions Pct Change 170 | html.Div( 171 | [ 172 | html.Div( 173 | [ 174 | html.Div(children="Market Cap", 175 | style={'textAlign':'center','fontSize':20}), 176 | html.Div([ 177 | html.Div(id = 'display_total_mc', 178 | style={'textAlign':'center'} 179 | ), 180 | ]), 181 | 182 | ], className = 'three columns' 183 | ), 184 | html.Div( 185 | [ 186 | html.Div(children="Market Cap % Change", 187 | style={'textAlign':'center','fontSize':20}), 188 | html.Div(id='display_pct_change', 189 | style={'textAlign':'center' }) 190 | ], className = 'three columns' 191 | ), 192 | html.Div( 193 | [ 194 | html.Div(children="Mentions", 195 | style={'textAlign':'center','fontSize':20}), 196 | html.Div(id='reddit_mentions', 197 | style={'textAlign':'center', 198 | }) 199 | ], className = 'three columns' 200 | ), 201 | html.Div( 202 | [ 203 | html.Div(children="Mentions by Sentiment", 204 | style={'textAlign':'center','fontSize':20}), 205 | html.Div(id='sentiment_cnt', 206 | style={'textAlign':'center'}) 207 | ], className = 'three columns' 208 | ), 209 | ], className="row",style={'marginTop': 5} 210 | ), 211 | 212 | ## Total MC chart & MC Percent Change 213 | html.Div( 214 | [ 215 | html.Div( 216 | [ 217 | dcc.Graph( 218 | id='total_mc' 219 | ), 220 | ], className='six columns', 221 | ), 222 | html.Div( 223 | [ 224 | dcc.Graph( 225 | id='mc_by_coin' 226 | ), 227 | ], className='six columns' 228 | ) 229 | ], className="row", style={'marginTop': 5,'marginRight':15, 'marginLeft':15} 230 | ), 231 | html.Div( 232 | [ 233 | html.Div( 234 | [ dcc.Tabs( 235 | tabs=[ 236 | {'label': 'Reddit Post Trends', 'value': 1}, 237 | {'label': 'Sentiment by Coin', 'value': 2}, 238 | {'label': 'Mentions by Day', 'value': 3} 239 | ], 240 | value = 2, 241 | id='tabs', 242 | ), 243 | html.Div(id='tab-output') 244 | ], className='six columns' 245 | ), 246 | html.Div( 247 | [dcc.Graph(id = 'scatterpolot'), 248 | ], className='six columns' 249 | ), 250 | ], className="row", style={'marginTop': 5,'marginRight':15, 'marginLeft':15} 251 | ) 252 | ], 253 | style={'backgroundColor':'#F1F1F1'} 254 | ) 255 | 256 | ## Callbacks 257 | 258 | @app.callback( 259 | dash.dependencies.Output('coin_select', 'value'), 260 | [dash.dependencies.Input('quick_filter', 'value')]) 261 | def set_coin_select(qf_value): 262 | if qf_value == None: 263 | value = ['Nano', 'NEO', 'Walton','Ethereum','SALT','VeChain','Dent'] 264 | else: 265 | value = df_mc[df_mc['current_rank'] <= qf_value]['name'].unique() 266 | return value 267 | 268 | @app.callback( 269 | dash.dependencies.Output('display_pct_change', 'children'), 270 | [dash.dependencies.Input('coin_select', 'value'), 271 | dash.dependencies.Input('date_filter', 'value')]) 272 | def pct_change(coin_select, date_filter): 273 | df = filter_df(df_mc, coin_select, date_filter) 274 | print(coin_select) 275 | start = df[df['insert_timestamp'] == df.min()['insert_timestamp']].sum()['market_cap_usd'] 276 | end = df[df['insert_timestamp'] == df.max()['insert_timestamp']].sum()['market_cap_usd'] 277 | pct_change = round(((end-start)/start)*100) 278 | return ' {} % '.format(pct_change) 279 | 280 | @app.callback( 281 | dash.dependencies.Output('reddit_mentions', 'children'), 282 | [dash.dependencies.Input('coin_select', 'value'), 283 | dash.dependencies.Input('date_filter', 'value')]) 284 | def mentions(coin_select, date_filter): 285 | df_reddit = filter_reddit(df_red_agg, coin_select, date_filter) 286 | return df_reddit.sum()['num_posts'] 287 | 288 | @app.callback( 289 | dash.dependencies.Output('display_total_mc', 'children'), 290 | [dash.dependencies.Input('coin_select', 'value')]) 291 | def mc_total(coin_select): 292 | df = df_mc[df_mc['insert_timestamp'] == df_mc.max()['insert_timestamp']] 293 | df_stg = df[df['name'].isin(coin_select)] 294 | tmc = int(df_stg.sum()['market_cap_usd']/1000000) 295 | return '{:,} MM'.format(tmc) 296 | 297 | @app.callback( 298 | dash.dependencies.Output('sentiment_cnt', 'children'), 299 | [dash.dependencies.Input('coin_select', 'value'), 300 | dash.dependencies.Input('date_filter', 'value')]) 301 | def mentions_by_sentiment(coin_select,date_filter): 302 | df_reddit = filter_reddit(df_red_agg, coin_select, date_filter) 303 | df3 = df_reddit.groupby('sentiment', as_index=False).sum() 304 | negative_cnt = int(df3[df3['sentiment'] =='Negative']['num_posts']) 305 | positive_cnt = int(df3[df3['sentiment'] =='Positive']['num_posts']) 306 | neutral_cnt = int(df3[df3['sentiment'] =='Neutral']['num_posts']) 307 | return 'Postive:{} Neutral: {} Negative: {}'.format(positive_cnt,neutral_cnt,negative_cnt) 308 | 309 | #total_MC_Graph 310 | @app.callback( 311 | dash.dependencies.Output('total_mc', 'figure'), 312 | [dash.dependencies.Input('coin_select', 'value'), 313 | dash.dependencies.Input('date_filter', 'value')]) 314 | def update_total_mc(coin_select, date_filter): 315 | df_total_mc = filter_df(df_mc, coin_select, date_filter) 316 | data = [{ 317 | 'x':df_total_mc.groupby('insert_timestamp', as_index=False).agg('sum').sort_values('insert_timestamp')['insert_timestamp'], 318 | 'y':df_total_mc.groupby('insert_timestamp', as_index=False).agg('sum').sort_values('insert_timestamp')['market_cap_usd'], 319 | 'type': 'line', 320 | 'name': 'Total MC'}] 321 | return {'data':data, 322 | 'layout':{ 323 | 'title': 'Market Cap'} 324 | } 325 | @app.callback( 326 | dash.dependencies.Output('scatterpolot', 'figure'), 327 | [dash.dependencies.Input('coin_select', 'value'), 328 | dash.dependencies.Input('date_filter', 'value')]) 329 | def scatter_plot(coin_select, datefilter): 330 | df = filter_reddit(df_scatter, coin_select, datefilter) 331 | data = [ 332 | go.Scatter( 333 | y=df[df['name'] == i]['post_count'], 334 | x=df[df['name'] == i]['market_cap_usd'], 335 | opacity=0.8, 336 | hovertext=df[df['name'] == i]['created'], 337 | mode = 'markers', 338 | marker = dict(size = 15), 339 | name=i 340 | 341 | ) for i in coin_select 342 | ] 343 | layout = go.Layout( 344 | title='Mentions vs Marketcap', 345 | xaxis=dict( 346 | title='Marketcap (Log Scale)', 347 | 348 | type='log', 349 | autorange=True, 350 | 351 | ), 352 | hovermode='closest', 353 | yaxis=dict( 354 | title='Mention Count', 355 | 356 | autorange=True 357 | ) 358 | ) 359 | figure = {'data':data, 360 | 'layout':layout} 361 | return figure 362 | 363 | @app.callback( 364 | dash.dependencies.Output('mc_by_coin', 'figure'), 365 | [dash.dependencies.Input('coin_select', 'value'), 366 | dash.dependencies.Input('date_filter', 'value')]) 367 | def update_mc_by_coin(coin_select, date_filter): 368 | df_coin_mc_stg = filter_df(df_mc, coin_select, date_filter) 369 | df_coin_mc = df_coin_mc_stg.sort_values(by=['id','insert_timestamp']) 370 | data = [ 371 | go.Scatter( 372 | x=df_coin_mc[df_coin_mc['name'] == i]['last_updated'], 373 | y=df_coin_mc[df_coin_mc['name'] == i]['pct_change'], 374 | mode='line', 375 | opacity=0.8, 376 | name=i 377 | ) for i in coin_select 378 | ] 379 | layout = go.Layout( 380 | title='Market Cap % Change by Coin', 381 | yaxis=dict( 382 | title='Percent Change', 383 | ), 384 | hovermode='closest', 385 | margin=dict( 386 | l=50, 387 | r=50, 388 | t=50, 389 | b=50, 390 | pad=10 391 | ), 392 | xaxis={'title':''} 393 | ) 394 | figure = {'data':data, 395 | 'layout':layout} 396 | return figure 397 | ##reddit agg graph 398 | 399 | 400 | 401 | #reddit post trends 402 | @app.callback( 403 | dash.dependencies.Output('tab-output', 'children'), 404 | [dash.dependencies.Input('coin_select', 'value'), 405 | dash.dependencies.Input('date_filter', 'value'), 406 | dash.dependencies.Input('tabs','value')]) 407 | def update_tabs(coin_select, date_filter,tabs): 408 | if tabs == 1: 409 | df_2 = filter_reddit(df_rt, coin_select, date_filter) 410 | df_trends = df_2.sort_values(['diff']).reset_index(drop=True) 411 | posts = list(df_trends['post_id'].unique()) 412 | print(df_trends) 413 | data2 = [ 414 | go.Scatter( 415 | x=df_trends[df_trends['post_id'] == i]['diff'], 416 | y=df_trends[df_trends['post_id'] == i]['score'], 417 | mode='line', 418 | opacity=0.8, 419 | name=str(df_trends[df_trends['post_id'] == i]['name'].unique()[0]), 420 | hovertext=str(df_trends[df_trends['post_id'] == i]['title'].unique()[0]) 421 | ) for i in posts 422 | ] 423 | layout = go.Layout( 424 | title='Reddit Post Trends', 425 | yaxis=dict( 426 | title='Score' 427 | ), 428 | hovermode='closest', 429 | showlegend=False 430 | ) 431 | figure1 = { 432 | 'data':data2, 433 | 'layout':layout 434 | } 435 | return html.Div([ 436 | dcc.Graph( 437 | id='graph', 438 | figure=figure1 439 | ) 440 | ]) 441 | elif tabs == 2: 442 | df_reddit = filter_reddit(df_red_agg, coin_select, date_filter) 443 | df_reddit2 = df_reddit.groupby(by=['sentiment','name'],as_index=False).sum() 444 | sentiment = ['Neutral','Negative','Positive'] 445 | data = [ 446 | go.Bar( 447 | x = df_reddit2[df_reddit2['sentiment'] == i]['name'], 448 | y = df_reddit2[df_reddit2['sentiment'] == i]['num_posts'], 449 | name = i 450 | ) for i in sentiment 451 | ] 452 | layout = go.Layout( 453 | title='Sentiment By Coin', 454 | yaxis=dict( 455 | title='Mention Count' 456 | ) 457 | ) 458 | figure1={'data':data, 459 | 'layout':layout} 460 | return html.Div([ 461 | dcc.Graph(id='graph', 462 | figure = figure1)]) 463 | 464 | elif tabs ==3: 465 | df_reddit = filter_reddit(df_red_agg, coin_select, date_filter) 466 | df_reddit2 = df_reddit.groupby(by=['created','name'],as_index=False).sum() 467 | data = [ 468 | go.Bar( 469 | x=df_reddit2[df_reddit2['name'] == i]['created'], 470 | y=df_reddit2[df_reddit2['name'] == i]['num_posts'], 471 | name = i#, 472 | #hovertext=df_reddit[df_reddit['name'] == i]['sentiment'] 473 | ) for i in coin_select 474 | ] 475 | layout = go.Layout( 476 | title='Mentions per Day', 477 | barmode='stack', 478 | yaxis=dict(title='Mention Count'), 479 | hovermode='closest' 480 | ) 481 | figure1={'data':data, 482 | 'layout':layout} 483 | return html.Div([ 484 | dcc.Graph(id='graph', 485 | figure = figure1)]) 486 | 487 | 488 | 489 | #helper function for price data 490 | def filter_df(df=None, coin_select=None, date_filter=None): 491 | date_cutoff = df.max()['insert_timestamp'] - pd.Timedelta(days=date_filter) 492 | #coin_filter 493 | df_stg = df[df['name'].isin(coin_select)] 494 | #date_filter 495 | df_stg_2 = df_stg[df_stg['insert_timestamp'] >= date_cutoff] 496 | coin_list = list(df_stg_2['name'].unique()) 497 | frame = [] 498 | for i in coin_list: 499 | df_stg_3 = df_stg_2[df_stg_2['name']== i] 500 | base = df_stg_3[df_stg_3['insert_timestamp'] == df_stg_3.min()['insert_timestamp']]['market_cap_usd'].reset_index(drop=True)[0] 501 | df_stg_3['pct_change'] = (((df_stg_3['market_cap_usd']-base)/ base) * 100 ) 502 | frame.append(df_stg_3) 503 | df_stg_4 = pd.concat(frame) 504 | return df_stg_4 505 | 506 | 507 | def filter_reddit(df=None, coin_select=None, date_filter=None): 508 | df_stg = df[df['name'].isin(coin_select)] 509 | date_cutoff = df.max()['created'] - pd.Timedelta(days=date_filter) 510 | df_stg_2 = df_stg[df_stg['created'] >= date_cutoff] 511 | return df_stg_2 512 | 513 | if __name__ == '__main__': 514 | app.run_server(debug=True) -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | certifi==2018.1.18 2 | chardet==3.0.4 3 | click==6.7 4 | dash==0.21.0 5 | dash-core-components==0.21.0rc1 6 | dash-html-components==0.9.0 7 | dash-renderer==0.11.3 8 | decorator==4.2.1 9 | Flask==0.12.2 10 | Flask-Compress==1.4.0 11 | gunicorn==19.7.1 12 | idna==2.6 13 | ipython-genutils==0.2.0 14 | itsdangerous==0.24 15 | Jinja2==2.10 16 | jsonschema==2.6.0 17 | jupyter-core==4.4.0 18 | MarkupSafe==1.0 19 | nbformat==4.4.0 20 | numpy==1.14.1 21 | pandas==0.22.0 22 | plotly==2.4.1 23 | psycopg2==2.7.4 24 | python-dateutil==2.6.1 25 | pytz==2018.3 26 | requests==2.18.4 27 | six==1.11.0 28 | SQLAlchemy==1.2.4 29 | traitlets==4.3.2 30 | urllib3==1.22 31 | Werkzeug==0.14.1 32 | --------------------------------------------------------------------------------