├── .gitignore ├── Commands.txt ├── GDPR Compliance.pdf ├── README.md ├── airFlowDAG └── mastadon_dag.py ├── analysis.ipynb ├── dataCollection ├── __pycache__ │ └── loadData.cpython-39.pyc ├── data_example.json ├── last_toot_id.txt └── loadData.py ├── images ├── airflow2.PNG ├── gantt.PNG └── workflow.png ├── loadingIntoHbase └── insertion.py ├── mapReduce └── python │ ├── mapper.py │ └── reducer.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | # Environment file 2 | .env -------------------------------------------------------------------------------- /Commands.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/Commands.txt -------------------------------------------------------------------------------- /GDPR Compliance.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/GDPR Compliance.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/README.md -------------------------------------------------------------------------------- /airFlowDAG/mastadon_dag.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/airFlowDAG/mastadon_dag.py -------------------------------------------------------------------------------- /analysis.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/analysis.ipynb -------------------------------------------------------------------------------- /dataCollection/__pycache__/loadData.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/dataCollection/__pycache__/loadData.cpython-39.pyc -------------------------------------------------------------------------------- /dataCollection/data_example.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/dataCollection/data_example.json -------------------------------------------------------------------------------- /dataCollection/last_toot_id.txt: -------------------------------------------------------------------------------- 1 | 111291292042375273 -------------------------------------------------------------------------------- /dataCollection/loadData.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/dataCollection/loadData.py -------------------------------------------------------------------------------- /images/airflow2.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/images/airflow2.PNG -------------------------------------------------------------------------------- /images/gantt.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/images/gantt.PNG -------------------------------------------------------------------------------- /images/workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/images/workflow.png -------------------------------------------------------------------------------- /loadingIntoHbase/insertion.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/loadingIntoHbase/insertion.py -------------------------------------------------------------------------------- /mapReduce/python/mapper.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/mapReduce/python/mapper.py -------------------------------------------------------------------------------- /mapReduce/python/reducer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SAAD-BEN/Mastadon_data_analysis_Airflow_Hadoop_Hbase/HEAD/mapReduce/python/reducer.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | Mastodon.py 2 | python-dotenv 3 | hdfs 4 | pandas 5 | happybase --------------------------------------------------------------------------------