├── .gitattributes ├── README.md └── images ├── bmbf_pf_funding_logos.svg └── start.gif /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Teledash 2 | *Research and analysis software for Telegram* 3 | 4 | Teledash is a web application that simplifies research and analysis of the content on Telegram. 5 | 6 | ![landing page](images/start.gif) 7 | 8 | ## Repositories 9 | 10 | * [Frontend](https://github.com/democ-de/teledash-frontend) 11 | * [Backend](https://github.com/democ-de/teledash-backend) 12 | 13 | ## Features 14 | One or more Telegram accounts can be linked to Teledash. The content will be downloaded and processed periodically. 15 | 16 | ### Search 17 | With the help of the web interface, all channels, groups and chats can be searched with various parameters. For example, you can search for messages from a certain period in certain channels or messages from specific users. 18 | 19 | ### Text and speech recognition 20 | Automated text recognition (OCR) is used to recognize and save text on images. In addition, voice messages can automatically be transcribed (ASR) in the background and stored as searchable text. The quality of the results depends strongly on the quality of the audio recording as well as the speech model used. Models for text as well as speech recognition can be manually improved or trained if necessary. 21 | 22 | ### Metrics 23 | Teledash regularly collects statistical data on the activity and growth of channels and groups, enabling quantitative analysis. 24 | 25 | ### Storage 26 | Media such as videos, photos, and voice messages can be automatically downloaded and stored in a MinIO instance or in S3-compatible cloud storage. 27 | 28 | ### Export 29 | All collected data can also be accessed and filtered via a REST API for further processing of the content by third-party software. API endpoints can be tested using Swagger. mongoexport optionally allows the export of complete data sets as CSV or JSON. 30 | 31 | ### Future development 32 | Teledash will be further developed and tested in journalistic and scientific contexts in the future. Feel free to [get in touch](mailto:teledash@democ.de). 33 | 34 | ## Terminology 35 | - __Chats__ are groups, supergroups and channels. (Private conversations won't be scraped and stored by teledash). 36 | - __User__ are users and bots. 37 | - __Messages__ are messages that contain text or media (attachments) 38 | 39 | ## Citation 40 | Please cite Teledash in your publications if you used it for your research: 41 | ```BibTeX 42 | @misc{teledash_2022, 43 | title={Teledash – analysis and research software for Telegram}, 44 | url={https://github.com/democ-de/teledash}, 45 | author={Weichbrodt, Gregor and Stanjek, Grischa}, 46 | year={2022} 47 | } 48 | ``` 49 | 50 | ## Acknowledgements 51 | 52 | * Funded from September 2021 until February 2022 by ![logos of the Bundesministerium für Bildung und Forschung (BMBF), Prototype Fund and OKF-Deutschland](images/bmbf_pf_funding_logos.svg) -------------------------------------------------------------------------------- /images/bmbf_pf_funding_logos.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | BMBF-Logo 4 | 5 | 6 | 73 | 155 | 177 | 178 | 179 | -------------------------------------------------------------------------------- /images/start.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/democ-de/teledash/2146eb1a62dc20a6e7d8775470d88c811892eeb8/images/start.gif --------------------------------------------------------------------------------