├── artifacts ├── pdf.png ├── nuogat.png └── nougat process.png └── README.md /artifacts/pdf.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/inuwamobarak/nougat/HEAD/artifacts/pdf.png -------------------------------------------------------------------------------- /artifacts/nuogat.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/inuwamobarak/nougat/HEAD/artifacts/nuogat.png -------------------------------------------------------------------------------- /artifacts/nougat process.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/inuwamobarak/nougat/HEAD/artifacts/nougat process.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Nougat: Revolutionizing OCR for Scientific Documents 2 | 3 | ![nuogat](https://github.com/inuwamobarak/nougat/assets/65142149/bb997529-11c9-4e3e-baf4-f0d479936037) 4 | 5 | ## About Nougat 6 | 7 | Nougat is an advanced Transformer-based OCR model that simplifies the process of converting complex scientific documents, often stored in PDF format, into a common and machine-readable Markdown format. Developed by a team of experts, Nougat leverages state-of-the-art architecture and training techniques to make scientific knowledge more accessible and usable. 8 | 9 | ## Key Features 10 | 11 | - **Transformer Architecture:** Nougat uses a Swin Transformer as a vision encoder and an mBART-based text decoder, allowing for end-to-end transcription of scientific PDFs. 12 | 13 | - **End-to-End Training:** With Nougat, there's no need for complex pipelines. The model takes raw pixels as input and generates Markdown text as output, simplifying the entire OCR process. 14 | 15 | - **Bridging the Gap:** Nougat not only transcribes scientific documents but also bridges the gap between human-readable content and machine-readable text, making it easier to access and utilize scientific knowledge. 16 | 17 | ```bash 18 | git clone https://github.com/inuwamobarak/nougat.git 19 | --------------------------------------------------------------------------------