├── .Dockerignore ├── .gitignore ├── Dockerfile ├── README.md ├── app.py └── requirements.txt /.Dockerignore: -------------------------------------------------------------------------------- 1 | .history 2 | venv 3 | __pycache__ 4 | .idea -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .history 2 | venv 3 | __pycache__ 4 | .idea -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.10-slim 2 | 3 | WORKDIR /python-docker 4 | 5 | COPY requirements.txt requirements.txt 6 | RUN apt-get update && apt-get install git -y 7 | # If are experiencing errors ImportError: cannot import name 'soft_unicode' from 'markupsafe' please uncomment below 8 | # RUN pip3 install markupsafe==2.0.1 9 | RUN pip3 install -r requirements.txt 10 | RUN pip3 install "git+https://github.com/openai/whisper.git" 11 | RUN apt-get install -y ffmpeg 12 | 13 | COPY . . 14 | 15 | EXPOSE 5000 16 | 17 | CMD [ "python3", "-m" , "flask", "run", "--host=0.0.0.0"] 18 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## What is Whisper? 2 | 3 | Whisper is an automatic State-of-the-Art speech recognition system from OpenAI that has been trained on 680,000 hours 4 | of multilingual and multitask supervised data collected from the web. This large and diverse 5 | dataset leads to improved robustness to accents, background noise and technical language. In 6 | addition, it enables transcription in multiple languages, as well as translation from those 7 | languages into English. OpenAI released the models and code to serve as a foundation for building useful 8 | applications that leverage speech recognition. 9 | 10 | ## How to start with Docker 11 | 1. First of all if you are planning to run the container on your local machine you need to have Docker installed. 12 | You can find the installation instructions [here](https://docs.docker.com/get-docker/). 13 | 2. Creating a folder for our files, lets call it `whisper-api` 14 | 3. Create a file called requirements.txt and add flask to it. 15 | 4. Create a file called Dockerfile 16 | 17 | In the Dockerfile we will add the following lines: 18 | 19 | ```dockerfile 20 | FROM python:3.10-slim 21 | 22 | WORKDIR /python-docker 23 | 24 | COPY requirements.txt requirements.txt 25 | RUN apt-get update && apt-get install git -y 26 | RUN pip3 install -r requirements.txt 27 | RUN pip3 install "git+https://github.com/openai/whisper.git" 28 | RUN apt-get install -y ffmpeg 29 | 30 | COPY . . 31 | 32 | EXPOSE 5000 33 | 34 | CMD [ "python3", "-m" , "flask", "run", "--host=0.0.0.0"] 35 | ``` 36 | ### So what is happening exactly in the Dockerfile? 37 | 1. Choosing a python 3.10 slim image as our base image. 38 | 2. Creating a working directory called `python-docker` 39 | 3. Copying our requirements.txt file to the working directory 40 | 4. Updating the apt package manager and installing git 41 | 5. Installing the requirements from the requirements.txt file 42 | 6. installing the whisper package from github. 43 | 7. Installing ffmpeg 44 | 8. And exposing port 5000 and running the flask server. 45 | 46 | ## How to create our rout 47 | 1. Create a file called app.py where we import all the necessary packages and initialize the flask app and whisper. 48 | 2. Add the following lines to the file: 49 | 50 | ```python 51 | from flask import Flask, abort, request 52 | from tempfile import NamedTemporaryFile 53 | import whisper 54 | import torch 55 | 56 | # Check if NVIDIA GPU is available 57 | torch.cuda.is_available() 58 | DEVICE = "cuda" if torch.cuda.is_available() else "cpu" 59 | 60 | # Load the Whisper model: 61 | model = whisper.load_model("base", device=DEVICE) 62 | 63 | app = Flask(__name__) 64 | ``` 65 | 3. Now we need to create a route that will accept a post request with a file in it. 66 | 4. Add the following lines to the app.py file: 67 | 68 | ```python 69 | @app.route("/") 70 | def hello(): 71 | return "Whisper Hello World!" 72 | 73 | 74 | @app.route('/whisper', methods=['POST']) 75 | def handler(): 76 | if not request.files: 77 | # If the user didn't submit any files, return a 400 (Bad Request) error. 78 | abort(400) 79 | 80 | # For each file, let's store the results in a list of dictionaries. 81 | results = [] 82 | 83 | # Loop over every file that the user submitted. 84 | for filename, handle in request.files.items(): 85 | # Create a temporary file. 86 | # The location of the temporary file is available in `temp.name`. 87 | temp = NamedTemporaryFile() 88 | # Write the user's uploaded file to the temporary file. 89 | # The file will get deleted when it drops out of scope. 90 | handle.save(temp) 91 | # Let's get the transcript of the temporary file. 92 | result = model.transcribe(temp.name) 93 | # Now we can store the result object for this file. 94 | results.append({ 95 | 'filename': filename, 96 | 'transcript': result['text'], 97 | }) 98 | 99 | # This will be automatically converted to JSON. 100 | return {'results': results} 101 | ``` 102 | 103 | ## How to run the container? 104 | 1. Open a terminal and navigate to the folder where you created the files. 105 | 2. Run the following command to build the container: 106 | 107 | ```bash 108 | docker build -t whisper-api . 109 | ``` 110 | 3. Run the following command to run the container: 111 | 112 | ```bash 113 | docker run -p 5000:5000 whisper-api 114 | ``` 115 | 116 | If you are having errors on MacOS please add `RUN pip3 install markupsafe==2.0.1` to the dockerfile. 117 | 118 | ## How to run the container with [Podman](https://podman.io/): 119 | 120 | ``` bash 121 | cd /tmp 122 | git clone https://github.com/lablab-ai/whisper-api-flask whisper 123 | cd whisper 124 | mv Dockerfile Containerfile 125 | podman build --network="host" -t whisper . 126 | podman run --network="host" -p 5000:5000 whisper 127 | ``` 128 | 129 | Then run: 130 | 131 | ``` bash 132 | curl -F "file=@/path/to/filename.mp3" http://localhost:5000/whisper 133 | ``` 134 | 135 | Also, from the README: 136 | 137 | > In result you should get a JSON object with the transcript in it. 138 | 139 | ## How to test the API? 140 | 1. You can test the API by sending a POST request to the route `http://localhost:5000/whisper` with a file in it. Body should be form-data. 141 | 2. You can use the following curl command to test the API: 142 | 143 | ```bash 144 | curl -F "file=@/path/to/file" http://localhost:5000/whisper 145 | ``` 146 | 3. In result you should get a JSON object with the transcript in it. 147 | 148 | ## How to deploy the API? 149 | This API can be deployed anywhere where Docker can be used. Just keep in mind that this setup currently using CPU for processing the audio files. 150 | If you want to use GPU you need to change Dockerfile and share the GPU. I won't go into this deeper as this is an introduction. 151 | [Docker GPU](https://docs.docker.com/config/containers/resource_constraints/) 152 | 153 | You can find the whole code [here]() 154 | 155 | **Thank you** for reading! If you enjoyed this tutorial you can find more and continue reading 156 | [on our tutorial page](https://lablab.ai/t/) 157 | 158 | --- 159 | 160 | [![Artificial Intelligence Hackathons, tutorials and Boilerplates](https://storage.googleapis.com/lablab-static-eu/images/github/lablab-banner.jpg)](https://lablab.ai) 161 | 162 | 163 | 164 | 165 | ## Join the LabLab Discord 166 | 167 | 168 | ![Discord Banner 1](https://discordapp.com/api/guilds/877056448956346408/widget.png?style=banner1) 169 | On lablab discord, we discuss this repo and many other topics related to artificial intelligence! Checkout upcoming [Artificial Intelligence Hackathons](https://lablab.ai) Event 170 | 171 | 172 | [![Acclerating innovation through acceleration](https://storage.googleapis.com/lablab-static-eu/images/github/nn-group-loggos.jpg)](https://newnative.ai) 173 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | from flask import Flask, abort, request 2 | from flask_cors import CORS 3 | from tempfile import NamedTemporaryFile 4 | import whisper 5 | import torch 6 | 7 | # Check if NVIDIA GPU is available 8 | torch.cuda.is_available() 9 | DEVICE = "cuda" if torch.cuda.is_available() else "cpu" 10 | 11 | # Load the Whisper model: 12 | model = whisper.load_model("base", device=DEVICE) 13 | 14 | app = Flask(__name__) 15 | CORS(app) 16 | 17 | @app.route("/") 18 | def hello(): 19 | return "Whisper Hello World!" 20 | 21 | 22 | @app.route('/whisper', methods=['POST']) 23 | def handler(): 24 | if not request.files: 25 | # If the user didn't submit any files, return a 400 (Bad Request) error. 26 | abort(400) 27 | 28 | # For each file, let's store the results in a list of dictionaries. 29 | results = [] 30 | 31 | # Loop over every file that the user submitted. 32 | for filename, handle in request.files.items(): 33 | # Create a temporary file. 34 | # The location of the temporary file is available in `temp.name`. 35 | temp = NamedTemporaryFile() 36 | # Write the user's uploaded file to the temporary file. 37 | # The file will get deleted when it drops out of scope. 38 | handle.save(temp) 39 | # Let's get the transcript of the temporary file. 40 | result = model.transcribe(temp.name) 41 | # Now we can store the result object for this file. 42 | results.append({ 43 | 'filename': filename, 44 | 'transcript': result['text'], 45 | }) 46 | 47 | # This will be automatically converted to JSON. 48 | return {'results': results} -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | flask 2 | flask-cors 3 | luminoth --------------------------------------------------------------------------------