├── highlighter-sliding-window ├── src │ ├── main │ │ ├── resources │ │ │ ├── application.properties │ │ │ └── logback.xml │ │ └── java │ │ │ └── kafkastreams │ │ │ └── example │ │ │ ├── .DS_Store │ │ │ └── KStream │ │ │ ├── Consumer.java │ │ │ ├── Producer.java │ │ │ └── KStreams.java │ └── test │ │ └── java │ │ └── kafkastreams │ │ └── example │ │ └── ExampleApplicationTests.java ├── settings.gradle ├── gradle │ └── wrapper │ │ ├── gradle-wrapper.jar │ │ └── gradle-wrapper.properties ├── .gitignore ├── build.gradle ├── gradlew.bat └── gradlew ├── .DS_Store ├── youtube-live ├── .env.example ├── chat │ └── tmp.text ├── requirements.txt ├── Dockerfile ├── README.md ├── docker-compose.yaml └── app.py ├── assets ├── logo.png ├── influx_01.png ├── influx_02.png ├── influx_03.png ├── influx_04.png ├── influx_05.png ├── influx_06.png ├── influx_07.png ├── influx_08.png ├── influx_09.png ├── influx_10.png ├── influx_11.png ├── influx_12.png ├── influx_13.png ├── influx_14.png └── architecture.png ├── influx ├── requirements.txt ├── delete_bucket.py ├── delete_measurement.py ├── read_binary.py ├── insert_text.py ├── explanation.md ├── insert_binary.py ├── Readme.md ├── influxDB_video.py └── influxDB_range.py ├── spark-streaming ├── requirements.txt ├── README.md └── spark-batch.py ├── terraform ├── README.md └── main.tf ├── video-stream ├── requirements.txt ├── README.md ├── producer.py ├── docker-compose.yaml ├── trigger_request.py ├── main.py ├── sqs_trigger_task.py └── sdk-test.ipynb ├── timeframe └── normalize_time.py ├── README.md ├── .gitignore └── video-upload └── lambda_videoUpload.py /highlighter-sliding-window/src/main/resources/application.properties: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /highlighter-sliding-window/settings.gradle: -------------------------------------------------------------------------------- 1 | rootProject.name = 'example' 2 | -------------------------------------------------------------------------------- /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/.DS_Store -------------------------------------------------------------------------------- /youtube-live/.env.example: -------------------------------------------------------------------------------- 1 | GCP_KEY=GCP_KEY 2 | OPENAI_API_KEY=OPENAI_API_KEY 3 | -------------------------------------------------------------------------------- /youtube-live/chat/tmp.text: -------------------------------------------------------------------------------- 1 | /* only for upload folder, you can delete for free */ -------------------------------------------------------------------------------- /assets/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/logo.png -------------------------------------------------------------------------------- /influx/requirements.txt: -------------------------------------------------------------------------------- 1 | confluent_kafka==2.5.0 2 | influxdb_client==1.45.0 3 | python-dotenv==1.0.1 -------------------------------------------------------------------------------- /assets/influx_01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_01.png -------------------------------------------------------------------------------- /assets/influx_02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_02.png -------------------------------------------------------------------------------- /assets/influx_03.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_03.png -------------------------------------------------------------------------------- /assets/influx_04.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_04.png -------------------------------------------------------------------------------- /assets/influx_05.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_05.png -------------------------------------------------------------------------------- /assets/influx_06.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_06.png -------------------------------------------------------------------------------- /assets/influx_07.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_07.png -------------------------------------------------------------------------------- /assets/influx_08.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_08.png -------------------------------------------------------------------------------- /assets/influx_09.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_09.png -------------------------------------------------------------------------------- /assets/influx_10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_10.png -------------------------------------------------------------------------------- /assets/influx_11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_11.png -------------------------------------------------------------------------------- /assets/influx_12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_12.png -------------------------------------------------------------------------------- /assets/influx_13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_13.png -------------------------------------------------------------------------------- /assets/influx_14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_14.png -------------------------------------------------------------------------------- /assets/architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/architecture.png -------------------------------------------------------------------------------- /spark-streaming/requirements.txt: -------------------------------------------------------------------------------- 1 | pyspark 2 | numpy==1.26.4 3 | pandas 4 | influxdb-client 5 | openai-whisper -------------------------------------------------------------------------------- /terraform/README.md: -------------------------------------------------------------------------------- 1 | ## Run 2 | 3 | ```bash 4 | terraform init 5 | terraform apply 6 | ``` 7 | 8 | ## Destroy 9 | ```bash 10 | terraform destroy 11 | ``` 12 | -------------------------------------------------------------------------------- /youtube-live/requirements.txt: -------------------------------------------------------------------------------- 1 | youtube-dl 2 | pytchat 3 | pafy 4 | pandas 5 | python-dotenv 6 | fastapi 7 | kafka-python 8 | uvicorn 9 | pymongo 10 | DateTime -------------------------------------------------------------------------------- /highlighter-sliding-window/gradle/wrapper/gradle-wrapper.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/highlighter-sliding-window/gradle/wrapper/gradle-wrapper.jar -------------------------------------------------------------------------------- /highlighter-sliding-window/src/main/java/kafkastreams/example/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/highlighter-sliding-window/src/main/java/kafkastreams/example/.DS_Store -------------------------------------------------------------------------------- /youtube-live/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.11-slim 2 | 3 | WORKDIR /app 4 | 5 | COPY requirements.txt ./ 6 | RUN pip install --no-cache-dir -r requirements.txt 7 | 8 | COPY . . 9 | EXPOSE 80 10 | CMD ["uvicorn", "app:app", "--reload"] 11 | -------------------------------------------------------------------------------- /highlighter-sliding-window/gradle/wrapper/gradle-wrapper.properties: -------------------------------------------------------------------------------- 1 | distributionBase=GRADLE_USER_HOME 2 | distributionPath=wrapper/dists 3 | distributionUrl=https\://services.gradle.org/distributions/gradle-7.6-bin.zip 4 | zipStoreBase=GRADLE_USER_HOME 5 | zipStorePath=wrapper/dists 6 | -------------------------------------------------------------------------------- /youtube-live/README.md: -------------------------------------------------------------------------------- 1 | ## Youtube-live-chat 2 | 실시간 유튜브 영상 url이 주어질 때 해당 영상의 livechat을 /chat 에 json형식으로 저장한다. 3 | 4 | ### 실행 5 | 1. .env 파일을 생성하고 GCP KEY를 넣어준다. 6 | 2. docker-compose.yaml에 youtube live url을 넣어준다. 7 | 3. docker-compose를 실행한다. 8 | ``` 9 | docker-compose up --build 10 | ``` -------------------------------------------------------------------------------- /youtube-live/docker-compose.yaml: -------------------------------------------------------------------------------- 1 | version: '3.8' 2 | services: 3 | app: 4 | build: 5 | context: . 6 | dockerfile: Dockerfile 7 | image: youtube-live-app 8 | volumes: 9 | - ./chat:/app/chat 10 | environment: 11 | - YOUTUBE_URL="https://www.youtube.com/watch?v=ujB0bnmvCkk" -------------------------------------------------------------------------------- /highlighter-sliding-window/src/test/java/kafkastreams/example/ExampleApplicationTests.java: -------------------------------------------------------------------------------- 1 | package kafkastreams.example; 2 | 3 | import org.junit.jupiter.api.Test; 4 | import org.springframework.boot.test.context.SpringBootTest; 5 | 6 | @SpringBootTest 7 | class ExampleApplicationTests { 8 | 9 | @Test 10 | void contextLoads() { 11 | } 12 | 13 | } 14 | -------------------------------------------------------------------------------- /spark-streaming/README.md: -------------------------------------------------------------------------------- 1 | ### Spark Streaming 2 | 3 | 1. Kafka topic에 적재된 video binary를 spark에 저장 4 | 2. spark window를 사용하여 30초 분량의 binary를 적재하고 이를 배치 단위로 전처리 5 | 1) 30초의 binary를 합쳐 video.ts로 저장 → ffmpeg로 video.ts를 video.mp4로 변환 6 | 2) whisper를 사용하여 mp4의 음성을 text로 전사 7 | 3) spark dataframe에 전사된 text를 새로운 column에 추가 8 | 3. spark query를 사용하여 전처리된 배치에 대해 foreachBatch()를 사용하여 influxdb에 적재 -------------------------------------------------------------------------------- /highlighter-sliding-window/src/main/resources/logback.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | %d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | -------------------------------------------------------------------------------- /video-stream/requirements.txt: -------------------------------------------------------------------------------- 1 | attrs==24.2.0 2 | certifi==2024.7.4 3 | charset-normalizer==3.3.2 4 | confluent-kafka==2.5.0 5 | h11==0.14.0 6 | idna==3.7 7 | isodate==0.6.1 8 | lxml==5.3.0 9 | outcome==1.3.0.post0 10 | pycountry==24.6.1 11 | pycryptodome==3.20.0 12 | PySocks==1.7.1 13 | requests==2.32.3 14 | setuptools==72.1.0 15 | six==1.16.0 16 | sniffio==1.3.1 17 | sortedcontainers==2.4.0 18 | streamlink==6.9.0 19 | trio==0.26.2 20 | trio-websocket==0.11.1 21 | typing_extensions==4.12.2 22 | urllib3==2.2.2 23 | websocket-client==1.8.0 24 | wheel==0.43.0 25 | wsproto==1.2.0 26 | -------------------------------------------------------------------------------- /highlighter-sliding-window/.gitignore: -------------------------------------------------------------------------------- 1 | HELP.md 2 | .gradle 3 | build/ 4 | !gradle/wrapper/gradle-wrapper.jar 5 | !**/src/main/**/build/ 6 | !**/src/test/**/build/ 7 | 8 | ### STS ### 9 | .apt_generated 10 | .classpath 11 | .factorypath 12 | .project 13 | .settings 14 | .springBeans 15 | .sts4-cache 16 | bin/ 17 | !**/src/main/**/bin/ 18 | !**/src/test/**/bin/ 19 | 20 | ### IntelliJ IDEA ### 21 | .idea 22 | *.iws 23 | *.iml 24 | *.ipr 25 | out/ 26 | !**/src/main/**/out/ 27 | !**/src/test/**/out/ 28 | 29 | ### NetBeans ### 30 | /nbproject/private/ 31 | /nbbuild/ 32 | /dist/ 33 | /nbdist/ 34 | /.nb-gradle/ 35 | 36 | ### VS Code ### 37 | .vscode/ 38 | -------------------------------------------------------------------------------- /video-stream/README.md: -------------------------------------------------------------------------------- 1 | ## Overview 2 | 3 | 4 | - [x] Streamlink on EC2 5 | - [x] EC2에서 SQS로 메시지 전송하기 6 | - [x] Lambda에서 컨슈머 함수 구현 7 | - [ ] Lambda 트리거 설정 8 | - [x] S3에 영상 저장 9 | 10 | ## Run 11 | ```bash 12 | python main.py