├── highlighter-sliding-window
├── src
│ ├── main
│ │ ├── resources
│ │ │ ├── application.properties
│ │ │ └── logback.xml
│ │ └── java
│ │ │ └── kafkastreams
│ │ │ └── example
│ │ │ ├── .DS_Store
│ │ │ └── KStream
│ │ │ ├── Consumer.java
│ │ │ ├── Producer.java
│ │ │ └── KStreams.java
│ └── test
│ │ └── java
│ │ └── kafkastreams
│ │ └── example
│ │ └── ExampleApplicationTests.java
├── settings.gradle
├── gradle
│ └── wrapper
│ │ ├── gradle-wrapper.jar
│ │ └── gradle-wrapper.properties
├── .gitignore
├── build.gradle
├── gradlew.bat
└── gradlew
├── .DS_Store
├── youtube-live
├── .env.example
├── chat
│ └── tmp.text
├── requirements.txt
├── Dockerfile
├── README.md
├── docker-compose.yaml
└── app.py
├── assets
├── logo.png
├── influx_01.png
├── influx_02.png
├── influx_03.png
├── influx_04.png
├── influx_05.png
├── influx_06.png
├── influx_07.png
├── influx_08.png
├── influx_09.png
├── influx_10.png
├── influx_11.png
├── influx_12.png
├── influx_13.png
├── influx_14.png
└── architecture.png
├── influx
├── requirements.txt
├── delete_bucket.py
├── delete_measurement.py
├── read_binary.py
├── insert_text.py
├── explanation.md
├── insert_binary.py
├── Readme.md
├── influxDB_video.py
└── influxDB_range.py
├── spark-streaming
├── requirements.txt
├── README.md
└── spark-batch.py
├── terraform
├── README.md
└── main.tf
├── video-stream
├── requirements.txt
├── README.md
├── producer.py
├── docker-compose.yaml
├── trigger_request.py
├── main.py
├── sqs_trigger_task.py
└── sdk-test.ipynb
├── timeframe
└── normalize_time.py
├── README.md
├── .gitignore
└── video-upload
└── lambda_videoUpload.py
/highlighter-sliding-window/src/main/resources/application.properties:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/highlighter-sliding-window/settings.gradle:
--------------------------------------------------------------------------------
1 | rootProject.name = 'example'
2 |
--------------------------------------------------------------------------------
/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/.DS_Store
--------------------------------------------------------------------------------
/youtube-live/.env.example:
--------------------------------------------------------------------------------
1 | GCP_KEY=GCP_KEY
2 | OPENAI_API_KEY=OPENAI_API_KEY
3 |
--------------------------------------------------------------------------------
/youtube-live/chat/tmp.text:
--------------------------------------------------------------------------------
1 | /* only for upload folder, you can delete for free */
--------------------------------------------------------------------------------
/assets/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/logo.png
--------------------------------------------------------------------------------
/influx/requirements.txt:
--------------------------------------------------------------------------------
1 | confluent_kafka==2.5.0
2 | influxdb_client==1.45.0
3 | python-dotenv==1.0.1
--------------------------------------------------------------------------------
/assets/influx_01.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_01.png
--------------------------------------------------------------------------------
/assets/influx_02.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_02.png
--------------------------------------------------------------------------------
/assets/influx_03.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_03.png
--------------------------------------------------------------------------------
/assets/influx_04.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_04.png
--------------------------------------------------------------------------------
/assets/influx_05.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_05.png
--------------------------------------------------------------------------------
/assets/influx_06.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_06.png
--------------------------------------------------------------------------------
/assets/influx_07.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_07.png
--------------------------------------------------------------------------------
/assets/influx_08.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_08.png
--------------------------------------------------------------------------------
/assets/influx_09.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_09.png
--------------------------------------------------------------------------------
/assets/influx_10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_10.png
--------------------------------------------------------------------------------
/assets/influx_11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_11.png
--------------------------------------------------------------------------------
/assets/influx_12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_12.png
--------------------------------------------------------------------------------
/assets/influx_13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_13.png
--------------------------------------------------------------------------------
/assets/influx_14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/influx_14.png
--------------------------------------------------------------------------------
/assets/architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/assets/architecture.png
--------------------------------------------------------------------------------
/spark-streaming/requirements.txt:
--------------------------------------------------------------------------------
1 | pyspark
2 | numpy==1.26.4
3 | pandas
4 | influxdb-client
5 | openai-whisper
--------------------------------------------------------------------------------
/terraform/README.md:
--------------------------------------------------------------------------------
1 | ## Run
2 |
3 | ```bash
4 | terraform init
5 | terraform apply
6 | ```
7 |
8 | ## Destroy
9 | ```bash
10 | terraform destroy
11 | ```
12 |
--------------------------------------------------------------------------------
/youtube-live/requirements.txt:
--------------------------------------------------------------------------------
1 | youtube-dl
2 | pytchat
3 | pafy
4 | pandas
5 | python-dotenv
6 | fastapi
7 | kafka-python
8 | uvicorn
9 | pymongo
10 | DateTime
--------------------------------------------------------------------------------
/highlighter-sliding-window/gradle/wrapper/gradle-wrapper.jar:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/highlighter-sliding-window/gradle/wrapper/gradle-wrapper.jar
--------------------------------------------------------------------------------
/highlighter-sliding-window/src/main/java/kafkastreams/example/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/YBIGTA/24th-de-highlighter/HEAD/highlighter-sliding-window/src/main/java/kafkastreams/example/.DS_Store
--------------------------------------------------------------------------------
/youtube-live/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM python:3.11-slim
2 |
3 | WORKDIR /app
4 |
5 | COPY requirements.txt ./
6 | RUN pip install --no-cache-dir -r requirements.txt
7 |
8 | COPY . .
9 | EXPOSE 80
10 | CMD ["uvicorn", "app:app", "--reload"]
11 |
--------------------------------------------------------------------------------
/highlighter-sliding-window/gradle/wrapper/gradle-wrapper.properties:
--------------------------------------------------------------------------------
1 | distributionBase=GRADLE_USER_HOME
2 | distributionPath=wrapper/dists
3 | distributionUrl=https\://services.gradle.org/distributions/gradle-7.6-bin.zip
4 | zipStoreBase=GRADLE_USER_HOME
5 | zipStorePath=wrapper/dists
6 |
--------------------------------------------------------------------------------
/youtube-live/README.md:
--------------------------------------------------------------------------------
1 | ## Youtube-live-chat
2 | 실시간 유튜브 영상 url이 주어질 때 해당 영상의 livechat을 /chat 에 json형식으로 저장한다.
3 |
4 | ### 실행
5 | 1. .env 파일을 생성하고 GCP KEY를 넣어준다.
6 | 2. docker-compose.yaml에 youtube live url을 넣어준다.
7 | 3. docker-compose를 실행한다.
8 | ```
9 | docker-compose up --build
10 | ```
--------------------------------------------------------------------------------
/youtube-live/docker-compose.yaml:
--------------------------------------------------------------------------------
1 | version: '3.8'
2 | services:
3 | app:
4 | build:
5 | context: .
6 | dockerfile: Dockerfile
7 | image: youtube-live-app
8 | volumes:
9 | - ./chat:/app/chat
10 | environment:
11 | - YOUTUBE_URL="https://www.youtube.com/watch?v=ujB0bnmvCkk"
--------------------------------------------------------------------------------
/highlighter-sliding-window/src/test/java/kafkastreams/example/ExampleApplicationTests.java:
--------------------------------------------------------------------------------
1 | package kafkastreams.example;
2 |
3 | import org.junit.jupiter.api.Test;
4 | import org.springframework.boot.test.context.SpringBootTest;
5 |
6 | @SpringBootTest
7 | class ExampleApplicationTests {
8 |
9 | @Test
10 | void contextLoads() {
11 | }
12 |
13 | }
14 |
--------------------------------------------------------------------------------
/spark-streaming/README.md:
--------------------------------------------------------------------------------
1 | ### Spark Streaming
2 |
3 | 1. Kafka topic에 적재된 video binary를 spark에 저장
4 | 2. spark window를 사용하여 30초 분량의 binary를 적재하고 이를 배치 단위로 전처리
5 | 1) 30초의 binary를 합쳐 video.ts로 저장 → ffmpeg로 video.ts를 video.mp4로 변환
6 | 2) whisper를 사용하여 mp4의 음성을 text로 전사
7 | 3) spark dataframe에 전사된 text를 새로운 column에 추가
8 | 3. spark query를 사용하여 전처리된 배치에 대해 foreachBatch()를 사용하여 influxdb에 적재
--------------------------------------------------------------------------------
/highlighter-sliding-window/src/main/resources/logback.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | %d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
--------------------------------------------------------------------------------
/video-stream/requirements.txt:
--------------------------------------------------------------------------------
1 | attrs==24.2.0
2 | certifi==2024.7.4
3 | charset-normalizer==3.3.2
4 | confluent-kafka==2.5.0
5 | h11==0.14.0
6 | idna==3.7
7 | isodate==0.6.1
8 | lxml==5.3.0
9 | outcome==1.3.0.post0
10 | pycountry==24.6.1
11 | pycryptodome==3.20.0
12 | PySocks==1.7.1
13 | requests==2.32.3
14 | setuptools==72.1.0
15 | six==1.16.0
16 | sniffio==1.3.1
17 | sortedcontainers==2.4.0
18 | streamlink==6.9.0
19 | trio==0.26.2
20 | trio-websocket==0.11.1
21 | typing_extensions==4.12.2
22 | urllib3==2.2.2
23 | websocket-client==1.8.0
24 | wheel==0.43.0
25 | wsproto==1.2.0
26 |
--------------------------------------------------------------------------------
/highlighter-sliding-window/.gitignore:
--------------------------------------------------------------------------------
1 | HELP.md
2 | .gradle
3 | build/
4 | !gradle/wrapper/gradle-wrapper.jar
5 | !**/src/main/**/build/
6 | !**/src/test/**/build/
7 |
8 | ### STS ###
9 | .apt_generated
10 | .classpath
11 | .factorypath
12 | .project
13 | .settings
14 | .springBeans
15 | .sts4-cache
16 | bin/
17 | !**/src/main/**/bin/
18 | !**/src/test/**/bin/
19 |
20 | ### IntelliJ IDEA ###
21 | .idea
22 | *.iws
23 | *.iml
24 | *.ipr
25 | out/
26 | !**/src/main/**/out/
27 | !**/src/test/**/out/
28 |
29 | ### NetBeans ###
30 | /nbproject/private/
31 | /nbbuild/
32 | /dist/
33 | /nbdist/
34 | /.nb-gradle/
35 |
36 | ### VS Code ###
37 | .vscode/
38 |
--------------------------------------------------------------------------------
/video-stream/README.md:
--------------------------------------------------------------------------------
1 | ## Overview
2 |
3 |
4 | - [x] Streamlink on EC2
5 | - [x] EC2에서 SQS로 메시지 전송하기
6 | - [x] Lambda에서 컨슈머 함수 구현
7 | - [ ] Lambda 트리거 설정
8 | - [x] S3에 영상 저장
9 |
10 | ## Run
11 | ```bash
12 | python main.py