├── .gitignore
├── README.md
├── download
├── README.md
├── __pycache__
│ └── download_youtube.cpython-36.pyc
├── audios
│ ├── README.md
│ ├── __pycache__
│ │ └── audio_chunks.cpython-36.pyc
│ ├── audio_Spectrum.py
│ ├── audio_chunks.py
│ ├── audio_total_length.py
│ ├── export_audioChunks.py
│ └── get_chuncks_text.py
├── convert_wav.py
├── download_list.py
├── download_youtube.py
└── urlList.txt
└── requirements.txt
/.gitignore:
--------------------------------------------------------------------------------
1 | ## Ignore File
2 |
3 | *.mp4
4 | *.wav
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # 책 읽어주는 딥러닝을 보고 꼽혀 만든 Repository
2 | 책 읽어주는 딥러닝이 너무 어려워보여 쉽고 간단하게 공부하면서 만들어볼 수 있지 않을까 하면서 시작하게된 repository입니다.
3 |
4 | * * *
5 |
6 | ### 1. 데이터 수집 및 음성 추출.
7 | download 디렉토리에 유튜브 영상을 수집하는 프로그램이 있습니다.
8 | 영상을 수집하고 오디오를 추출하여 wav파일로 뽑아내는 코드도 있습니다.
9 | [이동](https://github.com/clianor/voice-speaker-tensorflow/tree/master/download)
10 | * * *
11 |
12 | ### 2. 음성 자르기.
13 | 추출한 음성 파일을 잘라 추출하는 프로그램이 있습니다.
14 | [이동](https://github.com/clianor/voice-speaker-tensorflow/tree/master/download/audios)
15 |
16 | * * *
--------------------------------------------------------------------------------
/download/README.md:
--------------------------------------------------------------------------------
1 | # Data download & Export wav files
2 |
3 | ### 파일 설명
4 | - download_youtube.py
5 | - ``` python download_youtube.py ``` 처럼 사용하며 유튜브 영상을 다운로드 가능합니다.
6 | - 저장은 videos 폴더에 저장이되면 폴더가 존재하지 않으면 만들어 저장합니다.
7 | - download_list.py
8 | - ``` python download_list.py ``` 처럼 사용하며 urlList.txt 파일을 읽어 저장된 URL들의 유튜브 영상을 다운로드합니다.
9 | - convert_wav.py
10 | - ``` python convert_wav.py videos audios ``` 처럼 사용하며 영상에서 오디오를 추출하여 audios 디렉토리에 저장합니다.
11 |
12 | * * *
13 |
14 | ### 실행순서
15 | 1. ``` python download_list.py ```
16 | 2. ``` python convert_wav.py```
17 |
18 | * * *
--------------------------------------------------------------------------------
/download/__pycache__/download_youtube.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/clianor/voice-speaker-tensorflow/537e80be65eaa35f997093cee2be84035c66a188/download/__pycache__/download_youtube.cpython-36.pyc
--------------------------------------------------------------------------------
/download/audios/README.md:
--------------------------------------------------------------------------------
1 | # Export Audio Chunks
2 |
3 | ### 파일 설명
4 | - audio_Spectrum.py
5 | - ``` python audio_Spectrum.py ``` 처럼 사용하며 파일을 열어 librosa.load 부분을 수정하여 특정 음성 파일의 스펙트럼을 확인할 수 있습니다.
6 | - audio_chunks.py
7 | - ``` python audio_chunks.py ``` 처럼 사용하며 audio_Spectrum.py와 같이 열어 함수의 인자 부분을 수정하여 사용할 수 있습니다.
8 | - export_audioChunks.py
9 | - ``` python export_audioChunks.py ``` 처럼 사용하며 audios파일에 존재하는 모든 wav 파일들을 조건에 맞는 chunk로 split하여 chunks 디렉토리에 저장합니다.
10 | * * *
11 |
12 | ### 실행 순서
13 | 1. ``` python audio_Spectrum.py ``` 처럼 사용하여 먼저 오디오의 스펙트럼을 확인한다.
14 | 2. ``` python export_audioChunks.py ``` 처럼 사용하여 오디오의 chunk들을 추출합니다.
15 | 3. ``` python audio_total_length.py ``` 처럼 사용하여 추출된 chunk들의 총 길이를 계산합니다. (단위 : sec)
16 | * * *
17 |
18 | ### Speech to text
19 | - 구글 Speech API를 이용 아래 참조.
20 | - https://cloud.google.com/speech-to-text/docs/reference/libraries?hl=ko
--------------------------------------------------------------------------------
/download/audios/__pycache__/audio_chunks.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/clianor/voice-speaker-tensorflow/537e80be65eaa35f997093cee2be84035c66a188/download/audios/__pycache__/audio_chunks.cpython-36.pyc
--------------------------------------------------------------------------------
/download/audios/audio_Spectrum.py:
--------------------------------------------------------------------------------
1 | import librosa
2 | import librosa.display
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 |
6 | y, sr = librosa.load("data/yVxDSnTFN6o.wav")
7 |
8 | S = librosa.feature.melspectrogram(y, sr=sr, n_mels=128)
9 | log_S = librosa.amplitude_to_db(S, ref=np.max)
10 |
11 | plt.figure(figsize=(12, 4))
12 | librosa.display.specshow(log_S, sr=sr, x_axis='time', y_axis='mel')
13 | plt.title('mel power spectrogram')
14 | plt.colorbar(format='%+02.0f dB')
15 | plt.tight_layout()
16 | plt.show()
17 |
--------------------------------------------------------------------------------
/download/audios/audio_chunks.py:
--------------------------------------------------------------------------------
1 | import os
2 | from pydub import AudioSegment
3 | from pydub.silence import split_on_silence
4 |
5 | def AudioChuncks(filePath, format):
6 | if "chuncks" not in os.listdir():
7 | os.mkdir("chuncks")
8 |
9 | sound = AudioSegment.from_file("data/"+filePath, format=format)
10 | sound = sound.set_channels(1)
11 |
12 | chunks = split_on_silence(
13 | sound,
14 | # 0.5초(500ms)보다 긴 무음을 무음으로 간주.
15 | min_silence_len = 500,
16 | # -50 dBFS 미만의 소리는 없는것으로 간주.
17 | silence_thresh = -50,
18 | )
19 |
20 | soundCount = 0
21 | for chunk in chunks:
22 | # 잘려진 음성의 길이가 0.8초 이상일때 사용함.
23 | if chunk.duration_seconds > 0.8:
24 | chunk.export("chuncks/"+os.path.splitext(filePath)[0]+"{0}.wav".format(soundCount), format="wav")
25 | soundCount += 1
26 |
27 | if __name__ == "__main__":
28 | AudioChuncks("yVxDSnTFN6o&t.wav", "wav")
--------------------------------------------------------------------------------
/download/audios/audio_total_length.py:
--------------------------------------------------------------------------------
1 | import os
2 | from os.path import splitext
3 | from pydub import AudioSegment
4 |
5 | audioFiles = [fileName for fileName in os.listdir("chuncks") if splitext(fileName)[1] == ".wav"]
6 | totalLength = 0.
7 | for fn in audioFiles:
8 | totalLength += AudioSegment.from_file("chuncks/"+fn, format="wav").duration_seconds
9 |
10 | print("Audio Files Total Length :", totalLength, "sec")
--------------------------------------------------------------------------------
/download/audios/export_audioChunks.py:
--------------------------------------------------------------------------------
1 | import os
2 | from os.path import splitext
3 | from audio_chunks import AudioChuncks
4 |
5 | wavFiles = [fileName for fileName in os.listdir("data") if splitext(fileName)[1] == ".wav"]
6 | for fileName in wavFiles:
7 | AudioChuncks(fileName, "wav")
--------------------------------------------------------------------------------
/download/audios/get_chuncks_text.py:
--------------------------------------------------------------------------------
1 | import io
2 | import os
3 | import json
4 | from progressbar import ProgressBar
5 |
6 | from google.cloud import speech
7 | from google.cloud.speech import enums
8 | from google.cloud.speech import types
9 |
10 |
11 | def getAudioText(fn):
12 | client = speech.SpeechClient()
13 |
14 | file_name = os.path.join('chuncks',fn)
15 | with io.open(file_name, 'rb') as audio_file:
16 | content=audio_file.read()
17 | audio=types.RecognitionAudio(content=content)
18 |
19 | config=types.RecognitionConfig(language_code='ko-KR')
20 | response=client.recognize(config, audio)
21 | try:
22 | result = response.results[0].alternatives[0].transcript
23 | except:
24 | result = ""
25 | return result
26 |
27 | if __name__ == "__main__":
28 | bar=ProgressBar()
29 | data=dict()
30 | for fn in bar(sorted(os.listdir("chuncks"))):
31 | data[fn]=getAudioText(fn)
32 |
33 | with open("alignment.json","w") as f:
34 | f.write(json.dumps(data,ensure_ascii=False,indent="\t"))
--------------------------------------------------------------------------------
/download/convert_wav.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | from moviepy.editor import *
3 |
4 | if __name__ == "__main__":
5 | if len(sys.argv) != 3:
6 | print("비디오 디렉토리와 오디오 디렉토리를 입력하세요")
7 | sys.exit()
8 |
9 | if sys.argv[2] not in os.listdir():
10 | os.mkdir(sys.argv[2])
11 | if "data" not in os.listdir(".\\"+sys.argv[2]):
12 | os.mkdir(os.path.join(sys.argv[2], "data"))
13 |
14 | files=list(filter(lambda x: os.path.splitext(x)[1] == ".mp4" ,os.listdir(sys.argv[1])))
15 | for filename in files:
16 | try:
17 | video_clip=VideoFileClip(os.path.join(sys.argv[1], filename))
18 | audio_clip=video_clip.audio
19 | print("\npath:", ".\\"+os.path.join(sys.argv[1], "data\\"+filename))
20 | audio_clip.write_audiofile(os.path.join(sys.argv[2], "data\\"+os.path.splitext(filename)[0]+".wav"))
21 | audio_clip.close()
22 | except Exception as e:
23 | print(filename, e)
--------------------------------------------------------------------------------
/download/download_list.py:
--------------------------------------------------------------------------------
1 | import os
2 | from download_youtube import DownloadYoutube
3 |
4 | if __name__ == "__main__":
5 | try:
6 | with open("urlList.txt", "r") as f:
7 | urls=f.readlines()
8 |
9 | for url in urls:
10 | DownloadYoutube(url)
11 | except Exception as e:
12 | print(e)
--------------------------------------------------------------------------------
/download/download_youtube.py:
--------------------------------------------------------------------------------
1 | import os, re, sys
2 | import unicodedata
3 | import subprocess
4 | import pytube
5 |
6 | def DownloadYoutube(url):
7 | try:
8 | if not os.path.isdir(".\\videos"):
9 | os.mkdir(".\\videos")
10 |
11 | yt=pytube.YouTube(url)
12 | vids=yt.streams.all()
13 |
14 | parent_dir=".\\videos"
15 | vids[0].download(parent_dir)
16 |
17 | new_filename=re.findall("watch\?v=(\S+)", url)[0]+".mp4"
18 | default_filename=vids[0].default_filename
19 | try:
20 | os.rename(os.path.join(parent_dir, default_filename), os.path.join(parent_dir, new_filename))
21 | except FileExistsError as e:
22 | print(new_filename, "파일이 존재합니다.")
23 | os.remove(os.path.join(parent_dir, default_filename))
24 | except Exception as e:
25 | print(e)
26 | # end def
27 |
28 | if __name__ == "__main__":
29 | DownloadYoutube(sys.argv[1])
--------------------------------------------------------------------------------
/download/urlList.txt:
--------------------------------------------------------------------------------
1 | https://www.youtube.com/watch?v=cL7LXYA1uI4
2 | https://www.youtube.com/watch?v=lpJH4X6WiIY
3 | https://www.youtube.com/watch?v=YdsWnEXKLsY
4 | https://www.youtube.com/watch?v=QmOAZM4t_js
5 | https://www.youtube.com/watch?v=7_91FYi0nlI
6 | https://www.youtube.com/watch?v=yVxDSnTFN6o
7 | https://www.youtube.com/watch?v=0oJABnYXeoY
8 | https://www.youtube.com/watch?v=nRH6CedeUoo
9 | https://www.youtube.com/watch?v=tTRG60X0svo
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | audioread==2.1.6
2 | backcall==0.1.0
3 | certifi==2018.4.16
4 | chardet==3.0.4
5 | colorama==0.3.9
6 | cycler==0.10.0
7 | decorator==4.3.0
8 | future==0.16.0
9 | idna==2.7
10 | imageio==2.3.0
11 | ipython==6.4.0
12 | ipython-genutils==0.2.0
13 | jedi==0.12.1
14 | joblib==0.12.1
15 | kiwisolver==1.0.1
16 | librosa==0.6.1
17 | llvmlite==0.24.0
18 | matplotlib==2.2.2
19 | moviepy==0.2.3.5
20 | numba==0.39.0
21 | numpy==1.14.5
22 | parso==0.3.1
23 | pickleshare==0.7.4
24 | Pillow==5.2.0
25 | prompt-toolkit==1.0.15
26 | pydub==0.22.1
27 | Pygments==2.2.0
28 | pyparsing==2.2.0
29 | python-dateutil==2.7.3
30 | pytube==9.2.2
31 | pytz==2018.5
32 | requests==2.19.1
33 | resampy==0.2.1
34 | scikit-learn==0.19.2
35 | scipy==1.1.0
36 | simplegeneric==0.8.1
37 | six==1.11.0
38 | tqdm==4.23.4
39 | traitlets==4.3.2
40 | urllib3==1.23
41 | wcwidth==0.1.7
42 |
--------------------------------------------------------------------------------