├── CITISEN中文說明書.pdf
├── LICENSE
├── README.md
└── images
├── CITISEN_qrcode.png
├── CITISEN_qrcode_google.png
├── app_BNC.png
├── app_SE.png
├── app_main.png
├── app_recording_a.png
├── app_recording_b.png
├── app_recording_c.png
├── app_uploading.png
└── qr.ioi.tw.png
/CITISEN中文說明書.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/CITISEN中文說明書.pdf
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 kuluchen
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # CITISEN
2 |
4 |
5 |
6 | ## Introduction
7 | In this work, we present a deep learning-based speech signal-processing mobile application known as CITISEN. The CITISEN can perform three functions: speech enhancement (SE), model adaptation (MA), and background noise conversion (BNC), which allow CITISEN to be used as a platform for utilizing and evaluating SE models and flexibly extend the models to address various noise environments and users. For SE, CITISEN downloads pretrained SE models on the cloud server and then uses these models to effectively reduce noise components from instant or saved recordings provided by users. When it encounters noisy speech signals with unknown speakers or noise types, the MA function allows CITISEN to improve the SE performance effectively. A few audio files of unseen speakers or noise types are recorded and uploaded to the cloud server and then used to adapt the pretrained SE model. Finally, for the BNC, CITISEN removes the original background noise using an SE model, and then mixes the processed speech signal with new background noise. The novel BNC function can evaluate SE performance under specific conditions, cover people’s tracks, and provide entertainment.
8 |
9 |
10 | ## User interface and usage
11 |
12 | ### Four main pages in CITISEN
13 |
15 |
16 | The CITISEN application has four pages, "Speech Enhancement", "Background Noise Conversion", "Model Adaptation", and "Recording". The page name and the navigator buttons of each page are listed on the top-left and bottom in the application, respectively.
17 |
18 |
19 | ### Speech Enhancement page
20 |
22 |
23 | The “gender” button on the upper-right corner is used to specify the user’s gender. By pressing the “model switch” button, an SE model list will pop up, and users can change the SE model. After pressing the “preview” button, users will hear their original instant recording, and after pressing the “activate” button, users will hear their enhanced instant recording.
24 |
25 | ### Background Noise Conversion page
26 |
27 |
28 |
30 |
31 | By pressing the “sound switch” button, a background noise list will pop up. After pressing the “record noise” button, users can record and save a new noise signal. After pressing the “activate” button, users will hear the enhanced instant recording with the specified background noise. Note that the “gender” button and the “model switch” button have the same function as those in the “speech enhancement” page.
32 |
33 | ### Uploading page
34 |
35 |
37 |
38 | The “uploading” page is used for uploading the data for the Model Adaptation function. As CITISEN provides both unknown noise adaptation and new speaker adaptation, there are two file upload buttons: “record speech” and “record noise.” To start the recording, users can simply press one button. After finishing the recording by pressing the button again, CITISEN will pop up a submission window. Users can then name the audio file and upload the recorded audio to the server.
39 |
40 | ### Recording page
41 |
42 | A. Recording page of CITISEN (Part I). A new audio file can be recorded after pressing the “record new” button. The file can then be named and saved in a pop-up submission window.
43 |
44 |
46 |
47 | B. Recording page of CITISEN (Part II). By pressing the “choose file” button, users can choose an audio file on a pop-up window.
48 |
49 |
51 |
52 | C. Recording page of CITISEN (Part III). Users can choose an SE model type and an SE model by using the “gender” and “model switch” button. In addition, users can evaluate the SE results visually and aurally.
53 |
54 |
56 |
57 |
58 | The “recording” page supports classic recording and SE model evaluation. Specifically, on the “recording” page, users can save, playback, and run SE on a saved speech signal. First, users can record new audio by pressing the “record new” button, and CITISEN will redirect to a processing page. After finishing the recording by pressing the “stop” button, users can name and save the record. The workflow is shown in Fig. A. Then, users can choose an audio file, a model mode, and an SE model with the “choose file,” “gender,” and “model switch” buttons, respectively. Finally, by pressing the “run” button, an enhanced speech signal is generated. Because CITISEN demonstrates both the noisy spectrogram and enhanced spectrogram, users can visually evaluate the SE results. In addition, users can aurally evaluate the results by pressing the “play” and “stop” buttons to listen to the original and the enhanced speech signals. An illustration showing more details about the “recording” page is shown in Fig. B and Fig. C.
59 |
60 |
61 | ## Download
62 |
63 | * Download apk for Android.
64 |
65 | || Google drive | Dropbox (New) |
66 | |:------:|:------:|:------:|
67 | |URL| [link](https://drive.google.com/file/d/1FyfM3gcELodCzqN3zodJXp4hRjGQS4oF/view)|[link](https://www.dropbox.com/s/0e9mmwua8bfau2f/denoiser_220413.apk)|
68 | |QR code |
|
|
69 |
70 |
71 |
72 | * [Download] CITISEN for iOS.
73 |
74 | ## Paper
75 | * See [Paper](https://arxiv.org/pdf/2008.09264.pdf) for more detail.
76 |
77 | ## Results and demo
78 | * You can listen to some samples on the [Demo webpage](https://bio-asp-lab.github.io/CITISEN_demo/).
79 |
80 | ## Citations
81 | @ARTICLE{citisen2022, \
82 | author={Chen, Yu-Wen and Hung, Kuo-Hsuan and Li, You-Jin and Kang, Alexander Chao-Fu and Lai, Ya-Hsin and Liu, Kai-Chun and Fu, Szu-Wei and Wang, Syu- Siang and Tsao, Yu}, \
83 | journal={IEEE Access}, \
84 | title={CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application}, \
85 | year={2022}, \
86 | doi={10.1109/ACCESS.2022.3153469}}
87 |
88 |
89 | ## License
90 | * The CITISEN work is released under MIT License. See LICENSE for more details.
91 |
92 | ## Acknowledgments
93 | * [Bio-ASP Lab](https://bio-asplab.citi.sinica.edu.tw), CITI, Academia Sinica, Taipei, Taiwan
94 |
95 |
--------------------------------------------------------------------------------
/images/CITISEN_qrcode.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/CITISEN_qrcode.png
--------------------------------------------------------------------------------
/images/CITISEN_qrcode_google.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/CITISEN_qrcode_google.png
--------------------------------------------------------------------------------
/images/app_BNC.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_BNC.png
--------------------------------------------------------------------------------
/images/app_SE.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_SE.png
--------------------------------------------------------------------------------
/images/app_main.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_main.png
--------------------------------------------------------------------------------
/images/app_recording_a.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_recording_a.png
--------------------------------------------------------------------------------
/images/app_recording_b.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_recording_b.png
--------------------------------------------------------------------------------
/images/app_recording_c.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_recording_c.png
--------------------------------------------------------------------------------
/images/app_uploading.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_uploading.png
--------------------------------------------------------------------------------
/images/qr.ioi.tw.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/qr.ioi.tw.png
--------------------------------------------------------------------------------