├── CITISEN中文說明書.pdf ├── LICENSE ├── README.md └── images ├── CITISEN_qrcode.png ├── CITISEN_qrcode_google.png ├── app_BNC.png ├── app_SE.png ├── app_main.png ├── app_recording_a.png ├── app_recording_b.png ├── app_recording_c.png ├── app_uploading.png └── qr.ioi.tw.png /CITISEN中文說明書.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/CITISEN中文說明書.pdf -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 kuluchen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CITISEN 2 | CITISEN video 4 | 5 | 6 | ## Introduction 7 | In this work, we present a deep learning-based speech signal-processing mobile application known as CITISEN. The CITISEN can perform three functions: speech enhancement (SE), model adaptation (MA), and background noise conversion (BNC), which allow CITISEN to be used as a platform for utilizing and evaluating SE models and flexibly extend the models to address various noise environments and users. For SE, CITISEN downloads pretrained SE models on the cloud server and then uses these models to effectively reduce noise components from instant or saved recordings provided by users. When it encounters noisy speech signals with unknown speakers or noise types, the MA function allows CITISEN to improve the SE performance effectively. A few audio files of unseen speakers or noise types are recorded and uploaded to the cloud server and then used to adapt the pretrained SE model. Finally, for the BNC, CITISEN removes the original background noise using an SE model, and then mixes the processed speech signal with new background noise. The novel BNC function can evaluate SE performance under specific conditions, cover people’s tracks, and provide entertainment. 8 | 9 | 10 | ## User interface and usage 11 | 12 | ### Four main pages in CITISEN 13 | main 15 | 16 | The CITISEN application has four pages, "Speech Enhancement", "Background Noise Conversion", "Model Adaptation", and "Recording". The page name and the navigator buttons of each page are listed on the top-left and bottom in the application, respectively. 17 | 18 | 19 | ### Speech Enhancement page 20 | main 22 | 23 | The “gender” button on the upper-right corner is used to specify the user’s gender. By pressing the “model switch” button, an SE model list will pop up, and users can change the SE model. After pressing the “preview” button, users will hear their original instant recording, and after pressing the “activate” button, users will hear their enhanced instant recording. 24 | 25 | ### Background Noise Conversion page 26 | 27 | 28 | main 30 | 31 | By pressing the “sound switch” button, a background noise list will pop up. After pressing the “record noise” button, users can record and save a new noise signal. After pressing the “activate” button, users will hear the enhanced instant recording with the specified background noise. Note that the “gender” button and the “model switch” button have the same function as those in the “speech enhancement” page. 32 | 33 | ### Uploading page 34 | 35 | main 37 | 38 | The “uploading” page is used for uploading the data for the Model Adaptation function. As CITISEN provides both unknown noise adaptation and new speaker adaptation, there are two file upload buttons: “record speech” and “record noise.” To start the recording, users can simply press one button. After finishing the recording by pressing the button again, CITISEN will pop up a submission window. Users can then name the audio file and upload the recorded audio to the server. 39 | 40 | ### Recording page 41 | 42 | A. Recording page of CITISEN (Part I). A new audio file can be recorded after pressing the “record new” button. The file can then be named and saved in a pop-up submission window. 43 | 44 | main 46 | 47 | B. Recording page of CITISEN (Part II). By pressing the “choose file” button, users can choose an audio file on a pop-up window. 48 | 49 | main 51 | 52 | C. Recording page of CITISEN (Part III). Users can choose an SE model type and an SE model by using the “gender” and “model switch” button. In addition, users can evaluate the SE results visually and aurally. 53 | 54 | main 56 | 57 | 58 | The “recording” page supports classic recording and SE model evaluation. Specifically, on the “recording” page, users can save, playback, and run SE on a saved speech signal. First, users can record new audio by pressing the “record new” button, and CITISEN will redirect to a processing page. After finishing the recording by pressing the “stop” button, users can name and save the record. The workflow is shown in Fig. A. Then, users can choose an audio file, a model mode, and an SE model with the “choose file,” “gender,” and “model switch” buttons, respectively. Finally, by pressing the “run” button, an enhanced speech signal is generated. Because CITISEN demonstrates both the noisy spectrogram and enhanced spectrogram, users can visually evaluate the SE results. In addition, users can aurally evaluate the results by pressing the “play” and “stop” buttons to listen to the original and the enhanced speech signals. An illustration showing more details about the “recording” page is shown in Fig. B and Fig. C. 59 | 60 | 61 | ## Download 62 | 63 | * Download apk for Android. 64 | 65 | || Google drive | Dropbox (New) | 66 | |:------:|:------:|:------:| 67 | |URL| [link](https://drive.google.com/file/d/1FyfM3gcELodCzqN3zodJXp4hRjGQS4oF/view)|[link](https://www.dropbox.com/s/0e9mmwua8bfau2f/denoiser_220413.apk)| 68 | |QR code |main|main| 69 | 70 | 71 | 72 | * [Download] CITISEN for iOS. 73 | 74 | ## Paper 75 | * See [Paper](https://arxiv.org/pdf/2008.09264.pdf) for more detail. 76 | 77 | ## Results and demo 78 | * You can listen to some samples on the [Demo webpage](https://bio-asp-lab.github.io/CITISEN_demo/). 79 | 80 | ## Citations 81 | @ARTICLE{citisen2022, \ 82 | author={Chen, Yu-Wen and Hung, Kuo-Hsuan and Li, You-Jin and Kang, Alexander Chao-Fu and Lai, Ya-Hsin and Liu, Kai-Chun and Fu, Szu-Wei and Wang, Syu- Siang and Tsao, Yu}, \ 83 | journal={IEEE Access}, \ 84 | title={CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application}, \ 85 | year={2022}, \ 86 | doi={10.1109/ACCESS.2022.3153469}} 87 | 88 | 89 | ## License 90 | * The CITISEN work is released under MIT License. See LICENSE for more details. 91 | 92 | ## Acknowledgments 93 | * [Bio-ASP Lab](https://bio-asplab.citi.sinica.edu.tw), CITI, Academia Sinica, Taipei, Taiwan 94 | 95 | -------------------------------------------------------------------------------- /images/CITISEN_qrcode.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/CITISEN_qrcode.png -------------------------------------------------------------------------------- /images/CITISEN_qrcode_google.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/CITISEN_qrcode_google.png -------------------------------------------------------------------------------- /images/app_BNC.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_BNC.png -------------------------------------------------------------------------------- /images/app_SE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_SE.png -------------------------------------------------------------------------------- /images/app_main.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_main.png -------------------------------------------------------------------------------- /images/app_recording_a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_recording_a.png -------------------------------------------------------------------------------- /images/app_recording_b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_recording_b.png -------------------------------------------------------------------------------- /images/app_recording_c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_recording_c.png -------------------------------------------------------------------------------- /images/app_uploading.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/app_uploading.png -------------------------------------------------------------------------------- /images/qr.ioi.tw.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yuwchen/CITISEN/5a77eab888dbb3d87fdfa7fe1a2c1d25f0a2683f/images/qr.ioi.tw.png --------------------------------------------------------------------------------