├── LICENSE ├── M5Unified_StackChan_ChatGPT_Google ├── .gitignore ├── .vscode │ └── extensions.json ├── include │ └── README ├── lib │ └── README ├── platformio.ini ├── src │ ├── Audio.cpp │ ├── Audio.h │ ├── AudioOutputM5Speaker.h │ ├── CloudSpeechClient.cpp │ ├── CloudSpeechClient.h │ ├── main.cpp │ ├── network_param.h │ └── rootCACertificate.h └── test │ └── README ├── README.md ├── README_en.md └── images ├── image1.png ├── image2.png ├── image3.png └── image4.png /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 robo8080 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/.gitignore: -------------------------------------------------------------------------------- 1 | .pio 2 | .vscode/.browse.c_cpp.db* 3 | .vscode/c_cpp_properties.json 4 | .vscode/launch.json 5 | .vscode/ipch 6 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/.vscode/extensions.json: -------------------------------------------------------------------------------- 1 | { 2 | // See http://go.microsoft.com/fwlink/?LinkId=827846 3 | // for the documentation about the extensions.json format 4 | "recommendations": [ 5 | "platformio.platformio-ide" 6 | ], 7 | "unwantedRecommendations": [ 8 | "ms-vscode.cpptools-extension-pack" 9 | ] 10 | } 11 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/include/README: -------------------------------------------------------------------------------- 1 | 2 | This directory is intended for project header files. 3 | 4 | A header file is a file containing C declarations and macro definitions 5 | to be shared between several project source files. You request the use of a 6 | header file in your project source file (C, C++, etc) located in `src` folder 7 | by including it, with the C preprocessing directive `#include'. 8 | 9 | ```src/main.c 10 | 11 | #include "header.h" 12 | 13 | int main (void) 14 | { 15 | ... 16 | } 17 | ``` 18 | 19 | Including a header file produces the same results as copying the header file 20 | into each source file that needs it. Such copying would be time-consuming 21 | and error-prone. With a header file, the related declarations appear 22 | in only one place. If they need to be changed, they can be changed in one 23 | place, and programs that include the header file will automatically use the 24 | new version when next recompiled. The header file eliminates the labor of 25 | finding and changing all the copies as well as the risk that a failure to 26 | find one copy will result in inconsistencies within a program. 27 | 28 | In C, the usual convention is to give header files names that end with `.h'. 29 | It is most portable to use only letters, digits, dashes, and underscores in 30 | header file names, and at most one dot. 31 | 32 | Read more about using header files in official GCC documentation: 33 | 34 | * Include Syntax 35 | * Include Operation 36 | * Once-Only Headers 37 | * Computed Includes 38 | 39 | https://gcc.gnu.org/onlinedocs/cpp/Header-Files.html 40 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/lib/README: -------------------------------------------------------------------------------- 1 | 2 | This directory is intended for project specific (private) libraries. 3 | PlatformIO will compile them to static libraries and link into executable file. 4 | 5 | The source code of each library should be placed in a an own separate directory 6 | ("lib/your_library_name/[here are source files]"). 7 | 8 | For example, see a structure of the following two libraries `Foo` and `Bar`: 9 | 10 | |--lib 11 | | | 12 | | |--Bar 13 | | | |--docs 14 | | | |--examples 15 | | | |--src 16 | | | |- Bar.c 17 | | | |- Bar.h 18 | | | |- library.json (optional, custom build options, etc) https://docs.platformio.org/page/librarymanager/config.html 19 | | | 20 | | |--Foo 21 | | | |- Foo.c 22 | | | |- Foo.h 23 | | | 24 | | |- README --> THIS FILE 25 | | 26 | |- platformio.ini 27 | |--src 28 | |- main.c 29 | 30 | and a contents of `src/main.c`: 31 | ``` 32 | #include 33 | #include 34 | 35 | int main (void) 36 | { 37 | ... 38 | } 39 | 40 | ``` 41 | 42 | PlatformIO Library Dependency Finder will find automatically dependent 43 | libraries scanning project source files. 44 | 45 | More information about PlatformIO Library Dependency Finder 46 | - https://docs.platformio.org/page/librarymanager/ldf.html 47 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/platformio.ini: -------------------------------------------------------------------------------- 1 | ; PlatformIO Project Configuration File 2 | ; 3 | ; Build options: build flags, source filter 4 | ; Upload options: custom upload port, speed and extra flags 5 | ; Library options: dependencies, extra library storages 6 | ; Advanced options: extra scripting 7 | ; 8 | ; Please visit documentation for the other options and examples 9 | ; https://docs.platformio.org/page/projectconf.html 10 | 11 | [env:m5stack-core2] 12 | platform = espressif32@^6.2.0 13 | board = m5stack-core2 14 | framework = arduino 15 | upload_speed = 1500000 16 | monitor_speed = 115200 17 | ;board_build.partitions = no_ota.csv 18 | board_build.partitions = huge_app.csv 19 | monitor_filters = esp32_exception_decoder 20 | lib_deps = 21 | m5stack/M5Unified @ 0.1.9 22 | earlephilhower/ESP8266Audio @ ^1.9.7 23 | meganetaaan/M5Stack-Avatar@0.8.6 24 | arminjo/ServoEasing@^2.4.0 25 | madhephaestus/ESP32Servo@^0.9.0 26 | bblanchon/ArduinoJson @ ^6 27 | ESP32WebServer 28 | ESPmDNS 29 | https://github.com/horihiro/esp8266-google-tts 30 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/Audio.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include "Audio.h" 3 | 4 | Audio::Audio() { 5 | wavData = (typeof(wavData))heap_caps_malloc(record_size * sizeof(int16_t), MALLOC_CAP_8BIT); 6 | memset(wavData, 0 , record_size * sizeof(int16_t)); 7 | } 8 | 9 | Audio::~Audio() { 10 | delete wavData; 11 | } 12 | 13 | void Audio::CreateWavHeader(byte* header, int waveDataSize){ 14 | header[0] = 'R'; 15 | header[1] = 'I'; 16 | header[2] = 'F'; 17 | header[3] = 'F'; 18 | unsigned int fileSizeMinus8 = waveDataSize + 44 - 8; 19 | header[4] = (byte)(fileSizeMinus8 & 0xFF); 20 | header[5] = (byte)((fileSizeMinus8 >> 8) & 0xFF); 21 | header[6] = (byte)((fileSizeMinus8 >> 16) & 0xFF); 22 | header[7] = (byte)((fileSizeMinus8 >> 24) & 0xFF); 23 | header[8] = 'W'; 24 | header[9] = 'A'; 25 | header[10] = 'V'; 26 | header[11] = 'E'; 27 | header[12] = 'f'; 28 | header[13] = 'm'; 29 | header[14] = 't'; 30 | header[15] = ' '; 31 | header[16] = 0x10; // linear PCM 32 | header[17] = 0x00; 33 | header[18] = 0x00; 34 | header[19] = 0x00; 35 | header[20] = 0x01; // linear PCM 36 | header[21] = 0x00; 37 | header[22] = 0x01; // monoral 38 | header[23] = 0x00; 39 | header[24] = 0x80; // sampling rate 16000 40 | header[25] = 0x3E; 41 | header[26] = 0x00; 42 | header[27] = 0x00; 43 | header[28] = 0x00; // Byte/sec = 16000x2x1 = 32000 44 | header[29] = 0x7D; 45 | header[30] = 0x00; 46 | header[31] = 0x00; 47 | header[32] = 0x02; // 16bit monoral 48 | header[33] = 0x00; 49 | header[34] = 0x10; // 16bit 50 | header[35] = 0x00; 51 | header[36] = 'd'; 52 | header[37] = 'a'; 53 | header[38] = 't'; 54 | header[39] = 'a'; 55 | header[40] = (byte)(waveDataSize & 0xFF); 56 | header[41] = (byte)((waveDataSize >> 8) & 0xFF); 57 | header[42] = (byte)((waveDataSize >> 16) & 0xFF); 58 | header[43] = (byte)((waveDataSize >> 24) & 0xFF); 59 | } 60 | 61 | void Audio::Record() { 62 | CreateWavHeader(paddedHeader, wavDataSize); 63 | M5.Mic.begin(); 64 | int rec_record_idx; 65 | for (rec_record_idx = 0; rec_record_idx < record_number; rec_record_idx++) { 66 | auto data = &wavData[rec_record_idx * record_length]; 67 | M5.Mic.record(data, record_length, record_samplerate); 68 | } 69 | M5.Mic.end(); 70 | } 71 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/Audio.h: -------------------------------------------------------------------------------- 1 | #ifndef _AUDIO_H 2 | #define _AUDIO_H 3 | 4 | //#include "I2S.h" 5 | #include 6 | 7 | // 16bit, monoral, 16000Hz, linear PCM 8 | class Audio { 9 | // static constexpr const size_t record_number = 256; 10 | // static constexpr const size_t record_length = 200; 11 | // static constexpr const size_t record_size = record_number * record_length; 12 | // static constexpr const size_t record_samplerate = 16000; 13 | static const int headerSize = 44; 14 | // static size_t rec_record_idx; 15 | // static const int i2sBufferSize = 6000; 16 | // char i2sBuffer[i2sBufferSize]; 17 | void CreateWavHeader(byte* header, int waveDataSize); 18 | 19 | public: 20 | static constexpr const size_t record_number = 300; 21 | // static constexpr const size_t record_number = 256; 22 | // static constexpr const size_t record_length = 200; 23 | static constexpr const size_t record_length = 150; 24 | static constexpr const size_t record_size = record_number * record_length; 25 | static constexpr const size_t record_samplerate = 16000; 26 | // static size_t rec_record_idx; 27 | static const int wavDataSize = record_number * record_length * 2; // It must be multiple of dividedWavDataSize. Recording time is about 1.9 second. 28 | // static const int dividedWavDataSize = i2sBufferSize/4; 29 | // static int16_t *wavData; // It's divided. Because large continuous memory area can't be allocated in esp32. 30 | int16_t *wavData; // It's divided. Because large continuous memory area can't be allocated in esp32. 31 | byte paddedHeader[headerSize + 4] = {0}; // The size must be multiple of 3 for Base64 encoding. Additional byte size must be even because wave data is 16bit. 32 | 33 | Audio(); 34 | ~Audio(); 35 | void Record(); 36 | }; 37 | 38 | #endif // _AUDIO_H 39 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/AudioOutputM5Speaker.h: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | /// set M5Speaker virtual channel (0-7) 4 | //static constexpr uint8_t m5spk_virtual_channel = 0; 5 | 6 | class AudioOutputM5Speaker : public AudioOutput 7 | { 8 | public: 9 | AudioOutputM5Speaker(m5::Speaker_Class* m5sound, uint8_t virtual_sound_channel = 0) 10 | { 11 | _m5sound = m5sound; 12 | _virtual_ch = virtual_sound_channel; 13 | } 14 | virtual ~AudioOutputM5Speaker(void) {}; 15 | virtual bool begin(void) override { return true; } 16 | virtual bool ConsumeSample(int16_t sample[2]) override 17 | { 18 | if (_tri_buffer_index < tri_buf_size) 19 | { 20 | _tri_buffer[_tri_index][_tri_buffer_index ] = sample[0]; 21 | _tri_buffer[_tri_index][_tri_buffer_index+1] = sample[0]; 22 | _tri_buffer_index += 2; 23 | 24 | return true; 25 | } 26 | 27 | flush(); 28 | return false; 29 | } 30 | virtual void flush(void) override 31 | { 32 | if (_tri_buffer_index) 33 | { 34 | _m5sound->playRaw(_tri_buffer[_tri_index], _tri_buffer_index, hertz, true, 1, _virtual_ch); 35 | _tri_index = _tri_index < 2 ? _tri_index + 1 : 0; 36 | _tri_buffer_index = 0; 37 | ++_update_count; 38 | } 39 | } 40 | virtual bool stop(void) override 41 | { 42 | flush(); 43 | _m5sound->stop(_virtual_ch); 44 | for (size_t i = 0; i < 3; ++i) 45 | { 46 | memset(_tri_buffer[i], 0, tri_buf_size * sizeof(int16_t)); 47 | } 48 | ++_update_count; 49 | return true; 50 | } 51 | 52 | const int16_t* getBuffer(void) const { return _tri_buffer[(_tri_index + 2) % 3]; } 53 | const uint32_t getUpdateCount(void) const { return _update_count; } 54 | 55 | protected: 56 | m5::Speaker_Class* _m5sound; 57 | uint8_t _virtual_ch; 58 | static constexpr size_t tri_buf_size = 640; 59 | int16_t _tri_buffer[3][tri_buf_size]; 60 | size_t _tri_buffer_index = 0; 61 | size_t _tri_index = 0; 62 | size_t _update_count = 0; 63 | }; 64 | 65 | #define FFT_SIZE 256 66 | class fft_t 67 | { 68 | float _wr[FFT_SIZE + 1]; 69 | float _wi[FFT_SIZE + 1]; 70 | float _fr[FFT_SIZE + 1]; 71 | float _fi[FFT_SIZE + 1]; 72 | uint16_t _br[FFT_SIZE + 1]; 73 | size_t _ie; 74 | 75 | public: 76 | fft_t(void) 77 | { 78 | #ifndef M_PI 79 | #define M_PI 3.141592653 80 | #endif 81 | _ie = logf( (float)FFT_SIZE ) / log(2.0) + 0.5; 82 | static constexpr float omega = 2.0f * M_PI / FFT_SIZE; 83 | static constexpr int s4 = FFT_SIZE / 4; 84 | static constexpr int s2 = FFT_SIZE / 2; 85 | for ( int i = 1 ; i < s4 ; ++i) 86 | { 87 | float f = cosf(omega * i); 88 | _wi[s4 + i] = f; 89 | _wi[s4 - i] = f; 90 | _wr[ i] = f; 91 | _wr[s2 - i] = -f; 92 | } 93 | _wi[s4] = _wr[0] = 1; 94 | 95 | size_t je = 1; 96 | _br[0] = 0; 97 | _br[1] = FFT_SIZE / 2; 98 | for ( size_t i = 0 ; i < _ie - 1 ; ++i ) 99 | { 100 | _br[ je << 1 ] = _br[ je ] >> 1; 101 | je = je << 1; 102 | for ( size_t j = 1 ; j < je ; ++j ) 103 | { 104 | _br[je + j] = _br[je] + _br[j]; 105 | } 106 | } 107 | } 108 | 109 | void exec(const int16_t* in) 110 | { 111 | memset(_fi, 0, sizeof(_fi)); 112 | for ( size_t j = 0 ; j < FFT_SIZE / 2 ; ++j ) 113 | { 114 | float basej = 0.25 * (1.0-_wr[j]); 115 | size_t r = FFT_SIZE - j - 1; 116 | 117 | /// perform han window and stereo to mono convert. 118 | _fr[_br[j]] = basej * (in[j * 2] + in[j * 2 + 1]); 119 | _fr[_br[r]] = basej * (in[r * 2] + in[r * 2 + 1]); 120 | } 121 | 122 | size_t s = 1; 123 | size_t i = 0; 124 | do 125 | { 126 | size_t ke = s; 127 | s <<= 1; 128 | size_t je = FFT_SIZE / s; 129 | size_t j = 0; 130 | do 131 | { 132 | size_t k = 0; 133 | do 134 | { 135 | size_t l = s * j + k; 136 | size_t m = ke * (2 * j + 1) + k; 137 | size_t p = je * k; 138 | float Wxmr = _fr[m] * _wr[p] + _fi[m] * _wi[p]; 139 | float Wxmi = _fi[m] * _wr[p] - _fr[m] * _wi[p]; 140 | _fr[m] = _fr[l] - Wxmr; 141 | _fi[m] = _fi[l] - Wxmi; 142 | _fr[l] += Wxmr; 143 | _fi[l] += Wxmi; 144 | } while ( ++k < ke) ; 145 | } while ( ++j < je ); 146 | } while ( ++i < _ie ); 147 | } 148 | 149 | uint32_t get(size_t index) 150 | { 151 | return (index < FFT_SIZE / 2) ? (uint32_t)sqrtf(_fr[ index ] * _fr[ index ] + _fi[ index ] * _fi[ index ]) : 0u; 152 | } 153 | }; 154 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/CloudSpeechClient.cpp: -------------------------------------------------------------------------------- 1 | #include "CloudSpeechClient.h" 2 | #include "network_param.h" 3 | #include 4 | #include 5 | 6 | CloudSpeechClient::CloudSpeechClient(Authentication authentication) { 7 | this->authentication = authentication; 8 | client.setCACert(root_ca); 9 | client.setTimeout( 10000 ); 10 | if (!client.connect(server_stt, 443)) Serial.println("Connection failed!"); 11 | } 12 | 13 | CloudSpeechClient::~CloudSpeechClient() { 14 | client.stop(); 15 | } 16 | 17 | void CloudSpeechClient::PrintHttpBody2(Audio* audio) { 18 | String enc = base64::encode(audio->paddedHeader, sizeof(audio->paddedHeader)); 19 | enc.replace("\n", ""); // delete last "\n" 20 | client.print(enc); // HttpBody2 21 | char* wavData = (char*)audio->wavData; 22 | for (int j = 0; j < audio->record_number; j++) { 23 | enc = base64::encode((byte*)&wavData[j*audio->record_length*2], audio->record_length*2); 24 | enc.replace("\n", "");// delete last "\n" 25 | client.print(enc); //Serial.print(enc); // HttpBody2 26 | delay(10); 27 | } 28 | // Serial.printf("PrintHttpBody2=%d",len); 29 | } 30 | 31 | String CloudSpeechClient::Transcribe(Audio* audio) { 32 | String HttpBody1 = "{\"config\":{\"encoding\":\"LINEAR16\",\"sampleRateHertz\":16000,\"languageCode\":\""+LANG_CODE+"\"},\"audio\":{\"content\":\""; 33 | String HttpBody3 = "\"}}\r\n\r\n"; 34 | int httpBody2Length = (audio->wavDataSize + sizeof(audio->paddedHeader))*4/3; // 4/3 is from base64 encoding 35 | String ContentLength = String(HttpBody1.length() + httpBody2Length + HttpBody3.length()); 36 | // Serial.printf("HttpBody1=%d httpBody2Length=%d HttpBody3=%d \n",HttpBody1.length(),httpBody2Length,HttpBody3.length()); 37 | String HttpHeader; 38 | if (authentication == USE_APIKEY) 39 | HttpHeader = String("POST /v1/speech:recognize?key=") + GOOGLE_API_KEY 40 | + String(" HTTP/1.1\r\nHost: speech.googleapis.com\r\nContent-Type: application/json\r\nContent-Length: ") + ContentLength + String("\r\n\r\n"); 41 | else if (authentication == USE_ACCESSTOKEN) 42 | HttpHeader = String("POST /v1/speech:recognize HTTP/1.1\r\nHost: speech.googleapis.com\r\nContent-Type: application/json\r\nAuthorization: Bearer ") 43 | + AccessToken + String("\r\nContent-Length: ") + ContentLength + String("\r\n\r\n"); 44 | client.print(HttpHeader); //Serial.print(HttpHeader); 45 | client.print(HttpBody1); //Serial.print(HttpBody1); 46 | PrintHttpBody2(audio); 47 | client.print(HttpBody3); //Serial.print(HttpBody3); 48 | while (!client.available()); 49 | // Skip HTTP headers 50 | char endOfHeaders[] = "\r\n\r\n"; 51 | if (!client.find(endOfHeaders)) { 52 | Serial.println(F("Invalid response")); 53 | return String(""); 54 | } 55 | if(client.available())client.read(); 56 | if(client.available())client.read(); 57 | if(client.available())client.read(); 58 | 59 | // Parse JSON object 60 | StaticJsonDocument<500> jsonBuffer; 61 | DeserializationError error = deserializeJson(jsonBuffer,client); 62 | //root.prettyPrintTo(Serial); //Serial.println(""); 63 | String result = ""; 64 | if (error) { 65 | Serial.println("Parsing failed!"); 66 | return result; 67 | } else { 68 | String json_string; 69 | serializeJsonPretty(jsonBuffer, json_string); 70 | Serial.println("===================="); 71 | Serial.println(json_string); 72 | Serial.println("===================="); 73 | // root.prettyPrintTo(Serial); 74 | const char* text = jsonBuffer["results"][0]["alternatives"][0]["transcript"]; 75 | Serial.print("\n認識結果:"); 76 | if(text) { 77 | result = String (text); 78 | Serial.println((char *)text); 79 | } 80 | else { 81 | Serial.println("NG"); 82 | } 83 | } 84 | return result; 85 | } 86 | 87 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/CloudSpeechClient.h: -------------------------------------------------------------------------------- 1 | #ifndef _CLOUDSPEECHCLIENT_H 2 | #define _CLOUDSPEECHCLIENT_H 3 | #include 4 | #include "Audio.h" 5 | 6 | enum Authentication { 7 | USE_ACCESSTOKEN, 8 | USE_APIKEY 9 | }; 10 | 11 | class CloudSpeechClient { 12 | WiFiClientSecure client; 13 | void PrintHttpBody2(Audio* audio); 14 | Authentication authentication; 15 | 16 | public: 17 | CloudSpeechClient(Authentication authentication); 18 | ~CloudSpeechClient(); 19 | String Transcribe(Audio* audio); 20 | }; 21 | 22 | #endif // _CLOUDSPEECHCLIENT_H 23 | 24 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/main.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | //#include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | //#define USE_DOGFACE 9 | #ifdef USE_DOGFACE 10 | #include 11 | #endif 12 | 13 | #include 14 | #include 15 | #include 16 | #include "AudioOutputM5Speaker.h" 17 | #include 18 | #include 19 | #include // https://github.com/ArminJo/ServoEasing 20 | 21 | #include 22 | #include 23 | #include "rootCACertificate.h" 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include "CloudSpeechClient.h" 29 | 30 | // Default LANG_CODE 31 | String LANG_CODE = "ja-JP"; 32 | //String LANG_CODE = "en-US"; 33 | //String LANG_CODE "es-MX"; 34 | 35 | // 保存する質問と回答の最大数 36 | const int MAX_HISTORY = 5; 37 | 38 | // 過去の質問と回答を保存するデータ構造 39 | std::deque chatHistory; 40 | 41 | #define USE_SDCARD 42 | #define WIFI_SSID "SET YOUR WIFI SSID" 43 | #define WIFI_PASS "SET YOUR WIFI PASS" 44 | #define OPENAI_APIKEY "SET YOUR OPENAI APIKEY" 45 | #define GOOGL_APIKEY "SET YOUR GOOGL APIKEY" 46 | 47 | #define USE_SERVO 48 | #ifdef USE_SERVO 49 | #if defined(ARDUINO_M5STACK_Core2) 50 | // #define SERVO_PIN_X 13 //Core2 PORT C 51 | // #define SERVO_PIN_Y 14 52 | #define SERVO_PIN_X 33 //Core2 PORT A 53 | #define SERVO_PIN_Y 32 54 | #elif defined( ARDUINO_M5STACK_FIRE ) 55 | #define SERVO_PIN_X 21 56 | #define SERVO_PIN_Y 22 57 | #elif defined( ARDUINO_M5Stack_Core_ESP32 ) 58 | #define SERVO_PIN_X 21 59 | #define SERVO_PIN_Y 22 60 | #endif 61 | #endif 62 | 63 | TTS tts; 64 | HTTPClient http; 65 | WiFiClient client; 66 | 67 | /// set M5Speaker virtual channel (0-7) 68 | static constexpr uint8_t m5spk_virtual_channel = 0; 69 | using namespace m5avatar; 70 | Avatar avatar; 71 | const Expression expressions_table[] = { 72 | Expression::Neutral, 73 | Expression::Happy, 74 | Expression::Sleepy, 75 | Expression::Doubt, 76 | Expression::Sad, 77 | Expression::Angry 78 | }; 79 | 80 | ESP32WebServer server(80); 81 | 82 | String OPENAI_API_KEY = ""; 83 | extern String GOOGLE_API_KEY; 84 | 85 | char* text1 = "みなさんこんにちは、私の名前はスタックチャンです、よろしくね。"; 86 | char* text2 = "Hello everyone, my name is Stack Chan, nice to meet you."; 87 | 88 | // C++11 multiline string constants are neato... 89 | static const char HEAD[] PROGMEM = R"KEWL( 90 | 91 | 92 | 93 | 94 | AI Stack-chan 95 | )KEWL"; 96 | 97 | 98 | static const char APIKEY_HTML[] PROGMEM = R"KEWL( 99 | 100 | 101 | 102 | 103 | API Key Settings 104 | 105 | 106 |

API Key Settings

107 |
108 | 109 |
110 | 111 |
112 | 113 |
114 | 142 | 143 | )KEWL"; 144 | 145 | static const char ROLE_HTML[] PROGMEM = R"KEWL( 146 | 147 | 148 | 149 | role-setting 150 | 151 | 152 | 159 | 160 | 161 |

role-setting

162 |
163 |
164 |

165 | 166 |
167 | 191 | 192 | )KEWL"; 193 | String speech_text = ""; 194 | String speech_text_buffer = ""; 195 | //DynamicJsonDocument chat_doc(1024); 196 | DynamicJsonDocument chat_doc(1024*10); 197 | String json_ChatString = "{\"model\": \"gpt-3.5-turbo-0613\",\"messages\": [{\"role\": \"user\", \"content\": \"""\"}]}"; 198 | 199 | bool init_chat_doc(const char *data) 200 | { 201 | DeserializationError error = deserializeJson(chat_doc, data); 202 | if (error) { 203 | Serial.println("DeserializationError"); 204 | return false; 205 | } 206 | String json_str; //= JSON.stringify(chat_doc); 207 | serializeJsonPretty(chat_doc, json_str); // 文字列をシリアルポートに出力する 208 | Serial.println(json_str); 209 | return true; 210 | } 211 | 212 | void handleRoot() { 213 | server.send(200, "text/plain", "hello from m5stack!"); 214 | } 215 | 216 | void handleNotFound(){ 217 | String message = "File Not Found\n\n"; 218 | message += "URI: "; 219 | message += server.uri(); 220 | message += "\nMethod: "; 221 | message += (server.method() == HTTP_GET)?"GET":"POST"; 222 | message += "\nArguments: "; 223 | message += server.args(); 224 | message += "\n"; 225 | for (uint8_t i=0; i") + message + String("")); 230 | } 231 | 232 | //void VoiceText_tts(char *text,char *tts_parms) ; 233 | void google_tts(char *text, char *lang); 234 | void handle_speech() { 235 | String message = server.arg("say"); 236 | String expression = server.arg("expression"); 237 | int expr = 0; 238 | Serial.println(expression); 239 | if(expression != ""){ 240 | expr = expression.toInt(); 241 | if(expr < 0) expr = 0; 242 | if(expr > 5) expr = 5; 243 | } 244 | // message = message + "\n"; 245 | Serial.println(message); 246 | //////////////////////////////////////// 247 | // 音声の発声 248 | //////////////////////////////////////// 249 | avatar.setExpression(expressions_table[expr]); 250 | google_tts((char*)message.c_str(),(char*)LANG_CODE.c_str()); 251 | // avatar.setExpression(expressions_table[0]); 252 | server.send(200, "text/plain", String("OK")); 253 | } 254 | 255 | String https_post_json(const char* url, const char* json_string, const char* root_ca) { 256 | String payload = ""; 257 | WiFiClientSecure *client = new WiFiClientSecure; 258 | if(client) { 259 | client -> setCACert(root_ca); 260 | { 261 | // Add a scoping block for HTTPClient https to make sure it is destroyed before WiFiClientSecure *client is 262 | HTTPClient https; 263 | // https.setTimeout( 25000 ); 264 | https.setTimeout( 50000 ); 265 | 266 | Serial.print("[HTTPS] begin...\n"); 267 | if (https.begin(*client, url)) { // HTTPS 268 | Serial.print("[HTTPS] POST...\n"); 269 | // start connection and send HTTP header 270 | https.addHeader("Content-Type", "application/json"); 271 | // https.addHeader("Authorization", "Bearer YOUR_API_KEY"); 272 | https.addHeader("Authorization", String("Bearer ") + OPENAI_API_KEY); 273 | int httpCode = https.POST((uint8_t *)json_string, strlen(json_string)); 274 | 275 | // httpCode will be negative on error 276 | if (httpCode > 0) { 277 | // HTTP header has been send and Server response header has been handled 278 | Serial.printf("[HTTPS] POST... code: %d\n", httpCode); 279 | 280 | // file found at server 281 | if (httpCode == HTTP_CODE_OK || httpCode == HTTP_CODE_MOVED_PERMANENTLY) { 282 | payload = https.getString(); 283 | } 284 | } else { 285 | Serial.printf("[HTTPS] POST... failed, error: %s\n", https.errorToString(httpCode).c_str()); 286 | } 287 | https.end(); 288 | } else { 289 | Serial.printf("[HTTPS] Unable to connect\n"); 290 | } 291 | // End extra scoping block 292 | } 293 | delete client; 294 | } else { 295 | Serial.println("Unable to create client"); 296 | } 297 | return payload; 298 | } 299 | 300 | String chatGpt(String json_string) { 301 | String response = ""; 302 | // String json_string = "{\"model\": \"gpt-3.5-turbo\",\"messages\": [{\"role\": \"user\", \"content\": \"" + text + "\"},{\"role\": \"system\", \"content\": \"あなたは「スタックちゃん」と言う名前の小型ロボットとして振る舞ってください。\"},{\"role\": \"system\", \"content\": \"あなたはの使命は人々の心を癒すことです。\"},{\"role\": \"system\", \"content\": \"幼い子供の口調で話してください。\"}]}"; 303 | avatar.setExpression(Expression::Doubt); 304 | if(LANG_CODE == "ja-JP") { 305 | avatar.setSpeechText("考え中…"); 306 | }else{ 307 | avatar.setSpeechText("I'm thinking..."); 308 | } 309 | String ret = https_post_json("https://api.openai.com/v1/chat/completions", json_string.c_str(), root_ca_openai); 310 | avatar.setExpression(Expression::Neutral); 311 | avatar.setSpeechText(""); 312 | Serial.println(ret); 313 | if(ret != ""){ 314 | DynamicJsonDocument doc(2000); 315 | DeserializationError error = deserializeJson(doc, ret.c_str()); 316 | if (error) { 317 | Serial.print(F("deserializeJson() failed: ")); 318 | Serial.println(error.f_str()); 319 | avatar.setExpression(Expression::Sad); 320 | if(LANG_CODE == "ja-JP") { 321 | avatar.setSpeechText("エラーです"); 322 | response = "エラーです"; 323 | } else { 324 | avatar.setSpeechText("Error."); 325 | response = "Error."; 326 | } 327 | delay(1000); 328 | avatar.setSpeechText(""); 329 | avatar.setExpression(Expression::Neutral); 330 | }else{ 331 | const char* data = doc["choices"][0]["message"]["content"]; 332 | Serial.println(data); 333 | response = String(data); 334 | } 335 | } else { 336 | avatar.setExpression(Expression::Sad); 337 | if(LANG_CODE == "ja-JP") { 338 | avatar.setSpeechText("わかりません"); 339 | response = "わかりません"; 340 | } else { 341 | avatar.setSpeechText("I don't understand."); 342 | response = "I don't understand."; 343 | } 344 | delay(1000); 345 | avatar.setSpeechText(""); 346 | avatar.setExpression(Expression::Neutral); 347 | } 348 | return response; 349 | } 350 | 351 | String InitBuffer = ""; 352 | 353 | void handle_chat() { 354 | static String response = ""; 355 | String text = server.arg("text"); 356 | Serial.println(InitBuffer); 357 | init_chat_doc(InitBuffer.c_str()); 358 | // 質問をチャット履歴に追加 359 | chatHistory.push_back(text); 360 | // チャット履歴が最大数を超えた場合、古い質問と回答を削除 361 | if (chatHistory.size() > MAX_HISTORY * 2) 362 | { 363 | chatHistory.pop_front(); 364 | chatHistory.pop_front(); 365 | } 366 | 367 | for (int i = 0; i < chatHistory.size(); i++) 368 | { 369 | JsonArray messages = chat_doc["messages"]; 370 | JsonObject systemMessage1 = messages.createNestedObject(); 371 | if(i % 2 == 0) { 372 | systemMessage1["role"] = "user"; 373 | } else { 374 | systemMessage1["role"] = "assistant"; 375 | } 376 | systemMessage1["content"] = chatHistory[i]; 377 | } 378 | 379 | String json_string; 380 | serializeJson(chat_doc, json_string); 381 | if(speech_text=="" && speech_text_buffer == "") { 382 | response = chatGpt(json_string); 383 | speech_text = response; 384 | // 返答をチャット履歴に追加 385 | chatHistory.push_back(response); 386 | } else { 387 | response = "busy"; 388 | } 389 | // Serial.printf("chatHistory.max_size %d \n",chatHistory.max_size()); 390 | // Serial.printf("chatHistory.size %d \n",chatHistory.size()); 391 | // for (int i = 0; i < chatHistory.size(); i++) 392 | // { 393 | // Serial.print(i); 394 | // Serial.println("= "+chatHistory[i]); 395 | // } 396 | serializeJsonPretty(chat_doc, json_string); 397 | Serial.println("===================="); 398 | Serial.println(json_string); 399 | Serial.println("===================="); 400 | server.send(200, "text/html", String(HEAD)+String("")+response+String("")); 401 | } 402 | 403 | 404 | String Role_JSON = ""; 405 | void exec_chatGPT(String text) { 406 | static String response = ""; 407 | init_chat_doc(Role_JSON.c_str()); 408 | 409 | String role = chat_doc["messages"][0]["role"]; 410 | if(role == "user") {chat_doc["messages"][0]["content"] = text;} 411 | String json_string; 412 | serializeJson(chat_doc, json_string); 413 | 414 | response = chatGpt(json_string); 415 | speech_text = response; 416 | // server.send(200, "text/html", String(HEAD)+String("")+response+String("")); 417 | } 418 | 419 | void handle_apikey() { 420 | // ファイルを読み込み、クライアントに送信する 421 | server.send(200, "text/html", APIKEY_HTML); 422 | } 423 | 424 | void handle_apikey_set() { 425 | // POST以外は拒否 426 | if (server.method() != HTTP_POST) { 427 | return; 428 | } 429 | // openai 430 | String openai = server.arg("openai"); 431 | // voicetxt 432 | String google = server.arg("google"); 433 | 434 | OPENAI_API_KEY = openai; 435 | // tts_user = voicetext; 436 | GOOGLE_API_KEY = google; 437 | Serial.println(openai); 438 | Serial.println(google); 439 | 440 | uint32_t nvs_handle; 441 | if (ESP_OK == nvs_open("apikey", NVS_READWRITE, &nvs_handle)) { 442 | nvs_set_str(nvs_handle, "openai", openai.c_str()); 443 | nvs_set_str(nvs_handle, "google", google.c_str()); 444 | nvs_close(nvs_handle); 445 | } 446 | server.send(200, "text/plain", String("OK")); 447 | } 448 | 449 | void handle_role() { 450 | // ファイルを読み込み、クライアントに送信する 451 | server.send(200, "text/html", ROLE_HTML); 452 | } 453 | 454 | bool save_json(){ 455 | // SPIFFSをマウントする 456 | if(!SPIFFS.begin(true)){ 457 | Serial.println("An Error has occurred while mounting SPIFFS"); 458 | return false; 459 | } 460 | 461 | // JSONファイルを作成または開く 462 | File file = SPIFFS.open("/data.json", "w"); 463 | if(!file){ 464 | Serial.println("Failed to open file for writing"); 465 | return false; 466 | } 467 | 468 | // JSONデータをシリアル化して書き込む 469 | serializeJson(chat_doc, file); 470 | file.close(); 471 | return true; 472 | } 473 | 474 | /** 475 | * アプリからテキスト(文字列)と共にRoll情報が配列でPOSTされてくることを想定してJSONを扱いやすい形に変更 476 | * 出力形式をJSONに変更 477 | */ 478 | void handle_role_set() { 479 | // POST以外は拒否 480 | if (server.method() != HTTP_POST) { 481 | return; 482 | } 483 | String role = server.arg("plain"); 484 | if (role != "") { 485 | init_chat_doc(InitBuffer.c_str()); 486 | JsonArray messages = chat_doc["messages"]; 487 | JsonObject systemMessage1 = messages.createNestedObject(); 488 | systemMessage1["role"] = "system"; 489 | systemMessage1["content"] = role; 490 | } else { 491 | init_chat_doc(json_ChatString.c_str()); 492 | //会話履歴をクリア 493 | chatHistory.clear(); 494 | } 495 | InitBuffer=""; 496 | serializeJson(chat_doc, InitBuffer); 497 | Serial.println("InitBuffer = " + InitBuffer); 498 | Role_JSON = InitBuffer; 499 | 500 | // JSONデータをspiffsへ出力する 501 | save_json(); 502 | 503 | // 整形したJSONデータを出力するHTMLデータを作成する 504 | String html = "
";
 505 |   serializeJsonPretty(chat_doc, html);
 506 |   html += "
"; 507 | 508 | // HTMLデータをシリアルに出力する 509 | Serial.println(html); 510 | server.send(200, "text/html", html); 511 | // server.send(200, "text/plain", String("OK")); 512 | }; 513 | void handle_role_set2() { 514 | // POST以外は拒否 515 | if (server.method() != HTTP_POST) { 516 | return; 517 | } 518 | String role = server.arg("plain"); 519 | if (role != "") { 520 | JsonArray messages = chat_doc["messages"]; 521 | JsonObject systemMessage1 = messages.createNestedObject(); 522 | systemMessage1["role"] = "system"; 523 | systemMessage1["content"] = role; 524 | } else { 525 | init_chat_doc(json_ChatString.c_str()); 526 | } 527 | 528 | // JSONデータをspiffsへ出力する 529 | save_json(); 530 | 531 | // 整形したJSONデータを出力するHTMLデータを作成する 532 | String html = "
";
 533 |   serializeJsonPretty(chat_doc, html);
 534 |   html += "
"; 535 | // String json_str; //= JSON.stringify(chat_doc); 536 | // serializeJsonPretty(chat_doc, json_str); // 文字列をシリアルポートに出力する 537 | // Serial.println(json_str); 538 | // server.send(200, "text/html", String(HEAD)+String("")+json_str+String("")); 539 | 540 | // HTMLデータをシリアルに出力する 541 | Serial.println(html); 542 | server.send(200, "text/html", html); 543 | // server.send(200, "text/plain", String("OK")); 544 | }; 545 | 546 | // 整形したJSONデータを出力するHTMLデータを作成する 547 | void handle_role_get() { 548 | 549 | String html = "
";
 550 |   serializeJsonPretty(chat_doc, html);
 551 |   html += "
"; 552 | 553 | // HTMLデータをシリアルに出力する 554 | Serial.println(html); 555 | server.send(200, "text/html", String(HEAD) + html); 556 | }; 557 | 558 | void handle_face() { 559 | String expression = server.arg("expression"); 560 | expression = expression + "\n"; 561 | Serial.println(expression); 562 | switch (expression.toInt()) 563 | { 564 | case 0: avatar.setExpression(Expression::Neutral); break; 565 | case 1: avatar.setExpression(Expression::Happy); break; 566 | case 2: avatar.setExpression(Expression::Sleepy); break; 567 | case 3: avatar.setExpression(Expression::Doubt); break; 568 | case 4: avatar.setExpression(Expression::Sad); break; 569 | case 5: avatar.setExpression(Expression::Angry); break; 570 | } 571 | server.send(200, "text/plain", String("OK")); 572 | } 573 | 574 | void handle_setting() { 575 | String value = server.arg("volume"); 576 | String lang = server.arg("lang"); 577 | if(lang != "") LANG_CODE = lang; 578 | // volume = volume + "\n"; 579 | Serial.println(value); 580 | if(value == "") value = "180"; 581 | size_t volume = value.toInt(); 582 | { 583 | uint32_t nvs_handle; 584 | if (ESP_OK == nvs_open("setting", NVS_READWRITE, &nvs_handle)) { 585 | if(volume > 255) volume = 255; 586 | nvs_set_u32(nvs_handle, "volume", volume); 587 | if(lang != "") nvs_set_str(nvs_handle, "lang", (char*)lang.c_str()); 588 | nvs_close(nvs_handle); 589 | } 590 | } 591 | M5.Speaker.setVolume(volume); 592 | M5.Speaker.setChannelVolume(m5spk_virtual_channel, volume); 593 | server.send(200, "text/plain", String("OK")); 594 | } 595 | 596 | /// set M5Speaker virtual channel (0-7) 597 | //static constexpr uint8_t m5spk_virtual_channel = 0; 598 | static AudioOutputM5Speaker out(&M5.Speaker, m5spk_virtual_channel); 599 | AudioGeneratorMP3 *mp3; 600 | AudioFileSourceBuffer *buff = nullptr; 601 | AudioFileSourcePROGMEM *file = nullptr; 602 | uint8_t mp3buff[1024*60]; 603 | 604 | // Called when a metadata event occurs (i.e. an ID3 tag, an ICY block, etc. 605 | void MDCallback(void *cbData, const char *type, bool isUnicode, const char *string) 606 | { 607 | const char *ptr = reinterpret_cast(cbData); 608 | (void) isUnicode; // Punt this ball for now 609 | // Note that the type and string may be in PROGMEM, so copy them to RAM for printf 610 | char s1[32], s2[64]; 611 | strncpy_P(s1, type, sizeof(s1)); 612 | s1[sizeof(s1)-1]=0; 613 | strncpy_P(s2, string, sizeof(s2)); 614 | s2[sizeof(s2)-1]=0; 615 | Serial.printf("METADATA(%s) '%s' = '%s'\n", ptr, s1, s2); 616 | Serial.flush(); 617 | } 618 | 619 | // Called when there's a warning or error (like a buffer underflow or decode hiccup) 620 | void StatusCallback(void *cbData, int code, const char *string) 621 | { 622 | const char *ptr = reinterpret_cast(cbData); 623 | // Note that the string may be in PROGMEM, so copy it to RAM for printf 624 | char s1[64]; 625 | strncpy_P(s1, string, sizeof(s1)); 626 | s1[sizeof(s1)-1]=0; 627 | Serial.printf("STATUS(%s) '%d' = '%s'\n", ptr, code, s1); 628 | Serial.flush(); 629 | } 630 | 631 | #ifdef USE_SERVO 632 | #define START_DEGREE_VALUE_X 90 633 | //#define START_DEGREE_VALUE_Y 90 634 | #define START_DEGREE_VALUE_Y 85 // 635 | ServoEasing servo_x; 636 | ServoEasing servo_y; 637 | #endif 638 | 639 | void lipSync(void *args) 640 | { 641 | float gazeX, gazeY; 642 | int level = 0; 643 | DriveContext *ctx = (DriveContext *)args; 644 | Avatar *avatar = ctx->getAvatar(); 645 | for (;;) 646 | { 647 | level = abs(*out.getBuffer()); 648 | if(level<100) level = 0; 649 | if(level > 15000) 650 | { 651 | level = 15000; 652 | } 653 | float open = (float)level/15000.0; 654 | avatar->setMouthOpenRatio(open); 655 | avatar->getGaze(&gazeY, &gazeX); 656 | avatar->setRotation(gazeX * 5); 657 | delay(50); 658 | } 659 | } 660 | 661 | bool servo_home = false; 662 | 663 | void servo(void *args) 664 | { 665 | float gazeX, gazeY; 666 | DriveContext *ctx = (DriveContext *)args; 667 | Avatar *avatar = ctx->getAvatar(); 668 | for (;;) 669 | { 670 | #ifdef USE_SERVO 671 | if(!servo_home) 672 | { 673 | avatar->getGaze(&gazeY, &gazeX); 674 | servo_x.setEaseTo(START_DEGREE_VALUE_X + (int)(15.0 * gazeX)); 675 | if(gazeY < 0) { 676 | int tmp = (int)(10.0 * gazeY); 677 | if(tmp > 10) tmp = 10; 678 | servo_y.setEaseTo(START_DEGREE_VALUE_Y + tmp); 679 | } else { 680 | servo_y.setEaseTo(START_DEGREE_VALUE_Y + (int)(10.0 * gazeY)); 681 | } 682 | } else { 683 | // avatar->setRotation(gazeX * 5); 684 | // float b = avatar->getBreath(); 685 | servo_x.setEaseTo(START_DEGREE_VALUE_X); 686 | // servo_y.setEaseTo(START_DEGREE_VALUE_Y + b * 5); 687 | servo_y.setEaseTo(START_DEGREE_VALUE_Y); 688 | } 689 | synchronizeAllServosStartAndWaitForAllServosToStop(); 690 | #endif 691 | delay(50); 692 | } 693 | } 694 | 695 | void Servo_setup() { 696 | #ifdef USE_SERVO 697 | if (servo_x.attach(SERVO_PIN_X, START_DEGREE_VALUE_X, DEFAULT_MICROSECONDS_FOR_0_DEGREE, DEFAULT_MICROSECONDS_FOR_180_DEGREE)) { 698 | Serial.print("Error attaching servo x"); 699 | } 700 | if (servo_y.attach(SERVO_PIN_Y, START_DEGREE_VALUE_Y, DEFAULT_MICROSECONDS_FOR_0_DEGREE, DEFAULT_MICROSECONDS_FOR_180_DEGREE)) { 701 | Serial.print("Error attaching servo y"); 702 | } 703 | servo_x.setEasingType(EASE_QUADRATIC_IN_OUT); 704 | servo_y.setEasingType(EASE_QUADRATIC_IN_OUT); 705 | setSpeedForAllServos(30); 706 | 707 | servo_x.setEaseTo(START_DEGREE_VALUE_X); 708 | servo_y.setEaseTo(START_DEGREE_VALUE_Y); 709 | synchronizeAllServosStartAndWaitForAllServosToStop(); 710 | #endif 711 | } 712 | 713 | void google_tts(char *text, char *lang) { 714 | Serial.println("tts Start"); 715 | String link = "http" + tts.getSpeechUrl(text, lang).substring(5); 716 | Serial.println(link); 717 | 718 | http.begin(client, link); 719 | http.setReuse(true); 720 | int code = http.GET(); 721 | if (code != HTTP_CODE_OK) { 722 | http.end(); 723 | // cb.st(STATUS_HTTPFAIL, PSTR("Can't open HTTP request")); 724 | return ; 725 | } 726 | WiFiClient *ttsclient = http.getStreamPtr(); 727 | ttsclient->setTimeout( 10000 ); 728 | if (ttsclient->available() > 0) { 729 | int i = 0; 730 | int len = sizeof(mp3buff); 731 | int count = 0; 732 | 733 | bool data_end = false; 734 | while (!data_end) { 735 | if(ttsclient->available() > 0) { 736 | 737 | int bytesread = ttsclient->read(&mp3buff[i], len); 738 | // Serial.printf("%d Bytes Read\n",bytesread); 739 | i = i + bytesread; 740 | if(i > sizeof(mp3buff)) 741 | { 742 | break; 743 | } else { 744 | len = len - bytesread; 745 | if(len <= 0) break; 746 | } 747 | 748 | } 749 | { 750 | Serial.printf(" %d Bytes Read\n",i); 751 | int lastms = millis(); 752 | data_end = true; 753 | while (millis()-lastms < 600) { //データ終わりか待ってみる 754 | if (ttsclient->available() > 0) {data_end = false; break;} 755 | yield(); 756 | } 757 | } 758 | 759 | } 760 | 761 | Serial.printf("Total %d Bytes Read\n",i); 762 | ttsclient->stop(); 763 | http.end(); 764 | file = new AudioFileSourcePROGMEM(mp3buff, i); 765 | mp3->begin(file, &out); 766 | } 767 | } 768 | 769 | struct box_t 770 | { 771 | int x; 772 | int y; 773 | int w; 774 | int h; 775 | int touch_id = -1; 776 | 777 | void setupBox(int x, int y, int w, int h) { 778 | this->x = x; 779 | this->y = y; 780 | this->w = w; 781 | this->h = h; 782 | } 783 | bool contain(int x, int y) 784 | { 785 | return this->x <= x && x < (this->x + this->w) 786 | && this->y <= y && y < (this->y + this->h); 787 | } 788 | }; 789 | static box_t box_servo; 790 | static box_t box_stt; 791 | 792 | void Wifi_setup() { 793 | // 前回接続時情報で接続する 794 | while (WiFi.status() != WL_CONNECTED) { 795 | M5.Display.print("."); 796 | Serial.print("."); 797 | delay(500); 798 | // 10秒以上接続できなかったら抜ける 799 | if ( 10000 < millis() ) { 800 | break; 801 | } 802 | } 803 | M5.Display.println(""); 804 | Serial.println(""); 805 | // 未接続の場合にはSmartConfig待受 806 | if ( WiFi.status() != WL_CONNECTED ) { 807 | WiFi.mode(WIFI_STA); 808 | WiFi.beginSmartConfig(); 809 | M5.Display.println("Waiting for SmartConfig"); 810 | Serial.println("Waiting for SmartConfig"); 811 | while (!WiFi.smartConfigDone()) { 812 | delay(500); 813 | M5.Display.print("#"); 814 | Serial.print("#"); 815 | // 30秒以上接続できなかったら抜ける 816 | if ( 30000 < millis() ) { 817 | Serial.println(""); 818 | Serial.println("Reset"); 819 | ESP.restart(); 820 | } 821 | } 822 | // Wi-fi接続 823 | M5.Display.println(""); 824 | Serial.println(""); 825 | M5.Display.println("Waiting for WiFi"); 826 | Serial.println("Waiting for WiFi"); 827 | while (WiFi.status() != WL_CONNECTED) { 828 | delay(500); 829 | M5.Display.print("."); 830 | Serial.print("."); 831 | // 60秒以上接続できなかったら抜ける 832 | if ( 60000 < millis() ) { 833 | Serial.println(""); 834 | Serial.println("Reset"); 835 | ESP.restart(); 836 | } 837 | } 838 | } 839 | } 840 | 841 | // void info_spiffs(){ 842 | // FSInfo fs_info; 843 | // SPIFFS.info(fs_info); 844 | // Serial.print("SPIFFS Total bytes: "); 845 | // Serial.println(fs_info.totalBytes); 846 | // Serial.print("SPIFFS Used bytes: "); 847 | // Serial.println(fs_info.usedBytes); 848 | // Serial.print("SPIFFS Free bytes: "); 849 | // Serial.println(fs_info.totalBytes - fs_info.usedBytes); 850 | // } 851 | 852 | void setup() 853 | { 854 | auto cfg = M5.config(); 855 | 856 | cfg.external_spk = true; /// use external speaker (SPK HAT / ATOMIC SPK) 857 | //cfg.external_spk_detail.omit_atomic_spk = true; // exclude ATOMIC SPK 858 | //cfg.external_spk_detail.omit_spk_hat = true; // exclude SPK HAT 859 | cfg.internal_mic = true; 860 | 861 | M5.begin(cfg); 862 | 863 | { /// custom setting 864 | auto spk_cfg = M5.Speaker.config(); 865 | /// Increasing the sample_rate will improve the sound quality instead of increasing the CPU load. 866 | spk_cfg.sample_rate = 96000; // default:64000 (64kHz) e.g. 48000 , 50000 , 80000 , 96000 , 100000 , 128000 , 144000 , 192000 , 200000 867 | spk_cfg.task_pinned_core = APP_CPU_NUM; 868 | M5.Speaker.config(spk_cfg); 869 | } 870 | M5.Speaker.begin(); 871 | 872 | Servo_setup(); 873 | M5.Lcd.setTextSize(2); 874 | Serial.println("Connecting to WiFi"); 875 | WiFi.disconnect(); 876 | WiFi.softAPdisconnect(true); 877 | WiFi.mode(WIFI_STA); 878 | #ifndef USE_SDCARD 879 | WiFi.begin(WIFI_SSID, WIFI_PASS); 880 | OPENAI_API_KEY = String(OPENAI_APIKEY); 881 | GOOGLE_API_KEY = String(GOOGL_APIKEY); 882 | #else 883 | /// settings 884 | if (SD.begin(GPIO_NUM_4, SPI, 25000000)) { 885 | /// wifi 886 | auto fs = SD.open("/wifi.txt", FILE_READ); 887 | if(fs) { 888 | size_t sz = fs.size(); 889 | char buf[sz + 1]; 890 | fs.read((uint8_t*)buf, sz); 891 | buf[sz] = 0; 892 | fs.close(); 893 | 894 | int y = 0; 895 | for(int x = 0; x < sz; x++) { 896 | if(buf[x] == 0x0a || buf[x] == 0x0d) 897 | buf[x] = 0; 898 | else if (!y && x > 0 && !buf[x - 1] && buf[x]) 899 | y = x; 900 | } 901 | WiFi.begin(buf, &buf[y]); 902 | } else { 903 | WiFi.begin(); 904 | } 905 | 906 | uint32_t nvs_handle; 907 | if (ESP_OK == nvs_open("apikey", NVS_READWRITE, &nvs_handle)) { 908 | /// radiko-premium 909 | fs = SD.open("/apikey.txt", FILE_READ); 910 | if(fs) { 911 | size_t sz = fs.size(); 912 | char buf[sz + 1]; 913 | fs.read((uint8_t*)buf, sz); 914 | buf[sz] = 0; 915 | fs.close(); 916 | int y = 0; 917 | for(int x = 0; x < sz; x++) { 918 | if(buf[x] == 0x0a || buf[x] == 0x0d) 919 | buf[x] = 0; 920 | else if (!y && x > 0 && !buf[x - 1] && buf[x]) 921 | y = x; 922 | } 923 | nvs_set_str(nvs_handle, "openai", buf); 924 | nvs_set_str(nvs_handle, "google", &buf[y]); 925 | Serial.println(buf); 926 | Serial.println(&buf[y]); 927 | } 928 | 929 | nvs_close(nvs_handle); 930 | } 931 | SD.end(); 932 | } else { 933 | WiFi.begin(); 934 | } 935 | 936 | { 937 | uint32_t nvs_handle; 938 | if (ESP_OK == nvs_open("apikey", NVS_READONLY, &nvs_handle)) { 939 | Serial.println("nvs_open"); 940 | 941 | size_t length1; 942 | size_t length2; 943 | if(ESP_OK == nvs_get_str(nvs_handle, "openai", nullptr, &length1) && ESP_OK == nvs_get_str(nvs_handle, "google", nullptr, &length2) && length1 && length2) { 944 | Serial.println("nvs_get_str"); 945 | char openai_apikey[length1 + 1]; 946 | char google_apikey[length2 + 1]; 947 | if(ESP_OK == nvs_get_str(nvs_handle, "openai", openai_apikey, &length1) && ESP_OK == nvs_get_str(nvs_handle, "google", google_apikey, &length2)) { 948 | OPENAI_API_KEY = String(openai_apikey); 949 | GOOGLE_API_KEY = String(google_apikey); 950 | Serial.println(OPENAI_API_KEY); 951 | Serial.println(GOOGLE_API_KEY); 952 | } 953 | } 954 | nvs_close(nvs_handle); 955 | } 956 | } 957 | 958 | #endif 959 | { 960 | uint32_t nvs_handle; 961 | if (ESP_OK == nvs_open("setting", NVS_READONLY, &nvs_handle)) { 962 | size_t volume; 963 | nvs_get_u32(nvs_handle, "volume", &volume); 964 | if(volume > 255) volume = 255; 965 | M5.Speaker.setVolume(volume); 966 | M5.Speaker.setChannelVolume(m5spk_virtual_channel, volume); 967 | 968 | size_t length1; 969 | if(ESP_OK == nvs_get_str(nvs_handle, "lang", nullptr, &length1) && length1) { 970 | Serial.println("nvs_get_str"); 971 | char google_lang[length1 + 1]; 972 | if(ESP_OK == nvs_get_str(nvs_handle, "lang", google_lang, &length1)) { 973 | LANG_CODE = String(google_lang); 974 | Serial.println(OPENAI_API_KEY); 975 | } 976 | } 977 | nvs_close(nvs_handle); 978 | } else { 979 | if (ESP_OK == nvs_open("setting", NVS_READWRITE, &nvs_handle)) { 980 | size_t volume = 180; 981 | // LANG_CODE = "en-US"; 982 | nvs_set_u32(nvs_handle, "volume", volume); 983 | nvs_set_str(nvs_handle, "lang", (char*)LANG_CODE.c_str()); 984 | nvs_close(nvs_handle); 985 | M5.Speaker.setVolume(volume); 986 | M5.Speaker.setChannelVolume(m5spk_virtual_channel, volume); 987 | nvs_close(nvs_handle); 988 | } 989 | } 990 | } 991 | 992 | M5.Lcd.print("Connecting"); 993 | Wifi_setup(); 994 | M5.Lcd.println("\nConnected"); 995 | Serial.printf_P(PSTR("Go to http://")); 996 | M5.Lcd.print("Go to http://"); 997 | Serial.println(WiFi.localIP()); 998 | M5.Lcd.println(WiFi.localIP()); 999 | 1000 | if (MDNS.begin("m5stack")) { 1001 | Serial.println("MDNS responder started"); 1002 | M5.Lcd.println("MDNS responder started"); 1003 | } 1004 | delay(1000); 1005 | server.on("/", handleRoot); 1006 | 1007 | server.on("/inline", [](){ 1008 | server.send(200, "text/plain", "this works as well"); 1009 | }); 1010 | 1011 | // And as regular external functions: 1012 | server.on("/speech", handle_speech); 1013 | server.on("/face", handle_face); 1014 | server.on("/chat", handle_chat); 1015 | server.on("/apikey", handle_apikey); 1016 | server.on("/setting", handle_setting); 1017 | server.on("/apikey_set", HTTP_POST, handle_apikey_set); 1018 | server.on("/role", handle_role); 1019 | server.on("/role_set", HTTP_POST, handle_role_set); 1020 | server.on("/role_get", handle_role_get); 1021 | server.onNotFound(handleNotFound); 1022 | 1023 | init_chat_doc(json_ChatString.c_str()); 1024 | // SPIFFSをマウントする 1025 | if(SPIFFS.begin(true)){ 1026 | // JSONファイルを開く 1027 | File file = SPIFFS.open("/data.json", "r"); 1028 | if(file){ 1029 | DeserializationError error = deserializeJson(chat_doc, file); 1030 | if(error){ 1031 | Serial.println("Failed to deserialize JSON"); 1032 | init_chat_doc(json_ChatString.c_str()); 1033 | } 1034 | serializeJson(chat_doc, InitBuffer); 1035 | Role_JSON = InitBuffer; 1036 | String json_str; 1037 | serializeJsonPretty(chat_doc, json_str); // 文字列をシリアルポートに出力する 1038 | Serial.println(json_str); 1039 | } else { 1040 | Serial.println("Failed to open file for reading"); 1041 | init_chat_doc(json_ChatString.c_str()); 1042 | } 1043 | } else { 1044 | Serial.println("An Error has occurred while mounting SPIFFS"); 1045 | } 1046 | 1047 | server.begin(); 1048 | Serial.println("HTTP server started"); 1049 | M5.Lcd.println("HTTP server started"); 1050 | 1051 | Serial.printf_P(PSTR("/ to control the chatGpt Server.\n")); 1052 | M5.Lcd.print("/ to control the chatGpt Server.\n"); 1053 | delay(3000); 1054 | 1055 | audioLogger = &Serial; 1056 | mp3 = new AudioGeneratorMP3(); 1057 | // mp3->RegisterStatusCB(StatusCallback, (void*)"mp3"); 1058 | 1059 | #ifdef USE_DOGFACE 1060 | static Face* face = new DogFace(); 1061 | static ColorPalette* cp = new ColorPalette(); 1062 | cp->set(COLOR_PRIMARY, TFT_BLACK); //AtaruFace 1063 | cp->set(COLOR_SECONDARY, TFT_WHITE); 1064 | cp->set(COLOR_BACKGROUND, TFT_WHITE); 1065 | avatar.setFace(face); 1066 | avatar.setColorPalette(*cp); 1067 | avatar.init(8); //Color Depth8 1068 | #else 1069 | avatar.init(); 1070 | #endif 1071 | avatar.addTask(lipSync, "lipSync"); 1072 | avatar.addTask(servo, "servo"); 1073 | avatar.setSpeechFont(&fonts::efontJA_16); 1074 | box_servo.setupBox(80, 120, 80, 80); 1075 | box_stt.setupBox(0, 0, M5.Display.width(), 60); 1076 | randomSeed(millis()); 1077 | } 1078 | 1079 | String keywords[] = {"(Neutral)", "(Happy)", "(Sleepy)", "(Doubt)", "(Sad)", "(Angry)"}; 1080 | void addPeriodBeforeKeyword(String &input, String keywords[], int numKeywords) { 1081 | int prevIndex = 0; 1082 | for (int i = 0; i < numKeywords; i++) { 1083 | int index = input.indexOf(keywords[i]); 1084 | while (index != -1) { 1085 | if(LANG_CODE == "ja-JP") { 1086 | if (index > 0 && input.charAt(index-1) != '。') { 1087 | input = input.substring(0, index) + "。" + input.substring(index); 1088 | } 1089 | } else { 1090 | if (index > 0 && input.charAt(index-1) != '.') { 1091 | input = input.substring(0, index) + "." + input.substring(index); 1092 | } 1093 | } 1094 | prevIndex = index + keywords[i].length() + 1; // update prevIndex to after the keyword and period 1095 | index = input.indexOf(keywords[i], prevIndex); 1096 | } 1097 | } 1098 | // Serial.println(input); 1099 | } 1100 | 1101 | int expressionIndx = -1; 1102 | String expressionString[] = {"Neutral","Happy","Sleepy","Doubt","Sad","Angry",""}; 1103 | String emotion_parms[]= { 1104 | "&emotion_level=2&emotion=happiness", 1105 | "&emotion_level=3&emotion=happiness", 1106 | "&emotion_level=2&emotion=sadness", 1107 | "&emotion_level=1&emotion=sadness", 1108 | "&emotion_level=4&emotion=sadness", 1109 | "&emotion_level=4&emotion=anger"}; 1110 | int tts_emotion_no = 0; 1111 | //String random_words[18] = {"あなたは誰","楽しい","怒った","可愛い","悲しい","眠い","ジョークを言って","泣きたい","怒ったぞ","こんにちは","お疲れ様","詩を書いて","疲れた","お腹空いた","嫌いだ","苦しい","俳句を作って","歌をうたって"}; 1112 | String random_words[18] = {"楽しい","お菓子食べたい","あなたは誰","可愛い","悲しい","眠い","ジョークを言って","賢いね","笑い","こんにちは","お疲れ様","詩を書いて","疲れた","お腹空いた","大好き","気持ち良いね","俳句を作って","歌をうたって"}; 1113 | int random_time = -1; 1114 | bool random_speak = true; 1115 | 1116 | void getExpression(String &sentence, int &expressionIndx){ 1117 | Serial.println("sentence="+sentence); 1118 | int startIndex = sentence.indexOf("("); 1119 | if(startIndex >= 0) { 1120 | int endIndex = sentence.indexOf(")", startIndex); 1121 | if(endIndex > 0) { 1122 | String extractedString = sentence.substring(startIndex + 1, endIndex); // 括弧を含まない部分文字列を抽出 1123 | // Serial.println("extractedString="+extractedString); 1124 | sentence.remove(startIndex, endIndex - startIndex + 1); // 括弧を含む部分文字列を削除 1125 | // Serial.println("sentence="+sentence); 1126 | if(extractedString != "") { 1127 | expressionIndx = 0; 1128 | while(1) { 1129 | if(expressionString[expressionIndx] == extractedString) 1130 | { 1131 | avatar.setExpression(expressions_table[expressionIndx]); 1132 | break; 1133 | } 1134 | if(expressionString[expressionIndx] == "") { 1135 | expressionIndx = -1; 1136 | break; 1137 | } 1138 | expressionIndx++; 1139 | } 1140 | } else { 1141 | expressionIndx = -1; 1142 | } 1143 | } 1144 | } 1145 | } 1146 | 1147 | String separator_tbl[2][7] = {{"。","?","!","、",""," ",""},{":",",",".","?","!","\n",""}}; 1148 | 1149 | int search_separator(String text, int tbl){ 1150 | int i = 0; 1151 | int dotIndex_min = 1000; 1152 | int dotIndex; 1153 | while(separator_tbl[tbl][i] != ""){ 1154 | dotIndex = text.indexOf(separator_tbl[tbl][i++]); 1155 | if((dotIndex != -1)&&(dotIndex < dotIndex_min)) dotIndex_min = dotIndex; 1156 | } 1157 | if(dotIndex_min == 1000) return -1; 1158 | else return dotIndex_min; 1159 | 1160 | } 1161 | // int search_separator1(String text, int tbl){ 1162 | // int i = 0; 1163 | // int dotIndex; 1164 | // while(separator_tbl[tbl][i] != ""){ 1165 | // dotIndex = text.indexOf(separator_tbl[tbl][i++]); 1166 | // if(dotIndex != -1) return dotIndex; 1167 | // } 1168 | // return -1; 1169 | // } 1170 | 1171 | void loop() 1172 | { 1173 | static int lastms = 0; 1174 | static int lastms1 = 0; 1175 | 1176 | if (random_time >= 0 && millis() - lastms1 > random_time) 1177 | { 1178 | lastms1 = millis(); 1179 | random_time = 40000 + 1000 * random(30); 1180 | if (!mp3->isRunning() && speech_text=="" && speech_text_buffer == "") { 1181 | exec_chatGPT(random_words[random(18)]); 1182 | } 1183 | } 1184 | 1185 | if (M5.BtnA.wasPressed()&&(!mp3->isRunning())) 1186 | { 1187 | M5.Speaker.tone(1000, 100); 1188 | String tmp; 1189 | String lang; 1190 | if(random_speak) { 1191 | if(LANG_CODE == "ja-JP") { 1192 | tmp = "独り言始めます。"; 1193 | lang = "ja-JP"; 1194 | } else { 1195 | tmp = "I'll start talking to myself."; 1196 | lang = "en-US"; 1197 | } 1198 | lastms1 = millis(); 1199 | random_time = 40000 + 1000 * random(30); 1200 | } else { 1201 | if(LANG_CODE == "ja-JP") { 1202 | tmp = "独り言やめます。"; 1203 | lang = "ja-JP"; 1204 | } else { 1205 | tmp = "I'll stop talking to myself."; 1206 | lang = "en-US"; 1207 | } 1208 | random_time = -1; 1209 | } 1210 | random_speak = !random_speak; 1211 | avatar.setExpression(Expression::Happy); 1212 | // google_tts((char*)tmp.c_str(),"ja-JP"); 1213 | // google_tts((char*)tmp.c_str(),"en-US"); 1214 | google_tts((char*)tmp.c_str(),(char*)lang.c_str()); 1215 | avatar.setExpression(Expression::Neutral); 1216 | Serial.println("mp3 begin"); 1217 | } 1218 | 1219 | // if (Serial.available()) { 1220 | // char kstr[256]; 1221 | // size_t len = Serial.readBytesUntil('\r', kstr, 256); 1222 | // kstr[len]=0; 1223 | // avatar.setExpression(Expression::Happy); 1224 | // VoiceText_tts(kstr, tts_parms2); 1225 | // avatar.setExpression(Expression::Neutral); 1226 | // } 1227 | 1228 | M5.update(); 1229 | #if defined(ARDUINO_M5STACK_Core2) 1230 | auto count = M5.Touch.getCount(); 1231 | if (count) 1232 | { 1233 | auto t = M5.Touch.getDetail(); 1234 | if (t.wasPressed()) 1235 | { 1236 | if (box_stt.contain(t.x, t.y)&&(!mp3->isRunning())) 1237 | { 1238 | M5.Speaker.tone(1000, 100); 1239 | delay(200); 1240 | bool prev_servo_home = servo_home; 1241 | random_speak = true; 1242 | random_time = -1; 1243 | #ifdef USE_SERVO 1244 | servo_home = true; 1245 | #endif 1246 | avatar.setExpression(Expression::Happy); 1247 | if(LANG_CODE == "ja-JP") { 1248 | avatar.setSpeechText("御用でしょうか?"); 1249 | }else{ 1250 | avatar.setSpeechText("May I help you?"); 1251 | } 1252 | M5.Speaker.end(); 1253 | Serial.println("\r\nRecord start!\r\n"); 1254 | Audio* audio = new Audio(); 1255 | audio->Record(); 1256 | Serial.println("Record end\r\n"); 1257 | Serial.println("音声認識開始"); 1258 | if(LANG_CODE == "ja-JP") { 1259 | avatar.setSpeechText("わかりました"); 1260 | }else{ 1261 | avatar.setSpeechText("I understand."); 1262 | } 1263 | CloudSpeechClient* cloudSpeechClient = new CloudSpeechClient(USE_APIKEY); 1264 | String ret = cloudSpeechClient->Transcribe(audio); 1265 | delete cloudSpeechClient; 1266 | delete audio; 1267 | delay(500); 1268 | #ifdef USE_SERVO 1269 | servo_home = prev_servo_home; 1270 | #endif 1271 | Serial.println("音声認識終了"); 1272 | Serial.println("音声認識結果"); 1273 | if(ret != "") { 1274 | //M5.Lcd.println(ret); 1275 | if (!mp3->isRunning() && speech_text=="" && speech_text_buffer == "") { 1276 | exec_chatGPT(ret); 1277 | } 1278 | } else { 1279 | Serial.println("音声認識失敗"); 1280 | avatar.setExpression(Expression::Sad); 1281 | if(LANG_CODE == "ja-JP") { 1282 | avatar.setSpeechText("聞き取れませんでした"); 1283 | } else { 1284 | avatar.setSpeechText("I didn't hear it."); 1285 | //response = "I don't understand."; 1286 | } 1287 | delay(2000); 1288 | avatar.setSpeechText(""); 1289 | avatar.setExpression(Expression::Neutral); 1290 | } 1291 | M5.Speaker.begin(); 1292 | } 1293 | #ifdef USE_SERVO 1294 | if (box_servo.contain(t.x, t.y)) 1295 | { 1296 | servo_home = !servo_home; 1297 | M5.Speaker.tone(1000, 100); 1298 | } 1299 | #endif 1300 | } 1301 | } 1302 | #endif 1303 | 1304 | if (M5.BtnC.wasPressed()) 1305 | { 1306 | M5.Speaker.tone(1000, 100); 1307 | avatar.setExpression(Expression::Happy); 1308 | if(LANG_CODE == "ja-JP") { 1309 | google_tts(text1,"ja-JP"); 1310 | } else { 1311 | google_tts(text2,"en-US"); 1312 | } 1313 | avatar.setExpression(Expression::Neutral); 1314 | Serial.println("mp3 begin"); 1315 | } 1316 | 1317 | if(speech_text != ""){ 1318 | speech_text_buffer = speech_text; 1319 | speech_text = ""; 1320 | addPeriodBeforeKeyword(speech_text_buffer, keywords, 6); 1321 | Serial.println("-----------------------------"); 1322 | Serial.println(speech_text_buffer); 1323 | //--------------------------------- 1324 | String sentence = speech_text_buffer; 1325 | int dotIndex; 1326 | if(LANG_CODE == "ja-JP") { 1327 | dotIndex = search_separator(speech_text_buffer, 0); 1328 | //dotIndex = speech_text_buffer.indexOf("。"); 1329 | } else { 1330 | dotIndex = search_separator(speech_text_buffer, 1); 1331 | //dotIndex =speech_text_buffer.indexOf("."); 1332 | } 1333 | if (dotIndex != -1) { 1334 | if(LANG_CODE == "ja-JP") { 1335 | dotIndex += 3; 1336 | }else{ 1337 | dotIndex += 2; 1338 | } 1339 | sentence = speech_text_buffer.substring(0, dotIndex); 1340 | Serial.println(sentence); 1341 | speech_text_buffer = speech_text_buffer.substring(dotIndex); 1342 | }else{ 1343 | speech_text_buffer = ""; 1344 | } 1345 | //---------------- 1346 | getExpression(sentence, expressionIndx); 1347 | //---------------- 1348 | if(expressionIndx < 0) avatar.setExpression(Expression::Happy); 1349 | google_tts((char*)sentence.c_str(), (char*)LANG_CODE.c_str()); 1350 | if(expressionIndx < 0) avatar.setExpression(Expression::Neutral); 1351 | } 1352 | 1353 | if (mp3->isRunning()) { 1354 | // if (millis()-lastms > 1000) { 1355 | // lastms = millis(); 1356 | // Serial.printf("Running for %d ms...\n", lastms); 1357 | // Serial.flush(); 1358 | // } 1359 | if (!mp3->loop()) { 1360 | mp3->stop(); 1361 | if(file != nullptr){delete file; file = nullptr;} 1362 | Serial.println("mp3 stop"); 1363 | // avatar.setExpression(Expression::Neutral); 1364 | if(speech_text_buffer != ""){ 1365 | String sentence = speech_text_buffer; 1366 | int dotIndex; 1367 | if(LANG_CODE == "ja-JP") { 1368 | dotIndex = search_separator(speech_text_buffer, 0); 1369 | //dotIndex =speech_text_buffer.indexOf("。"); 1370 | } else { 1371 | dotIndex = search_separator(speech_text_buffer, 1); 1372 | //dotIndex =speech_text_buffer.indexOf("."); 1373 | } 1374 | if (dotIndex != -1) { 1375 | if(LANG_CODE == "ja-JP") { 1376 | dotIndex += 3; 1377 | }else{ 1378 | dotIndex += 2; 1379 | } 1380 | sentence = speech_text_buffer.substring(0, dotIndex); 1381 | Serial.println(sentence); 1382 | speech_text_buffer = speech_text_buffer.substring(dotIndex); 1383 | }else{ 1384 | speech_text_buffer = ""; 1385 | } 1386 | //---------------- 1387 | getExpression(sentence, expressionIndx); 1388 | //---------------- 1389 | if(expressionIndx < 0) avatar.setExpression(Expression::Happy); 1390 | google_tts((char*)sentence.c_str(), (char*)LANG_CODE.c_str()); 1391 | if(expressionIndx < 0) avatar.setExpression(Expression::Neutral); 1392 | } else { 1393 | avatar.setExpression(Expression::Neutral); 1394 | expressionIndx = -1; 1395 | } 1396 | } 1397 | } else { 1398 | server.handleClient(); 1399 | } 1400 | //delay(100); 1401 | } 1402 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/network_param.h: -------------------------------------------------------------------------------- 1 | #ifndef _NETWORK_PARAM_H 2 | #define _NETWORK_PARAM_H 3 | 4 | const char* server_stt = "speech.googleapis.com"; 5 | 6 | // To get the certificate for your region run: 7 | // openssl s_client -showcerts -connect speech.googleapis.com:443 8 | // Copy the certificate (all lines between and including ---BEGIN CERTIFICATE--- 9 | // and --END CERTIFICATE--) to root.cert and put here on the root_cert variable. 10 | // certificate for https://speech.googleapis.com 11 | // GlobalSign Root CA, valid until Fri Jan 28 2028, size: 1927 bytes 12 | const char* root_ca = \ 13 | "-----BEGIN CERTIFICATE-----\n" \ 14 | "MIIFYjCCBEqgAwIBAgIQd70NbNs2+RrqIQ/E8FjTDTANBgkqhkiG9w0BAQsFADBX\n" \ 15 | "MQswCQYDVQQGEwJCRTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEQMA4GA1UE\n" \ 16 | "CxMHUm9vdCBDQTEbMBkGA1UEAxMSR2xvYmFsU2lnbiBSb290IENBMB4XDTIwMDYx\n" \ 17 | "OTAwMDA0MloXDTI4MDEyODAwMDA0MlowRzELMAkGA1UEBhMCVVMxIjAgBgNVBAoT\n" \ 18 | "GUdvb2dsZSBUcnVzdCBTZXJ2aWNlcyBMTEMxFDASBgNVBAMTC0dUUyBSb290IFIx\n" \ 19 | "MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAthECix7joXebO9y/lD63\n" \ 20 | "ladAPKH9gvl9MgaCcfb2jH/76Nu8ai6Xl6OMS/kr9rH5zoQdsfnFl97vufKj6bwS\n" \ 21 | "iV6nqlKr+CMny6SxnGPb15l+8Ape62im9MZaRw1NEDPjTrETo8gYbEvs/AmQ351k\n" \ 22 | "KSUjB6G00j0uYODP0gmHu81I8E3CwnqIiru6z1kZ1q+PsAewnjHxgsHA3y6mbWwZ\n" \ 23 | "DrXYfiYaRQM9sHmklCitD38m5agI/pboPGiUU+6DOogrFZYJsuB6jC511pzrp1Zk\n" \ 24 | "j5ZPaK49l8KEj8C8QMALXL32h7M1bKwYUH+E4EzNktMg6TO8UpmvMrUpsyUqtEj5\n" \ 25 | "cuHKZPfmghCN6J3Cioj6OGaK/GP5Afl4/Xtcd/p2h/rs37EOeZVXtL0m79YB0esW\n" \ 26 | "CruOC7XFxYpVq9Os6pFLKcwZpDIlTirxZUTQAs6qzkm06p98g7BAe+dDq6dso499\n" \ 27 | "iYH6TKX/1Y7DzkvgtdizjkXPdsDtQCv9Uw+wp9U7DbGKogPeMa3Md+pvez7W35Ei\n" \ 28 | "Eua++tgy/BBjFFFy3l3WFpO9KWgz7zpm7AeKJt8T11dleCfeXkkUAKIAf5qoIbap\n" \ 29 | "sZWwpbkNFhHax2xIPEDgfg1azVY80ZcFuctL7TlLnMQ/0lUTbiSw1nH69MG6zO0b\n" \ 30 | "9f6BQdgAmD06yK56mDcYBZUCAwEAAaOCATgwggE0MA4GA1UdDwEB/wQEAwIBhjAP\n" \ 31 | "BgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBTkrysmcRorSCeFL1JmLO/wiRNxPjAf\n" \ 32 | "BgNVHSMEGDAWgBRge2YaRQ2XyolQL30EzTSo//z9SzBgBggrBgEFBQcBAQRUMFIw\n" \ 33 | "JQYIKwYBBQUHMAGGGWh0dHA6Ly9vY3NwLnBraS5nb29nL2dzcjEwKQYIKwYBBQUH\n" \ 34 | "MAKGHWh0dHA6Ly9wa2kuZ29vZy9nc3IxL2dzcjEuY3J0MDIGA1UdHwQrMCkwJ6Al\n" \ 35 | "oCOGIWh0dHA6Ly9jcmwucGtpLmdvb2cvZ3NyMS9nc3IxLmNybDA7BgNVHSAENDAy\n" \ 36 | "MAgGBmeBDAECATAIBgZngQwBAgIwDQYLKwYBBAHWeQIFAwIwDQYLKwYBBAHWeQIF\n" \ 37 | "AwMwDQYJKoZIhvcNAQELBQADggEBADSkHrEoo9C0dhemMXoh6dFSPsjbdBZBiLg9\n" \ 38 | "NR3t5P+T4Vxfq7vqfM/b5A3Ri1fyJm9bvhdGaJQ3b2t6yMAYN/olUazsaL+yyEn9\n" \ 39 | "WprKASOshIArAoyZl+tJaox118fessmXn1hIVw41oeQa1v1vg4Fv74zPl6/AhSrw\n" \ 40 | "9U5pCZEt4Wi4wStz6dTZ/CLANx8LZh1J7QJVj2fhMtfTJr9w4z30Z209fOU0iOMy\n" \ 41 | "+qduBmpvvYuR7hZL6Dupszfnw0Skfths18dG9ZKb59UhvmaSGZRVbNQpsg3BZlvi\n" \ 42 | "d0lIKO2d1xozclOzgjXPYovJJIultzkMu34qQb9Sz/yilrbCgj8=\n" \ 43 | "-----END CERTIFICATE-----\n" \ 44 | ""; 45 | 46 | // Getting Access Token : 47 | // At first, you should get service account key (JSON file). 48 | // Type below command in Google Cloud Shell to get AccessToken: 49 | // $ gcloud auth activate-service-account --key-file=KEY_FILE (KEY_FILE is your service account key file) 50 | // $ gcloud auth print-access-token 51 | // The Access Token is expired in an hour. 52 | // Google recommends to use Access Token. 53 | const String AccessToken = ""; 54 | 55 | // It is also possible to use "API Key" instead of "Access Token". It doesn't have time limit. 56 | //const String ApiKey = ""; 57 | String GOOGLE_API_KEY = ""; 58 | extern String LANG_CODE; 59 | 60 | // see https://cloud.google.com/docs/authentication?hl=ja#getting_credentials_for_server-centric_flow 61 | // see https://qiita.com/basi/items/3623a576b754f738138e (Japanese) 62 | 63 | #endif // _NETWORK_PARAM_H 64 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/src/rootCACertificate.h: -------------------------------------------------------------------------------- 1 | // certificate for https://api.openai.com 2 | // Baltimore CyberTrust Root, valid until Wed Jan 01 2025, size: 1379 bytes 3 | const char* root_ca_openai = \ 4 | "-----BEGIN CERTIFICATE-----\n" \ 5 | "MIIDzTCCArWgAwIBAgIQCjeHZF5ftIwiTv0b7RQMPDANBgkqhkiG9w0BAQsFADBa\n" \ 6 | "MQswCQYDVQQGEwJJRTESMBAGA1UEChMJQmFsdGltb3JlMRMwEQYDVQQLEwpDeWJl\n" \ 7 | "clRydXN0MSIwIAYDVQQDExlCYWx0aW1vcmUgQ3liZXJUcnVzdCBSb290MB4XDTIw\n" \ 8 | "MDEyNzEyNDgwOFoXDTI0MTIzMTIzNTk1OVowSjELMAkGA1UEBhMCVVMxGTAXBgNV\n" \ 9 | "BAoTEENsb3VkZmxhcmUsIEluYy4xIDAeBgNVBAMTF0Nsb3VkZmxhcmUgSW5jIEVD\n" \ 10 | "QyBDQS0zMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEua1NZpkUC0bsH4HRKlAe\n" \ 11 | "nQMVLzQSfS2WuIg4m4Vfj7+7Te9hRsTJc9QkT+DuHM5ss1FxL2ruTAUJd9NyYqSb\n" \ 12 | "16OCAWgwggFkMB0GA1UdDgQWBBSlzjfq67B1DpRniLRF+tkkEIeWHzAfBgNVHSME\n" \ 13 | "GDAWgBTlnVkwgkdYzKz6CFQ2hns6tQRN8DAOBgNVHQ8BAf8EBAMCAYYwHQYDVR0l\n" \ 14 | "BBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMCMBIGA1UdEwEB/wQIMAYBAf8CAQAwNAYI\n" \ 15 | "KwYBBQUHAQEEKDAmMCQGCCsGAQUFBzABhhhodHRwOi8vb2NzcC5kaWdpY2VydC5j\n" \ 16 | "b20wOgYDVR0fBDMwMTAvoC2gK4YpaHR0cDovL2NybDMuZGlnaWNlcnQuY29tL09t\n" \ 17 | "bmlyb290MjAyNS5jcmwwbQYDVR0gBGYwZDA3BglghkgBhv1sAQEwKjAoBggrBgEF\n" \ 18 | "BQcCARYcaHR0cHM6Ly93d3cuZGlnaWNlcnQuY29tL0NQUzALBglghkgBhv1sAQIw\n" \ 19 | "CAYGZ4EMAQIBMAgGBmeBDAECAjAIBgZngQwBAgMwDQYJKoZIhvcNAQELBQADggEB\n" \ 20 | "AAUkHd0bsCrrmNaF4zlNXmtXnYJX/OvoMaJXkGUFvhZEOFp3ArnPEELG4ZKk40Un\n" \ 21 | "+ABHLGioVplTVI+tnkDB0A+21w0LOEhsUCxJkAZbZB2LzEgwLt4I4ptJIsCSDBFe\n" \ 22 | "lpKU1fwg3FZs5ZKTv3ocwDfjhUkV+ivhdDkYD7fa86JXWGBPzI6UAPxGezQxPk1H\n" \ 23 | "goE6y/SJXQ7vTQ1unBuCJN0yJV0ReFEQPaA1IwQvZW+cwdFD19Ae8zFnWSfda9J1\n" \ 24 | "CZMRJCQUzym+5iPDuI9yP+kHyCREU3qzuWFloUwOxkgAyXVjBYdwRVKD05WdRerw\n" \ 25 | "6DEdfgkfCv4+3ao8XnTSrLE=\n" \ 26 | "-----END CERTIFICATE-----\n" \ 27 | ""; 28 | -------------------------------------------------------------------------------- /M5Unified_StackChan_ChatGPT_Google/test/README: -------------------------------------------------------------------------------- 1 | 2 | This directory is intended for PlatformIO Test Runner and project tests. 3 | 4 | Unit Testing is a software testing method by which individual units of 5 | source code, sets of one or more MCU program modules together with associated 6 | control data, usage procedures, and operating procedures, are tested to 7 | determine whether they are fit for use. Unit testing finds problems early 8 | in the development cycle. 9 | 10 | More information about PlatformIO Unit Testing: 11 | - https://docs.platformio.org/en/latest/advanced/unit-testing/index.html 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # M5Unified_StackChan_ChatGPT_Google 2 | Google Cloud Speech to Text を使って音声で会話できるスタックチャンです。多言語対応です。 3 | 4 | [English](README_en.md)
5 | 6 | 7 | ![画像1](images/image1.png)

8 | 9 | これは @mongonta555 さんの[スタックチャン M5GoBottom版組み立てキット](https://raspberrypi.mongonta.com/about-products-stackchan-m5gobottom-version/ "Title")に対応したスタックチャンファームです。

10 | 11 | --- 12 | 13 | ### M5GoBottom版スタックチャン本体を作るのに必要な物、及び作り方 ### 14 | こちらを参照してください。
15 | * [スタックチャン M5GoBottom版組み立てキット](https://raspberrypi.mongonta.com/about-products-stackchan-m5gobottom-version/ "Title")
16 | 17 | ### プログラムをビルドするのに必要な物 ### 18 | * [M5Stack Core2](http://www.m5stack.com/ "Title")
19 | * VSCode
20 | * PlatformIO
21 | 22 | 使用しているライブラリ等は"platformio.ini"を参照してください。
23 | 24 | --- 25 | 26 | ### サーボモーターを使用するGPIO番号の設定 ### 27 | * M5Unified_StackChan_ChatGPT.inoの42行目付近、サーボモーターを使用するGPIO番号を設定してください。 28 | 29 | 30 | ### ChatGPTのAPIキーの取得 ### 31 | 32 | ChatGPTのAPIキー取得方法は以下の通りです。(詳細はこのページ一番下のリンクを参照してください。) 33 | 34 | * [OpenAIのウェブサイト](https://openai.com/ "Title")にアクセスして、アカウントを作成します。メールアドレスと携帯電話番号が必要です。 35 | * アカウント作成後、APIキーを発行します。APIキーは有料ですが、無料期間やクレジットがあります。
36 | 37 | ### Google Cloud Speech to Text のAPIキーの取得 ### 38 | 39 | Google Cloud Speech to TextのAPIキー取得方法は以下の通りです。(詳細はこのページ一番下のリンクを参照してください。) 40 | 41 | * [Google Cloud Platformのウェブサイト](https://cloud.google.com/?hl=ja/ "Title")にアクセスして、アカウントを作成します。メールアドレスと携帯電話番号が必要です。カードの登録が必須ですが、無料トライアルや無料枠があります。 42 | * アカウント作成後、APIキーを取得します。
43 | 44 | --- 45 | 46 | ### 使い方 ### 47 | * SDカードのルートに以下の2つのファイルを作成しておくと、使用できるようになります。
48 | 【設定が上手くいったらSDカードは必ず抜いておいてください。】
49 | 50 | 1. wifi.txtファイル:ファイル名は"wifi.txt"で、中身は次の通りです。
51 | YOUR_WIFI_SSID
52 | YOUR_WIFI_PASS
53 | 54 | 2. apikey.txtファイル:ファイル名は"apikey.txt"で、中身は次の通りです。
55 | YOUR_OPENAI_APIKEY
56 | YOUR_GOOGLE_STT_APIKEY
57 | 58 | * もしM5Stackが以前にWifiに接続していた場合、SDカードが必要なく自動的にWifiに接続されます。
59 | この場合、ブラウザで"http://XXX.XXX.XXX.XXX/apikey"にアクセスし、APIキーを設定できます。
60 | (xxxx.xxxx.xxxx.xxxxはAIスタックチャンの起動時に表示されるIPアドレスです。)
61 | 62 | * スタックチャンの額にタッチすると聞き取りを開始します。
63 | 聞き取り時間は3秒程度です。
64 | 65 | * ブラウザで"http://xxxx.xxxx.xxxx.xxxx/role"にアクセスすると、ロールを設定できます。
66 | (xxxx.xxxx.xxxx.xxxxはAIスタックチャンの起動時に表示されるIPアドレスです。)
67 | テキストエリアに何も入力せずに送信すると、以前に設定されたロールが削除されます。

68 | ロール情報は自動的にspiffsに保存されます。
69 |
70 | 71 | * ブラウザで"http://xxxx.xxxx.xxxx.xxxx/role_get"にアクセスすると、現在設定しているロールを取得できます。
72 | 73 | * スピーカーの音量を調整できます。

74 | 例:http://xxxx.xxxx.xxxx.xxxx/setting?volume=180
75 | volumeの値は0~255 76 | 77 | * AIスタックチャンの表情を会話内容に合わせて変更できます。
78 | ロール設定で以下の2行をそのまま入力してください。

79 | (Happy)のように、必ず括弧で囲んで感情の種類を表し、返答の先頭に付けてください。
80 | 感情の種類には、Neutral、Happy、Sleepy、Doubt、Sad、Angryがあります。

81 | 他にもロールを設定する際は、これらの2行を最後にしてください。
82 | 出来ればこの2行のみでやってみてください。
83 | ロールを増やすと失敗しやすくなります。
84 |
85 | 86 | * 独り言モードを追加しました。ランダムな時間間隔で、ランダムに喋ります。
87 | 感情表現機能と組み合わせると楽しいです。
88 | ボタンAで独り言モードをON/OFFできます。
89 | 独り言モードでも従来通りスマホから会話できます。
90 |
91 | 92 | * 直近5回分の会話の履歴を保存する機能があります。
93 | 94 | * 音声認識プログラムは別途ユーザーが用意する必要があります。
95 | 音声認識プログラムからこのようにhttpコマンドでテキスト(UTF-8)を渡します。
96 | (テキストはURLエンコードして渡してください。)

97 | http://XXX.XXX.XXX.XXX/chat?text=こんにちは

98 | XXX.XXX.XXX.XXXの部分は起動時に表示されるM5StackのIPアドレスに置き換えてください。

99 | * 上記と同様にしてブラウザを使ってアクセスすることも出来ます。

100 | ![画像2](images/image2.png)
101 | 102 | * 私は音声認識にiPhoneのショートカット機能を使っています。
103 | このように簡単に音声認識が使えます。

104 | ![画像3](images/image3.png)
105 | 106 | * M5Stack Core2の画面中央付近にタッチするとスタックチャンの首振りを止められます。
107 | 108 | * M5Stack Core2のボタンCを押すと、音声合成のテストが出来ます。
109 | 110 | --- 111 | 112 | ### 使用する言語の設定方法 ### 113 | 114 | 次の2つの設定をします。

115 | 116 | 1. ChatGPTの設定をする

117 | 言語を英語に設定する例:ブラウザで "http://xxxx.xxxx.xxxx.xxxx/role" にアクセスして、以下のようにロールを設定します。

118 | ![画像4](images/image4.png)

119 | 120 | 2. TTSの言語設定をする

121 | 言語を英語に設定する例:ブラウザで以下のように設定します。

http://xxxx.xxxx.xxxx.xxxx/setting?lang=en-US

en-US が英語の場合の言語コードです。その他の言語コードは下記リンクを参照してください。

122 | 123 | * [Cloud Speech-to-Text 言語サポート](https://cloud.google.com/speech-to-text/docs/languages?hl=ja/ "Title")

124 | 125 | --- 126 | 127 | ### ChatGPTのAPIキー取得の参考リンク ### 128 | 129 | * [ChatGPT API利用方法の簡単解説](https://qiita.com/mikito/items/b69f38c54b362c20e9e6/ "Title")
130 | 131 | ### Google Cloud Speech to TextのAPIキー取得の参考リンク ### 132 | 133 | * [Speech-to-Text APIキーの取得/登録方法について](https://nicecamera.kidsplates.jp/help/feature/transcription/apikey/ "Title")
134 | 135 | ### ChatGPTのキャラクター設定の参考リンク ### 136 | 137 | * [ChatGPTのAPIでキャラクター設定を試してみた](https://note.com/it_navi/n/nf5f702b36a75#8e42f887-fb07-4367-9f3f-ab7f119eb064/ "Title")
138 |

139 | 140 | -------------------------------------------------------------------------------- /README_en.md: -------------------------------------------------------------------------------- 1 | # M5Unified_StackChan_ChatGPT_Google 2 | Stack Chan for spoken conversations using Google Cloud Speech to Text. It is multilingual. 3 | 4 | ![Image1](images/image1.png)

5 | 6 | This is a StackChan firmware that corresponds to @mongonta555's [ StackChan M5GoBottom Version Assembly Kit.](https://raspberrypi.mongonta.com/about-products-stackchan-m5gobottom-version/ "Title")

7 | 8 | --- 9 | 10 | ### Materials and instructions required to make a M5GoBottom version StackChan main body ### 11 | Please refer to this.
12 | 13 | * [StackChan M5GoBottom Version Assembly Kit](https://raspberrypi.mongonta.com/about-products-stackchan-m5gobottom-version/ "Title")
14 | 15 | ### Materials required to build the program ### 16 | * [M5Stack Core2](http://www.m5stack.com/ "Title")
17 | * VSCode
18 | * PlatformIO
19 | 20 | Please refer to "platformio.ini" for the libraries used, etc.
21 | 22 | --- 23 | 24 | ### Setting the GPIO number to use the servo motor ### 25 | * Please set the GPIO number to use the servo motor around line 50 of main.cpp. 26 | 27 | 28 | ### Getting the ChatGPT API key ### 29 | 30 | The method of obtaining a ChatGPT API key is as follows. (For details, please refer to the link at the bottom of this page.) 31 | 32 | * Access the[OpenAI website](https://openai.com/ "Title")and create an account. An email address and mobile phone number are required. 33 | * After creating an account, issue an API key. The API key is paid, but there is a free period and credits.
34 | 35 | ### Obtaining an API key for Google Cloud Speech to Text ### 36 | 37 | Please follow the instructions below to obtain an API key for Google Cloud Speech to Text. (See the link at the bottom of this page for details.) 38 | 39 | * Go to the [Google Cloud Platform website](https://cloud.google.com/?hl=ja/ "Title") and create an account. You will need an email address and a cell phone number. Card registration is required, but free trials and free slots are available. 40 | * After creating an account, you will get an API key.
41 | 42 | --- 43 | 44 | ### Usage ### 45 | * If you create the following two files at the root of the SD card, you can use it.
46 | [Be sure to remove the SD card once the setup is successful].
47 | 48 | 1. wifi.txt file: The file name is "wifi.txt", and the contents are as follows.
49 | YOUR_WIFI_SSID
50 | YOUR_WIFI_PASS
51 | 52 | 2. apikey.txt file: The file name is "apikey.txt", and the contents are as follows.
53 | YOUR_OPENAI_APIKEY
54 | YOUR_GOOGLE_STT_APIKEY
55 | 56 | * If the M5Stack was previously connected to Wi-Fi, it will automatically connect to Wi-Fi without the need for an SD card.
57 | In this case, access "http://XXX.XXX.XXX.XXX/apikey" in your browser and set the API key.
58 | (xxxx.xxxx.xxxx.xxxx is the IP address displayed when AI Stack Chan is started.)
59 | 60 | * Touch the stack chan's forehead to start listening.
61 | Listening time is about 3 seconds.
62 | 63 | * You can set the role by accessing "http://xxxx.xxxx.xxxx.xxxx/role" in your browser.
64 | (xxxx.xxxx.xxxx.xxxx is the IP address displayed when AI Stack Chan is started.)
65 | If you send it without entering anything in the text area, the previously set role will be deleted.

66 | Role information is automatically saved to spiffs.
67 |
68 | 69 | * You can get the currently set role by accessing "http://xxxx.xxxx.xxxx.xxxx/role_get" in your browser.
70 | 71 | * You can adjust the volume of the speaker.

72 | Example:http://xxxx.xxxx.xxxx.xxxx/setting?volume=180
73 | The value of volume is 0 to 255. 74 | 75 | * You can change the expression of AI Stack-chan to match the conversation content.
76 | Please enter the following two lines as they are in the role settings.

77 | Enclose the emotion type in parentheses, as in (Happy), and attach it to the beginning of the response.
78 | The emotion types include Neutral, Happy, Sleepy, Doubt, Sad, and Angry.

79 | When setting up other roles, please include these two lines at the end.
80 | If possible, try using only these two lines.
81 | Increasing the number of roles will make it more likely to fail.
82 |
83 | 84 | * Added a soliloquy mode. It speaks randomly at random intervals. It is fun when combined with the emotion expression function. Press button A to turn soliloquy mode on/off. Even in soliloquy mode, you can still talk to it from your smartphone as before.
85 |
86 | 87 | * It has a function to save the history of the last five conversations.
88 | 89 | * You need to separately prepare a speech recognition program. Pass text (UTF-8) with an HTTP command from the speech recognition program as follows (please pass the text after URL encoding):

90 | http://XXX.XXX.XXX.XXX/chat?text=hello

91 | Replace XXX.XXX.XXX.XXX with the IP address of the displayed M5Stack at startup.

92 | * You can also access it using a browser in the same way as above.

93 | ![Image2](images/image2.png)
94 | 95 | * I use the iPhone's shortcut function for speech recognition. This allows for easy use of speech recognition.

96 | ![Image3](images/image3.png)
97 | 98 | * Touching near the center of the screen of the M5Stack Core2 will stop Stack-chan's head from shaking.
99 | 100 | * Pressing button C on the M5Stack Core2 allows you to test speech synthesis.
101 | 102 | --- 103 | 104 | ### Setting Language Configuration ### 105 | 106 | Please follow these two steps:

107 | 108 | 1. Set up ChatGPT configuration

109 | For example, to set the language to English, access "http://xxxx.xxxx.xxxx.xxxx/role" on your browser and set the role as shown below.

110 | ![Image4](images/image4.png)

111 | 112 | 2. TSet up TTS language configuration

113 | For example, to set the language to English, configure as follows on your browser:

http://xxxx.xxxx.xxxx.xxxx/setting?lang=en-US

en-US is the language code for English. For other language codes, please refer to the link below.

114 | 115 | * [Cloud Speech-to-Text Language Support](https://cloud.google.com/speech-to-text/docs/languages?hl=ja/ "Title")

116 | 117 | --- 118 | 119 | ### Reference links for obtaining ChatGPT API key ### 120 | 121 | * [A simple explanation of how to use ChatGPT API](https://qiita.com/mikito/items/b69f38c54b362c20e9e6/ "Title")
122 | 123 | ### Reference link for obtaining an API key for Google Cloud Speech to Text ### 124 | 125 | * [How to get/register a Speech-to-Text API key](https://nicecamera.kidsplates.jp/help/feature/transcription/apikey/ "Title")
126 | 127 | 128 | ### Reference links for ChatGPT character configuration ### 129 | 130 | * [Reference links for ChatGPT character configuration](https://note.com/it_navi/n/nf5f702b36a75#8e42f887-fb07-4367-9f3f-ab7f119eb064/ "Title")
131 |

132 | 133 | -------------------------------------------------------------------------------- /images/image1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robo8080/M5Unified_StackChan_ChatGPT_Google/8e9a71d524f32c52239c1310cf9c3b49e6fab989/images/image1.png -------------------------------------------------------------------------------- /images/image2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robo8080/M5Unified_StackChan_ChatGPT_Google/8e9a71d524f32c52239c1310cf9c3b49e6fab989/images/image2.png -------------------------------------------------------------------------------- /images/image3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robo8080/M5Unified_StackChan_ChatGPT_Google/8e9a71d524f32c52239c1310cf9c3b49e6fab989/images/image3.png -------------------------------------------------------------------------------- /images/image4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robo8080/M5Unified_StackChan_ChatGPT_Google/8e9a71d524f32c52239c1310cf9c3b49e6fab989/images/image4.png --------------------------------------------------------------------------------