ESP8266开发板:百度语音在线识别与物联网智能控制

12 下载量 150 浏览量 更新于2024-12-04 11 收藏 747KB ZIP 举报
资源摘要信息:"ESP8266百度语音在线识别WIFI开发板web配网物联网智能对话-电路方案" 本开发板提供了两种版本,分别满足有无网络的不同需求。无网络版实现了esp8266的录音与播放功能,用户说出的话可以被开发板重复播放。而全功能版则增加了百度语音在线识别功能,允许用户通过语音命令控制智能家电,查询时间、日历和天气预报,实现智能语音对话。 开发板的操作流程包括以下几个步骤: 1. WEB配网:开发板生成一个名为clock_mac的wifi热点,用户可以使用手机或电脑连接此热点,在浏览器中访问***.***.*.*进入配网页面,进行wifi连接设置。 2. 百度语音在线识别:开发板能够识别用户发出的任意语音指令。 3. 智能家电控制:通过语音识别结果,开发板能够控制智能家电,如开关灯、打开电视等。 4. 语音播放功能:开发板能够播报时间、日历信息和天气预报。 为了支持上述功能,开发板具备以下几个关键的硬件组件和软件特性: - ESP8266芯片:这是一款低成本的Wi-Fi模块,具有完整的TCP/IP协议栈和微型控制器功能。 - WM8978音频编解码器:这是一个高保真音频编解码器,支持I2S通信协议,用于实现与ESP8266的音频信号交换。 - I2S通信:一种串行通信协议,用于在esp8266与WM8978之间传输音频数据。 - SDK源码:软件开发工具包,提供了一系列软件组件、库、示例代码和文档,用于开发基于esp8266的物联网应用。 - 技术支持:为了帮助开发者更好地使用开发板,还提供了相应的技术支持服务。 开发环境方面,本项目推荐使用Eclipse作为开发环境。Eclipse是一个开源的集成开发环境(IDE),广泛应用于Java、C/C++、PHP等编程语言的软件开发。 视频教程方面,开发团队正在更新视频教程,以帮助用户更好地理解和掌握开发板的使用方法。 对于开发板的原理图和详细使用说明,可以通过提供的百度云盘链接下载。提供的压缩包文件包括了使用说明文档、原理图PDF文件以及两个图片文件,图片文件可能包含了开发板的实物图或是原理图的截图。此外,还有一个文本文件包含了esp8266百度语音识别开发板资料的下载地址,可能指向的是源代码或是其他相关资源。 考虑到物联网、语音识别和电路方案的标签,可以推断出该开发板针对的是物联网领域中的智能对话系统市场,通过集成ESP8266芯片的低成本Wi-Fi连接能力,结合百度语音识别技术,为智能家居、智能助手等应用提供了一个完整的解决方案。该方案不仅简化了开发过程,而且也提供了一个互动体验,让用户体验到物联网技术带来的便利。
2017-09-12 上传
esp8266 语音播放 //Priorities of the reader and the decoder thread. Higher = higher prio. #define PRIO_READER 11 #define PRIO_MAD 1 //The mp3 read buffer size. 2106 bytes should be enough for up to 48KHz mp3s according to the sox sources. Used by libmad. #define READBUFSZ (2106) static char readBuf[READBUFSZ]; static long bufUnderrunCt; //Reformat the 16-bit mono sample to a format we can send to I2S. static int sampToI2s(short s) { //We can send a 32-bit sample to the I2S subsystem and the DAC will neatly split it up in 2 //16-bit analog values, one for left and one for right. //Duplicate 16-bit sample to both the L and R channel int samp=s; samp=(samp)&0xffff; samp=(samp<65535) samp=65535; if (samp>11]; err=(samp&0x7ff); //Save rounding error. return samp; } //2nd order delta-sigma DAC //See http://www.beis.de/Elektronik/DeltaSigma/DeltaSigma.html for a nice explanation static int sampToI2sDeltaSigma(short s) { int x; int val=0; int w; static int i1v=0, i2v=0; static int outReg=0; for (x=0; x<32; x++) { val<0) w-=32767; else w+=32767; //Difference 1 w+=i1v; i1v=w; //Integrator 1 if (outReg>0) w-=32767; else w+=32767; //Difference 2 w+=i2v; i2v=w; //Integrator 2 outReg=w; //register if (w>0) val|=1; //comparator } return val; } //Calculate the number of samples that we add or delete. Added samples means a slightly lower //playback rate, deleted samples means we increase playout speed a bit. This returns an //8.24 fixed-point number int recalcAddDelSamp(int oldVal) { int ret; long prevUdr=0; static int cnt; int i; static int minFifoFill=0; i=spiRamFifoFill(); if (i<minFifoFill) minFifoFill=i; //Do the rest of the calculations plusminus every 100mS (assuming a sample rate of 44KHz) cnt++; if (cnt<1500) return oldVal; cnt=0; if (spiRamFifoLen()<10*1024) { //The FIFO is very small. We can't do calculations on how much it's filled on average, so another //algorithm is called for. int tgt=1600; //we want an average of this amount of bytes as the average minimum buffer fill //Calculate underruns this cycle int udr=spiRamGetUnderrunCt()-prevUdr; //If we have underruns, the minimum buffer fill has been lower than 0. if (udr!=0) minFifoFill=-1; //If we're below our target decrease playback speed, and vice-versa. ret=oldVal+((minFifoFill-tgt)*ADD_DEL_BUFFPERSAMP_NOSPIRAM); prevUdr+=udr; minFifoFill=9999; } else { //We have a larger FIFO; we can adjust according to the FIFO fill rate. int tgt=spiRamFifoLen()/2; ret=(spiRamFifoFill()-tgt)*ADD_DEL_BUFFPERSAMP; } return ret; } //This routine is called by the NXP modifications of libmad. It passes us (for the mono synth) //32 16-bit samples. void render_sample_block(short *short_sample_buff, int no_samples) { //Signed 16.16 fixed point number: the amount of samples we need to add or delete //in every 32-sample static int sampAddDel=0; //Remainder of sampAddDel cumulatives static int sampErr=0; int i; int samp; #ifdef ADD_DEL_SAMPLES sampAddDel=recalcAddDelSamp(sampAddDel); #endif sampErr+=sampAddDel; for (i=0; i(1<<24)) { sampErr-=(1<<24); //...and don't output an i2s sample } else if (sampErr<-(1<<24)) { sampErr+=(1<bufend-stream->next_frame; memmove(readBuf, stream->next_frame, rem); while (rem<sizeof(readBuf)) { n=(sizeof(readBuf)-rem); //Calculate amount of bytes we need to fill buffer. i=spiRamFifoFill(); if (i<n) n=i; //If the fifo can give us less, only take that amount if (n==0) { //Can't take anything? //Wait until there is enough data in the buffer. This only happens when the data feed //rate is too low, and shouldn't normally be needed! // printf("Buf uflow, need %d bytes.\n", sizeof(readBuf)-rem); bufUnderrunCt++; //We both silence the output as well as wait a while by pushing silent samples into the i2s system. //This waits for about 200mS for (n=0; nerror, mad_stream_errorstr(stream)); return MAD_FLOW_CONTINUE; } //This is the main mp3 decoding task. It will grab data from the input buffer FIFO in the SPI ram and //output it to the I2S port. void ICACHE_FLASH_ATTR tskmad(void *pvParameters) { int r; struct mad_stream *stream; struct mad_frame *frame; struct mad_synth *synth; //Allocate structs needed for mp3 decoding stream=malloc(sizeof(struct mad_stream)); frame=malloc(sizeof(struct mad_frame)); synth=malloc(sizeof(struct mad_synth)); if (stream==NULL) { printf("MAD: malloc(stream) failed\n"); return; } if (synth==NULL) { printf("MAD: malloc(synth) failed\n"); return; } if (frame==NULL) { printf("MAD: malloc(frame) failed\n"); return; } //Initialize I2S i2sInit(); bufUnderrunCt=0; printf("MAD: Decoder start.\n"); //Initialize mp3 parts mad_stream_init(stream); mad_frame_init(frame); mad_synth_init(synth); while(1) { input(stream); //calls mad_stream_buffer internally while(1) { r=mad_frame_decode(frame, stream); if (r==-1) { if (!MAD_RECOVERABLE(stream->error)) { //We're most likely out of buffer and need to call input() again break; } error(NULL, stream, frame); continue; } mad_synth_frame(synth, frame); } } } int getIpForHost(const char *host, struct sockaddr_in *ip) { struct hostent *he; struct in_addr **addr_list; he=gethostbyname(host); if (he==NULL) return 0; addr_list=(struct in_addr **)he->h_addr_list; if (addr_list[0]==NULL) return 0; ip->sin_family=AF_INET; memcpy(&ip->sin_addr, addr_list[0], sizeof(ip->sin_addr)); return 1; } //Open a connection to a webserver and request an URL. Yes, this possibly is one of the worst ways to do this, //but RAM is at a premium here, and this works for most of the cases. int ICACHE_FLASH_ATTR openConn(const char *streamHost, const char *streamPath) { int n, i; while(1) { struct sockaddr_in remote_ip; bzero(&remote_ip, sizeof(struct sockaddr_in)); if (!getIpForHost(streamHost, &remote_ip)) { vTaskDelay(1000/portTICK_RATE_MS); continue; } int sock=socket(PF_INET, SOCK_STREAM, 0); if (sock==-1) { continue; } remote_ip.sin_port = htons(streamPort); printf("Connecting to server %s...\n", ipaddr_ntoa((const ip_addr_t*)&remote_ip.sin_addr.s_addr)); if (connect(sock, (struct sockaddr *)(&remote_ip), sizeof(struct sockaddr))!=00) { close(sock); printf("Conn err.\n"); vTaskDelay(1000/portTICK_RATE_MS); continue; } //Cobble together HTTP request write(sock, "GET ", 4); write(sock, streamPath, strlen(streamPath)); write(sock, " HTTP/1.0\r\nHost: ", 17); write(sock, streamHost, strlen(streamHost)); write(sock, "\r\n\r\n", 4); //We ignore the headers that the server sends back... it's pretty dirty in general to do that, //but it works here because the MP3 decoder skips it because it isn't valid MP3 data. return sock; } } //Reader task. This will try to read data from a TCP socket into the SPI fifo buffer. void ICACHE_FLASH_ATTR tskreader(void *pvParameters) { int madRunning=0; char wbuf[64]; int n, l, inBuf; int t; int fd; int c=0; while(1) { fd=openConn(streamHost, streamPath); printf("Reading into SPI RAM FIFO...\n"); do { n=read(fd, wbuf, sizeof(wbuf)); if (n>0) spiRamFifoWrite(wbuf, n); c+=n; if ((!madRunning) && (spiRamFifoFree()0); close(fd); printf("Connection closed.\n"); } } //Simple task to connect to an access point, initialize i2s and fire up the reader task. void ICACHE_FLASH_ATTR tskconnect(void *pvParameters) { //Wait a few secs for the stack to settle down vTaskDelay(3000/portTICK_RATE_MS); //Go to station mode wifi_station_disconnect(); if (wifi_get_opmode() != STATION_MODE) { wifi_set_opmode(STATION_MODE); } //Connect to the defined access point. struct station_config *config=malloc(sizeof(struct station_config)); memset(config, 0x00, sizeof(struct station_config)); sprintf(config->ssid, AP_NAME); sprintf(config->password, AP_PASS); wifi_station_set_config(config); wifi_station_connect(); free(config); //Fire up the reader task. The reader task will fire up the MP3 decoder as soon //as it has read enough MP3 data. if (xTaskCreate(tskreader, "tskreader", 230, NULL, PRIO_READER, NULL)!=pdPASS) printf("Error creating reader task!\n"); //We're done. Delete this task. vTaskDelete(NULL); } //We need this to tell the OS we're running at a higher clock frequency. extern void os_update_cpu_frequency(int mhz); void ICACHE_FLASH_ATTR user_init(void) { //Tell hardware to run at 160MHz instead of 80MHz //This actually is not needed in normal situations... the hardware is quick enough to do //MP3 decoding at 80MHz. It, however, seems to help with receiving data over long and/or unstable //links, so you may want to turn it on. Also, the delta-sigma code seems to need a bit more speed //than the other solutions to keep up with the output samples, so it's also enabled there. #if defined(DELTA_SIGMA_HACK) SET_PERI_REG_MASK(0x3ff00014, BIT(0)); os_update_cpu_frequency(160); #endif //Set the UART to 115200 baud UART_SetBaudrate(0, 115200); //Initialize the SPI RAM chip communications and see if it actually retains some bytes. If it //doesn't, warn user. if (!spiRamFifoInit()) { printf("\n\nSPI RAM chip fail!\n"); while(1); } printf("\n\nHardware initialized. Waiting for network.\n"); xTaskCreate(tskconnect, "tskconnect", 200, NULL, 3, NULL); }