Voice Recognition MCP Service
This service provides voice recognition and text extraction capabilities through both stdio and MCP modes.
Features
- Voice recognition from file
- Voice recognition from base64 encoded data
- Text extraction
- Support for both stdio and MCP modes
- Structured voice recognition results
Project Structure
voice_service.py
- Core service implementationstdio_server.py
- stdio mode entry pointmcp_server.py
- MCP mode entry pointbuild.py
- Build script for executablesbuild_exec.sh
- Build execution scripttest_*.sh
- Test scripts for different functionalities
Installation
- Clone the repository:
git clone https://github.com/AIO-2030/mcp_voice_identify.git
cd mcp_voice_identify
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables in
.env
:
API_URL=your_api_url
API_KEY=your_api_key
Usage
stdio Mode
- Run the service:
python stdio_server.py
- Send JSON-RPC requests via stdin:
{
"jsonrpc": "2.0",
"method": "help",
"params": {},
"id": 1
}
- Or use the executable:
./dist/voice_stdio
MCP Mode
- Run the service:
python mcp_server.py
- Or use the executable:
./dist/voice_mcp
Voice Recognition Results
The service provides structured voice recognition results. Here’s an example of the response format:
Original API Response
{
"jsonrpc": "2.0",
"result": {
"message": "input processed successfully",
"results": "test test test",
"label_result": "<|en|><|EMO_UNKNOWN|><|Speech|><|woitn|>test test test"
},
"id": 1
}
Restructured Response
{
"jsonrpc": "2.0",
"result": {
"message": "input processed successfully",
"results": "test test test",
"label_result": {
"lan": "en",
"emo": "unknown",
"type": "speech",
"speaker": "woitn",
"text": "test test test"
}
},
"id": 1
}
Label Result Fields
The label_result
field contains the following structured information:
Field | Description | Example Value |
---|---|---|
lan | Language code | “en” |
emo | Emotion state | “unknown” |
type | Audio type | “speech” |
speaker | Speaker identifier | “woitn” |
text | Recognized text content | “test test test” |
Special Labels
The service recognizes and processes the following special labels in the original response:
<|en|>
- Language code<|EMO_UNKNOWN|>
- Emotion state<|Speech|>
- Audio type<|woitn|>
- Speaker identifier
Building Executables
- Make the build script executable:
chmod +x build_exec.sh
- Build stdio mode executable:
./build_exec.sh
- Build MCP mode executable:
./build_exec.sh mcp
The executables will be created at:
- stdio mode:
dist/voice_stdio
- MCP mode:
dist/voice_mcp
Testing
Run the test scripts:
chmod +x test_*.sh
./test_help.sh
./test_voice_file.sh
./test_voice_base64.sh
License
This project is licensed under the MIT License - see the LICENSE file for details.
Voice Recognition Service
Project Details
- yangsenessa/mcp_voice_identify
- MIT License
- Last Updated: 4/15/2025
Recomended MCP Servers

🧠 𝑴𝒆𝒎𝒐𝒓𝒚-𝑷𝒍𝒖𝒔 is a lightweight, local RAG memory store for MCP agents. Easily record, retrieve, update, delete, and...
MCP server for applying a Claude Shannon-inspired problem-solving pattern
MCP server for fetch web page content using Playwright headless browser.
✍ 一款优秀的开源博客发布应用。
🔍 Enable AI assistants to search, access, and analyze PubMed articles through a simple MCP interface.
Documentation Generator MCP Server for automated documentation creation

A MCP server providing character info query for wuthering waves
Rootly MCP server
Repositório com um MCP-Server simples com seis tipos de mapas mentais diferentes.
Apache AGE MCP Server
An MCP (Model Context Protocol) tool that provides stock market data and trading capabilities using the yfinance library,...