Voice Recognition MCP Service
This service provides voice recognition and text extraction capabilities through both stdio and MCP modes.
Features
- Voice recognition from file
- Voice recognition from base64 encoded data
- Text extraction
- Support for both stdio and MCP modes
- Structured voice recognition results
Project Structure
voice_service.py- Core service implementationstdio_server.py- stdio mode entry pointmcp_server.py- MCP mode entry pointbuild.py- Build script for executablesbuild_exec.sh- Build execution scripttest_*.sh- Test scripts for different functionalities
Installation
- Clone the repository:
git clone https://github.com/AIO-2030/mcp_voice_identify.git
cd mcp_voice_identify
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables in
.env:
API_URL=your_api_url
API_KEY=your_api_key
Usage
stdio Mode
- Run the service:
python stdio_server.py
- Send JSON-RPC requests via stdin:
{
"jsonrpc": "2.0",
"method": "help",
"params": {},
"id": 1
}
- Or use the executable:
./dist/voice_stdio
MCP Mode
- Run the service:
python mcp_server.py
- Or use the executable:
./dist/voice_mcp
Voice Recognition Results
The service provides structured voice recognition results. Here’s an example of the response format:
Original API Response
{
"jsonrpc": "2.0",
"result": {
"message": "input processed successfully",
"results": "test test test",
"label_result": "<|en|><|EMO_UNKNOWN|><|Speech|><|woitn|>test test test"
},
"id": 1
}
Restructured Response
{
"jsonrpc": "2.0",
"result": {
"message": "input processed successfully",
"results": "test test test",
"label_result": {
"lan": "en",
"emo": "unknown",
"type": "speech",
"speaker": "woitn",
"text": "test test test"
}
},
"id": 1
}
Label Result Fields
The label_result field contains the following structured information:
| Field | Description | Example Value |
|---|---|---|
| lan | Language code | “en” |
| emo | Emotion state | “unknown” |
| type | Audio type | “speech” |
| speaker | Speaker identifier | “woitn” |
| text | Recognized text content | “test test test” |
Special Labels
The service recognizes and processes the following special labels in the original response:
<|en|>- Language code<|EMO_UNKNOWN|>- Emotion state<|Speech|>- Audio type<|woitn|>- Speaker identifier
Building Executables
- Make the build script executable:
chmod +x build_exec.sh
- Build stdio mode executable:
./build_exec.sh
- Build MCP mode executable:
./build_exec.sh mcp
The executables will be created at:
- stdio mode:
dist/voice_stdio - MCP mode:
dist/voice_mcp
Testing
Run the test scripts:
chmod +x test_*.sh
./test_help.sh
./test_voice_file.sh
./test_voice_base64.sh
License
This project is licensed under the MIT License - see the LICENSE file for details.
Voice Recognition Service
Project Details
- yangsenessa/mcp_voice_identify
- MIT License
- Last Updated: 4/15/2025
Recomended MCP Servers
@sage/mcp-apple
A Model Context Protocol (MCP) server that enables AI assistants to perform web searches using SearXNG, a privacy-respecting...
An MCP server to let AI agents control Intruder
A model context protocol server to migrate data out of code (ts/js) into config (json)
Appwrite’s MCP server. Operating your backend has never been easier.
MCP Server for the Mapbox API.
mcp server
challenge 5 activity
A Model Context Protocol (MCP) server for accessing the Climatiq API to calculate carbon emissions. This allows AI...
An intelligent MCP server that serves as a guardian of development knowledge, providing Cline assistants with curated access...





