Machine Learning Researcher
KGEN
Job: Machine Learning Researcher/AI Voice Researcher- Model Evaluation & Data Strategy
About the Role
We are building structured, high-quality voice datasets for frontier AI companies working on speech-to-text, speech-to-speech, and multimodal AI systems.
We are looking for an AI Voice Researcher who can evaluate datasets across evolving speech models, identify performance gaps, and translate those insights into structured data strategy.
This role sits at the intersection of research, benchmarking, and data intelligence.
What You’ll Own
Cross-Model Dataset Evaluation
- Benchmark voice datasets across ASR and speech models (Whisper, Deepgram, Google STT, etc.)
- Measure performance using WER, CER, MOS, robustness, latency and error patterns
- Design structured experiments to understand how different datasets impact model accuracy
- Compare performance across multilingual, dialect-heavy, emotional, and noisy speech data
Model Gap Analysis
- Identify where speech models underperform:
- Accents and dialects
- Code-switching
- Emotional speech
- Low-resource languages
- Background noise scenarios
- Quantify model weaknesses through structured analysis
- Map performance gaps to specific dataset requirements
You will help define what data models actually need next.
Dataset Scoring & Supplier Quality Framework
- Build a standardized dataset quality scoring rubric
- Define measurable evaluation criteria:
- Audio clarity
- Speaker diversity
- Annotation accuracy
- Emotion depth
- Accent coverage
- Tag and rank suppliers based on objective quality signals
Benchmarking Reports & Strategic Insights
- Publish structured benchmarking reports
- Track performance shifts as new models are released
- Stay updated on evolving speech model architectures
- Provide outside-in insights to support conversations with AI research teams
What We’re Looking For
- 2–6 years experience in speech AI, audio ML, or applied AI research
- Strong understanding of ASR / TTS systems and model behavior
- Experience running experiments and benchmarking models
- Strong Python and ML experimentation skills
- Ability to design structured evaluation frameworks
Technical Skills
- Python (mandatory)
- PyTorch / TensorFlow
- Whisper / SpeechBrain / Kaldi or similar
- Familiarity with WER, CER, MOS, SNR metrics
- Experience working with multilingual datasets
Ideal Mindset
- Deeply curious about how models fail
- Analytical and detail-oriented
- Outside-in thinker
- Comfortable reading research papers and testing new APIs
- Strong written communication skills
What Success Looks Like
- Clear benchmarking framework across multiple speech models
- Published internal evaluation reports
- Identified model gaps tied to structured data recommendations
- Dataset quality scoring system operational
- Measurable improvement in supplier differentiation and data strategy
Ready to build the future of AI + Humans
📩 Send your profile to: apurva@kgen.io/hr@humynlabs.ai
Learn more: https://kgen.io
Follow us: https://x.com/KGeN_Community
| https://x.com/KGeN_IO
