andrei/maternal-app

Fork 0

Files

History

Andrei 79966a6a6d

CI/CD Pipeline / Lint and Test (push) Has been cancelled

Details

CI/CD Pipeline / E2E Tests (push) Has been cancelled

Details

CI/CD Pipeline / Build Application (push) Has been cancelled

Details

Add voice intent classification for hands-free tracking

Implemented comprehensive voice command understanding system:

**Intent Classification:**
- Feeding intent (bottle, breastfeeding, solid food)
- Sleep intent (naps, nighttime sleep)
- Diaper intent (wet, dirty, both, dry)
- Unknown intent handling

**Entity Extraction:**
- Amounts with units (ml, oz, tbsp): "120 ml", "4 ounces"
- Durations in minutes: "15 minutes", "for 20 mins"
- Time expressions: "at 3:30 pm", "30 minutes ago", "just now"
- Breast feeding side: "left", "right", "both"
- Diaper types: "wet", "dirty", "both"
- Sleep types: "nap", "night"

**Structured Data Output:**
- FeedingData: type, amount, unit, duration, side, timestamps
- SleepData: type, duration, start/end times
- DiaperData: type, timestamp
- Ready for direct activity creation

**Pattern Matching:**
- 15+ feeding patterns (bottle, breast, solid)
- 8+ sleep patterns (nap, sleep, woke up)
- 8+ diaper patterns (wet, dirty, bowel movement)
- Robust keyword detection with variations

**Confidence Scoring:**
- High: >= 0.8 (strong match)
- Medium: 0.5-0.79 (probable match)
- Low: < 0.5 (uncertain)
- Minimum threshold: 0.3 for validation

**API Endpoint:**
- POST /api/voice/transcribe - Classify text or audio
- GET /api/voice/transcribe - Get supported commands
- JSON response with intent, confidence, entities, structured data
- Audio transcription placeholder (Whisper integration ready)

**Implementation Files:**
- lib/voice/intentClassifier.ts - Core classification (600+ lines)
- app/api/voice/transcribe/route.ts - API endpoint
- scripts/test-voice-intent.mjs - Test suite (25 tests)
- lib/voice/README.md - Complete documentation

**Test Coverage:** 25 tests, 100% pass rate
✅ Bottle feeding (3 tests)
✅ Breastfeeding (3 tests)
✅ Solid food (2 tests)
✅ Sleep tracking (6 tests)
✅ Diaper changes (7 tests)
✅ Edge cases (4 tests)

**Example Commands:**
- "Fed baby 120 ml" → bottle, 120ml
- "Nursed on left breast for 15 minutes" → breast_left, 15min
- "Changed wet and dirty diaper" → both
- "Napped for 45 minutes" → nap, 45min

System converts natural language to structured tracking data with
high accuracy for common parenting voice commands.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-01 20:20:07 +00:00

intentClassifier.ts

Add voice intent classification for hands-free tracking

2025-10-01 20:20:07 +00:00

README.md

Add voice intent classification for hands-free tracking

2025-10-01 20:20:07 +00:00

README.md

Voice Intent Classification

Accurate classification of voice commands for hands-free activity tracking.

Overview

The voice intent classification system converts natural language voice commands into structured data for feeding, sleep, and diaper tracking. It uses pattern matching and entity extraction to understand user intent and extract relevant details.

Supported Intents

1. Feeding

Track bottle feeding, breastfeeding, and solid food consumption.

Subtypes:

bottle - Bottle feeding with formula or pumped milk
breast_left - Breastfeeding from left side
breast_right - Breastfeeding from right side
breast_both - Breastfeeding from both sides
solid - Solid food or meals

Extractable Entities:

Amount (ml, oz, tbsp)
Duration (minutes)
Side (left, right, both)
Time (absolute or relative)

Examples:

"Fed baby 120 ml"
"Gave him 4 ounces"
"Nursed on left breast for 15 minutes"
"Breastfed on both sides for 20 minutes"
"Baby ate solid food"
"Had breakfast"

2. Sleep

Track naps and nighttime sleep.

Subtypes:

nap - Daytime nap
night - Nighttime sleep

Extractable Entities:

Duration (minutes)
Type (nap or night)
Time (start/end)

Examples:

"Baby fell asleep for a nap"
"Napped for 45 minutes"
"Put baby down for bedtime"
"Baby is sleeping through the night"
"Baby woke up"

3. Diaper

Track diaper changes.

Subtypes:

wet - Wet diaper (urine)
dirty - Dirty diaper (bowel movement)
both - Both wet and dirty
dry - Dry/clean diaper

Extractable Entities:

Type (wet, dirty, both)
Time (when changed)

Examples:

"Changed wet diaper"
"Dirty diaper change"
"Changed a wet and dirty diaper"
"Baby had a bowel movement"
"Diaper had both poop and pee"

Usage

Basic Classification

import { classifyIntent } from '@/lib/voice/intentClassifier';

const result = classifyIntent("Fed baby 120 ml");

console.log(result.intent); // 'feeding'
console.log(result.confidence); // 0.9
console.log(result.structuredData);
// {
//   type: 'bottle',
//   amount: 120,
//   unit: 'ml'
// }

With Validation

import { classifyIntent, validateClassification } from '@/lib/voice/intentClassifier';

const result = classifyIntent(userInput);

if (validateClassification(result)) {
  // Confidence >= 0.3 and intent is known
  createActivity(result.structuredData);
} else {
  // Low confidence or unknown intent
  showError("Could not understand command");
}

Confidence Levels

import { getConfidenceLevel } from '@/lib/voice/intentClassifier';

const level = getConfidenceLevel(0.85); // 'high'
// 'high': >= 0.8
// 'medium': 0.5 - 0.79
// 'low': < 0.5

API Endpoint

POST /api/voice/transcribe

Transcribe audio or classify text input.

Text Input:

curl -X POST http://localhost:3030/api/voice/transcribe \
  -H "Content-Type: application/json" \
  -d '{"text": "Fed baby 120ml"}'

Response:

{
  "success": true,
  "transcription": "Fed baby 120ml",
  "classification": {
    "intent": "feeding",
    "confidence": 0.9,
    "confidenceLevel": "high",
    "entities": [
      {
        "type": "amount",
        "value": 120,
        "confidence": 0.9,
        "text": "120 ml"
      }
    ],
    "structuredData": {
      "type": "bottle",
      "amount": 120,
      "unit": "ml"
    }
  }
}

GET /api/voice/transcribe

Get supported commands and examples.

curl http://localhost:3030/api/voice/transcribe

Pattern Matching

The classifier uses regex patterns to detect intents:

Feeding Patterns

Fed/feed/gave + amount + unit
Bottle feeding keywords
Breastfeeding keywords (nursed, nursing)
Solid food keywords (ate, breakfast, lunch, dinner)

Sleep Patterns

Sleep/nap keywords
Fell asleep / woke up
Bedtime / night sleep

Diaper Patterns

Diaper/nappy keywords
Changed diaper
Wet/dirty/poop/pee keywords
Bowel movement / BM

Entity Extraction

Amount Extraction

Recognizes:

120 ml, 120ml, 120 milliliters
4 oz, 4oz, 4 ounces
2 tbsp, 2 tablespoons

Duration Extraction

Recognizes:

15 minutes, 15 mins, 15min
for 20 minutes
lasted 30 minutes

Time Extraction

Recognizes:

Absolute: at 3:30 pm, 10 am
Relative: 30 minutes ago, 2 hours ago
Contextual: just now, a moment ago

Side Extraction (Breastfeeding)

Recognizes:

left breast, left side, left boob
right breast, right side
both breasts, both sides

Type Extraction (Diaper)

Recognizes:

Wet: wet, pee, urine
Dirty: dirty, poop, poopy, soiled, bowel movement, bm
Combination: detects both keywords for mixed diapers

Common Mishears & Corrections

The system handles common voice recognition errors:

Heard	Meant	Handled
"mils"	"ml"	✅ Pattern includes "ml" variations
"ounce says"	"ounces"	✅ Pattern matches "ounce" or "oz"
"left side" vs "left breast"	Same meaning	✅ Both patterns recognized
"poopy" vs "poop"	Same meaning	✅ Multiple variations supported

Confidence Scoring

Confidence is calculated based on:

Pattern matches: More matches = higher confidence
Entity extraction: Successfully extracted entities boost confidence
Ambiguity: Conflicting signals reduce confidence

Minimum confidence threshold: 0.3 (30%)

Testing

Run the test suite:

node scripts/test-voice-intent.mjs

Test Coverage:

25 test cases
Feeding: 8 tests (bottle, breast, solid)
Sleep: 6 tests (nap, night, duration)
Diaper: 7 tests (wet, dirty, both)
Edge cases: 4 tests

Multi-Language Support

Currently supports English only. Planned languages:

Spanish (es-ES)
French (fr-FR)
Portuguese (pt-BR)
Chinese (zh-CN)

Each language will have localized patterns and keywords.

Integration with Whisper API

For audio transcription, integrate OpenAI Whisper:

import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function transcribeAudio(audioFile: File): Promise<string> {
  const transcription = await openai.audio.transcriptions.create({
    file: audioFile,
    model: 'whisper-1',
    language: 'en', // Optional: specify language
  });

  return transcription.text;
}

Future Enhancements

Audio transcription with Whisper API
Multi-language support (5 languages)
Context-aware classification (user history)
Custom vocabulary (child names, brand names)
Clarification prompts for ambiguous commands
Machine learning-based classification
Offline voice recognition fallback
Voice feedback confirmation

Troubleshooting

Q: Classification returns 'unknown' for valid commands

Check if keywords are in supported patterns
Verify minimum confidence threshold (0.3)
Add variations to INTENT_PATTERNS

Q: Entities not extracted correctly

Check regex patterns in ENTITY_PATTERNS
Verify unit formatting (spaces, abbreviations)
Test with simplified command first

Q: Confidence too low despite correct intent

Multiple pattern matches boost confidence
Add more specific patterns for common phrases
Adjust confidence calculation algorithm

Error Codes

VOICE_INVALID_INPUT - Missing or invalid text input
VOICE_AUDIO_NOT_IMPLEMENTED - Audio transcription not yet available
VOICE_INVALID_CONTENT_TYPE - Wrong Content-Type header
VOICE_CLASSIFICATION_FAILED - Could not classify intent
VOICE_TRANSCRIPTION_FAILED - General transcription error