Building AI-Powered Meeting Intelligence: Lessons from EVA Meet

September 15, 2025

by Hamzah Ejaz, Software Engineer

Building EVA Meet at CogniCloud taught me invaluable lessons about integrating multiple AI services into a cohesive, enterprise-grade platform. Here's what I learned architecting a system that processes live conversations, fact-checks in real-time, and generates actionable insights.

The Challenge

When we started EVA Meet, the goal was ambitious: create a meeting intelligence platform that could:

Transcribe conversations with near-perfect accuracy in real-time
Fact-check claims as they're discussed
Generate intelligent summaries and extract action items
Deliver everything with sub-2-second latency
Scale to enterprise requirements

Architecture Overview

Multi-AI Orchestration

The core challenge was orchestrating three different AI services:

Deepgram for Transcription

// Real-time transcription with Deepgram
const deepgram = createClient(process.env.DEEPGRAM_API_KEY)

const connection = deepgram.listen.live({
  model: 'nova-2',
  language: 'en',
  smart_format: true,
  punctuate: true,
})

connection.on('transcript', (data) => {
  const transcript = data.channel.alternatives[0].transcript
  if (transcript) {
    // Emit to WebSocket clients
    io.emit('transcription', {
      text: transcript,
      confidence: data.channel.alternatives[0].confidence,
      timestamp: Date.now()
    })
  }
})

Perplexity AI for Fact-Checking

async function factCheck(claim: string) {
  const response = await fetch('https://api.perplexity.ai/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.PERPLEXITY_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'pplx-70b-online',
      messages: [{
        role: 'user',
        content: `Fact-check this claim and provide sources: "${claim}"`
      }]
    })
  })

  return await response.json()
}

GPT-4 for Summarization

async function generateSummary(transcript: string) {
  const completion = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{
      role: 'system',
      content: 'You are an expert at analyzing meeting transcripts...'
    }, {
      role: 'user',
      content: `Summarize this meeting and extract action items:\n\n${transcript}`
    }],
    temperature: 0.3,
  })

  return completion.choices[0].message.content
}

Real-time Infrastructure

WebSocket Architecture

Achieving sub-2-second latency required careful WebSocket design:

io.on('connection', (socket) => {
  console.log('Client connected:', socket.id)

  socket.on('join-meeting', async ({ meetingId, userId }) => {
    socket.join(meetingId)

    // Send historical transcript
    const history = await getTranscriptHistory(meetingId)
    socket.emit('transcript-history', history)
  })

  socket.on('audio-chunk', async (audioData) => {
    // Stream to Deepgram
    deepgramConnection.send(audioData)
  })

  socket.on('request-factcheck', async ({ claim, meetingId }) => {
    const result = await factCheck(claim)
    io.to(meetingId).emit('factcheck-result', result)
  })
})

Performance Optimization

Streaming Responses Instead of waiting for complete AI responses, we stream results:

const stream = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [...],
  stream: true,
})

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content
  if (content) {
    socket.emit('summary-chunk', content)
  }
}

Key Learnings

1. API Rate Limiting & Costs

Managing multiple AI APIs requires careful rate limiting:

const rateLimiter = new RateLimiter({
  openai: { requests: 500, per: 'minute' },
  perplexity: { requests: 100, per: 'minute' },
  deepgram: { minutes: 10000, per: 'month' }
})

async function callWithRateLimit(service, fn) {
  await rateLimiter.wait(service)
  return await fn()
}

2. Error Handling & Fallbacks

Enterprise systems need robust error handling:

async function transcribeWithFallback(audio) {
  try {
    return await deepgram.transcribe(audio)
  } catch (error) {
    logger.error('Deepgram failed:', error)
    // Fallback to alternative service
    return await whisperAPI.transcribe(audio)
  }
}

3. Context Window Management

GPT-4 has token limits. We implemented smart context management:

function buildContextWindow(transcript: string, maxTokens = 7000) {
  const messages = splitIntoMessages(transcript)
  let context = []
  let tokenCount = 0

  // Take most recent messages that fit in context
  for (let i = messages.length - 1; i >= 0; i--) {
    const messageTokens = estimateTokens(messages[i])
    if (tokenCount + messageTokens > maxTokens) break
    context.unshift(messages[i])
    tokenCount += messageTokens
  }

  return context
}

Production Challenges Solved

Scalability

Implemented connection pooling for WebSockets
Used Redis for distributed caching
Deployed on Kubernetes for auto-scaling

Data Privacy

End-to-end encryption for audio streams
GDPR-compliant data retention policies
Secure API key management with Vault

Reliability

Implemented circuit breakers for AI APIs
Added retry logic with exponential backoff
Built health check endpoints for monitoring

Results

< 2 second latency for real-time features
95%+ accuracy in transcription and action item extraction
Zero downtime during peak usage
Served 100+ concurrent meetings without degradation

Takeaways for Developers

Start simple: Don't optimize prematurely. Get it working, then make it fast.
Monitor everything: AI APIs can fail in unexpected ways. Comprehensive logging saved us multiple times.
Test with real data: Mock data doesn't reveal edge cases in natural language processing.
Budget wisely: AI API costs can spiral quickly. Implement usage tracking early.
User feedback loops: The best improvements came from watching how users actually used the features.

Next Steps

We're exploring:

Custom LLM fine-tuning for domain-specific meetings
Multi-language support
Integration with more meeting platforms
On-premise deployment for security-sensitive clients

Building EVA Meet was one of the most challenging and rewarding projects of my career. The intersection of AI and real-time systems presents unique challenges, but the impact on productivity is transformative.

Want to discuss AI integration strategies? Get in touch.

Contact Info

Follow me