Building AI-Powered Meeting Intelligence: Lessons from EVA Meet

by Hamzah Ejaz, Software Engineer

Building EVA Meet at CogniCloud taught me invaluable lessons about integrating multiple AI services into a cohesive, enterprise-grade platform. Here's what I learned architecting a system that processes live conversations, fact-checks in real-time, and generates actionable insights.

The Challenge

When we started EVA Meet, the goal was ambitious: create a meeting intelligence platform that could:

  • Transcribe conversations with near-perfect accuracy in real-time
  • Fact-check claims as they're discussed
  • Generate intelligent summaries and extract action items
  • Deliver everything with sub-2-second latency
  • Scale to enterprise requirements

Architecture Overview

Multi-AI Orchestration

The core challenge was orchestrating three different AI services:

Deepgram for Transcription

// Real-time transcription with Deepgram
const deepgram = createClient(process.env.DEEPGRAM_API_KEY)

const connection = deepgram.listen.live({
  model: 'nova-2',
  language: 'en',
  smart_format: true,
  punctuate: true,
})

connection.on('transcript', (data) => {
  const transcript = data.channel.alternatives[0].transcript
  if (transcript) {
    // Emit to WebSocket clients
    io.emit('transcription', {
      text: transcript,
      confidence: data.channel.alternatives[0].confidence,
      timestamp: Date.now()
    })
  }
})

Perplexity AI for Fact-Checking

async function factCheck(claim: string) {
  const response = await fetch('https://api.perplexity.ai/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.PERPLEXITY_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'pplx-70b-online',
      messages: [{
        role: 'user',
        content: `Fact-check this claim and provide sources: "${claim}"`
      }]
    })
  })

  return await response.json()
}

GPT-4 for Summarization

async function generateSummary(transcript: string) {
  const completion = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{
      role: 'system',
      content: 'You are an expert at analyzing meeting transcripts...'
    }, {
      role: 'user',
      content: `Summarize this meeting and extract action items:\n\n${transcript}`
    }],
    temperature: 0.3,
  })

  return completion.choices[0].message.content
}

Real-time Infrastructure

WebSocket Architecture

Achieving sub-2-second latency required careful WebSocket design:

io.on('connection', (socket) => {
  console.log('Client connected:', socket.id)

  socket.on('join-meeting', async ({ meetingId, userId }) => {
    socket.join(meetingId)

    // Send historical transcript
    const history = await getTranscriptHistory(meetingId)
    socket.emit('transcript-history', history)
  })

  socket.on('audio-chunk', async (audioData) => {
    // Stream to Deepgram
    deepgramConnection.send(audioData)
  })

  socket.on('request-factcheck', async ({ claim, meetingId }) => {
    const result = await factCheck(claim)
    io.to(meetingId).emit('factcheck-result', result)
  })
})

Performance Optimization

Streaming Responses Instead of waiting for complete AI responses, we stream results:

const stream = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [...],
  stream: true,
})

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content
  if (content) {
    socket.emit('summary-chunk', content)
  }
}

Key Learnings

1. API Rate Limiting & Costs

Managing multiple AI APIs requires careful rate limiting:

const rateLimiter = new RateLimiter({
  openai: { requests: 500, per: 'minute' },
  perplexity: { requests: 100, per: 'minute' },
  deepgram: { minutes: 10000, per: 'month' }
})

async function callWithRateLimit(service, fn) {
  await rateLimiter.wait(service)
  return await fn()
}

2. Error Handling & Fallbacks

Enterprise systems need robust error handling:

async function transcribeWithFallback(audio) {
  try {
    return await deepgram.transcribe(audio)
  } catch (error) {
    logger.error('Deepgram failed:', error)
    // Fallback to alternative service
    return await whisperAPI.transcribe(audio)
  }
}

3. Context Window Management

GPT-4 has token limits. We implemented smart context management:

function buildContextWindow(transcript: string, maxTokens = 7000) {
  const messages = splitIntoMessages(transcript)
  let context = []
  let tokenCount = 0

  // Take most recent messages that fit in context
  for (let i = messages.length - 1; i >= 0; i--) {
    const messageTokens = estimateTokens(messages[i])
    if (tokenCount + messageTokens > maxTokens) break
    context.unshift(messages[i])
    tokenCount += messageTokens
  }

  return context
}

Production Challenges Solved

Scalability

  • Implemented connection pooling for WebSockets
  • Used Redis for distributed caching
  • Deployed on Kubernetes for auto-scaling

Data Privacy

  • End-to-end encryption for audio streams
  • GDPR-compliant data retention policies
  • Secure API key management with Vault

Reliability

  • Implemented circuit breakers for AI APIs
  • Added retry logic with exponential backoff
  • Built health check endpoints for monitoring

Results

  • < 2 second latency for real-time features
  • 95%+ accuracy in transcription and action item extraction
  • Zero downtime during peak usage
  • Served 100+ concurrent meetings without degradation

Takeaways for Developers

  1. Start simple: Don't optimize prematurely. Get it working, then make it fast.
  2. Monitor everything: AI APIs can fail in unexpected ways. Comprehensive logging saved us multiple times.
  3. Test with real data: Mock data doesn't reveal edge cases in natural language processing.
  4. Budget wisely: AI API costs can spiral quickly. Implement usage tracking early.
  5. User feedback loops: The best improvements came from watching how users actually used the features.

Next Steps

We're exploring:

  • Custom LLM fine-tuning for domain-specific meetings
  • Multi-language support
  • Integration with more meeting platforms
  • On-premise deployment for security-sensitive clients

Building EVA Meet was one of the most challenging and rewarding projects of my career. The intersection of AI and real-time systems presents unique challenges, but the impact on productivity is transformative.

Want to discuss AI integration strategies? Get in touch.

More articles

Full Stack MERN to AI Engineer: My Journey

How I transitioned from traditional full-stack development to AI engineering, the skills I acquired, and lessons for developers making the same journey.

Read more

The Future of Web Development with AI Integration

Exploring how AI is transforming web development, from intelligent code generation to enhanced user experiences and automated workflows.

Read more

Ready to Transform Your Business?

Get in touch today to learn how technology can revolutionize your operations!