FinSAH Model: Our Custom BERT-Based Conversational Intelligence

Published on 28 May 2025

FinSAH Model: Our Custom BERT-Based Conversational Intelligence

Published on May 28, 2025 by Syed Ali Hassan

FinSAH is our internal conversational intelligence layer built on top of a lean, battle-tested Transformer backbone (DistilBERT) augmented with domain-tuned retrieval, structured corpora, and deterministic FAQ logic. Instead of depending on heavyweight external LLM providers for every turn, we engineered a hybrid stack that keeps answers grounded, fast, and privacy-conscious.

Why We Built FinSAH

  • Determinism & Reliability: Avoid brittle hallucinations by constraining answer space to verified internal knowledge.
  • Cost Efficiency: Lightweight extractive QA beats per-token generation costs for high-volume support / advisory workflows.
  • Data Residency & Privacy: Keep sensitive profile and project data local without shipping full context to third parties.
  • Composable Architecture: Swap or layer retrieval sources (FAQ, profile, PDF, legal, blog) without retraining the core model.

High-Level Architecture

  1. Query Normalization: Light cleaning, lowercasing, token filtering (stopwords trimmed only for scoring stage—raw query preserved for answering).
  2. Multi-Corpus Retrieval: Keyword & token overlap scoring across: FAQ dataset, Profile (resume-derived), Legal/Process docs, Blog knowledge, PDF-ingested corpus.
  3. Stage Selection: Fast path: FAQ match → General cross-corpus → Profile-focused fallback → Conversation history synthesis → Safe fallback summary.
  4. Extractive QA: DistilBERT-based QA (context window assembled from top-N retrieved chunks).
  5. Answer Synthesis / Sanitization: If QA is low-confidence, we synthesize a concise, multi-source stitched response—never returning "I do not know".

Data Preparation Pipeline

  • FAQ Curation: High-frequency strategic, pricing, timeline, integration, security, IP ownership questions manually authored → deterministic answers with semantic scoring fallback.
  • Resume / Profile Ingestion: PDF parsed → chunked into semantic atomic sections (basic info, roles, achievements) → indexed with IDs (e.g., profile-basic).
  • Blog & Legal Docs: Existing structured content tokenized and stored with sourceType metadata for traceability in answers.
  • Chunk Scoring: Simple overlap heuristic (token Jaccard + weighted keyword hits) keeps implementation transparent and inspectable.

Model Customization Strategy

We did not blindly fine-tune BERT on small, overfitted subsets. Instead, we focused on strategic wrapping of a robust pretrained extractive QA backbone:

  • Prompt Assembly: Concatenate top contextual chunks with clear separators and a concise directive statement.
  • Confidence Heuristics: Length + pattern checks (e.g., avoidance of generic uncertainty phrases) trigger alternate retrieval stage.
  • History Integration: Recent turns distilled into a pseudo-context block if direct retrieval is weak.
  • Source Attribution: Each answer surfaces up to six scored chunks with source type tags for auditability.

Why Not Immediate Fine-Tuning?

Fine-tuning extractive QA models on narrow proprietary corpora can lead to degraded generalization and brittle answers. Our architecture achieves high precision via retrieval quality & deterministic layers first. We reserve fine-tuning for future phases once we accumulate higher-quality interaction logs and validated target answer spans.

Extensibility Roadmap

  • Semantic Vector Retrieval: Introduce hybrid scoring (BM25 + embeddings) while retaining transparency.
  • Light Supervised Fine-Tune: Use curated Q&A pairs harvested from production interactions.
  • Adaptive FAQ Expansion: Automatically propose new FAQ entries based on clustering of unmatched queries.
  • Reinforcement Signals: Track helpfulness votes & dwell time for ranking adjustments.

Results So Far

  • Latency kept low (single lightweight QA inference + minimal chunk assembly).
  • No generic "I don't know" responses—context synthesis guarantees a constructive answer.
  • Answers consistently grounded with visible sources for trust & compliance review.

FinSAH demonstrates that with careful orchestration, retrieval quality and explainability can outperform naive large model prompting for focused domains. It is a foundation we will iterate toward more adaptive and learning-driven behavior.

Ready to Implement?

Get a personalized AI roadmap, free ROI calculator, and expert guidance tailored to your business needs.