FinSAH Model: Our Custom BERT-Based Conversational Intelligence
Published on 28 May 2025
FinSAH Model: Our Custom BERT-Based Conversational Intelligence
Published on May 28, 2025 by Syed Ali Hassan
FinSAH is our internal conversational intelligence layer built on top of a lean, battle-tested Transformer backbone (DistilBERT) augmented with domain-tuned retrieval, structured corpora, and deterministic FAQ logic. Instead of depending on heavyweight external LLM providers for every turn, we engineered a hybrid stack that keeps answers grounded, fast, and privacy-conscious.
Why We Built FinSAH
- Determinism & Reliability: Avoid brittle hallucinations by constraining answer space to verified internal knowledge.
- Cost Efficiency: Lightweight extractive QA beats per-token generation costs for high-volume support / advisory workflows.
- Data Residency & Privacy: Keep sensitive profile and project data local without shipping full context to third parties.
- Composable Architecture: Swap or layer retrieval sources (FAQ, profile, PDF, legal, blog) without retraining the core model.
High-Level Architecture
- Query Normalization: Light cleaning, lowercasing, token filtering (stopwords trimmed only for scoring stage—raw query preserved for answering).
- Multi-Corpus Retrieval: Keyword & token overlap scoring across: FAQ dataset, Profile (resume-derived), Legal/Process docs, Blog knowledge, PDF-ingested corpus.
- Stage Selection: Fast path: FAQ match → General cross-corpus → Profile-focused fallback → Conversation history synthesis → Safe fallback summary.
- Extractive QA: DistilBERT-based QA (context window assembled from top-N retrieved chunks).
- Answer Synthesis / Sanitization: If QA is low-confidence, we synthesize a concise, multi-source stitched response—never returning "I do not know".
Data Preparation Pipeline
- FAQ Curation: High-frequency strategic, pricing, timeline, integration, security, IP ownership questions manually authored → deterministic answers with semantic scoring fallback.
- Resume / Profile Ingestion: PDF parsed → chunked into semantic atomic sections (basic info, roles, achievements) → indexed with IDs (e.g.,
profile-basic). - Blog & Legal Docs: Existing structured content tokenized and stored with sourceType metadata for traceability in answers.
- Chunk Scoring: Simple overlap heuristic (token Jaccard + weighted keyword hits) keeps implementation transparent and inspectable.
Model Customization Strategy
We did not blindly fine-tune BERT on small, overfitted subsets. Instead, we focused on strategic wrapping of a robust pretrained extractive QA backbone:
- Prompt Assembly: Concatenate top contextual chunks with clear separators and a concise directive statement.
- Confidence Heuristics: Length + pattern checks (e.g., avoidance of generic uncertainty phrases) trigger alternate retrieval stage.
- History Integration: Recent turns distilled into a pseudo-context block if direct retrieval is weak.
- Source Attribution: Each answer surfaces up to six scored chunks with source type tags for auditability.
Why Not Immediate Fine-Tuning?
Fine-tuning extractive QA models on narrow proprietary corpora can lead to degraded generalization and brittle answers. Our architecture achieves high precision via retrieval quality & deterministic layers first. We reserve fine-tuning for future phases once we accumulate higher-quality interaction logs and validated target answer spans.
Extensibility Roadmap
- Semantic Vector Retrieval: Introduce hybrid scoring (BM25 + embeddings) while retaining transparency.
- Light Supervised Fine-Tune: Use curated Q&A pairs harvested from production interactions.
- Adaptive FAQ Expansion: Automatically propose new FAQ entries based on clustering of unmatched queries.
- Reinforcement Signals: Track helpfulness votes & dwell time for ranking adjustments.
Results So Far
- Latency kept low (single lightweight QA inference + minimal chunk assembly).
- No generic "I don't know" responses—context synthesis guarantees a constructive answer.
- Answers consistently grounded with visible sources for trust & compliance review.
FinSAH demonstrates that with careful orchestration, retrieval quality and explainability can outperform naive large model prompting for focused domains. It is a foundation we will iterate toward more adaptive and learning-driven behavior.
Ready to Implement?
Get a personalized AI roadmap, free ROI calculator, and expert guidance tailored to your business needs.