From SharePoint Chaos to Searchable Knowledge
How we untangled 8 years of orphaned sites for a 200-person consulting firm.
The Situation
A mid-size management consulting firm came to us with a familiar problem: 8 years of SharePoint accumulation. 47 team sites. 3 different folder naming conventions. An unknown number of duplicate documents. And 200 consultants spending an average of 45 minutes per day searching for internal documents.
Phase 1: Content Audit
We ran an automated content audit across all 47 SharePoint sites, classifying documents by type, project, and freshness. The results were eye-opening: 142,000 total documents, 38% duplicates, 22% orphaned with no owner.
Phase 2: Deduplication & Normalization
We built a deduplication pipeline using file hashing (exact duplicates) and semantic similarity (near-duplicates). We kept the most recent version and archived the rest. Then we normalized formats, standardized metadata, and tagged by project, client, and knowledge domain.
Phase 3: AI-Powered Search
With clean data, we deployed an AI search interface. Consultants could ask natural language questions and get cited answers with source links. Domain-specific chunking strategies ensured proposals, deliverables, and memos were each handled optimally.
Results
- 82% reduction in average search time (45 min → 8 min/day)
- 3x faster proposal creation for new consultants
- ROI payback in under 3 months