© Copyright Factored 2019 - 2024. All Rights Reserved.
Case-Studies: Manufacturing
Factored developed a graph-based solution for detecting variations of supplier names and generating representative names for each group. The model was deployed as an API in Google Cloud.
The Challenge:
Companies can name suppliers however they want in their own ERP systems. When it comes to integrating different ERP systems, this causes duplication of supplier names and hurts large-scale analysis. The problem is to identify the different variations a supplier name can have and group them together under a single, normalized supplier name.
The Solution:
Our solution included fetching external data from Bing, which helped correctly identify tricky cases like subsidiaries, mergers and acquisitions. Our proprietary algorithm then built a graph from this data, followed by a hierarchical clustering approach and finally a deduplication and post-processing step. The model obtained an adjusted mutual information (AMI) score of 91%. This helped improve the data quality significantly, and now the client can detect transactions related to the same supplier, even if their names are different.
The Outcome:
More Stuff goes here…
Tech Stack & Skills:
BigQuery, DataFlow, Bing, Dedupe, StarSpace, Machine learning.
Results
Stuff
Stuff
Stuff