Use cases
Product repositories: how to reconcile them?
Simplify repository management with data reconciliation.


The need
Obstacles to product repositories
Our customer, a major player in the consumer credit market, wanted to offer a one-click online financing plan to all used car buyers.
While most partner sites selling used vehicles use the Argus (and sometimes JATO) as their automotive repository, the algorithms used to create our customer's financing plan were based on a different repository: EUROTAX.
In order for the private individual to receive his financing plan in a matter of seconds, it was necessary to establish a unique correspondence between the entries in the repositories, which did not have a common key, and whose differences in vehicle description made this correspondence non-trivial.
Proposed solution
Reconcile repositories produced with Tale of Data
Use of special "full-text" joins* designed by Tale of Data (approx. 100,000 entries per repository):
- Creation of a composite key for each repository by concatenating several fields (e.g. model, long version label, number of doors, year of commissioning, etc.).
- The composite key is matched with the composite keys of the other repositories that have the most "words" in common. In addition, words are weighted according to their rarity in the composite key corpus (principle: the rarer a word is in the corpus, the more credible the match).
- Elimination of multiple matches using so-called arbitration numerical fields (such as price incl. VAT or CO2 emission level): these fields are not standardized enough to be included in the composite key, but they are very effective for making a choice when a vehicle from one repository is matched with several vehicles from another repository. We'll take the one with the closest price and CO2 emission rate.


Benefits
The benefits of reconciling product repositories
Thanks to the involvement of business experts (who have in-depth knowledge of automotive repositories), the fields used in the composite key and the arbitration fields were optimally determined.
The rate of uniquematches has risen :
- From 55% in the first approach, which consisted in asking the customer's Data Scientists to code string matching algorithms in Python, algorithms that had been regularly rejected by the business for several months.
- The composite key approach proposed by Tale of Data involved the business to the tune of 95%.
- With the remaining 5% of multiple matches showing no significant difference in the financing plan generated, the Tale of Data approach was validated within a week by the customer's business teams.
Product Benefits
Integrate
Seamless integration of AI and No-Code technology for effective data refinement.
Collaboration
Shareable
Visualize
Powerful
Quality
Achieve superior data quality faster and at a lower cost, while demonstrating the tangible value of investing in data excellence.
Intuition
Organized
Aligned
Don't just take our word for it
Read what our customers say about us.
