Does xDM support deterministic or probabilistic matching?
Best Answer
C
Cédric BLANC
said
over 2 years ago
Semarchy xDM leverages both deterministic and probabilistic matching.
Deterministic matching means "matching using rules." That can lead to some confusion as all xDM matching is based on rules. However, xDM rules can contain probabilistic algorithms for 'fuzzy matched' entities. Rules may includes either or both exact matching (name1 = name2) and fuzzy matching (phonetic(name1) = phonetic(name2)).
Fuzzy Matched Entities use a Matcher to automatically detect duplicates using fuzzy matching algorithms such as:
• Metaphone & Double Metaphone • Soundex • Edit Distance & Edit Distance Similarity • Jaro Winker & Jaro Winkler Similarity • NGRAMs • Levenshtein • etc.
Multiple Match Rules in a matcher allow you to define any number of conditions required for considering two records a match. Each condition has its own Matching Score. This score represents the percentage of confidence you put in a match that occurs based on the rule. Records are considered matched when the aggregate of all conditions exceed thresholds you define.
Matching Groups are created using matching transitivity. Matching Transitivity means:
If A matches B and B matches C, then A, B, and C are in the same matching group. Each matching group has a Confidence Score expressing the level of confidence across the group of matching records. This score is the average of the individual match scores in the group.
1 Comment
Cédric BLANC
said
over 2 years ago
Answer
Semarchy xDM leverages both deterministic and probabilistic matching.
Deterministic matching means "matching using rules." That can lead to some confusion as all xDM matching is based on rules. However, xDM rules can contain probabilistic algorithms for 'fuzzy matched' entities. Rules may includes either or both exact matching (name1 = name2) and fuzzy matching (phonetic(name1) = phonetic(name2)).
Fuzzy Matched Entities use a Matcher to automatically detect duplicates using fuzzy matching algorithms such as:
• Metaphone & Double Metaphone • Soundex • Edit Distance & Edit Distance Similarity • Jaro Winker & Jaro Winkler Similarity • NGRAMs • Levenshtein • etc.
Multiple Match Rules in a matcher allow you to define any number of conditions required for considering two records a match. Each condition has its own Matching Score. This score represents the percentage of confidence you put in a match that occurs based on the rule. Records are considered matched when the aggregate of all conditions exceed thresholds you define.
Matching Groups are created using matching transitivity. Matching Transitivity means:
If A matches B and B matches C, then A, B, and C are in the same matching group. Each matching group has a Confidence Score expressing the level of confidence across the group of matching records. This score is the average of the individual match scores in the group.
Cédric BLANC
Does xDM support deterministic or probabilistic matching?
Semarchy xDM leverages both deterministic and probabilistic matching.
Deterministic matching means "matching using rules." That can lead to some confusion as all xDM matching is based on rules. However, xDM rules can contain probabilistic algorithms for 'fuzzy matched' entities. Rules may includes either or both exact matching (name1 = name2) and fuzzy matching (phonetic(name1) = phonetic(name2)).
Fuzzy Matched Entities use a Matcher to automatically detect duplicates using fuzzy matching algorithms such as:
• Metaphone & Double Metaphone
• Soundex
• Edit Distance & Edit Distance Similarity
• Jaro Winker & Jaro Winkler Similarity
• NGRAMs
• Levenshtein
• etc.
Multiple Match Rules in a matcher allow you to define any number of conditions required for considering two records a match. Each condition has its own Matching Score. This score represents the percentage of confidence you put in a match that occurs based on the rule. Records are considered matched when the aggregate of all conditions exceed thresholds you define.
Matching Groups are created using matching transitivity. Matching Transitivity means:
If A matches B and B matches C, then A, B, and C are in the same matching group. Each matching group has a Confidence Score expressing the level of confidence across the group of matching records. This score is the average of the individual match scores in the group.
Cédric BLANC
Semarchy xDM leverages both deterministic and probabilistic matching.
Deterministic matching means "matching using rules." That can lead to some confusion as all xDM matching is based on rules. However, xDM rules can contain probabilistic algorithms for 'fuzzy matched' entities. Rules may includes either or both exact matching (name1 = name2) and fuzzy matching (phonetic(name1) = phonetic(name2)).
Fuzzy Matched Entities use a Matcher to automatically detect duplicates using fuzzy matching algorithms such as:
• Metaphone & Double Metaphone
• Soundex
• Edit Distance & Edit Distance Similarity
• Jaro Winker & Jaro Winkler Similarity
• NGRAMs
• Levenshtein
• etc.
Multiple Match Rules in a matcher allow you to define any number of conditions required for considering two records a match. Each condition has its own Matching Score. This score represents the percentage of confidence you put in a match that occurs based on the rule. Records are considered matched when the aggregate of all conditions exceed thresholds you define.
Matching Groups are created using matching transitivity. Matching Transitivity means:
If A matches B and B matches C, then A, B, and C are in the same matching group. Each matching group has a Confidence Score expressing the level of confidence across the group of matching records. This score is the average of the individual match scores in the group.
-
Can we reset Matches and run again on match rule change or add a new match rule?
-
"Unmerge" records
-
Turn off match rules to speed up an integration job
-
Can anyone tell me how to load a Fuzzy-Matched entity ... but skip the matching happening auto-magically?
-
Importing CSV in Fuzzy Matched Entity Does Not Trigger Consolidation
-
How can I trigger a "match on child records"?
-
How can I Configure Most Frequent Values in Survivorship Rules?
-
Machine Learning and AI for matching
-
Prevent loads from replacing values overridden by users
See all 42 topics