Data Deduplication
Identify and Eliminate Duplicates Fast
On average, a database contains 8-10% duplicate records. These duplicates result in waste and inefficiencies and cloud your ability to get a single, accurate view of the customer.
Melissa is the most powerful and accurate matching and deduping solution on the market to combat the problem of duplicate records. What sets it apart from the rest is its intelligent parsing capability to understand and parse the various components of domestic and international addresses. By combining deep domain knowledge of international address formats and advanced fuzzy matching techniques, MatchUp gives you the ability to identify and merge/purge even the most difficult-to-spot duplicate records.
- Eliminate clutter and duplicates that prevent a clear view of your customers
- Increase the accuracy of your database – saving you time and money
- Reduce postage and mailing costs by eliminating duplicates using advanced matching technology
How MatchUp Works
MatchUp employs a matchcode to determine if two records should be considered duplicates. MatchUp uses a predefined matchcode, or one that you have created using the Matchcode Editor.
The following matchcode components (data types) are available for use in identifying duplicates: +
- Prefix
- First Name
- Middle Name
- Last Name
- Suffix
- Gender
- First/Nickname
- Middle/Nickname
- Department/Title
- Company
- Company Acronym
- Street Number
- Street Pre-Directional
- Street Name
- Street Suffix
- Street Post-Directional
- PO Box™
- Street Secondary
- Address
- City
- State/Province
- ZIP9
- ZIP5
- ZIP+4®
- Postal Code
- Country
- Phone/Fax
- Email Address
- Credit Card Number
- Date
- Numeric
- Proximity
- General ID
Fuzzy Matching
MatchUp combines Melissa’s deep domain knowledge of contact data with over 20 fuzzy matching algorithms to match similar records and quickly dedupe your database.
MatchUp employs the following fuzzy matching algorithms to identify “non-exact matching” duplicate records: +
- Phonetex
- Soundex
- Containment
- Frequency
- Fast Near
- Accurate Near
- Frequency Near
- UTF-8 Near
- Vowels Only
- Consonants Only
- Alphas Only
- Numerics Only
- MD Keyboard
- Jaro
- Jaro-Winkler
- n-Gram
- Needleman-Wunch
- Dice’s Coefficient
- Smith-Waterman-Gotoh
- Jaccard Similarity Coefficient
- Overlap Coefficient
- Longest Common Substring
- Double MetaPhone
Global Merge / Purge & Deduping
The World Edition of MatchUp supports 12 countries, including Canada, Germany, U.K., and Poland. MatchUp’s advanced deduping can see through diacritic equivalents to Latin characters and interpret keywords that are the same but spelled differently (i.e. Germany and DEU).
Unique Matching Scenarios
MatchUp has some unique attributes which can be employed to help identify duplicates in some interesting ways.
1. Survivorship for Golden Record Creation
+Matchup can select the best elements from multiple records to survive consolidation, ideal for the creation of golden records for a single customer view. Available in Microsoft SQL Server Integration Services (SSIS) and Pentaho PDI.
2. Proximity Matching
+MatchUp’s patented distance algorithm uses latitude-longitude coordinates and proximity thresholds to identify duplicate records that are geographically close together. For instance, using location attributes, MatchUp can detect matching records at different addresses (for example, a company with two different entrances) but within a specified distance to each other.
3. Householding
+MatchUp can identify and consolidate records that are members of the same household to better understand customer relationships, lifecycle, and needs. You can also use MatchUp to bring together multiple business accounts into “corporate families” to build insight and better evaluate the total sales relationship. Householding can also be used to eliminate unnecessary multiple mailings to the same household to cut down on wasted print, production, and postage costs.
Three Ways to Dedupe Your Data
MatchUp offers three methods of operation (or ways to match records):
1. Read / Write Deduping
+Compares records in one or more databases at once. Each unique group will have one record that receives an “output” status; the other matching records receive a “duplicate” status. Ideal for matching entire databases at one time.
2. Incremental Deduping
+Enables real-time matching by comparing each record as it comes in (like from a web form or call center) against the existing master database. If the incoming record is not a duplicate, it can be added.
3. Hybrid Deduping
+Provides a combination of the first two methods with the flexibility to customise the process to match an incoming record against a small cluster of potential matches. With hybrid deduping you can store the match keys in a proprietary manner. Ideal for real-time data entry or batch processing of entire lists.
Request a Demonstration
A demonstration with one of our representatives gives you a first-hand look at our products in action. Request one today.
Request Demo NowHelpful Resources