Access the full text.
Sign up today, get DeepDyve free for 14 days.
Record linkage, also known as database matching or entity resolution, is now recognised as a core step in the KDD process. Data mining projects increasingly require that information from several sources is combined before the actual mining can be conducted. Also of increasing interest is the deduplication of a single database. The objectives of record linkage and deduplication are to identify, match and merge all records that relate to the same real-world entities. Because real-world data is commonly 'dirty', data cleaning is an important first step in many deduplication, record linkage, and data mining project. In this paper, an overview of the Febrl (Freely Extensible Biomedical Record Linkage) system is provided, and the results of a recent survey of Febrl users is discussed. Febrl includes a variety of functionalities required for data cleaning, deduplication and record linkage, and it provides a graphical user interface that facilitates its application for users who do not have programming experience.
ACM SIGKDD Explorations Newsletter – Association for Computing Machinery
Published: Nov 16, 2009
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.