A powerful and modular toolkit for record linkage and duplicate detection in Python
-
Updated
Feb 21, 2024 - Python
A powerful and modular toolkit for record linkage and duplicate detection in Python
Learned string similarity for entity names using optimal transport.
📐 Hidden alignment conditional random field for classifying string pairs.
Similar string search in Levenshtein distance
A Python library for calculating string distances using C extensions (with a pure Python fallback)
A python implementation of a variety of text/string distance and similarity metrics. No GPL!
Deduplicate data using fuzzy and deterministic matching rules.
CRF Edit Distance
Fast(ish) string similarity for one vs many comparisons.
Python wrapper for Rust's strsim library
Detects similarities between strings & generates similarity hash
Common string similarity algorithm implementations.
🐍 Python implementations of common string distance and similarity algorithms
program that prints the longest substring in which the letters occur in alphabetical order
Add a description, image, and links to the string-distance topic page so that developers can more easily learn about it.
To associate your repository with the string-distance topic, visit your repo's landing page and select "manage topics."