~icefox/cf_issues#16: 
Are there duplicate or near-duplicate crates in existance?

Need to do some research into detecting such things. What we really want is something that's the exact OPPOSITE of a checksum, an algorithm that gives us a qualitative "how close are these to each other?" Even better if we can feed it better features than just text; syntax tree, bag-of-words type featurization, I dunno.

Status
REPORTED
Submitter
~icefox
Assigned to
No-one
Submitted
1 year, 7 months ago
Updated
1 year, 7 months ago
Labels
AREA-Analysis TYPE-Design research