Viewing a single comment thread. View all comments

hellrail t1_isjxlxf wrote

U need to find a method tobturn these names into a feature vector, such that in feature space similar names ate clustered together naturally. Start with standard string similarities to get the feature vector, if that does not result in sufficiently unambigious cluster formations proceed by lemmatization methods and if it still is not sufficient try out some prelearned mod ls to generate the feature encoding

2