Submitted by AutoModerator t3_z07o4c in MachineLearning
scarbchaser t1_iy4htr4 wrote
I'm new to this so any help is appreciated. Been looking for resources but maybe I'm using the wrong keywords.
What's the best way to approach building a data set of similar technologies like synynoms in the English language but for other things.
Example. Java, jdk, android, jdk7 can all be "java" related, and "programming", "tech" etc
Where would one start, setting this up almost like tags. Are there already existing datasets?
What if I wanted to do calculations later or build some type of inference, on Java. But have it apply for all things related to all those other ones.
Thanks and sorry. Might be ambiguous because not sure where to begin
Viewing a single comment thread. View all comments