bankCC
bankCC t1_ix9o1ln wrote
Reply to [D] Simple Questions Thread by AutoModerator
Which approach would be best for a classification of text into 2 categories, where my dataset is realy small and unbalanced (4000, 250) each text containing around 200-300 words.
And most of the time just one or two words will lead to classification. I could just do a keyword search, but misspelled words might slip through and the dictionary would be pretty big and computational expensive to compare on each file. So I thought ML would be a better idea.
Maybe a CNN but the dataset seems to be way too small to accomplish acceptable results.
Any hints are welcome tyvm
bankCC t1_ixc6lk0 wrote
Reply to comment by Gazorpazzor in [D] Simple Questions Thread by AutoModerator
Thank you very much for the answer! I highly appreciate it. You gave me a realy good base to start from. Huge thanks