Viewing a single comment thread. View all comments

VicentRS t1_iu28e37 wrote

Hello! I am currently in a small ML competition that my college lab is doing for fun. The challenge is to predict product prices. One of the columns in the dataset is the product's description and there's another one with the name.

In my head, products that include words like "phone" in the name or the description will tend be more expensive than say, a product called or described as "pencil". How should I featurize those columns to follow that logic?

1

merouane_nz t1_iu3i90o wrote

if you dont have a column like "product family" try to extract this information from the name/description, for example transform anything like phone, smartphone, iphone ...etc to "phone" and drop the name/description

1