${INS1}, Watch our latest webinar about flu vaccine
Do you think patients would like to go up to 250 days without an attack?
Watch our latest webinar about flu vaccine
??? See if more of your patients are ready for vaccine
Important news for your invaccinated patients
Important news for your inv?ccinated patients
...
I have around 30k of sentences, around 85% of these are sentences that considered as 'good'. By good I mean sentences with no strange characters and sequences of characters such as '${INS1}', '???', or '?' inside the word etc. Otherwise sentence is considered as 'bad'. I need to find 'good' patterns to be able to identify 'bad' sentences in the future and exclude them, as the list of sentences will become larger in the future and new 'bad' sentences might appear.
Is there any way to identify 'good' sentences using Regex, libraries in Python/R, or any other tool?
Thank you
Only_Television2030 t1_isl60kc wrote
Reply to [D] Simple Questions Thread by AutoModerator
I have a list of sentences. Examples:
I have around 30k of sentences, around 85% of these are sentences that considered as 'good'. By good I mean sentences with no strange characters and sequences of characters such as '${INS1}', '???', or '?' inside the word etc. Otherwise sentence is considered as 'bad'. I need to find 'good' patterns to be able to identify 'bad' sentences in the future and exclude them, as the list of sentences will become larger in the future and new 'bad' sentences might appear.
Is there any way to identify 'good' sentences using Regex, libraries in Python/R, or any other tool?
Thank you