Submitted by cm_34978 t3_100rbhp in MachineLearning
low_effort_shit-post t1_j2mpzz5 wrote
Reply to comment by VacuousWaffle in [D] Data cleaning techniques for PDF documents with semantically meaningful parts by cm_34978
We get pdf feeds all the time with promises that have financial implications to get a proper data feed. Usually we kick the can down the road and when we get the feed just pull it in and it moves along out usual etl process within a day. PDFs are to be ignored
Viewing a single comment thread. View all comments