Viewing a single comment thread. View all comments

SwabianStargazer t1_ixe8q19 wrote

Hi. I am a software engineer working on mostly backend stuff but now need to dip into ML territory for the first time. I have zero experience and need some pointers to identify the right topics to research for my use case.

We have test data for machines that do the same task over and over again for a long period of time during a test run for stress testing. Let’s say we have a sampling rate of 30Hz for features like temperature, motor rpm and motor voltage during this time. So the result after a test run is e.g. 10 hours of data that contain the same procedure 10.000 times.

I now want to analyze the data for outliers to identify problems during the test. For example I want to identify the test cycles that had abnormal high temperature etc. Result should be something like a timestamp and a label so that I see which of the 10.000 cycles should be inspected further by a human.

Another thing that I am interested in is a way to automatically split tue data into 10.000 separated cycles so we can see when a cycle started and when it ended (remember there are 10.000 cycles in the data)

What would the base approach to achieve these things? Which methods and models should I look into and do my research on?

Thanks in advance for all pointers and help!

1

trnka t1_ixeuv51 wrote

You might be able to try outlier detection to identify unusual test cycles. Though I've heard that it's often better if you're able to label even a small amount of data for whether it's anomalous or not, because an outlier detection method doesn't know which features are important or not, and labeled data can teach ML which features are important.

Feature representation might be tricky but a simple way to start is min, max, avg, stddev of each sensor.

To segment test cases, you could make it into a machine learning problem by predicting whether time T is the start of a cycle, trained from some labeled data. I imagine that getting good results will depend on how you represent the features of "before time T" and "after time T"

Not my area of expertise but I hope this helps!

1