I'm trying to figure out if the various versions of YOLO, such as YOLOv7 are better than the various versions of RCNN in terms of accuracy alone if speed is not much of an issue. Let's say I'm trying to detect various objects on a 2D floor plan, and I only care about accuracy.

How would a classifier that would go square by square to find the objects perform? This may not be as efficient as the standard object detection models, but would it be more accurate if I am willing to throw as much compute power as it wants for this brute force approach?

Comments

SeucheAchat9115 t1_iyusz2k wrote on December 4, 2022 at 9:09 AM

#856,686

I guess on Coco the best accuracy is given by transformer networks like Swin, but I would assume your dataset is not as big as coco, therefore transformers might not generalize well.

somebodyenjoy OP t1_iyuz8tx wrote on December 4, 2022 at 10:42 AM

#856,920

Replying to SeucheAchat9115 (#856,686)

In this case, what would be the better option?

SeucheAchat9115 t1_iyuzb7c wrote on December 4, 2022 at 10:43 AM

#856,923

Replying to somebodyenjoy (#856,920)

I guess yolov7 is a good choice, but depends on your institute be aware of the licenses of the code.

killver t1_iyv1zfo wrote on December 4, 2022 at 11:21 AM

#856,993

Replying to somebodyenjoy (#856,920)

EfficientDet if you care about license.

somebodyenjoy OP t1_iyv3jls wrote on December 4, 2022 at 11:43 AM

#857,029

Replying to SeucheAchat9115 (#856,923)

What would be the accuracy of the brute-force approach, i.e. sliding window approach? Would the accuracy be better than all others?

SeucheAchat9115 t1_iyv5t91 wrote on December 4, 2022 at 12:14 PM

#857,088

Replying to somebodyenjoy (#857,029)

Sliding window approches are „Conventional“ Image Processing techniques which are not comptitive anymore nowadays.

somebodyenjoy OP t1_iyv5zlq wrote on December 4, 2022 at 12:16 PM

#857,093

Replying to SeucheAchat9115 (#857,088)

Maybe in terms of speed, but what about accuracy? Wouldn’t it make sense that a classifier going around the image would be more accurate? Is there any research or articles comparing the modern algorithms to sliding windows

SeucheAchat9115 t1_iyv631b wrote on December 4, 2022 at 12:18 PM

#857,096

Replying to somebodyenjoy (#857,093)

Deep Learning Classifiers based on Convolutions also go around the whole image. And the sliding window approaches are not competitive anymore in terms of accuracy as well

somebodyenjoy OP t1_iyv79r7 wrote on December 4, 2022 at 12:33 PM

#857,136

Replying to SeucheAchat9115 (#857,096)

I understand, I was asking if we use something like an alexnet and train it on a specific object, like a dog or not detector. Then make this detector go around the entire image in a brute-force manner, would that be more accurate than the object detector models right now

SeucheAchat9115 t1_iyv7fcl wrote on December 4, 2022 at 12:34 PM

#857,146

Replying to somebodyenjoy (#857,136)

No, because the object detector can solve the problem in a single forward path. Todays deep learning based object detectors like Yolo or RCNN + Swin are very good choices for a detection task

somebodyenjoy OP t1_iyv8ila wrote on December 4, 2022 at 12:47 PM

#857,181

Replying to SeucheAchat9115 (#857,146)

You mean to say they can do better in terms of accuracy even tho they detect in a single forward path?

SeucheAchat9115 t1_iyv8t0s wrote on December 4, 2022 at 12:51 PM

#857,188

Replying to somebodyenjoy (#857,181)

Yes, because Deep Learning is way better than conventional Methods.

Flag_Red t1_iyvot70 wrote on December 4, 2022 at 3:20 PM

#857,787

Replying to somebodyenjoy (#857,181)

He doesn't know.

bernhard-lehner t1_iywb1n5 wrote on December 4, 2022 at 5:56 PM

#858,551

if compute doesn't seem to be an issue, why not try what works best on your data?