saw79
saw79 t1_iz0158r wrote
I don't think it makes sense these days to implement a CNN architecture from scratch for a standard problem (e.g., classification), except as a learning exercise. A common set of classification networks that I use as a go-to are the EfficientNet architectures. Usually I use the timm
library (for PyTorch), and instantiating the model is just 1 line of code (see its docs). You can either load it in pretrained (from ImageNet) or randomly initialized, and further fine-tune yourself. EfficientNet has versions 0-7 that give increasing performance at the cost of computation/size. If you're in TensorFlow-land I'm sure there's something analogous. Both TF and PT have model zoos in official packages too. Like torchvision.models
or whatever.
saw79 t1_ixiusbb wrote
Reply to How to efficiently re-train a classification model with an addition of a new class? by kingfung1120
Your model should output 3 logits, one for class_a
, one for class_b
, and one for class_c
.
When you use data from the 1st dataset,
- penalize
class_a
outputs for samples withclass_b
andanything_but_a_b
labels - penalize
class_b
outputs for samples withclass_a
andanything_but_a_b
labels - penalize
class_c
outputs for samples withclass_a
andclass_b
labels
When you use data from the 2nd dataset,
- penalize
class_a
outputs for samples withclass_c
labels - penalize
class_b
outputs for samples withclass_c
labels - penalize
class_c
outputs for samples withnot_class_c
labels
saw79 t1_irwr0ae wrote
Reply to comment by sqweeeeeeeeeeeeeeeps in I made densify –– a tool for enriching point cloud datasets by jsonathan
You're correct, but that's not what he's doing.
saw79 t1_ir23cl5 wrote
Reply to comment by XecutionStyle in A wild question? Why CNNs are not aware of visual quality? [D] by ThoughtOk5558
All I meant by nebulous was that he didn't have a concrete idea for what to actually use as visual quality, and you've nicely described how it's actually a very deep inference that we as humans make with our relatively advanced brains.
I did not mean that it it's conceptually something that can't exist. I think we're very much in agreement.
saw79 t1_ir1a2tb wrote
Reply to comment by ThoughtOk5558 in A wild question? Why CNNs are not aware of visual quality? [D] by ThoughtOk5558
Oh ok cool. Is your code anywhere? What kind of energy model? I have experience with other types of deep generative models but actually am just starting to learn about EBMs myself recently.
saw79 t1_ir13tsx wrote
In addition to other commenter's [good] point about your nebulous "visual quality" idea, a couple other comments on what you're seeing:
-
Frankly, your generative model doesn't seem very good. If your generated samples don't look anything like CIFAR images, I would stop here. Your model's p(x) is clearly very different from CIFAR's p(x).
-
Why are "standard"/discriminative models' confidence scores high? This is a hugely important drawback of discriminative models and one reason why generative models are interesting in the first place. Discriminative models model p(y|x) (class given data), but don't know anything about p(x). Generative models model p(x, y) = p(y|x) p(x); i.e., they generally have access to the prior p(x) and can assess whether an image x can even be understood by the model in the first place. These types of models would (hopefully, if done correctly), give low confidence on "crappy" images.
saw79 t1_izjv9wa wrote
Reply to Does anyone know how to get the NxNx12 from the input image - is it just using reshape function or is there any other function that can be used by Actual-Performer-832
Sometimes that beautiful one-liner just isn't worth it compared to something like