Submitted by AutoModerator t3_xznpoh in MachineLearning

Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead!

Thread will stay alive until next one so keep posting after the date in the title.

Thanks to everyone for answering questions in the previous thread!

17

Comments

You must log in or register to comment.

Tobiwan663 t1_iro5hgz wrote

Hello Dear ML community,

I am an machine learning student looking for an interesting research topic, more specifically I am interested in modeling algorithmic thinking through neural networks. Of course Reinforcement Learning methods come to mind but they mostly make use of tree search and some value/policy function(s) modeled by neural networks. For me such an RL setting does not sound very promising when it comes to General AI because the only known general intelligent system (brain) does not appear to use tree search explicitly but rather as an product of its general intelligence emerging from neural activity. Do you know of any research sub-areas which try to understand these questions?

Appreciate any hints!

2

Dimitri_3gg t1_iropv0e wrote

Computational neuroscience for machine learning - the study of the brain and its computation to improve the currently naive simplification of ANNs. Deep learning is miles behind the human brain in aspects such as learning and actual deep understanding.

Genetic programming for deep learning- fun methodology of guided randomization to learn neural networks.

Predictive coding - Rao & Ballard. A more advanced form of the MLP which forward propagates errors between predictions and observations rather than input. Spratling 2017 is a good review, but the Rao and Ballard 1999 is fundamental.

4

vpk_vision t1_irqumk7 wrote

Can I use a global threshold for clustering after training a Person-Reid NN with triplet Loss?

Assume that I have "N" classes in my training data, I train a Person-Reid NN using triplet Loss. In the inference stage I compute scores (using euclidean distance) as follows:

​

Class A Class A Class B Class B Class C Class C
Class A 0 0.6 0.8 0.79 0.88 0.88
Class A 0.6 0 0.71 0.71 0.87 0.87
Class B 0.8 0.71 0 0.70 0.86 0.86
Class B 0.79 0.71 0.70 0 0.85 0.85
Class C 0.88 0.87 0.86 0.85 0 0.80
Class C 0.88 0.87 0.86 0.85 0.80 0

​

The above is a hypothetical N*N score matrix that I have constructed.

Row 1 and Row2 (Column1 and Column2): Class A

Row 3 and Row 4 (Column3 and Column4): Class B

Row 5 and Row 6 (Column5 and Column6): Class C

The only constraint that I have used is that the intraclass distances should be smaller than the interclass distances (which is what triplet Loss does). However a single threshold cannot be used in this case. For example a threshold of 0.6 would work for Class A but not for Class C. Is my understanding correct or am I missing something? Thanks a lot in advance.

2

Normal_Flan_1269 t1_irr4uqr wrote

Is a statistics departments research in nonparametric statistics and statistical learning considered “machine learning”? Is there overlap? [D]

A lot of the departments where I’ve seen “machine learning” research has been in computer science departments. However, I’ve seen a good number of statistics departments that have some sort of overlapping research areas, like:

“High dimensional statistics”, “nonparametric statistics”, “statistical learning”,

I was wondering if the type of research statisticians do in these areas is considered machine learning, or is it more so statistical methodology.

2

Neither-Awareness855 t1_irrqo5q wrote

Is it worth waiting to see how intel’s arc gpu’s do in machine learning compared to Nvidia already supported gpu’s? Or is the amount of library support for Nvidia outweighs the upcoming support for intel arc?

16gb of vram in the A770 vs 12gb of vram in the 3060

I did read that tensowflow and PyTorch are working to use intel’s arc XMX, but there is no date of when that will be done

2

mardabx t1_irt4kob wrote

What type of network would be best suited to turn a source (e.g. an image bitmap) into a definition of system of known components (in that example a component would be an image processing operation) needed to regenerate approximation of that source?

2

ray3425 t1_irt8sfe wrote

Anyone know any NLP implementations that reliably translate English to more specific/academic language? e.x. Ball - > Sphere

2

whengreg t1_irtma8g wrote

What's a good way to go from almost 0 knowledge in Machine Learning to at least vaguely knowing what I'm doing? I tried the "Supervised Machine Learning: Regression and Classification" class on Coursera, and bounced off. I got some of the actual concepts, but whenever it got to the point of "now write some code", I wasn't able to manage, and found it difficult to gain knowledge from the lectures.

I have a background in software development, with most day-to-day development in Python. I'd like something that would take less than a couple weeks to get something useful from.

2

notEVOLVED t1_irx0frl wrote

Is there a YouTube channel dedicated to explaining the CODE of ML paper implementations, like for example the code of Stable Diffusion or OpenAI's Whisper?

I really want to understand some of the code for these papers but am having a hard time trying to understand what's going on in some parts of them.

2

vroomwaddle t1_irywjex wrote

what’s the current state of deep learning on tabular data? are there any good libraries that are ready for prime time here? i’ve seen a few things around like tabnet, but it doesn’t seem like anything mainstream enough to have a keras / tensorflow implementation.

as an xgboost curmudgeon, i’m hoping to get similar performance but greater flexibility in model architecture and output format from a deep learning approach.

3

coffeecoffeecoffeee t1_is1pv58 wrote

I'm looking for advice on identifying clusters of people, each of whom has longitudinal data.

I have data structured as a multivariate time series of exactly 28 days for each of a large number of people. (The days themselves differ from person to person, but each person's days are always consecutive and a given person's Day D is the same day for every observation in the multivariate time series). Each person-day is associated with a bunch of nonnegative counts, many of which are 0.

For further clarification, a given person's data looks something like this, where Obs d corresponds to the observation of a given feature on Day d: "Feature A: [10, 9, 0, 2, 0, 0, ..., obs27a, 3], Feature B: [38, 12, 0, 3, 0, 0, ..., obs27b, 0], Feature C: [12, 6, 0, 10, 0, 0, ...obs27c, 13]".

What are some recommended approaches towards identifying clusters of people when the data is structured like this? I've considered mixture modeling with a random effect on person but it's not obvious how to fit one when there's no response variable. I've also looked into self-organizing maps but they look like they're for clustering time series, rather than individuals who have longitudinal data. I also recently discovered the Croston method for demand forecasting of intermittent time series, which is a modified EWMA, but it sounds like it's more useful for smoothing, and I'd still have to figure out how to cluster the smoothed time series'.

2

Mmm36sa t1_is20xv6 wrote

I have a dataset of ~13k entries, 1025 features, 28 classes, cleaned. I did feature selection then scaling then fitted into an mlpclassifier and with some hyper parameters tuning got 75% score.

I’m looking for ideas to improve my results. Mplclassifier got the highest result in comparison to random forest, Hgradient boosting or svm on a stratified sample. Oh and I can’t use tensorflow on my hardware.

1

SCP_radiantpoison t1_is29rva wrote

does anyone has experience running state of the art neural networks in opencv.dnn?

i´m trying to restore some old family photos and plan to use this github projects. i see they have the pretrained models available. can i use the CV2.dnn module to run them inside my own script? and if so do i have to preprocess the images or how do i proceed?

https://github.com/microsoft/Bringing-Old-Photos-Back-to-Lifehttps://github.com/jantic/DeOldifyhttps://github.com/xinntao/Real-ESRGAN

2

mobani t1_is4dxah wrote

I am looking for an alternative to the face-vid2vid from NVlabs. https://nvlabs.github.io/face-vid2vid/

Sadly the demo site was discontinued, is there any alternatives that can do face poses or just face frontalisation?

2

BAMFmartinFTW t1_is4vn3s wrote

Hi, I want to know if the next case is possible through ML for my project.

For my schoolproject about logistics my idea was to measure what percentage of the cargo hold is loaded with goods (12ton truck, with two axis and a white canvas around the cargo hold). I thought this approach was interesting because a single sensor in the middle of the cargo hold would give faulty reading if the cargo is loaded unevenly. As the contents of the cargo hold consists of bin bags (so it's a soft base product and a non -structured load)

So I thought of hanging a camera in the cargo hold (lights comes through the canvas) and through ML train the model of how much the cargo hold is loaded. The cargo gets weight when unloading. And everytime some cargo gets loaded an estimation is made of the weight that was added.

Would it be feasible to mount a camera and train it with the unloading weight and perhaps also with the estimation weights? Or does it sounds too much as a hassle and would Lidar be a more realistic approach in which case I would search for another project case?

Thank you in advance

2

liljontz t1_is5iqfm wrote

What do you actually have to do to train an AI? Ive heard it a lot and was wondering what actually goes into it.

3

ThrowThisShitAway10 t1_is5n9pc wrote

  1. Have a dataset and a model with trainable weights (neural network)
  2. input data -> network -> prediction data
  3. loss = loss function(prediction, truth)
  4. Perform backpropagation with the loss to update the weights in the neural network. Over time this will minimize the loss and allow the model to "learn" from the data and truth values you provide

The input data could be images of animals and the truth might be a classification on what kind of animal ("dog", "cat", "pig").

3

ThrowThisShitAway10 t1_is5onty wrote

I think the data would be rather noisy, and you'd have to collect a lot of it.

It would be nice if you could collect the sensor data from the single sensor in the middle of the cargo as well as the camera data. This way you have a good prior (approximation) for the weight. So instead of trying to predict the weight using camera data alone, you just have to predict the difference between the sensor weight and the true weight.

3

liljontz t1_is5ut6l wrote

Thank you for this answer! It was very helpful, I'm really new to code in general but my goal is to learn how to make a song lyric generator, all the ones online are multi purpose I want one dedicated to just that. Again thank you!!

3

zeXas_99 t1_is73ckp wrote

i have a school project on ml. the project is building a module that detects human face and predicts age, gender , ethnicity and emotion. and deploy it to web as web application using api. my question is which framework is better and why ? flask or django ?..my second question is which we should start with first , building the web application and api or building the module ? .. im responsible for both web developing and a part of the module . the rest of my teammate wont be responsible for web development.

2

Sbadabam278 t1_is75ja9 wrote

How can I learn the theory behind diffusion models (and stable diffusion) properly?

I have read the papers, but to me they gloss over a huge amount of information and are hard to make sense of at the moment.

Let’s take the original diffusion paper “deep unsupervised learning using non equilibrium thermodynamics “

They start with a data point x0 and then apply a “markov diffusion kernel” (aka adding a zero mean Gaussian random variable) for T times until we converge to a fixed distribution (also normal). Then they want to learn a “reverse distribution” p that inverts the process, by learning mean and variance for the reverse process distribution at each step.

So first of all, we already know mean and variance of each step. Why are you trying to estimate them? Are we trying to find “fake” mean and variance which push the stable state towards the “manifold” of realistic looking data points? If so, some other things in the paper don’t make sense to me (things like “the forward and reversal process are identical if the variance is small” - wtf are you talking about)

Another point is: what is the significance of this process in the first place? The forward process is mathematically equivalent to just add a single Gaussian random variable with higher variance. Why is having many steps important, and why can’t we learn to demonize directly from the final state in a single step?

There are many more questions I have about the paper, so my main question is: how do people make sense of it? I’m having a hard time even finding out which topics I should research.

I’m not an expert in probability / markov chains / math in general, but I think I can say I’m not a complete newbie either. What is the expected background one should have to read and understand these articles, and do you have any pointers on how to do that?

Thanks!

2

Lajamerr_Mittesdine t1_is89dnx wrote

I have a project idea and would like some feedback on feasibility.

I want to create a ML model that I would use in a subsequent model training loop.

This first model would take a image of x by x dimensions as input and then output instructions to a custom Image Creation tool for steps of re-creating the image.

The instructions would be semi-human readable but mostly just for the program to interpret and would look like the following and be arguments for the custom image creation tool to take in.

> >412, 123 #FF00FF ----- This would turn this one pixel Fuschia > >130, 350 ; 150, 400 #000000 ----- This would turn this rectangle of pixels on the canvas to black.

And many more complex tools available to take in as arguments.

The reward function would have two stages. The first stage is how close is your image to the original which would be easy to compute. And the second stage reward function would reward instruction minimization. I.E. 5000 steps to recreate the image would be rewarded higher than 10000 steps.

It would also be easy to set the upper bound of recreating the image to the total pixel count for that image so that it can be killed if it reaches the limit without creating the 1:1 image it was given as input.

The program would also allow as input argument the ability to create custom functions. Which we would also the model the ability to do. One thing that would incentivize the model to create and use its custom functions is that the reward would be tweaked so that if the model uses a predefined function it creates it counts as less instructions than if it were to individually call those instructions.

This first model is all about training it to recreate images 1:1 in the least amount of discrete instructions as possible for any arbitrary image.

This model/program would then be used in a second models training loop which I would like to keep secret for now.

1

ThrowThisShitAway10 t1_is90ox7 wrote

There's some papers on this. They usually refer to these commands as a "domain-specific language". I know of this article https://arxiv.org/pdf/2006.08381.pdf where they define some basic functions to start and then it attempts to learn higher-order functions while building a program to solve a specified task.

There was an interesting Kaggle competition a few years back by Francois Chollet where competitors had to come up with a method that can generate short programs to solve simple tasks. https://www.kaggle.com/competitions/abstraction-and-reasoning-challenge It ended up being quite challenging

3

C0hentheBarbarian t1_is96iqz wrote

Highly recommend this post by Jay Alammar. He has one of the best tutorials on how transformers work too (IMO) and this one is up there. I have worked with CV very sporadically recently but his post along with some of the links he has on there explained things to me pretty well. The only math background I can recommend off the top of my head is the probability calculation for lower/upper bounds - you can look up how VAEs work there or the post I linked has resources to understand the same.

2

Ripcord999 t1_is9hugv wrote

I am an experienced in SW. Worked on many technologies.

I want to start learning ML. What would a good approach?

1

pulszero t1_is9l0s8 wrote

Can the new Intel Arc A770 gpu be used in machine learning?

1

Narigah t1_isap81a wrote

Hello guys, I'm quite new to Machine Learning but I have a kind of challenge for an academic paper.

My data is a time series and I have to make predictions about specific positions in the time series. As an example, I have an array of floats with 350 positions, there is a pattern to certain positions that I need my model to figure out, based on their values and the surrounding values. In my train examples I would have the array of floats and the correct marked positions (e.g. position 35, 86, 150, 240, 351). It doesn't need to always get the exact position, but it should get as closer as possible.

Do you guys know of anything similar to this so I can study about it? Or do you recommend any approach? I'm kinda stuck on figuring out how to ascertain the loss and the precision, as it doesn't need to meet the exact position of the label, just to be as close as possible.

Thanks in advance for any help!

2

Antique_Appearance62 t1_isaqa7n wrote

Can someone give me a repository GitHub project where two samples of the same user are compared with each other and returned true if matches.

1

Next-Conclusion-3071 t1_isbh322 wrote

I am getting my masters in compsci in machine learning for my emphasis.

I really really want to be a machine learning engineer I love the math, I love the code and everything about all that is.

What is the best way as a new graduate to seek and prepare for a job in this field

Is it simply apply, have projects, and use kaggle?

Or is there more to it?

Also what networks or organizations can i join to start networking in the area?

3

whydontigetbetter01 t1_isc8q3e wrote

Are there any constraints on developing ML projects regarding the platform? Me and my friends want to develop a mobile app both in IOS & Android that is basically a yoga app where it checks your form/posture on camera while you are doing the yoga poses and give feedback regarding to your form. We don't know much about ML, we are going to learn through it as we have time, we will train our own model.

Is there something we should take into consideration? We thought we can implement it on Flutter in order to deploy it on both IOS and Android but we are not sure, is that doable? Training your own model then doing a mobile app with flutter using the trained model. We are trying to declare stuff like these before starting on implementation but we have no clue if we have any constraints. Can anyone help us?

1

grid_world t1_isc9zs1 wrote

Variational Autoencoder automatic latent dimensionality selection

For a given dataset (say, CIFAR-10), if you intentionally keep the latent space dimensionality to be large, 1000-d, I am assuming that during learning, the model will automatically not use the dimensions it doesn't need to optimize the reconstruction and KL-divergence losses. Consequently, these variables will be either or very close to a multivariable, standard, Gaussian distribution(s). Is my hand wavy thought correct? And if yes, are there any research paper which prove this?

1

iamikka t1_isdpee5 wrote

Hi guys, I am building an open-source project and have to train multiple models for very long hours. Is there any way to get some free GPU resources for an open-source project?

0

Sbadabam278 t1_iseer5o wrote

Thank you for the resources, it is a nice explanation! However, I was looking for more of a technical understanding - which topics I should read in order to follow and understand the original paper?

1

Nyanraltotlapun t1_isefd1c wrote

Hi. I have time-series data. I try to do all sorts of thing with it, forecasting and classification with RNNs and Fully Connected models.

The question is - can neural networks capture speed of change of values? RNNs and FC ones? Should I try to feed networks with derivatives of my values? Or it can potentially worsen performance of my networks?

Second question, how should I normalize derivative, my first idea is to take absolute values of derivatives and encode sign as separate features(two features for positive and negative). Does it sounds reasonable? I am afraid of my data becoming to complex.

1

DurianNo2306 t1_isgq7ty wrote

Hi guys I'm a 46 year's farmer in small mountain village, I learned machine learning so I could use to better manage my small Budget. My nephew said I could make good income if I worked for a company, and showed the youngest billionaire in AI of the company Scale AI . So I would love to know what are they doing what services they offer. With all the dusty information about them someone could clarify little bite ?? . Thank you in advance.

1

ABCDofDataScience t1_isi9ke4 wrote

Question: What exactly does Pytorch super(My_Neural_Network,self).__init__() do such that we need to include it in all Neural networks init() method?
After looking up online, all I found is: It initializes some special properties that are required for Neural Network but couldn't find any solid answer that describes in detail.

1

itsyourboiirow t1_iskrfyz wrote

If you are doing it to learn and for fun, I would look into a Recurrent Neural Network (RNN) or a Long short term memory (LSTM) model for generation. They’re really good at picking up patterns in text. Im sure it would be able to do it well with enough training data.

2

Only_Television2030 t1_isl60kc wrote

I have a list of sentences. Examples:

  1. ${INS1}, Watch our latest webinar about flu vaccine
  2. Do you think patients would like to go up to 250 days without an attack?
  3. Watch our latest webinar about flu vaccine
  4. ??? See if more of your patients are ready for vaccine
  5. Important news for your invaccinated patients
  6. Important news for your inv?ccinated patients
  7. ...
    I have around 30k of sentences, around 85% of these are sentences that considered as 'good'. By good I mean sentences with no strange characters and sequences of characters such as '${INS1}', '???', or '?' inside the word etc. Otherwise sentence is considered as 'bad'. I need to find 'good' patterns to be able to identify 'bad' sentences in the future and exclude them, as the list of sentences will become larger in the future and new 'bad' sentences might appear.
    Is there any way to identify 'good' sentences using Regex, libraries in Python/R, or any other tool?
    Thank you
1

TomaszA3 t1_islah3h wrote

Is every case of biological processing with learning basically a neural network or neural network done differently?

1

keto-ejh t1_ism5huh wrote

Hi! I have a complex machine learning task due 10/28 and I’m stuck. Looking for a tutor who can help me, will pay $100/hr. Please let me know (DM) if you are qualified and can help! Thanks!!

1

Last-Autumn-Leaf t1_ismbgv5 wrote

I'm a soon to be graduate and i'm looking for a job or an internship in a big tech company. Do you have any tips except the classic leetcode grinding ?

1

Important_Put8366 t1_ismycko wrote

4090 now or wait for 4090ti?

I am interested in using and training stable diffusion models (specifically the recent Novel AI leak), so I need a new graphics card.

4090 has 24 gb vram and 4090ti, i heard, has 48 gb vram. It seems to me that getting 4090ti is much better because large language model and diffusion eats a lot of vram. I currently own an 1070, so I can do some generation but not training.

Anyone has any idea on when nvidia will release 4090ti? If I need to wait for another half a year, i might as well just get a 4090.

1

MerlinTrashMan t1_iso5oeo wrote

I am using lag columns in my feature engineering to provide more information when it is available. I have lags for times in minutes of (-1,-2,-3,-5,-8,-13,-20,-30,-45,-65,-90). My problem is that it is possible for -5 to -90 to have not occurred yet. My current coding is using the value of -4 for all the values past -5 and I am concerned that even though I have a time of day feature, it is not getting associated to the lag columns to lower their relevance at low time of day values. What are some approaches to reduce/resolve this issue?

1

ThrowThisShitAway10 t1_iso6km8 wrote

This is a feature of Python, not just PyTorch. We use the super function because we want our class to inherit the attributes of it's parent. For your PyTorch module to work, you have to inherit from the nn.Module class. It's not a big deal

1

Nyanraltotlapun t1_iso6lqp wrote

For example I encoded it as such. Different features have different scales and I need to normalize it somehow. But because differential encoding produce signet values I have problem with it. I afraid that with normalization I will lost information about direction(sign)

1

princesengar t1_isojs4e wrote

Hi everyone, I am a SAP Developer currently in TCS, I have only one year of work experience, I want to start my career in Machine Learning, but am not able to find machine learning jobs for freshers, I have good knowledge and hands on on machine learning projects, can someone suggest where(or how) to look for ML jobs? You can reach me out on LinkedIn : https://www.linkedin.com/in/prashant-singh-3755041a0

1

Unusual_Variation_32 t1_isorxfm wrote

Hi everyone!

So I have one true/false question:

Does L2 regularization(Ridge) reduces both the training and test error? I assume no, since ridge regression won’t improve the error, but not 100% sure.

Can you explain this please?

1

Puzzleheaded-Me-41 t1_ispt7mc wrote

Hey everyone!

Im new to Diffusion models and Im on the quest to develop a text to image stable diffusion model of my own, Im in need of all the relevant resources which will help me understand and make the model. Any leads?

1

Voldemort_15 t1_isqi3rr wrote

Hello all,
I run:
model.train()
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Epoch 1/400: 0%| | 0/400 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-139-c72315b99576> in <module>
----> 1 model.train()
46 frames
/usr/local/lib/python3.7/dist-packages/torch/distributions/distribution.py in __init__(self, batch_shape, event_shape, validate_args)
54 if not valid.all():
55 raise ValueError(
---> 56 f"Expected parameter {param} "
57 f"({type(value).__name__} of shape {tuple(value.shape)}) "
58 f"of distribution {repr(self)} "
ValueError: Expected parameter loc (Tensor of shape (128, 10)) of distribution Normal(loc: torch.Size([128, 10]), scale: torch.Size([128, 10])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=<AddmmBackward0>)
Would you have advice in this case to fix the error? I appreciate your help!

1

Select-Shopping4606 t1_isr2ijn wrote

hi everyone.

Considering a multivariate problem such as weather prediction using linear models.

with x1,x2,x3,x4,x5 to predict weather y. how do we find how much we need to increase or decrease x2 to get our desired treshhold y ????

is it only way to change manually y=wx+b ????

Thanks for kind suggestions and directions.

1

nadia-nahar t1_ist51r2 wrote

I am looking for open-source machine learning applications or products for end-users (not demos, libraries, or dev tools) for my research work. What are the ML applications you have encountered or worked on in open-source?

1

Rei_Moriaty t1_istd4un wrote

I wanted to know how to continue learning and work towards breaking into Data Science/ML while working. As my current job is quite toxic requires me to work for almost 12 hours daily. Does anyone have any suggestions?

1

EManO13 t1_isuroz9 wrote

I want to use an LSTM to predict a value that only is released at the end of a day. Say I have minute data for stock trades, and I want to forecast the highest trade of the day. So it is a forecasting problem until the point where the data is trending down, then it is more of a "what would highest trade be if our observed sample is this." Do I make all 1440 data points of a day have the same value? Or just the last one and I predict only the last value of the day? In the preprocessing phase and would appreciate insight.

1

your-mom-was-burned t1_iswgcym wrote

I have a zip file that has two folders with txt files in it. How can I use this txt files to train a machine? One folder has texts which I need to label as YES and another has texts which I need to label as NO. Machine needs to return YES or NO. Help please

1

JuanG024 t1_iswt0yv wrote

i have trained a model in yolov7. and i want to improve it. what method should i do to improve it?
a. use the best model and use it to train new batch of images.
b. use the best model and use it to train all images with new batch of image.
c. use the last model and use it to train new batch of images.
d. use the last model and use it to train all images with new batch of image.

2

Known_Ad_5120 t1_isyq0pi wrote

Feature Importance and Threshold Moving
Problem Type : Binary Classification
Dataset : Imbalanced

Current sklearn pipeline uses XGBoost model and involves moving threshold from 0.5 to a considerably higher value like 0.8 - 0.9.

Is it viable to use XGBoost's feature importance metrics for identifying the relevant features, if not what would be a better alternate?

2

ShowMeUrNips t1_isz1d51 wrote

I only have a tiny bit of experience with StyleGan3, but have they been able to fix the issues with side profile images of their faces? I'm a novice go easy on me.

&#x200B;

Thank you.

1

VoyagerExpress t1_iszfebl wrote

I am currently working on a project on an unbounded multi-task optimization problem. Essentially lets say my model outputs a tensor which leads to an SNR type loss (for people familiar with wireless communications jargon, the signal and interference vectors are columns of this tensor) and I would like to improve this SNR upto some required value. Do you guys have any suggestions on loss functions I could use? Rn I am trying out (model_output_snr - Req SNR)^2, basically an MSE loss wrt the required minimum snr. This doesn't change the fact that the problem itself is unbounded and unsupervised. I am new to this style of learning paradigm since I am used to having data with inputs and labels.

I tried a bunch of architectures to solve this problem but fundamentally I feel like the training losses are looking super erratic and not improving at all even after thousands of epochs.

Are there any precursors to this kind of ML technique, anything I should look out for? Really any help would be great at this point thanks! The problem itself is similar to a convex optimization problem statement, but the maximisation objective is non-convex due to inherent non-linearities in activation functions. Is there some theoretical limit on such kind of learning problems which make this approach (using ML instead of convex optimization) pointless in the first place?

1

iridium__ t1_it2zlb1 wrote

I'm a mechanical engineering student (mechatronics actually) and I was thinking about doing a machine learning project for my final project, but I don't have a clear idea what would I do.

I was thinking about lane detection for self-driving cars or some kind of classification network for conveyor belts or something similar.

Any ideas for the project?

1

seiqooq t1_it3ixpk wrote

Correct me if I’m wrong but you say you’d like to improve your SNR up to some value, it sounds like you could simply formulate this as a 1D maximization problem, rather than a 2D optimization problem. In this case, reinforcement learning and genetic algorithms are high on the list as solutions.

1

seiqooq t1_it3wkzy wrote

Try to think of this in terms of how you will use the model. It sounds like a day-trading model, correct me if I’m wrong. In this case, you’ll want to ask the question of “based on todays trading patterns, should I sell now, or is the peak still likely to come?”.

See if this helps your problem formulation and therefore your labeling.

As a side note, most models are not sophisticated enough to capture the extreme complexity of stock behavior. If this is your first foray into stock prediction, I’d recommend tempering expectations.

1

seiqooq t1_it3zp9b wrote

It’s useful to think of regularization simply as offering a way to punish/reward a system for exhibiting some behavior during training. Barring overfitting, if this leads to improvements in training error, you can expect improvements in test error as well.

2

Puzzleheaded-Me-41 t1_it40s6f wrote

Question: how does one exactly give text embeddings to a machine learning model? Im trying to create a stable diffusion model clone like Dalle2, Ive searched various different sources about text embeddings but couldn’t find the techniques.. any suggestions?

1

seiqooq t1_it40uw6 wrote

It’s a bit of a rabbit hole, but this is required for autograd to create the reverse computation graph (enables backpropagation). PyTorch has great videos on YouTube if you want to dig in, just search PyTorch autograd.

1

seiqooq t1_it443kx wrote

I think you’re just about there with an answer. Assuming each occurrence is weighted evenly you could approach this a few ways:

  1. Use binary labeling such that the output vector looks like [0,0,0,01,0,0…, 1] and is of length 350. You can think of this as representing the true goal of finding the exact positions. Then, during optimization, you can determine a threshold or other logic to handle all of the fuzzy predictions that will inevitably result from training.

  2. Assign fuzzy labels scaling inversely with the distance from the target point. EG [0, 0.1, 0.5, 1, 0.5, 0.1, 0…]. The same thresholding can be done here as well.

Assuming locale is important for classification, I’d consider using convolutions as well to extract useful information from neighboring data points.

2

le_bebop t1_it4npm4 wrote

Question: Any advice on probabilistic regression with small data (~500 instances, 14 features)?
I'm using xgboost, trying to avoid overfitting with hyperparameter optimization (with hyperopt) to reduce average validation score on 5-fold CV, but still leading to some overfitting (average CV train MAPE 2.85; average test CV MAPE 15.36; test MAPE 18).
I've read that Bayesian models are recommended for such cases of regression on small data, but I'm not familiar (yet) with these models. Could you give any tip or advice to achieve a robust generalization on small data regression? Or recommend some Bayesian library so I can try it.

1

dearnot t1_it6iueu wrote

Question: Consider a stock that values: 10,00 USD in 2010, 75,00 USD in 2015, 150,00 USD in 2020 and it continues to grow by this day.

Given that decision tree based algorithms like xgboost are generating the tree (splitting the values) based on the ranges, I don’t understand how the tree built on the past data (e.g. years 2000 - 2015) could be in any form applicable for the future price predictions (e.g. years 2015 - 2080).

Could somebody confirm that that feature normalization is truly not required for data that grows beyond the original(/fit/train) range with time?

Do I need to run the raw stock price through some log or sigmoid function before training or is xgboost actually smart enough to deal with this kind of data automatically?

&#x200B;

edit: to clarify. I have read it everywhere, including the official forums - that feature normalization is not required when training the decision trees model. In my case I am using the xgboost library that uses the gradient boosting decision tree algorithm to train the model but I think that this question is applicable to any other tool that uses the DT based algo.

2

DeepNonseNse t1_it6yta0 wrote

>to clarify. I have read it everywhere, including the official forums - that feature normalization is not required when training the decision trees model

All the XGBoost decision tree splits are in form of: [feature] >= [treshold], thus any order preserving normalization/transformation (log, sigmoid, z-scoring, min-max etc) won't have any impact on the results. But if the order is not preserved, creating new transformed features can be beneficial.

Without doing any transformations or changes to the modelling procedure, and training data containing years 2000-2014 and test 2015-2080, the predictions would be something similar to those values in 2014 as you originally suspected. There isn't any hidden built-in magic to do anything about data shift.

One common way to tackle this type of time series problems is to switch to autoregressive (type of) modelling. So, instead of just using raw stock prices directly, use yearly change percentages.

1

frappuccino_o t1_it7gj27 wrote

Hi! Any Text-to-Speech guys in here? I'm trying to reimplement YourTTS by Coqui-AI and I'm not getting nearly the same quality. Furthermore, their speaker-consistency loss seems to be buggy and only authors know how to run it properly. Has anyone worked on implementing that too? Would be nice to connect and use some help lol.

Cheers.

1

vardonir t1_it7wwgq wrote

im trying to find models for voice cloning but all i seem to find are for text-to-speech. there has to be something that exists that changes the voice audio-to-audio, right? the intonation/emotion of the speaker in the source audio will be lost if it goes through TTS, so i dont want it to go through that

(it doesnt have to be real-time)

1

WykopKropkaPeEl t1_it8oi22 wrote

Is it possible to train a text generative model on someones message history and then have a pretty good estimate on how they would respond to something written to them?

1

disibio1991 t1_it9epik wrote

Has there been any talk about teams creating and/or using high-quality annotations for image training data? So that for example an image of person is not just captioned with facial expression, race and general age but with much more - country, income, marital status, health status, 'big-five' taxonomy and so on.

Another example - an image of a tree on a hill with descriptors of exact geolocation, age, altitude, shade/sunlight position.

edit: okay, found something - 'Civilian American European Surface Anthropometry Resource' datased.

1

ForIgogassake t1_ita9pus wrote

Hello everyone! I'm trying to train a model for DeepSpeech using the Common Voice dataset, but because I'm a complete beginner, I'm having some issues when I follow the steps in the given guideline. I'm stuck when I have to use DeepSpeech importer because I don't know how to execute that Ubuntu command while using Windows, and where and to which folder should I extract the dataset in order for the script to run (I'm not a Python beginner), or how do I run the script because I tried it in two IDEs but it didn't work. Therefore, I really need your help for my project.

Image to address where I am stuck

Thank you

1

_eXpose t1_itbj6ho wrote

What is the difference between self-supervised learning and active learning?

Are they somehow related or two completely different areas?

1

coinclink t1_itcn0eo wrote

Where does one with a basic understanding of ML and MLOps start with training a computer vision model that identifies unknown features in fixed-scene imagery and allows the features to be labeled as they are identified? Ideally, I'd like to start with a model that knows nothing and use "human in the loop" method of slowly training the model to recognize distinct patterns as features.

1

zeromodz12 t1_itd5zfz wrote

Hi, I am looking to create a tool for my management whereas they can enter a question and the tool will convert the question to a SQL statement and execute in our database which will then return the answer. I understand GPT-3 has this functionality, but using it is not free. I am looking for a free solution. I tried searching through Hugging Transformers and have not found anything. Any advice on what I could use?

1

BigmacMcWhopperson t1_itef8se wrote

ok so.

after training a model can you reuse it by replacing "scratch" as a checkpoint with a pth file? ringing anybodys bells? i'm totally by myself just loving the new toys, thanks for any advice. also question is there a discord? thanks

1

fr4nl4u t1_itfkbcw wrote

Active learning is when the input of a labeler is asked during training to optimally deal with the model uncertainty. This approach is part of the general semi supervised tasks as it suggest to label iteratively a minimal number of new examples to train a model.

1

kappesas t1_itg06i2 wrote

I have 3 soil sensors (which are measuring the temperature) in three different depth (1, 2 and 3m). Therefore I have 3 classes.

It is hourly data. Now I want to see if there is a significant difference between these classes. Which method can I use?

1