Viewing a single comment thread. View all comments

sckuzzle t1_iw8jv72 wrote

Why are you using a "model" / MLPs at all for this? This is a strictly data processing problem with no model creation required.

Just process your data by throwing away 75% of it, then take the max, then check if each value is equal to the maximum.

Something like (python):

import numpy as np

def process_data(input_array):
  every_fourth = []
  for i in range(len(input_array)):
    if (i+1)%4==0:
      every_fourth.append(input_array[i])
  max_value = max(every_fourth)
  matching_values = (np.array(every_fourth) == max_value)
  return matching_values
8

ole72444 OP t1_iwa5adf wrote

Yes, i understand it looks like a data preprocessing problem (and it actually is). But this is a toy example to demonstrate if NNs can actually generalise functions that are of this sophistication.

2

ContributionWild5778 t1_iw95dyj wrote

Agreed to this. It's a data pre-processing step making a model do this would be a very complicated task.

1

ole72444 OP t1_iwa5ljr wrote

I'm trying to see if NNs can actually generalise such functions. I'm using the preprocessing that you've recommended to create the ground truth labels

1

HMasterSunday t1_iw9allv wrote

also: numpy.split can create several cuts off of a numpy array so it simplifies to:

import numpy as np def process_data(input_array): cut_array = numpy.split(input_array, (len(array)/4)) max_array =[ ] for cut in cut_array: max_array.append(max(cut)) return max_array

much shorter, used this method recently so it's on the front of my mind

edit: don't know how to format on here, sorry

1

sckuzzle t1_iw9fe1i wrote

Writing "short" code isn't a always good thing. Yes your suggestion has less lines, but:

  • It takes ~6 times as long to run

  • It does not return the correct output (split does not take every nth value, but rather groups it into n groups)

I'm absolutely not claiming my code was optimized, but it did clearly show the steps required to calculate the necessary output, so it was easy to understand. Writing "short" code is much more difficult to understand what is happening, and often leads to a bug (as seen here). Also, depending on how you are doing it, it often takes longer to run (the way it was implemented requires it to do extra steps which aren't necessary).

1

HMasterSunday t1_iw9qr8l wrote

Interesting, I didn't try a test run to time both approaches, I'll do that more often. As per your other point though, my code does account for that already, the number of individual cuts is 1/4 of the length of the full array (len(input_array)/4) so it splits it up into arrays of length 4 anyways. That much I do know at least.

1

sckuzzle t1_iwaia67 wrote

> As per your other point though, my code does account for that already

You may try running it? It returns [3.0, 8.0, 12.0, 8.0]. The intended output is [False, False, True, False]. OP didn't ask for it to be split into groups of four, they asked for every fourth value to be taken.

1