Sadgasm1 OP t1_j1u9eh8 wrote
Reply to comment by mateusb12 in Principal Components Analysis, k-means clustering and actual outcome of the 2022 Fifa world cup matches [OC] by Sadgasm1
lol
Each row in the data is a match, so ‘team1’ represents a win for the “home” team, ‘team2’ represents a win for the “away” team and ‘draw’ represents a draw.
TheGoatzart t1_j1xwe1m wrote
But home and away are totally arbitrary in this context, with the exception of Qatar. So how does that characteristic provide any benefit from an analysis or presentation standpoint?
Sadgasm1 OP t1_j1yfh7h wrote
Oh yeah generally youre right, but the data was also ‘doubled’ in this way. Because each row is a match, the variables are present for each team separately. For example: there’s attempts from outside the box by team1, and attempts from outside the box by team2. The pattern of relation should be symmetrical because these differences are arbitrary as you say.
Viewing a single comment thread. View all comments