Submitted by AutoModerator t3_yj6tsc in dataisbeautiful

Anybody can post a question related to data visualization or discussion in the monthly topical threads. Meta questions are fine too, but if you want a more direct line to the mods, click here

If you have a general question you need answered, or a discussion you'd like to start, feel free to make a top-level comment.

Beginners are encouraged to ask basic questions, so please be patient responding to people who might not know as much as yourself.


To view all Open Discussion threads, click here.

To view all topical threads, click here.

Want to suggest a topic? Click here.

33

Comments

You must log in or register to comment.

TempoMentalWriter t1_iup47gy wrote

I'm looking for a software package that would allow me to do a timeline visualization, similar to what is done on Wikipedia for, say, musical groups.

https://imgur.com/a/bJPSFrv for an example.

1

Zibbulon t1_iuxlro9 wrote

Hello, I am in the process of learning Matplotlib and it's been very fun so far. Do you know were I can find basic public databases / files that contains various information on which to practice making graphs ?

Thank you.

1

FeistyConstruction60 t1_iv8v9z7 wrote

People I would like to know your opinion about the recopilation of our data without any of our awareness.

1

ShounakDas t1_ivbtdhz wrote

What are the best Data Visualization tools that can also export in mp4? Using Python and API s.

1

Cloudkid78 t1_ivdjb8t wrote

That's honesty great! (Say hi guys, this is screenshot is for my Estatistics an Analysis Homework).

1

RustyKarma076 t1_ivf3mox wrote

Can we make r/DataIsBeautiful good again?

I see way too many posts on this sub that boil down to lazy attempts at trying to make a political claim. The data isn’t that neat, or fun to look at. All it is, is a bar graph form of “hey guys look how little Fox News talks about school shootings.” or “hey guys look how awful Christians are.”

The point of the sub is to show the data, and the well done presentation of the data. If the focus of your post is politics, and specifically how one political group is wrong/hypocritical, and makes little to no attempt at presenting data in an interesting way, it should be deleted.

This is a great example

7

worksofter t1_ivl9uej wrote

I have some data where the data decrease the further up both the x and y axis you go. What is an attractive yet effective way of showing this data? The goal is that somebody can look at where they are for the x axis, and where they are on the y axis, and find the 'meeting point' for the answer (sorry to be vague). Would appreciate any help. Thank you

https://imgur.com/cQOqRbk

1

Helique t1_ivqap2t wrote

I am looking for a new job, are there any standardized data formats for recording the job search process? I have seen lots of people create visualizations, and it would be cool if we could standardize on data formats.

1

pappasmurf91 t1_ivu2fzv wrote

So sort of an odd request considering this group name. But I'm a teacher looking to talk about graphs and how they can be used for good and bad. I didn't know if people had data is ugly examples? If I need to make this into a different thread let me know.

1

talkingtunataco501 t1_ivu3g41 wrote

For 2.5 years, I collected this data.

  • Drinking: whether I drank on a day or not, how much I had to drink, whether I did it for fun or stress, and whether I had a hangover or not the next day
  • Marijuana: whether I had pot on a day or not, the method I took (smoke or edible), whether I did it for fun or stress, and whether I had a hangover or not the next day

I started to collect this data to track some interesting things. From this, I figured out that I get alcohol induced. migraines. I also found out that on average I drank 1.05 times per week with 3.1 drinks per week, and I do pot on average of 1.93 times per week. This is during that particular 2.5 year period.

What are some good visualizations that I can do with this data?

2

Infamousscorpion t1_ivv0j91 wrote

I've wondered the same thing in the past and was thinking of making my own. The big picture idea is that it is called a 'Sankey' diagram.

[https://plotly.com/python/sankey-diagram/#:~:text=Sankey%20diagrams%20visualize%20the%20contributions,Figure(data%3D%5Bgo](https://plotly.com/python/sankey-diagram/#:~:text=Sankey%20diagrams%20visualize%20the%20contributions,Figure(data%3D%5Bgo).

Hope that helps

2

Summoarpleaz t1_ivyugrv wrote

THANK YOU. I subbed to this not to exclusively read about politics or, really, to read about the data, but to look at smart and beautifully presented data. Idk why it has boiled down to bar charts.

1

RustyKarma076 t1_ivyz6ks wrote

There’s nothing inherently wrong if the post is about politics. Even if the data is very clearly trying to make a point about one political group. I’m fine with that, as long as the data is “beautiful.”

There’s nothing interesting about bar graphs and pie charts. That’s all they are

edit: like this

It’s the top post of the day and it’s just a wall of text to complain about billionaires. There’s nothing interesting about it

1

Summoarpleaz t1_ivyzgp3 wrote

Yes agreed. I have no issue with politics but it seems like the posts have basically gone down in the data visualization aspect in favor of making a political point. Again, wouldn’t be an issue if only the data visualization part were punched up a notch.

1

_artbreaker t1_iw7038j wrote

I have been thinking recently, especially with COP happening, that this subreddit could create something that collectively visualises and tracks the scale of climate impact in as it's happening. Potentially as a Micro-website.

Like how many countries have experienced drought, flood, unprecedented storms, famine etc. This could also be linked to news stories around the world.

I feel like the best thing about this sub is the way it can help convey complex information at scale. We get news articles around floods in Pakistan, droughts in China, but what does that all look like together?

2

SullyPanda76cl t1_iw9reue wrote

Hi... can we pin a "101" threat of animating data?

I see that's a very common question for newbies enthusiasts (like me)

2

vitaliyh t1_iwasqsm wrote

How would one go about making a heatmap of persons earning more than $100,000 per year, assuming each person is a 50-mile circle rather than a dot?...

1

MatteDambro t1_iwb9om1 wrote

How can I have data about luxury goods inflation over 20 years?

If they are not aggregated, can I have price index over years of:

  • 5 stars Hotels
  • Luxury Fashion
  • Sports car
  • Yachts
  • Villas
1

skipjack- t1_iwf5zx6 wrote

So I'm relatively new to this sub (and reddit as a whole), but I have a few interactive visualizations I'd like to share and get feedback on. I have two questions before I move forward though, both relating to the "Title" and image limitation...

  • The sourcing and description must be in the image?
  • Is it ok for me to respond to my own post with a link to the interactive version?
1

gftmc t1_iwh951i wrote

I'm looking to do some clustering over time. What kind of tools would you suggest? (I can program well enough)

Specifically, I have a group of people and connections. The connections strengthen or weaken over time. I want to show how strong a connection between two individuals is, and cluster them together the stronger the connections are.

1

skipjack- t1_iwjk9er wrote

Ok looks like the answer to the first question is "no":

> [OC] posts must state the data source(s) and tool(s) used in a top-level comment.

And for the second, yes (implied by the same quote above I think).

2

zebulon_20 t1_iwkcd09 wrote

[Serious] What's the most tattoo-worthy function, line of code, or plot?

1

sinncross t1_iwo5uy6 wrote

Hi, I'm wondering what is the best way to graph a survey question related to ranked choices from 3 different demographics?

eg: Rank the following fruit in order of deliciusness: apples, watermelon, lemons. And the answers come from kids, teenangers and adults.

1

melent3303 t1_iwpmvc0 wrote

What would be the best way to visualize this data set:

Black Friday Death Count

It has data on the: year, location, # of injured, and # of deaths.

1

Historical-Jello1745 t1_ix9wd2g wrote

Health data question: With things like smart watches, calorie tracking apps and sleep cycle alarm apps - users are gathering a lot of data about their daily habits. While a lot of it might be quite dirty (e.g. sleep alarm apps may miss records for weekends, or automatic time guesses may be off, users may lie to calorie tracking apps) - in aggregate the data has to be somewhat useful, no?

Wondering:

  1. Why there aren't any major projects asking for this data?
  2. What are the risks/implications of using this data - even if donated freely by volunteers?
  3. What could such data actually be useful for? (e.g. HR data for a million people over 5 years, OR sleep timing and quality for similar sets of people, OR complete food logs from fitness enthusiasts or chronic dieters)
1

FearlessHead8689 t1_ixh45p1 wrote

Hi!

I am looking to make a career change to data analytics in the next 2-5 years. Right now, I still work with data quite regularly in my current role and often make basic visuals in excel.

I would like to up my data visual game! I assume excel won't cut it, what programs do I need to start looking to practice?

Thanks!

1

qthrow12 t1_ixmqifm wrote

For any data people here, I have a question about how to present some data.

I've got a query that returns periods of vacation time for a group of people, over many years. (5 in this case).

It also identifies when theirs a stat day.

​

So my data for example looks like. This would be 1 row of data.

startdate - 2018-01-01 00:00:00.000

enddate - 2018-01-02 00:00:00.000

count of members - 2

Stat Day found (within that range) - Y

​

I'm trying to figure out how to compare 2018 to 19 to 20 to 21 etc.

How would I best display this?

​

My end goal is to hopefully be able to use this to say, for example, 2nd week of February you typically have X members off on vacation. So a manager could prepare.

​

The problems i'm having figuring this out, is that every year moves everything forward a day, and you can't just add those movements so they all balance out, as that day that was in week 1 of month might be part of week 2 now, but the stat fell in week 1 regardless and thats the week people might take off.

​

I hope this makes sense. Thank you so much for your time.

1

Cosmic-Rookie t1_ixn6w9p wrote

I'd love to see somebody make a visualisation of how many countries actually did boycott the Qatar world Championship broadcasts. There seem to be mixed results with some posts stating that Germany had half as many viewers than expected and Denmark having more than they had during a previous match against Russia.

1

TurtleChomps t1_iy6ffgh wrote

Would love to know software used for the beautiful charts (static), maps etc… and where can we learn it? Any classes specific to data viz or

1

Hayk94 t1_iy7r404 wrote

Hey guys, a data viz noob here so excuse my foolish question.

Say we have a chart with bars and lines combined. And it's a dual Y axis chart. One axis is for the bars and the other one for the line. Now my understanding is that the two axes should have some correlation, aren't they? Is there any scenario where it makes sense the axes not to have correlation?

Also any links to reading material as to what are the rules of those type of charts and should the axis be correlated or not, would be appreciated.

1

YOLO4JESUS420SWAG t1_iy8hygr wrote

Where can I make a request? I love this sub but don't know shit about data mining nor how to make cool graphics like I see here. After hitting 8bn people on earth I was hoping to see a visual of something like the last 100 years global map of what areas exploded and by how much. It'd be so cool to see.

1

beingsubmitted t1_iycmkvz wrote

I'm a little in between, here. I've posted OC a couple of times, and I'm working on a new one currently.

I for one do think aesthetics should be part of it, absolutely. Having posted OC before, though,you will get a lot of pushback for aesthetic choices - if your colors don't perfectly interpolate or anything can be interpreted as not strictly accurate. There's a degree to which accuracy and aesthetics are at odds. For example, a pie chart can be perfectly accurate, but if certain colors are too similar or one color stands out without good reason, you will be accused of manipulating data or not presenting it honestly.

On the other hand, I also think some data just are beautiful, regardless of aesthetics. Interesting data are fun to look at on their own, and the data itself is part of the beauty, like "r/oddlysatisfying" I think sometimes a post isn't about the data visualization being beautiful , but the data itself.

Ultimately, though, it's a matter of voting, particularly on new/rising posts. Given that only about one in a thousand viewers or less of a post actually up or downvote, individuals have a lot of power to reshape the sub, by voting in new/rising. If more aesthetically-minded posts are lifted, the OC will follow suit.

1

Atomicityy t1_iyd9odo wrote

Can someone recommend a noob friendly program to analyze data? Will said program also help me transform my findings into charts and graphs?

I want to analyse my spotify wrapped playlists from 2016 til now. Examples: how is each decade/genre represented, nationalities of the artists, which songs return over the years etc.

1

zestyping t1_iyewudy wrote

Members and mods, let's talk about misleading visualizations.

The subreddit description says DataIsBeautiful is for visualizations that effectively convey information.

To me, that means highly misleading visualizations shouldn't qualify. Don't we all want a high-quality subreddit?

Of course, we all make mistakes sometimes; I'm not talking about missing one point out of 100, or plotting a point a few pixels off to the side, the kind of thing that could be fixed with a small correction. I'm talking about egregious designs, where a visual element is 2 or 3 or even 10 times larger than it should be. Whether or not there's an intent to mislead, these are simply low-quality visualizations, and we'd be better off without them.

What do you think? Should this be addressed in the FAQ? Should there be a guideline?

1