Submitted by AutoModerator t3_10qu5oa in dataisbeautiful

Anybody can post a question related to data visualization or discussion in the monthly topical threads. Meta questions are fine too, but if you want a more direct line to the mods, click here

If you have a general question you need answered, or a discussion you'd like to start, feel free to make a top-level comment.

Beginners are encouraged to ask basic questions, so please be patient responding to people who might not know as much as yourself.


To view all Open Discussion threads, click here.

To view all topical threads, click here.

Want to suggest a topic? Click here.

68

Comments

You must log in or register to comment.

Zaskoda t1_j6xrp6n wrote

Is there a nice list of open data providers anywhere? There are so many nifty APIs providing open data across the Internet. I figure there must be a nice index listing a bunch of them but I'm not sure where to look for such a thing. This seemed like the kind of sub that would know.

2

Ancient-Bread-3236 t1_j724k7f wrote

Does anybody know a good (possibly free) tool for visualising interactions between people in a timeline?

Looking for something that kinda looks like this: https://www.chartgeek.com/wp-content/uploads/2019/04/timeline-game-of-thrones.jpg

2

PsychologicalEgg9377 t1_j95zs8b wrote

A network graph using something like graphviz might be a better method. The problem is, the result represents a snapshot in time. So you'd likely have to animate it or otherwise make it interactive with a timeline slider.

1

monstrodolagoness t1_j8iqqhc wrote

I'm afraid to make a new topic/discussion and be inappropriate for this sub. But, let me ask you guys... what software do you use to make these amazing-looking plots? I'm trying my best with python, but I'm not sure if I'm on the right path.

My goal is to make plots for business reports, but not sure about the quality.

Any help would be deeply appreciated! And thank you in advance.

2

crasyeyez t1_j6s7gxm wrote

Can anyone help me think of a better way to visualize the info in the attached?

I need to show the rankings by country for the various types of renewable energy. The key point to convey is the countries and their rankings.

https://imgur.com/a/prl9mDj

1

ForeverMorning0426 t1_j84lux5 wrote

Tab Bar Graph with animation when users change the tag. You can use its rank as the value. The higher bar is, the higher rank is. X-index shows countries’ flags. Use tag to control which renewable energy the bar graph considers.

1

Illustrious-Fox4063 t1_j9n5xk1 wrote

Stacked bar graph although not the best as it is hard to compare the amounts of each type across countries. A user would still get an impression of the overall makeup of each country's renewable energy usage. Stack Bars also do not convey the overall quantity of each countries energy usage and the total of each each renewable type.

Side by Side bars are another option as they allow you to have a uniform axis for the total amount of energy and then group the bars by either country or renewable type.

1

NickEcommerce t1_j6sa318 wrote

I have a bunch of rows that each contain a product name, and then the number of items sold for each of the last 12 months. How can I highlight which months are above average for the row? In excel that kind of conditional formatting gets applied to the entire dataset but each row needs it's own average calculation. I could apply a fresh conditional format to each row, but with more than 1,000 rows it's a big pain in the backside!

1

crimeo t1_j6ysk0j wrote

Not really a proper answer, but a loophole/workaround:

  • Make another copy of the whole table, but this time each row normalized (subtract minimum from the row then divide by (maximum - minimum)) so every row now goes 0 to 1.

  • Apply a single conditional format to the entire thing, since now each row is apples to apples and you only need one

  • Use this to visually navigate instead or to sort, and the left table to see the raw numbers

2

NickEcommerce t1_j711z9i wrote

Thats a great idea, thank you! Some of my numbers are so vast in range it didn't occur to me to normalise. They're sales figures so in a poor month an item might sell 1, but in a good month it might sell 250, so when figuring out seasonality I am finding it tough to pick out some "winning" months for a given product.

1

Cypherazul_0 t1_j6xoyrn wrote

I really want to make a data link table that’s webs out with all interlinked pieces of data. Specifically subjects on a podcast. Anyone have ideas if a place where I can make something like that

1

Zambooka100 t1_j70dug7 wrote

I am working on a project where I have been provided a data set of auto loans for a specific period of time with many loan parameters such as credit score, debt to income, loan to value, original term, interest rate, payment amount, vehicles year make and model, probability of default, unsecured DTI, etc.

I need to identify patterns in auto loan charge-offs (vehicle surrenders/repo) . Anything I want at all.

I’ve come up with some ideas: basic comparison of a number of parameters between charge-offs and non-charge-offs. Interest accrual on loans prior to charge off vs the calculated loss of the vehicle and how large the offset is. Number of months loan was working with collections prior to default and changes made to loan payments.

I am just looking for additional ideas on what I could show with this data. I’m happy to provide additional information if anyone is interested.

1

Ol_grans t1_j72gs6t wrote

Hey Folks! I am looking for help in visualizing theoretical public transportation routes for a given US metropolitan area.

We have pretty lackluster service and I want to poll the public and ask them where they commute to.

Questions would be something like:

  1. What neighborhood do you live in (field/drop-down)
  2. Enter up to 3 locations you frequent in a month that you would prefer to take public transportation on (address field) 2a. how often do you commute to location 1-3 (daily, weekly, monthly)
  3. What is your current commute time in minutes (integer)

Given this data, how could I process/visualize these trends? I would like to say "wow! A lot of people need rapid transport from towns A <-> B and towns B <-> C but not so much for towns A<->C!"

1

donuthorse t1_j78ppul wrote

Someone is trying to hurt and kill dogs in my home town! I need help!

A little bit of background: I live in a town in Sweden with around 340.000 people. Since December 2020 - Today, we've had over 150 known attempts to try to hurt and kill dogs.

The perpetrator, in most cases, deploys small baked bread buns containg sharp, hand made "stars" made out of pieces of tin can. Sometimes the buns are dropped in plain sight, sometimes near bushes / under leaves etc.

Now, I have collected all the police reports, gone through them all and entered them into excel with dates, time when reported (where applicable), on what address it happened etc.

Can someone help me to visualize this somehow? Maybe there is some kind of pattern? Maybe I'm grasping at straws here.. but maybe someone can help me?

Thank you

1

RattisTheRat t1_j8r5ggs wrote

From what you’ve written, I think viewing this as a line chart over time would be helpful.

A step further, if your town is split into areas, I.e. up town, down town, east block, west block, etc & view that again as a time series you might see a pattern in events in different areas there

Edit: you might see that an event that occursed in the ‘west block’ often has an event in the ‘up town’

1

PsychologicalEgg9377 t1_j94589f wrote

This is awful to hear. If it's the same person or group, you might find some pattern based around weekdays, holidays, time of day, etc. Try labeling them based on date/time and see if you can run some summary statistics. If it's kids, it might increase during holidays. If it's someone jobless, there may be no pattern. This is all speculation, but it's a starting point.

Machine learning might be able to offer some insights as well, but that might be too steep a learning curve.

Good luck in finding them

1

SupermarketOk8234 t1_j7q6h5t wrote

somebody can help me for how do I install my VE type1?? plz!

1

tan_tan_tanuki t1_j7riie7 wrote

I am a fifth-grade teacher about to teach (extremely basic, obviously) statistics and probability concepts to kids. My student group includes many who respond well to visual approaches to math. Can anyone here recommend any beginner- or child-friendly websites that generate beautiful and intuitive graphs?

1

PsychologicalEgg9377 t1_j943mp3 wrote

It's not a very satisfying answer, but Excel might be your best bet. You likely already have it on your work computer and there are tons of resources for creative ideas. You can download tons of data already in a spreadsheet usable form (csv mostly). Most services will either have a steep learning curve, or be too basic to be useful.

2

RattisTheRat t1_j8r5vfj wrote

Not sure on what fifth grade is here, but I find khan academy super helpful, even now as an adult.

1

yankee29 t1_j7tk82q wrote

Hey everyone,

I recently tried to visualise some of my research findings in a classic two-dimensional plot in R. Problematically, some of my observations share the exact same values for both the X- and Y-dimension, leading to a perfect overlap in the graphic.

I would like to fix said overlap, making sure that all dots are clearly visible. Whilst this should not be too much of an issue, for some reason, I can't seem to find a workable solution that makes all dots visible.

Does anyone have any design ideas or similar how to plot my observations in a better way that makes all overlapping data points visible?

Many thanks in advance

1

PsychologicalEgg9377 t1_j9432iq wrote

This is common. There's a technical term for it (jitter). There's a balance between making your datapoints visible and changing their values too much.

1

Airborne18th t1_j860hma wrote

I have data on an Excel file that has upstream and downstream content (data is in 3 columns left column has upstream name, 2nd column has current location and 3rd column has downstream location) that I'm hoping can be read and can be visualized to show the relationships (paths). I'm wondering if there are a few and very easy (no code) options for viewing the data (can be inactive or not) as a relationship map. Any suggestions would be appreciated.

1

superavg t1_j8blast wrote

What program/tool are people using for the data videos posted that include moving graphics along with the data?

1

Trick_Read t1_j8nwrds wrote

Guys, I suck at data. Really want to improve this and the visualisation skill. Where to begin?

1

PsychologicalEgg9377 t1_j942tn7 wrote

Excel. I dislike Excel for many reasons, but it does allow you to quickly visualize data and get practice with the process. If you can program, R+ggplot2 or Python+Pandas+Seaborn.

1

Good_Sage t1_j8rhve2 wrote

What are the most famous or well known methods to proper visualise data? I am willing to learn different methods or use different websites and explore various options. I am sorry if this question has been asked before but I really want to know better ways to show data besides normal charts. (Please link me to previous threads if this has been mentioned before)

1

PsychologicalEgg9377 t1_j942krp wrote

I'm from an academic background and used to use a lot of R. There's a library called ggplot2 that is very formal and structured. Many other plotting libraries and methods are very disjoint, but ggplot2 gives you a good foundation because it's based on a lot of plotting theory. It depends if you program or not.

I found this PDF on datacamp that is very high level. I'm not sure I agree with all of them, but it's probably a good start.

https://s3.amazonaws.com/assets.datacamp.com/email/other/Data+Visualizations+-+DataCamp.pdf

2

Good_Sage t1_j943ql3 wrote

Thanks! I will take a look at that. So I am assuming there are no particular website that can do all the plotting and you would have to program that? I am good at programming but definitely not at the high level. This might as well be a long procrastinated project for me when I get some free time. If there are some more libraries (because there seem to be alot of cool graphs in this subreddit) please do let me know!

1

PsychologicalEgg9377 t1_j945pha wrote

The two most common stacks I see are Python+Pandas+Seaborn or R+ggplot2.

Python has the added advantage there's a big industry demand and would be more likely to find a job with python experience.

2

Rezurrected188 t1_j8x299e wrote

What's the best way to, on Android, make one of those charts where you color in days on a calendar to track events?

1

PsychologicalEgg9377 t1_j941tlq wrote

I created an interactive plot written in javascript and plotly that I'd like to post here. I have it hosted on a server. Is it possible to embed as an iframe or similar? Or will I have to just post a hyperlink?

I generally don't click off-site links when scrolling reddit, so I suspect others have the same habit.

1

levinikee t1_j9drtmj wrote

Does anyone else remember a graph where OP tracked their heart rate while in a cab ride to the airport with their girlfriend?

I distinctly remember it because I thought it was so bittersweet, but I can't seem to find it anymore.

1

Nik64 t1_j9fbeog wrote

How does one come up with good ideas to visualise? The key goal is to tell a story through data; but I am struggling to find out something interesting worth visualising - something that catches peoples' eyes - which hasn't been done before.

Can anyone give me any tips?

1

SheLookedLevel18 t1_j9oh20u wrote

It feels like this sub has moved away from the “beautiful” part and seems to just upvote “is data” element. Many of the posts that get upvoted are the most basic graphs

1

burnt-store-studio t1_j9t5h98 wrote

Good morning; I am looking for a dataset of NCAAM basketball games including the timestamps for shots and, with hope, running scores.

This [http://academics.smcvt.edu/jtrono/BBallArchive.htm] is a great dataset but doesn't include the fidelity I'd need.

The NCAA-related datasets pointed to by Kaggle are also missing what I'd like.

I expect the AWS NCAA stats would have this (and tons more) data, but (a) I don't know how to get even a sample of it to check, and (b) if it costs, I'm sheepishly not able to bring money into this obscure theory.

I'm trying to test a theory about scoring in the last ten minutes of games. I'd make a beautiful graph if only I could find the data :)

If you have any information, I'd be so grateful!

Thanks!

[Edit: grammar]

1