brightbehaviorist t1_iu8tmct wrote on October 29, 2022 at 1:29 PM

Another big problem with this kind of data mining is that it doesn’t account for the base rate of route use.

Say there’s a very popular, busy pedestrian route that thousands of people walk every day. Over 6 months, there’s four crimes in the database for that stretch, but that’s out of hundreds of thousands successfully completed safe trips. Another route is much quieter—through neighborhood streets instead of up the main drag. It only has one crime in the database for the same period, but that’s out of only a few thousand successfully completed safe trips. If the app is just telling you that 1 < 4, it will recommend the quieter route, but that’s not necessarily the safer one. What you’d really want to know is the ratio of trips interrupted by crime : trips attempted. Otherwise the app is just reacting to where the people are and sending you away from them—and it’s common sense that an empty street is often more dangerous than a populated one.

That’s on top of the substantial problems with the database data that others have pointed to.

Hreed1 t1_iu8v6d4 wrote on October 29, 2022 at 1:43 PM

Brilliant points all around.

As you pointed out, this app completely hides the fact that the vast overwhelming majority of walking trips happen without any incident. Therefore, any incidents reported appear more like noise in the full-context of the data, rendering the predictive power of the model useless, and consequently rendering this app obsolete at the start.

agentxstealth OP t1_iu914o5 wrote on October 29, 2022 at 2:31 PM

Many studies have shown that past crimes are indicative of future crimes happening at similar locations in the future, which is what inspired me to build the app. Hence the model definitely has some predictive value and is not obsolete at all, imo.

Hreed1 t1_iu95c5m wrote on October 29, 2022 at 3:03 PM

Okay, but HOW MUCH predictive power?

“Some predictive value” is no different from “almost no predictive value at all”

Can this app accurately predict whether or not someone will have an incident on any given path? Of course not, bc if the data was that granular the police would just intervene and prevent the crime from happening in the first place - and of course, this is ridiculous.

The amount of data you would need to prove that simply does not exist.

And if you think simply saying “many studies” is convincing, then you are sorely mistaken. Aren’t you trying to promote this app? Are you really seeking to make people safer or is this just a cash grab, profiting off of some people’s innate fears?

agentxstealth OP t1_iu96n8f wrote on October 29, 2022 at 3:12 PM

I did extensive market research before pursuing this idea, and seeing all the studies on the correlation is what inspired me to build it. In fact similar apps also use a similar principle to calculate areas where future crimes are likely to occur. An example of such an app is WalkSafe+. It's your choice if you want to believe that.

You're right the app cannot predict whether someone will have an incident, but it definitely lowers the chance a considerable amount. Hence "some predictive value" is most certainly different from "no predictive at all." It's your choice if you want to download the app though and don't think it's benefitting you. I don't think you speak for everyone though.

Hreed1 t1_iu9sb9a wrote on October 29, 2022 at 5:47 PM

Very unconvincing.

-What is the predictive power? Precision and Specificity, confidence intervals?

“definitely lowers the chance a considerable amount”

Explain. How did you come to this conclusion? What are you even considering is “a considerable amount”?

Can you even currently calculate the correlation between your “danger score” and predicted crime?

You’re selling this app as safety, a mistake many apps and service solutions offer. But this app does not actual make people safer. You haven’t demonstrated this and it doesn’t look like you intend to. It looks like you don’t care if this app actual makes people safer - just worried about the “market” and the download rate. Which is sorta okay…..(because anyone relying purely on an app to be safe is insane in the first place)

You’re right, this app isnt for me, and idk who it’s for. I wish you luck in hopes that this improves and that you really rethink your strategy. Nevertheless, making an app is a big task - so congrats on getting this far.

[deleted] t1_iu90vvd wrote on October 29, 2022 at 2:30 PM

[deleted]

brightbehaviorist t1_iu964wd wrote on October 29, 2022 at 3:09 PM

It’s just not true that the absolute number of crimes “should” correlate with the ratio of trips interrupted by crime : trips attempted, even if both routes have been taken at least some minimum number of times (which your model doesn’t have any way of knowing, anyway). If you don’t understand this very basic bit of data science, it’s totally reckless for you to be offering people advice on where to walk.

Look, there were 485 murders in NYC in 2021, compared to 337 in Baltimore in the same period. But we all know better than to say that Baltimore is safer than NYC, because NYC has many times more people in it than we do. You have to correct for the base rate of population by comparing murders/100k residents or something. When you do that, you see that the count doesn’t correlate with the relative risk ratio at all!

If you don’t have the information you need for the denominator of the relative risk ratio, there’s no amount of “testing” that can show your model works.

agentxstealth OP t1_iu98mce wrote on October 29, 2022 at 3:27 PM

This is a good point, but as I stated previously, current crime hotspots do indeed have predictive value of where future crimes will occur. Moreover, I am not forcing the user to take a specific route, they have a plethora of options to choose based on their preferences for both speed and safety. If you don't believe me you don't have to download the app.

Also your whole comparison of Baltimore to NYC is completely dumbfounded because the user will be choosing between routes that go through the same city, with all route options having being taken a similar number of times. Hence if one route has a significantly greater number of crime hotspots that will almost certainly correlate with the crime:trips attempted ratio. Also as I said previously, the user can choose between multiple options and I am not forcing him to take any specific route.

Edit: brightbehaviorist I totally understand what you're saying that the crime:trips ratio would be a better indicator of the safety than just crimes, and I am working on integrating this into the algo. However, I think it is still very useful to see local crime hotspots along the route you're taking, and I has said before, the user can choose between a plethora of options based on their preferences (like google maps). Thanks for your feedback!

[deleted] t1_iu8zzzq wrote on October 29, 2022 at 2:23 PM

[deleted]

agentxstealth OP t1_iu8z95b wrote on October 29, 2022 at 2:17 PM

I am literally using a database of real-time verified crime data that gets updated hourly with new, very recent crime reports that gets factored into the algorithm to determine the safest route. How is that not "backed up by data"? You are also right that a lot of the time the safety score for each route are very similar, which is why the user has the option to choose between different routes (they are not just being forced to take one route) and pick one that optimizes their preferences for both safety and speed.

MaxipadMassacre t1_iu9cl1i wrote on October 29, 2022 at 3:56 PM

Please show me where BPD is posting “real-time verified crime data that gets updated hourly.”

agentxstealth OP t1_iu9ctqv wrote on October 29, 2022 at 3:58 PM

Not using BPD

MaxipadMassacre t1_iu9d1lk wrote on October 29, 2022 at 3:59 PM

So then where is this data coming from? Lots of us here have trouble trusting BPD reporting in the first place, but if it’s not their data then whose? Pretty sure the only source of crime data for the city would be from police

agentxstealth OP t1_iu9dnkp wrote on October 29, 2022 at 4:03 PM

From an api that performs web scraping on several datasets across the US. Based on the testing I have done the crimes are extremely recent, from within the last day to the past week, and new ones are inputted each day that can change the recommended route. I don't feel comfortable sharing the exact source as that might give too much info away for competitors.

Also if you don't trust the reporting/data source you can still use the app to see the locations of nearby crimes, and the user is not restricted to take one route, he has a plethora of options to optimize his preferences for both safety and speed.

MaxipadMassacre t1_iu9e2wo wrote on October 29, 2022 at 4:07 PM

If these are public datasets you should have no problem sharing them and letting potential customers see where exactly you’re pulling data from. Otherwise, they have no reason to trust it. I see you’ve posted in multiple other cities trying to shill this same service. Hopefully, the fact that this post has gotten the most traction in two weeks, and the fact that most here are refuting your claims or want to see the data shows you that your idea, while well intentioned, is not feasible. At least not the way you’re presenting it.

agentxstealth OP t1_iu9epfz wrote on October 29, 2022 at 4:11 PM

If I share the exact dataset that will literally be a recipe for someone to build a similar app, especially if I do that in a reddit post. If you want, you can email me at streetwisesafe@gmail.com and we can chat over email there and I can be more specific. Also every claim that has presented I have refuted, so the app is definitely both well-intentioned and feasible.

You are the only person who has asked to see the actual data source and cannot trust it without knowing that, not "most here."

MaxipadMassacre t1_iu9exid wrote on October 29, 2022 at 4:13 PM

I disagree whole heartedly and have no interest in keeping this conversation with a brick wall going over email. Your app’s download numbers will reflect the flaws in your concept.

agentxstealth OP t1_iu9ff7p wrote on October 29, 2022 at 4:16 PM

Any logical reason why you "disagree whole heartedly?" Also if you download the app and click the info icon it explains how the data is being collected. I just don't want to share that on a reddit post is all.

Ever felt unsafe walking along the streets of Baltimore late at night?

Hreed1 t1_iu8nhy3 wrote on October 29, 2022 at 12:32 PM