Hacker News new | past | comments | ask | show | jobs | submit login
OpenTraffic: a free, global traffic speed data set linked to OpenStreetMap (opentraffic.io)
170 points by executesorder66 on Nov 24, 2015 | hide | past | favorite | 26 comments



Where is the data?

Will full dumps be made available under a permissive license or will we have to hope that "free" and "open" mean "we have an API and might serve you something if we feel like it" as it too often does?


Data is - I believe - currently held on Conveyal's (the consultancy spearheading this) db. As I understand it, this has been developed in tandem with some World Bank initiatives and a test version of this is already operational with traffic data the WB has accrued in a number of international cities. Recent pushed wrt this project have been aimed at getting US cities on board, which (from personal experience) is much, much harder than getting buy-in on this type of work in cities abroad.

And wrt anon data versus specific trip data - that is anonymized when the trips are uploaded to the system. There will be no way to identify specific trips with the tool.


> And wrt anon data versus specific trip data - that is anonymized when the trips are uploaded to the system. There will be no way to identify specific trips with the tool.

There is no way to 100% anonymise trips. Some will be deidentified.


Data related to a single trip will not be kept - data on route segments is product of aggregates, as I understand.


Even if they release the computed average road speeds, it would be difficult for them to release the individual traces while maintaining user privacy.

After all, if a GPS trace starts at my home and ends at my work, that's going to be pretty easy to de-anonymise.


That should be a choice up to each person with a sensible default. You could let users define home/work zones where no data would be shared. You could add a bit of delay to the data transfer to be able to say "don't send anything from the first and last X minutes of moving".

OSM is full of GPS traces volunteered from thousands or people. The NYC taxi data set is not that different and includes exactly the data you describe.

I very much find the taxi data a inconsiderate breach of people's privacy as I doubt they were aware that this data might become available. But I consider volunteered detailed trace data perfectly fine and actually necessary for a open platform as you cannot reproduce results otherwise.


Their architecture plan makes this clear, the idea is to only publicly share the anonymized data:

https://github.com/opentraffic/architecture


But someone is keeping data that can be de-anonymized? That sounds dangerous.


If it's based on OSM road geometry data (and or linked to it), the I think the OSM licence (the ODbL) kicks in and the derived database would have to be also under the ODbL.


I wish them the best! These sorts of data are extremely powerful for local governments and advocacy for safer and better-designed streets.

There’s been a big push in Pittsburgh for improved traffic safety around the University of Pittsburgh and CMU due to several recent cyclist and pedestrian deaths. I’m not affiliated with the city or any local advocacy organizations, but as a concerned cyclist with some programming chops, I figured I could put some numbers to the problem myself with video tracking [0]. This impromptu weekend hack has garnered way more attention than I ever expected [1]. Being able to see long-term behaviors would be even more powerful.

0. https://github.com/mbauman/TrafficSpeed

1. http://www.post-gazette.com/news/transportation/2015/11/19/P...


It's just as likely that this data will be used to increase speed limits, under they theory that limits should be set at the 85th percentile of speed. Using data like this to make streets safe for people is anathema to the American driver.


That's probably true in many locations. Here, though, the city and university are interested in infrastructure changes that will reduce the "natural speed" of the road… effectively tackling the 85th percentile from the other direction.


They provide a link[0] to a list of opensource tools that they have made for this but after having a look at them, I realized there is a lot of work to be done.

[0]https://github.com/opentraffic


Building critical mass will be hard without Google Waze's promotion budget, but remember where Openstreetmap stood at its birth !


It can be done, but it will take time. My folks, who commute by car every day, loved Waze until it was bought. They ended up feeling the quality went way down and also locked out of the fruits of their labor: My mom was banned for making a few mistakes! Maybe there is a growing group of people who would be ready for an open alternative.


If this means that apps like OsmAnd can do traffic-aware navigation, it will help the entire OSM ecosystem. I try to use OsmAnd but I often have to use Google Maps in order to route around traffic in my city.

They should really work with the major OSM app vendors to get live data from mobile devices.


Can I install an app on my phone to contribute to this data during my commutes?


i think it is a nice possibility, but for now it doesn`t. It is only looking at providers who have access to large GPS datasets


What's the authoritative source for this data?


I think at the moment it is still a plan.

If it is successfully open source, there could be multiple organizations/groups that are maintaining traffic pools. For example, there isn't huge benefit to pooling traffic data between Australia and North America. Perhaps it would be convenient, but that's about it.


The benefit is that I can then take my phone with a navigation app with real time traffic updates that works in the US and use it in Aus without changing anything (mobile data costs aside).


Well the app needs to know where to find traffic data anyway, it could have a different source per region.

That said, I'm not sure whether it's easier to have a big database or a separated one.

With separation, many people could host parts of the traffic data. As a student I could even host something without getting huge costs.

Without separation, you don't have the complexity of splitting per location.

With separation, databases can be smaller and you can put a database with only what you need in that region. You don't want to have requests from Australia being answered (load balanced) to a European server anyway, so if you host Australia's part in Australia you'll have lower latency and a less complex database.

Without separation, you don't have to keep multiple separate databases. It's a simpler, more unified design.

I don't know which is better.


Oh, you mean Car Pools

I'll show myself out


they are working with world bank to source data from taxi firms and other companies which maintains these datasets


this is pretty awesome, if/when it ever picks up. the possibilities of this far surpass just the basic "what's the traffic jam today" of personnal use cases. by making such data publicly available, it can influence significantly public policies and much more.

good thing things like this are finally getting off the ground. pretty much the only thing still tying me up to google maps at least...


it's not very open if I can't see the data




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: