Data Analysis of a Cyber-Physical Public Transit Monitoring System
Faculty Advisor Name
Dr. Samy El-Tawab
Description
Public transportation around midsize educational cities (e.g., James Madison University in Harrisonburg, Virginia) has become increasingly vital as residential and commuter populations continue to grow every year. This research proposes a cyber-physical system that monitors the quality of service of the transit bus system around James Madison University. By utilizing the power of Internet of Things (IoT) devices, in this case Raspberry Pi single-board computers, it is possible to create a network of smart nodes that collect data on the bus routing efficiency and ridership. Using the stored big data, improvements can be made to bus routes and traffic congestion in Harrisonburg, as well as similar college towns in the future.
This paper is the beginning of data analysis for the first deployment of IoT nodes at seven JMU bus shelters during the Spring of 2017. The selection of these specific stations depended on the WiFi coverage as well as which locations were the busiest stations around the university. These data collection stations compile and submit the MAC addresses, timestamps, and signal strength of wireless devices connected to the official JMU wireless network through the use of a network sniffing program called Tshark. The big data collected over multiples weeks of deployment is stored in a Cloud Storage database hosted by Amazon Web Services that allows for data analysis and convenient access.
After designing multiple queries, our results showed ridership frequency and the average waiting times at each of the seven bus stations. Frequencies at certain bus stops could be broken down by week, day, or hourly intervals by filtering timestamp values. Riders’ wait times could also be calculated by finding the difference between arrival and departure times for their unique MAC addresses at specific nodes. Similarly to ridership frequency, wait times can be assessed based on time of day, day of the week, or even across different nodes.
Several concerns for security and privacy arise with the collection, transmission, and storage of data in the Cloud (e.g., privacy of the ridership MAC addresses, tracking of a specific bus rider in the system ...etc.). Additional security implementation has been suggested to emphasize privacy protection. The primary tool implemented within this project is the Secure Hashing Algorithm (SHA-256). Using this one way function, all MAC addresses recorded by our Raspberry Pi devices are digested into unique but unidentifiable values that are then stored in place of the MAC address. This preserves individuals’ privacy but still allows for anonymous querying of unique passengers in our system.
Data Analysis of a Cyber-Physical Public Transit Monitoring System
Public transportation around midsize educational cities (e.g., James Madison University in Harrisonburg, Virginia) has become increasingly vital as residential and commuter populations continue to grow every year. This research proposes a cyber-physical system that monitors the quality of service of the transit bus system around James Madison University. By utilizing the power of Internet of Things (IoT) devices, in this case Raspberry Pi single-board computers, it is possible to create a network of smart nodes that collect data on the bus routing efficiency and ridership. Using the stored big data, improvements can be made to bus routes and traffic congestion in Harrisonburg, as well as similar college towns in the future.
This paper is the beginning of data analysis for the first deployment of IoT nodes at seven JMU bus shelters during the Spring of 2017. The selection of these specific stations depended on the WiFi coverage as well as which locations were the busiest stations around the university. These data collection stations compile and submit the MAC addresses, timestamps, and signal strength of wireless devices connected to the official JMU wireless network through the use of a network sniffing program called Tshark. The big data collected over multiples weeks of deployment is stored in a Cloud Storage database hosted by Amazon Web Services that allows for data analysis and convenient access.
After designing multiple queries, our results showed ridership frequency and the average waiting times at each of the seven bus stations. Frequencies at certain bus stops could be broken down by week, day, or hourly intervals by filtering timestamp values. Riders’ wait times could also be calculated by finding the difference between arrival and departure times for their unique MAC addresses at specific nodes. Similarly to ridership frequency, wait times can be assessed based on time of day, day of the week, or even across different nodes.
Several concerns for security and privacy arise with the collection, transmission, and storage of data in the Cloud (e.g., privacy of the ridership MAC addresses, tracking of a specific bus rider in the system ...etc.). Additional security implementation has been suggested to emphasize privacy protection. The primary tool implemented within this project is the Secure Hashing Algorithm (SHA-256). Using this one way function, all MAC addresses recorded by our Raspberry Pi devices are digested into unique but unidentifiable values that are then stored in place of the MAC address. This preserves individuals’ privacy but still allows for anonymous querying of unique passengers in our system.