Abstract: | 巨量資料的應用蓬勃發展,本論文透過政府的 Open Data 公車動態資訊,取得 公車的即時定位,透過公車的定位資訊評估道路交通狀態。本論文建置了一個 雲端城市交通狀態評估系統,透過了 Big Data 架構來實作系統,其中運用高可 用性的雲端套件 Apache Hadoop 及 Apache Spark,提出一個有效的架構用於處 理巨量資料,套用於台灣大道的公車定位資訊,並且透過即時計算公車的即時 平均速度與歷史的平均速度做比較,在找尋塞車點的部分應用 Fuzzy C-Means, K-Means, DBSCAN 找出交通堵塞及繁忙的集中點,藉此評估道路的交通狀態。 在實作上,本論文透過實驗比較 Hadoop 與 Spark 的差異後選擇 Spark 作為本 論文系統之運算架構。數據儲存的部分經由實驗比較 HDFS 中不同 replication 數量的讀寫速度找出最適合設定套用於系統。找尋塞車點部份經由實驗找尋最 佳的分群方法,並經由實驗結果選擇最適合本系統之分群方法 K-Means 做為評 估方式。在即時評估的實作本系統透過移動平均的概念實作即時交通狀態的評 估,在介面部分使用了網頁前端技術呈現雲端城市交通狀態評估系統,本論文 系統實際應用於台灣大道並能成功達成交通狀態的評估。 Recently, big data are widely applied to different field. This work presents a cloud city traffic state assessment system using a novel architecture of big data. The proposed system provides the real-time busses location and real-time traffic situation, especially the real-time traffic situation nearby, through open data, GPS, GPRS and cloud technologies. With the high-scalability cloud technologies, Hadoop and Spark, the proposed system architecture is first implemented successfully and efficiently. Next, we utilize three clustering methods, DBSCAN, K-Means, and Fuzzy C-Means to find the area of traffic jam in Taichung city and moving average to find the area of traffic jam in Taiwan Boulevard which is the main road in Taichung city. Finally, several experiments are test. The first experiment indicates that the computing ability of Spark is better than that of Hadoop. The second experiment compares the HDFS processing speed under different number of replication. In the last experiment, we compare the clustering performance of DBSCAN, K-Means, and Fuzzy C-Means so that K-Means is adopted in the proposed system. Based on these experiments, the provided services are present via an advanced web technology |