Abstract: | 雲端運算與服務是指可以隨時、隨地、依需求、使用任何裝置存取各種服務。它是一種模式,依照需求能夠方便地存取網路上所提供的電腦資源,這些電腦資源包括網路、伺服器、儲存空間、應用程式及服務。因應雲端運算服務的普及而產生大量的資料數據,未來科技發展以巨量資料的保存、處理及分析應用為重點研究方向,針對大量資料的儲存及處理,沒有使用分散式運算與分散式檔案系統,無法滿足此需求。本論文使用開放原始碼,比較當前較著名的分散式檔案系統Hadoop與Ceph,針對這兩個系統的檔案上傳與下載效能,大小檔案的傳輸能力與容錯能力做比較分析,在60次不同檔案大小的傳輸測試中,Ceph只有2次明顯數據的優於Hadoop,其餘的實驗數據都顯示Hadoop具有較好的效能表現。證明現階段Hadoop受產業界所採用實作,已具有較穩定及較佳的效能,而Ceph目前還不建議在生產環境中採用,對於未來發展還有很大的成長空間。 Cloud computing refers to services at anytime, anywhere, on demand, using any device to access various services. It is a model that can be easily accessed in accordance with the needs of the network computer resources provided by these computer resources, including networks, servers, storage, applications, and services. In response to the popularity of cloud computing services, which produce large amount of information and data, and in order to save the future of science and technology development, processing and analyzing massive data applications for key research direction, the storage and handling of large amount of data without the use of distributed computing and Distributed File System, has become the focal point. In this thesis, the open source, Hadoop Distributed File System, and Ceph were compared in these areas of file uploading/downloading performance, transmission capacity, and fault tolerance comparative analysis of file size. In 60-different-file size transmission test, Ceph performed only 2 times better than the obvious data Hadoop. The rest of the experimental data shown a better performance achieved with the Hadoop. The more stable and better performing Hadoop though currently under proof stage, has yet to be implemented by the industry. Ceph is not currently recommended in production environment; however, there can be a great development for future growth. |