現今,NVIDIA公司的CUDA是一種為了撰寫高度平行的應用程式,所發展出來的通用型可伸縮式的平行程式設計模組。它提供了一些關鍵的抽象化概念:一個有層次的線程區塊、共享式的記憶體和屏障同步。在整個工業界和學術界的科學家們已經使用CUDA在生產或是研究的程式碼時,有了驚人的加速性。這模組在編寫在多核心的圖形處理器上運用的多線程程式時,已經證明是非常成功的了。內建圖形處理器的叢集環境在雲端運算中扮演一個重要的角色。因為一些高強度運算的程式需要中央處理器及圖形處理器一起運算。本篇論文中,我們以PCI穿透的技術,使得在虛擬環境中的虛擬機器得以使用NVIDIA的顯示卡,進而可以使用CUDA高效能運算。這將使得虛擬機器不僅只能有虛擬的中央處理器,更可以使用實體的圖形處理器來做運算,虛擬機器的效能將可大幅提升。本論文中將會量測虛擬機與實體機之間使用CUDA的效能差異,以及擁有不同中央處理器的虛擬機器是否會影響到CUDA效能。最後,我們將會比較兩套開源程式碼的虛擬環境,是否會對經過PCI穿透所使用的CUDA造成效能上的差異。透過實驗將可以知道哪個環境將會對在虛擬環境中使用CUDA有最佳的效能。 Nowadays, NVIDIA’s CUDA is a general purpose scalable parallel programming model for writing highly parallel applications. It provides several key abstractions – a hierarchy of thread blocks, shared memory, and barrier synchronization. This model has proven to be quite successful at programming multithreaded many core GPUs and scales transparently to hundreds of cores: scientists throughout industry and academia are already using CUDA to achieve dramatic speedups on production and research codes. GPU-base clusters are likely to play an important role in future cloud computing centers, because some compute-intensive applications may require both CPUs and GPUs. In this thesis by using PCI pass-through technology and making the virtual machines in a virtual environment are able to use the NVIDIA graphics card, and we can use the CUDA high performance computing as well. It makes the virtual machine have not only the virtual CPU but also the real GPU for computing. The performance of virtual machine is predicted to increase dramatically. This thesis will measure the performance differences between virtual machines and physical machines by using CUDA; and how virtual machines would varify CPU numbers under influence of CUDA performance. At last, we compare two open source virtualization environment hypervisor, whether it is after PCI pass-through CUDA performance differences or not. Through the experiment, we will be able to know which environment will reach the best efficiency in a virtual environment by using CUDA.