NVIDIA GPU 軟體
-
Exploring utilisation of GPU for database applications
Exploring utilisation of GPU for database applications探索利用GPU的數據庫應用本研究致力於探討可能的應用加速的GPU技術的數據庫訪問。 我們使用了N - Gram統計近似的文字作為搜索引擎的測試床基於 GPU加速算法。 兩個解決方案 - 混合的CPU / GPU和純 GPU的查詢處理算法進行了研究和比較基準的CPU以及與算法優化算法的CPU版本。 該混合算法對大多數查詢表現不佳,只有適度提速是可以實現的長期質疑,高錯誤級別。 另一方面加速比高達 18倍,取得了純 GPU的算法。 GPU加速的應用為更廣泛的數據基礎問題進行了討論。採購正式版、大量授權報價、技術支援、軟體諮詢、委託採購 ...
張貼者:2010/7/2 上午1:42service orderble
-
gVirtuS: A GPGPU transparent virtualization component
gVirtuS: A GPGPU transparent virtualization componentgVirtuS:GPGPU的一個透明的虛擬化組件gVirtuS試圖填補差距在內部舉辦計算機集群,配備 GPGPUs設備,並支付為使用高性能的虛擬集群部署透過公共或私人雲計算。 gVirtuS允許一個實例化的虛擬機來訪問 GPGPUs以透明的方式,與一架空略大於一真機 / GPGPU的設置。 gVirtuS管理程序是獨立的,即使它目前基於 GPU的虛擬化NVIDIA的CUDA,它不限於特定品牌的技術。 各組成部分的業績評估的gVirtuS通過測試套件在不同的部署方案,如提供GPGPU的權力,基於雲計算 HPC集群和共享遠程高性能計算節點之間舉行 ...
張貼者:2010/6/29 下午8:02service orderble
-
A GPU approach to FDTD for radio coverage prediction
A GPU approach to FDTD for radio coverage predictionGPU的FDTD方法的無線電覆蓋預測一個眾所周知的方法來計算無線電波傳播的有限差分時域(FDTD法)模型,解決了麥克斯韋方程組的離散網格。 與開發新的可編程圖形硬件,新穎的解決方案,已經計算電磁場正在實施的GPU。 本文的GPU執行 FTDT是加速發展和實現了超過 100倍以上的實現方法上運行的AMD Athlon 64X2雙核心4600------------------------------------------------------------------------------------------------------------------------- ...
張貼者:2010/6/29 下午8:15service orderble
-
B Flash Finder
B Flash Finder B Flash Finder FinderThis程序充分利用了NVIDIA的CUDA技術得到快速的搜索結果從整個硬盤中secondsApplication-------------------------------------------------------------------------------------------------------------------------This program harnesses the power of nvidia's cuda technology to get fast search results from an entire harddrive ...
張貼者:2010/6/29 下午8:20service orderble
-
Accelerating Biomedical Signal Processing Algorithms with Parallel Programming on Graphic Processor Units
Accelerating Biomedical Signal Processing Algorithms with Parallel Programming on Graphic Processor Units加快生物醫學信號處理算法的並行編程方法在圖像處理器單元本文研究獲得的收益的使用,採用圖形處理單元(GPU)的並行編程在生物醫學信號處理領域。 在執行時間上的差異時,計算相關尺寸(CD)的多元神經生理記錄和皮膚電導級別(沙中線)的報告通過比較幾種常見的編程環境。 此外,如在這項研究中,結合並行編程與設計技術,特殊處理的內存管理問題,如內存之間的數據傳輸設備和GPU可能進一步加快處理速度。因此,盡量減少執行的時間取得適當的手段並行體系結構設計的一個因素可能達到 ...
張貼者:2010/6/29 下午8:23service orderble
-
NBSymple: a symplectic N-body code for astrophysical simulations using TESLA GPUs
NBSymple:辛N體代碼使用Tesla GPU的天體物理模擬NBSymple是一個全新的並行代碼,利用聯合演出多核心CPU和GPU的,由平均開放 MP和CUDA技術,分別為。 它執行數值積分的運動方程的一組 N個粒子通過牛頓引力相互作用。 整合的時間是由一個高精度算法,保證時間,可逆性和優秀節能。 我們測試的代碼在各種情況下,利用簡單的精度和雙精度算術,以及一個軟件“雙單”的精度,似乎是一個很好的妥協之間的精度和速度在Tesla C1060 GPU的。------------------------------------------------------------------------------------------------------------------------ ...
張貼者:2010/6/30 上午12:37service orderble
-
Incompressible Flow Computations on the NCSA Lincoln Tesla Cluster
Incompressible Flow Computations on the NCSA Lincoln Tesla Cluster我們追求的MPI - CUDA技術三大戰略的實施和探討,探討的效率和可擴展性的不可壓縮流計算群集的林肯特斯拉在國家超級計算應用中心(NCSA的)。 我們利用一些先進的功能和CUDA編程的MPI兩個重疊的GPU數據傳輸和MPI通信與計算的圖形處理器。 我們維持大約 2.4萬億次浮點運算的64節點集群的自我評估林肯特斯拉使用128圖形處理器,總有30,720處理元件。 我們的研究結果表明,多GPU集群可以大大加快計算流體動力學(CFD)模擬實驗。------------------------------------------------------------------------------------------------------------------------------ ...
張貼者:2010/6/30 上午12:39service orderble
-
Simulation Game of Life on GPU.
Simulation Game of Life on GPU.模擬遊戲人生於 GPU通過 CUDA技術。 它採用共享存儲器技術。---------------------------------------------------------------------------------------------------------Simulation Game of Life on GPU via CUDA. It used shared ram technic.Speed Up8 XCodeAuthor(s ...
張貼者:2010/6/30 上午12:40service orderble
-
Massively parallel forward modeling of scalar and tensor gravimetry data
Massively parallel forward modeling of scalar and tensor gravimetry data地球物理建模程序源代碼計算導數的第一和第二的引力勢分佈的三維質量。-------------------------------------------------------------------------------------------------------------Geophysical modeling code for calculating the first and second derivative of the gravitational potential for a ...
張貼者:2010/6/30 上午12:42service orderble
-
CNS: a GPU-based framework for simulating cortically-organized networks
CNS: a GPU-based framework for simulating cortically-organized networks一個普通的GPU為基礎的框架的快速模擬“外皮組織”網絡,定義為網絡組成的n維層類似的細胞。--------------------------------------------------------------------------------------------------------------------------A general GPU-based framework for the fast simulation of "cortically-organized ...
張貼者:2010/6/30 上午12:44service orderble
-
Best-effort semantic document search on GPUs
盡力而為的語義搜索GPU的文件語義索引是一種流行的技術,用於訪問和組織大量的非結構化文本數據。 我們描述一個優化的執行語義索引和文件搜索多核GPU的平台。 我們觀察到一個並行執行的語義索引在一個 128核心特斯拉C870 GPU是唯一的2.4倍快於一連續實施了2.4GHz的英特爾至強處理器。 我們賦予了不到一個壯觀的加速比的工作量不匹配的語義索引特色和獨特的建築特色的圖形處理器。 相對於常規數值計算已被移植到GPU的非常成功,我們的語義索引算法(最近提出的所謂監督語義索引算法SSI)的有有趣的特徵 - 並行的金額在每個訓練實例數據的依賴,每次迭代涉及產品密集稀疏矩陣的一個載體,導致隨機內存訪問模式。 ...
張貼者:2010/6/30 上午12:45service orderble
-
High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning
High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioningMotivated by high computation power and low price per performance ratio of GPUs, GPU accelerated clusters are being ...
張貼者:2010/6/21 下午3:57service orderble
-
Real-Time Multi-Agent Path Planning on Arbitrary Surfaces
Real-Time Multi-Agent Path Planning on Arbitrary SurfacesPath planning is an active topic in the literature, and efficient navigation over non-planar surfaces is an open research question ...
張貼者:2010/6/21 下午3:55service orderble
-
Realtime free surface fluid simulation and visualization
Realtime free surface fluid simulation and visualizationImplementation of a free surface fluid simulation and visualization using the Lattice Boltzmann method. OpenCL 1.0 is used for the fluid simulation ...
張貼者:2010/6/21 下午3:38service orderble
-
Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA
Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDAWe port a high-order finite-element application that performs the numerical simulation of seismic ...
張貼者:2010/6/21 下午3:37service orderble
-
Design and implementation of the software architecture for a 3-D reconstruction system in medical imaging
Design and implementation of the software architecture for a 3-D reconstruction system in medical imagingThe design and implementation of the reconstruction system in medical X-ray imaging is ...
張貼者:2010/6/21 下午3:35service orderble
-
Performance analysis of accelerated image registration using GPGPU
Performance analysis of accelerated image registration using GPGPUThis paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture ...
張貼者:2010/6/21 下午3:33service orderble
-
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstractionProgrammers for GPGPU face rapidly changing substrate of programming abstractions, execution models, and hardware implementations ...
張貼者:2010/6/21 下午3:31service orderble
-
Thermal analysis of multiprocessor SoC applications by simulation and verification
Thermal analysis of multiprocessor SoC applications by simulation and verificationOverheating of computer chips leads to degradation of performance and reliability. Therefore, preventing chips from overheating in spite of increased ...
張貼者:2010/6/21 下午3:29service orderble
-
Exploring NVIDIA-CUDA for video coding
Exploring NVIDIA-CUDA for video codingSpeed Up100 XPaperAuthor(s)Aleksandar Colic Hari Kalva Borko Furht Organization TypeAcademiaOrganizationFlorida Atlantic UniversityPlatformN/ASoftware ...
張貼者:2010/6/21 下午3:28service orderble
-
Iterative induced dipoles computation for molecular mechanics on GPUs
Iterative induced dipoles computation for molecular mechanics on GPUsIn this work, we present a first step towards the efficient implementation of polarizable molecular mechanics force fields with GPU acceleration ...
張貼者:2010/6/21 下午3:26service orderble
-
Computational visual attention systems and their cognitive foundations: A survey
Computational visual attention systems and their cognitive foundations: A surveyBased on concepts of the human visual system, computational visual attention systems aim to detect regions of interest in images ...
張貼者:2010/6/21 下午3:25service orderble
-
High-performance cone beam reconstruction using CUDA compatible GPUs
High-performance cone beam reconstruction using CUDA compatible GPUsCompute unified device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA ...
張貼者:2010/6/21 下午3:23service orderble
-
Teaching design & analysis of multi-core parallel algorithms using CUDA
Teaching design & analysis of multi-core parallel algorithms using CUDAOne of the dominant trends in microprocessor architecture in recent years is continually increasing chip-level parallelism. However, many undergraduate ...
張貼者:2010/6/21 下午3:21service orderble
-
An asymmetric distributed shared memory model for heterogeneous parallel systems
An asymmetric distributed shared memory model for heterogeneous parallel systemsHeterogeneous computing combines general purpose CPUs with accelerators to efficiently execute both sequential control-intensive and data-parallel phases of ...
張貼者:2010/6/21 下午3:20service orderble
-
Accelerating the local outlier factor algorithm on a GPU for intrusion detection systems
Accelerating the local outlier factor algorithm on a GPU for intrusion detection systemsThe Local Outlier Factor (LOF) is a very powerful anomaly detection method available in machine learning and ...
張貼者:2010/6/21 下午2:54service orderble
-
FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects
FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effectsIn the past decade, modern GPUs have provided increasing programmability with vertex, geometry and fragment shaders. However, many classical ...
張貼者:2010/6/21 下午2:52service orderble
-
The Scalable Heterogeneous Computing (SHOC) benchmark suite
The Scalable Heterogeneous Computing (SHOC) benchmark suiteScalable heterogeneous computing systems, which are composed of a mix of compute devices, such as commodity multicore processors, graphics processors, reconfigurable processors, and ...
張貼者:2010/6/21 下午2:50service orderble
-
Accelerating MATLAB Image Processing Toolbox functions on GPUs
Accelerating MATLAB Image Processing Toolbox functions on GPUsIn this paper, we present our effort in developing an open-source GPU (graphics processing units) code library for the MATLAB Image ...
張貼者:2010/6/21 下午2:49service orderble
-
A symbolic verifier for CUDA programs
A symbolic verifier for CUDA programsWe present a preliminary automated verifier based on mechanical decision procedures which is able to prove functional correctness of CUDA programs and guarantee to ...
張貼者:2010/6/21 下午2:44service orderble
-
A breadth-first course in multicore and manycore programming
A breadth-first course in multicore and manycore programmingThe technique of scaling hardware performance through increasing the number of cores on a chip requires programmers to learn to write ...
張貼者:2010/6/21 下午2:46service orderble
-
A symbolic verifier for CUDA programs
A symbolic verifier for CUDA programsWe present a preliminary automated verifier based on mechanical decision procedures which is able to prove functional correctness of CUDA programs and guarantee to ...
張貼者:2010/6/21 下午2:39service orderble
-
CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences
CUDAlign: using GPU to accelerate the comparison of megabase genomic sequencesBiological sequence comparison is a very important operation in Bioinformatics. Even though there do exist exact methods to compare ...
張貼者:2010/6/21 下午2:47service orderble
-
A simulation of large-scale groundwater flow on CUDA-enabled GPUs
A simulation of large-scale groundwater flow on CUDA-enabled GPUsThis paper presents a simulation method for large-scale groundwater flow on CUDA-enabled GPUs. The discretization method for ...
張貼者:2010/6/21 下午2:38service orderble
-
Cortical architectures on a GPGPU
Cortical architectures on a GPGPUAs the number of devices available per chip continues to increase, the computational potential of future computer architectures grows likewise. While this is a clear ...
張貼者:2010/6/21 下午2:36service orderble
-
Modeling GPU-CPU workloads and systems
Modeling GPU-CPU workloads and systemsHeterogeneous systems, systems with multiple processors tailored for specialized tasks, are challenging programming environments. While it may be possible for domain experts to optimize ...
張貼者:2010/6/21 下午2:35service orderble
-
Small-Ruleset Regular Expression Matching on GPGPUs: Quantitative Performance Analysis and Optimization
Small-Ruleset Regular Expression Matching on GPGPUs: Quantitative Performance Analysis and OptimizationWe explore the intersection between an emerging class of architectures and a prominent workload: GPGPUs (General-Purpose Graphics ...
張貼者:2010/6/21 下午2:34service orderble
-
Fault Table Computation on GPUs
Fault Table Computation on GPUsIn this paper, we explore the implementation of fault table generation on a Graphics Processing Unit (GPU). A fault table is essential for fault diagnosis ...
張貼者:2010/6/21 下午2:33service orderble
-
Parallel GPU-based data-dependent triangulations
Parallel GPU-based data-dependent triangulationsIn this paper we introduce a new technique for data-dependent triangulation which is suitable for implementation on a GPU. Our solution is based ...
張貼者:2010/6/21 下午2:31service orderble
-
LBM based flow simulation using GPU computing processor
LBM based flow simulation using GPU computing processorGraphics Processing Units (GPUs), originally developed for computer games, now provide computational power for scientific applications. In this paper, we develop a ...
張貼者:2010/6/21 下午2:30service orderble
-
42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence
42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulenceAs an entry for the 2009 Gordon Bell price/performance prize, we present the results ...
張貼者:2010/6/21 下午2:29service orderble
-
CUDA renderer: a programmable graphics pipeline
CUDA renderer: a programmable graphics pipelineModern GPUs provide gradually increasing programmability on vertex shader, geometry shader and fragment shader in the past decade. However, many classical problems such as ...
張貼者:2010/6/21 下午2:26service orderble
-
Taming irregular EDA applications on GPUs
Taming irregular EDA applications on GPUsRecently general purpose computing on graphic processing units (GPUs) is rising as an exciting new trend in high-performance computing. Thus it is appealing ...
張貼者:2010/6/21 下午2:25service orderble
-
Using common graphics hardware for multi-agent traffic simulation with CUDA
Using common graphics hardware for multi-agent traffic simulation with CUDAToday's graphics processing units (GPU) have tremendous resources when it comes to raw computing power. The simulation of ...
張貼者:2010/6/21 下午2:23service orderble
-
Boids that see: Using self-occlusion for simulating large groups on GPUs
Boids that see: Using self-occlusion for simulating large groups on GPUsBehavioral models have been used in the entertainment industry to increase the realism in the simulation of large ...
張貼者:2010/6/21 下午2:22service orderble
-
A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors
A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processorsNeural network simulators that take into account the spiking behavior of neurons are ...
張貼者:2010/6/21 下午2:20service orderble
-
Triangular matrix inversion on Graphics Processing Unit
Triangular matrix inversion on Graphics Processing UnitDense matrix inversion is a basic procedure in many linear algebra algorithms. A computationally arduous step in most dense matrix inversion methods is ...
張貼者:2010/6/21 下午2:18service orderble
-
Massively parallel Linux laptops, workstations and clusters with CUDA
Massively parallel Linux laptops, workstations and clusters with CUDAUnleash the GPU within!PaperAuthor(s)Robert Farber Organization TypeN/AOrganizationN/APlatformN/ASoftware License ...
張貼者:2010/6/21 下午2:15service orderble
-
Nifty_reg
Nifty_regGlobal and local medical image registration using CUDA. The global alignment is based on a block-matching technique and the local warping on a cubic B-spline deformation ...
張貼者:2010/6/21 下午2:13service orderble
-
Acceleration of the Smith-Waterman Algorithm using Single and Multiple Graphics Processors
Acceleration of the Smith-Waterman Algorithm using Single and Multiple Graphics ProcessorsFinding regions of similarity between two very long data streams is a computationally intensive problem referred to as ...
張貼者:2010/6/21 下午2:12service orderble
-
Performance and Scalability of GPU-Based Convolutional Neural Networks
Performance and Scalability of GPU-Based Convolutional Neural NetworksIn this paper we present the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks (CNNs ...
張貼者:2010/6/21 下午2:10service orderble
-
Cusp: A sparse matrix library for CUDA
Cusp: A sparse matrix library for CUDACusp is a library for sparse linear algebra and graph computations on CUDA. Cusp provides a flexible, high-level interface for manipulating sparse ...
張貼者:2010/6/21 下午2:05service orderble
-
Simulation and Visualization of the Saint-Venant System using GPUs
Simulation and Visualization of the Saint-Venant System using GPUsThis paper describes the efficient implementation of three second order accurate explicit schemes that solve the shallow water equations. The ...
張貼者:2010/6/21 下午1:59service orderble
-
State-of-the-Art in Heterogeneous Computing
State-of-the-Art in Heterogeneous ComputingThis extensive survey (33 pages, over 180 references) gives an overview of hardware and software tools for the Cell Broadband Engine, Graphics Processing ...
張貼者:2010/6/21 下午1:57service orderble
-
An MPI-CUDA Implementation for Massively Parallel
An MPI-CUDA Implementation for Massively ParallelModern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications ...
張貼者:2010/6/21 下午1:56service orderble
-
Multi-target C++ implementation of parallel skeletons
Multi-target C++ implementation of parallel skeletonsThis paper presents the design of an efficient multi-target (CPU+GPU) implementation for the Parallel_for skeleton. Emerging massively parallel architectures promise ...
張貼者:2010/6/21 下午2:16service orderble
-
Using GPU on HPC Applications to Satisfy Low-Power Computational Requirement
Using GPU on HPC Applications to Satisfy Low-Power Computational RequirementThe High-performance, low-power computing is required to reduce the computer infrastructure needed for large multi-physics calculations ...
張貼者:2010/6/21 下午1:54service orderble
-
AntiPlanet Reflections
AntiPlanet ReflectionsAntiPlanet Reflections is first person "doom" style 3D shooter game in fantastic extraterrestrial world, which is built of spheres, shadows and infinite reflections. AntiPlanet scenes are fully dynamic ...
張貼者:2010/6/21 下午2:11service orderble
-
Toward efficient GPU-accelerated N-body simulations
Toward efficient GPU-accelerated N-body simulationsN-body algorithms are applicable to a number of common problems in computational physics including gravitation, electrostatics, and fluid dynamics. Fast algorithms (those ...
張貼者:2010/6/21 下午1:52service orderble
-
Porting of an Edge-Based CFD Solver to GPUs
Porting of an Edge-Based CFD Solver to GPUsGraphics processing units (GPUs) are increasingly becoming a mainstream platform for high performance computational fluid dynamics. This paper describes the porting ...
張貼者:2010/6/21 下午1:50service orderble
-
Accelerating H.264 inter prediction in a GPU by using CUDA
Accelerating H.264 inter prediction in a GPU by using CUDAH.264/AVC defines a very efficient algorithm for the inter prediction but it takes too much time. With ...
張貼者:2010/6/21 下午1:49service orderble
-
A GPU-enabled solver for time-constrained linear sum assignment problems
A GPU-enabled solver for time-constrained linear sum assignment problemsThis paper deals with solving large instances of the Linear Sum Assignment Problems (LSAPs) under realtime constraints, using Graphical ...
張貼者:2010/6/21 下午1:41service orderble
-
Real Time Simulation of Tissue Cutting Based on GPU and CUDA for Surgical Training
Real Time Simulation of Tissue Cutting Based on GPU and CUDA for Surgical TrainingA novel approach to the simulation of soft tissue cutting in a virtual reality endoscopic simulator ...
張貼者:2010/6/21 下午1:39service orderble
-
Preliminary implementation of VQ image coding using GPGPU
Preliminary implementation of VQ image coding using GPGPUGPGPU (general purpose computing on graphic processing unit) attracts a great deal of attention, that is used for general-purpose computations like ...
張貼者:2010/6/21 下午1:35service orderble
-
hiCUDA: High-Level GPGPU Programming
hiCUDA: High-Level GPGPU ProgrammingGraphics Processing Units (GPUs) have become a competitive accelerator for applications outside the graphics domain, driven by improvements in GPU programmability. Although the Compute Unified ...
張貼者:2010/6/21 下午1:34service orderble
-
CUDA Based GPU Programming to Simulate 3D Tissue Deformation
CUDA Based GPU Programming to Simulate 3D Tissue DeformationThe medical training systems based on virtual simulation are highly desired since minimally invasive surgical techniques have become popular to patients ...
張貼者:2010/6/21 下午1:28service orderble
-
Offloading Region Matching of Data Distribution Management with CUDA
Offloading Region Matching of Data Distribution Management with CUDAData distribution management (DDM) aims to reduce the transmission of irrelevant data between High Level Architecture (HLA) compliant simulators by taking ...
張貼者:2010/6/21 下午1:29service orderble
-
Parallel Iterative Linear Solvers on GPU: A Financial Engineering Case
Parallel Iterative Linear Solvers on GPU: A Financial Engineering CaseIn many numerical applications resulting from computational science and engineering problems, the solution of sparse linear systems is the most ...
張貼者:2010/6/21 下午1:20service orderble
-
IP routing processing with graphic processors
IP routing processing with graphic processorsThroughput and programmability have always been the central, but generally conflicting concerns for modern IP router designs. Current high performance routers depend on proprietary ...
張貼者:2010/6/21 下午1:23service orderble
-
Frame-based parallelization of MPEG-4 on compute unified device architecture (CUDA)
Frame-based parallelization of MPEG-4 on compute unified device architecture (CUDA)Due to its object based nature, flexible features and provision for user interaction, MPEG-4 encoder is highly ...
張貼者:2010/6/21 下午1:24service orderble
-
A Work-Efficient GPU Algorithm for Level Set Segmentation
A Work-Efficient GPU Algorithm for Level Set SegmentationLevel set segmentation is a powerful computational method for identifying complex objects in n-dimensional images. We present a novel level ...
張貼者:2010/6/21 下午1:05service orderble
-
Bayesian Real-Time Perception Algorithms on GPU
Bayesian Real-Time Perception Algorithms on GPUReal-time implementation of a Bayesian framework for robotic multisensory perception using the Compute Unified Device Architecture (CUDA).Speed Up30,000 X ...
張貼者:2010/6/21 下午1:03service orderble
-
Application-guided tool development for architecturally diverse computation
Application-guided tool development for architecturally diverse computationArchitecturally diverse computation exploits non-traditional computing platforms (e.g., field-programmable gate arrays, graphics processors, heterogeneous chip multiprocessors) to execute user ...
張貼者:2010/6/21 下午12:54service orderble
-
Non-blocking programming on multi-core graphics processors: extended asbtract
Non-blocking programming on multi-core graphics processors: extended asbtractThis paper investigates the synchronization power of coalesced memory accesses, a family of memory access mechanisms introduced in recent large ...
張貼者:2010/6/21 下午12:52service orderble
-
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUsIn this paper we describe techniques for compiling fine-grained SPMD-threaded programs, expressed in programming models such as ...
張貼者:2010/6/21 下午12:51service orderble
-
Parallel computing with CUDA
Parallel computing with CUDANVIDIA's CUDA architecture provides a powerful platform for writing highly parallel programs. By providing simple abstractions for hierarchical thread organization, memories, and synchronization, the CUDA ...
張貼者:2010/6/21 下午12:49service orderble
-
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
Optimization of linked list prefix computations on multithreaded GPUs using CUDAWe present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded ...
張貼者:2010/6/21 下午12:43service orderble
-
CUDA-based AES parallelization with fine-tuned GPU memory utilization
CUDA-based AES parallelization with fine-tuned GPU memory utilizationCurrent Graphics Processing Unit (GPU) presents large potentials in speeding up computationally intensive data parallel applications over traditional parallelization approaches ...
張貼者:2010/6/21 下午12:41service orderble
-
Implementing the Himeno benchmark with CUDA on GPU clusters
Implementing the Himeno benchmark with CUDA on GPU clustersThis paper describes the use of CUDA to accelerate the Himeno benchmark on clusters with GPUs. The implementation is designed to ...
張貼者:2010/6/21 下午12:36service orderble
-
A tile-based parallel Viterbi algorithm for biological sequence alignment on GPU with CUDA
A tile-based parallel Viterbi algorithm for biological sequence alignment on GPU with CUDAThe Viterbi algorithm is the compute-intensive kernel in Hidden Markov Model (HMM) based sequence alignment ...
張貼者:2010/6/21 下午12:34service orderble
-
Parallel external sorting for CUDA-enabled GPUs with load balancing and low transfer overhead
Parallel external sorting for CUDA-enabled GPUs with load balancing and low transfer overheadSorting is a well-investigated topic in Computer Science in general and by now many efficient ...
張貼者:2010/6/21 下午12:31service orderble
-
Fast binding site mapping using GPUs and CUDA
Fast binding site mapping using GPUs and CUDABinding site mapping refers to the computational prediction of the regions on a protein surface that are likely to bind a small ...
張貼者:2010/6/21 下午12:29service orderble
-
Efficient parallel algorithms for maximum-density segment problem
Efficient parallel algorithms for maximum-density segment problemOne of the fundamental problems involving DNA sequences is to find high density segments of certain widths, for example, those regions with ...
張貼者:2010/6/21 下午12:27service orderble
-
Fast implementation of Wyner-Ziv Video codec using GPGPU
Fast implementation of Wyner-Ziv Video codec using GPGPUIn this paper, we report a fast implementation of Wyner-Ziv video decoder using general-purpose computing on graphics processing units ...
張貼者:2010/6/21 下午12:25service orderble
-
The GPU Computing Era
The GPU Computing EraGPU computing is at a tipping point, becoming more widely used in demanding consumer applications and high-performance computing. This article describes the rapid evolution of ...
張貼者:2010/6/21 下午12:24service orderble
-
Object-oriented stream programming using aspects
Object-oriented stream programming using aspectsHigh-performance parallel programs that efficiently utilize heterogeneous CPU+GPU accelerator systems require tuned coordination among multiple program units. However, using current programming frameworks ...
張貼者:2010/6/21 下午12:23service orderble
-
Efficient Histogram Algorithms
Efficient Histogram AlgorithmsThis paper presents two efficient histogram algorithms designed for NVIDIA's CUDA compatible GPUs, which can be used for parallel computation of histograms on large data-sets ...
張貼者:2010/6/21 下午12:22service orderble
-
Speeding Up Mutual Information Computation Hardware
Speeding Up Mutual Information Computation HardwareThis paper presents an efficient method for mutual information computation between images (2D or 3D) for NVIDIA's CUDA compatible devices, overcoming limitations by ...
張貼者:2010/6/21 下午12:20service orderble
-
LINZIK: The compact optical CAD
LINZIK: The compact optical CADA lens ray tracing program for calculating, in particular, astronomical optics. It includes optimizer, which can choose parameters of surfaces to minimize the goal (merit ...
張貼者:2010/6/21 下午12:19service orderble
-
Canny Edge Detection
Canny Edge DetectionThe Canny edge detector is a very popular edge feature detector used as a pre-processing step in many computer vision algorithms. By using the more programmer ...
張貼者:2010/6/21 下午12:17service orderble
-
SciFinance Speeds Financial Results with Parallel Computing
SciFinance Speeds Financial Results with Parallel ComputingBy harnessing the power of NVIDIA CUDA with GPU or multi-CPU workstations, SciFinance parallel codes for Monte Carlo pricing models run blazingly ...
張貼者:2010/6/21 下午12:14service orderble
-
Accelerate Large Graph Algorithms
Accelerate Large Graph AlgorithmsThis paper presents a few fundamental algorithms - including breadth first search, single source shortest path, and all-pairs shortest path - using CUDA on large graphs. We ...
張貼者:2010/6/21 下午12:12service orderble
顯示 1 - 92 篇文章 (共 92 篇)。
檢視更多 »
|