728x90
MSHR : (Miss-status Holding Registers)
Associative comparison logic merge simultaneous requests for the same cache block, the number of outstanding misses that can be supported is typically small.
[출처 : analyzing cuda workloads using a detailed gpu simulator, ispass 2009]
CTA : Cooperative Thread Array
CUDA 프로그래밍 모델에서의 GPU 메모리 영역의 종류
-> global, local, constant, texture, shared
개수 ( Architecture : GF100 , Compute capability : version 2.x )
1warp 당 thread 32개
1block 당 thread 1024개
1block 당 warp 32개
1SM 당 CUDA core 32개
Maximum # of threads per block : 1024개
기본 구조
728x90
'콤퓨타 > GPU&GPGPU' 카테고리의 다른 글
GPU와 GPGPU (0) | 2013.02.25 |
---|---|
Daisy-chained (0) | 2013.02.21 |
North Bridge와 South Bridge (0) | 2013.02.21 |
댓글