TY - GEN
T1 - Parallel-pipeline-based traversal unit for hardware-accelerated ray tracing
AU - Kim, Jin Woo
AU - Lee, Won Jong
AU - Lee, Min Woo
AU - Han, Tack Don
PY - 2012
Y1 - 2012
N2 - In this work, we propose a novel parallel-pipeline traversal unit for hardware-based ray tracing, which can reduce latency and increase cache locality. Owing to the high memory bandwidth and computation requirements of ray-tracing operations such as traversal and intersection tests, recent studies have focused on the development of a hardware-based traversal and intersection-test unit[Nah et al. 2011][Lee et al. 2012]. Existing hardware engines are based on a single deep pipeline structure that increases the throughput of ray processing per unit time. However, traversal operations involve non-deterministic changes in the states of a ray. Therefore, in some cases, the ray may be unnecessarily transferred between pipeline stages, thereby increasing the overall latency. In order to solve this problem, we propose a parallel traversal unit having a pipeline per state. Our results show that the proposed system is up to 30% more efficient than a single-pipeline system because it decreases average latency per ray and increases cache efficiency. Copyright is held by the author / owner(s).
AB - In this work, we propose a novel parallel-pipeline traversal unit for hardware-based ray tracing, which can reduce latency and increase cache locality. Owing to the high memory bandwidth and computation requirements of ray-tracing operations such as traversal and intersection tests, recent studies have focused on the development of a hardware-based traversal and intersection-test unit[Nah et al. 2011][Lee et al. 2012]. Existing hardware engines are based on a single deep pipeline structure that increases the throughput of ray processing per unit time. However, traversal operations involve non-deterministic changes in the states of a ray. Therefore, in some cases, the ray may be unnecessarily transferred between pipeline stages, thereby increasing the overall latency. In order to solve this problem, we propose a parallel traversal unit having a pipeline per state. Our results show that the proposed system is up to 30% more efficient than a single-pipeline system because it decreases average latency per ray and increases cache efficiency. Copyright is held by the author / owner(s).
UR - http://www.scopus.com/inward/record.url?scp=84871544123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84871544123&partnerID=8YFLogxK
U2 - 10.1145/2407156.2407203
DO - 10.1145/2407156.2407203
M3 - Conference contribution
AN - SCOPUS:84871544123
SN - 9781450319119
T3 - SIGGRAPH Asia 2012 Posters, SA 2012
BT - SIGGRAPH Asia 2012 Posters, SA 2012
T2 - SIGGRAPH Asia 2012 Posters, SA 2012
Y2 - 28 November 2012 through 1 December 2012
ER -