세미나 필기 | 2021 3월 10일 Specialized Architectures for Machine Learning and Data Analytics
서울대 Jae W. Lee 교수님의 세미나입니다
알고리즘/모델, 소프트웨어/프레임워크, 하드웨어가 모두 중요하며, 서로 독립적이지 않음
A3/ELSA - Attention Accelerator
'Attention' DNNs for NLP, Vision, etc. 에서 가장 중요한 advancement임
??? Attention이 뭐지
content- based similarity search
finds data relevant to the query, then returns the weighted sum of such data
(query)dot(key matrix) -> attention score -> softmax computation (sharpening the values) -> weighted sum computation
??
cost of attention" substantial portion of the total runtime (>35% and often >60%) in many NN models
specialized accelerator for attention mechanism
opportunities for Approximation
--> after softmax, most of the values are 0 : actually sparse operation
-->> can avoid large amount of computations
identify a few largest and a few smallest component multiplication results
"Hopefully you get the basic idea" i don't
do column-wise sorting in increasing order
preprocessing
댓글
댓글 쓰기