Weekly Seminar (Member) #2
Replies: 7 comments 15 replies
-
|
Date: 2025-09-09 OT / 전경호 |
Beta Was this translation helpful? Give feedback.
-
|
Date: 2025-09-16 Program
질문 정리 TIP) 연구 결과를 좀 더 computing efficiency (latency, throughput, ...) 측면에서 정리해보자. |
Beta Was this translation helpful? Give feedback.
-
|
Program Data Center Optimization for the AI / 박재욱 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Date: 2025-09-30 Program MInference1.0:AcceleratingPre-fillingfor Long-ContextLLMsviaDynamicSparseAttention / 김형균 Dynamic Sparse Attention 알고리즘을 제안한 MInference 논문에 대한 소개 질문 정리 Q) Evaluation 에서 End-to-End Latency 측정은 정확히 어떤 단계까지 포함하는 건가요? Prefill Only? Prefill and Decoder? Q) Evaluation 에서 각 Head 별 Latency 측정 그래프는 어떤 방식으로 한 건가요? Q) Online 과정에서 Sparsity Indices Approximation은 Full-attention을 사용하는 건가요? Q) Decoder 부분에도 Dynamic Sparse Attention을 적용한 건가요? Q) Batch size 설정도 해당 알고리즘의 퍼포먼스에 영향을 미칠까요? Q) MMInference는 VLM에 대한 건가요? MInference와 어떤 차이가 있는 건가요? |
Beta Was this translation helpful? Give feedback.
-
|
Date: 2025-09-30 논문 발표 자료 |
Beta Was this translation helpful? Give feedback.
-
|
Date: 2025-10-14 Papers
Attention에 양자화를 적용하기위해 아웃라이어 및 정확도 문제를 해결하기 위한 연구 발표자료 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
EfficientLLM 세미나 기록
Beta Was this translation helpful? Give feedback.
All reactions