diff --git a/src/routes/(cams)/cams24/+page.svelte b/src/routes/(cams)/cams24/+page.svelte index 2f0cfdc..93e6f55 100644 --- a/src/routes/(cams)/cams24/+page.svelte +++ b/src/routes/(cams)/cams24/+page.svelte @@ -298,1607 +298,7 @@ increase the efficiency of the hardware platform where the simulator is run, and obtain results sooner.
-All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator | -
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -
All times are in Central Standard Time (UTC-6).
-Time | -Event | -
8:00 - 8:10 | -Opening Remarks | -
8:10 - 9:00 | -Keynote | -
9:00 - 10:00 | -Paper Talks | -
9:00 - 9:15 | -
- [Paper] Demystifying Platform Requirements for Diverse LLM
- Inference Use Cases
- Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), - Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, - Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) - -
-
- - Large language models (LLMs) have shown remarkable performance across a wide range - of applications, often outperforming human experts. However, deploying these - parameter-heavy models efficiently for diverse inference use cases requires - carefully designed hardware platforms with ample computing, memory, - and network resources. With LLM deployment scenarios and models - evolving at breakneck speed, the hardware requirements to meet Service - Level Objectives(SLOs) remains an open research question. - - In this work, we present an analytical tool, GenZ, to study the relationship - between LLM inference performance and various platform design parameters. We - validate our tool against real hardware data running various different LLM models, - achieving a geomean error of 2.73%. We present case studies that provide insights - into configuring platforms for different LLM workloads and use cases. We quantify - the platform requirements to support SOTA LLMs under diverse serving settings. - Furthermore, we project the hardware capabilities needed to enable future LLMs - potentially exceeding hundreds of trillions of parameters. The trends and insights - derived from GenZ can guide AI engineers deploying LLMs as well as computer architects - designing next-generation hardware accelerators and platforms. Ultimately, this work - sheds light on the platform design considerations for unlocking the full potential of - LLMs across a spectrum of applications. - - |
-
9:15 - 9:30 | -
- [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
- Transfer for Detecting Architectural Bottlenecks
- Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, - Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan - Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) - -
-
- - The architectural analysis tools that output bottleneck - information do not allow knowledge transfer to other - applications or architectures. So, we propose a novel - tool that can predict a known application's bottlenecks - for previously unseen architectures or an unknown application's - bottlenecks for known architectures. We (i) identify the bottleneck - characteristics of 44 applications and use this as the dataset - for our ML/DL model; (ii) identify the correlations between metrics - and bottlenecks to create our tool's initial feature list; (iii) - propose an architectural bottleneck analysis model - BottleneckAI - - that employs random forest regression (RFR) and multi-layer perceptron - (MLP) regression; (iv) present results that indicate BottleneckAI - tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy - in predicting bottlenecks; (v) present five versions of BottleneckAI, - four of which are trained with single architecture data, and one of - which is trained with multiple architecture data, to predict bottlenecks - for new architectures. - |
-
9:30 - 9:45 | -
- [Paper] How Accurate is Accurate Enough for Simulators? A
- Review of Simulation Validation
- Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) - -
-
- - Simulators are vital tools for evaluating the performance of - innovative architectural designs. To ensure an accurate simulation - results, researchers must validate these simulators. However, even - validated simulators can exhibit unreliability when facing new - workloads or modified architectural designs. This paper seeks to - enhance simulator trustworthiness by refining the validation process. - Through a comprehensive review of existing literature, the nuances of - simulator accuracy and reliability are examined from a broader - perspective on simulation error that goes beyond simple accuracy - validation. Our proposals for improving simulator trustworthiness - include selecting a representative benchmark set and expanding the - configuration set during validation. Additionally, we aim to predict - errors associated with new workloads by leveraging the error profiles - obtained from the validation process. To further enhance overall simulator - trustworthiness, we suggest incorporating error tolerance in the simulator - calibration process. Ultimately, we propose additional validation with - new benchmarks and minimal calibration, as this approach closely mimics - real-world usage environments. - |
-
9:45 - 10:00 | -[Paper] Parallelizing a Modern GPU Simulator
- Rodrigo Huerta and Antonio Gonzalez (Universitat Politècnica de Catalunya) - -
-
- - Simulators are a primary tool in computer architecture - research but are extremely computationally intensive. - Simulating modern architectures with increased core counts - and recent workloads can be challenging, even on modern hardware. - This paper demonstrates that simulating some GPGPU workloads in a - single-threaded state-of-the-art simulator such as Accel-sim can - take more than five days. In this paper we present a simple approach to - parallelize this simulator with minimal code changes by using OpenMP. - Moreover, our parallelization technique is deterministic, so the simulator - provides the same results for single-threaded and multi-threaded simulations. - Compared to previous works, we achieve a higher speed-up, and, more importantly, - the parallel simulation does not incur any inaccuracies. When we run the - simulator with 16 threads, we achieve an average speed-up of 5.8x and reach - 14x in some workloads. This allows researchers to simulate applications that - take five days in less than 12 hours. By speeding up simulations, researchers can - model larger systems, simulate bigger workloads, add more detail to the model, - increase the efficiency of the hardware platform where the simulator is run, - and obtain results sooner. - |
-
10:00 - 10:30 | -Coffee break | -
10:30 - 12:00 | -Simulator Release Talks | -
10:30 - 10:50 | -gem5 (Jason Lowe-Power) | -
10:50 - 11:10 | -Sniper (Alen Sabu, Trevor E. Carlson) | -
11:10 - 11:30 | -
- User-Friendly Tools in Akita (Yifan Sun)
-
-
-
- - In this talk, we will present the real-time - monitoring tool for Akita---AkitaRTM---and the - default trace visualization tool for - Akita---Daisen. - - |
-
11:30 - 11:50 | -SST (TBD) | -
11:50 - 12:00 | -Closing Remarks | -