Research outputs 2022 to 2026

Game-based LLM inference task offloading for edge computing system

Author Identifier (ORCID)

Wei Ni: https://orcid.org/0000-0002-4933-594X

Abstract

Large Language Models (LLMs), with their powerful capabilities, are fundamentally transforming society. Cloud-based LLM deployment has drawbacks, including latency, lack of offline functionality, and high long-term costs. Edge computing-based LLM solutions address these issues by offloading inference tasks to edge servers. This paper formulates and optimizes an LLM inference framework that jointly accounts for inference time and predictive quality. Specifically, to address the natural tendency of selfish users for personal utility maximization, we develop a Game-theoretic Offloading Algorithm (GOALIT) to optimize LLM inference offloading. The approach enables distributed users to iteratively adjust their strategies and converge to a Nash equilibrium. Compared to the optimal tree-based search (OT-GAH), the proposed approach yields a 27% increase in token throughput and shortens inference latency by 20%, while maintaining lower perplexity under dynamic system loads. These findings confirm the effectiveness of our approach in resource-constrained edge environments.

Keywords

Edge computing, game theory, LLM inference task offloading, quantization

Document Type

Journal Article

Date of Publication

1-1-2026

Volume

Publication Title

IEEE Transactions on Green Communications and Networking

Publisher

IEEE

School

School of Engineering

Funders

National Natural Science Foundation of China (62202060) / Guangxi Natural Science Foundation of China (2023GXNSFAA026270) / Young Elite Scientists Sponsorship Program by Beijing Association for Science and Technology (BYESS2023311)

Comments

Hou, S., Gan, M., Ni, W., Zhai, Z., & Liu, X. (2026). Game-based LLM inference task offloading for edge computing system. IEEE Transactions on Green Communications and Networking, 10, 2490–2502. https://doi.org/10.1109/TGCN.2026.3676823

Copyright

subscription content

First Page

2490

Last Page

2502

Link to Full Text

COinS

Link to publisher version (DOI)

10.1109/TGCN.2026.3676823

Research outputs 2022 to 2026

Game-based LLM inference task offloading for edge computing system

Author Identifier (ORCID)

Abstract

Keywords

Document Type

Date of Publication

Volume

Publication Title

Publisher

School

Funders

Comments

Copyright

First Page

Last Page

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations

Research outputs 2022 to 2026

Game-based LLM inference task offloading for edge computing system

Authors/Creators

Author Identifier (ORCID)

Abstract

Keywords

Document Type

Date of Publication

Volume

Publication Title

Publisher

School

Funders

Comments

Copyright

First Page

Last Page

Share

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations