Jianyong Yuan


2024

pdf bib
EpiGEN: An Efficient Multi-Api Code GENeration Framework under Enterprise Scenario
Sijie Li | Sha Li | Hao Zhang | Shuyang Li | Kai Chen | Jianyong Yuan | Yi Cao | Lvqing Yang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In recent years, Large Language Models (LLMs) have demonstrated exceptional performance in code-generation tasks. However, under enterprise scenarios where private APIs are pre-built, general LLMs often fail to meet expectations. Existing approaches are confronted with drawbacks of high resource consumption and inadequate handling of multi-API tasks. To address these challenges, we propose EpiGEN, an Efficient multi-Api code GENeration framework under enterprise scenario. It consists of three core modules: Task Decomposition Module (TDM), API Retrieval Module (ARM), and Code Generation Module (CGM), in which Langchain played an important role. Through a series of experiments, EpiGEN shows good acceptability and readability, compared to fully fine-tuned LLM with a larger number of parameters. Particularly, in medium and hard level tasks, the performance of EpiGEN on a single-GPU machine even surpasses that of a fully fine-tuned LLM that requires multi-GPU configuration. Generally, EpiGEN is model-size agnostic, facilitating a balance between the performance of code generation and computational requirements.