InverseCoder

Posted Jul 9, 2024

By Anonymous

1 min read

📙Paper: InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct
📚Publisher: arxiv
🏠Author Affiliation: ICT, CAS, Baidu, Autodesk Research
Contributions:
- We introduce INVERSE-INSTRUCT, a simple yet effective instruction tuning approach exploiting the mismatch of code-generation and instruction-generation.
- We make thorough analysis on INVERSE-INSTRUCT, including the component of generated dataset, the impact of data size, etc. We find that the self-consistency between the code generation and summarization is predictive of the effectiveness of INVERSE-INSTRUCT prior to training.
- Based on INVERSE-INSTRUCT, we present a series of code LLMs named InverseCoder, which achieves SOTA or comparative results on a wide range of benchmarks including Python code generation, multilingual code completion, and data science problems.
Models and Datasets: InverseCoder

This post is licensed under CC BY 4.0 by the author.