CodeRL

Posted Jun 5, 2022 Updated Jan 29, 2023

By Anonymous

1 min read

📙Paper: CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
📚Publisher: NeurIPS
🏠Author Affiliation: Salesforce Research
🔑Public: ✅
🌐Architecture
- Encoder-Decoder
- Decoder-Only
📏Model Size
- 770M
🗂️Data pre-processing
- Data Resource
  - APPS
- De-duplication: /
- Filter Strategies
  - /
🍉Tokenizer
- Technology
  - Byte-level Byte-Pair-Encoding (BBPE)
  - SentencePiece
- Details
  - We adopt the code-specific tokenizer from Wang et al. [2021].
🧪Hyperparameters (CodeRL 770M)
- optimizer: AdamW
  - betas: /
  - eps: /
- batch size: 64
- context window: /
- gradient accumulation steps: /
- warmup steps: /
- learning rate: /
- weight decay: /
- decay schedule
  - Cosine
  - Linear
  - Polynomial
  - Inverse Square
- precision floating point: /
🏃‍♀️Training
- model initialization: CodeT5-large-ntp-py
- training strategies
  - left-to-right
  - fill-in-the-middle
  - reinforcement learning
- trained tokens/steps: /
- hardware: 1 A100 GPU
- training time: fine-tuned CodeT5-large 30 hours

This post is licensed under CC BY 4.0 by the author.