- 📙Paper: GPT Code Clippy The Open Source version of GitHub Copilot
- 📚Publisher:
other
- 🏠Author Affiliation:
CodedotAI
- 🔑Public: ✅
- 🌐Architecture
- Encoder-Decoder
- Decoder-Only
- 📏Model Size
125M
;1.3B
- 🗂️Data pre-processing
- Data Resource
- The dataset used to train GPT-CC is obtained from SEART GitHub Search
- The GitHub repositories contain in the Pile
- De-duplication: ✅
- Filter Strategies
- >10 GitHub stars
- >2 commits
- must have a licence
- exclude forks
- size < 70708 bytes
- Data Resource
- 🍉Tokenizer
- Technology
- Byte-level Byte-Pair-Encoding (BBPE)
- SentencePiece
- Details
- GPT-Neo tokenizer
- Technology
- 🧪Hyperparameters (GPT-CC 1.3B)
- optimizer: AdaFa
- betas: /
- eps: /
- batch size:
24
- context window:
1,024
- gradient accumulation steps: /
- warmup steps:
5,000
- learning rate:
2e-5
- weight decay:
0.1
- decay schedule
- Cosine
- Linear
- Polynomial
- Inverse Square
- precision floating point: /
- optimizer: AdaFa
- 🏃♀️Training
- model initialization: GPT-Neo
- training strategies
- left-to-right
- fill-in-the-middle
- trained tokens/steps: /
- hardware: /
- training time: /
GPT-CC
This post is licensed under CC BY 4.0 by the author.
Recently Updated