- 📙Paper: IntelliCode compose code generation using transformer
- 📚Publisher:
FSE
- 🏠Author Affiliation:
Microsoft
- 🔑Public: ❌
- 🌐Architecture
- Encoder-Decoder
- Decoder-Only
- 📏Model Size
366M
- 🗂️Data pre-processing
- Data Resource
- We collect a large unsupervised source code dataset to train and evaluate the code sequence completion model. It comprises over 1.2 billion lines of source code in Python, C#, JavaScript and TypeScript programming languages. A total of over 52000 top-starred (non-fork) projects in GitHub has been selected, containing libraries from a diverse set of domains, with over 4.7 million source code files.
- De-duplication: ❌
- Filter Strategies
- /
- Data Resource
- 🍉Tokenizer
- Technology
- Byte-level Byte-Pair-Encoding (BBPE)
- SentencePiece
- Details
- /
- Technology
- 🧪Hyperparameters (GPT-C 366M)
- optimizer: Adam
- betas: /
- eps: /
- batch size: /
- context window:
1,024
- gradient accumulation steps: /
- warmup steps: /
- learning rate:
6.25e-5
- weight decay: /
- decay schedule
- Cosine
- Linear
- Polynomial
- Inverse Square
- precision floating point: /
- optimizer: Adam
- 🏃♀️Training
- model initialization: from scratch
- training strategies
- left-to-right
- fill-in-the-middle
- trained tokens/steps: /
- hardware: 5 Lambda V100 boxes, each having sixteen V100 GPUs with 32 GB HBM2 memory
- training time: /
GPT-C
This post is licensed under CC BY 4.0 by the author.
Recently Updated