Home GPT-C
Post
Cancel

GPT-C

  • 📙Paper: IntelliCode compose code generation using transformer
  • 📚Publisher: FSE
  • 🏠Author Affiliation: Microsoft
  • 🔑Public: ❌
  • 🌐Architecture
    • Encoder-Decoder
    • Decoder-Only
  • 📏Model Size
    • 366M
  • 🗂️Data pre-processing
    • Data Resource
      • We collect a large unsupervised source code dataset to train and evaluate the code sequence completion model. It comprises over 1.2 billion lines of source code in Python, C#, JavaScript and TypeScript programming languages. A total of over 52000 top-starred (non-fork) projects in GitHub has been selected, containing libraries from a diverse set of domains, with over 4.7 million source code files.
    • De-duplication: ❌
    • Filter Strategies
      • /
  • 🍉Tokenizer
    • Technology
      • Byte-level Byte-Pair-Encoding (BBPE)
      • SentencePiece
    • Details
      • /
  • 🧪Hyperparameters (GPT-C 366M)
    • optimizer: Adam
      • betas: /
      • eps: /
    • batch size: /
    • context window: 1,024
    • gradient accumulation steps: /
    • warmup steps: /
    • learning rate: 6.25e-5
    • weight decay: /
    • decay schedule
      • Cosine
      • Linear
      • Polynomial
      • Inverse Square
    • precision floating point: /
  • 🏃‍♀️Training
    • model initialization: from scratch
    • training strategies
      • left-to-right
      • fill-in-the-middle
    • trained tokens/steps: /
    • hardware: 5 Lambda V100 boxes, each having sixteen V100 GPUs with 32 GB HBM2 memory
    • training time: /
This post is licensed under CC BY 4.0 by the author.

Getting Started

PyMT5