TinyCodeLM 400M by upiter

 ยป  All LLMs  ยป  upiter  ยป  TinyCodeLM 400M   URL Share it on

  Arxiv:2410.02749   Autotrain compatible   Dataset:bigcode/the-stack   Dataset:huggingfacefw/fineweb   Endpoints compatible   Olmo   Pytorch   Region:us
Model Card on HF ๐Ÿค—: https://huggingface.co/upiter/TinyCodeLM-400M 

TinyCodeLM 400M Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
TinyCodeLM 400M (upiter/TinyCodeLM-400M)

TinyCodeLM 400M Parameters and Internals

Model Type 
generative code model, text generation, code synthesis
Use Cases 
Areas:
Python code synthesis
Primary Use Cases:
Python code synthesis
Limitations:
Potential for misuse in generating vulnerable/malicious code
Considerations:
Model-generated code must not be executed without precautions.
Additional Notes 
Pretrained on a mixture of open-source web text and Python code.
Training Details 
Data Sources:
bigcode/the-stack, HuggingFaceFW/fineweb, Magicoder, StarCoder2 OSS-Instruct
Data Volume:
72 billion tokens
Methodology:
pretrained on open-source web text and Python code. Instruction tuned on synthetic edit sequence data using the LintSeq algorithm.
Training Time:
Pretraining took about two days (150M) and six days (400M). Instruction tuning took several hours.
Hardware Used:
single H100 node (four GPUs) for pretraining, single H100 GPU for instruction tuning
Model Architecture:
Autoregressive language models mimicking GPT-2 architectures with OLMo model transformer architecture changes.
Safety Evaluation 
Risk Categories:
potential misuse for vulnerabilities/malicious code generation
Ethical Considerations:
The importance of handling model-generated code with precautions.
Input Output 
Input Format:
Text only
Output Format:
Text and code outputs. Instruction tuned models generate code via 'diffs'.
LLM NameTinyCodeLM 400M
Repository ๐Ÿค—https://huggingface.co/upiter/TinyCodeLM-400M 
Model Size400m
Required VRAM1.8 GB
Updated2025-02-22
Maintainerupiter
Model Typeolmo
Model Files  1.8 GB
Model ArchitectureOlmoForCausalLM
Licenseapache-2.0
Context Length1024
Model Max Length1024
Transformers Version4.44.0
Tokenizer ClassGPTNeoXTokenizer
Padding Token<|padding|>
Vocabulary Size50304
Torch Data Typefloat32

Rank the TinyCodeLM 400M Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227