The Hugging Face AI community announced the release of Falcon 180B, an open-source large language model (LLM) with 180 billion parameters trained on 3.5 trillion tokens. This latest LLM surpasses prior models, including the previously top-ranked LLaMA 2, in scale and performance. Falcon 180B, trained using Amazon SageMaker on 4,096 GPUs, competes closely with commercial models like Google's PaLM-2. The release signifies rapid advancement in LLMs, with Falcon 180B benefiting from techniques such as LoRAs and Nvidia’s Perfusion. It is expected to see further improvement as the community fine-tunes it.
Falcon 180B Training Full fine-tuning 5120GB 8x 8x A100 80GB
Falcon 180B Training LoRA with ZeRO-3 1280GB 2x 8x A100 80GB
Falcon 180B Training QLoRA 160GB 2x A100 80GB
Falcon 180B Inference BF16/FP16 640GB 8x A100 80GB
Falcon 180B Inference GPTQ/int4 320GB 8x A100 40GB