I wanted to give it a crack and whaddayaknow the devs got it up before I could crack open my breakfast red bull. And...quite frankly it's amazing!
Here's a few facts:
1. There's two new models; Llama 4 Scout, a 17 billion parameter model with 16 experts, and Llama 4 Maverick, a 17 billion parameter model with 128 experts
2. Maverick (80.5) surpasses Gemini 2.0 Flash (77.6) and gets close to DeepSeek v3.1 (81.2) on the MMLU Pro reasoning and knowledge benchmark
3. They support 12 languages including Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese
4. You'll be able to get them to do more of what you want given the work Meta has been doing to help drive down model refusals on safe prompts
5. There's still not a lot known about their big brother Llama 4 Behemoth as it's still in Preview but it's a 288B active parameter, 16MOE beast which was used for model distillation.
And here's how you can them for building your own agent using Hugging Face smolagents. In the vid you'll learn how to:
1. Install and setup smolagents for use with LiteLLM
2. Get access to LLama 4 via IBM watsonx.ai
3. Build your own ReAct agent in Python
If you enjoyed it give me a like and a follow, cheers coders! ๐๐จ๐พโ๐ป
This post was originally shared by on Linkedin.