Flux.2 Swift MLX

A native Swift implementation of Flux.2 image generation models, running locally on Apple Silicon Macs using MLX.

Downloads

📦 Latest Release (v2.1.0) — Universal binaries for Apple Silicon

Download	Description
Flux2App	Demo macOS app with T2I, I2I, chat (guide)
Flux2CLI	Image generation CLI (guide)
FluxEncodersCLI	Text encoders CLI (guide)

Note: On first launch, macOS may block unsigned apps. Right-click → Open to bypass Gatekeeper.

Features

Image Generation (Flux2Core)

Native Swift: Pure Swift implementation, no Python dependencies at runtime
MLX Acceleration: Optimized for Apple Silicon (M1/M2/M3/M4) using MLX
Multiple Models: Dev (32B), Klein 4B, and Klein 9B variants
Quantized Models: On-the-fly quantization (qint8/int4) for all models — Dev fits in ~17GB at int4
Text-to-Image: Generate images from text prompts
Image-to-Image: Transform images with text prompts and configurable strength
Multi-Image Conditioning: Combine elements from up to 3 reference images
Prompt Upsampling: Enhance prompts with Mistral/Qwen3 before generation
LoRA Support: Load and apply LoRA adapters for style transfer
LoRA Training: Train your own LoRAs on Apple Silicon (guide)
LoRA Evaluation: Automated pipeline to evaluate training gap and recommend parameters (guide)
Image-to-Image Training: Train paired I2I LoRAs (e.g. style transfer, image restoration)
CLI Tool: Full-featured command-line interface (Flux2CLI)
macOS App: Demo SwiftUI application (Flux2App) with T2I, I2I, and chat

Text Encoders (FluxTextEncoders)

Mistral Small 3.2 (24B): Text encoder for FLUX.2 dev/pro
Qwen3 (4B/8B): Text encoder for FLUX.2 Klein
Qwen3.5-4B VLM: Native vision-language model for image analysis (~3GB, auto-downloaded)
FLUX.2 Image Description: VLM-powered image analysis optimized for FLUX.2 regeneration
Image Comparison: Score two images on scene and style fidelity (0-10)
Text Generation: Streaming text generation with configurable parameters
Interactive Chat: Multi-turn conversation with chat template support
Vision Analysis: Image understanding via Pixtral (Mistral) or Qwen3.5 vision encoders
FLUX.2 Embeddings: Extract embeddings compatible with FLUX.2 image generation
CLI Tool: Complete command-line interface (FluxEncodersCLI)

Requirements

macOS 15.0 (Sequoia) or later (built on macOS 26 Tahoe)
Apple Silicon Mac (M1/M2/M3/M4)
Xcode 16.0 or later

Memory requirements by model (with on-the-fly quantization):

Model	int4	qint8	bf16
Klein 4B	16 GB	16 GB	24 GB
Klein 9B	16 GB	24 GB	32 GB
Dev (32B)	32 GB	96 GB	96 GB

Installation

Pre-built Binaries (Recommended)

Download from the Releases page:

# CLI
unzip Flux2CLI-v2.1.0-macOS.zip
./Flux2CLI t2i "a cat" --model klein-4b
 
# App
unzip Flux2App-v2.1.0-macOS.zip
open Flux2App.app

Build from Source

git clone https://github.com/VincentGourbin/flux-2-swift-mlx.git
cd flux-2-swift-mlx

Build with Xcode (not swift build):

Open the project in Xcode
Select Flux2CLI or Flux2App scheme
Build with Cmd+B (or Cmd+R to run)

Download Models

The models are downloaded automatically from HuggingFace on first run.

For Dev (32B):

Text Encoder: Mistral Small 3.2 (~25GB 8-bit)
Transformer: Flux.2 Dev (~33GB qint8, ~17GB int4)
VAE: Flux.2 VAE (~3GB)

For Klein 4B/9B:

Text Encoder: Qwen3-4B or Qwen3-8B (~4-8GB 8-bit)
Transformer: Klein 4B (~4-7GB) or Klein 9B (~5-17GB depending on quantization)
VAE: Flux.2 VAE (~3GB)

Models are cached in ~/Library/Caches/models/ by default (configurable via --models-dir or ModelRegistry.customModelsDirectory for sandboxed apps).

Usage

CLI

# Fast generation with Klein 4B (~26s, commercial OK)
flux2 t2i "a beaver building a dam" --model klein-4b
 
# Better quality with Klein 9B (~62s)
flux2 t2i "a beaver building a dam" --model klein-9b
 
# Maximum quality with Dev (~35min, requires 64GB+ RAM)
flux2 t2i "a beautiful sunset over mountains" --model dev
 
# With custom parameters
flux2 t2i "a red apple on a white table" \
  --width 512 \
  --height 512 \
  --steps 20 \
  --guidance 4.0 \
  --seed 42 \
  --output apple.png
 
# Image-to-Image with reference image
flux2 i2i "transform into a watercolor painting" \
  --images photo.jpg \
  --strength 0.7 \
  --steps 28 \
  --output watercolor.png
 
# Multi-image conditioning (combine elements)
flux2 i2i "a cat wearing this jacket" \
  --images cat.jpg \
  --images jacket.jpg \
  --steps 28 \
  --output cat_jacket.png

See CLI Documentation for all options.

As a Library

import Flux2Core
 
// Initialize pipeline
let pipeline = try await Flux2Pipeline()
 
// Generate image
let image = try await pipeline.generateTextToImage(
    prompt: "a beautiful sunset over mountains",
    height: 512,
    width: 512,
    steps: 20,
    guidance: 4.0
) { current, total in
    print("Step \(current)/\(total)")
}

Architecture

Flux.2 Dev is a ~32B parameter rectified flow transformer:

8 Double-stream blocks: Joint attention between text and image
48 Single-stream blocks: Combined text+image processing
4D RoPE: Rotary position embeddings for T, H, W, L axes
SwiGLU FFN: Gated activation in feed-forward layers
AdaLN: Adaptive layer normalization with timestep conditioning

Text encoding uses Mistral Small 3.2 to generate 15360-dim embeddings.

On-the-fly Quantization

All models support on-the-fly quantization to reduce transformer memory. No need to download separate variants — one bf16 model file serves all levels.

Model	bf16	qint8 (-47%)	int4 (-72%)
Klein 4B	7.4 GB	3.9 GB	2.1 GB
Klein 9B	17.3 GB	9.2 GB	4.9 GB
Dev (32B)	61.5 GB	32.7 GB	17.3 GB

# Klein 9B with qint8 (fits in 24 GB)
flux2 t2i "a cat" --model klein-9b --transformer-quant qint8
 
# Dev with int4 (fits in 32 GB)
flux2 t2i "a cat" --model dev --transformer-quant int4

See Quantization Benchmark for detailed measurements and visual comparison.

Documentation

Guides

Guide	Description
CLI Documentation	Command-line interface — all commands and options
LoRA Guide	Loading and using LoRA adapters
LoRA Training Guide	Training parameters, DOP, gradient checkpointing, YAML config
LoRA Evaluation	Automated gap analysis and training parameter recommendations
VLM API	Qwen3.5 VLM — image analysis, comparison, LoRA training setup
Text Encoders	FluxTextEncoders library API and CLI
Custom Model Integration	Integrating custom MLX-compatible models into the framework
Flux2App Guide	Demo macOS application

Examples and Benchmarks

Example	Description
Examples Gallery	Overview of all examples with sample outputs
Model Comparison	Dev vs Klein 4B vs Klein 9B — performance, quality, when to use each
Quantization Benchmark	Measured memory, speed, and visual quality for bf16/qint8/int4
Flux.2 Dev Examples	T2I, I2I, multi-image conditioning, VLM image interpretation
Flux.2 Klein 4B Examples	Fast T2I, multiple resolutions, quantization comparison
Flux.2 Klein 9B Examples	T2I, multiple resolutions, prompt upsampling

LoRA Training

Guide	Description
LoRA Evaluation Pipeline	New — Automated gap analysis: VLM describes reference, generates baseline, compares, recommends training params
Cat Toy (Subject LoRA)	Subject injection with DOP, trigger word `sks` (Klein 4B)
Tarot Style (Style LoRA)	Style transfer, trigger word `rwaite`, 32 training images (Klein 4B)

Help Wanted — The LoRA evaluation parameter recommendations are based on initial heuristics and will be refined with user feedback. If you use evaluate-lora and train LoRAs, please share your results to help improve the recommendations!

Current Limitations

Dev Performance: Generation takes ~30 min for 1024x1024 images (use Klein for faster results)
Dev Memory: Requires 32GB+ with int4, 64GB+ with qint8 (Klein 4B works with 16GB)
LoRA Training: Supported on Klein 4B, Klein 9B, and Dev. Enable gradient_checkpointing: true for larger models to reduce memory by ~50%. Image-to-Image training doubles sequence length — gradient checkpointing is recommended.

Acknowledgments

Black Forest Labs for Flux.2
Hugging Face Diffusers for reference implementation
MLX team at Apple for the ML framework

License

MIT License - see LICENSE file.

Disclaimer: This is an independent implementation and is not affiliated with Black Forest Labs. Flux.2 model weights are subject to their own license terms.

flux2-swift-mlx