type
status
date
slug
summary
tags
category
password
icon
Author
Abstract
Black Forest Labs has announced the launch of the FLUX Pro Finetuning API, bringing unprecedented customization capabilities to the flagship FLUX Pro model. I'll test the Finetuning API, take notes, and share my insights in this blog.
The documentation for the API: https://docs.bfl.ml/finetuning/
How to Get Started?
- Prepare your images in supported formats (JPG, JPEG, PNG, or WebP). While 5 images is the minimum recommendation, I tested with both 30 and 100 images
High-quality datasets with clear, articulated subjects/objects/styles significantly improve training results. Higher resolution source images help but are capped at 1MP.
- Add text descriptions by creating text files that match your image filenames. For example, if your image is "sample.jpg", name its description file "sample.txt". (I opted to use the automatic caption function instead of writing descriptions manually in my test.)
- Compress your data folder into a ZIP file.
- Configure Training Parameters
- Submit Training Task
- Run Inference
Training Parameters
- mode
It determines the finetuning approach based on your concept. Options: “character”, “product”, “style”, “general”. In “general” mode, the entire image is captioned when caption is True without specific focus areas. No subject specific improvements will be made.
- finetune_comment
Purpose: Descriptive note to identify your fine-tune since names are UUIDs. Will be displayed in finetune_details
- iterations
- Minimum: 100
- Default: 300
Purpose: Defines training duration
For quick exploration, 100-150 iterations are usually sufficient. However, more complex tasks, larger datasets, or cases requiring extreme precision may benefit from additional iterations.
- learning_rate
Default: 0.0001 for both "full" and "lora" finetune_type options. I kept this default value in my test using "full" finetune_type.
Lower values can improve the result but might need more iterations to learn a concept. Higher values can allow you to train for less iterations at potential loss in quality.
- priority
There are two options for this parameter “speed” and “quality”. Default value is “quality”
- Captioning
This parameter is a Boolean type. It toggles automatic image captioning on or off. While I used automatic captioning in my test, I recommend writing captions manually instead.
- trigger_word
The default value is "TOK". While this parameter may seem less critical, you can customize it to reference your newly introduced concepts.
- lora_rank
Default value is 32. Choose between 32 and 16. A lora_rank of 16 can increase training efficiency and decrease loading times.
- finetune_type
Default value is “full”. Choose between “ful” for a full finetuning + post hoc extraction of the trained weights into LoRA or “lora” for a raw LoRA training.
There are some available endpoints for your finetuned model:
- /flux-pro-1.1-ultra-finetuned
- /flux-pro-finetuned
- /flux-pro-1.0-depth-finetuned
- /flux-pro-1.0-canny-finetuned
- /flux-pro-1.0-fill-finetuned
Implmentation Script
Additional Documentation
There are several additional parameters you should know for fine-tuned model inference, such as width and height. The detailed documentation is available here: https://api.us1.bfl.ai/scalar#tag/tasks/POST/v1/flux-pro-1.0-fill-finetuned
My tested images
Following are my generated images fine-tuned with
flux-pro-1.1-ultra-finetuned
. 







- Author:Chengsheng Deng
- URL:https://chengshengddeng.com/article/notes-on-fluxpro-finetuning-api
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts