
[![][release-shield]][release-link]
[![][dockerhub-shield]][dockerhub-link]
[![][github-stars-shield]][github-stars-link]
[![][github-issues-shield]][github-issues-link]
[![][github-contributors-shield]][github-contributors-link]
[![][github-forks-shield]][github-forks-link]
[![][license-shield]][license-link]
[![][wechat-shield]][wechat-link]
[![][spaces-shield]][spaces-link]
[![][swanhub-demo-shield]][swanhub-demo-link]
[![][modelscope-shield]][modelscope-link]
[![][modelers-shield]][modelers-link]
[![][trendshift-shield]][trendshift-link] [![][hellogithub-shield]][hellogithub-link]

Related Projects:
- SwanLab: Used throughout the training of the portrait matting model for analysis and monitoring, as well as collaboration with lab colleagues, significantly improving training efficiency.
Online Experience: 、
、[![][modelscope-shield]][modelscope-link]
2024.11.20: Gradio Demo adds Print Layout option, supports six-inch, five-inch, A4, 3R, and 4R layout sizes
🚀 Thank you for your interest in our work. You may also want to check out our other achievements in the field of image processing, feel free to reach out: zeyi.lin@swanhub.co.
HivisionIDPhoto aims to develop a practical and systematic intelligent algorithm for producing ID photos.
It utilizes a comprehensive AI model workflow to recognize various user photo-taking scenarios, perform matting, and generate ID photos.
HivisionIDPhoto can achieve:

If HivisionIDPhoto helps you, please star this repo or recommend it to your friends to solve the urgent ID photo production problem!
We have shared some interesting applications and extensions of HivisionIDPhotos built by the community:
Environment installation and dependencies: - Python >= 3.7 (project primarily tested on Python 3.10) - OS: Linux, Windows, MacOS
git clone https://github.com/Zeyi-Lin/HivisionIDPhotos.git
cd HivisionIDPhotos
It is recommended to create a python3.10 virtual environment using conda, then execute the following commands
pip install -r requirements.txt
pip install -r requirements-app.txt
Method 1: Script Download
python scripts/download_model.py --models all
Method 2: Direct Download
Store in the project's hivision/creator/weights directory:
- modnet_photographic_portrait_matting.onnx (24.7MB): Official weights of MODNet, download
- hivision_modnet.onnx (24.7MB): Matting model with better adaptability for pure color background replacement, download
- rmbg-1.4.onnx (176.2MB): Open-source matting model from BRIA AI, download and rename to rmbg-1.4.onnx
- birefnet-v1-lite.onnx(224MB): Open-source matting model from ZhengPeng7, download and rename to birefnet-v1-lite.onnx
| Extended Face Detection Model | Description | Documentation |
|---|---|---|
| MTCNN | Offline face detection model, high-performance CPU inference, default model, lower detection accuracy | Use it directly after cloning this project |
| RetinaFace | Offline face detection model, moderate CPU inference speed (in seconds), and high accuracy | Download and place it in the hivision/creator/retinaface/weights directory |
| Face++ | Online face detection API launched by Megvii, higher detection accuracy, official documentation | Usage Documentation |
Test environment: Mac M1 Max 64GB, non-GPU acceleration, test image resolution: 512x715(1) and 764×1146(2).
| Model Combination | Memory Occupation | Inference Time (1) | Inference Time (2) |
|---|---|---|---|
| MODNet + mtcnn | 410MB | 0.207s | 0.246s |
| MODNet + retinaface | 405MB | 0.571s | 0.971s |
| birefnet-v1-lite + retinaface | 6.20GB | 7.063s | 7.128s |
In the current version, the model that can be accelerated by NVIDIA GPUs is birefnet-v1-lite, and please ensure you have around 16GB of VRAM.
If you want to use NVIDIA GPU acceleration for inference, after ensuring you have installed CUDA and cuDNN, find the corresponding onnxruntime-gpu version to install according to the onnxruntime-gpu documentation, and find the corresponding pytorch version to install according to the pytorch official website.
# If your computer is installed with CUDA 12.x and cuDNN 8
# Installing torch is optional. If you can't configure cuDNN, try installing torch
pip install onnxruntime-gpu==1.18.0
pip install torch --index-url https://download.pytorch.org/whl/cu121
After completing the installation, call the birefnet-v1-lite model to utilize GPU acceleration for inference.
TIP: CUDA installations are backward compatible. For example, if your CUDA version is 12.6 but the highest version currently matched by torch is 12.4, it's still possible to install version 12.4 on your computer.
python app.py
Running the program will generate a local web page where you can perform operations and interact with ID photos.

Core parameters:
-i: Input image path-o: Output image path-t: Inference type, options are idphoto, human_matting, add_background, generate_layout_photos--matting_model: Portrait matting model weight selection--face_detect_model: Face detection model selectionMore parameters can be viewed by running python inference.py --help
Input 1 photo to obtain 1 standard ID photo and 1 high-definition ID photo in 4-channel transparent PNG.
python inference.py -i demo/images/test0.jpg -o ./idphoto.png --height 413 --width 295
Input 1 photo to obtain 1 4-channel transparent PNG.
python inference.py -t human_matting -i demo/images/test0.jpg -o ./idphoto_matting.png --matting_model hivision_modnet
Input 1 4-channel transparent PNG to obtain 1 3-channel image with added background color.
python inference.py -t add_background -i ./idphoto.png -o ./idphoto_ab.jpg -c 4f83ce -k 30 -r 1
Input 1 3-channel photo to obtain 1 six-inch layout photo.
python inference.py -t generate_layout_photos -i ./idphoto_ab.jpg -o ./idphoto_layout.jpg --height 413 --width 295 -k 200
Input 1 4-channel photo (the image after matting) to obtain 1 standard ID photo and 1 high-definition ID photo in 4-channel transparent PNG.
python inference.py -t idphoto_crop -i ./idphoto_matting.png -o ./idphoto_crop.png --height 413 --width 295
python deploy_api.py
For detailed request methods, please refer to the API Documentation, which includes the following request examples: - cURL - Python
Choose one of the following methods
Method 1: Pull the latest image:
docker pull linzeyi/hivision_idphotos
Method 2: Directly build the image from Dockerfile:
After ensuring that at least one matting model weight file is placed in the hivision/creator/weights directory, execute the following in the project root directory:
docker build -t linzeyi/hivision_idphotos .
Method 3: Build using Docker Compose:
After ensuring that at least one matting model weight file is placed in the hivision/creator/weights directory, execute the following in the project root directory:
docker compose build
Start Gradio Demo Service
Run the following command, and you can access it locally at http://127.0.0.1:7860.
docker run -d -p 7860:7860 linzeyi/hivision_idphotos
Start API Backend Service
docker run -d -p 8080:8080 linzeyi/hivision_idphotos python3 deploy_api.py
Start Both Services Simultaneously
docker compose up -d
This project provides some additional configuration options, which can be set using environment variables:
| Environment Variable | Type | Description | Example |
|---|---|---|---|
| FACE_PLUS_API_KEY | Optional | This is your API key obtained from the Face++ console | 7-fZStDJ···· |
| FACE_PLUS_API_SECRET | Optional | Secret corresponding to the Face++ API key | VTee824E···· |
| RUN_MODE | Optional | Running mode, with the option of beast (beast mode). In beast mode, the face detection and matting models will not release memory, achieving faster secondary inference speeds. It is recommended to try to have at least 16GB of memory. |
beast |
Example of using environment variables in Docker: ```bash docker run -d -p 7860:7860 \ -e FACE_PLUS_API_KEY=7-fZStDJ···· \ -e FACE_PLUS_API_SECRET=VTee824E···· \ -e RUN_MODE=beast \ linzeyi/hivision_
$ claude mcp add HivisionIDPhotos \
-- python -m otcore.mcp_server <graph>