MCPcopy
hub / github.com/continue-revolution/sd-webui-segment-anything

github.com/continue-revolution/sd-webui-segment-anything @v1.6.2 sqlite

repository ↗ · DeepWiki ↗ · release v1.6.2 ↗
556 symbols 1,396 edges 39 files 99 documented · 18%
README

Segment Anything for Stable Diffusion WebUI

This extension aim for connecting AUTOMATIC1111 Stable Diffusion WebUI and Mikubill ControlNet Extension with segment anything and GroundingDINO to enhance Stable Diffusion/ControlNet inpainting, enhance ControlNet semantic segmentation, automate image matting and create LoRA/LyCORIS training set.

News

  • 2023/04/10: v1.0.0 SAM extension released! You can click on the image to generate segmentation masks.
  • 2023/04/12: v1.0.1 Mask expansion and API support released by @jordan-barrett-jm! You can expand masks to overcome edge problems of SAM.
  • 2023/04/15: v1.1.0 GroundingDINO support released! You can enter text prompts to generate bounding boxes and segmentation masks.
  • 2023/04/18: v1.2.0 ControlNet V1.1 inpainting support released! You can copy SAM generated masks to ControlNet to do inpainting. Note that you must update ControlNet extension to use it. ControlNet inpainting has far better performance compared to general-purposed models, and you do not need to download inpainting-specific models anymore.
  • 2023/04/24: v1.3.0 Automatic segmentation support released! Functionalities with * require you to have ControlNet extension installed. This update includes support for
    • *ControlNet V1.1 semantic segmentation
    • EditAnything un-semantic segmentation
    • Image layout generation (single image + batch process)
    • *Image masking with categories (single image + batch process)
    • *Inpaint not masked for ControlNet inpainting on txt2img panel
  • 2023/04/29: v1.4.0 API has been completely refactored. You can access all features for single image process through API. API documentation has been moved to wiki.
  • 2023/05/22: v1.4.1 EditAnything is ready to use! You can generate random segmentation and copy the output to EditAnything ControlNet.
  • 2023/05/29: v1.4.2 You may now do SAM inference on CPU by checking "Use CPU for SAM". This is for some MAC users who are not able to do SAM inference on GPU. I discourage other users from using this feature because it is significantly slower than CUDA.
  • 2023/06/01: v1.5.0 You may now choose to use local GroundingDINO to bypass C++ problem. See FAQ-1 for more detail.
  • 2023/06/04: v1.5.1 Upload Mask to ControlNet Inpainting comes back in response to ControlNet inpaint improvement. You should see a new tab beside AutoSAM after updating the extension. This feature will again be removed once ControlNet extension has its own uploading feature.
  • 2023/06/13: v1.6.0 SAM-HQ supported by @SpenserCai and me. This is an "upgraded" SAM, created by researchers at ETH Zurich & HKUST. However, I cannot guarantee which one is better and you should make your own choice based on your own experiments. Go to Installation to get the link to the models.
  • 2023/06/29: v1.6.1 MobileSAM supported. This is a tiny version of SAM, created by researchers at Kyung Hee University. Visit here for more information.
  • 2023/08/31: v1.6.2 Support WebUI v1.6.0, Gradio v3.41.2

Note that support for some other variations of SAM, such as Matting-Anything and FastSAM are still on the way. Support for these models, unlike MobileSAM, are non-trivial, especially FastSAM, which utilize a completely different pipeline, ultralytics/YOLO. Introducing these new works to the current codebase will make the original ugly-enough codebase more ugly. They will be supported once I finish a major refactor of the current codebase.

FAQ

Thanks for suggestions from github issues, reddit and bilibili to make this extension better.

There are already at least two great tutorials on how to use this extension. Check out this video (Chinese) from @ThisisGameAIResearch and this video (Chinese) from @OedoSoldier. You can also check out my demo.

You should know the following before submitting an issue.

  1. Due to the overwhelming complaints about GroundingDINO installation and the lack of substitution of similar high-performance text-to-bounding-box library, I decide to modify the source code of GroundingDINO and push to this repository. Starting from v1.5.0, you can choose to use local GroundingDINO by checking Use local groundingdino to bypass C++ problem on Settings/Segment Anything. This change should solve all problems about ninja, pycocotools, _C and any other problems related to C++/CUDA compilation.

    If you did not modify the setting described above, This script will firstly try to install GroundingDINO and check if your environment has successfully built the C++ dynamic library (the annoying _C). If so, this script will use the official implementation of GroundingDINO. This is to show respect to the authors of GroundingDINO. If the script failed to install GroundingDINO, it will use local GroundingDINO instead.

    If you'd still like to resolve the install problem of GroundingDINO, I observe some common problems for Windows users: - pycocotool: here. - _C: here. DO NOT skip steps.

    If you are still unable to install GroundingDINO on Windows AND you cannot resolve this problem AFTER searching for issues inside here here and here, You may refer to #98 and watch the videos there. Note that I develop on linux, so I cannot guarantee that any video tutorials may or may not work.

  2. If you

    The problem is most likely due to some other extensions which might also change the position inside the extension list to control ControlNet. The easiest solution is here. This change will precede SAM extension to be before ControlNet, bypassing the internal preceding code, and will not prevent you from receiving any updates from me. I am not planning to refactor my code to bypass this problem. I did not expect to control ControlNet when I created this extension, but ControlNet indeed grow much faster than my expectation.

  3. This extension has almost moved into maintenance phase. Although I don't think there will be huge updates in the foreseeable future, Mikubill ControlNet Extension is still fast developing, and I'm looking forward to more opportunities to connect my extension to ControlNet. Despite of this, I will continue to deal with issues, and monitor new research works to see if they are worth supporting. I welcome any community contribution and any feature requests.

  4. You must use gradio>=3.23.0 and WebUI>=22bcc7be to use this extension. A1111 WebUI is stable, and some integrated package authors have also updated their packages (for example, if you are using the package from @Akegarasu, i.e. 秋叶整合包, it has already been updated according to this video). Also, supporting different versions of WebUI will be a huge time commitment, during which I can create many more features. Please update your WebUI and it is safe to use. I'm not planning to support some old commits of WebUI, such as a9fed7c3.

  5. It is impossible to support the following features, at least at this moment, due to gradio/A1111 limitations. I will closely monitor gradio/A1111 update to see if it becomes possible to support them:

    • color inpainting, because gradio wierdly enlarge the input image which slows down your browser, or even freeze your page. I have already implemented this feature, though, but I made it invisible. Note that ControlNet v1.1 inpainting model is very strong, and you do not need to rely on the traditional inpainting anymore. ControlNet v1.1 does not support color inpainting.
    • edit mask/explicit copy, because gradio Image component cannot accept image+mask as an output, which is the required way of explicitly copying a masked image to img2img inpaint/inpaint sketch/ControlNet (i.e. you can see the actual masked image on the panel, instead of a mysterious internal copying). Without this, you will not be able to edit mask.
  6. Inpaint-Anything and EditAnything and A LOT of other popular SAM extensions have been supported. For Inpaint-Anything, you may check this issue for how to use. For EditAnything, please check how to use. I am always open to support any other interesting applications, submit a feature request if you find another interesting one.

  7. If you have a job opportunity and think I am a good fit, please feel free to send me an email.

  8. If you want to sponsor me, please go to sponsor section and scan the corresponding QR code.

Installation

Download this extension to ${sd-webui}/extensions via whatever way you like (git clone or install from UI)

Choose one or more of the models below and put them to ${sd-webui}/models/sam or ${sd-webui-segment-anything}/models/sam (Choose one, not both. Remove the former folder if you choose to use the latter.). Do not change model name, otherwise this extension may fail due to a bug inside segment anything.

We support several variations of segmentation models:

  1. SAM from Meta AI.

    I myself tested vit_h on NVIDIA 3090 Ti which is good. If you encounter VRAM problem, you should switch to smaller models.

  2. SAM-HQ from SysCV.

  3. MobileSAM from Kyung Hee University.

    • [39MB mobile_sam](https://github.com/ChaoningZhang/MobileSAM/blob/mas

Core symbols most depended-on inside this repo

print
called by 82
local_groundingdino/util/misc.py
to
called by 38
local_groundingdino/util/misc.py
update
called by 22
local_groundingdino/util/utils.py
max
called by 19
local_groundingdino/util/misc.py
samTabPrefix
called by 19
javascript/sam.js
deepcopy
called by 11
local_groundingdino/util/slconfig.py
get
called by 9
local_groundingdino/models/registry.py
max_cn_num
called by 9
scripts/process_params.py

Shape

Method 283
Function 174
Class 92
Route 7

Languages

Python98%
TypeScript2%

Modules by API surface

local_groundingdino/util/misc.py54 symbols
local_groundingdino/util/utils.py48 symbols
sam_hq/modeling/tiny_vit.py45 symbols
local_groundingdino/datasets/transforms.py42 symbols
scripts/sam.py37 symbols
local_groundingdino/util/slconfig.py33 symbols
local_groundingdino/models/GroundingDINO/backbone/swin_transformer.py27 symbols
local_groundingdino/models/GroundingDINO/transformer.py25 symbols
scripts/auto.py24 symbols
scripts/api.py23 symbols
local_groundingdino/util/slio.py23 symbols
local_groundingdino/models/GroundingDINO/utils.py16 symbols

For agents

$ claude mcp add sd-webui-segment-anything \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact