MCPcopy
hub / github.com/kohya-ss/sd-scripts / text2img

Method text2img

gen_img_diffusers.py:1302–1392  ·  view source on GitHub ↗

r""" Function for text-to-image generation. Args: prompt (`str` or `List[str]`): The prompt or prompts to guide the image generation. negative_prompt (`str` or `List[str]`, *optional*): The prompt or prompts not to guide the ima

(
        self,
        prompt: Union[str, List[str]],
        negative_prompt: Optional[Union[str, List[str]]] = None,
        height: int = 512,
        width: int = 512,
        num_inference_steps: int = 50,
        guidance_scale: float = 7.5,
        num_images_per_prompt: Optional[int] = 1,
        eta: float = 0.0,
        generator: Optional[torch.Generator] = None,
        latents: Optional[torch.FloatTensor] = None,
        max_embeddings_multiples: Optional[int] = 3,
        output_type: Optional[str] = "pil",
        return_dict: bool = True,
        callback: Optional[Callable[[int, int, torch.FloatTensor], None]] = None,
        callback_steps: Optional[int] = 1,
        **kwargs,
    )

Source from the content-addressed store, hash-verified

1300 # return StableDiffusionPipelineOutput(images=image, nsfw_content_detected=has_nsfw_concept)
1301
1302 def text2img(
1303 self,
1304 prompt: Union[str, List[str]],
1305 negative_prompt: Optional[Union[str, List[str]]] = None,
1306 height: int = 512,
1307 width: int = 512,
1308 num_inference_steps: int = 50,
1309 guidance_scale: float = 7.5,
1310 num_images_per_prompt: Optional[int] = 1,
1311 eta: float = 0.0,
1312 generator: Optional[torch.Generator] = None,
1313 latents: Optional[torch.FloatTensor] = None,
1314 max_embeddings_multiples: Optional[int] = 3,
1315 output_type: Optional[str] = "pil",
1316 return_dict: bool = True,
1317 callback: Optional[Callable[[int, int, torch.FloatTensor], None]] = None,
1318 callback_steps: Optional[int] = 1,
1319 **kwargs,
1320 ):
1321 r"""
1322 Function for text-to-image generation.
1323 Args:
1324 prompt (`str` or `List[str]`):
1325 The prompt or prompts to guide the image generation.
1326 negative_prompt (`str` or `List[str]`, *optional*):
1327 The prompt or prompts not to guide the image generation. Ignored when not using guidance (i.e., ignored
1328 if `guidance_scale` is less than `1`).
1329 height (`int`, *optional*, defaults to 512):
1330 The height in pixels of the generated image.
1331 width (`int`, *optional*, defaults to 512):
1332 The width in pixels of the generated image.
1333 num_inference_steps (`int`, *optional*, defaults to 50):
1334 The number of denoising steps. More denoising steps usually lead to a higher quality image at the
1335 expense of slower inference.
1336 guidance_scale (`float`, *optional*, defaults to 7.5):
1337 Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598).
1338 `guidance_scale` is defined as `w` of equation 2. of [Imagen
1339 Paper](https://arxiv.org/pdf/2205.11487.pdf). Guidance scale is enabled by setting `guidance_scale >
1340 1`. Higher guidance scale encourages to generate images that are closely linked to the text `prompt`,
1341 usually at the expense of lower image quality.
1342 num_images_per_prompt (`int`, *optional*, defaults to 1):
1343 The number of images to generate per prompt.
1344 eta (`float`, *optional*, defaults to 0.0):
1345 Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to
1346 [`schedulers.DDIMScheduler`], will be ignored for others.
1347 generator (`torch.Generator`, *optional*):
1348 A [torch generator](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make generation
1349 deterministic.
1350 latents (`torch.FloatTensor`, *optional*):
1351 Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image
1352 generation. Can be used to tweak the same generation with different prompts. If not provided, a latents
1353 tensor will ge generated by sampling using the supplied random `generator`.
1354 max_embeddings_multiples (`int`, *optional*, defaults to `3`):
1355 The max multiple length of prompt embeddings compared to the max output length of text encoder.
1356 output_type (`str`, *optional*, defaults to `"pil"`):
1357 The output format of the generate image. Choose between
1358 [PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`.
1359 return_dict (`bool`, *optional*, defaults to `True`):

Callers

nothing calls this directly

Calls 1

__call__Method · 0.95

Tested by

no test coverage detected