hub / github.com/MeiGen-AI/InfiniteTalk / generate

Method generate

wan/first_last_frame2video.py:133–377 · view source on GitHub ↗

r""" Generates video frames from input first-last frame and text prompt using diffusion process. Args: input_prompt (`str`): Text prompt for content generation. first_frame (PIL.Image.Image): Input image tensor. Shape: [3, H, W

(self,
                 input_prompt,
                 first_frame,
                 last_frame,
                 max_area=720 * 1280,
                 frame_num=81,
                 shift=16,
                 sample_solver='unipc',
                 sampling_steps=50,
                 guide_scale=5.5,
                 n_prompt="",
                 seed=-1,
                 offload_model=True)

Source from the content-addressed store, hash-verified

131	self.sample_neg_prompt = config.sample_neg_prompt
132
133	def generate(self,
134	input_prompt,
135	first_frame,
136	last_frame,
137	max_area=720 * 1280,
138	frame_num=81,
139	shift=16,
140	sample_solver='unipc',
141	sampling_steps=50,
142	guide_scale=5.5,
143	n_prompt="",
144	seed=-1,
145	offload_model=True):
146	r"""
147	Generates video frames from input first-last frame and text prompt using diffusion process.
148
149	Args:
150	input_prompt (`str`):
151	Text prompt for content generation.
152	first_frame (PIL.Image.Image):
153	Input image tensor. Shape: [3, H, W]
154	last_frame (PIL.Image.Image):
155	Input image tensor. Shape: [3, H, W]
156	[NOTE] If the sizes of first_frame and last_frame are mismatched, last_frame will be cropped & resized
157	to match first_frame.
158	max_area (`int`, optional, defaults to 720*1280):
159	Maximum pixel area for latent space calculation. Controls video resolution scaling
160	frame_num (`int`, optional, defaults to 81):
161	How many frames to sample from a video. The number should be 4n+1
162	shift (`float`, optional, defaults to 5.0):
163	Noise schedule shift parameter. Affects temporal dynamics
164	[NOTE]: If you want to generate a 480p video, it is recommended to set the shift value to 3.0.
165	sample_solver (`str`, optional, defaults to 'unipc'):
166	Solver used to sample the video.
167	sampling_steps (`int`, optional, defaults to 40):
168	Number of diffusion sampling steps. Higher values improve quality but slow generation
169	guide_scale (`float`, optional, defaults 5.0):
170	Classifier-free guidance scale. Controls prompt adherence vs. creativity
171	n_prompt (`str`, optional, defaults to ""):
172	Negative prompt for content exclusion. If not given, use `config.sample_neg_prompt`
173	seed (`int`, optional, defaults to -1):
174	Random seed for noise generation. If -1, use random seed
175	offload_model (`bool`, optional, defaults to True):
176	If True, offloads models to CPU during generation to save VRAM
177
178	Returns:
179	torch.Tensor:
180	Generated video frames tensor. Dimensions: (C, N H, W) where:
181	- C: Color channels (3 for RGB)
182	- N: Number of frames (81)
183	- H: Frame height (from max_area)
184	- W: Frame width from max_area)
185	"""
186	first_frame_size = first_frame.size
187	last_frame_size = last_frame.size
188	first_frame = TF.to_tensor(first_frame).sub_(0.5).div_(0.5).to(
189	self.device)
190	last_frame = TF.to_tensor(last_frame).sub_(0.5).div_(0.5).to(

Callers

nothing calls this directly

Calls 10

set_timestepsMethod · 0.95

stepMethod · 0.95

FlowUniPCMultistepSchedulerClass · 0.85

FlowDPMSolverMultistepSchedulerClass · 0.85

get_sampling_sigmasFunction · 0.85

retrieve_timestepsFunction · 0.85

deviceMethod · 0.80

visualMethod · 0.80

encodeMethod · 0.45

decodeMethod · 0.45

Tested by

no test coverage detected