hub / github.com/Tencent-Hunyuan/HunyuanVideo-I2V / encode

Method encode

hyvideo/text_encoder/__init__.py:292–517 · view source on GitHub ↗

Args: batch_encoding (dict): Batch encoding from tokenizer. use_attention_mask (bool): Whether to use attention mask. If None, use self.use_attention_mask. Defaults to None. output_hidden_states (bool): Whether to output hidden states. If

(
        self,
        batch_encoding,
        use_attention_mask=None,
        output_hidden_states=False,
        do_sample=None,
        hidden_state_skip_layer=None,
        return_texts=False,
        data_type="image",
        semantic_images=None,
        device=None,
    )

Source from the content-addressed store, hash-verified

290	raise ValueError(f"Unsupported tokenize_input_type: {tokenize_input_type}")
291
292	def encode(
293	self,
294	batch_encoding,
295	use_attention_mask=None,
296	output_hidden_states=False,
297	do_sample=None,
298	hidden_state_skip_layer=None,
299	return_texts=False,
300	data_type="image",
301	semantic_images=None,
302	device=None,
303	):
304	"""
305	Args:
306	batch_encoding (dict): Batch encoding from tokenizer.
307	use_attention_mask (bool): Whether to use attention mask. If None, use self.use_attention_mask.
308	Defaults to None.
309	output_hidden_states (bool): Whether to output hidden states. If False, return the value of
310	self.output_key. If True, return the entire output. If set self.hidden_state_skip_layer,
311	output_hidden_states will be set True. Defaults to False.
312	do_sample (bool): Whether to sample from the model. Used for Decoder-Only LLMs. Defaults to None.
313	When self.produce is False, do_sample is set to True by default.
314	hidden_state_skip_layer (int): Number of hidden states to hidden_state_skip_layer. 0 means the last layer.
315	If None, self.output_key will be used. Defaults to None.
316	hidden_state_skip_layer (PIL.Image): The reference images for i2v models.
317	return_texts (bool): Whether to return the decoded texts. Defaults to False.
318	"""
319	device = self.model.device if device is None else device
320	use_attention_mask = use_default(use_attention_mask, self.use_attention_mask)
321	hidden_state_skip_layer = use_default(
322	hidden_state_skip_layer, self.hidden_state_skip_layer
323	)
324	do_sample = use_default(do_sample, not self.reproduce)
325	if not self.i2v_mode:
326	attention_mask = (
327	batch_encoding["attention_mask"].to(device)
328	if use_attention_mask
329	else None
330	)
331	outputs = self.model(
332	input_ids=batch_encoding["input_ids"].to(device),
333	attention_mask=attention_mask,
334	output_hidden_states=output_hidden_states
335	or hidden_state_skip_layer is not None,
336	)
337	if hidden_state_skip_layer is not None:
338	last_hidden_state = outputs.hidden_states[
339	-(hidden_state_skip_layer + 1)
340	]
341	# Real last hidden state already has layer norm applied. So here we only apply it
342	# for intermediate layers.
343	if hidden_state_skip_layer > 0 and self.apply_final_norm:
344	last_hidden_state = self.model.final_layer_norm(last_hidden_state)
345	else:
346	last_hidden_state = outputs[self.output_key]
347
348	# Remove hidden states of instruction tokens, only keep prompt tokens.
349	if self.use_template:

Callers 6

forwardMethod · 0.95

predictMethod · 0.45

encode_promptMethod · 0.45

get_cond_latentsFunction · 0.45

prepare_model_inputsFunction · 0.45

extractFunction · 0.45

Calls 2

use_defaultFunction · 0.85

TextEncoderModelOutputClass · 0.85

Tested by

no test coverage detected