MCPcopy Index your code
hub / github.com/Turing-Project/WriteGPT / embed

Function embed

LanguageNetwork/GPT2/train/modeling.py:254–320  ·  view source on GitHub ↗

reur and position embeddings :param input_ids: int Tensor of shape [batch_size, seq_length]. :param vocab_size: number of words in vocab :param embedding_size: dimensionality of the embedding :param position_offset: aka number of cached tokens. :param initializer_range: float. Ra

(input_ids,
          vocab_size,
          embedding_size,
          position_offset=0,
          initializer_range=0.02,
          max_position_embeddings=512,
          use_one_hot_embeddings=True)

Source from the content-addressed store, hash-verified

252
253
254def embed(input_ids,
255 vocab_size,
256 embedding_size,
257 position_offset=0,
258 initializer_range=0.02,
259 max_position_embeddings=512,
260 use_one_hot_embeddings=True):
261 """reur and position embeddings
262 :param input_ids: int Tensor of shape [batch_size, seq_length].
263 :param vocab_size: number of words in vocab
264 :param embedding_size: dimensionality of the embedding
265 :param position_offset: aka number of cached tokens.
266 :param initializer_range: float. Range of the weight initialization.
267 :param max_position_embeddings: int. Maximum sequence length.
268 :param use_one_hot_embeddings: probably want this to be true
269 :return: [batch_size, seq_length, embedding_size] embedded tensor
270 """
271 (batch_size, seq_length) = get_shape_list(input_ids, expected_rank=2)
272
273 embedding_table = tf.get_variable(
274 name='word_embed',
275 shape=[vocab_size, embedding_size],
276 initializer=create_initializer(initializer_range),
277 )
278
279 assert_op = tf.assert_less_equal(tf.reduce_max(input_ids), vocab_size - 1)
280 with tf.control_dependencies([assert_op]):
281 if use_one_hot_embeddings:
282 flat_input_ids = tf.reshape(input_ids, [-1])
283 one_hot_input_ids = tf.one_hot(flat_input_ids, depth=vocab_size)
284 output_flat = tf.matmul(one_hot_input_ids, embedding_table)
285 else:
286 output_flat = tf.nn.embedding_lookup(embedding_table, input_ids)
287
288 embedded_input = tf.reshape(output_flat, [batch_size, seq_length, embedding_size])
289
290 assert_op = tf.assert_less_equal(seq_length, max_position_embeddings)
291
292 with tf.control_dependencies([assert_op]):
293 full_position_embeddings = tf.get_variable(
294 name='pos_embed',
295 shape=[max_position_embeddings, embedding_size],
296 initializer=create_initializer(initializer_range),
297 )
298 # Since the position embedding table is a learned variable, we create it
299 # using a (long) sequence length `max_position_embeddings`. The actual
300 # sequence length might be shorter than this, for faster training of
301 # tasks that do not have long sequences.
302 #
303 # So `full_position_embeddings` is effectively an embedding table
304 # for position [0, 1, 2, ..., max_position_embeddings-1], and the current
305 # sequence has positions [0, 1, 2, ... seq_length-1], so we can just
306 # perform a slice.
307 if position_offset == 0:
308 embedded_input += tf.slice(full_position_embeddings, [0, 0], [seq_length, -1])[None]
309 else:
310 # Tensorflow is too stupid to allow slicing
311 flat_pos_ids = (tf.range(seq_length, dtype=tf.int32) + position_offset)

Callers 1

__init__Method · 0.70

Calls 3

get_shape_listFunction · 0.90
layer_normFunction · 0.90
create_initializerFunction · 0.70

Tested by

no test coverage detected