MCPcopy
hub / github.com/togethercomputer/RedPajama-Data / form_ngrams

Function form_ngrams

app/src/utilities/text/ngrams.py:1–17  ·  view source on GitHub ↗
(sequence, n)

Source from the content-addressed store, hash-verified

1def form_ngrams(sequence, n):
2 history = []
3 # build the first ngram, yielding only when we have a full ngram
4 while n > 1:
5 try:
6 next_item = next(sequence)
7 except StopIteration:
8 # no more data, terminate the generator
9 return
10 history.append(next_item)
11 n -= 1
12
13 # yield each ngram we have, then add the next item and repeat
14 for item in sequence:
15 history.append(item)
16 yield tuple(history)
17 del history[0]

Callers 6

_compute_ngramsFunction · 0.90
__call__Method · 0.90
__call__Method · 0.90
__call__Method · 0.90
__call__Method · 0.90
generate_signatureFunction · 0.90

Calls

no outgoing calls

Tested by

no test coverage detected