MCPcopy Index your code
hub / github.com/lazyprogrammer/machine_learning_examples / train_wikipedia

Function train_wikipedia

rnn_class/batch_wiki.py:133–146  ·  view source on GitHub ↗
(we_file='word_embeddings.npy', w2i_file='wikipedia_word2idx.json', RecurrentUnit=GRU)

Source from the content-addressed store, hash-verified

131
132
133def train_wikipedia(we_file='word_embeddings.npy', w2i_file='wikipedia_word2idx.json', RecurrentUnit=GRU):
134 # there are 32 files
135 ### note: you can pick between Wikipedia data and Brown corpus
136 ### just comment one out, and uncomment the other!
137 # sentences, word2idx = get_wikipedia_data(n_files=100, n_vocab=2000)
138 sentences, word2idx = get_sentences_with_word2idx_limit_vocab()
139 print("finished retrieving data")
140 print("vocab size:", len(word2idx), "number of sentences:", len(sentences))
141 rnn = RNN(30, [30], len(word2idx))
142 rnn.fit(sentences, learning_rate=2*1e-4, epochs=10, show_fig=True, activation=T.nnet.relu)
143
144 np.save(we_file, rnn.We.get_value())
145 with open(w2i_file, 'w') as f:
146 json.dump(word2idx, f)
147
148
149

Callers 1

batch_wiki.pyFile · 0.70

Calls 4

fitMethod · 0.95
RNNClass · 0.70
saveMethod · 0.45

Tested by

no test coverage detected