MCPcopy
hub / github.com/piskvorky/gensim / inner_product

Method inner_product

gensim/similarities/termsim.py:518–629  ·  view source on GitHub ↗

Get the inner product(s) between real vectors / corpora X and Y. Return the inner product(s) between real vectors / corpora vec1 and vec2 expressed in a non-orthogonal normalized basis, where the dot product between the basis vectors is given by the sparse term similarity ma

(self, X, Y, normalized=(False, False))

Source from the content-addressed store, hash-verified

516 self.matrix = source.tocsc()
517
518 def inner_product(self, X, Y, normalized=(False, False)):
519 """Get the inner product(s) between real vectors / corpora X and Y.
520
521 Return the inner product(s) between real vectors / corpora vec1 and vec2 expressed in a
522 non-orthogonal normalized basis, where the dot product between the basis vectors is given by
523 the sparse term similarity matrix.
524
525 Parameters
526 ----------
527 vec1 : list of (int, float) or iterable of list of (int, float)
528 A query vector / corpus in the sparse bag-of-words format.
529 vec2 : list of (int, float) or iterable of list of (int, float)
530 A document vector / corpus in the sparse bag-of-words format.
531 normalized : tuple of {True, False, 'maintain'}, optional
532 First/second value specifies whether the query/document vectors in the inner product
533 will be L2-normalized (True; corresponds to the soft cosine measure), maintain their
534 L2-norm during change of basis ('maintain'; corresponds to query expansion with partial
535 membership), or kept as-is (False; corresponds to query expansion; default).
536
537 Returns
538 -------
539 `self.matrix.dtype`, `scipy.sparse.csr_matrix`, or :class:`numpy.matrix`
540 The inner product(s) between `X` and `Y`.
541
542 References
543 ----------
544 The soft cosine measure was perhaps first described by [sidorovetal14]_.
545 Further notes on the efficient implementation of the soft cosine measure are described by
546 [novotny18]_.
547
548 .. [sidorovetal14] Grigori Sidorov et al., "Soft Similarity and Soft Cosine Measure: Similarity
549 of Features in Vector Space Model", 2014, http://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/view/2043/1921.
550
551 .. [novotny18] Vít Novotný, "Implementation Notes for the Soft Cosine Measure", 2018,
552 http://dx.doi.org/10.1145/3269206.3269317.
553
554 """
555 if not X or not Y:
556 return self.matrix.dtype.type(0.0)
557
558 normalized_X, normalized_Y = normalized
559 valid_normalized_values = (True, False, 'maintain')
560
561 if normalized_X not in valid_normalized_values:
562 raise ValueError('{} is not a valid value of normalize'.format(normalized_X))
563 if normalized_Y not in valid_normalized_values:
564 raise ValueError('{} is not a valid value of normalize'.format(normalized_Y))
565
566 is_corpus_X, X = is_corpus(X)
567 is_corpus_Y, Y = is_corpus(Y)
568
569 if not is_corpus_X and not is_corpus_Y:
570 X = dict(X)
571 Y = dict(Y)
572 word_indices = np.array(sorted(set(chain(X, Y))))
573 dtype = self.matrix.dtype
574 X = np.array([X[i] if i in X else 0 for i in word_indices], dtype=dtype)
575 Y = np.array([Y[i] if i in Y else 0 for i in word_indices], dtype=dtype)

Calls 5

is_corpusFunction · 0.90
corpus2cscFunction · 0.90
_normalize_dense_vectorFunction · 0.85
_normalize_dense_corpusFunction · 0.85
_normalize_sparse_corpusFunction · 0.85