MCPcopy Index your code
hub / github.com/TheAlgorithms/Python / similarity_search

Function similarity_search

machine_learning/similarity_search.py:37–140  ·  view source on GitHub ↗

:param dataset: Set containing the vectors. Should be ndarray. :param value_array: vector/vectors we want to know the nearest vector from dataset. :return: Result will be a list containing 1. the nearest vector 2. distance from the vector >>> dataset = np.ar

(
    dataset: np.ndarray, value_array: np.ndarray
)

Source from the content-addressed store, hash-verified

35
36
37def similarity_search(
38 dataset: np.ndarray, value_array: np.ndarray
39) -> list[list[list[float] | float]]:
40 """
41 :param dataset: Set containing the vectors. Should be ndarray.
42 :param value_array: vector/vectors we want to know the nearest vector from dataset.
43 :return: Result will be a list containing
44 1. the nearest vector
45 2. distance from the vector
46
47 >>> dataset = np.array([[0], [1], [2]])
48 >>> value_array = np.array([[0]])
49 >>> similarity_search(dataset, value_array)
50 [[[0], 0.0]]
51
52 >>> dataset = np.array([[0, 0], [1, 1], [2, 2]])
53 >>> value_array = np.array([[0, 1]])
54 >>> similarity_search(dataset, value_array)
55 [[[0, 0], 1.0]]
56
57 >>> dataset = np.array([[0, 0, 0], [1, 1, 1], [2, 2, 2]])
58 >>> value_array = np.array([[0, 0, 1]])
59 >>> similarity_search(dataset, value_array)
60 [[[0, 0, 0], 1.0]]
61
62 >>> dataset = np.array([[0, 0, 0], [1, 1, 1], [2, 2, 2]])
63 >>> value_array = np.array([[0, 0, 0], [0, 0, 1]])
64 >>> similarity_search(dataset, value_array)
65 [[[0, 0, 0], 0.0], [[0, 0, 0], 1.0]]
66
67 These are the errors that might occur:
68
69 1. If dimensions are different.
70 For example, dataset has 2d array and value_array has 1d array:
71 >>> dataset = np.array([[1]])
72 >>> value_array = np.array([1])
73 >>> similarity_search(dataset, value_array)
74 Traceback (most recent call last):
75 ...
76 ValueError: Wrong input data's dimensions... dataset : 2, value_array : 1
77
78 2. If data's shapes are different.
79 For example, dataset has shape of (3, 2) and value_array has (2, 3).
80 We are expecting same shapes of two arrays, so it is wrong.
81 >>> dataset = np.array([[0, 0], [1, 1], [2, 2]])
82 >>> value_array = np.array([[0, 0, 0], [0, 0, 1]])
83 >>> similarity_search(dataset, value_array)
84 Traceback (most recent call last):
85 ...
86 ValueError: Wrong input data's shape... dataset : 2, value_array : 3
87
88 3. If data types are different.
89 When trying to compare, we are expecting same types so they should be same.
90 If not, it'll come up with errors.
91 >>> dataset = np.array([[0, 0], [1, 1], [2, 2]], dtype=np.float32)
92 >>> value_array = np.array([[0, 0], [0, 1]], dtype=np.int32)
93 >>> similarity_search(dataset, value_array) # doctest: +NORMALIZE_WHITESPACE
94 Traceback (most recent call last):

Callers

nothing calls this directly

Calls 2

euclideanFunction · 0.70
appendMethod · 0.45

Tested by

no test coverage detected