ID : 360

viewed : 64

Tags : PythonPython Math

96

The cosine similarity measures the similarity between vector lists by calculating the cosine angle between the two vector lists. If you consider the cosine function, its value at 0 degrees is 1 and -1 at 180 degrees. This means for two overlapping vectors, the value of cosine will be maximum and minimum for two precisely opposite vectors.

In this article, we will calculate the cosine similarity between two lists of equal sizes.

`scipy`

Module to Calculate the Cosine Similarity Between Two Lists in PythonThe `spatial.cosine.distance()`

function from the `scipy`

module calculates the distance instead of the cosine similarity, but to achieve that, we can subtract the value of the distance from 1.

For example,

`from scipy import spatial List1 = [4, 47, 8, 3] List2 = [3, 52, 12, 16] result = 1 - spatial.distance.cosine(List1, List2) print(result) `

Output:

`0.9720951480078084 `

`NumPy`

Module to Calculate the Cosine Similarity Between Two Lists in Python calculates the dot product of the two vectors passed as parameters. The `numpy.norm()`

function returns the vector norm.

We can use these functions with the correct formula to calculate the cosine similarity.

For example,

`from numpy import dot from numpy.linalg import norm List1 = [4, 47, 8, 3] List2 = [3, 52, 12, 16] result = dot(List1, List2)/(norm(List1)*norm(List2)) print(result) `

Output:

`0.9720951480078084 `

If there are multiple or a list of vectors and a query vector to calculate cosine similarities, we can use the following code.

`import numpy as np List1 =np.array([[ 4, 45, 8, 4], [ 2, 23, 6, 4]]) List2=np.array([ 2, 54, 13, 15]) similarity_scores = List1.dot(List2)/ (np.linalg.norm(List1, axis=1) * np.linalg.norm(List2)) print(similarity_scores) `

Output:

`[0.98143311 0.99398975] `

`sklearn`

Module to Calculate the Cosine Similarity Between Two Lists in PythonIn the `sklearn`

module, there is an in-built function called `cosine_similarity()`

to calculate the cosine similarity.

See the code below.

`from sklearn.metrics.pairwise import cosine_similarity,cosine_distances A=np.array([10,3]) B=np.array([8,7]) result=cosine_similarity(A.reshape(1,-1),B.reshape(1,-1)) print(result) `

Output:

`[[0.91005765]] `

`torch`

Module to Calculate the Cosine Similarity Between Two Lists in PythonWhen we deal with N-dimensional tensors having shapes (m,n), we can use the `consine_similarity()`

function from the `torch`

module to find the cosine similarity.

For example,

`import torch import torch.nn.functional as F t1 = [3,45,6,8] a = torch.FloatTensor(t1) t2 = [4,54,3,7] b = torch.FloatTensor(t2) result = F.cosine_similarity(a, b, dim=0) print(result) `

Output:

`tensor(0.9960) `

Lists are converted into tensors using the `torch.FloatTensor()`

module.