Python - How To Split Sentence Into Words in Python

ID : 39

viewed : 180

Tags : PythonPython String

vote vote

96

This tutorial will discuss the methods to split a sentence into a list of words in Python.

Split Sentence Into Words With the str.split() Function in Python

The in Python takes a separator as an input parameter and splits the calling string into multiple strings based on the separator. If we don’t specify any separator, the str.split() function splits the string on the basis of empty spaces. The following code snippet shows us how to split a sentence into a list of words with the str.split() function.

sentence = "This is a sentence" words = sentence.split() print(words) 

Output:

['This', 'is', 'a', 'sentence'] 

We declared a string variable sentence that contains some data. We then split the sentence variable into a list of strings with the sentence.split() function and stored the results into the words list. The str.split() function is the easiest way to convert a sentence into a list of words in Python.

Split Sentence Into Words With List Comprehensions in Python

We can also use list comprehensions to split a sentence into a list of words. However, this approach isn’t as straightforward as the str.split() function. The advantage of using list comprehensions is that we can also perform some operations on the obtained words. The operations could range from appending something to each word or removing something from each word. The following code snippet shows us how to split a sentence into words with list comprehensions and the str.split() function.

sentence = "This is a sentence" words = [word for word in sentence.split()] print(words) 

Output:

['This', 'is', 'a', 'sentence'] 

We declared a string variable sentence that contains some data. We then split the sentence variable into a list of strings with list comprehension and stored the results into the words list. This method is useful to modify each obtained word before storing the word into the words list.

Split Sentence Into Words With the nltk Library in Python

The or the Natural language toolkit library is used for text processing in Python. We have to install it before using it because it is an external library. The command to install the natural language toolkit is given below.

pip install nltk 

Once installed, we have to download the punkt package using the nltk.download() function. This phenomenon is illustrated in the following code snippet.

import nltk nltk.download('punkt') 

Output:

[nltk_data] Downloading package punkt to /root/nltk_data... [nltk_data]   Unzipping tokenizers/punkt.zip. 

The inside the nltk library can be used to solve our specific problem. This function takes a string as an input parameter and splits it into multiple substrings. The following code snippet shows us how to split a sentence into a list of words with the nltk.word_tokenize() function.

sentence = "This is a sentence" words = nltk.word_tokenize(sentence) print(words) 

Output:

['This', 'is', 'a', 'sentence'] 

We split our sentence string into a list of words with the nltk.word_tokenize(sentence) function and stored the results into the words list. In the end, we displayed the elements inside the words list.

The str.split() method is the simplest way to solve this specific problem, but there isn’t much that we can do with the split() function once we have the list of words. The other two methods are useful when we want to perform some additional manipulation on the obtained words.

  • Related HOW TO?