Data Science Terminology

Natural Language Processing: Acquainting Machines with Human Language


 

Imagine you’ve woken up on a fine summer’s day in the middle of July. 

The sun is beaming through your window and whilst you stretch the morning cramps out, you call out: 

You: “Hey Siri” 

Siri: “Good Morning, [insert your name]. How may I help you”

You: “What’s the time?” 

Siri: “The time is 6am GMT” 

You realize you’ve woken up 2 hours before your alarm so you pull the covers back over your head in an attempt to get more sleep. However, the thought dawns on you - “maybe I could get ahead”:

You: “Hey Siri” 

Siri: “Good Morning, [insert your name]. How may I help you” 

You: “What’s my schedule for today?” 

Siri: “You have a follow-up meeting with a potential client at 3pm, you have football with the kids at 5.30pm, and you are cooking today at 7.30pm”. 

This form of communication with technology has become more prominent in the advent of tools such as Amazon’s Alexa, Google Home, Microsoft Cortana, Apple’s Siri, etc. Thanks to these intermediaries, communication with a computer can occur in what is known as Natural Language - essentially the language we humans communicate in. Without the natural language that we speak to computers being translated, we’d have to speak to the computers in binary. Literally, 0’s and 1’s. Although it’s possible to represent language in binary data, Natural Language Processing (NLP) is the tool we use to allow computers to understand language. 

Therefore, NLP may be described as an area of computer science that deals with techniques to analyze, model, and understand human language.  

Dive into the fundamentals of machine learning! 

Real-World Examples of NLP

Since natural language is a prominent part of human existence, NLP plays a crucial role in a wide range of software applications that we use in our day-to-day lives. Some examples of real-world NLP use cases are: 

  • Emailing: Email platforms such as Google Mail and Windows Outlook [etc] leverage NLP extensively to provide a valuable range of features. Some of these features include spam classification, priority inboxing, auto-complete, and event extraction. 
  • Voice Assistants: Above we gave an extensive example of humans communicating with machines via voice assistants. These voice assistants depend on a variety of NLP techniques to interact with the user, understand their commands, and respond accordingly. 
  • Translation: As the world has become more interconnected, more people are taking the initiative to experience cultures they’ve never witnessed before. The major hiccup here is there are often language barriers across cultures but with the likes of Google Translate [and many more], we could easily translate our familiar tongue to a foreign tongue with a click of a few buttons. 
  • Search Engines: Search engines are like the cornerstone of the internet and they use NLP heavily for various tasks such as understanding, question answering, information retrieval, ranking and grouping of results, and many more. 
  • Autocorrect: NLP forms the foundation of various spelling and grammar corrections tools such as Grammarly and spell-check. 

If you think this is breathtaking, you’d be shocked to know that we haven’t even scratched the surface of the various use cases of NLP. NLP use cases are growing in application across several applications, and as you read this now, there are more applications being implemented. 

The Different NLP Tasks

With a flurry of different use cases comes a collection of fundamental tasks which tend to prop up quite frequently across various NLP projects. The task involves: 

Text Classification

Taking the content as a basis point, the goal of text classification (also known as text tagging or text categorization) is to bucket the text into a set of known categories. This is probably the most common task in NLP as it’s used in many tools. 

Text Summarization 

The task of text summarization is to generate a short, concise and meaningful summary of longer documents such as books, blog posts, news articles, etc.  

Topic Modelling

This is an unsupervised machine learning technique, that is employed to uncover the topical structure of a large collection of documents. 

Information Extraction

As the name implies, the task of information extraction is to extract the relevant information from text. 

Information Retrieval 

Information retrieval is the process of obtaining information system resources that are relevant to an information need from a collection of those resources [Source: Wikipedia]. A well-known application of information retrieval is Google search. 

Language Modelling

Language modelling leverages various statistical and probabilistic techniques in order to predict the next sequence of words based on the history of previous words. 

Conversational Agents 

This task involves building dialogue systems that conduct NLP and then responds to the given query with natural language. 

Question Answering 

This task involves building a system, that could provide an automatic answer to a question posed in natural language. 

Machine Translation 

This task involves converting a piece of text from one language to another language. 

Wrap Up 

In summary, human language is staggeringly complex and very diverse. The rapid growth of Natural Language Processing is reflective of the need to help resolve some problems (such as the ambiguity of language) we face when using machines to break down and interpret human language.

 

Interested in learning more data science concepts or the latest product updates? Subscribe to our updates!

Subscribe

Similar posts