Natural language processing, or NLP, is a field of artificial intelligence that focuses on the interaction between computers and humans using natural language. NLP is a branch of AI but is really a mixture of disciplines such as linguistics, computer science, and engineering.
There are a number of approaches to NLP, ranging from rule-based modelling of human language to statistical methods. Common uses of NLP include speech recognition systems, the voice assistants available on smartphones, and chatbots.
Natural language processing is a subfield of artificial intelligence that focuses on the interaction between computers and human language. AI is a broad field that encompasses many different areas, including robotics, computer vision, machine learning, and more. NLP specifically deals with how computers can understand, interpret, and generate human language.
It is possible to develop natural language processing systems with no machine learning. For example, a simple chatbot such as Joseph Weizenbaum’s ELIZA, which applies manually written rules to simulate a psychiatrist’s conversation, is an NLP system that contains no machine learning at all. Likewise, a supermarket chain’s machine learning system which learns from customers’ purchases and recommends future products contains no NLP at all. But all of these belong under the umbrella of artificial intelligence.
We encounter a number of common applications of NLP every day. Here are a few examples:
There are a number of approaches to processing natural language, as no two NLP tasks are the same. However, we can divide NLP into two broad approaches: symbolic NLP and statistical NLP, although a hybrid is becoming more popular.
Symbolic NLP was the dominant approach from the 1950s to the 1990s, and involved programmers coding grammar rules and ontologies into a system, cataloguing real-world and linguistic knowledge. Statistical NLP is currently the dominant approach, where machine learning algorithms such as neural networks are trained on vast corpora of data, and learn common patterns without being taught the grammar of a language.
A traditional NLP pipeline follows a series of steps to turn a sentence into something that a computer can handle. This is the approach taken by a number of widely used NLP libraries, such as spaCy and Natural Language Toolkit (NLTK), although not all steps are always present.
Tokenisation: Breaking down an input text into smaller chunks like words or sentences.
Stop-word removal: Eliminating commonly used words like “a”, “an”, “the”, and so on as they do not provide context.
Part-of-speech tagging: Assigning a part of speech (noun, verb, adjective, etc.) to each word in a text.
Named Entity Recognition: Identifying and classifying entities like people, places, and organizations in a text.
Sentiment Analysis: Identifying the overall emotion or sentiment behind a piece of text, such as positive, negative, or neutral.
Machine Learning: Using algorithms to analyse patterns in the text and learn from it.
However, with the advent of neural networks and deep learning techniques in NLP, these pipelines are becoming less relevant.
Convolutional neural networks (CNNs) were developed for computer vision problems, such as recognising handwritten digits on envelopes. However, in the 2010s they found widespread use in text processing. This is thanks to the invention of the Word2vec algorithm, which allows words to be represented as vectors in a high-dimensional space, allowing a document to be converted into a matrix which can be handled as if it were an image.
The pipeline for a CNN is as follows:
Tokenisation: The input text is broken down, as in traditional NLP.
Word vectorisation: words are converted to vectors according to a lookup table, and the entire document becomes an n×m matrix.
The matrix is passed into a convolutional neural network, which can perform tasks such as document classification.
The state of the art is currently the transformer model. A transformer is a neural network which processes a sequence of tokens and calculates a vector representing each token which depends on the other tokens in the sequence. This is unlike the word vector method described in the previous section, as a word will have a different vector representation depending on its role in the sentence.
A transformer model such as BERT can transform a sentence into a single vector in high-dimensional space. Sentences with semantically similar content appear close together in the vector space.
NLP is used in a variety of business areas and industries.
Business function | Application of NLP |
---|---|
Customer service | Chatbots on company websites. These reduce call centre costs and allow companies to run analytics on chat logs. |
Customer service | Triaging incoming emails using a classifier |
Operations | Estimate the risk of a clinical trial protocol failing, or the cost of a repair. |
Operations | Machine translation: Google, Azure translate |
Market research | Machine learning models such as Whisper can transcribe interviews with Key Opinion Leaders (KOLs) in pharma |
Operations | Document summarisation models |
Operations | Identify key products or locations mentioned in a text, and extract their relationships. For example, a doctor says “I would prescribe (DRUG) with (DRUG)”, and a smart model may be able to identify the relationships between the entities. |
If you have a background of studies in a different area, and would like to get into natural language processing, there are a number of books and other resources available to help you make the move.
First of all, having an interest in languages, and developing a career in NLP, are different things. If you want to get into NLP, you will need an interest in algorithms, problem solving, and linguistics.
Learn the basics of NLP: Start by acquiring a basic understanding of NLP by working through some of the resources above, such as the Stanford NLP course.
Develop strong computer science, coding, and software engineering skills: As NLP requires a lot of programming skills, proficiency in programming languages such as Python is crucial. You should also gain an understanding of the fundamentals of machine learning and deep learning.
Gain practical experience: Work on NLP projects to gain practical experience. Participate in online forums and contribute to open source projects.
Pursue advanced studies: You may consider pursuing a Master’s or Ph.D. in NLP or a related field to dive deeper into this area, if your finances and commitments permit.
Network: Attend conferences, meetups, and join online groups to network with other NLP professionals and keep up-to-date with the latest trends in the field.
Look for job opportunities: Look for NLP job openings on LinkedIn, company websites, and job boards. You can take on work as a freelancer on a platform such as Upwork to hone your skills. You could also join a company in an industry such as legal or pharmaceuticals, where there are large amounts of text data. Quite often it is the case that companies do not have anybody in house who is prepared to deal with the headache of text data, so if that’s your cup of tea, you could find a very comfortable niche.
Some of the most popular NLP tools include:
In addition, there are cloud-based LLMs such as OpenAI’s GPT-3 and Meta’s LLaMA, which are disrupting the field.
Companies doing NLP include the big tech companies, such as:
as well as companies such as
There are also consultancies such as Fast Data Science.
There are a number of marketplaces to recruit freelance NLP specialists, such as Upwork or Fiverr. You can also contact me to arrange a consultation with my company Fast Data Science. My team and I have undertaken consulting projects for large private sector companies, startups, universities and non-profits. We are likely to be able to deliver value quickly and within your budget, and the initial conversation is free of charge.
The history of NLP can be traced back to the early 1940s, shortly after World War II, when scientists in the USA and Soviet Union were attempting to make machines which could translate between languages, such as English and Russian.
In 1950, the British computer scientist Alan Turing proposed a test to determine a machine’s ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human. He called his test the “Imitation Game”.
I propose to consider the question, ‘Can machines think?’ This should begin with definitions of the meaning of the terms ‘machine’ and ‘think’.
– Alan Turing, in Mind (1950).
In 1957, Noam Chomsky developed his Universal Grammar, the theory that language is innate and hard-coded in the human brain, and there are common rules which underlie all human languages which cannot be explained simply by observing how children learn to speak.
From the 1950s to 1970s, researchers in NLP began to divide into two camps: those favouring a symbolic approach to modelling language, and those who preferred a stochastic approach. Symbolic NLP involves developing formal language rules, similar to what would be found in a Latin primer (“a sentence consists of a noun phrase followed by a verb phrase”). The polar opposite of this is stochastic NLP, where a program calculates statistics and probabilities, such as the frequency of words and pairs of words in a corpus.
In the 1970s, researchers developed formal logic-based languages such as Prolog, which could model legal questions or logical problems. Rule-based systems were developed for discourse modelling. Over the next few decades there was a gradual transition towards machine learning algorithms for NLP, due to the availability of computational power and a reduction in the importance of “purist” linguistics such as Chomsky’s theories.
By the 2000s, large amounts of text data were widely available and companies such as Google were able to build large-scale statistical translation models. In the 2010s, a further shift took place towards neural networks.
In 2013, a team at Google introduced the Word2vec algorithm, which represents words in a lexicon as points in vector space, where the distance between points is significant and corresponds to semantic similarity. Then in 2017, Vaswani et al introduced the transformer architecture in a paper called “Attention is All You Need”, which was yet another quantum leap in the field. Transformers have fuelled the recent explosion in large language models (LLMs) such as ChatGPT.
Today, NLP has begun to be widely used in consumer electronics as well as in business. Insurance, pharma or legal firms which need to process large numbers of documents may well resort to NLP to extract structured information, cluster items, analyse customer support logs, or predict future events.
Turing, A. M. (1950) Mind, 59(236), 433-460.
Chomsky, N. (1986) Knowledge of language: Its nature, origin, and use. Greenwood Publishing Group.
Vaswani, Ashish, et al. (2017) Attention is All You Need Advances in neural information processing systems 30.
Mikolov, Tomas; et al. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781
“Fine-tuning” means adapting an existing machine learning model for specific tasks or use cases. In this post I’m going to walk you through how you can fine tune a large language model for sentence similarity using some hand annotated test data. This example is in the psychology domain. You need training data consisting of pairs of sentences, and a “ground truth” of how similar you want those sentences to be when you train your custom sentence similarity model.
Hire an NLP developer and untangle the power of natural language in your projects The world is buzzing with the possibilities of natural language processing (NLP). From chatbots that understand your needs to algorithms that analyse mountains of text data, NLP is revolutionising industries across the board. But harnessing this power requires the right expertise. That’s where finding the perfect NLP developer comes in. Post a job in NLP on naturallanguageprocessing.
Hire an NLP data scientist and boost your business with AI As artificial intelligence transcends the realm of sci-fi and starts getting intricately woven into our everyday lives, the demand for specialized professionals to oversee its many dimensions has never been higher. If your company is looking to step into the future, now is the perfect time to hire an NLP data scientist! What is an NLP data scientist? Natural Language Processing (NLP), a subset of machine learning, focuses on the interaction between humans and computers via natural language.