"Siri for industry is on its way"
We are no longer surprised when we see people talking to Alexa or Siri in plain English. Having a digital industry assistant is only a matter of time. But first, we need a team effort to build a corpus!
Let me start by saying that I wrote this article completely myself! Really?
Well, fooling us with AI-generated articles is one of the many ways GPT-3 has caught our attention for the past months. A college student used GPT-3 to generate a blog post on what to do when feeling unproductive. It ended up at the top of Hacker News with more than 26,000 views. Only one person asked if it was written by AI. Earlier, another blogpost on why GPT-3 may be the biggest thing since bitcoin went viral, mainly because the author surprised his readers in the last paragraph by revealing all the text was generated by GPT-3. A few months ago, The Guardian generated some publicity with a similar experiment.
There are a lot of other intriguing examples making GPT-3 a real buzz in the AI community and even beyond. But why is it such a big deal for Natural Language Processing (NLP)?
NLP so far…
NLP can be considered a branch of AI and is all about making sense of human language. NLP originates from the 1950s but the last decade brought a real revolution. We went from vectorizing words and analyzing word similarities (e.g. “man is to boy” what “woman is to girl”) with word2vec in 2013 to the Transformer model proposed in the “Attention Is All You Need” paper released in 2017. Transformers leverage the use of attention mechanisms to gather information about the context of a given word and encode that information in the vector representing the word (e.g. I have a “date” tonight vs What “date” is it today?). Transformers are the basis for state-of-the-art language models such as BERT and GPT.
"Attention please"
Language models are the new kids on the block in NLP. They basically predict how likely one word will appear in a text, given all other words in that text (e.g. “It is [MASK] today. Let’s go to the beach”). Knowing the conditional probability of words is the basis for a variety of downstream NLP tasks such as content creation, language translation, auto-completion, question answering and text classification. Most language models are pre-trained on a large dataset (such as Wikipedia) and afterwards fine-tuned to perform a specific NLP task on a smaller dataset. This process is called transfer learning.
What is GPT-3?
GPT-3 is the third version of the Generative Pre-trained Transformer models developed by OpenAI, the AI specialist co-founded by Elon Musk, although he is no longer onboard. Nevertheless, there are no major breakthroughs in terms of architecture, it is considered the most powerful language model ever. Why? Because of its size!
The model has a stunning 175 billion parameters and was pre-trained on a corpus of nearly half a trillion words, mainly sourced from the internet. In fact, the model is so large that no fine-tuning at all is required. It knows so much about language that it can learn NLP tasks that it has never encountered before by just giving it a few examples. This is called few-shot learning. The idea of such a general model is very tempting because it opens the path for democratizing AI and making NLP accessible for everyone.
Let’s get practical
The idea is tempting. Unfortunately only limited people have access to the beta API for GPT-3 for now, but subscribing to the premium version of AI Dungeon gives you a workaround allowing you to catch a glimpse of what the model really can do.
Let us have a look and see how it could help maintenance professionals and operations managers in supporting their day-to-day job. I gave the model a little context for warming up and asked a question about pump failure modes. The text in bold is fully generated by the AI system.
Centrifugal pumps are one of the world's most widely used type of pump, having an extensive range of applications, from food processing to water or sewage transportation. Problems that arise within these machines decrease the flow of the fluid within the pipelines, thus interrupting the production and transport of the fluid to its destination within the process.
What are the most common failure modes that are found in centrifugal pumps in the sewage industry?
A: The main cause for failures is a lack of lubrication. If there is no oil or grease on the gears, then it will not move smoothly and the machine may stop working altogether.
B: The water pressure inside the pipes may be too low. When this happens, the pump will start leaking and eventually explode.
C: There may be a problem with the bearings of the gearbox. If these are worn, then they will need to be replaced along with the oil and grease within them.
D: A plugged or dirty filter may be restricting the flow of the liquid. If the filter becomes clogged, then it needs to be cleaned or replaced.
E: If the operator does not have the correct maintenance methods, then the machine may begin to overheat and break down. This involves stopping the pump mechanism and giving it a thorough clean.
F: Sometimes, the pump will fail due to a manufacturing defect. This is extremely rare and only occurs due to poor quality control during manufacturing. If the pump is functioning correctly, but is old and showing signs of wear, then it should be replaced.
G: The highest risk of failure is down to operator error. Pumps are very heavy duty machines that can go through thousands of hours of operation and still take a long time to break down. The operator must be careful when working on and around the pump, keeping their footing firmly on the ground at all times.
You must admit the model can somehow pass the Turing test and fool you enough by pretending it knows a thing or two about pumps. It can talk like someone who has been around in the industry for years, but is this enough to become your digital buddy? Another round. I confronted the model with a slightly more specific case. Again, the bold text is generated by the AI.
A Guinean mine operator is using a Wirtgen 2200 SM surface miner with a C27 ATAAC V12 diesel engine producing 950 HP at 2100 RPM and a 2.2 m-long cutting drum for mining bauxite*. The surface miner was taken out of service and the turbocharger came into the repair shop with a broken exhaust valve. It was all red. What could be the reason of this failure?
*For the curious minds: bauxite is the main ore used for aluminum production.
The Wirtgen 2200 SM is a very reliable machine, but it has its limitations. Its main weakness is that it can only operate in one direction: up or down.
This time it makes you laugh. I had other conversations where the model initially gave back a meaningful answer but ended up by telling a story of two men getting out of a vehicle, wearing overalls with their faces covered by dust masks, pointing their guns on an operator dismantling a pump.
The funny thing is that GPT-3 will never tell you “I don’t know”. This reminds me of those guys in the bar who always have something to say and act as if they are experts on every topic. At first you are impressed, but are they so smart?
"It is human if you don't know something"
I ended up with the same feeling after a couple of rounds of AI Dungeon. First you are blown away, but then you realize the model will not get the pump or turbocharger fixed. Every random person with some brains and a search engine eventually can formulate an answer on the first question without understanding how a pump works. For the second example, it is clear you need to step away from design failure modes and have a good understanding of the operating conditions before you can make a thorough fault diagnosis.
In fact, you cannot blame GPT-3 for being wrong, because it was trained with knowledge from books and not with any language that has meaning in a mining facility or a production floor. Reading the internet is simply not enough to understand how physical assets behave in the real world.
What is in it for the industry?
"Reading the internet is not enough. We should start reading the factory"
General models are a great idea but if we really want to democratize the use of language models for industrial companies, they should start reading the factory. Therefore, we should build out a Common Crawl for the industry: a large corpus with domain-specific jargon, abbreviations, misspellings, synonyms and word associations that we can typically find in maintenance records, operator and quality logs, warranty claims, asset datasheets, product manuals, etc. Databases on a company level containing thousands or millions of records are not large enough to learn the true semantics of industrial language. The effort to build a corpus must be industry wide.
A great example of a such an effort can be found in healthcare, where libraries with medical lexicon are used in combination with NLP to extract relevant clinical information from unstructured data found in electronic health records.
It is a small step to apply a similar methodology to maintenance logs, where the most valuable information is often stored in free text fields, such as the symptoms and root causes associated with failure events or other problems, and the physical actions taken to repair components, machines, or subsystems. This data is collected for every machine over their entire lifespan. Structuring this human knowledge in meaningful features and linking it with streaming process data opens a new door for building prescriptive maintenance applications where recommendations are based on what works well within the particular context of a factory, and not on what works well in general (like we find in design failure mode libraries).
The road ahead
If we can tap into the hidden potential of all this unstructured information and truly understand it, then we can link the ears and eyes of the production floor with the hard facts and measurements derived from historical and streaming sensor data.
Combining NLP with machine learning (ML) makes it possible to build real Human-In-The-Loop applications where actionable insights are gained from human language to support operators, technicians, and engineers in their day to day jobs. In addition, the knowledge they have can be used to give feedback – in their own words – on those prescriptions, improving the ML system for the next challenges that arise.