It’s been more than a year since ChatGPT was launched and in that time, it has taken the enterprise Artificial Intelligence (AI) world by storm. The primary reason for ChatGPT to catch the attention of most enterprises is that CXOs have easy access to the tool, and it was being widely talked about in most CXO forums. Notably, Open AI and other major companies like Meta and Google have also been consistently releasing improvements to their products over the last one year. While these cumulative events are a significant breakthrough for the rapidly evolving Generative AI space, the journey for language models started several decades ago.
Read more: The Ultimate Enterprise Playbook (guide) for Generative AI Implementation and Adoption
The roots of AI can be traced to a workshop in Darthmouth College in 1956, with many of the attendees going on to become pivotal leaders in the field of AI research. Contributions from pioneers such as Alan Turing were carried forward by others who attended this workshop like John McCarthy and Marvin Minsky. While Turing’s transformative work on the Turing test laid the foundation for the concept of machine intelligence and creativity, the development of the first AI programs by McCarthy and Minsky, set the stage for exploring the generation of human-like outputs by machines.
Early AI Systems
The early developments in AI focused on rule-based systems and expert systems, which aimed to mimic human reasoning and decision-making processes. One of the pioneering examples of AI during this era was the SHRDLU system which was developed by Terry Winograd in the late 1960s. SHRDLU demonstrated the ability to understand and manipulate objects in a virtual block world, which was an early advance in language understanding and generation.
Another significant progression in early Generative AI was the development of expert systems such as MYCIN, which was an AI system designed to diagnose bacterial infections and recommend antibiotic treatments. These early systems showcased the potential of machines to perform complex cognitive tasks and create human-like outputs in specific domains.
The Rise of Neural Networks
Neural networks sparked a new era of AI, enabling machines to learn and generate complex patterns and outputs. The development of deep learning algorithms and the availability of large-scale datasets paved the way for generative models which could produce realistic images, text, and audio. Word2Vec which was published 2013, leveraged the power of neural networks to learn word associations, which was a significant development in advancing the field of language processing.
Another key advancement in this era was the development of Generative Adversarial Networks (GANs) by Ian Goodfellow and his team in 2014. GANs introduced a novel framework for training generative models by pitting two neural networks against each other – a generator and a discriminator. This adversarial training process enabled GANs to produce high-quality synthetic data, leading to significant progressions in generative image synthesis and manipulation.
Transformers
Transformers were introduced in the now famous 2017 paper Attention Is All You Need (https://research.google/pubs/pub46201/) by Google Research. The key innovation in Transformers is the attention mechanism, which allows the model to focus on various parts of the input sequence individually, when making predictions. This is done through the “self-attention” mechanism, where each element in the input sequence can weigh the importance of other elements. This enables Transformers to have extremely long-term memory and preserve the relevant context across multiple conversations / text inputs. This was a major step forward for many natural language processing tasks, such as machine translation, text summarization, language understanding, as it addressed language processing in a manner which wasn’t done earlier.
The Transformer architecture is the fundamental building block of all LLMs. With it, models like GPT (Generative Pre-trained Transformer) can generate more accurate and contextually relevant output. Large Language Models like GPT are trained on massive amounts text data that result in internally learnt weights called parameters which are used to predict the next token in the sequence, i.e. the next word. In general, the larger the parameters, the greater is the perceived creativity & accuracy of the model.
The Hype and What Is Next
In summary, it has been a continuum of research which has brought these models to the fore. What has kept ChatGPT continuously relevant to the enterprise community and the general public at large, is the widespread availability and ease of use of GPT. A major area of interest is the perceived creativity and accuracy of the language generation, which has resulted in many creative professionals feeling like their revenue streams and potentially even livelihoods will eventually come under threat.
While there has been a lot of CXO level excitement on Generative AI, the key question remains – What are the truly successful enterprise use cases of Generative AI that we’ve seen? What has clearly not worked is the rounding up and dropping of all the organizational initiatives around data engineering and analytics, in favour of new investments in Generative AI.
On the other hand, what has proven to be successful and what enterprises should focus on, is a careful approach which leverages the power of these Gen AI tools for the appropriate business context i.e. deciding which hammer head to use for which nail. We have seen various companies buying into the hype and deploying Gen AI tools to solve various business problems like predicting revenues or understanding what is happening with key company performance metrics.
At Prescience Decision Solutions, we approach analytics techniques as a continuum of what delivers maximum value based on the business need and available data. Similarly, organizations need to prioritize their business use cases and then adopt the most appropriate techniques for data analysis which include:
- Deterministic BI
- Descriptive analytics
- Statistical Analysis
- Machine learning
- Systemic intelligence /AI
- Generative AI
Cassie Kozyrkov, the former Chief Scientist at Google put this in perspective in her 2018 blog https://kozyrkov.medium.com/what-on-earth-is-data-science-eb1237d8cb37 . In the years since, nothing has really changed from that line of thinking. Specifically on Gen AI tools, enterprises have significantly improved their productivity by tackling the right use cases such as text processing, summarization, where relevant data from disparate enterprise systems has been brought into a single window for faster data driven decision making.
Overall, the widely adopted enterprise Generative AI use cases can be grouped into the following areas:
- Knowledge Management
- Productivity improvement
- Digital assistants for various jobs
- Code generation and testing
- Customer service and engagement
- Accelerating research and discovery
In Part 2 and 3 of this series, we will explore some successful use cases in detail.
If you are looking for a starting point for your business, take advantage of our personalized FREE consultation workshop Sign up here.
Subscribe for regular updates on AI and Data Innovations, case studies, and blogs. Join our mailing list.
Shivakumar is a keen follower of scientific trends and an Asimov fan. He believes solid execution is the key to the success of any strategy and is focused on building a world-class data science team at Prescience. He has a B.Tech from IIT Delhi and an MBA from IIM Lucknow, with 20+ years of experience in the technology space.