Subscribe to our daily and weekly newsletters to receive the latest updates and exclusive content on industry-leading AI reporting. Learn more
Two years after ChatGPT’s release, conversations about AI are inevitable as companies across industries look to leverage it large language models (LLMs) to transform their business processes. But as powerful and promising as LLMs are, many business and IT leaders have become overly reliant on them and overlooking their limitations. For this reason, I anticipate a future in which specialized language models (SLMs) will play a larger, complementary role in enterprise IT.
SLMs are commonly referred to as “small language models” because they require less data and training time and “leaner versions of LLMs.” But I prefer the word “specialized” because it better captures the ability of these purpose-built solutions to perform highly specialized work with greater accuracy, consistency, and transparency than LLMs. By complementing LLMs with SLMs, companies can create solutions that leverage the strengths of each model.
Trust and the LLM “black box” problem
LLMs are incredibly powerful, but are also known to sometimes “lose track” or produce results that deviate from course due to their generalist training and vast amounts of data. This makes this tendency even more problematic ChatGPT by OpenAI and other LLMs are essentially “black boxes” that do not reveal how they arrive at an answer.
This black box problem will become a bigger problem in the future, especially for enterprises and mission-critical applications where accuracy, consistency and compliance are of utmost importance. Think of healthcare, financial services, and law as prime examples of professions where inaccurate answers can have huge financial consequences and even life-and-death implications. Regulators are already taking note of this and will likely start asking explainable AI solutionsespecially in industries that rely on privacy and accuracy.
While companies often adopt a human-in-the-loop approach to mitigate these issues, an over-reliance on LLMs can lead to a false sense of security. Over time, complacency can set in and mistakes can slip through undetected.
SLMs = better explainability
Fortunately, SLMs are better suited to overcome many of the limitations of LLMs. Rather than being designed for general tasks, SLMs are developed with a narrower focus and trained on domain-specific data. This distinctive feature allows them to handle differentiated language requirements in areas where precision is of utmost importance. Instead of relying on huge, heterogeneous data sets, SLMs are trained on targeted information that allows them to do this contextual intelligence to provide more consistent, predictable and relevant answers.
This offers several advantages. First, they are more explainable, making it easier to understand the source and rationale of their results. This is critical in regulated industries where decisions must be traced back to a source.
Second, their smaller size means they can often operate faster than LLMs, which can be a critical factor for real-time applications. Third, SLMs offer companies more control over privacy and security, especially when deployed internally or designed specifically for the company.
Additionally, SLMs may require specialized training initially, but reduce the risks associated with using third-party LLMs controlled by external providers. This control is invaluable in applications that require strict data processing and compliance.
Focus on developing expertise (and be wary of vendors who overpromise)
I want to make that clear LLMs and SLMs are not mutually exclusive. In practice, SLMs can complement LLMs and create hybrid solutions where LLMs provide broader context and SLMs ensure precise execution. Even though it’s still early days when it comes to LLMs, I always advise technology leaders to further explore the many possibilities and benefits of LLMs.
Additionally, while LLMs scale well to a variety of problems, SLMs may not translate well to specific use cases. Therefore, it is important to have a clear understanding in advance of which use cases should be addressed.
It is also important that business and IT leaders devote more time and attention to building the specific skills needed to train, tune, and test SLMs. Fortunately, there is plenty of free information and training available through popular sources such as Coursera, YouTube, and others Huggingface.co. Executives should ensure their developers have ample time to learn and experiment with SLMs as the battle for AI expertise intensifies.
I also advise leaders to carefully vet their partners. I recently spoke with a company that asked me for my opinion on the claims made by a particular technology provider. My guess is that they either exaggerated their claims or were simply unable to understand the capabilities of the technology.
The company wisely took a step back and implemented a controlled proof of concept to test the vendor’s claims. As I suspected, the solution simply wasn’t ready for prime time and the company was able to get away with spending relatively little time and money.
Whether a company is starting with a proof of concept or a live deployment, I advise them to start small, test often, and build on early successes. I have personally experienced working with a small set of instructions and information, only to find that when I then feed the model more information, the results veer off course. For this reason, a slow and steady approach is a prudent approach.
In summary, while LLMs will continue to provide increasingly valuable capabilities, their limitations are becoming more apparent as organizations become increasingly reliant on AI. The addition of SLMs provides a path forward, particularly in high-risk areas that require accuracy and explainability. By investing in SLMs, companies can future-proof their AI strategies and ensure that their tools not only drive innovation, but also meet the demands for trust, reliability and control.
AJ Sunder is co-founder, CIO and CPO at Responsive.
DataDecisionMakers
Welcome to the VentureBeat community!
At DataDecisionMakers, experts, including engineers who work with data, can share data-related insights and innovations.
If you want to learn more about innovative ideas and current information, best practices and the future of data and data technology, visit us at DataDecisionMakers.
You might even think about it contribute an article Your own!
Source link