Vincent Granville PRO

vincentg64

https://mltechniques.com/resources/

AI & ML interests

GenAI, LLM, synthetic data, optimization, fine-tuning, model evaluation

Recent Activity

posted an update 5 days ago

How to Design LLMs that Don’t Need Prompt Engineering https://mltblog.com/3GAbAQu Standard LLMs rely on prompt engineering to fix problems (hallucinations, poor response, missing information) that come from issues in the backend architecture. If the backend (corpus processing) is properly built from the ground up, it is possible to offer a full, comprehensive answer to a meaningful prompt, without the need for multiple prompts, rewording your query, having to go through a chat session, or prompt engineering. In this article, I explain how to do it, focusing on enterprise corpuses. The strategy relies on four principles: ➡️ Exact and augmented retrieval ➡️ Showing full context in the response ➡️ Enhanced UI with option menu ➡️ Structured response as opposed to long text I now explain these principles. Read full article at https://mltblog.com/3GAbAQu #xLLM #BondingAI #PromptEngineering

replied to their post 7 days ago

Doing Better with Less: LLM 2.0 for Enterprise https://mltblog.com/4jMU9KB Standard LLMs are trained to predict the next tokens or missing tokens. It requires deep neural networks (DNN) with billions or even trillions of tokens, as highlighted by Jensen Huang, CEO of Nvidia, in his keynote talk at the GTC conference earlier this year. Yet, 10 trillion tokens cover all possible string combinations; the vast majority of them is noise. After all, most people have a vocabulary of about 30k words. But this massive training is necessary to prevent DNNs from getting stuck in sub-optimal configurations due to vanishing gradient and other issues. What if you could do with a million times less? With mere millions of tokens rather than trillions? Afterall, predicting the next token is a task remotely related to what modern LLMs do. Its history is tied to text auto-filling, guessing missing words, autocorrect and so on, developed initially for tools such as BERT. Now, it’s no different than training a plane to efficiently operate on the runway, but not to fly. It also entices LLM vendors to charge clients by token usage, with little regard to ROI. Our approach is radically different. We do not use DNNs nor GPUs. It is as much different from standard AI than it is from classical NLP and machine learning. Its origins are similar to other tools that we built including NoGAN, our alternative to GAN for tabular data synthetization. NoGAN — a fast technology with no DNN — performs a lot faster with much better results, even in real-time. The output quality is assessed using our ground-breaking evaluation metric capturing important defects missed by all other benchmarking tools. In this article, I highlight unique components of xLLM, our new architecture for enterprise. Read full article at https://mltblog

posted an update 7 days ago

View all activity

Organizations

None yet

Posts 26

Post

581

How to Design LLMs that Don’t Need Prompt Engineering https://mltblog.com/3GAbAQu

Standard LLMs rely on prompt engineering to fix problems (hallucinations, poor response, missing information) that come from issues in the backend architecture. If the backend (corpus processing) is properly built from the ground up, it is possible to offer a full, comprehensive answer to a meaningful prompt, without the need for multiple prompts, rewording your query, having to go through a chat session, or prompt engineering. In this article, I explain how to do it, focusing on enterprise corpuses. The strategy relies on four principles:

➡️ Exact and augmented retrieval
➡️ Showing full context in the response
➡️ Enhanced UI with option menu
➡️ Structured response as opposed to long text

I now explain these principles.

Read full article at https://mltblog.com/3GAbAQu

#xLLM #BondingAI #PromptEngineering

Post

447

Doing Better with Less: LLM 2.0 for Enterprise https://mltblog.com/4jMU9KB

Standard LLMs are trained to predict the next tokens or missing tokens. It requires deep neural networks (DNN) with billions or even trillions of tokens, as highlighted by Jensen Huang, CEO of Nvidia, in his keynote talk at the GTC conference earlier this year. Yet, 10 trillion tokens cover all possible string combinations; the vast majority of them is noise. After all, most people have a vocabulary of about 30k words. But this massive training is necessary to prevent DNNs from getting stuck in sub-optimal configurations due to vanishing gradient and other issues.

What if you could do with a million times less? With mere millions of tokens rather than trillions? Afterall, predicting the next token is a task remotely related to what modern LLMs do. Its history is tied to text auto-filling, guessing missing words, autocorrect and so on, developed initially for tools such as BERT. Now, it’s no different than training a plane to efficiently operate on the runway, but not to fly. It also entices LLM vendors to charge clients by token usage, with little regard to ROI.

Our approach is radically different. We do not use DNNs nor GPUs. It is as much different from standard AI than it is from classical NLP and machine learning. Its origins are similar to other tools that we built including NoGAN, our alternative to GAN for tabular data synthetization. NoGAN — a fast technology with no DNN — performs a lot faster with much better results, even in real-time. The output quality is assessed using our ground-breaking evaluation metric capturing important defects missed by all other benchmarking tools.

In this article, I highlight unique components of xLLM, our new architecture for enterprise.

Read full article at https://mltblog

View all Posts

models 1

vincentg64/xLLM

Summarization • Updated Jul 21, 2024

datasets 0

None public yet