With the advent of LAMA 2The distribution of solid LLMS has become more and more a reality. Its precision approach to the GPT-3.5 of OpenAI, which is well used for many use cases.
In this article, we will explore how we can use LLAMA2 for modeling subjects without needing to transmit each document to the model. Instead, we will take advantage of BertopicA modeling technique of modular subjects that can use any LLM for representations of fine adjustment subjects.
Bertopic works quite simple. It consists of 5 sequential steps:
- Incorporate documents
- Reduce the dimensionality of interest
- Cluster reduces incorporations
- Tokenize documents by cluster
- Extract the best words of representation by cluster
However, with the rise of LLMs as LAMA 2We can do much better than a lot of independent words by subject. It is not possible to transmit all the documents to Llama 2 directly and to analyze them. We can use vector databases for research …