knowledge-gpt — a Library for Creating Specialized ChatGPT Bots with 2 Lines of Code

Eren Akbulut
Geeks-of-Data
Published in
3 min readMar 19, 2023

--

Before starting the article, I want to mention our “Geeks of Data” Discord channel. You can join and say hello, and exchange ideas about data science, engineering, or analysis fields.🚀 Link

The Knowledge-gpt library is a valuable and user-friendly tool that can process diverse information sources and generate meaningful indexes to facilitate accurate and confident query responses about the source documents. It basically allows you to create your specialized GPT bot with a few lines of code.

This library employs multiple open-source models to support various languages and APIs to access OpenAI services. Currently, it can retrieve information by scraping websites, using YouTube transcripts as an information source, extracting text from raw audio to use as an information source, and querying documents such as pdf, ppt, and doc files.

Today we’ll walk you through its website scraping example.

The first thing to do is to install our library PyPI package provided here. Installing setups is as easy as running this command:

pip install knowledgegpt

Afterward, we need to import our OpenAI key and specify our secret key.

Import Secret Key

Then we can import the needed class for information extraction from a website. ( Currently doesn’t have wide coverage, mostly tested with Wikipedia. )

Import Lib

We can then create an instance from our class and start making API calls.

Above we see how to use our library, we just give a path/URL, select our extractor type, model language, and answering engine using the flag “is_turbo” (when is set to true uses chatgpt engine otherwise uses text-davinci-003). Then we make a call using the extract method, it takes in our query and a max_tokens limit, it first calculates embedding and creates indexes then sends a request to the selected engine.

API Call

The output from the cell above is shown below, it has some constant prompting parts inside as well as the Context we passed in. The length of the Context is dictated by the parameter “max_tokens”.

Prompt Context

Afterward, we should be able to run the extract method over and over without calculating indexes again, it’ll just use the precomputed ones and will take only seconds to answer questions.

When the messages variable is checked ( when using turbo also known as the chatgpt model ) we see our chat history.

More Messages

Alright, everyone, that’s pretty much how one can use knowledgegpt to create specialized bots for their use cases. You can find more examples in our GitHub repository in the examples section.

Thank you very much for reading and following along, friends. If you want to access content like this and spend time with curious, intelligent, and hardworking colleagues, we also welcome you to our Discord server. 🚀 Link

--

--