Introduction

This repository contains a practical example about how to build a GPT-4 Q&A app capable of answering questions related to your private documents in just a couple of hours.

The app uses the following technologies:

Content

The repository contains the following applications.

A Jupyter Notebookreads your private documents (for this example I'm using the dotnet microservices book) and stores the content in Pinecone.
A Streamlit app allow us to query the data stored in Pinecone using a GPT-4 LLM model.

External Dependencies

Azure OpenAI
Pinecone

Prerequisites

You MUST have the following services running before trying to execute the app.

An Azure OpenAI instance with the following models deployed:
- text-embedding-ada-002.
- gpt-4 or gpt-4-32k.

The models can be called whatever you like.

A Pinecone database with an index with 1536 dimensions and cosine metric.

The index can be called whatever you like.

How to run the app

Before trying to run the app, read the Prerequisites section.

Step 1: Add your data into Pinecone

The repository contains a Jupyter Notebook that reads a PDF file from the docs folder, splits the content into multiple chunks and stores them into PineCone.

You must set the following enviroment variables, before executing the Jupyter Notebook:

PINECONE_API_KEY: Pinecone ApiKey.
PINECONE_ENVIRONMENT: Pinecone index environment.
PINECONE_INDEX_NAME: Pinecone index name.
AZURE_OPENAI_APIKEY: Azure OpenAI ApiKey.
AZURE_OPENAI_BASE_URI: Azure OpenAI URI.
AZURE_OPENAI_EMBEDDINGS_MODEL_NAME: The text-embedding-ada-002 model deployment name.
AZURE_OPENAI_GPT4_MODEL_NAME: The gpt-4 model deployment name.

What's the model deployment name?

When you deploy a model on an Azure OpenAI instance you must give it a name.
For this example to run properly you need to deploy at least a text-embedding-ada-002 model and a gpt-4 model.

Step 2: Query your data

The app.py is a Streamlit app that does the following steps:

Converts your query into a vector.
Retrieves the information that is semantically related to our query from Pinecone.
Feeds the retrieved information into a LLM model which builds a response.

Run the app locally :

Restore dependencies:

pip install -r requirements.txt

When you install Streamlit, a command-line (CLI) tool gets installed as well. The purpose of this tool is to run Streamlit apps.

streamlit run app.py

You MUST set the following environment variables on your local machine before executing the app:

PINECONE_API_KEY: Pinecone ApiKey.
PINECONE_ENVIRONMENT: Pinecone index environment.
PINECONE_INDEX_NAME: Pinecone index name.
AZURE_OPENAI_APIKEY: Azure OpenAI ApiKey.
AZURE_OPENAI_BASE_URI: Azure OpenAI URI.
AZURE_OPENAI_EMBEDDINGS_MODEL_NAME: The text-embedding-ada-002 model deployment name.
AZURE_OPENAI_GPT4_MODEL_NAME: The gpt-4 model deployment name.

Run the app in a container:

This repository has a Dockerfile in case you prefer to execute the app on a container.

Build the image:

docker build -t qa-app .

Run it:

docker run -p 5050:5050 \
        -e AZURE_OPENAI_APIKEY="<azure-openai-api-key>" \
        -e AZURE_OPENAI_BASE_URI="<azure-openai-api-uri>" \
        -e AZURE_OPENAI_EMBEDDINGS_MODEL_NAME="<azure-openai-embeddings-deployment-model-name>" \
        -e AZURE_OPENAI_GPT4_MODEL_NAME="<azure-openai-gpt4-deployment-model-name>" \
        -e PINECONE_INDEX="<pinecone-index-name>" \
        -e PINECONE_ENVIRONMENT="<pinecone-environment-name>" \
        -e PINECONE_API_KEY="<pinecone-api-key>" \
        qa-app

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
imgs		imgs
.gitignore		.gitignore
Dockerfile		Dockerfile
Ingest data into Pinecone.ipynb		Ingest data into Pinecone.ipynb
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Introduction

Content

External Dependencies

Prerequisites

How to run the app

Step 1: Add your data into Pinecone

Step 2: Query your data

Output

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

sinbc2003/building-qa-app-with-openai-pinecone-and-streamlit

Folders and files

Latest commit

History

Repository files navigation

Introduction

Content

External Dependencies

Prerequisites

How to run the app

Step 1: Add your data into Pinecone

Step 2: Query your data

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages