GitXplorerGitXplorer
U

irs-manual-demo

public
14 stars
7 forks
1 issues

Commits

List of commits on branch main.
Unverified
281247061aaf9d8714bb0c2876443aef3832f3c2

docs: add -r flag to install command

MMthwRobinson committed 2 years ago
Verified
86b1cdb7fdc1b6da63abd5aca509bae82ea23071

Update download_data.py

aamanda103 committed 2 years ago
Verified
0e87a259843bf23df0b6f97fcce837c692087d41

Update README.md

aamanda103 committed 2 years ago
Unverified
cf45bbea19452b079be5b8667fa6a8f188085a0f

whoops removing commented out logic

aamanda103 committed 2 years ago
Unverified
8d4543e22d5a9a8a120c69741f4af4892c8accb7

fixing content length error

aamanda103 committed 2 years ago
Unverified
ef18a18f8c4335d6a4670692a0d9fbfc422648d9

adding json files

committed 2 years ago

README

The README file for this repository.

Chat with IRS Manuals

This directory contains an application for chatting with IRS manuals. Once data is available, the chat application only uses self-hosted models and can be run in a disconnected environment. Here's how to get started with the chatbot:

Installation

pip install -r requirements.txt

Environment Variables

Note there are other options for these connections, but these are the ones referenced in this implementation

OpenAI

Pinecone

PINECONE_API_KEY
PINECONE_API_ENV
OPENAI_API_KEY
PINECONE_INDEX_NAME

Download PDFs from IRS website

python download_data.py <Base URL> <Page Start> <Page End> <Target Directory>

Download

Run PDFs against unstructured-ingest

PYTHONPATH=. ./unstructured/ingest/main.py \
  --local-input-path <ingest-input-dir> \
  --structured-output-dir <ingest-output-dir> \
  # optional parameter -> this will hit the *NEW* API vs. processing locally
  --partition-by-api

Download gif

Here's an example of the structured json output

JSON

Seed and utilize vector db

python ingest_data.py <path-to-structured-json-file-directory>

Run the chat CLI

python cli_app.py

Chat

Chat with our hosted instance here