GitXplorerGitXplorer
h

smol-blueprint

public
7 stars
0 forks
1 issues

Commits

List of commits on branch main.
Verified
b93c2b7987df36941eb86724f64c2d42b76101a4

Merge pull request #2 from huggingface/rag/building

ddavidberenstein1957 committed 2 days ago
Unverified
b41e0f77aa038ece5b1175f7af41cdb8f224c616

Update dependencies, enhance README, and refine Jupyter notebooks

ddavidberenstein1957 committed 3 days ago
Unverified
4f7537079fe3e7aab8ab70d74a81e72675fb7978

Merge branch 'main' into rag/building

ddavidberenstein1957 committed 3 days ago
Verified
9f8f9ad6f726ac77e42d580ed6214fcff0995370

Merge pull request #1 from huggingface/rag/index-change-dataset

ddavidberenstein1957 committed 3 days ago
Unverified
ac4831c54854e68a5d9201edefc2e47a8f952e40

Fix minor formatting issue in `rag/indexing.ipynb` by removing an unnecessary newline in the DataFrame creation line for improved code clarity.

ddavidberenstein1957 committed 3 days ago
Unverified
7181b389a7d1a43fde9dfde97c5b4ef689d0c827

Update dependencies and refine Jupyter notebook for improved functionality

ddavidberenstein1957 committed 3 days ago

README

The README file for this repository.
smol blueprint logo

A smol blueprint

A smol blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs. It is a more practical approach that strives to show the application of some of the theoretical learnings from the smol-course as an end2end real-world problem.

🚀 Web apps and microservices included!

Each notebook will show how to deploy your AI as a webapp on Hugging Face Spaces with Gradio, which you can directly use as microservices through the Gradio Python Client. All the code and demos can be used in a private or public setting. Deployed on the Hub!

The blueprint

We want to build a tool that can help us use AI on company documents. In our case, we will be working with the smol-blueprint/hf-blogs dataset, which is a dataset that contains the blogs from the Hugging Face website.

Installation and configuration

Python environment

We will use uv to manage the project. First create a virtual environment:

uv venv --python 3.11
source .venv/bin/activate

Then you can install all the required dependencies:

uv sync --all-groups

Or you can sync between different dependency groups:

uv sync scraping
uv sync rag
uv sync information-extraction

Hugging Face Account

You will need a Hugging Face account to use the Hub API. You can create one here. After this you can follow the huggingface-cli instructions and log in to configure your Hugging Face token.

huggingface-cli login