Lexo - LLM Toolkit for RAG, Fine-tuning, and AI Agents

🎯 Project Overview

Lexo is a comprehensive collection of Jupyter notebooks designed for learning and applying Large Language Models (LLMs) in real-world scenarios. Master RAG systems, fine-tuning, AI agents, multimodal processing, and ML benchmarks through hands-on projects.

📚 Each notebook includes:

Clear problem statement and real-world context
Step-by-step implementation with explanations
Required API keys and setup instructions
Customizable parameters for experimentation
Performance evaluation and insights

📚 View All Notebooks

🎯 Key Features

🧠 Core Skills Covered

✨ Retrieval-Augmented Generation (RAG)
✨ Prompt engineering and optimization
✨ AI agents and tool integration
✨ Fine-tuning and QLoRA techniques

🤖 Models & Technologies

✨ Frontier LLMs (GPT, Claude, DeepSeek)
✨ Open-source models (LLaMA, Mistral)
✨ Speech-to-Text processing
✨ Vector databases and embeddings

🎓

Certification Project

This comprehensive toolkit is the result of my certification from Ed Donner's LLM Engineering Master Course

📜 View Certification

🔧 Core AI Applications

🌐 WebPage Summarizer

Beginner

Summarize any URL using OpenAI + LLaMA with Selenium for handling both static and JavaScript-rendered websites.

Handles JavaScript websites
Markdown-formatted summaries
Real-time processing

Selenium BeautifulSoup LLms Ollama

🧾 Brochure Generator

Intermediate

Transform websites into AI-crafted brochures for clients, investors, and recruits using intelligent content extraction.

Smart content filtering
Real-time streaming output
Multi-model support

BeautifulSoup LLMs Ollama IPython

💡 Tech Assistant

Beginner

AI-driven tool that provides concise, structured explanations for technical questions and code snippets.

Interactive Q&A
Real-time streaming
Code explanation

LLMs Ollama IPython

🤖 TriBot Debate

Intermediate

Three-bot chat system with GPT (polite & humorous), Claude (argumentative & snarky), and DeepSeek (logical & analytical).

Distinct personalities
Multi-model integration
Customizable prompts

OpenAI Anthropic DeepSeek IPython

🌤️ WeatherMate AI Agent

Intermediate

Conversational AI agent that analyzes real-time weather conditions and suggests activities and events based on location.

Real-time weather data
Event recommendations
External API integration

LLMs Tools REST APIs Gradio

📝 Advanced Workflows

📝 Meeting Minutes Assistant

Intermediate

Generate structured meeting minutes from audio recordings using Speech-to-Text (Whisper) and Large Language Models.

Audio transcription
Structured output
Real-time streaming

Whisper LLaMA 3.1 Gradio HuggingFace GPU

🧪 Synthetic Data Generator

Intermediate

Generate realistic synthetic datasets for tabular, text, and time-series data using multiple LLM providers.

Multiple data types
JSON and CSV output
Multi-model support

OpenAI Anthropic Google Cloud Platform Gradio

🧠 RAG QA Assistant

Intermediate

Internal expert knowledge assistant using Retrieval-Augmented Generation (RAG) for fast, accurate answers to internal queries.

Document loading (PDF, text, markdown)
ChromaDB vector store
Conversation history
Source attribution

LangChain ChromaDB LLMs Gradio

🔬 ML & Fine-tuning Pipeline

Complete ML pipeline from data to deployment with comprehensive benchmarking and evaluation. GPU required for optimal performance.

📊 Data Curation

Aggregate, clean, analyze, and balance datasets for price prediction tasks.

⚔️ Traditional ML vs LLMs

Compare traditional ML models against frontier LLMs for performance benchmarking.

🧠 E5 Embeddings & RAG

Test contextual embeddings and retrieval-augmented generation approaches.

🔧 Fine-tuning GPT-4o Mini

Fine-tune frontier models and compare before/after performance.

🦙 LLaMA 3.1 Evaluation

Evaluate quantized LLaMA 3.1 8B model performance.

⚙️ QLoRA Fine-tuning

Fine-tune LLaMA 3.1 using QLoRA with hyperparameter optimization.

🧪 Model Evaluation

Comprehensive evaluation and performance comparison across all models.

🏆 Leaderboard

Final rankings and insights across ML, embeddings, RAG, and fine-tuned models.

🏆 Capstone Project

🏷️ Snapr - AI Deal Finder

Advanced

Capstone Project: A comprehensive AI system that scans online product listings, predicts their value using an ensemble of models, and alerts users to great deals. This project integrates the ensemble model (fine-tuned LLaMA, XGBoost, and GPT-4o Mini + RAG) with cloud deployment (Modal/GCP) for production use.

Multi-model ensemble integration
Real-time price prediction
Cloud deployment with Modal
Scalable production infrastructure
Deal alert system

LLaMA 3.1 Fine tuned with QLoRA XGBoost RAG ChromaDB/AWS S3 Modal Docker HuggingFace Modal GPU Google Cloud Platform

🌟 Explore More Projects

Inspired by these AI solutions? Explore more machine learning and artificial intelligence projects showcasing advanced techniques and real-world applications.

🚀 View Full Portfolio

🧠 Lexo

🎯 Project Overview

📚 Each notebook includes:

🎯 Key Features

🧠 Core Skills Covered

🤖 Models & Technologies

Certification Project

🔧 Core AI Applications

🌐 WebPage Summarizer

🧾 Brochure Generator

💡 Tech Assistant

🤖 TriBot Debate

🌤️ WeatherMate AI Agent

📝 Advanced Workflows

📝 Meeting Minutes Assistant

🧪 Synthetic Data Generator

🧠 RAG QA Assistant

🔬 ML & Fine-tuning Pipeline

📊 Data Curation

⚔️ Traditional ML vs LLMs

🧠 E5 Embeddings & RAG

🔧 Fine-tuning GPT-4o Mini

🦙 LLaMA 3.1 Evaluation

⚙️ QLoRA Fine-tuning

🧪 Model Evaluation

🏆 Leaderboard

🏆 Capstone Project

🏷️ Snapr - AI Deal Finder

🔗 Related Projects

🌟 Explore More Projects

Connect

About