☎️ (+57) 3132593396 | 📧 jj.ovalle@uniandes.edu.co | https://www.linkedin.com/in/jjmov/
M.S., Data Analytics Intelligence | Universidad de los Andes (Aug 2024) |
B.S., Economics | Universidad de los Andes (Nov 2020) |
Relevant Technologies: Qdrant Vector Database, Cohere Reranker, LangChain, LangGraph, Streamlit, Modal
The Colombian Law Agent is an agentic tool designed to simplify the way we interact with legal documents and laws in Colombia. At its core, this project is about making law accessible. Using web scraping techniques, I’ve gathered up-to-date information straight from official sources, because sometimes the latest laws aren’t available in a convenient format.
With the power of vector databases and AI, the Colombian Law Agent can quickly find and retrieve the exact legal information you need. Imagine asking a question about a specific law and getting a precise answer in seconds – that’s what this tool does. It’s built for everyone, from legal professionals to everyday citizens, ensuring that understanding Colombian law is as easy as asking a question. The app isn’t just smart; it’s also user-friendly. A straightforward UI means you don’t need to be a tech wizard to use it. Whether you’re doing in-depth legal research or just curious about a law, the Colombian Law Agent may be your go-to resource, streamlining legal inquiries with technology.
Relevant Technologies: AWS SageMaker, LLamaIndex, LangChain, Streamlit
Deployed Mistral7B into a practical application, leading to the creation of 7BSQL Master. This demo app, developed using AWS SageMaker, LLamaIndex, and Streamlit, showcases the ability to seamlessly transform natural language questions into SQL queries. Hosted on AWS, 7BSQL Master provides an intuitive platform for users to leverage the sophisticated NL2SQL capabilities of Mistral7B, demonstrating the practical application and deployment of fine-tuned AI models in a user-friendly interface.
Relevant Technologies: HuggingFace, LangChain, LangSmith, Weights&Biases
Fine-tuned four large language models—Gemma, Mistral, DeciLM, and LLama2, in their 7 billion parameters version, for the task of generating SQL queries from natural language. This project aimed to enable these models to accurately interpret user intent and output corresponding SQL queries. The fine-tuning process employed LoRA (Low-Rank Adaptation) for efficient model parameter tuning. Performance monitoring and evaluation were facilitated by Weights & Biases and LangSmith.
Relevant Technologies: LangChain, Pinecone, Weights&Biases, Chainlit
Developed a Retrieval-Augmented Generation (RAG) application that enables users to interact with PDFs and text files, facilitating conversational ‘chatting’ with documents. This application leverages Pinecone for its vector storage capabilities, ensuring efficient information retrieval that significantly enhances user experience. The app’s workflow is orchestrated using LangChain, allowing for seamless integration of various AI components. To ensure continuous improvement and a deeper understanding of user interactions, Weights and Biases was employed for the tracking of model interactions and performance metrics. The user interface, crafted with Chainlit, provides an intuitive and easy environment for users to effortlessly navigate and converse with their documents
Relevant Technologies: HuggingFace, BigQuery, HDBSCAN, UMAP
Developed a comprehensive NLP project focused on extracting actionable business insights from Amazon reviews of video games. Initially, the sentiment of each review was analyzed to identify negative feedback, utilizing a BERT based model. Following this, an embedding model was applied to transform the reviews into embeddings, facilitating the nuanced understanding of customer opinions beyond mere positive or negative sentiment. Leveraging UMAP for dimensionality reduction and HDBSCAN for clustering, the project effectively grouped reviews into distinct clusters, enabling a focused analysis on specific aspects of customer dissatisfaction.
Senior Data Scientist @ Escala24x7 (October 2023 - Present)
Mid Data Scientist @ Habi (July 2023 - October 2023)
Data Scientist @ Interpublic Group (IPG) – Kinesso (March 2021 - April 2022)
Data Analyst @ AXA Colpatria (August 2020 - March 2021)