Personal AI Law Advisor for India with Context Memory

Personal AI Law Advisor for India with Context Memory

[Building a Personal AI Law Advisor for India with Context Memory ] In the rapidly evolving field of artificial intelligence, the true test of an assistant is not just its knowledge, but its ability to hold a meaningful, context-aware conversation. For my final year academic project, I decided to tackle this challenge within the complex domain of Indian law. The result is Nyay AI, a personal AI law advisor that can remember the user's conversation history to provide intelligent, relevant, and in-depth legal analysis.

This project goes beyond a simple question-and-answer bot. It is a stateful conversational agent designed to make Indian legal information more accessible and intuitive than ever before.

Core Features & Capabilities

Context-Aware Conversational Memory: This is the cornerstone of the project. The AI remembers previous turns in the conversation, allowing for follow-up questions, clarifications, and a natural dialogue flow. It can even remember the user's name.

Deep Legal Knowledge Base: Trained on a vast corpus of Indian legal texts, the AI can expertly discuss the Indian Constitution, explain intricate sections of the Indian Penal Code (IPC), and analyze various other statutes.

On-Demand Case Summarization: Nyay AI can provide concise summaries of landmark Supreme Court cases, breaking down complex judgments into understandable key points.

GPU-Accelerated Performance: The system leverages NVIDIA CUDA via the llama-cpp-python library to run the 8-billion parameter model efficiently on consumer hardware, ensuring responses are generated in a timely manner.

Sleek and Modern UI: A fully custom-built user interface using modern CSS and JavaScript provides a polished and professional user experience.

Technical Architecture & The Memory Mechanism

The power of Nyay AI comes from a carefully selected stack of open-source technologies.

Foundation Model: An 8B parameter LLaMA 3.1 model fine-tuned for Indian Law (varma007ut/Indian_Legal_Assitant) in the efficient GGUF format.

Inference Engine: llama-cpp-python handles the model loading and inference, offloading the majority of the model's layers to a GPU for high-speed processing.

Backend: A Python server using the lightweight Flask framework manages requests and orchestrates the AI's responses.

The conversational memory is achieved by managing the state on the frontend. With every new user message, the entire chat history is sent to the backend. The Flask server then dynamically constructs a detailed "meta-prompt" that includes the AI's core instructions followed by the full conversation transcript. By processing this entire context, the model can generate a response that is logically consistent with everything that has been discussed before.

Challenges & Solutions

Hardware Constraints: Running an 8B model on a consumer GPU with limited VRAM was the biggest challenge. This was solved by using a 4-bit quantized GGUF model and offloading ~30 of the 32 model layers to the GPU, allowing the system to run smoothly on a 4GB graphics card.

Environment Hell: Finding the correct combination of a 64-bit Python interpreter, CUDA-compatible PyTorch, and the right version of the llama-cpp-python library was a major hurdle that required a clean environment setup.

Achieving Coherent Memory: Initial tests showed the AI getting confused about its own identity. This was solved through meticulous prompt engineering, where the AI's instructions were refined to be crystal clear about its persona and its task of remembering the conversation.

Conclusion and Future Scope

Nyay AI is a powerful demonstration of how modern AI techniques can create truly intelligent and useful tools. By combining a powerful foundation model with a robust memory mechanism, we can build assistants that are not just knowledgeable, but are also genuinely helpful conversational partners.

Future work could involve integrating a real-time legal case database, expanding to regional Indian languages, or adding document upload and analysis capabilities.

🛠️ Tech Stack

 

  • Foundation Model: varma007ut/Indian_Legal_Assitant (LLaMA 3.1 8B GGUF)
  • Backend: Python, Flask
  • AI Engine: llama-cpp-python
  • GPU Acceleration: NVIDIA CUDA
  • Frontend: HTML, CSS, JavaScript

🖥️ System Requirements (Strict)

 

This is a resource-intensive project. The following are the minimum requirements to run it successfully.

ComponentMinimum Required Specification
OSWindows 10/11 (64-bit)
PythonPython 3.10 (64-bit)
GPU (NVIDIA)4 GB VRAM (e.g., RTX 3050) & CUDA Compute Capability 8.0+
RAM16 GB
CPUModern 6-core processor (e.g., Ryzen 5 5600X, Intel i5-12400)
Storage20 GB free space on an SSD

🚀 Installation & Setup Guide

Follow these steps carefully to set up and run the project locally.

1. Prerequisites

  • NVIDIA GPU: You must have a compatible NVIDIA graphics card.
  • Git: Ensure Git is installed.
  • Python 3.10 (64-bit): Crucially, you must have a 64-bit version of Python 3.10. You can download it hereRemember to check "Add Python 3.10 to PATH" during installation.

2. Clone the Repository

Download Full Project Source Code

The complete, working source code for this project, including the final Python scripts and the HTML/CSS for the UI, is available on my GitHub repository.

➡️ [GitHub Repository Link ]

Demo Video

Posted in Ai & ML, Machine Learning, Machine Learning Projects With Source Code and tagged , , , , , , , , , , , , , , , , , , , , , , , , , .

Leave a Reply