Skip to content
Tax Law GPT
Tax Law GPT
Overview
This project fine-tunes an OpenAI model using the Internal Revenue Code (IRC) to assist accountants and tax specialists in understanding and applying tax regulations. By leveraging the fine-tuned model, professionals can ask complex questions related to the IRC and receive accurate, context-specific responses that cater to their needs.
Project Structure
- train.jsonl: The training dataset containing examples of IRC-related text and corresponding expected model outputs.
- test.jsonl: The validation dataset used to evaluate the model's performance during fine-tuning.
Prerequisites
- Python 3.8 or higher
- OpenAI Python Client Library (
openaipackage) - OpenAI API key
Installation
- Install the necessary Python packages:
pip install openai
- Set your OpenAI API key as an environment variable or replace
'your_api_key'in the code with your actual API key.
Fine-Tuning the Model
- Upload Training and Validation Files: The first step involves uploading the training and validation files (
train.jsonlandtest.jsonl) to OpenAI. These files must be in JSON Lines format.
import openai # Set your API key openai.api_key = 'your_api_key' # Upload training file with open("train.jsonl", "rb") as train_file: training_file_response = openai.files.create( file=train_file, purpose='fine-tune' ) training_file_id = training_file_response.id print(f"Training file uploaded with ID: {training_file_id}") # Upload validation file with open("test.jsonl", "rb") as validation_file: validation_file_response = openai.files.create( file=validation_file, purpose='fine-tune' ) validation_file_id = validation_file_response.id print(f"Validation file uploaded with ID: {validation_file_id}")
- Create the Fine-Tuning Job: After uploading the files, create the fine-tuning job using the file IDs obtained from the upload process. This job configures the model with the desired hyperparameters.
# Create the fine-tuning job with the file IDs fine_tune_response = openai.fine_tuning.jobs.create( training_file=training_file_id, validation_file=validation_file_id, model="gpt-4o-mini-2024-07-18", hyperparameters={ "n_epochs": 3, "batch_size": 64, "learning_rate_multiplier": 0.1 } ) # Get job ID print(f"Fine-tuning job created with ID: {fine_tune_response.id}")
- Monitor the Fine-Tuning Process: After the job is created, you can monitor its progress through the OpenAI API or dashboard. The process may take some time depending on the dataset size and chosen parameters.
Usage
Once fine-tuning is complete, the model can be used to answer specific tax-related questions by leveraging the fine-tuned model endpoint. Here's a basic example of how to interact with the model:
response = openai.chat.completions.create( model="ft:gpt-4o-mini-2024-07-18:personal::9vWCoRNV", messages=[ { "role": "user", "content": [ { "type": "text", "text": "your question here" } ] } ], temperature=1, top_p=1, frequency_penalty=0, presence_penalty=0, response_format={ "type": "text" } ) print(response)
Contributing
Contributions are welcome!
