Runpod provides an easier method to fine tune an LLM. For more information, see Fine tune a model.
axolotl is a tool that simplifies the process of training large language models (LLMs). It provides a streamlined workflow that makes it easier to fine-tune AI models on various configurations and architectures. When combined with Runpod’s GPU resources, Axolotl enables you to harness the power needed to efficiently train LLMs.
In addition to its user-friendly interface, Axolotl offers a comprehensive set of YAML examples covering a wide range of LLM families, such as LLaMA2, Gemma, LLaMA3, and Jamba. These examples serve as valuable references, helping users understand the role of each parameter and guiding them in making appropriate adjustments for their specific use cases. It is highly recommended to explore these examples to gain a deeper understanding of the fine-tuning process and optimize the model’s performance according to your requirements.
In this tutorial, we’ll walk through the steps of training an LLM using Axolotl on Runpod and uploading your model to Hugging Face.
Fine-tuning a large language model (LLM) can take up a lot of compute power. Because of this, we recommend fine-tuning using Runpod’s GPUs.
To do this, you’ll need to create a Pod, specify a container, then you can begin training. A Pod is an instance on a GPU or multiple GPUs that you can use to run your training job. You also specify a Docker image like axolotlai/axolotl-cloud:main-latest
that you want installed on your Pod.
Login to Runpod and deploy your Pod.
axolotlai/axolotl-cloud:main-latest
image as your Template image.For optimal compatibility, we recommend using A100, H100, V100, or RTX 3090 Pods for Axolotl fine-tuning.
Now that you have your Pod set up and running, connect to it over secure SSH.
Wait for the Pod to startup, then connect to it using secure SSH.
Follow the on-screen prompts to SSH into your Pod.
You should use the SSH connection to your Pod as it is a persistent connection. The Web UI terminal shouldn’t be relied on for long-running processes, as it will be disconnected after a period of inactivity.
With the Pod deployed and connected via SSH, we’re ready to move on to preparing our dataset.
The dataset you provide to your LLM is crucial, as it’s the data your model will learn from during fine-tuning. You can make your own dataset that will then be used to fine-tune your own model, or you can use a pre-made one.
To continue, use either a local dataset or one stored on Hugging Face.
To use a local dataset, you’ll need to transfer it to your Runpod instance. You can do this using Runpod CLI to securely transfer files from your local machine to the one hosted by Runpod. All Pods automatically come with runpodctl
installed with a Pod-scoped API key. To send a file
Run the following on the computer that has the file you want to send, enter the following command:
To receive a file
The following is an example of a command you’d run on your Runpod machine.
Once the local dataset is transferred to your Runpod machine, we can proceed to updating requirements and preprocessing the data.
If your dataset is stored on Hugging Face, you can specify its path in the lora.yaml
configuration file under the datasets
key. Axolotl will automatically download the dataset during the preprocessing step.
Review the configuration file in detail and make any adjustments to your file as needed.
Now update your Runpod machine’s requirement and preprocess your data.
Before you can start training, you’ll need to install the necessary dependencies and preprocess our dataset.
In some cases, your Pod will not contain the Axolotl repository. To add the required repository, run the following commands and then continue with the tutorial:
lora.yml
configuration file with your dataset path and other training settings. You can use any of the examples in the examples
folder as a starting point.This step converts your dataset into a format that Axolotl can use for training.
Having updated the requirements and preprocessed the data, we’re now ready to fine-tune the LLM.
With your environment set up and data preprocessed, you’re ready to start fine-tuning the LLM.
Run the following command to fine-tune the base model.
This will start the training process using the settings specified in your lora.yml
file. The training time will depend on factors like your model size, dataset size, and GPU type. Be prepared to wait a while, especially for larger models and datasets.
Once training is complete, we can move on to testing our fine-tuned model through inference.
Once training is complete, you can test your fine-tuned model using the inference script:
This will allow you to interact with your model and see how it performs on new prompts. If you’re satisfied with your model’s performance, you can merge the LoRA weights with the base model using the merge_lora
script.
You will merge the base model with the LoRA weights using the merge_lora
script.
Run the following command:
This creates a standalone model that doesn’t require LoRA layers for inference.
Finally, you can share your fine-tuned model with others by uploading it to Hugging Face.
huggingface-cli
.huggingface-cli upload
command to upload your merged model to the repository.With our model uploaded to Hugging Face, we’ve successfully completed the fine-tuning process and made our work available for others to use and build upon.
By following these steps and leveraging the power of Axolotl and Runpod, you can efficiently fine-tune LLMs to suit your specific use cases. The combination of Axolotl’s user-friendly interface and Runpod’s GPU resources makes the process more accessible and streamlined. Remember to explore the provided YAML examples to gain a deeper understanding of the various parameters and make appropriate adjustments for your own projects. With practice and experimentation, you can unlock the full potential of fine-tuned LLMs and create powerful, customized AI models.