Ollama on Google Colab

Running Large Language Models on Google Colab with Ollama

—— WRITTEN BY CHAT GPT ——— except the code ofcourse guys!!

Introduction

Google Colab is a cloud-based Jupyter notebook environment that allows users to run Python code for free with access to GPUs and TPUs. Ollama, an open-source framework, enables developers to run and experiment with large language models (LLMs) efficiently. This guide will walk you through setting up Ollama on Google Colab to host and run LLMs.

Understanding Google Colab

Google Colab offers features like free GPU access, easy collaboration, and pre-installed libraries. However, it has limitations such as session timeouts and restricted storage. It is best suited for prototyping and testing models rather than long-running applications.

Introduction to Ollama

Ollama is a framework for running LLMs efficiently, designed with simplicity and performance in mind. It supports various open-source models and is optimized for smooth deployment and testing. Being open-source, it allows extensive customization and integration into different projects.

Setting Up the Environment

Before using Google Colab with Ollama, ensure you have a Google account and access to Colab. You will need to install required dependencies and configure Colab’s runtime for optimal performance.

Steps:

Login to Google cloab and create a new notebook.
Change runtime to GPU T4
Create a new code cell and input the below and run it
```
 !curl https://ollama.ai/install.sh | sh
```

Add a new cell and add the below code and run it

 !pip install aiohttp pyngrok

 import os
 import asyncio
 from google.colab import userdata

 # Set LD_LIBRARY_PATH so the system NVIDIA library
 os.environ.update({'LD_LIBRARY_PATH': '/usr/lib64-nvidia'})

 async def run_process(cmd):
   print('>>> starting', *cmd)
   p = await asyncio.subprocess.create_subprocess_exec(
       *cmd,
       stdout=asyncio.subprocess.PIPE,
       stderr=asyncio.subprocess.PIPE,
   )

   async def pipe(lines):
     async for line in lines:
       print(line.strip().decode('utf-8'))

   await asyncio.gather(
       pipe(p.stdout),
       pipe(p.stderr),
   )

 #register an account at ngrok.com and create an authtoken and place it here
 await asyncio.gather(
     run_process(['ngrok', 'config', 'add-authtoken',"NGROCK TOKEN HERE"])
 )

 await asyncio.gather(
     run_process(['ollama', 'serve']),
     # Pull all the agents you want with a new run_process for each agent
     run_process(['ollama','pull','mistral']),
     run_process(['ngrok', 'http', '--log', 'stderr', '11434', '--host-header', 'localhost:11434'])
 )

Once you run it, you should see a https URL by ngrock you can then use to call the agent through local ollama on the machine .
Open new terminal and paste the URL like this export OLLAMA_HOST=”<URL>” . Hit enter
Type ollama run mistral
ENJOY!!!

Since Colab has limited storage, you may need external sources like Google Drive to store large models. Also you cannot run the model on gpu forever due to usage restrctions on colab however you can stil use the cpu.

Steps:

Download LLMs to Colab or Google Drive.
Configure model settings for efficiency.
Adjust runtime settings for better performance.

Using Ollama for Development

Ollama can be used for various development purposes, including:

Experimenting with different LLMs.
Integrating models into applications.
Testing model responses and fine-tuning parameters.

Advantages and Challenges

Advantages:

Free access to GPUs.
Quick experimentation without local hardware constraints.
Easy integration into existing projects.

Challenges:

Limited runtime duration.
Storage constraints.
Potential latency issues.

Solutions:

Use Google Drive for model storage.
Optimize model performance by adjusting runtime settings.
Periodically refresh Colab sessions to avoid disconnections.

Conclusion

Using Ollama on Google Colab is a powerful way to experiment with LLMs without requiring high-end local hardware. While there are limitations, strategic setup and optimization can help overcome these challenges. Explore further, test different models, and enhance your AI-driven applications!

Ollama on Google Colab

Table of contents

Running Large Language Models on Google Colab with Ollama

Introduction

Understanding Google Colab

Introduction to Ollama

Setting Up the Environment

Steps:

Steps:

Using Ollama for Development

Advantages and Challenges

Conclusion

Did you find this article valuable?