You may have already tested generative AI engines like ChatGPT and Google Bard. While cloud access to these tools is popular, you can also install them locally on your own computer. There are some real benefits to this: it’s obviously more private and you won’t receive any warnings about performance overruns or AI unavailability. Besides, it’s just frosty.
To get started, you’ll need a program to run the AI, and you’ll need a gigantic language model (or LLM) to generate the responses. These LLMs form the basis of AI text generators. GPT-4 is the latest version to support ChatGPT, and Google has now released Gemini as a recent and improved LLM that is intended to run behind Google Bard.
If you’ve never heard the term LLM before, you clearly haven’t read our definitive AI glossary. A certain level of scientific and mathematical knowledge is required to fully understand them, but generally speaking, LLMs learn from expansive amounts of sample data and learn to recognize relationships between words and sentences (i.e. which words typically follow each other).
There are several AI models that can be installed locally.
Source: Lifehaker
Simply put, LLMs are supercharged autocorrection engines. They don’t really “know” anything, but they recognize how words should fit together to sound natural and make sense. At a high enough level, it starts to feel like you’re talking to a real person. There’s a lot more to it, but you get the idea.
When it comes to running your own LLMs, you don’t have to be a huge company or research organization to access them: there are several publicly available, including one published by Meta called Lama; others were developed by scientists and volunteers. The general idea is that publicly available LLM programs will assist support innovation and improve transparency.
For the purposes of this guide we will utilize LM Studio to show you how to install LLM locally. This is one of the best options for the job (though there are quite a few out there). It is free to utilize and can be configured on Windows, macOS and Linux.
How to Install Local LLM
The first step is to download LM Studio from the official website, taking into account the minimum system requirements: LLM is quite demanding to run, so you need a fairly powerful computer for it. Windows or Linux PCs that support AVX2 (usually on newer machines) and Apple Silicon Macs running macOS 13.6 or later will work, and at least 16GB of RAM is recommended. On PCs, a minimum of 6 GB of VRAM is also recommended.
Once you have the software up and running, you need to find an LLM that you can download and utilize – without it you won’t be able to do much. Part of the appeal of LM Studio is that it recommends “recent and noteworthy” LLMs on the front screen of the app, so if you have no idea what LLM you want, you can choose one of them.
You’ll find that LLMs vary in size, complexity, data sources, purpose and speed: there’s no right or wrong answer on which one to utilize, but there’s plenty of information on sites like Reddit AND Face Hugging if you want to do some research. As you might expect, LLM files can be several gigabytes in size, so you may find yourself reading in the background while you wait for them to download.
LM Studio can assist you find an LLM to pursue.
Source: Lifehaker
If you see an LLM you like on your home screen, just click Download. Otherwise, you can perform a search or paste the URL into the box at the top. You will be able to see the size of each LLM so you can estimate the download time as well as the date it was last updated. It is also possible to filter the results to see the models that have been downloaded the most.
You can install as many LLM modules as you want (as long as you have space), but if you have at least one on your system, they will appear in the My Models panel. (To access it, click the folder icon on the left). Here you can view information about each installed model, check for updates, and remove models.
To start showing suggestions, open the AI Chat panel via the bubble icon on the left. At the top, select the model you want to utilize, then type your prompt in the user message box at the bottom and press Enter. The type of results you get will be familiar if you have previously used an LLM such as ChatGPT.
The hint and response system is similar to ChatGPT or Bard.
Source: Lifehaker
On the right side, you can control various LLM-related settings, including how longer responses are handled and how much processing work is offloaded to the system’s GPU. There is also a field for “pre-hint”: for example, you can tell the LLM to always respond with a certain tone or language style.
Click Up-to-date chat button on the left if you want to start a recent conversation, and your previous conversations are recorded underneath in case you need to go back to them. Once you have finished generating a specific response, you have the option to take a screenshot, copy the text, or regenerate a different response based on the same prompt.
That’s it! You work with local LLMs. There are many possibilities you can explore in terms of LLM development and prompts if you want to dig deeper, but the basics are not challenging to understand, and LM Studio makes the setup process very elementary, even if you are a complete beginner.