Don't Trust DeepSeek R1? Me neither! Lets make it safe.

Don't Trust DeepSeek R1? Me neither! Lets make it safe.
Photo by Solen Feyissa / Unsplash

The new DeepSeek R1 Large Language Model (LLM) is quite a technological marvel along with also proving that sometimes finding slack in the rope is a better option then just buying a longer rope. DeepSeek R1 is able to reach better benchmarks with a fraction of the compute of OpenAI's o1. How? Well instead of just throwing more compute at the problem, DeepSeek used new techniques to try and do more with the same hardware. These techniques are out of the scope of this blog as I am by no means an AI expert, but feel free to read the paper yourself! You can find it here.

The best part is this LLM is Open Source under the MIT license. This is a big deal as it means we can run the model on our own hardware and also make and redistribute other models we create from it. Great, but whats the catch? The catch is, DeepSeek is a China based company that, in the short amount of time they have been a company, has had some really really bad data privacy issues. Not to mention the IT security laws in China make this a difficult piece of software to trust.

I personally am not very skeptical of the Open Source model as from my limited understanding, LLM model files are basically some header data followed by a bunch of floating point data. Now this doesn't meant you could not infect these model files so lets just assume the worst. Well, I still want to use R1 on my desktop. So how are we going to do that? Ollama is a great tool for this as you can run it in an isolated docker or podman container. Network Chuck has a great video on how to harden Ollama for this use case that I will link below.

Now I love ollama and it was my go to AI LLM tool for over a year, but I wish there was an AI tool that was a bit more modular on how to "deploy" the models.

Enter RamaLama. I found out about RamaLama from Christian Schaller's blog post about the future of AI in Fedora Workstation. At first it just looked like a Ollama clone with a few extra features and support for a few more models, but after reading through their GitHub I came across their Security section.

"Because RamaLama defaults to running AI models inside of rootless containers using Podman on Docker. These containers isolate the AI models from information on the underlying host. With RamaLama containers, the AI model is mounted as a volume into the container in read/only mode. This results in the process running the model, llama.cpp or LLM, being isolated from the host. In addition, since ramalama run uses the --network=none option, the container can not reach the network and leak any information out of the system. Finally, containers are run with --rm options which means that any content written during the running of the container is wiped out when the application exits."

Well ain't that just exactly what we were looking for? Each model gets their own container, no internet access, no file system access, no special or elevated privileges, private container registry support, and the container is deleted at shutdown. I'm not sure about you but this makes me much more comfortable with running R1 on my machine.

Now it is important to keep in mind our attack surface even with RamaLama. The R1 model file still exists on or host system. It is just never executed or used as their is no application to use it. So the threat is much less than Ollama in a container, and significantly less than Ollama installed to the host. It is also important to note that RamaLama is still under development and has not reached a 1.0.0 release.

Overall, RamaLama appears to be the most secure way to run AI LLM models on modern system! A big thanks to the Fedora team and Christian Schaller for the awareness and including it in Fedora!

Mastodon