apt-mirror and Other Caching for Debian/Ubuntu Repositories

Wednesday 16 January 2008 by Bradley M. Kuhn

Working for a small non-profit, everyone has to wear lots of hats, and one that I have to wear from time to time (since no one else here can) is “sysadmin”. One of the perennial rules of system administration is: you can never give users enough bandwidth. The problem is, they eventually learn how fast your connection to the outside is, and then complain any time a download doesn't run at that speed. Of course, if you have a T1 or better, it's usually the other side that's the problem. So, I look to use our extra bandwidth during off hours to cache large pools of data that are often downloaded. With a organization full of Ubuntu machines, the Ubuntu repositories are an important target for caching.

apt-mirror is a program that mirrors large Debian-based repositories, including the Ubuntu ones. There are already tutorials available on how to set it up. What I'm writing about here is a way to “force” users to use that repository.

The obvious way, of course, is to make everyone's /etc/apt/sources.list point at the mirrored repository. This often isn't a good option. Save the servers, the user base here is all laptops, which means that they will often be on networks that may actually be closer to another package repository and perhaps I want to avoid interfering with that. (Although given that I can usually give almost any IP number in the world better than the 30kbs/sec that ubuntu.com's servers seem to quickly throttle to, that probably doesn't matter so much).

The bigger problem is that I don't want to be married to the idea that the apt-mirror is part of our essential 24/7 infrastructure. I don't want an angry late-night call from a user because they can't install a package, and I want the complete freedom to discontinue the server at any time, if I find it to be unreliable. I can't do this easily if sources.list files on traveling machines are hard-coded with the apt-mirror server's name or address, especially when I don't know when exactly they'll connect back to our VPN.

The easier solution is to fake out the DNS lookups via the DNS server used by the VPN and the internal network. This way, user only get the mirror when they are connected to the VPN or in the office; otherwise, the get the normal Ubuntu servers. I had actually forgotten you could fake out DNS on a per host basis, but asking my friend Paul reminded me quickly. In /etc/bin/named.conf.local (on Debian/Ubuntu), I just add:

            zone "archive.ubuntu.com"      {
                    type master;
                    file "/etc/bind/db.archive.ubuntu-fake";

And in /etc/bind/db.archive.ubuntu-fake:

            $TTL    604800
            @ IN SOA archive.ubuntu.com.  root.vpn. (
                   2008011001  ; serial number                                              
                   10800 3600 604800 3600)
                 IN NS my-dns-server.vpn.
            ;  Begin name records                                                           
            archive.ubuntu.com.  IN A            MY.EXTERNAL.FACING.IP

And there I have it; I just do one of those for each address I want to replace (e.g., security.ubuntu.com). Now, when client machines lookup archive.ubuntu.com (et al), they'll get MY.EXTERNAL.FACING.IP, but only when my-dns-server.vpn is first in their resolv.conf.

Next time, I'll talk about some other ideas on how I make the apt-mirror even better.

Posted on Wednesday 16 January 2008 at 15:22 by Bradley M. Kuhn.

Submit comments on this post to <bkuhn@ebb.org>.

Creative Commons License This website and all documents on it are licensed under a Creative Commons Attribution-Share Alike 3.0 United States License .

#include <std/disclaimer.h>
use Standard::Disclaimer;
from standard import disclaimer
SELECT full_text FROM standard WHERE type = 'disclaimer';

Both previously and presently, I have been employed by and/or done work for various organizations that also have views on Free, Libre, and Open Source Software. As should be blatantly obvious, this is my website, not theirs, so please do not assume views and opinions here belong to any such organization.

— bkuhn

ebb is a (currently) unregistered service mark of Bradley M. Kuhn.

Bradley M. Kuhn <bkuhn@ebb.org>