Proxy software

There are a variety of proxy software packages for Linux. Some are application level (such as SQUID) and others are at the session level (such as SOCKS).

Application proxy server software

SQUID

SQUID is a powerful and fast object cache server. It proxies FTP and WWW sessions, basically giving it many of the properties of an FTP and a WWW server, but it only reads and writes files within it's cache directory (or so we hope), making it relatively safe. Squid would be very hard to use to actually compromise the system and runs as a non root user (typically 'nobody'), so generally it's not much to worry about. Your main worry with Squid should be improper configuration. For example, if Squid is hooked up to your internal network (as is usually the case), and the internet (again, very common), it could actually be used to reach internal hosts (even if they are using non-routed IP addresses). Hence proper configuration of Squid is very important.

The simplest way to make sure this doesn't happen is to use Squid's internal configuration and only bind it to the internal interface(s), not letting the outside world attempt to use it as a proxy to get at your internal LAN. In addition to this, firewalling it is a good idea. Squid can also be used as an HTTP accelerator (also known as a reverse proxy), perhaps you have an NT WWW Server on the internal network that you want to share with the world, in this case things get a bit harder to configure but it is possible to do relatively securely. Fortunately Squid has very good ACL's (Access Control Lists) built into the squid.conf file, allowing you to lock down access by names, IP’s, networks, time of day, actual day (perhaps you allow unlimited browsing on the weekends for people that actually come in to the office). Remember however that the more complicated an ACL is, the slower Squid will be to respond to requests.

Most network administrators will want to configure Squid so that an internal network can access www sites on the Internet. In this example 10.0.0.0/255.255.255.0 is the internal network, 5.6.7.8 is the external IP address of the Squid server, and 1.2.3.4 is a www server we want to see.

Squid should be configured so that it only listens for requests on it’s internal interface, if it were listening on all interfaces I could go to 5.6.7.8 port 3128 and request http://10.0.0.2/, or any internal machine for that matter and view www content on your internal network. You want something like this in your squid.conf file:

tcp_incoming_address 10.0.0.1
tcp_outgoing_address 5.6.7.8
udp_incoming_address 10.0.0.1
udp_outgoing_address 5.6.7.8

This will prevent anyone from using Squid to probe your internal network.

On the opposite side of the coin we have people that use Squid to make internal www servers accessible to the Internet in a controlled manner. For example you may want to have an IIS 4.0 www server you want to put on the Internet, but are afraid to connect it directly. Using Squid you can grant access to it in a very controlled manner. In this example 1.2.3.4 is a random machine on the Internet, 5.6.7.8 is the external IP address of the Squid server, 10.0.0.1 is it’s internal IP address, and 10.0.0.2 is a www server on the internal network running IIS 4.0.

To set Squid up to run as an accelerator simply set the “http_port” to 80 in squid.conf:

http_port 3128

And then set the IP addresses differently:

tcp_incoming_address 5.6.7.8
tcp_outgoing_address 10.0.0.2
udp_incoming_address 5.6.7.8
udp_outgoing_address 10.0.0.2

And finally you have to define the machine you are accelerating for:

httpd_accel_host 10.0.0.2
httpd_accel_port 80

This is covered extensively in the Squid FAQ at: http://www.squid-cache.org/Doc/FAQ/FAQ.html (section 20).

The ACL's work by defining rules, and then applying those rules, for example:

acl internalnet 10.0.0.0/255.0.0.0
http_access allow internalnet
http_access deny all

Which defines "internalnet" as being anything with a source of 10.0.0.0/255.255.255.0, allowing it access to the http caching port, and denying everything else. Remember that rules are read in the order given, just like ipfwadm, allowing you to get very complex (and make mistakes if you are not careful). Always start with the specific rules followed by more general rules, and remember to put blanket denials after specific allowals, otherwise it might make it through. Its better to accidentally deny something then to let it though, as you'll find out about denials (usually from annoyed users) faster then things that get through (when annoyed users notice accounting files from the internal www server appearing on the Internet). The Squid configuration files (squid.conf) is well commented (to the point of overkill) and also has a decent man page.

Another useful example is blocking ads, so to block them you can add the following to squid.conf:

acl ads dstdomain ads.blah.com
http_access deny ads

The acl declaration is simply a pattern, be it a destination domain name, source domain name, regex and so on, the http_access directive actually specifies what to do with it (deny, allow, etc). Properly setup this is an extremely powerful tool to restrict access to the WWW. Unfortunately it does have one Achilles heel: it doesn't support user based authentication and control (not that many UNIX based proxy servers do). Remember that like any set of rules they are read from top to bottom, so put your specific denials and allowals first, and then the more general rules. The squid.conf file should be well commented and self explanatory, the Squid FAQ is at: http://www.squid-cache.org/Doc/FAQ/FAQ.html

One important security issue most people overlook with Squid is the log files it keeps. By default Squid may or may not log each request it handles (depends on the config file), from “http://www.nsa.gov/” to “http://www.example.org/cgi-bin/access&member=john&password=bob”. You definitely want to disable the access logs unless you want to keep a close eye on what people view on the Internet (legally this is questionable, check with your lawyers). The directive is “cache_access_log” and to disable it set it to “/dev/null”, this logs ALL accesses, and ICP queries (inter-cache communications). The next big one is the “cache_store_log”, which is actually semi useful for generating statistics on how effective your www cache is, it doesn’t log who made the request, simply what the status of objects in the cache is, so in this case you would see the pictures on a pornographic site being repeatedly served, to disable it set it to “none”. The “cache_log” should probably be left on, it contains basic debugging info such as when the server was started and when it was stopped, to disable it set it to “/dev/null”. Another, not very well documented log files is the “cache_swap_log” file, which keeps a record of what is going on with the cache, and will also show you the URL’s people are visiting (but not who/etc), setting this to “/dev/null” doesn’t work (in fact Squid pukes out severely) and setting it to “none” simply changes the filename from “log” to “none”. The only way to stop it is to link the file to “/dev/null” (by default the root of the www cache files /log), and also to link the “log-last-clean” to “/dev/null” (although in my quick tests it doesn’t appear to store anything you can’t be sure otherwise). So to summarize:

in squid.conf:

cache_access_log /dev/null
cache_store_log none
cache_log /dev/null

and link:

/var/spool/squid/log to /dev/null
/var/spool/squid/log-last-clean to /dev/null

or whichever directory holds the root of your www cache (the 00 through 0F directories).

Another important issue that gets forgotten is the ICP (Internet Cache Protocol) component of Squid. The only time you will use ICP is if you create arrays or chains of proxy servers. If you’re like me, you have only the one proxy server and you should definitely disabled ICP. This is easily done by setting the ICP port in squid.conf from the default “3130” to “0”. You should also firewall port 3128 (the default Squid port that clients bind to) from the Internet:

ipfwadm -I -a accept -P tcp -S 10.0.0.0/8 -D 0.0.0.0/0 3128
ipfwadm -I -a accept -P tcp -S some.trusted.host -D 0.0.0.0/0 3128
ipfwadm -I -a deny -P tcp -S 0.0.0.0/0 -D 0.0.0.0/0 3128

or in ipchains:

ipchains -A input -p all -j ACCEPT -s 10.0.0.0/8 -d 0.0.0.0/0 3128
ipchains -A input -p all -j ACCEPT -s some.trusted.host -d 0.0.0.0/0 3128
ipchains -A input -p all -j DENY -s 0.0.0.0/0 -d 0.0.0.0/0 3128

squidGuard

squidGuard allows you to put in access control lists, filter lists, and redirect requests, easily and efficiently. It is ideal for controlling access to the WWW, and for more specific tasks such as blocking pornographic content (a valid concern for many people). It cannot make decisions based upon content however, it simply looks at the URL’s being processed, so it cannot be used to block active content and so on. squidGuard is available from: http://www.squidguard.org

LDAP auth module for SQUID

This allows you to authenticate users via an LDAP server, however passwords/etc are transmitted in the clear, so use some form of VPN to secure it. You can get it from: http://www.stroeder.com/proxy_auth_ldap/

Cut the crap

Cut the crap (CTC) is aimed at blocking banner ads and reducing bandwidth usage while surfing. You can get it from: http://www.softlab.ece.ntua.gr/~ckotso/CTC/.

WWWOFFLE

WWWOFFLE is a rather nice looking proxy for UNIX systems that handles HTTP and FTP. You can get it at: http://www.gedanken.demon.co.uk/wwwoffle/.

Circuit level proxy software

SOCKS

SOCKS is a circuit level proxy, typically loaded on firewalls because it has good access controls. Applications must be SOCKS'ified, most popular web browsers, ftp clients and so on have support by default. You can get it from: http://www.socks.nec.com/.

Dante

Dante is a free implementaiton of the popular SOCKS server. It is available from: http://www.inet.no/dante/.

Back

Last updated on 1/9/2001