Dave Hall Consulting: Solr Replication, Load Balancing, haproxy and Drupal

I use Apache Solr for search on several projects, including a few using Drupal. Solr has built in support for replication and load balancing, unfortunately the load balancing is done on the client side and works best when using a persistent connection, which doesn't make a lot of sense for php based webapps. In the case of Drupal, there has been a long discussion on a patch in the issue queue to enable Solr's native load balancing, but things seem to have stalled.
In one instance I have Solr replicating from the master to a slave, with the plan to add additional slaves if the load justifies it. In order to get Drupal to write to the master and read from either node I needed a proxy or load balancer. In my case the best lightweight http load balancer that would easily run on the web heads was haproxy. I could have run varnish in front of solr and had it do the load balancing but that seemed like overkill at this stage.
Now when an update request hits haproxy it directs it to the master, but for reads it balances the requests between the 2 nodes. To get this setup running on ubuntu 9.10 with haproxy 1.3.18, I used the following /etc/haproxy/haproxy.cfg on each of the web heads:read more