pea53 http://pea53.com software & such posterous.com Sun, 02 Jan 2011 17:55:00 -0800 Pre-Forking Workers in Ruby http://pea53.com/pre-forking-workers-in-ruby http://pea53.com/pre-forking-workers-in-ruby

On the same day, two separate rubyists asked me the very same question: “How do you communicate between the parent and a forked child worker”. This question needs a little background information to be properly understood.

Pre-forking is a UNIX idiom. When a process is expected to handle many tasks simultaneously, child processes can be created to offload the work from the parent process. Generally this makes the application more responsive; the child processes can use multiple CPUs and handle IO streams without blocking the parent. Eric Wong’s Unicorn web server uses child processes in this fashion. Ryan Tomayko has a fantastic blog post describing Unicorn and pre-forking child processes.

Servolux provides a Prefork class to manage a pool of forked child processes. Internally the Prefork class uses a pipe to send messages and status between the parent and child. This pipe provides all kinds of niceties like heartbeat, child timeout, and rolling restarts. However, this pipe cannot be used for general communication to the child by the end user.

In fact, communication with a specific child process is usually not the desired behavior.

Two of the major reasons for using child processes is (1) to take advantage of multiple CPUs or (2) to handle IO intensive tasks. In both of these situations each child process should be interchangeable with any other child process. That is, we don’t care which child is handling a calculation or some IO process; one child is as good as another.

So, the communication pipe used by the Prefork class is not really what we’re after. It is used to manage a specific child. Instead we need a way to send a message to the entire pool of child workers. Any currently available child can handle the message.

Beanstalkd

The simplest method of communication with the child processes is via a message queue. The Servolux gem provides some example documentation showing how to use a Beanstalkd message queue to send jobs to the child processes. Each child establishes a connection to the queue and waits for messages to process. The user pushes messages onto the queue to be handled by some child.

Sockets

A harder (but more educational) method is to use sockets for communication to a child process. The following example is lifted mostly from Ryan Tomayko’s blog post mentioned above, but with a smattering of Servolux thrown in for good measure.

The key thing to take away here is that we create a UNIX server socket in the parent process, and the forked children “accept” on this socket to receive messages. As odd as it seems, the parent then creates a UNIX socket in order to send messages to the children; the parent sends messages to the children who are accepting.

Because each child has a copy of the UNIX server socket, each child also needs to close this socket. This is done in the after_executing method. This method is called just before the child process exits. Resource cleanup happens here.

The parent process also needs to close the UNIX server socket and remove the socket file created in the tmp folder. These final steps are performed in the ensure block to ensure they happen.

Conclusion

The main concept to take away here is that pre-forked workers are indistinguishable from one another. One child process is as good as another. The vast majority of the time you will need to pass messages to the first available child worker. If you find yourself needing to communicate with a specific child then Prefork is most likely not the solution you are looking for.

Permalink

]]>
http://files.posterous.com/user_profile_pics/692743/me_2010.jpg http://posterous.com/users/Q9eMgl0dFv Tim Pease pea53 Tim Pease
Wed, 01 Dec 2010 19:23:00 -0800 Rolling Rails Log Files http://pea53.com/rolling-rails-log-files http://pea53.com/rolling-rails-log-files

There is a small issue with the default Rails logging setup. If left unchecked, the production log file can grow to fill all available space on the disk and cause the server to crash. The end result is a spectacular failure brought on by a minor oversight: that Rails provides no mechanism to limit log file sizes.

Periodically the Rails log files must be cleaned up to prevent this from happening. One solution available to Linux users is the built-in logrotate program. Kevin Skoglund has written a blog post describing how to use logrotate for rotating rails log files. The advantage of logrotate is that nothing needs to change in the Rails application in order to use it. The disadvantage is that the Rails app should be halted to safely rotate the logs.

Another solution is to replace the default Rails logger with the Ruby logging framework. The logging framework is an implementation of the Apache Log4j framework in Ruby. Although more complex than using logrotate, the logging framework allows Rails to roll it’s own log files without needing to halt the application. Another advantage is that logging destination (a file in this case) can be pointed to a syslog daemon via a configuration file.

Configuring Rails

Rails provides for application configuration via the config/environment.rb file; setting the Rails.logger in this file is the normal way of specifying an alternative logger to use. However, the logging framework needs to be interposed much earlier in the Rails initialization process. This is accomplished via the much loved/dreaded monkey-patch.

The following ruby code should be saved in your Rails app as lib/logging-rails.rb:

Include this file at the top of your config/environment.rb file just after the Rails bootstrap require line.

As a bonus, at the very end of the environment file is a line that will dump the current logging configuration to STDERR when the Rails logger is set to debug mode. This visual display of the configuration is very useful for understanding where log messages will be sent.

Configuring Logging

Now that Rails has been cowed into submission, it is time to configure the logging framework and enable rolling log files. A new configuration file has been introduced by the lib/logging-rails.rb file. The config/logging.rb file contains the base settings for the logging framework; these settings can be overridden and refined in the environment specific Rails configuration files.

Copy the following code to your config folder:

This file is Ruby code that calls the configuration hooks provided by the logging framework. There are many more examples demonstrating the various appenders and techniques to achieve the desired logging output. A brief overview of what is happening is warranted, though.

The first line describes how objects will be formatted when passed as the log message. The allowed values for format_as are :string, :inspect, and :yaml.

The second line defines how log messages will be formatted before being sent to the appenders (appenders do the actual writing of the log message to the logging destination). The PatternLayout class is well documented. This layout will be assigned to the rolling file appender.

Next comes the rolling file appender definition. Each appender is given a name that can be used to refer to the appender later in the configuration. The rolling file appender is configured to roll daily and keep the last 7 days of log files. Setting the :truncate flag to true will cause the current log file to be truncated when it is opened; usually this is not the desired behavior when Rails start. The :auto_flushing flag can either be true or it can be a number; when a number is used, then that number of log messages are buffered and then written to disk in bulk. The configuration shown flushes each log message to disk as soon as it is sent to the appender.

Finally is the configuration of the root logger. Since the logging framework is written in the spirit of Log4j, there is a hierarchy of logger instances each with their own log level and possibly their own appender. The rolling file appender is assigned to the root logger, and the log level is set according to the Rails configuration.

Congratulations! Rails is now rolling it’s log files.

Bonus!

Now that the logging framework has been integrated into the Rails app other appenders can be used. Multiple appenders can be used simultaneously, too.

At work, we are using the syslog appender to send all our log messages to a central logging server. We then use Splunk to extract information from the aggregated log files. That is another entire post in itself. Please respond on Twitter if you are interested in learning more.

Postscript

This approach to rolling Rails log files only works with Rails 2 projects. Rails 3 has drastically changed the initialization process. A post is forthcoming on how to integrate the logging framework with Rails 3; the first step is digging into the new initialization process.

Permalink

]]>
http://files.posterous.com/user_profile_pics/692743/me_2010.jpg http://posterous.com/users/Q9eMgl0dFv Tim Pease pea53 Tim Pease