All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors,
or our affiliates.
Kiel and I had a fun time tracking down a client's networking problem the other day. Their scp transfers from their application servers behind a Cisco PIX firewall failed after a few seconds, consistently, with a connection reset.
The problem was easily reproducible with packet sizes of 993 bytes or more, not just with TCP but also ICMP (bloated ping packets, generated with ping -s 993 $host). That raised the question of how this problem could go undetected for their heavy web traffic. We determined that their HTTP load balancer avoided the problem as it rewrote the packets for HTTP traffic on each side.
Kiel narrowed the connect resets down to iptables' state-tracking considering packets INVALID, not ESTABLISHED or RELATED as they should be.
Then he found via tcpdump that the problem was easily visible in scp connections when TCP window scaling adjustments were made by either side of the connection. We tried disabling window scaling but that didn't help.
We tried having iptables allow packets in state INVALID when they were also ESTABLISHED or RELATED, and that reduced the frequency of terminated connections, but still didn't eliminate them entirely. (And it was a kludge we weren't eager to keep in place anyway.)
We wanted to avoid some unpleasant possibilities: (1) turn off stateful firewalling or (2) perform risky updates or configuration changes on the Cisco PIX, which may or may not fix the problem, in the middle of the busy holiday ecommerce season.
Finally, Kiel found this netfilter mailing list post which describes how to enable a Linux kernel workaround for the mangled packets the Cisco generates:
echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal
Of course saving that in /etc/sysctl.conf so it persists after a reboot.
So we have reliable long-running scp connections with TCP window scaling working and iptables doing its job. I love it when a plan comes together.
Comments
Sometimes software is buggy, and even with the malleability of open source software, upgrading to fix a problem may not be an immediate option due to lack of time, risks to production stability, or problems caused by other incompatible changes in a newer version of the software.
ImageMagick is a widely used open source library and set of programs for manipulating images in many ways. It's very useful and I'm grateful it exists and has become so powerful. However, many longtime ImageMagick users like me can attest that it has had a fair number of bugs, and upgrades sometimes don't go very smoothly as APIs change, or new bugs creep in.
Recently my co-worker, Jeff Boes, had the misfortune, or opportunity, of encountering just such a scenario. Our friends at CityPass have several site features that use ImageMagick for resizing, rotating, and otherwise manipulating or gathering data about images.
The environment specifics (skip if you're not troubleshooting an ImageMagick problem of your own!): RHEL 5 with its standard RPM of ImageMagick-6.2.8.0-4.el5_1.1.x86_64. The application server is Interchange, running on our local-perl-5.10.0 nonthreaded Perl build, using the local-ImageMagick-perl-6.2.8.0-4.1 library. Those custom builds are available in the packages.endpoint.com endpoint Yum repository.
CityPass reported problems with some EPS (Encapsulated PostScript) images failing to process correctly by ImageMagick. In fact, the bug prevented any subsequent image processing jobs from completing in the same OS process. Upgrading ImageMagick would fix the bug, but we can't currently do that on the production server due to other compatibility problems.
After some trial and error, Jeff determined that the ImageMagick bug only kicks in when the first image processed is an EPS file. If it's any other image type, it works fine. This explained why code that had been unchanged in a year or so suddenly stopped working: Before now, no EPS file had happened to come first.
At first Jeff hacked the system to process the non-EPS files first, then sorted the results as originally desired. Then we realized there may be some rare scenarios where no non-EPS files at all were in the batch, which would trigger the bug. Jeff then had ImageMagick always first process a trivial small JPEG file which was known to work.
That worked, but Jeff then came across the idea of processing an empty image file so we didn't have a dependency on an image that might later be deleted. He tinkered a bit and came up with something suprising but even better. This is his Perl code:
my $first_im = Image::Magick->new;
$first_im->read('');
# (then process all images in any order as originally intended)
I wouldn't have expected an initial read of an empty string filename to solve the problem, but it did. Accompanied by a suitable comment noting the history of the kludge for future software archaeologists, closed the case.
Software's funny, but it's nice when there's a simple -- if counterintuitive -- solution to work around a bug. And I think Jeff has mostly recovered his sanity in the meantime!
Comments
There are a few common ways to start processes at boot time in Red Hat Enterprise Linux 5 (and thus also CentOS 5):
- Standard init scripts in /etc/init.d, which are used by all standard RPM-packaged software.
- Custom commands added to the /etc/rc.local script.
- @reboot cron jobs (for vixie-cron, see `man 5 crontab` -- it is not supported in some other cron implementations).
Custom standalone /etc/init.d init scripts become hard to differentiate from RPM-managed scripts (not having the separation of e.g. /usr/local vs. /usr), so in most of our hosting we've avoided those unless we're packaging software as RPMs.
rc.local and @reboot cron jobs seemed fairly equivalent, with crond starting at #90 in the boot order, and local at #99. Both of those come after other system services such as Postgres & MySQL have already started.
To start up processes as various users we've typically used su - $user -c "$command" in the desired order in /etc/rc.local. This was mostly for convenience in easily seeing in one place what all would be started at boot time. However, when running under SELinux this runs processes in the init_t context which usually prevents them from working properly.
The cron @reboot jobs don't have that SELinux context problem and work fine, just as if run from a login shell, so now we're using those. Of course they have the added advantage that regular users can edit the cron jobs without system administrator intervention.
Comments
RPM spec files offer a way to define and test build variables with a directive like this:
%define <variable> <value>
Sometimes it's useful to override such variables temporarily for a single build, without modifying the spec file, which would make the changed variable appear in the output source RPM. For some reason, how to do this has been hard for me to find in the docs and hard for me to remember, despite its simplicity.
Here's how. For example, to override the standard _prefix variable with value /usr/local:
rpmbuild -ba SPECS/$package.spec --define '_prefix /usr/local'
Comments
It's unfortunate that past versions of Ruby have gained a reputation of performing poorly, consuming too much memory, or otherwise being "unfit for the enterprise." According to the fine folks at Phusion, this is partly due to the way Ruby does memory management. And they've created an alternative branch of Ruby 1.8 called "Ruby Enterprise Edition." This code base includes many significant patches to the stock Ruby code which dramatically improve performance.
Phusion advertises an average memory savings of 33% when combined with Passenger, their Apache module for serving Rails apps. We did some testing of our own, using virtualized Xen servers from our Spreecamps.com offering. These servers use the DevCamps system to run several separate instances of httpd for each developer, so reducing the usage of Passenger was crucial to fitting into less than a gigabyte of memory. Our findings were dramatic: one instance dropped 100MB down to 40MB. (The status tools included with Passenger were very helpful in confirming this.)
There has been some discussion on the Phusion Passenger and other mailing lists about packaging Ruby Enterprise Edition for Red Hat Enterprise Linux and its derivatives (CentOS and Fedora). Packages are available from Phusion for Ubuntu Linux, but many of our clients prefer RHEL's reputation as a stable platform for e-commerce hosting. So we've packaged ruby-enterprise into RPM and made them available to give back to the Rails community.
We want our SpreeCamps systems to be easy to maintain, following the "Principle of Least Astonishment." By default, Phusion's script installs ruby-enterprise into /opt, so invocation must include the full path to the executable. This would be unsettling to a developer who mistakenly installed gems to Red Hat's rubygems path while intending to install gems usable by REE and Passenger. It is important to install the ruby and gem executables into all users' $PATH.
We took a cue from our customized local-perl packages. These packages install themselves into /usr/local. This means that all executables reside in /usr/local/bin; no $PATH modifications are necessary to utilize them via the command-line. Our ruby-enterprise packages are configured the same way. (If another /usr/local/bin/ruby exists, package installation will fail before clobbering another ruby installation.) Applications which specify #!/usr/bin/ruby will continue to use Red Hat's packaged ruby.
Similar to a source-based installation, once these packages are installed you may do gem install passenger and any other gems your application needs. Phusion's REE installer also installs several "useful gems". However we elected not to include these in the main ruby-enterprise RPM package. More, smaller packages limited to a particular module or piece of software, is better than one or two big fat RPMs with a bunch of stuff you may or may not need. We will likely package individual gems in the near future.
These packages are publicly available from our repository. We've just begun using these but are finding them reliable and very helpful so far. Any of you who would like to are welcome to try them out via direct download, or much easier, adding our Yum repository to your system as described here:
https://packages.endpoint.com/
Once you've done that, a simple command should get you most of the way there:
yum install ruby-enterprise ruby-enterprise-rubygems
If you prefer to download them directly, the .rpm packages are available on that site as well, just browse through the repo.
The .spec file is available for review and forking on GitHub:
http://gist.github.com/108940
Many thanks to list member Tim Charper for providing an example .spec, and my colleagues at End Point for reviewing this work.
We appreciate any comments or questions you may have. This package repo is for us and our clients primarily, but if there's a package you need that isn't in there, let us know and maybe we'll add it.
Comments