All opinions expressed are those of the authors and not necessarily those of OSNews.com, our sponsors, or our affiliates.
  Add to My Yahoo!  Subscribe with Bloglines  Subscribe in NewsGator Online

published by noreply@blogger.com (Brian Buchalter) on 2012-07-17 16:11:00 in the "hosting" category

It may frighten you to know that there are applications which take longer than Passenger's default timeout of 10 minutes. Well, it's true. And yes, those application owners know they have bigger fish to fry. But when a customer needs that report run *today* being able to lengthen a timeout is a welcomed stopgap.

Tracing the timeout

There are many different layers at which a timeout can occur, although these may not be immediately obvious to your users. Typically they receive a 504 and an ugly "Gateway Time-out" message from Nginx. Review the Nginx error logs both at the reverse proxy and application server, you might see a message like this:

upstream timed out (110: Connection timed out) while reading response header from upstream

If you're seeing this message on the reverse proxy, the solution is fairly straight forward. Update the proxy_read_timeout setting in your nginx.conf and restart. However, it's more likely you've already tried that and found it ineffective. If you expand your reading of the Nginx error you might notice another clue.

upstream timed out (110: Connection timed out) while reading response header from upstream, 
upstream: "passenger://unix:/tmp/passenger.3940/master/helper_server.sock:"

This is the kind of error message you'd see on the Nginx application server when a Passenger process takes longer than the default timeout of 10 minutes. If you're seeing this message, it'd be wise to review the Rails logs to get a sense for how long this process actually takes to complete so you can make a sane adjustment to the timeout. Additionally, it's good to see what task is actually taking so long so you can offload the job into the background eventually.

Changing nginx-passenger module's timeout

If you're unable to address the slow Rails process problem and must extend the length of the time out, you'll need to modify the Passenger gem's Nginx configuration. Start by locating the Passenger gem's Nginx config with locate nginx/Configuration.c and edit the following lines:

ngx_conf_merge_msec_value(conf->upstream.read_timeout,
                              prev->upstream.read_timeout, 60000);
Replace the 60000 value with your desired timeout in milliseconds. Then run sudo passenger-install-nginx-module to recompile nginx and restart.

Improving Error Pages

Another lesson worth addressing here is that Nginx error pages are ugly and unhelpful. Even if you have a Rails plugin like exception_notification installed, these kind of Nginx errors will be missed, unless you use the error_page directive. In other applications I've setup explicit routes to test exception_notification properly sends an email by creating a controller action that simple raises an error. Using Nginx's error_page directive, you can call an exception controller action and pass useful information along to yourself as well as present the user with a consistent error experience.


Comments

published by noreply@blogger.com (Jon Jensen) on 2011-07-01 16:06:00 in the "hosting" category

In our work the occasional mysterious problem surfaces which makes me appreciate how tractable and sane the majority of the challenges are. Here I'll tell the story of one of the mysterious problems.

In Internet routing of IPv4 addresses, there's nothing inherently special about an IP address that ends in .0, .255, or anything else. It all depends on the subnet. In the days before CIDR (Classless Inter-Domain Routing) brought us arbitrary subnet masks, there were classes of routing, most commonly A, B, and C. And the .0 and .255 addresses were special.

That was a long time ago, but it can still cause occasional trouble today. One of our hosting providers assigned us an IP address ending in .0, which we used for hosting a website. It worked fine, and was in service for many months before we heard any reports of trouble.

Then we heard a report from one of our clients that they could not access that website from their home, but they could from their office. We couldn't ever figure out why.

Next one of our own employees found that he could not access the website from his home, but he could from other locations.

Finally we had enough evidence when a friend from the open source community also could not access that website from his home.

The commonality was in the router they were using:

  • Belkin G Wireless Router Model F5D7234-4 v4
  • Belkin F5D9231-4 v1
  • and the third thought it was a Belkin but they were not able to provide the exact model.

We moved the website to a different IP address on the same server, and they had no problem accessing it.

The routers are obviously broken, but there's little sense arguing about that. For now we avoid using any .0 IP address because there are going to be some few people who can't reach it.


Comments

published by noreply@blogger.com (Jon Jensen) on 2011-02-08 16:21:00 in the "hosting" category

In recent years it?s become increasingly common for hosting providers to advertise their compliance with the SAS 70 Type II audit. Interest in that audit often comes from hosting customers? need to meet Sarbanes-Oxley (aka Sarbox) or other legal requirements in their own businesses. But what is SAS 70?

It was not clear to me at first glance that SAS 70 is actually a financial accounting audit, not one that deals primarily with privacy, information technology security, or other areas.

SAS 70 was created by the American Institute of Certified Public Accountants (AICPA) and contains guidelines for assessing organizations? service delivery processes and controls. The audit is performed by an independent Certified Public Accountant.

Practically speaking, what does passing a SAS 70 audit tell us about an organization? Most importantly that it is financially reliable, and thus hopefully a safe partner for providing critical Internet hosting and data storage services.

On June 15, 2011, the SAS 70 audit will be effectively replaced by the new SSAE 16 attestation standard (Statement on Standards for Attestation Engagements no. 16, Reporting on Controls at a Service Organization). Thus the focus appears to shift from an external auditor investigating an organization, to the organization making claims about itself under the guidance of an auditor.

SSAE 16 was created by the AICPA to make the United States service organization reporting standard compatible with the new international service organization reporting standard, ISAE 3402, which is freely available in PDF format. The SSAE 16 document is available only for a fee.

The AICPA?s FAQ on the SAS 70 to SSAE 16 transition makes an interesting point:

Q. ? Will entities now become ?SSAE 16 certified??

A. ? No! A popular misconception about SAS 70 is that a service organization becomes ?certified? as SAS 70 compliant after undergoing a type 1 or type 2 service auditor?s engagement. There is no such thing as being SAS 70 certified and there will be no such thing as being SSAE 16 certified. An SSAE 16 report (as with a SAS 70 report) is primarily an auditor to auditor communication, the purpose of which is to provide user auditors with information about controls at a service organization that are relevant to the user entities? financial statements.

This is interesting because many in the industry informally state that they are ?SAS 70 Type II certified?. But practically speaking for those of us involved in Internet hosting, is ?certification? very different from ?passing an audit?? It serves primarily as a requirement checklist item about hosting providers in either case.

Many major hosting providers have completed a SAS 70 Type II audit, including Rackspace (and Rackspace Cloud), Amazon Web Services, SoftLayer (and The Planet, which SoftLayer recently acquired), Verio, Terremark, and ServePath, to mention a few that we have worked with. Presumably these will make an SSAE 16 attestation later this year.

Note that many VPS and cloud hosting providers do not report having been SAS 70 audited. If this is a requirement for your hosting, it's important to look for it early before settling on a provider.

More details about the SAS 70 to SSAE 16 transition are available on the AICPA Service Organization Controls Reporting website.


Comments

published by noreply@blogger.com (Jon Jensen) on 2009-10-22 16:27:00 in the "hosting" category

I have a testing server that was running RHEL 5.2 (x86_64) but its RHN entitlement ran out and I wanted to upgrade it to CentOS 5.4. I found a few tips online about how to do that, but they were a little dated so here are updated instructions showing the steps I took:

yum clean all
mkdir ~/centos
cd ~/centos
wget http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-5
wget http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-5-4.el5.centos.1.x86_64.rpm
wget http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-notes-5.4-4.x86_64.rpm
wget http://mirror.centos.org/centos/5/os/x86_64/CentOS/yum-3.2.22-20.el5.centos.noarch.rpm
wget http://mirror.centos.org/centos/5/os/x86_64/CentOS/yum-updatesd-0.9-2.el5.noarch.rpm
wget http://mirror.centos.org/centos/5/os/x86_64/CentOS/yum-fastestmirror-1.1.16-13.el5.centos.noarch.rpm
rpm --import RPM-GPG-KEY-CentOS-5 
rpm -e --nodeps redhat-release
rpm -e yum-rhn-plugin yum-updatesd
rpm -Uvh *.rpm
yum -y upgrade
# edit /etc/grub.conf to point to correct new kernel (with Xen, in my case)
shutdown -r now

It has worked well so far.


Comments

published by Adam S (firsttubedotcom) on 2007-07-11 16:47:40 in the "Web Hosting" category
Adam S I left for vacation on June 28, and before doing so, I took a quick glance over firsttube.com and jotted a quick blog post about it. firsttube.com was fully functional and officially dormant for 10 days as of June 28.

Imagine my surprise when on Monday, my wife said, "Hey, your site isn't working!" The index page worked, but none of the other pages.

In short, my webhost, Hostgator decided to implement PHPsuexec. Here's the gist of this awesome program: typically, your web server runs as the "nobody" user on a server, but you login as yourself, say your username is "jdough." You need to use certain tricks, like using .htaccess files and chmodding to get around certain limitations. PHPSuexec makes php run *as you,* removing the need for world writable directories and creating a need for custom php.ini files to replace certain php directives in your .htacess files.

Since my site doesn't use file extensions on most files, I used a directive called DefaultType to make everything PHP. This stopped functioning when Hostgator made the changes on Monday. Instead, every one of the pages that relied upon that value for parsing stopped working and started displaying HTTP error 500.

When I returned into town on Sunday, I opened a high priorityt ticket with Hostgator. An hour later, I called the support line and was told an admin would reply presently. An hour later, I replied to my confirmation email to their email support line. Another hour later, I called again. After 35 minutes on the phone, they finally helped me get the pages running. But images across the site were broken. They were generating parsing errors! They were being interpretted by PHP. Yikes! Another 25 minutes on the phone today resulted in new .htaccess files everywhere. I should tell you that today's phone calls were with two "gators" who were both very friendly and helped me very enthusiastically.

Hostgator did not email me about these changes, even though they have my email address. They did not call me, even though they have my phone number. They did not post anything in my control panel, even though they can. Instead, they posted it in their own support forums and expected me to check it. A major change to the very core of the server behavior and they simply didn't tell me. And as a result, my sites were down for a week plus. So if you tried visiting firsttube.com in that time, I'm sorry for the interruption: the view page, the print page, the comments page, and nearly every other meaningful page failed to parse.

If I were a business and monetized my site in any way, I would immediately cancel. But to be fair, Hostgator has unparalleled uptime, unmatched availability, awesome tools (cpanel based), a competitive rate, and a friendly support staff. So I decided to give them one more chance. They have burned all the trust they gained with me, and I will not be recommending them to anyone right now, but I am not taking my business elsewhere just yet.

PHPsuexec is a great tool that provides a nice security boost, but do some serious testing before you implement it. It can dramatically alter the way your websites work.

Tags: Web Hosting, Hostgator, Rant, PHPsuexec, PHP, Nerd, Meta
Comments