Tips on building a highload webserver

Lulu's picture

Of course there can be other ways and other software to build webservers , but the most popular in the free world is the xAMP platform (LAMP , sometimes WAMP et cetera.) where the first letter stands for operating system 'L'inux (x means any, because due to openness the suite runs on FreeBSD, Solaris and even Windows), 'A'pache (the de-facto world most popular web server), 'M'ySQL (there are other alternatives, notably PostgreSQL, but the MySQL and equivalents such as MariaDB is still on the peak), 'P'HP (of course there are alternatives too, but its extremely popular), so the Apache is server, PHP is script language and MySQL or MariaDB is a database backend.

thats how the classic scheme looks:
xamp classic
Classic xAMP scheme: Apache with mod_php and SQL server

Apache is built with a PHP interpreter inside, processes rules and config, serves static files, executes scripts and CGI. Such scheme is not very performance efficient, it spawns a bloated with PHP and its extensions process (or thread) per connection even just to serve every JPEG picture, and therefore eats a lot of RAM and keeps in memory processes stalled by low speed network IO. Privileges for virtual hosts can be separated with suphp or mpm_itk (experimental), otherwise webserver runs under 1 account and can cross-access all data for all virtual hosts, which may belong to different users.

There is a solution, which became popular recently - installing a reverse proxy, this is same like a proxy on the user side, but it does not connect user with multiple servers, instead it connects server with multiple users, therefore its called 'reverse proxy', You can use apache with mod_accel, or another way is to use Nginx (Its 'engine X' but its more popular to call it n-JINX, dont scare, no bad magic!)
Additionally you can configure Nginx to serve set of static files, so Apache will not fork a memory expensive thread for a simple image requests, those files will be served by Nginx directly.

xamp proxyed
Reverse proxying xAMP with NGINX , static content is served by NGINX in this example.

Nginx is a single threaded event-driven server, it will work as one control process and one (or one-per CPU core if you like) worker, each worker consumes about 4 Mb of RAM and is capable of handling thousands of connections, it was basically written to solve '10K problem' ( more than 10 thousands user connections simultaneously ) and it performs this task very well. You can dramatically increase performance without increasing MaxClients for Apache, you also gain an advantage of full compilance with non-proxyed scheme, because the main server is still Apache, the only one thing you will need - mod_rpaf, so Apache will be aware of real client IP address, and not the proxy one. If you use SSL - nginx can handle it!

==

Finally with time nginx has developed extra features, which permitted it to almost fully replace Apache, those are not very widely used yet, but its very promising scalable solution for building real high load servers, such feature is - fastcgi.

lets throw away apache
Pure NGINX serving static files and executing scripts via FastCGI

FastCGI can be done in 2 ways:
1) server will spawn an CGI interpreter when needed and grab its output to serve user request (thats traditional way of executing FastCGI's PHP, Ruby etc)
Benefits: Interpreter is called only when needed
Disadvantages: Interpreter needs to be launched, thats slower when you have a lot of scripts to execute, you will need a controller to limit Max spawns and UID controller per virtual server, if you care about privileges separation.

2) server will connect to port (or even better - socket) and ask for work to be done, then reads the output and serves that to user.
Advantages:
* You have already an interpreter running (no launch delay)
* Interpreters can be ran in pools under different user id's and configurations, which makes it excellent solution for virtual hosting
* Interpreters will not go over certain configured limits (manager process takes care of it)

Does the second way looks sweet ? Yes, you can do PHP that way, it is called PHP-fpm, patches available for 5.2 tree and its ported to 5.3 (enabled at buildtime) (caveat! the port is not as same as for 5.2, it is another config file format and not (yet?) all options are supported, you cannot load different set of extensions yet for example)

If you have sCGI to run (Perl, C) then you can get a perl script (FPM like) or execute them on backend server (such as lighttpd, mini_httpd, thttpd or even naked apache without modules) you choose the server depending on your needs, alternatively you can run lighttpd without nginx, it is event-driven 10K-aware server too and can do FastCGI and cCGI, you may prefer to stick with it, yet nginx is running on 6.5% of all the world servers (its hard to distinguish between xAMP+nginx and pure nginx, so most of that servers are actually running apache with nginx as reverse proxy) and lighttpd on 0.7% of all servers.

Disadvantages of using pure-nginx and pure-lighttpd:
1) no .htaccess , you should migrate needed rules to nginx.conf or lighttpd.conf. Nginx may look better here, because of Perl language support in its config.
2) Some applications choke when they dont detect apache and may disable some features. I.e. you'll need 'Nginx compatability plugin' for wordpress, to make it do nice permalinks and do not push redirect codes to search engines.
3) Less tested environment, most apps are developed with Apache as server in mind, they have manuals for apache, so sometimes you are on your own with issues encountered.

A security caveat for default nginx and php config: make sure you set cgi.fix_pathinfo to 0 in php.ini , ensure that nginx does check if script does really exist before passing it to php, otherwise combining php pathinfo rewrite and nginx passing any request ending at .php to fastcgi_pass will result in possibility of executing embedded code in user uploadable files, such as avatars. The issue is currently almost not covered in any official documentation, you have been warned.

==

And the last (but not least) , no matter what you use, you shall use more caching if possible
caches whereever possible

1) Application cache: most applications have their own caches, its not needed to do same work twice, therefore once done it can be reused. Sometimes you have to use plugins, such as Wp-Super-Cache for WordPress, sometimes the caching is built-in, like in Drupal or Joomla. Just use it!
Disadvantage: Various issues dependent on application and module used.

2) PHP Opcode cache: PHP is interpreted language, a script needs to be loaded from disk, lexically parsed and then compiled, by using an opcode cache you remove that stages, the compiled code is stored in RAM (and sometimes @disk) and if script wasnt changed it can be executed without lexical parsing. 40% speed increase, dont let it waste on the road. You can use APC, XCache, Memcached server (separate server, efficient when you have lots of RAM for it) and eAccelerator, for my experience eAccelerator performed best with PHP-fpm, just an advice ;)
Disadvantage: Consumes RAM

3) Optimize your SQL server, use query cache, you can throw away InnoDB and stick with MyISAM engine and add the saved RAM to query cache pool, it will be more efficient

4) Look for possibility of using Nginx caches, such as Fcgi cache, it can be very fast if you pay attention on setting it up.
Disadvantage: You may need application-side controller for cache invalidation, i.e. when user authentificate,

You choose the set of optimizations and their agressiveness, it is better to set some reasonable values and tune them as needed.

~ GL and cheers.

PS: do not steal this article without permission ;)

Awesome :) The Jinx is not a

Awesome :)

The Jinx is not a bad thing after all LOL

Lulu's picture

Yes, not bad, it was written

Yes, not bad, it was written with performance and security in mind, so 6.5% of world web servers with nginx is the well expected result ) I wonder how many of that 6.5% is pure nginx , without apache as backend.

netcraft stats for december 2010
http://news.netcraft.com/archives/2010/12/01/december-2010-web-server-su...

most popular - Apache 59% (fallin')

MS IIS - 22.7% (fallin')
nginx - 6.04% ( + 0.6% mo)
GWS - 5.94% (fallin')
lighttpd - 0.83 (fallin')

there is also other thing for reverse proxying - its called Varnish, its for those who dont like nginx origin :) It is only for static content and proxying as i know

You're right.

You're right.

Netster's picture

LOL :p

LOL

:p

Routers's picture

now i open my eyes

now i open my eyes bigger

nichie thanks

Netster's picture

You better open big big :D

You better open big big :D

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img><i><b><h1><h2><h3>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions. There is no CAPTCHA shown for registered and logged in users.
Image CAPTCHA
Enter the characters shown in the image.