What is Apache web server and how it works
Web server is software that accepts HTTP requests from clients (browsers) and provides them as HTTP response data content, most often in the form of web pages containing information, images, files for download and more.
Apache HTTP Server
The Apache web server is an open source HTTP server, used by large number of modern operating systems.
The goal of Apache developers is to provide a modern, secure and multi-functional web server that provides HTTP services observing the HTTP standards. Without a doubt, Apache is the most widely used web server on the web.
The process of loading a web page into a browser goes through the following steps:
- Resolving the URL domain name to IP address the route between client and server IP addresses
- Building a TCP/IP connection
- Sending HTTP/HTTPS request to the server
- Resolving the path to the requested resource
- Splitting the resource into packets that contain both a segment of data to be transferred and the address where the data is to be sent (done by TCP).
- Accepting the packets in the specified order by the client
- Loading the web page or downloading the file.
The tasks performed by the web server are as follows:
- Accepting a TCP connection and receiving the request
- Resolving the file path to the requested resource in the URL
- Finding the virtual host in the server configuration from which to execute the request
- Finding the resource file
- Returning the static resource to the client
- Forwarding the dynamic code to an interpreter (PHP, for example)
- Returning the dynamic output to the client
- Closing the connection
If the server does not find the file specified in the request, or if the information needed to generate dynamic content is missing in the database, the server returns the HTTP status code as a response.
Some of the most common HTTP errors are:
- If the file does not exist, it returns 404 Not Found
- If the user has no right to view the file, it returns 403 Forbidden
- If the file is moved, it returns 301 Moved Permanently
Modern Web servers are performing much more tasks than accepting queries and returning files. Serving a request can be quite complicated process that includes a subset of the following steps, depending on the specific request:
- Retrieving the name of the requested page
- Checking the identity of the client
- Checking client's permissions
- Checking page permissions
- Checking local cache
- Retrieving the requested static web page or construct it if it contains dynamic content (CGI, PHP, JSP, ASP, etc.)
- Specifying the MIME type to include in the response
- Performing various tasks such as creating a user profile or collection of certain statistics
- Returning response to the client's query
- Adding corresponding log files records
Domain Name vs. Uniform Resource Locator (URL)
A domain name is just a name like
example.com, while the Uniform Resource Locator (URL) is a web address of a web resource - it can be a static file or dynamically generated content:
- Static URL:
- Dynamic URL:
Static and Dynamic Content
Web site content can take many different forms, but may be broadly divided into static and dynamic content.
Static content is things like HTML files, image files, CSS files, and other files that reside in the filesystem. The DocumentRoot directive specifies where in your filesystem you should place these files.
Typically, a document called index.html will be served when a directory is requested without a file name being specified. For example, if DocumentRoot is set to
/var/www/html and a request is made for
http://www.example.com/work/, the file
/var/www/html/work/index.html will be served to the client.
Dynamic content is anything that is generated at request time, and may change from one request to another. There are numerous ways that dynamic content may be generated.
For example, in the same Wordpress template different articles from different categories are loaded after different requests because the content is dynamically generated by PHP functions and MySQL database information.
The process of identifying the IP address of the domain name is performed by the DNS server either by returning the IP address from the cache or by sending a series of recursive queries to the DNS authoritative name servers.
Establishing TCP/IP Connection
In order for a computer to use the Internet, it is necessary to install program-implemented rules for the transmission and control of information - the so-called TCP/IP protocols.
Transfer Control Protocol (TCP) serves to establish the connection between the sender and the recipient. It breaks the data at small packets, sends them over the network, and on the other side of the connection it takes care of assembling the received packets of the sent document. It also takes care of resending the lost packets and rearranging those arriving in a messy order.
IP (Internet Protocol) - a protocol that provides network connections. IP also provides the route between the client and the server using IP datagrams (the format of data that can be recognized by IP).
TCP/IP connection is extremely important and can be defined as follows: IP defines endpoints and communication routes while TCP ensures reliable transport of the data packets.
HTTP & HTTPS
HTTP (Hypertext Transfer Protocol) functions as a requestâ€“response protocol in the clientâ€“server computing model. The client submits an HTTP request message to the server.
The server, which provides resources such as HTML files and other content, or performs other functions on behalf of the client, returns a response message to the client.
HTTPS (also called HTTP over Transport Layer Security (TLS), HTTP over SSL and HTTP Secure) is a communications protocol for secure communication over a computer network which is widely used on the Internet.
HTTPS URLs begin with "https://" and use port 443 by default, whereas HTTP URLs begin with "http://" and use port 80 by default.
HTTP operates at the highest layer of the TCP/IP model, the Application layer; as does the TLS security protocol (operating as a lower sublayer of the same layer), which encrypts an HTTP message prior to transmission and decrypts a message upon arrival. Strictly speaking, HTTPS is not a separate protocol, but refers to use of ordinary HTTP over an encrypted SSL/TLS connection.
Web Server Listener
The Web listener is used to indicate the IP address and port to which a client makes a connection.
When apache process httpd starts, it binds to some port and address on the local machine and waits for incoming requests. By default, it listens to all addresses on the machine.
However, it may need to be told to listen on specific ports, or only on selected addresses, or a combination of both. This is often combined with the Virtual Host feature, which determines how httpd responds to different IP addresses, hostnames and ports.
For example, to make the server accept connections on both port 80 and port 8000, on all interfaces, use:
To make the server accept connections on port 80 for one interface, and port 8000 on another, use
IPv6 addresses must be enclosed in square brackets, as in the following example:
You only need to set the protocol if you are running on non-standard ports. For example, running an https site on port 8443:
Listen 188.8.131.52:8443 https
The term Virtual host refers to the practice of running more than one web site on a single machine. Virtual hosts can be IP-based, meaning you have a different IP address for each website or name-based , which means you have multiple names running on one IP address (also known as a shared IP address ).
This feature enables any hosting user to purchase dedicated IP address other than the shared IP address of the server and thus be independent from the shared IP - this is especially important in these two cases:
- mail services - if the shared IP address gets blocked in a blacklist by some of the many email reputation services your correspondence will continue to work corectly;
- SSL certificates - SSL certificate, signed from certificate authority (CA) cannot be installed on domain name that is using the shared IP address;
Virtual name-based hosting is usually simpler because all you have to do is configure your DNS server to resolve each host name to the correct IP address and then configure Apache HTTP server to recognize the different host names and to deliver the requested URL resource from the correct virtual host.
The Apache web server is one of the most popular and powerful web servers in the world, due to its ease of administration and flexibility that comes as a result from its modular design.
Modules are extensions that enhance the basic functionality of the Web server.
The modules reï¬‚ect the growth of the Web and the inclusion of dynamic content into the web pages.
Easy module management allows administrators to modify Apache according to their needs - to add only needed modules and to remove ones that are not.
There are two types of modules:
Built-in modules are compiled into Apache and will load with the server any time it is started. Their functionality cannot be removed without recompiling the package. These modules are also known as static.
Loadable modules can be loaded on and off as required. These are the shared modules.
Apache HTTP Server configuration is done by directives in plain text configuration files. The main configuration file is usually called httpd.conf.
Hosting users do not have access to this file as well as to all server configuration files, but apache server provide a way to make configuration changes on a per-directory basis using .htaccess file.