4

So what I know about socket is that a socket is an end point of a connection for a process, hence 1 socket on a host binds to an IP and a unique port number for each connection. enter image description here

But a webserver (by default use port 80) to listen for connections coming in from multiple clients.

My question is: Does that mean a single socket on the server is listening to multiple clients simultaneously? This would conflict with my understanding of socket

Could someone please shed some light on this topic?

td16
  • 177

2 Answers2

4

Sockets are file descriptors with special abilities. While every socket somehow uses a port, they are not the same thing.

A socket is identified by a local address+port and a remote address+port. That means the same local port can be part of multiple sockets if the remote part is different.

A TCP server (such as a web server process) listens on a local port. Here, the local address only controls who can connect to this port: everyone, or only connections from localhost. The remote address of a listening socket is zero, which means no connection. Here I've started a python3 -m http.server on localhost port 7001:

tcp  127.0.0.1:7001   0.0.0.0:*       LISTEN        32143/python3

When I connect to that web server via my web browser, we see two additional sockets:

tcp  127.0.0.1:7001   0.0.0.0:*        LISTEN       32143/python3
tcp  127.0.0.1:50204  127.0.0.1:7001   ESTABLISHED  1658/firefox
tcp  127.0.0.1:7001   127.0.0.1:50204  ESTABLISHED  32143/python3

(data obtained via netstat, and edited for clarity)

The Firefox browser created a socket to connect() to the server. Firefox uses port 50204 in this case, so its socket is identified as local 127.0.0.1:50204 remote 127.0.0.1:7001. When the server accept()ed the connection, this connection got its own socket, which is basically the reverse of the client socket: local 127.0.0.1:7001 remote 127.0.0.1:50204. The local port is the same port the server is listening to.

The client socket and server connection socket always mirror each other, although in reality the server often sees a different client IP+port due to network address translation (NAT).

Why can the server use the same port for all connections? Well, every TCP/IP packet contains the IP+port of the sender and receiver. When the server operating system gets a connection request from a client for some port, the connection will usually be refused unless a server process is listening on that port. In that case, the server process may accept the connection and we get a socket representing that connection.

For all subsequently received TCP packets, the OS will look at the addresses and see whether they match an established socket connection. If so, the packet content is stored in a buffer that can be read from the socket file descriptor by the server process. When the server writes to the socket connection file descriptor, the OS knows the local and remote address, and can therefore create a TCP packet with the appropriate metadata.

So the sockets are entries in a lookup table used by the OS to translate between file descriptors and network addresses/ports.

amon
  • 135,795
2

In a nutshell, when a POSIX-compliant networking stack is in play, each connection to the socket on port 80 will result in its own socket being created when accept() is called.

So under the hood multiple sockets are created. It doesn't conflict with your understanding of a socket.

See here for more info: http://www.gnu.org/software/libc/manual/html_node/Accepting-Connections.html#Accepting-Connections

RibaldEddie
  • 3,303