- Computers & Software»
- Computer How-Tos & Tutorials
TCP and UDP Fundamentals
The Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) are the two most popular TCP/IP transport layer protocols. TCP/IP networking protocol developed in the 1970s by pioneering network engineers Vinton Cerf and Bob Kahn. From then on, all networks that use TCP/IP are collectively known as the Internet. The standardization of TCP/IP allows the number of Internet sites and users to grow exponentially. The TCP/IP protocols were initially developed as part of the research network developed by the United States Defense Advanced Research Projects Agency (DARPA or ARPA). Initially TCP, a single protocol was developed to be working at both layers i.e., transport layer as well as network layer. Since it was contrary to the OSI standards, TCP was later being split into TCP at the transport layer and IP at the network layer; thus the name “TCP/IP”. The process of dividing TCP into two portions began in version 3 of TCP, written in 1978. The first formal standard for the versions of IP and TCP used in modern networks (version 4) were created in 1980. This is why the first “real” version of IP is version 4 and not version 1. TCP/IP quickly became the standard protocol set for running the ARPAnet. In the 1980s, more and more machines and networks were connected to the evolving ARPAnet using TCP/IP protocols, and the TCP/IP Internet was born.
These TCP/IP protocols define a variety of functions considered to be OSI transport layer, or Layer 4, features. Some of the functions related to things you see every day—for instance, when you open multiple web browsers on your PC, how does your PC know which browser to put the next web page in? When a web server sends you 500 IP packets containing the various parts of a web page, and 1 packet has errors, how does your PC recover the lost data? This chapter covers how TCP and UDP perform these two functions, along with the other functions performed by the transport layer.
OSI transport layer provides several features. A transport layer can provide either connection oriented or connectionless communication. TCP protocol provides connection oriented communication. Connection oriented means the hosts involved in transmission, first exchanges certain information between themselves before actual data can be transmitted. If data gets lost, the TCP protocol will resend it. Connectionless communication implies no message exchange prior to actual data transmission. A transport layer protocol like UDP just sends data packet without exchanging any parameters, and does not wait for any acknowledgment. It just sends the data and assumes that the data would have been received.
TCP provides a variety of useful features, including error recovery. In fact, TCP is best known for its error-recovery feature—but it does more. TCP, defined in RFC 793, performs several functions.
Multiplexing using TCP port Numbers:
TCP provides a lot of features to applications, at the expense of requiring slightly more processing and overhead, as compared to UDP. However, TCP and UDP both use a concept called multiplexing. So, this section begins with an explanation of multiplexing with TCP and UDP. Afterward, the unique features of TCP and UDP are explored. Multiplexing by TCP and UDP involves the process of how a computer thinks when receiving data.
The computer might be running many applications, such as a web browser, an e-mail package, or an FTP client. TCP and UDP multiplexing enables the receiving computer to know which application to give the data to. Suppose the client computer is currently using three applications to communicate to server. When the server receives data packets, it gets concerned about to which server application the data should be handed over to. This concern is addressed by the concept called port numbers used by TCP as well as UDP.
Well known services use port numbers called well known port numbers. When a client sends a web request to server, it uses destination port number 80. When client sends a FTP request, it uses destination port number 21 and for TFTP it uses destination port number 69. When the TCP components receive the requests from the client, it looks for the destination port numbers and hands over the packets to the corresponding server services running on the server. Request containing destination port number 80 is handed over to the web server, request containing destination port number 21 is handed over to the FTP server and request containing destination port number 69 is handed over to the TFTP server.
These are some of the popular services and the well known port numbers used by them. The client applications use any available port numbers which is not in use. The port number ranges from 0 to 65535. The well known ports ranges from 0 to 1023 and are assigned by IANA. IANA stands for Internet Assigned Numbers Authority. It is an organization that overseas IP address, top level domain and internet protocol code point allocations.
Error Recovery or Reliability:
Other important function provided by TCP is error recovery or reliability. To accomplish reliability, TCP numbers data bytes using the Sequence and Acknowledgment fields in the TCP header. Here you can see a web server is sending 1000 bytes of data to web browser and a sequence number 1000 is used in the TCP header. The web server sends another 1000 bytes of data with sequence number 2000 and yet another 1000 bytes of data with sequence number 3000. Next the web browser is sending acknowledgement to the server for successfully receiving of 3000 bytes. The 4000 in the acknowledgement field implies the next byte to be received.
Here you can see that the web server is continuing sending data, each with a segment size of 1000 bytes. After sending three segments, the web server waits for acknowledgement from the web client. The web client sends an acknowledgment telling the server about loss of segment 2000. The TCP at web server resends the 2nd segment. Finally the web client is sending an acknowledgement data, informing the server to start transmission of next segments. Here you will be wondering why the server sends only three segments of 1000 bytes each and waits for acknowledgement from the client..?
That we are going to discuss next.
Flow control using windowing:
TCP implements flow control by taking advantage of the Sequence and Acknowledgment fields in the TCP header, along with another field called the Window field. Here in the exhibit, you can see that the client is sending an acknowledgement to the server with the window size of 3000. Wondering...?
What does the window size mean…? Using the window field in the TCP header, the client is telling the server that how much bytes of data it is ready to receive without sending the acknowledgement. So the server start sending TCP segments each with 1000 bytes. Once the window size reaches the 3000 the server has to wait for an acknowledgement from the client. The client is sending the acknowledgment along with the next window size. But here you see that the next window size is 4000 instead of 3000…? Why…? the client requested for a window size of 3000 in the first acknowledgement… and upon successful receiving of 3000 bytes, the client is increasing the window size. It is increasing the window size because there were no error during previous transmission and the client is thinking that it can handle more data before sending an acknowledgement. So it is the client which decides how much data it is ready to handle at any time without an acknowledgement. Since the widow size keeps on increasing or decreasing depending upon network and hardware performance, this flow control mechanism is also termed as sliding window.
Now you will be thinking why the client is sending an acknowledgement of 1000 without any previous data transmission from server …? Right..? Let us understand.
Connection Establishment and Termination:
Before any TCP data transmission between two stations, a TCP connection must be established first. A TCP connection establishes by three way handshake between two stations. Here in the exhibit you see that the web browser is sending a request to web server. It is the network application which selects, which layer 4 protocols to be used. Here in the case of web browser, it is using TCP. When browser hands over the application data to TCP stack, the TCP initializes three way handshake by sending a SYN request to the web server’s TCP stack. Here SYN means “synchronize the sequence numbers.” The web client is sending a sequence number 200, Destination port or DPORT=80 and source port or SPORT 1027. The web server TCP stack sends back SYN, ACK response in a second handshake.
Here you can see that the sequence number is 1450, i.e., the web server is initializing its own TCP sequencing and acknowledging the clients sequence number 200 by sending an ACK of 201 in response. The web server is using destination port or DPORT=1027 and source port or SPORT=80.
In third hand shake the client machine is sending an ACK back in response of the servers SYN request in second handshake. Here the sequence number is 201 and the ACK is 1451. The destination port or DPORT is 80 i.e., the port on which the web server is listening for incoming connections. The source port or SPORT is 1027 i.e., the port number on which the web client is listening. Now at this stage, TCP three way handshake is complete and the two stations are ready to exchange application data. This three way handshake is referred as TCP connection establishment. The sequence numbers are used to reassemble the data at the receiving end, because it is possible that often user data is broken into smaller chunks called packets at the network layer and the packets may be sent via different paths to the destination. Due to the packets taking different paths to the destination they may reach in improper order, hence the sequence number is used to put the packets in the proper order at the receiving end.
TCP connection Termination:
Now we should have some understanding of TCP connection termination as well. Here the client pc initiated a connection and now it wants to terminate the connection after data transmission. The connection termination occurs in four steps. First client sends an ACK for the last received data with a FIN request and a sequence number of 1000. The FIN here simply stands for finish. Upon receiving the FIN request from client the server acknowledges it by sending an ACK of 1001. Again the server is sending its own FIN with a sequence number of 1470 to the client so that the client should not send any data further. In fourth step the client acknowledges the server by sending an ACK of 1471 and the TCP connection gets terminated.