Interprocess Communication: TCP sockets
Goals
- Learn to work with TCP sockets. In this lab you will work with a pair of programs which implement the client/server paradigm. The server will be a type of program known as daemon, which performs the following tasks on an infinite loop: wait for a request to arrive, process the request, and send back a response. The client will be a program that crafts a request, sends it to the server, receives and processes the response, and then terminates.
Credits
The material developed for this lab was developed by Prof. L. Felipe Perrone. Permission to reuse this material in parts or in its entirety is granted provided that this “credits” note is not removed. Additional students files associated with this lab, as well as any existing solutions can be provided upon request by e- mail to: perrone[at]bucknell[dot]edu
TCP Sockets
You have learned that the Unix pipe is a construct for interconnecting two processes that execute on the same machine. Unix pipes follow the byte stream service model, meaning that you work with them by pushing bytes in on the write end and pulling bytes out from the read end. Since access to pipes is provided via Unix file descriptors, the programmer can use the same “file” read and write system calls to operate on them.
The concept of a TCP socket is very similar to that of a pipe. The most fundamental difference is that TCP sockets serve to interconnect two processes that execute on arbitrary machines. Whether the two processes execute on the same host or on networked hosts across the world from each other, the set up and operations on the sockets are the same.
You should think of a socket as a communication endpoint. If a socket interconnects processes on arbitrary hosts on the Internet, the first thing that should occur to you is that sockets must be related to Internet addresses. When we say Internet address, it might occur to you that we’re somehow referring to IP addresses, which we use to pinpoint hosts on the Internet. An IP address, however, can only identify a host, not an application process within that particular host. If you need to pinpoint a specific application process within a host, you need to extend this concept of address to the pair <IP address, port number>, where port number serves to identify an application within a host.
This mapping of application to port number doesn’t happen by magic, of course. An application must bind to a port number within a given host and it must choose a port number that is not used by the system for a standard service. Take a look at /etc/services to find a large number of well-defined ports that are used by standard applications. The port numbers you use should never conflict with these. In fact, you should be using port numbers in “user space”; pick something at random in the range 8,000 to 10,000 to experiment with your programs.
In this lab you will work with a pair of programs which implement the client/server paradigm. The server (echod) is a type of program known as daemon, which performs the following tasks on an infinite loop: wait for a request to arrive, process the request, and send back a response. The client (echoreq) is a program that crafts a request, sends it to the server, receives and processes the response, and then terminates.
The basic design pattern for client/server applications based on TCP sockets is illustrated in the figure below.
The figure shows the sequence of calls to functions in the socket library that are appropriate for the client process and for the server process. TCP sockets implement a high level abstraction that gives the programmer a byte-stream communication channel across networked hosts that is reliable and order-preserving. After the connection set up, you have something that works identically to pipes.
Note: You will need to turn in a Makefile that generates all the objects and the executables for this lab assignment. This includes a rule for building wrappers.o as described below. Failing to include this Makefile will lead to additional issues in grading your lab.
Problem 3 (10 points)
Go through the code you have written for previous labs and find all the wrapper functions you wrote to substitute for system and library calls. Create two files with all the wrappers you have written:
- wrappers.h – this file will contain the function prototypes (and only the prototypes) of your wrapper functions. It will be included by any programs you write in this and in future labs, which use the corresponding system and library calls.
- wrappers.c – this file will contain the complete implementations of your wrapper functions. It will be separately compiled into object code by a rule in your Makefile. The resulting object file will be linked with the programs in this and in future lab assignments.
Your files should include wrappers for the following functions: fork(2), pipe(2), wait(2), waitpid(2), open(2), close(2), write(2), read(2), connect(2), bind(2), listen(2), accept(2), (and socket(2) maybe too) and any others you use in this lab which set the “errno” variable when encountering an error condition. There are two versions of open(2) with different numbers of parameters. We ask you implement the version with 3 arguments, i.e., open(const char*pathname, int flags, mode_t mode). While C doesn’t support polymorphism, it does support variable arguments. We suggest you do a search to find out how this may be done, if you are interested in the topic.
You must use the proper include guard for the wrappers.h file. That is, have the pair of guards to surround the prototypes (a.k.a. headers) for these functions. We use the symbol WRAPPERS_H here. You may choose the one you’d like to use. If you aren’t sure how to use the include guards, you may go back to your CSCI 206 exercises to see the details.
#ifndef WRAPPERS_H #define WRAPPERS_H /* Your function headers and others are defined in this block */ ... #endif
Problem 4 (25 points)
First of all, you should implement the communication between the client and the server, that is, you will augment the two files given to you so that echoreq sends a string to echod, which receives the string and sends it back to echoreq without any changes.
Once that functionality works, you will add new functionality to echod. It’s a very familiar problem and you can reuse the code for tokenizing a string and eliminating extraneous spaces, which you wrote for Lab 2.
When the server receives a string with an arbitrary number of spaces between words, it will “clean it up” before returning to echoreq. By “clean up” you should understand that the string returned will have exactly one space between any pair of consecutive words.
For example, if the server receives a string such as:
this is a test of the emergency broadcast system
It returns to the client the following string:
this is a test of the emergency broadcast system
For the sake of debugging your code, you can put both your client and your sever in the same host: your very workstation. However, to verify that everything is working to specifications, make sure to put your client and your server each on a different host. For example, you can run your server on linuxremote-csci315 and run your client programs on a lab Linux machine.
Also note: in order to send a command line argument that contains spaces, you can use double quotes ” to encompass the argument. Thus, on the server side:
[<you>@linuxremote-csci315 Lab04] $ ./echod <some port number you in your head agree to>
while on the client side:
[<you>@<your machine> Lab04]$ ./echoreq linuxremote-csci315 <that same port number> "Hello You!" ECHOREQ: from server = Hello You!
Problem 5 (20 points)
Create a file called lab04.txt to write answers to the following questions. Please write as concisely and clearly as possible. Make sure to label each problem with proper number, e.g., Problem 5.1.
- If no calls were made to fork in either of the programs you have, why is it that we’re claiming that TCP sockets are a mechanism for interprocess communication?
- Is the socket functionality provided by the kernel or by an external library? Present an argument to justify your answer from the perspectives of the actual implementation on Linux and from the perspective of the design decisions made for the construction of the operating system (think about the implications of these decisions on the performance of a modern networked computer). You should think about what makes sense, do a little research to verify whether you are on the right track, reason about your findings, and only then write your conclusions.
- Only when you consider a program’s communications needs and the operational constraints around a program can you choose which of the two IPC mechanisms is most appropriate. Describe what drives you decision to use either pipes and sockets to interconnect two processes.
- echoreq makes a call to gethostbyname(3). Explain what this library call does for your program and how you use its API.
- gethostbyname(3) is not the most up-to-date function for its kind of task (it is deprecated). Discover what function might eventually replace gethostbyname(3) and explain how differently it might work.
Problem 6 (10 points)
Copy the echoreq.c file to a new file called echoreq2.c. Once you have discovered a modern alternative to gethostbyname(3), modify echoreq2.c so that it uses this new version. Otherwise, the behavior echoreq2.c should be identical to what you see in your solution to Problem 4. Note that in this case, you may need to use the compiler switch -std=gnu99 in your Makefile as some of the newer C functions are available only in gnu99, not in c99.
Submission
When you are done with everything, you need to:
- cd ~/csci315/Labs/Lab4
- git pull
- git add Makefile
- git add lab04.txt
- git add echod.c
- git add echoreq.c
- git add echoreq2.c
- git add wrappers.h
- git add wrappers.c
- git commit -m “lab 4 completed”
- git push
Before turning in your work for grading, create a text file in your Lab 4 directory called submission.txt. In this file, provide a list to indicate to the grader, problem by problem, if you completed the problem and whether it works to specification. Wrap everything up by turning in this file:
- git add ~/csci315/Labs/Lab4/submission.txt
- git commit -m “Lab 4 completed”
- git push
Grading Rubric
Problem 3 [10 points total]
- [6 points] Implement wrappers correctly for the “new” system calls, connect, bind, listen, and accept;
- [4 points] Implement wrappers correctly for old functions.
Problem 4 [25 points total]
- [5 points] Use the wrapper functions you implemented in echoreq.c and echod.c;
- [15 points] Implement the communications between echoreq and echod correctly;
- [5 points] Correctly remove extra blank spaces in the message.
Problem 5 [20 points total]
- 4 points for each of the answers in lab04.txt.
Problem 6 [15 points total]
- [10 points] Use the correct name resolution function other than gethostbyname();
- [5 points] Correctly implement echoreq2.c using the above found name resolution function.
Note: [up to -10 points] An incorrect or incomplete Makefile to build all programs in the lab assignment.
A note about your use of git for source version control:
Yes, we have been using git primarily as a means for you to place your work in a remote repository that the graders can access. HOWEVER, we can’t forget that the primary benefit of a source version control system is to help out the developer, that is: YOU!
Consider committing your code (even if only to your local repository) as soon as you have determined that you got something working. Heck, you can commit also partially debugged versions of your code, if you want to create a checkpoint. This can help you immensely if you turn out to screw something up, by accident, and want to recover your files from a previous checkpoint. To recover the previous state recorded in your repositories, the commands below are useful:
- git log -> will let you see the history of commits to your repository and to find out the name of a revision, which needed if you want to recover that revision. Note that unlike svn, revision names in git are these long, unwieldy hash tags (with hexadecimal numbers). To know what you are looking for, you will have to rely on the comment that you entered when you committed the revision. Do you see now that it is important to use a meaningful message with your commits?
- git reset -> read about this in your favorite git tutorial or man page.
- git checkout -> read about this in your favorite git tutorial or man page.