System level I/O with Unix/Linux


A linux file is a series of m bytes /[b_0, b_1…b_{m – 1} ]

All I/O devices are such as terminal, Network, printers are modeled as file.

Openning a file

An application opens a file by request a file from Kernel. The Kernel returns a small non-negative integer called File descriptor. The kernel keeps info about the file, and the app only keeps track of the descriptor.

Change file position

Kernel keeps a current position k for each file and it starts at 0. App can explicitly change it via sleek.

Read/write file

Read a file copies n bytes from the file to memory starting at k. It will then increment the file position k by n.

If the file size is m and m < n then Kernel will trigger a condition called EOF. There is no explicit EOF character associated with the file though…

Writing is similar. It will write n bytes to the file starting at k and increment the current position k by n.

File type

Regular file

contains arbitrary data

Directory file

A file contains multiple links to other file, mapping file names to files, including directory files. Each directory files contain 2 files at least. The first is “.”, which points to itself. The other is “..”, which points to the parent.


File used to communicate with other process across networks.

Open and close a file

Openning a file

Open will takes a filename and flags and user mode and return the smallest int that’s not used in by the current processor as the file descriptor. For example, the first file a process open will has fd == 3. (0 is stdin, 1 is stdout and 2 is stderr).

int open(char *filename, int flags, mode_t mode);

flags indicate how to access the file. The flag can also be ored to create bit mask that gives additional info. We can do

open('temp.txt, O_RDONLY | O_TRUNC, 0);

Close file

int close(int fd);

Will close the file associated with the given fd. Closing a closed file is error.

Always check return codes, even for seemingly benign functions such as close()

Read/Write file

ssize_t read(int fd, void* buf, size_t n);

will read at most n byte from fd starting at the current position and store it into buf. A return value of -1 indicates Error. 0 indicates EOF. Else the return value indicates the bytes read.

ssize_t write(int fd, const void* buf, size_t n);

copy at most n bytes from buf to fd at the current position.

Short Count

For both read and write, the return value might be smaller than n. This is called Short count. It can happen due to three things: 1. Encounter EOF 2. Reading from text line in terminal. If we are reading from a terminal, no matter how big the n is, read will just read a text line and return the length of the text. 3. Reading from socket.

Standard I/O

C standard library libc.h has high level file functions such as fopen, fclose, and so on. Open and close file: fopen, fclose Read and write bytes; fread, fwrite Read and write text line fget, fput Format read write fprintf, fscanf

standard IO models file as stream, abstraction for file descriptor and a buffer.

Applications often read/write one byte at a time and doing read, write is expensive because it requires kernel call. So we use buffered IO that put contents into a buffer first.

EG: stdout uses buffer. It will flush when we call fflush(stdout) or when the function exits.

Most of the time, we would prefer standard IO but it has problem with: 1. Can not access meta data 2. Not async-data safe (not safe for signal handling) 3. Not suitable for network socket 4. Restrictions

Standard IO uses full duplex so we can use the stream both to read and write but the buffer causes problem. More on standard IO restriction: 1. Can’t use input right after output because of buffer. If we want to use it, we must flush it with fflush 2. An output can’t follow input function without calling fseek or rewind. It causes problem for network since we can’t fseek on a socket.

We can open two streams one for read and one for write but we need to close them

FILE* fpin, *fpout;
fpin = fopen(socked, "r");
fpout = fopen(socked, "w");

We need to close both streams to release the resource. But since both streams have the same underlying socket, the second close will cause problem.

When to use Unix IO

  1. Inside signal handler
  2. When performance matter a lot

What not to use for binary file

Text-oriented IO

such as fget

string function

Such as strlen because ‘\0’ will be interpreted as null terminator

How kernel represent open files

To represent files in Unix kernel, we need three tables:

Descriptor table: Each process keeps a descriptor table where each entry is an open file. Each entry points to an entry in the file table.

File table: File table keeps track of the current position offset of each file, and a reference count of the file and a pointer to entry to v-node table. Closing a file will decrement the reference count of the associated file table. File table is shared across all processes

v-node table: Keeps stat info about that table.

Two file descriptor represents two different files.

Two file descriptors that point to the same file. EG, do two open on the same file name.

Forking will let children inherit files opened by parents
Before forking:

After forking:

We also increase the reference count by 1 for each file.

IO redirection

Linux IO allows user to redirect standard output to file on disk

$ ls > foo.txt

will redirect the output of ls to foo.txt.

It achieves this by using dup2(int oldfd, int newfd);
For each process, dup2 will copy oldfd’s table entry to that of the new file descriptor.

EG: dup2(4,1) will copy the pointer pointed to by file descriptor 4 to the entry of file descriptor 1 in the file descriptor table

if newfd is open, then we will close it first(and decrement the refcount).

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see