mirror of git://sourceware.org/git/glibc.git
				
				
				
			
		
			
				
	
	
		
			396 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			396 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
| @node I/O Overview, I/O on Streams, Pattern Matching, Top
 | |
| @c %MENU% Introduction to the I/O facilities
 | |
| @chapter Input/Output Overview
 | |
| 
 | |
| Most programs need to do either input (reading data) or output (writing
 | |
| data), or most frequently both, in order to do anything useful.  The GNU
 | |
| C library provides such a large selection of input and output functions
 | |
| that the hardest part is often deciding which function is most
 | |
| appropriate!
 | |
| 
 | |
| This chapter introduces concepts and terminology relating to input
 | |
| and output.  Other chapters relating to the GNU I/O facilities are:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item
 | |
| @ref{I/O on Streams}, which covers the high-level functions
 | |
| that operate on streams, including formatted input and output.
 | |
| 
 | |
| @item
 | |
| @ref{Low-Level I/O}, which covers the basic I/O and control
 | |
| functions on file descriptors.
 | |
| 
 | |
| @item
 | |
| @ref{File System Interface}, which covers functions for operating on
 | |
| directories and for manipulating file attributes such as access modes
 | |
| and ownership.
 | |
| 
 | |
| @item
 | |
| @ref{Pipes and FIFOs}, which includes information on the basic interprocess
 | |
| communication facilities.
 | |
| 
 | |
| @item
 | |
| @ref{Sockets}, which covers a more complicated interprocess communication
 | |
| facility with support for networking.
 | |
| 
 | |
| @item
 | |
| @ref{Low-Level Terminal Interface}, which covers functions for changing
 | |
| how input and output to terminals or other serial devices are processed.
 | |
| @end itemize
 | |
| 
 | |
| 
 | |
| @menu
 | |
| * I/O Concepts::       Some basic information and terminology.
 | |
| * File Names::         How to refer to a file.
 | |
| @end menu
 | |
| 
 | |
| @node I/O Concepts, File Names,  , I/O Overview
 | |
| @section Input/Output Concepts
 | |
| 
 | |
| Before you can read or write the contents of a file, you must establish
 | |
| a connection or communications channel to the file.  This process is
 | |
| called @dfn{opening} the file.  You can open a file for reading, writing,
 | |
| or both.
 | |
| @cindex opening a file
 | |
| 
 | |
| The connection to an open file is represented either as a stream or as a
 | |
| file descriptor.  You pass this as an argument to the functions that do
 | |
| the actual read or write operations, to tell them which file to operate
 | |
| on.  Certain functions expect streams, and others are designed to
 | |
| operate on file descriptors.
 | |
| 
 | |
| When you have finished reading to or writing from the file, you can
 | |
| terminate the connection by @dfn{closing} the file.  Once you have
 | |
| closed a stream or file descriptor, you cannot do any more input or
 | |
| output operations on it.
 | |
| 
 | |
| @menu
 | |
| * Streams and File Descriptors::    The GNU Library provides two ways
 | |
| 			             to access the contents of files.
 | |
| * File Position::                   The number of bytes from the
 | |
|                                      beginning of the file.
 | |
| @end menu
 | |
| 
 | |
| @node Streams and File Descriptors, File Position,  , I/O Concepts
 | |
| @subsection Streams and File Descriptors
 | |
| 
 | |
| When you want to do input or output to a file, you have a choice of two
 | |
| basic mechanisms for representing the connection between your program
 | |
| and the file: file descriptors and streams.  File descriptors are
 | |
| represented as objects of type @code{int}, while streams are represented
 | |
| as @code{FILE *} objects.
 | |
| 
 | |
| File descriptors provide a primitive, low-level interface to input and
 | |
| output operations.  Both file descriptors and streams can represent a
 | |
| connection to a device (such as a terminal), or a pipe or socket for
 | |
| communicating with another process, as well as a normal file.  But, if
 | |
| you want to do control operations that are specific to a particular kind
 | |
| of device, you must use a file descriptor; there are no facilities to
 | |
| use streams in this way.  You must also use file descriptors if your
 | |
| program needs to do input or output in special modes, such as
 | |
| nonblocking (or polled) input (@pxref{File Status Flags}).
 | |
| 
 | |
| Streams provide a higher-level interface, layered on top of the
 | |
| primitive file descriptor facilities.  The stream interface treats all
 | |
| kinds of files pretty much alike---the sole exception being the three
 | |
| styles of buffering that you can choose (@pxref{Stream Buffering}).
 | |
| 
 | |
| The main advantage of using the stream interface is that the set of
 | |
| functions for performing actual input and output operations (as opposed
 | |
| to control operations) on streams is much richer and more powerful than
 | |
| the corresponding facilities for file descriptors.  The file descriptor
 | |
| interface provides only simple functions for transferring blocks of
 | |
| characters, but the stream interface also provides powerful formatted
 | |
| input and output functions (@code{printf} and @code{scanf}) as well as
 | |
| functions for character- and line-oriented input and output.
 | |
| @c !!! glibc has dprintf, which lets you do printf on an fd.
 | |
| 
 | |
| Since streams are implemented in terms of file descriptors, you can
 | |
| extract the file descriptor from a stream and perform low-level
 | |
| operations directly on the file descriptor.  You can also initially open
 | |
| a connection as a file descriptor and then make a stream associated with
 | |
| that file descriptor.
 | |
| 
 | |
| In general, you should stick with using streams rather than file
 | |
| descriptors, unless there is some specific operation you want to do that
 | |
| can only be done on a file descriptor.  If you are a beginning
 | |
| programmer and aren't sure what functions to use, we suggest that you
 | |
| concentrate on the formatted input functions (@pxref{Formatted Input})
 | |
| and formatted output functions (@pxref{Formatted Output}).
 | |
| 
 | |
| If you are concerned about portability of your programs to systems other
 | |
| than GNU, you should also be aware that file descriptors are not as
 | |
| portable as streams.  You can expect any system running @w{ISO C} to
 | |
| support streams, but non-GNU systems may not support file descriptors at
 | |
| all, or may only implement a subset of the GNU functions that operate on
 | |
| file descriptors.  Most of the file descriptor functions in the GNU
 | |
| library are included in the POSIX.1 standard, however.
 | |
| 
 | |
| @node File Position,  , Streams and File Descriptors, I/O Concepts
 | |
| @subsection File Position
 | |
| 
 | |
| One of the attributes of an open file is its @dfn{file position} that
 | |
| keeps track of where in the file the next character is to be read or
 | |
| written.  In the GNU system, and all POSIX.1 systems, the file position
 | |
| is simply an integer representing the number of bytes from the beginning
 | |
| of the file.
 | |
| 
 | |
| The file position is normally set to the beginning of the file when it
 | |
| is opened, and each time a character is read or written, the file
 | |
| position is incremented.  In other words, access to the file is normally
 | |
| @dfn{sequential}.
 | |
| @cindex file position
 | |
| @cindex sequential-access files
 | |
| 
 | |
| Ordinary files permit read or write operations at any position within
 | |
| the file.  Some other kinds of files may also permit this.  Files which
 | |
| do permit this are sometimes referred to as @dfn{random-access} files.
 | |
| You can change the file position using the @code{fseek} function on a
 | |
| stream (@pxref{File Positioning}) or the @code{lseek} function on a file
 | |
| descriptor (@pxref{I/O Primitives}).  If you try to change the file
 | |
| position on a file that doesn't support random access, you get the
 | |
| @code{ESPIPE} error.
 | |
| @cindex random-access files
 | |
| 
 | |
| Streams and descriptors that are opened for @dfn{append access} are
 | |
| treated specially for output: output to such files is @emph{always}
 | |
| appended sequentially to the @emph{end} of the file, regardless of the
 | |
| file position.  However, the file position is still used to control where in
 | |
| the file reading is done.
 | |
| @cindex append-access files
 | |
| 
 | |
| If you think about it, you'll realize that several programs can read a
 | |
| given file at the same time.  In order for each program to be able to
 | |
| read the file at its own pace, each program must have its own file
 | |
| pointer, which is not affected by anything the other programs do.
 | |
| 
 | |
| In fact, each opening of a file creates a separate file position.
 | |
| Thus, if you open a file twice even in the same program, you get two
 | |
| streams or descriptors with independent file positions.
 | |
| 
 | |
| By contrast, if you open a descriptor and then duplicate it to get
 | |
| another descriptor, these two descriptors share the same file position:
 | |
| changing the file position of one descriptor will affect the other.
 | |
| 
 | |
| @node File Names,  , I/O Concepts, I/O Overview
 | |
| @section File Names
 | |
| 
 | |
| In order to open a connection to a file, or to perform other operations
 | |
| such as deleting a file, you need some way to refer to the file.  Nearly
 | |
| all files have names that are strings---even files which are actually
 | |
| devices such as tape drives or terminals.  These strings are called
 | |
| @dfn{file names}.  You specify the file name to say which file you want
 | |
| to open or operate on.
 | |
| 
 | |
| This section describes the conventions for file names and how the
 | |
| operating system works with them.
 | |
| @cindex file name
 | |
| 
 | |
| @menu
 | |
| * Directories::                 Directories contain entries for files.
 | |
| * File Name Resolution::        A file name specifies how to look up a file.
 | |
| * File Name Errors::            Error conditions relating to file names.
 | |
| * File Name Portability::       File name portability and syntax issues.
 | |
| @end menu
 | |
| 
 | |
| 
 | |
| @node Directories, File Name Resolution,  , File Names
 | |
| @subsection Directories
 | |
| 
 | |
| In order to understand the syntax of file names, you need to understand
 | |
| how the file system is organized into a hierarchy of directories.
 | |
| 
 | |
| @cindex directory
 | |
| @cindex link
 | |
| @cindex directory entry
 | |
| A @dfn{directory} is a file that contains information to associate other
 | |
| files with names; these associations are called @dfn{links} or
 | |
| @dfn{directory entries}.  Sometimes, people speak of ``files in a
 | |
| directory'', but in reality, a directory only contains pointers to
 | |
| files, not the files themselves.
 | |
| 
 | |
| @cindex file name component
 | |
| The name of a file contained in a directory entry is called a @dfn{file
 | |
| name component}.  In general, a file name consists of a sequence of one
 | |
| or more such components, separated by the slash character (@samp{/}).  A
 | |
| file name which is just one component names a file with respect to its
 | |
| directory.  A file name with multiple components names a directory, and
 | |
| then a file in that directory, and so on.
 | |
| 
 | |
| Some other documents, such as the POSIX standard, use the term
 | |
| @dfn{pathname} for what we call a file name, and either @dfn{filename}
 | |
| or @dfn{pathname component} for what this manual calls a file name
 | |
| component.  We don't use this terminology because a ``path'' is
 | |
| something completely different (a list of directories to search), and we
 | |
| think that ``pathname'' used for something else will confuse users.  We
 | |
| always use ``file name'' and ``file name component'' (or sometimes just
 | |
| ``component'', where the context is obvious) in GNU documentation.  Some
 | |
| macros use the POSIX terminology in their names, such as
 | |
| @code{PATH_MAX}.  These macros are defined by the POSIX standard, so we
 | |
| cannot change their names.
 | |
| 
 | |
| You can find more detailed information about operations on directories
 | |
| in @ref{File System Interface}.
 | |
| 
 | |
| @node File Name Resolution, File Name Errors, Directories, File Names
 | |
| @subsection File Name Resolution
 | |
| 
 | |
| A file name consists of file name components separated by slash
 | |
| (@samp{/}) characters.  On the systems that the GNU C library supports,
 | |
| multiple successive @samp{/} characters are equivalent to a single
 | |
| @samp{/} character.
 | |
| 
 | |
| @cindex file name resolution
 | |
| The process of determining what file a file name refers to is called
 | |
| @dfn{file name resolution}.  This is performed by examining the
 | |
| components that make up a file name in left-to-right order, and locating
 | |
| each successive component in the directory named by the previous
 | |
| component.  Of course, each of the files that are referenced as
 | |
| directories must actually exist, be directories instead of regular
 | |
| files, and have the appropriate permissions to be accessible by the
 | |
| process; otherwise the file name resolution fails.
 | |
| 
 | |
| @cindex root directory
 | |
| @cindex absolute file name
 | |
| If a file name begins with a @samp{/}, the first component in the file
 | |
| name is located in the @dfn{root directory} of the process (usually all
 | |
| processes on the system have the same root directory).  Such a file name
 | |
| is called an @dfn{absolute file name}.
 | |
| @c !!! xref here to chroot, if we ever document chroot. -rm
 | |
| 
 | |
| @cindex relative file name
 | |
| Otherwise, the first component in the file name is located in the
 | |
| current working directory (@pxref{Working Directory}).  This kind of
 | |
| file name is called a @dfn{relative file name}.
 | |
| 
 | |
| @cindex parent directory
 | |
| The file name components @file{.} (``dot'') and @file{..} (``dot-dot'')
 | |
| have special meanings.  Every directory has entries for these file name
 | |
| components.  The file name component @file{.} refers to the directory
 | |
| itself, while the file name component @file{..} refers to its
 | |
| @dfn{parent directory} (the directory that contains the link for the
 | |
| directory in question).  As a special case, @file{..} in the root
 | |
| directory refers to the root directory itself, since it has no parent;
 | |
| thus @file{/..} is the same as @file{/}.
 | |
| 
 | |
| Here are some examples of file names:
 | |
| 
 | |
| @table @file
 | |
| @item /a
 | |
| The file named @file{a}, in the root directory.
 | |
| 
 | |
| @item /a/b
 | |
| The file named @file{b}, in the directory named @file{a} in the root directory.
 | |
| 
 | |
| @item a
 | |
| The file named @file{a}, in the current working directory.
 | |
| 
 | |
| @item /a/./b
 | |
| This is the same as @file{/a/b}.
 | |
| 
 | |
| @item ./a
 | |
| The file named @file{a}, in the current working directory.
 | |
| 
 | |
| @item ../a
 | |
| The file named @file{a}, in the parent directory of the current working
 | |
| directory.
 | |
| @end table
 | |
| 
 | |
| @c An empty string may ``work'', but I think it's confusing to
 | |
| @c try to describe it.  It's not a useful thing for users to use--rms.
 | |
| A file name that names a directory may optionally end in a @samp{/}.
 | |
| You can specify a file name of @file{/} to refer to the root directory,
 | |
| but the empty string is not a meaningful file name.  If you want to
 | |
| refer to the current working directory, use a file name of @file{.} or
 | |
| @file{./}.
 | |
| 
 | |
| Unlike some other operating systems, the GNU system doesn't have any
 | |
| built-in support for file types (or extensions) or file versions as part
 | |
| of its file name syntax.  Many programs and utilities use conventions
 | |
| for file names---for example, files containing C source code usually
 | |
| have names suffixed with @samp{.c}---but there is nothing in the file
 | |
| system itself that enforces this kind of convention.
 | |
| 
 | |
| @node File Name Errors, File Name Portability, File Name Resolution, File Names
 | |
| @subsection File Name Errors
 | |
| 
 | |
| @cindex file name errors
 | |
| @cindex usual file name errors
 | |
| 
 | |
| Functions that accept file name arguments usually detect these
 | |
| @code{errno} error conditions relating to the file name syntax or
 | |
| trouble finding the named file.  These errors are referred to throughout
 | |
| this manual as the @dfn{usual file name errors}.
 | |
| 
 | |
| @table @code
 | |
| @item EACCES
 | |
| The process does not have search permission for a directory component
 | |
| of the file name.
 | |
| 
 | |
| @item ENAMETOOLONG
 | |
| This error is used when either the total length of a file name is
 | |
| greater than @code{PATH_MAX}, or when an individual file name component
 | |
| has a length greater than @code{NAME_MAX}.  @xref{Limits for Files}.
 | |
| 
 | |
| In the GNU system, there is no imposed limit on overall file name
 | |
| length, but some file systems may place limits on the length of a
 | |
| component.
 | |
| 
 | |
| @item ENOENT
 | |
| This error is reported when a file referenced as a directory component
 | |
| in the file name doesn't exist, or when a component is a symbolic link
 | |
| whose target file does not exist.  @xref{Symbolic Links}.
 | |
| 
 | |
| @item ENOTDIR
 | |
| A file that is referenced as a directory component in the file name
 | |
| exists, but it isn't a directory.
 | |
| 
 | |
| @item ELOOP
 | |
| Too many symbolic links were resolved while trying to look up the file
 | |
| name.  The system has an arbitrary limit on the number of symbolic links
 | |
| that may be resolved in looking up a single file name, as a primitive
 | |
| way to detect loops.  @xref{Symbolic Links}.
 | |
| @end table
 | |
| 
 | |
| 
 | |
| @node File Name Portability,  , File Name Errors, File Names
 | |
| @subsection Portability of File Names
 | |
| 
 | |
| The rules for the syntax of file names discussed in @ref{File Names},
 | |
| are the rules normally used by the GNU system and by other POSIX
 | |
| systems.  However, other operating systems may use other conventions.
 | |
| 
 | |
| There are two reasons why it can be important for you to be aware of
 | |
| file name portability issues:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item
 | |
| If your program makes assumptions about file name syntax, or contains
 | |
| embedded literal file name strings, it is more difficult to get it to
 | |
| run under other operating systems that use different syntax conventions.
 | |
| 
 | |
| @item
 | |
| Even if you are not concerned about running your program on machines
 | |
| that run other operating systems, it may still be possible to access
 | |
| files that use different naming conventions.  For example, you may be
 | |
| able to access file systems on another computer running a different
 | |
| operating system over a network, or read and write disks in formats used
 | |
| by other operating systems.
 | |
| @end itemize
 | |
| 
 | |
| The @w{ISO C} standard says very little about file name syntax, only that
 | |
| file names are strings.  In addition to varying restrictions on the
 | |
| length of file names and what characters can validly appear in a file
 | |
| name, different operating systems use different conventions and syntax
 | |
| for concepts such as structured directories and file types or
 | |
| extensions.  Some concepts such as file versions might be supported in
 | |
| some operating systems and not by others.
 | |
| 
 | |
| The POSIX.1 standard allows implementations to put additional
 | |
| restrictions on file name syntax, concerning what characters are
 | |
| permitted in file names and on the length of file name and file name
 | |
| component strings.  However, in the GNU system, you do not need to worry
 | |
| about these restrictions; any character except the null character is
 | |
| permitted in a file name string, and there are no limits on the length
 | |
| of file name strings.
 |