Share to: share facebook share twitter share wa share telegram print page

Ar (Unix)

ar
Original author(s)Ken Thompson,
Dennis Ritchie
(AT&T Bell Laboratories)
Developer(s)Various open-source and commercial developers
Initial releaseNovember 3, 1971; 53 years ago (1971-11-03)
Written inC
Operating systemUnix, Unix-like, V, Plan 9, Inferno
PlatformCross-platform
TypeCommand
LicensePlan 9: MIT License
archiver format
Filename extension
.a, .lib, .ar[1]
Internet media type
application/x-archive[1]
Magic number!<arch>
Type of formatarchive format
Container forusually object files (.o)
StandardNot standardized, several variants exist
Open format?Yes[2]

ar, short for archiver, is a shell command for maintaining multiple files as a single archive file; a file archiver. It is often used to create and update static library files that the link editor or linker uses and for generating deb format packages for the Debian Linux distribution. It can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries.[3]

Originally developed for Unix, the command is widely available on Unix-based systems, and similar commands are available on other platforms. An implementation is included in GNU Binutils.[2] In the Linux Standard Base (LSB), the command has been deprecated and is expected to disappear in a future release of that standard. The rationale provided was that "the LSB does not include software development utilities nor does it specify .o and .a file formats."[4]

File format

Diagram showing an example file structure of a .deb file

The format of a file that results from using ar has never been standardized. Modern archives are based on a common format with two main variants, BSD and System V (initially known as COFF, and used as well by GNU, ELF, and Windows.) Historically there have been other variants[5] including V6, V7, AIX (small[6] and big[7]), and Coherent, which all vary significantly from the common format.[8]

Structure

An archive file begins with a header that identifies the file type and is followed with a section for each contained file. Each contained file section consists of a header followed by the file content. The headers consist solely of printable ASCII characters and line feeds. In fact, an archive containing only text files is also a text file.

The content of a contained file begins on an even byte boundary. A newline is inserted between files as padding, if necessary. Nevertheless, the size stored reflects the size excluding padding.[9]

Archive header

The first header, a.k.a. file signature, is a magic number that encodes the ASCII string !<arch> followed by a single line feed character (0x0A).

Contained file header

Each file is preceded by a header that contains information about the file. The common format is as follows. Numeric values are encoded in ASCII and all values are right-padded with spaces (0x20).

Offset Length Content Format
0 16 File identifier ASCII
16 12 File modification timestamp (in seconds) Decimal
28 6 Owner ID Decimal
34 6 Group ID Decimal
40 8 File mode (type and permission) Octal
48 10 File size in bytes Decimal
58 2 Ending characters 0x60 0x0A

Variants

Variants of the command were developed to address issues including:

File name length limitation
The GNU and BSD variants devised different methods of storing long file names.
Global symbol table
Many implementations include a global symbol table (a.k.a. armap, directory or index) for fast linking without needing to scan the whole archive for a symbol. POSIX recognizes this feature, and requires implementations to have an -s option for updating it. Most implementations put it at the first file entry.[10]
Year 2038 problem
Although the common format is not at risk of this problem, many implementations are vulnerable to failure in that year.

BSD

The BSD implementation stores file names right-padded with ASCII spaces. This causes issues with spaces inside file names.[clarification needed] The 4.4BSD implementation stores extended file names[clarification needed] by placing the string "#1/" followed by the file name length in the file name field, and storing the real file name in front of the data section.[8]

The BSD implementation traditionally does not handle the building of a global symbol lookup table, and delegates this task to a separate utility, ranlib,[11] which inserts an architecture-specific[clarification needed] file named __.SYMDEF as first archive member.[12] Some descendants put a space and "SORTED" after the name to indicate a sorted version.[13] A 64-bit variant called __.SYMDEF_64 exists on Darwin.

To conform to POSIX, newer BSD implementations support the -s option instead of ranlib. FreeBSD in particular ditched the SYMDEF table format and embraced the System V style table.[14]

System V (or GNU)[clarification needed]

The System V implementation uses a slash ('/') to mark the end of the file name which allows for the use of spaces without the use of an extended file name. Then[clarification needed], it stores multiple extended file names in the data section of a file[clarification needed] with the name "//", this record is referred to by future headers[clarification needed]. A header references an extended file name by storing a "/" followed by a decimal offset to the start of the file name in the extended file name data section.[15] The format of this "//" file itself is simply a list of the long file names, each separated by one or more LF characters. This is usually the second entry of the file, after the symbol table which always is the first.

The System V implementation uses the special file name "/" to denote that the following data entry contains a symbol lookup table, which is used in ar libraries[clarification needed] to speed up access. This symbol table is built in three parts which are recorded together as contiguous data.

  1. A 32-bit big endian integer, giving the number of entries in the table.
  2. A set of 32-bit big endian integers. One for each symbol, recording the position within the archive of the header for the file containing this symbol.
  3. A set of Zero-terminated strings. Each is a symbol name, and occurs in the same order as the list of positions in part 2.

Some System V systems do not use this format. For operating systems such as HP-UX 11.0, this information is stored in a data structure based on the SOM file format.

The special file "/" is not terminated with a specific sequence; the end is assumed once the last symbol name has been read.[clarification needed]

To overcome the 4 GiB file size limit[clarification needed] some operating system like Solaris 11.2 and GNU use a variant lookup table. Instead of 32-bit integers, 64-bit integers are used in the symbol lookup tables. The string "/SYM64/" instead "/" is used as identifier for this table[16]

Windows

The Windows (PE/COFF) variant is based on the SysV/GNU variant. The first entry "/" has the same layout as the SysV/GNU symbol table. The second entry is another "/", a Microsoft extension that stores an extended symbol cross-reference table. This one is sorted and uses little-endian integers.[5][17] The third entry is the optional "//" long name data as in SysV/GNU.[18]

Thin archive

The GNU binutils and Elfutils implementations have an additional "thin archive" format with the magic number !<thin>. A thin archive only contains a symbol table and references to the file. The file format is essentially a System V format archive where every file is stored without the data sections. Every file name is stored as a "long" file name and they are to be resolved as if they were symbolic links.[19]

Examples

The following command creates an archive libclass.a with object files class1.o, class2.o, class3.o:

ar rcs libclass.a class1.o class2.o class3.o

The linker ld can read object code from an archive file. The following example shows how the archive libclass.a (specified as -lclass) is linked with the object code of main.o.

ld main.o -lclass

See also

References

  1. ^ a b "application/x-archive". Archived from the original on 2019-12-08. Retrieved 2019-03-11.
  2. ^ a b "ar(1) – Linux man page". Retrieved 3 October 2013.
  3. ^ "Static Libraries". TLDP. Retrieved 3 October 2013.
  4. ^ Linux Standard Base Core Specification, version 4.1, Chapter 15. Commands and Utilities > ar
  5. ^ a b Levine, John R. (2000) [October 1999]. "Chapter 6: Libraries". Linkers and Loaders. The Morgan Kaufmann Series in Software Engineering and Programming (1 ed.). San Francisco, USA: Morgan Kaufmann. ISBN 1-55860-496-0. OCLC 42413382. Archived from the original on 2012-12-05. Retrieved 2020-01-12. Code: [1][2][dead link] Errata: [3]
  6. ^ "ar File Format (Small)". IBM.
  7. ^ "ar File Format (Big)". IBM.
  8. ^ a b Manual page for NET/2 ar file format
  9. ^ "ar.h". www.unix.com. The UNIX and Linux Forums.
  10. ^ ar – Shell and Utilities Reference, The Single UNIX Specification, Version 5 from The Open Group
  11. ^ Manual page for NET/2 ranlib utility
  12. ^ Manual page for NET/2 ranlib file format
  13. ^ "ranlib.h". opensource.apple.com.
  14. ^ ar(5) – FreeBSD File Formats Manual
  15. ^ An offset is a number of characters; not a line or item index.
  16. ^ "ar.h(3HEAD)". docs.oracle.com. Oracle Corporation. 11 November 2014. Retrieved 14 November 2018.
  17. ^ Pietrek, Matt (April 1998), "Under The Hood", Microsoft Systems Journal, archived from the original on 2007-06-24, retrieved 2014-08-23
  18. ^ "llvm-mirror/llvm: archive.cpp (format detection)". GitHub. Retrieved 10 February 2020.
  19. ^ "ar". GNU Binary Utilities.
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9

Portal di Ensiklopedia Dunia

Kembali kehalaman sebelumnya