转自:http://www.faqs.org/docs/Linux-HOWTO/GCC-HOWTO.html
The Linux GCCHOWTO
Daniel Barlow
May 1999
This document covers how to set up the GNU C compiler anddevelopment libraries under Linux, and gives an overview ofcompiling, linking, running and debugging programs under it. Mostof the material in it has been taken from Mitch D'Souza's GCC-FAQor the ELF-HOWTO - it replaces both documents.
This is the first version to be written in DocBook instead ofthe old Linuxdoc format, and may contain markup errors. Please letme know if you find anything worng.
As can be determined from the long times between updates of thisdocument, I don't actually have the time or inclination to maintainit much. If you have, can, and want to, drop me some emaildescribing what you'd do with it and why you think you'd be good atit.
Preliminaries
ELF vs. a.out, libc 5vs 6
Three years ago when this document was first created, I openedthis section by saying "Linux development is in a state of fluxright now" and going on to describe how ELF was replacing the oldera.out binary format.
It still is in a state of flux. It always will be. Though thatparticular change is long since past, development of the Linuxkernel and the surrounding system continues to happen, and thingschange for developers as a result. So it's a good idea to knowupfront what kind of system you have in front of you.
The possible candidates, in order of age, are
libc 4, a.out: very old systems
libc 5, ELF: Red Hat 4.2, Debian 2.0
libc 6 (a.k.a glibc 2), ELF: Red Hat 5 - 5.2, Debian 2.1
libc 6.1,(a.k.a glibc 2.1) ELF: Red Hat 6
$ ldd /bin/ls
libc.so.6 => /lib/libc.so.6 (0x4000e000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
|
This document was created on a Debian 2.1 system, so no surprise there.
It's entirely possible that the system you're using may have amix of different versions on it. What you probably want to know inthat case is the version that its C development environment is setup for, so you're best off compiling "hello world" and runningldd on the output thus created. Note thatfor historical reasons, gcc defaults to anoutput file calleda.outeven on ELFsystems, so don't assume anything from that.
Administrata
The copyright information and like legalese can be found at theend of this document, together with thestatutory warnings about asking dumb questions on Usenet, revealingyour ignorance of the C language by reporting bugs which aren't,and picking your nose while chewing gum.
Typography
If you're reading this in Postscipt, dvi, or html format, youget to see a little more font variation than people with the plaintext version. In particular, filenames, commands, command outputand source code excerpts are set in some form oftypewriterfont, whereas `variables' and randomthings that need emphasizing are emphasized.
You also get a usable index. In dvi or postscript, the numbersin the index are section numbers. In HTML they're just sequentiallyassigned numbers that you can click on. In the plain text version,they really are just numbers. Get an upgrade!
The Bourne (rather than C) shell syntax is used in examples. Cshell users will want to use
% setenv FOO bar |
$ FOO=bar; export FOO |
If the prompt shown is#rather than$, the command shown will probably onlywork as root. Of course, I accept no responsibility for anythingthat happens to your system as a result of trying these examples.Have a nice day:-)
Where to getthings
In the three years since the first `HOWTO' version of this,useful Linux distributions have become prevalent. So, where onceI'd have spent pages listing FTP sites and hours updating (failingto update) version numbers and directory names, now I will simplysay - your distribution maintainer should be taking care of thisfor you. If you don't have, say, gcc installed, find the RPM or thedeb packages that contain it, and install it. If that isn't anoption because you don't have a friendly distribution, you'vealmost certainly been using Linux long enough that you don't needme to tell you where to find things anyway.
This document
You're reading it. You probably have it already.
This document is one of the Linux HOWTO series, so is probablyalready installed somewhere in/usr/docif you're reading this on a linux box. Failing that, from all LinuxHOWTO repositories (try Metalab) and (possibly in a slightly newerversion) at my personal web site www.telent.net.
Otherdocumentation
The official documentation for gcc is in the source distribution(see below) as texinfo files, and as.infofiles. If you have a fast network connection, a cdrom, or areasonable amount of patience, you can just untar it and copy therelevant bits into/usr/info. If not, youmay find them at tsx-11, but not necessarily always the latestversion.
There are two source of documentation for libc. GNU libc comeswith info files which describe Linux libc fairly accurately exceptfor stdio. Also, the manpagesarchive are written for Linux and describe a lot of system calls(section 2) and libc functions (section 3).
GCC
There are two answers.
(a) The official Linux GCC distribution can always be found inbinary (ready-compiled) form at .At the time of writing, 2.7.2 (gcc-2.7.2.bin.tar.gz) is the latest version.
(b) The latest source distribution of GCC from the Free SoftwareFoundation can be had from GNU archives. This is not necessarily always thesame version as above, though it is just now. The Linux GCCmaintainer(s) have made it easy for you to compile the latestversion available yourself --- theconfigurescript should set it all up for you. Checktsx-11 as well, for patches which you may want toapply.
To compile anything non-trivial (and quite a few trivial thingsalso) you will also need the
C library andheader files
What you want here depends on (i) whether your system is ELF ora.out, and (ii) which you want it to be. If you're upgrading fromlibc 4 to libc 5, you are recommended to look at the ELF-HOWTO fromapproximately the same place as you found this document.
These are available from tsx-11 as above:
- libc-5.2.18.bin.tar.gz
--- ELF shared library images, static libraries and includefiles for the C and maths libraries.
libc-5.2.18.tar.gz--- Source for the above. You will also need the.bin.package for the header files. If you aredeliberating whether to compile the C library yourself or use thebinaries, the right answer in nearly all cases is to use thebinaries. You will however need to roll your own if you want NYS orshadow password support.
libc-4.7.5.bin.tar.gz--- a.out shared library images and static libraries for version4.7.5 of the C library and friends. This is designed to coexistwith the libc 5 package above, but is only really necessary if youwish to keep using/developing a.out format programs.
Associated tools(as, ld, ar, strings etc)
From tsx-11, just like everything else so far. Thecurrent version isbinutils-2.6.0.2.bin.tar.gz.
Note that the binutils are only available in ELF, the currentlibc version is in ELF and the a.out libc is happiest when used inconjunction with an ELF libc. C library development is movingemphatically ELFwards, and unless you have really good reasons forneeding a.out things you're encouraged to follow suit.
GCC installation andsetup
GCCversions
You can find out what GCC version you're running by typinggcc -vat the shell prompt. This is also afairly reliable way to find out whether you are set up for ELF ora.out. On my system it does
$ gcc -v Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.7.2/specs gcc version 2.7.2 |
The key things to note here are
i486. This indicates that the gcc youare using was built for a 486 processor --- you might have 386 or586 instead. All of these chips can run code compiled for each ofthe others; the difference is that the 486 code has added paddingin some places so runs faster on a 486. This has no detrimentalperformance effect on a 386, but does make the binaries slightlylarger.
box. This is not at all important, and may say something else(such asslackwareordebian) or nothing at all (so that the completedirectory name isi486-linux). If youbuild your own gcc, you can set this at build time for cosmeticeffect. Just like I did:-)
linux. This may instead saylinuxelforlinuxaout, and,confusingly, the meaning of each varies according to the versionthat you are using.
linuxmeans ELF if the version is 2.7.0or newer, a.out otherwise.
linuxaoutmeans a.out. It wasintroduced as a target when the definition oflinuxwas changed from a.out to ELF, so you won'tsee anylinuxaoutgcc older than2.7.0.
linuxelfis obsolete. It is generally aversion of gcc 2.6.3 set to produce ELF executables. Note that gcc2.6.3 has known bugs when producing code for ELF --- an upgrade isadvisable.
2.7.2is the version number.
So, in summary, I have gcc 2.7.2 producing ELF code. Quellesurprise.
Where did itgo?
If you installed gcc without watching, or if you got it as partof a distribution, you may like to find out where it lives in thefilesystem. The key bits are
/usr/lib/gcc-lib/target/version/(andsubdirectories) is where most of the compiler lives. This includesthe executable programs that do actual compiling, and someversion-specific libraries and include files.
/usr/bin/gccis the compiler driver ---the bit that you can actually run from the command line. This canbe used with multiple versions of gcc provided that you havemultiple compiler directories (as above) installed. To find out thedefault version it will use, typegcc -v.To force it to another version, typegcc-Vversion. For example
# gcc -v Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.7.2/specs gcc version 2.7.2 # gcc -V 2.6.3 -v Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.6.3/specs gcc driver version 2.7.2 executing gcc version 2.6.3
/usr/target/(bin|lib|include)/.If you have multiple targets installed (for example, a.out and elf,or a cross-compiler of some sort, the libraries, binutils(as,ldand soon) and header files for the non-native target(s) can be foundhere. Even if you only have one kind of gcc installed you mightfind anyway that various bits for it are kept here. If not, they'rein/usr/(bin|lib|include).
/lib/,/usr/liband others are library directories for the native system. You willalso need/lib/cppfor many applications(X makes quite a lot of use of it) --- either copy it from/usr/lib/gcc-lib/target/version/or make a symlinkpointing there.
Where are theheader files?
Apart from whatever you install yourself under/usr/local/include, there are three main sources ofheader files in Linux:
Most of/usr/include/and itssubdirectories are supplied with the libc binary package from H JLu. I say `most' because you may also have files from other sources(cursesanddbmlibraries, for example) in here, especially if you are using thenewest libc distribution (which doesn't come with curses or dbm,unlike the older ones).
/usr/include/linuxand/usr/include/asm(for the files<linux #endif
Use__linux__for this purpose,notlinux.Although the latter is defined, it is not POSIX compliant.
Compilerinvocation
The documentation for compiler switches is the gcc info page (inEmacs, useC-h ithen select the `gcc'option). Your distributor may not have packed this with yoursystem, or you may have an old version; the best thing to do inthis case is to download the gcc source archive from or one of itsmirrors, and copy them out of it.
The gcc manual page (gcc.1) is,generally speaking, out of date. It will warn you of this when youtry to look at it.
Compilerflags
gcc can be made to optimize its output code by adding-On to its command line,where n is an optional small integer.Meaningful values of n, and their exacteffect, vary according to the exact version, but typically itranges from 0 (no optimization) to 2 (lots) or 3 (lots andlots).
Internally, gcc translates these to a series of-fand-moptions. You cansee exactly which-Olevels map to whichoptions by running gcc with the-vflagand the (undocumented)-Qflag. Forexample, for-O2, mine says
enabled: -fdefer-pop -fcse-follow-jumps -fcse-skip-blocks
-fexpensive-optimizations
-fthread-jumps -fpeephole -fforce-mem -ffunction-cse -finline
-fcaller-saves -fpcc-struct-return -frerun-cse-after-loop
-fcommon -fgnu-linker -m80387 -mhard-float -mno-soft-float
-mno-386 -m486 -mieee-fp -mfp-ret-in-387
|
Using an optimization level higher than your compiler supports(e.g.-O6) will have exactly the sameeffect as using the highest level that it does support. Distributing code which is set tocompile this way is a poor idea though --- if further optimisationsare incorporated into future versions, you (or your users) may findthat they break your code.
Users of gcc 2.7.0 thru2.7.2 should note that there is a bug in-O2on these. Specifically, strength reductiondoesn't work. A patch can be had to fix this if you feel likerecompiling gcc, otherwise make sure that you always compile with-fno-strength-reduce
Processor-specific
There are other-mflags which aren'tturned on by any variety of-Obut arenevertheless useful. Chief among these are-m386and-m486, which tellgcc to favour the 386 or 486 respectively. Code compiled with oneof these will still work on the other; 486 code is bigger, butotherwise not slower on the 386.
There is currently no-mpentiumor-m586. Linus suggests using-m486 -malign-loops=2 -malign-jumps=2-malign-functions=2, to get 486 code optimisations but withoutthe big gaps for alignment (which the pentium doesn't need).Michael Meissner (of Cygnus) says
"My hunch is that-mno-strength-reducealso results in faster code onthe x86 (note, I'm not talking about the strength reduction bug,which is another issue). This is because the x86 is rather registerstarved (and GCC's method of grouping registers into spillregisters vs. other registers doesn't help either). Strengthreduction typically results in using additional registers toreplace multiplications with addition. I also suspect-fcaller-savesmay also be a loss.""Another hunch is that-fomit-frame-pointermight or might not be a win. Onthe one hand, it can mean that another register is available forallocation. On the other hand, the way the x86 encodes itsinstruction set, means that stack relative addresses take morespace instead of frame relative addresses, which means slightlyless Icache availble to the program. Also,-fomit-frame-pointer, means that the compiler has toconstantly adjust the stack pointer after calls, while with aframe, it can let the stack accumulate for a few calls."
The final word on this subject is from Linus again:
"Note that if you want to get optimalperformance, don't believe me: test. There are lots of gcc compilerswitches, and it may be that a particular set gives the bestoptimizations for you. "
Internal compiler error: cc1 got fatal signal11
Signal 11 is SIGSEGV, or `segmentation violation'. Usually itmeans that the program got its pointers confused and tried to writeto memory it didn't own. So, it could be a gcc bug.
gcc is however, a well tested and reliable piece of software,for the most part. It also uses a large number of complex datastructures, and an awful lot of pointers. In short, it's thepickiest RAM tester commonly available. If you can't duplicate the bug --- if it doesn't stop inthe same place when you restart the compilation --- it's almostcertainly a problem with your hardware (CPU, memory, motherboard orcache). Don't claim it as a bug becauseyour computer passes the power-on checks or runs Windows ok orwhatever; these `tests' are commonly and rightly held to beworthless. And don't claim it's a bug because a kernel compilealways stops during `make zImage' --- ofcourse it will! `make zImage' is probablycompiling over 200 files; we're looking for a slightly smaller place than that.
If you can duplicate the bug, and (better) can produce a shortprogram that exhibits it, you can submit it as a bug report to theFSF, or to the linux-gcc mailing list. See the gcc documentationfor details of exactly what information they need.
Portability
It has been said that, these days, if something hasn't beenported to Linux then it is not worth having :-)
Seriously though, in general only minor changes are needed tothe sources to get over Linux's 100% POSIX compliance. It is alsoworthwhile passing back any changes to authors of the code suchthat in the future only `make' need be called to provide a workingexecutable.
BSDisms (includingbsd_ioctl,daemonand<sgtty.h>)
You can compile your program with-I/usr/include/bsdand link it with-lbsd(i.e. add-I/usr/include/bsdtoCFLAGSand-lbsdto theLDFLAGSline in your Makefile). There isno need to add-D__USE_BSD_SIGNALany more if you want BSD typesignal behavior, as you get this automatically when you have-I/usr/include/bsdand include<signal.h>.
`Missing'signals (SIGBUS,SIGEMT,SIGIOT,SIGTRAP,SIGSYSetc)
Linux is POSIX compliant. These are not POSIX-defined signals--- ISO/IEC 9945-1:1990 (IEEE Std 1003.1-1990), paragraph B.3.3.1.1sez:
"``The signals SIGBUS, SIGEMT, SIGIOT,SIGTRAP, and SIGSYS were omitted from POSIX.1 because theirbehavior is implementation dependent and could not be adequatelycategorized. Conforming implementations may deliver these signals,but must document the circumstances under which they are deliveredand note any restrictions concerning their delivery.''"
The cheap and cheesy way to fix this is to redefine thesesignals toSIGUNUSED. The correct way is to bracket the code that handles themwith appropriate#ifdefs:
#ifdef SIGSYS #endif |
K& R Code
GCC is an ANSI compiler; much existing code is not ANSI. There'sreally not much that can be done about this, except to add-traditionalto the compiler flags. Thereis a certain amount of finer-grained control over which varietiesof brain damage to emulate; consult the gcc info page.
Note that-traditionalhas effectsbeyond just changing the language that gcc accepts. For example, itturns on-fwritable-strings, which movesstring constants into data space (from text space, where theycannot be written to). This increases the memory footprint of theprogram.
Preprocessorsymbols conflict with prototypes in the code
One of the most frequent problems is that some common functionsare defined as macros in Linux's header files and the preprocessorwill refuse to parse similar prototype definitions in the code.Common ones areatoi()andatol().
sprintf()
Something to be aware of, especially when porting from SunOS, isthatsprintf(string, fmt, ...)returns apointer tostringon many unices, whereasLinux (following ANSI) returns the number of characters which wereput into the string.
fcntland friends. Where are the definitions ofFD_*stuff ?
In<sys/time.h>. If youare usingfcntlyou probably want toinclude<unistd.h>too, forthe actual prototype.
Generally speaking, the manual page for a function lists thenecessary#includes in its SYNOPSISsection.
Theselect()timeout. Programs startbusy-waiting.
The BSD manual page for select(2) used to say "select() should probably return the time remaining fromthe original timeout, if any, by modifying the time value in place.This may be implemented in future versions of the system. Thus, itis unwise to assume that the timeout pointer will be unmodified bythe select() call."
Some versions of Linux do perform this modification. Some don't.It is incredibly unwise to assume one behaviour or the other.
To fix, put the timeout value into that structure every time youcallselect(). Change code like
struct timeval timeout;
timeout.tv_sec = 1; timeout.tv_usec = 0;
while (some_condition)
select(n,readfds,writefds,exceptfds,&timeout);
|
struct timeval timeout;
while (some_condition) {
timeout.tv_sec = 1; timeout.tv_usec = 0;
select(n,readfds,writefds,exceptfds,&timeout);
}
|
Some versions of Mosaic were at one time notable for thisproblem. The speed of the spinning globe animation was inverselyrelated to the speed that the data was coming in from the networkat!
Interruptedsystem calls.
Symptom:
When a program is stopped using Ctrl-Z and then restarted - orin other situations that generate signals: Ctrl-C interruption,termination of a child process etc. - it complains about"interrupted system call" or "write: unknown error" or things likethat.
Problem:
POSIX systems check for signals a bit more often than some olderunices. Linux may execute signal handlers ---
asynchronously (at a timer tick)
on return from any system call
during the execution of the following system calls:select(),pause(),connect(),accept(),read()onterminals, sockets, pipes or files in/proc,write()onterminals, sockets, pipes or the line printer,open()on FIFOs, PTYs or serial lines,ioctl()on terminals,fcntl()with commandF_SETLKW,wait4(),syslog(), any TCP or NFS operations.
For other operating systems you may have to include the systemcallscreat(),close(),getmsg(),putmsg(),msgrcv(),msgsnd(),recv(),send(),wait(),waitpid(),wait3(),tcdrain(),sigpause(),semop()to thislist.
If a signal (that the program has installed a handler for)occurs during a system call, the handler is called. When thehandler returns (to the system call) it detects that it wasinterrupted, and immediately returns with -1 anderrno = EINTR. The program is not expecting that tohappen, so bottles out.
You may choose between two fixes.
(1) For every signal handler that you install, addSA_RESTARTto the sigaction flags. For example,change
signal (sig_nr, my_signal_handler); |
signal (sig_nr, my_signal_handler);
{ struct sigaction sa;
sigaction (sig_nr, (struct sigaction *)0, &sa);
#ifdef SA_RESTART
sa.sa_flags |= SA_RESTART;
#endif
#ifdef SA_INTERRUPT
sa.sa_flags &= ~ SA_INTERRUPT;
#endif
sigaction (sig_nr, &sa, (struct sigaction *)0);
}
|
Note that while this applies to most system calls, you muststill check forEINTRyourself onread(),write(),ioctl(),select(),pause()andconnect(). See below.
(2) Check forEINTRexplicitly,yourself:
Here are two examples forread()andioctl(),
Original piece of code usingread()
int result;
while (len > 0) {
result = read(fd,buffer,len);
if (result < 0) break;
buffer += result; len -= result;
}
|
int result;
while (len > 0) {
result = read(fd,buffer,len);
if (result < 0) { if (errno != EINTR) break; }
else { buffer += result; len -= result; }
}
|
int result; result = ioctl(fd,cmd,addr); |
int result;
do { result = ioctl(fd,cmd,addr); }
while ((result == -1) && (errno == EINTR));
|
Note that in some versions of BSD Unix the default behaviour isto restart system calls. To get system calls interrupted you haveto use theSV_INTERRUPTorSA_INTERRUPTflag.
Writable strings(program seg faults randomly)
GCC has an optimistic view of its users, believing that theyintend string constants to be exactly that --- constant. Thus, itstores them in the text (code) area of the program, where they canbe paged in and out from the program's disk image (instead oftaking up swapspace), and any attempt to rewrite them will cause asegmentation fault. This is a feature!
It may cause a problem for old programs that, for example, callmktemp()with a string constant asargument.mktemp()attempts to rewrite itsargument in place.
To fix, either (a) compile with-fwritable-strings, to get gcc to put constants indata space, or (b) rewrite the offending parts to allocate anon-constant string and strcpy the data into it before calling.
Why does theexecl()call fail?
Because you're calling it wrong. The first argument toexeclis the program that you want to run.The second and subsequent arguments become theargvarray of the program you're calling. Remember:argv[0]is traditionally set even when aprogram is run with `no' arguments. So, you should be writing
execl("/bin/ls","ls",NULL);
|
execl("/bin/ls", NULL);
|
Executing the program with no arguments at all is construed asan invitation to print out its dynamic library dependencies, atleast using a.out. ELF does things differently.
(If you want this library information, there are simplerinterfaces; see the section on dynamic loading, or the manual pageforldd).
Debugging andProfiling
Preventativemaintenance (lint)
There is no widely-used lint for Linux, as most people aresatisfied with the warnings that gcc can generate. Probably themost useful is the-Wallswitch --- thisstands for `Warnings, all' but probably has more mnemonic value ifthought of as the thing you bang your head against.
There is a public domain lint available from . Idon't know how good it is.
Debugging
How do I getdebugging information into a program ?
You need to compile and link all its bits with the-gswitch, and without the-fomit-frame-pointerswitch. Actually, you don'tneed to recompile all of it, just the bits you're interested indebugging.
On a.out configurations the shared libraries are compiled with-fomit-frame-pointer, which gdb won't geton with. Giving the-goption when youlink should imply static linking; this is why.
If the linker fails with a message about not finding libg.a, youdon't have/usr/lib/libg.a, which is thespecial debugging-enabled C library. It may be supplied in the libcbinary package, or (in newer C library versions) you may need toget the libc source code and build it yourself. You don't actuallyneed it though; you can get enoughinformation for most purposes simply by symlinking it to/usr/lib/libc.a
How do I get itout again?
A lot of GNU software comes set up to compile and link with-g, causing it to make very big (and oftenstatic) executables. This is not really such a hot idea.
If the program has an autoconf generatedconfigurescript, you can usually turn off debugginginformation by doing./configure CFLAGS=or./configure CFLAGS=-O2. Otherwise,check the Makefile. Of course, if you're using ELF, the program isdynamically linked regardless of the-gsetting, so you can juststripit.
Availablesoftware
Most people use gdb, which you can getin source form from GNU archive sites, or as a binary from tsx-11 or sunsite. xxgdb is an Xdebugger based on this (i.e. you need gdb installed first). Thesource may be found at
Also, the UPS debugger has been portedby Rick Sladkey. It runs under X as well, but unlike xxgdb, it isnot merely an X front end for a text based debugger. It has quite anumber of nice features, and if you spend any time debugging stuff,you probably should check it out. The Linux precompiled version andpatches for the stock UPS sources can be found in , and the original source at .
Another tool you might find useful for debugging is `strace', which displays the system calls that aprocess makes. It has a multiplicity of other uses too, includingfiguring out what pathnames were compiled into binaries that youdon't have the source for, exacerbating race conditions in programsthat you suspect contain them, and generally learning how thingswork. The latest version of strace (currently 3.0.8) can be foundat .
Background (daemon)programs
Daemon programs typically executefork()early, and terminate the parent. This makesfor a short debugging session.
The simplest way to get around this is to set a breakpoint forfork, and when the program stops, force itto return 0.
(gdb) list
1 #include <stdio.h>
2
3 main()
4 {
5 if(fork()==0) printf("child\n");
6 else printf("parent\n");
7 }
(gdb) break fork
Breakpoint 1 at 0x80003b8
(gdb) run
Starting program: /home/dan/src/hello/./fork
Breakpoint 1 at 0x400177c4
Breakpoint 1, 0x400177c4 in fork ()
(gdb) return 0
Make selected stack frame return now? (y or n) y
#0 0x80004a8 in main ()
at fork.c:5
5 if(fork()==0) printf("child\n");
(gdb) next
Single stepping until exit from function fork,
which has no line number information.
child
7 }
|
Core files
When Linux boots it is usually configured not to produce corefiles. If you like them, use your shell's builtin command tore-enable them: for C-shell compatibles (e.g. tcsh) this is
% limit core unlimited |
$ ulimit -c unlimited |
If you want a bit more versatility in your core file naming (forexample, if you're trying to conduct a post-mortem using a debuggerthat's buggy itself) you can make a simple mod to your kernel. Lookfor the code infs/binfmt_aout.candfs/binfmt_elf.c(in newer kernels, you'llhave to grep around a little in older ones) that says
memcpy(corefile,"core.",5);
#if 0
memcpy(corefile+5,current->comm,sizeof(current->comm));
#else
corefile[4] = '\0';
#endif
|
and change the0s to1s.
Profiling
Profiling is a way to examine which bits of a program are calledmost often or run for longest. It is a good way to optimize codeand look at where time is being wasted. You must compile all objectfiles that you require timing information for with-p, and to make sense of the output file you willalso needgprof(from the binutilspackage). See thegprofmanual page fordetails.
Linking
Between the two incompatible binary formats, the static vsshared library distinction, and the overloading of the verb `link'to mean both `what happens after compilation' and `what happenswhen a compiled program is invoked' (and, actually, the overloadingof the word `load' in a comparable but opposite sense), thissection is complicated. Little of it is much more complicated thanthat sentence, though, so don't worry too much about it.
To alleviate the confusion somewhat, we refer to what happens atruntime as `dynamic loading' and cover it in the next section. Youwill also see it described as `dynamic linking', but not here. Thissection, then, is exclusively concerned with the kind of linkingthat happens at the end of a compilation.
Shared vs staticlibraries
The last stage of building a program is to `link' it; to joinall the pieces of it together and see what is missing. Obviouslythere are some things that many programs will want to do --- openfiles, for example, and the pieces that do these things areprovided for you in the form of libraries. On the average Linuxsystem these can be found in/liband/usr/lib/, among other places.
When using a static library, the linker finds the bits that theprogram modules need, and physically copies them into theexecutable output file that it generates. For shared libraries, itdoesn't --- instead it leaves a note in the output saying `whenthis program is run, it will first have to load this library'.Obviously shared libraries tend to make for smaller executables;they also use less memory and mean that less disk space is used.The default behaviour of Linux is to link shared if it can find theshared libraries, static otherwise. If you're getting staticbinaries when you want shared, check that the shared library files(*.safor a.out,*.sofor ELF) are where they should be, and arereadable.
On Linux, static libraries have names likelibname.a, while shared libraries are calledlibname.so.x.y.zwherex.y.zis some form of version number. Sharedlibraries often also have links pointing to them, which areimportant, and (on a.out configurations) associated.safiles. The standard libraries come in bothshared and static formats.
You can find out what shared libraries a program requires byusingldd(List Dynamic Dependencies)
$ ldd /usr/bin/lynx
libncurses.so.1 => /usr/lib/libncurses.so.1.9.6
libc.so.5 => /lib/libc.so.5.2.18
|
This shows that on my system the WWW browser `lynx' depends onthe presence oflibc.so.5(the C library)andlibncurses.so.1(used for terminalcontrol). If a program has no dependencies,lddwill say `staticallylinked' or `statically linked(ELF)'.
Interrogatinglibraries (`which library issin()in?')
nmlibrarynameshould list all the symbols that libraryname has references to. It works on bothstatic and shared libraries. Suppose that you want to know wheretcgetattr()is defined: you might do
$ nm libncurses.so.1 |grep tcget
U tcgetattr
|
TheUstands for `undefined' --- itshows that the ncurses library uses but does not define it. Youcould also do
$ nm libc.so.5 | grep tcget 00010fe8 T __tcgetattr 00010fe8 W tcgetattr 00068718 T tcgetpgrp |
The `W' stands for `weak', which meansthat the symbol is defined, but in such a way that it can beoverridden by another definition in a different library. Astraightforward `normal' definition (such as the one fortcgetpgrp) is marked by a `T'
The short answer to the question in the title, by the way, islibm.(so|a). All the functions defined in<math.h>arekept in the maths library; thus you need to link with-lmwhen using any of them.
Findingfiles
ld: Output file requires shared library`libfoo.so.1`
The file search strategy of ld and friends varies according toversion, but the only default you can reasonably assume is/usr/lib. If you want libraries elsewhereto be searched, specify their directories with the-Loption to gcc or ld.
If that doesn't help, check that you have the right file in thatplace. For a.out, linking with-lfoomakesld look forlibfoo.sa(shared stubs), andif unsuccessful then forlibfoo.a(static). For ELF, it looks forlibfoo.sothenlibfoo.a.libfoo.sois usually a symbolic link tolibfoo.so.x.
Building your ownlibraries
Versioncontrol
As any other program, libraries tend to have bugs which getfixed over time. They also may introduce new features, change theeffect of existing ones, or remove old ones. This could be aproblem for programs using them; what if it was depending on thatold feature?
So, we introduce library versioning. We categorise the changesthat might be made to a library as `minor' or `major', and we rulethat a `minor' change is not allowed to break old programs that areusing the library. You can tell the version of a library by lookingat its filename (actually, this is, strictly speaking, a lie forELF; keep reading to find out why) :libfoo.so.1.2has major version 1, minor version 2.The minor version number can be more or less anything --- libc putsa `patchlevel' in it, giving library names likelibc.so.5.2.18, and it's also reasonable to putletters, underscores, or more or less any printable ASCII init.
One of the major differences between ELF and a.out format is inbuilding shared libraries. We look at ELF first, because it'ssimpler.
ELF? What is itthen, anyway?
ELF (Executable and Linking Format) is a binary formatoriginally developed by USL (UNIX System Laboratories) andcurrently used in Solaris and System V Release 4. Because of itsincreased flexibility over the older a.out format that Linux wasusing, the GCC and C library developers decided last year to moveto using ELF as the Linux standard binary format also.
Come again?
This section is from the document'/news-archives/comp.sys.sun.misc'.
"ELF ("Executable Linking Format) is the"new, improved" object file format introduced in SVR4. ELF is muchmore powerful than straight COFF, in that it *is* user-extensible.ELF views an object-file as an arbitarily long list of sections(rather than an array of fixed size entities), these sections,unlike in COFF, do not HAVE to be in a certain place and do notHAVE to come in any specific order etc. Users can add new sectionsto object-files if they wish to capture new data. ELF also has afar more powerful debugging format called DWARF (Debugging WithAttribute Record Format) - not currently fully supported on linux(but work is underway). A linked list of DWARF DIEs (or DebuggingInformation Entries) forms the .debug section in ELF. Instead ofbeing a collection of small, fixed-size information records, DWARFDIEs each contain an arbitrarily long list of complex attributesand are written out as a scope-based tree of program data. DIEs cancapture a large amount of information that the COFF .debug sectionsimply couldn't (like C++ inheritance graphs etc.).""ELF files are accessed via the SVR4 (Solaris2.0 ?) ELF access library, which provides an easy and fastinterface to the more gory parts of ELF. One of the major boons inusing the ELF access library is that you will never need to look atan ELF file qua. UNIX file, it is accessed as an Elf *, after anelf_open() call and from then on, you perform elf_foobar() calls onits components instead of messing about with its actual on-diskimage (something many COFFers did with impunity). "
The case for/against ELF, and the necessary contortions toupgrade an a.out system to support it, are covered in the ELF-HOWTOand I don't propose to cut/paste them here. The HOWTO should beavailable in the same place as you found this one.
ELF sharedlibraries
To buildlibfoo.soas a shared library,the basic steps look like this:
$ gcc -fPIC -c *.c $ gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o $ ln -s libfoo.so.1.0 libfoo.so.1 $ ln -s libfoo.so.1 libfoo.so $ LD_LIBRARY_PATH=`pwd`:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH |
This will generate a shared library calledlibfoo.so.1.0, and the appropriate links for ld(libfoo.so) and the dynamic loader(libfoo.so.1) to find it. To test, we addthe current directory toLD_LIBRARY_PATH.
When you're happpy that thelibrary works, you'll have to move it to, say,/usr/local/lib, and recreate the appropriate links.The link fromlibfoo.so.1tolibfoo.so.1.0is kept up to date byldconfig, which on most systems is run as part ofthe boot process. Thelibfoo.solink mustbe updated manually. If you are scrupulous about upgrading all theparts of a library (e.g. the header files) at the same time, thesimplest thing to do is makelibfoo.so-> libfoo.so.1, so that ldconfig will keep bothlinks current for you. If you aren't,you're setting yourself up to have all kinds ofweird things happen at a later date. Don't say you weren'twarned.
$ su # cp libfoo.so.1.0 /usr/local/lib # /sbin/ldconfig # ( cd /usr/local/lib ; ln -s libfoo.so.1 libfoo.so ) |
Versionnumbering, sonames and symlinks
Each library has a soname. When thelinker finds one of these in a library it is searching, it embedsthe soname into the binary instead of the actual filename it islooking at. At runtime, the dynamic loader will then search for afile with the name of the soname, not the library filename. Thus alibrary calledlibfoo.socould have asonamelibbar.so, and all programs linkedto it would look forlibbar.soinsteadwhen they started.
This sounds like a pointless feature, but it is key tounderstanding how multiple versions of the same library can coexiston a system. The de facto naming standard for libraries in Linux isto call the library, say,libfoo.so.1.2,and give it a soname oflibfoo.so.1. Ifit's added to a `standard' library directory (e.g./usr/lib),ldconfigwillcreate a symlinklibfoo.so.1 ->libfoo.so.1.2so that the appropriate image is found atruntime. You also need a linklibfoo.so-> libfoo.so.1so that ld will find the rightsoname to use at link time.
So, when you fix bugs in the library, or add new functions (anychanges that won't adversely affect existing programs), you rebuildit, keeping the soname as it was, and changing the filename. Whenyou make changes to the library that would break existing binaries,you simply increment the number in the soname --- in this case,call the new versionlibfoo.so.2.0, andgive it a soname oflibfoo.so.2. Nowswitch thelibfoo.solink to point to thenew version and all's well with the world again.
Note that you don't have to namelibraries this way, but it's a good convention. ELF gives you theflexibility to name libraries in ways that will confuse the pantsoff people, but that doesn't mean you have to use it.
Executive summary: supposing that you observe the tradition thatmajor upgrades may break compatibility, minor upgrades may not,then link with
gcc -shared -Wl,-soname,libfoo.so.major -o libfoo.so.major.minor |
and everything will be all right.
a.out. Ye oldetraditional format
The ease of building shared libraries is a major reason forupgrading to ELF. That said, it's still possible in a.out. Getand read the 20 page document that you will findafter unpacking it. I hate to be so transparently partisan, but itshould be clear from context that I never bothered myself :-)
ZMAGIC vsQMAGIC
QMAGIC is an executable format just like the old a.out (alsoknown as ZMAGIC) binaries, but which leaves the first pageunmapped. This allows for easier NULL dereference trapping as nomapping exists in the range 0-4096. As a side effect your binariesare nominally smaller as well (by about 1K).
Obsolescent linkers support ZMAGIC only, semi-obsolescentsupport both formats, and current versions support QMAGIC only.This doesn't actually matter, though, as the kernel can still runboth formats.
Your `file' command should be able to identify whether a programis QMAGIC.
FilePlacement
An a.out (DLL) shared library consists of two real files and asymlink. For the `foo' library used throughout this document as anexample, these files would belibfoo.saandlibfoo.so.1.2; the symlink would belibfoo.so.1and would point at the latterof the files. What are these for?
At compile time,ldlooks forlibfoo.sa. This is the `stub' file for thelibrary, and contains all exported data and pointers to thefunctions required for run time linking.
At run time, the dynamic loader looks forlibfoo.so.1. This is a symlink rather than a realfile so that libraries can be updated with newer, bugfixed versionswithout crashing any application that was using the library at thetime. After the new version --- say,libfoo.so.1.3--- is completely there, runningldconfig will switch the link to point to it in one atomicoperation, leaving any program which had the old version stillperfectly happy.
DLL libraries (I know that's a tautology --- so sue me) oftenappear bigger than their static counterparts. They reserve spacefor future expansion in the form of `holes' which can be made totake no disk space. A simplecpcall orusing the programmakeholewill achievethis. You can also strip them after building, as the addresses arein fixed locations. Do not attempt to strip ELFlibraries.
``libc-lite''?
A libc-lite is a light-weight version of the libc library builtsuch that it will fit on a floppy and suffice for all of the mostmenial of UNIX tasks. It does not includecurses, dbm, termcap etc code. If your/lib/libc.so.4is linked to a lite lib, you areadvised to replace it with a full version.
Linking: commonproblems
Send me your linking problems! I probably won't do anythingabout them, but I will write them up if I get enough ...
- Programs link static when you wanted them shared
Check that you have the right links forldto find each shared library. For ELF this means alibfoo.sosymlink to the image, for a.outalibfoo.safile. A lot of people had thisproblem after moving from ELF binutils 2.5 to 2.6 --- the earlierversion searched more `intelligently' for shared libraries, so theyhadn't created all the links. The intelligent behaviour was removedfor compatibility with other architectures, and because quite oftenit got its assumptions wrong and caused more trouble than itsolved.
The DLL tool `mkimage' fails to find libgcc, orAs oflibc.so.4.5.xand above, libgccis no longer shared. Hence you must replace occurrences of`-lgcc' on the offending line with`gcc -print-libgcc-file-name`(completewith the backquotes).
Also, delete all/usr/lib/libgcc*files. This is important.
__NEEDS_SHRLIB_libc_4 multiply definedmessagesare another consequence of the same problem.
``Assertion failure'' message when rebuilding a DLL ?This cryptic message most probably means that one of your jumptable slots has overflowed because too little space has beenreserved in the originaljump.varsfile.You can locate the culprit(s) by running the `getsize' command provided in the tools-2.17.tar.gzpackage. Probably the only solution, though, is to bump the majorversion number of the library, forcing it to be backwardincompatible.
ld: output file needs shared librarylibc.so.4This usually happens when you are linking with libraries otherthan libc (e.g. X libraries), and use the-gswitch on the link line without also using-static.
The.sastubs for the shared librariesusually have an undefined symbol_NEEDS_SHRLIB_libc_4which gets resolved from thelibc.sastub. However with-gyou end up linking withlibg.aorlibc.aand thusthis symbol never gets resolved, leading to the above errormessage.
In conclusion, add-staticwhencompiling with the-gflag, or don't linkwith-g. Quite often you can get enoughdebugging information by compiling the individual files with-g, and linking without it.
DynamicLoading
This section is a tad short right now; itwill be expanded over time as I gut the ELF howto
Concepts
Linux has shared libraries, as you will by now be sick ofhearing if you read the whole of the last section at a sitting.Some of the matching-names-to-places work which was traditionallydone at link time must be deferred to load time.
Errormessages
Send me your link errors! I won't do anything about them, but Imight write them up ...
- can't load library: /lib/libxxx.so,Incompatible version
(a.out only) This means that you don't have the correct majorversion of the xxx library. No, you can't just make a symlink toanother version that you do have; if you are lucky this will causeyour program to segfault. Get the new version. A similar situationwith ELF will result in a message like
ftp: can't load library 'libreadline.so.2'
warning using incompatible library versionxxx(a.out only) You have an older minor version of the library thanthe person who compiled the program used. The program will stillrun. Probably. An upgrade wouldn't hurt, though.
Controlling theoperation of the dynamic loader
There are a range of environment variables that the dynamicloader will respond to. Most of these are more use tolddthan they are to the average user, and can mostconveniently be set by running ldd with various switches. Theyinclude
LD_BIND_NOW--- normally, functions arenot `looked up' in libraries until they are called. Setting thisflag causes all the lookups to happen when the library is loaded,giving a slower startup time. It's useful when you want to test aprogram to make sure that everything is linked.
LD_PRELOADcan be set to a filecontaining `overriding' function definitions. For example, if youwere testing memory allocation strategies, and wanted to replace`malloc', you could write your replacement routine, compile it intomalloc.oand then
LD_ELF_PRELOADandLD_AOUT_PRELOADare similar, but only apply to theappropriate type of binary. IfLD_something_PRELOADandLD_PRELOADareset, the more specific one is used.$ LD_PRELOAD=malloc.o; export LD_PRELOAD $ some_test_program
LD_LIBRARY_PATHis a colon-separatedlist of directories in which to look for shared libraries. It doesnot affect ld; it only has effect atruntime. Also, it is disabled for programs that run setuid orsetgid. Again,LD_ELF_LIBRARY_PATHandLD_AOUT_LIBRARY_PATHcan also be used todirect the search differently for different flavours of binary.LD_LIBRARY_PATHshouldn't be necessary innormal operation; add the directories to/etc/ld.so.conf/and rerun ldconfig instead.
LD_NOWARNapplies to a.out only. Whenset (e.g. withLD_NOWARN=true; exportLD_NOWARN) it stops the loader from issuing non-fatal warnings(such as minor version incompatibility messages).
LD_WARNapplies to ELF only. When set,it turns the usually fatal ``Can't find library'' messages intowarnings. It's not much use in normal operation, but important forldd.
LD_TRACE_LOADED_OBJECTSapplies to ELFonly, and causes programs to think they're being run underldd:
$ LD_TRACE_LOADED_OBJECTS=true /usr/bin/lynx libncurses.so.1 => /usr/lib/libncurses.so.1.9.6 libc.so.5 => /lib/libc.so.5.2.18
Writing programswith dynamic loading
This is very close to the way that Solaris 2.x dynamic loadingsupport works, if you're familiar with that. It is coveredextensively in H J Lu's ELF programming document, and thedlopen(3)manual page, which can be foundin the ld.so package. Here's a nice simple example though: link itwith-ldl
#include <dlfcn.h>
#include <stdio.h>
main()
{
void *libc;
void (*printf_call)();
if(libc=dlopen("/lib/libc.so.5",RTLD_LAZY))
{
printf_call=dlsym(libc,"printf");
(*printf_call)("hello, world\n");
}
}
|
Contacting thedevelopers
Bugreports
Start by narrowing the problem down. Isit specific to Linux, or does it happen with gcc on other systems?Is it specific to the kernel version? Library version? Does it goaway if you link static? Can you trim the program down to somethingshort that demonstrates the bug?
Having done that, you'll know what program(s) the bug is in. ForGCC, the bug reporting procedure is explained in the info file. Forld.so or the C or maths libraries, send mail tolinux-gcc@vger.rutgers.edu. If possible, include ashort and self-contained program that exhibits the bug, and adescription both of what you want it to do, and what it actuallydoes.
Helping withdevelopment
If you want to help with the development effort for GCC or the Clibrary, the first thing to do is join thelinux-gcc@vger.rutgers.edumailing list. If you justwant to see what the discussion is about, there are list archivesat .The second and subsequent things depend on what you want to do!
The Remains
The Credits
" Only presidents, editors, and people withtapeworms have the right to use the editorial ``we''." (MarkTwain)
This HOWTO is based very closely on Mitchum DSouza's GCC-FAQ;most of the information (not to mention a reasonable amount of thetext) in it comes directly from that document. Instances of thefirst person pronoun in this HOWTO could refer to either of us;generally the ones that say ``I have not tested this; don't blameme if it toasts your hard disk/system/spouse'' apply to both ofus.
Contributors to this document have included (in ASCII orderingby first name) Andrew Tefft, Axel Boldt, Bill Metzenthen, BruceEvans, Bruno Haible, Daniel Barlow, Daniel Quinlan, David Engel,Dirk Hohndel, Eric Youngdale, Fergus Henderson, H.J. Lu, JensSchweikhardt, Kai Petzke, Michael Meissner, Mitchum DSouza, OlafFlebbe, Paul Gortmaker, Rik Faith, Steven S. Dick, Tuomas J Lukka,and of course Linus Torvalds, without whom the whole exercise wouldhave been pointless, let alone impossible :-)
Please do not feel offended if your name has not appeared hereand you have contributed to this document (either as HOWTO or asFAQ). Email me and I will rectify it.
Translations
French, Eric Dumas
http://www.freenix.fr/unix/linux/HOWTO/GCC-HOWTO.htmlItalian, Andrea Girotto
http://www.pluto.linux.it/ildp/HOWTO/GCC-HOWTO.htmlJapanese,
Feedback
is welcomed. Mail me at daniel.barlow@linux.org. My PGP public key (ID5F263625) is available from my web pages, if you feel the need to be secretiveabout things.
Legalese
All trademarks used in this document are acknowledged as beingowned by their respective owners.
This document is copyright (C) 1996,1999 Daniel Barlow<dan@detached.demon.co.uk>.It may be reproduced and distributed in whole or in part, in anymedium physical or electronic, as long as this copyright notice isretained on all copies. Commercial redistribution is allowed andencouraged; however, the author would like to be notified of anysuch distributions.
All translations, derivative works, or aggregate worksincorporating any Linux HOWTO documents must be covered under thiscopyright notice. That is, you may not produce a derivative workfrom a HOWTO and impose additional restrictions on itsdistribution. Exceptions to these rules may be granted undercertain conditions; please contact the Linux HOWTO coordinator atthe address given below.
In short, we wish to promote dissemination of this informationthrough as many channels as possible. However, we do wish to retaincopyright on the HOWTO documents, and would like to be notified ofany plans to redistribute the HOWTOs.
If you have questions, please contact Tim Bynum, the Linux HOWTOcoordinator, atlinux-howto@sunsite.unc.eduvia email.