UNIX FOR BEGINNERS SECOND EDITION



                     Brian W. Kernighan





                     Bell Laboratories



               Murray Hill, New Jersey 07974







                          _A_B_S_T_R_A_C_T





       This paper  is  meant  to  help  new  users  get

     started  on  the  UNIX*  operating   system.    It

     includes:



      o+basics needed for day-to-day use of the system -

       typing  commands,  correcting  typing  mistakes,

       logging in and out, mail, inter-terminal commun-

       ication,   the   file  system,  printing  files,

       redirecting I/O, pipes, and the shell.



      o+document preparation - a brief discussion of the

       major  formatting  programs  and macro packages,

       hints  on  preparing  documents,   and   capsule

       descriptions of some supporting software.



      o+UNIX programming - using the editor, programming

       the shell, programming in C, other languages and

       tools.



      o+An annotated UNIX bibliography.







_I_N_T_R_O_D_U_C_T_I_O_N





  From the user's point of view, the UNIX  operating  system



is  easy  to  learn  and  use, and presents few of the usual



impediments to getting the job done.  It is  hard,  however,



__________________________

* UNIX is a Trademark of Bell Laboratories.







                     November 16, 1985













2                                         _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





for the beginner to know where to start, and how to make the



best  use  of the facilities available.  The purpose of this



introduction is to help new users get used to the main ideas



of  the  UNIX  system  and  start making effective use of it



quickly.





  You should have a couple of other documents with  you  for



easy  reference as you read this one.  The most important is



_T_h_e _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l; it's often easier to tell  you



to  read  about  something  in the manual than to repeat its



contents here.  The other  useful  document  is  _A  _T_u_t_o_r_i_a_l



_I_n_t_r_o_d_u_c_t_i_o_n  _t_o  _t_h_e  _U_N_I_X _T_e_x_t _E_d_i_t_o_r, which will tell you



how to use the editor to get text -  programs,  data,  docu-



ments - into the computer.





  A word of warning: the UNIX system has become quite  popu-



lar, and there are several major variants in widespread use.



Of course details also change with time.   So  although  the



basic  structure  of UNIX and how to use it is common to all



versions, there will certainly be a  few  things  which  are



different  on  your  system from what is described here.  We



have tried to minimize the problem, but be aware of it.   In



cases of doubt, this paper describes Version 7 UNIX.





  This paper has five sections:





  1.Getting Started: How to log in, how to type, what to  do



  about mistakes in typing, how to log out.  Some of this is



  dependent on which system you log into (phone numbers, for







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                         3





  example)  and  what terminal you use, so this section must



  necessarily be supplemented by local information.





  2.Day-to-day Use: Things you need every  day  to  use  the



  system  effectively:  generally  useful commands; the file



  system.





  3.Document Preparation: Preparing manuscripts  is  one  of



  the  most common uses for UNIX systems.  This section con-



  tains advice, but not extensive instructions on any of the



  formatting tools.





  4.Writing  Programs:  UNIX  is  an  excellent  system  for



  developing programs.  This section talks about some of the



  tools, but again is not a tutorial in any of the  program-



  ming languages provided by the system.





  5.A UNIX Reading List.  An annotated bibliography of docu-



  ments that new users should be aware of.





_I.  _G_E_T_T_I_N_G _S_T_A_R_T_E_D





_L_o_g_g_i_n_g _I_n





  You must have a UNIX login name, which you  can  get  from



whoever  administers your system.  You also need to know the



phone number, unless your system uses permanently  connected



terminals.   The  UNIX  system  is capable of dealing with a



wide variety of terminals: Terminet 300's; Execuport, TI and



similar  portables;  video  (CRT) terminals like the HP2640,



etc.; high-priced  graphics  terminals  like  the  Tektronix





                     November 16, 1985













4                                         _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





4014;  plotting  terminals like those from GSI and DASI; and



even the venerable Teletype in its various forms.  But note:



UNIX  is  strongly oriented towards devices with _l_o_w_e_r _c_a_s_e.



If your terminal produces only upper case  (e.g.,  model  33



Teletype,  some  video and portable terminals), life will be



so difficult that you should look for another terminal.





  Be sure to set the switches appropriately on your  device.



Switches  that  might need to be adjusted include the speed,



upper/lower case mode, full duplex,  even  parity,  and  any



others  that  local  wisdom advises.  Establish a connection



using whatever magic is needed for your terminal;  this  may



involve  dialing  a  telephone  call  or  merely  flipping a



switch.  In either case, UNIX should type ``llllooooggggiiiinnnn::::'' at you.



If  it  types  garbage, you may be at the wrong speed; check



the switches.  If that fails, push the ``break'' or ``inter-



rupt''  key a few times, slowly.  If that fails to produce a



login message, consult a guru.





  When you get a llllooooggggiiiinnnn:::: message, type  your  login  name  _i_n



_l_o_w_e_r  _c_a_s_e.   Follow it by a RETURN; the system will not do



anything  until  you  type  a  RETURN.   If  a  password  is



required, you will be asked for it, and (if possible) print-



ing will be turned off while  you  type  it.   Don't  forget



RETURN.





  The culmination of your login efforts is a ``prompt  char-



acter,''  a  single character that indicates that the system



is ready to accept commands from you.  The prompt  character





                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                         5





is  usually  a  dollar sign $$$$ or a percent sign %%%%.  (You may



also get a message of the day just before the prompt charac-



ter, or a notification that you have mail.)





_T_y_p_i_n_g _C_o_m_m_a_n_d_s





  Once you've seen the prompt character, you can  type  com-



mands, which are requests that the system do something.  Try



typing





  ddddaaaatttteeee



followed by RETURN.  You should get back something like





  MMMMoooonnnn JJJJaaaannnn 11116666 11114444::::11117777::::11110000 EEEESSSSTTTT 1111999977778888



Don't forget the RETURN after the command, or  nothing  will



happen.   If  you think you're being ignored, type a RETURN;



something should happen.  RETURN won't be  mentioned  again,



but  don't forget it - it has to be there at the end of each



line.





  Another command you might try  is  wwwwhhhhoooo,  which  tells  you



everyone who is currently logged in:





  wwwwhhhhoooo



gives something like





  mmmmbbbb   ttttttttyyyy00001111JJJJaaaannnn 11116666    00009999::::11111111

  sssskkkkiiii  ttttttttyyyy00005555JJJJaaaannnn 11116666    00009999::::33333333

  ggggaaaammmm  ttttttttyyyy11111111JJJJaaaannnn 11116666    11113333::::00007777



The time is when  the  user  logged  in;  ``ttyxx''  is  the



system's idea of what terminal the user is on.









                     November 16, 1985













6                                         _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





  If you make a mistake typing the command name,  and  refer



to  a  non-existent command, you will be told.  For example,



if you type





  wwwwhhhhoooommmm



you will be told





  wwwwhhhhoooommmm:::: nnnnooootttt ffffoooouuuunnnndddd



Of course, if you inadvertently type the name of some  other



command, it will run, with more or less mysterious results.





_S_t_r_a_n_g_e _T_e_r_m_i_n_a_l _B_e_h_a_v_i_o_r





  Sometimes you can get into a  state  where  your  terminal



acts  strangely.   For  example,  each  letter  may be typed



twice, or the RETURN may not cause a line feed or  a  return



to  the  left margin.  You can often fix this by logging out



and logging back in.  Or you can read the description of the



command ssssttttttttyyyy in section I of the manual.  To get intelligent



treatment of tab characters (which are much used in UNIX) if



your terminal doesn't have tabs, type the command





  ssssttttttttyyyy ----ttttaaaabbbbssss



and the system will convert each tab into the  right  number



of  blanks  for  you.   If your terminal does have computer-



settable tabs, the command ttttaaaabbbbssss will set the stops correctly



for you.





_M_i_s_t_a_k_e_s _i_n _T_y_p_i_n_g





  If you make a typing mistake, and see it before RETURN has





                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                         7





been  typed,  there  are  two  ways  to recover.  The sharp-



character #### erases the last character typed; in fact succes-



sive uses of #### erase characters back to the beginning of the



line (but not beyond).   So  if  you  type  badly,  you  can



correct as you go:





  dddddddd####aaaatttttttteeee########eeee



is the same as ddddaaaatttteeee.





  The at-sign @@@@ erases all of the characters typed so far on



the  current  input  line,  so  if the line is irretrievably



fouled up, type an @@@@ and start the line over.





  What if you must enter a sharp or at-sign as part  of  the



text?   If  you  precede  either #### or @@@@ by a backslash \\\\, it



loses its erase meaning.  So to enter a sharp or at-sign  in



something,  type  \\\\####  or  \\\\@@@@.  The system will always echo a



newline at you after your at-sign, even  if  preceded  by  a



backslash.  Don't worry - the at-sign has been recorded.





  To erase a backslash, you have to type two sharps  or  two



at-signs,  as  in \\\\########.  The backslash is used extensively in



UNIX to indicate that the following character is in some way



special.





_R_e_a_d-_a_h_e_a_d





  UNIX has full read-ahead, which means that you can type as



fast  as you want, whenever you want, even when some command



is typing at you.  If you type  during  output,  your  input







                     November 16, 1985













8                                         _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





characters  will  appear  intermixed with the output charac-



ters, but they will be stored away and  interpreted  in  the



correct  order.   So you can type several commands one after



another without waiting for the  first  to  finish  or  even



begin.





_S_t_o_p_p_i_n_g _a _P_r_o_g_r_a_m





  You can stop most programs by typing the character ``DEL''



(perhaps  called ``delete'' or ``rubout'' on your terminal).



The ``interrupt'' or ``break'' key found on  most  terminals



can  also be used.  In a few programs, like the text editor,



DEL stops whatever the program is doing but  leaves  you  in



that program.  Hanging up the phone will stop most programs.





_L_o_g_g_i_n_g _O_u_t





  The easiest way to log out is to hang up the  phone.   You



can also type





  llllooooggggiiiinnnn



and let someone else use the terminal you were  on.   It  is



usually  not sufficient just to turn off the terminal.  Most



UNIX systems do not use a time-out mechanism, so  you'll  be



there forever unless you hang up.





_M_a_i_l





  When you log in, you may sometimes get the message





  YYYYoooouuuu hhhhaaaavvvveeee mmmmaaaaiiiillll....







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                         9





UNIX provides a postal system so you  can  communicate  with



other users of the system.  To read your mail, type the com-



mand





  mmmmaaaaiiiillll



Your mail will be printed,  one  message  at  a  time,  most



recent  message  first.   After each message, mmmmaaaaiiiillll waits for



you to say what to do with it.  The two basic responses  are



dddd, which deletes the message, and RETURN, which does not (so



it will still be there the next time you read your mailbox).



Other  responses are described in the manual.  (Earlier ver-



sions of mmmmaaaaiiiillll do not process one message at a time, but  are



otherwise similar.)





  How do you send mail to someone else?  Suppose it is to go



to  ``joe'' (assuming ``joe'' is someone's login name).  The



easiest way is this:





  mmmmaaaaiiiillll jjjjooooeeee

  _n_o_w _t_y_p_e _i_n _t_h_e _t_e_x_t _o_f _t_h_e _l_e_t_t_e_r

  _o_n _a_s _m_a_n_y _l_i_n_e_s _a_s _y_o_u _l_i_k_e ...

  _A_f_t_e_r _t_h_e _l_a_s_t _l_i_n_e _o_f _t_h_e _l_e_t_t_e_r

  _t_y_p_e _t_h_e _c_h_a_r_a_c_t_e_r ``_c_o_n_t_r_o_l-_d'',

  _t_h_a_t _i_s, _h_o_l_d _d_o_w_n ``_c_o_n_t_r_o_l'' _a_n_d _t_y_p_e

  _a _l_e_t_t_e_r ``_d''.



And that's it.  The  ``control-d''  sequence,  often  called



``EOF''  for  end-of-file,  is used throughout the system to



mark the end of input from a terminal, so you might as  well



get used to it.





  For practice, send  mail  to  yourself.   (This  isn't  as



strange  as it might sound - mail to oneself is a handy rem-







                     November 16, 1985













10                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





inder mechanism.)





  There are other ways to send mail - you can send a  previ-



ously  prepared letter, and you can mail to a number of peo-



ple all at once.  For more details see mmmmaaaaiiiillll(1).  (The  nota-



tion mmmmaaaaiiiillll(1) means the command mmmmaaaaiiiillll in section 1 of the _U_N_I_X



_P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l.)





_W_r_i_t_i_n_g _t_o _o_t_h_e_r _u_s_e_r_s





  At some point, out of the blue will come a message like





  MMMMeeeessssssssaaaaggggeeee ffffrrrroooommmm jjjjooooeeee ttttttttyyyy00007777............



accompanied by a startling beep.  It means that Joe wants to



talk  to  you, but unless you take explicit action you won't



be able to talk back.  To respond, type the command





  wwwwrrrriiiitttteeee jjjjooooeeee



This establishes a two-way communication path.  Now whatever



Joe  types  on  his  terminal  will appear on yours and vice



versa.  The path is slow, rather like talking to  the  moon.



(If you are in the middle of something, you have to get to a



state where you can type a command.  Normally, whatever pro-



gram  you are running has to terminate or be terminated.  If



you're editing, you can escape temporarily from the editor -



read the editor tutorial.)





  A protocol is needed to keep what you  type  from  getting



garbled up with what Joe types. Typically it's like this:











                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        11







  Joe types wwwwrrrriiiitttteeee ssssmmmmiiiitttthhhh and waits.

  Smith types wwwwrrrriiiitttteeee jjjjooooeeee and waits.

  Joe now types his message (as many  lines  as  he  likes).

  When  he's ready for a reply, he signals it by typing ((((oooo)))),

  which stands for ``over''.

  Now Smith types a reply, also terminated by ((((oooo)))).

  This cycle repeats  until  someone  gets  tired;  he  then

  signals  his  intent  to  quit  with  ((((oooooooo)))), for ``over and

  out''.

  To terminate the  conversation,  each  side  must  type  a

  ``control-d''  character alone on a line. (``Delete'' also

  works.) When the other person types his ``control-d'', you

  will get the message EEEEOOOOFFFF on your terminal.







  If you write to  someone  who  isn't  logged  in,  or  who



doesn't want to be disturbed, you'll be told.  If the target



is logged in but doesn't answer  after  a  decent  interval,



simply type ``control-d''.





_O_n-_l_i_n_e _M_a_n_u_a_l





  The _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l is  typically  kept  on-line.



If  you  get stuck on something, and can't find an expert to



assist you, you can print on your terminal some manual  sec-



tion  that  might help.  This is also useful for getting the



most up-to-date information on a command.  To print a manual



section,  type ``man command-name''.  Thus to read up on the



wwwwhhhhoooo command, type





  mmmmaaaannnn wwwwhhhhoooo



and, of course,





  mmmmaaaannnn mmmmaaaannnn



tells all about the mmmmaaaannnn command.









                     November 16, 1985













12                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





_C_o_m_p_u_t_e_r _A_i_d_e_d _I_n_s_t_r_u_c_t_i_o_n





  Your UNIX system  may  have  available  a  program  called



lllleeeeaaaarrrrnnnn, which provides computer aided instruction on the file



system and basic commands, the editor, document preparation,



and even C programming.  Try typing the command





  lllleeeeaaaarrrrnnnn



If lllleeeeaaaarrrrnnnn exists on your system, it will tell you what to  do



from there.





_I_I.  _D_A_Y-_T_O-_D_A_Y _U_S_E





_C_r_e_a_t_i_n_g _F_i_l_e_s - _T_h_e _E_d_i_t_o_r





  If you have to type a paper or a letter or a program,  how



do  you  get the information stored in the machine?  Most of



these tasks are done  with  the  UNIX  ``text  editor''  eeeedddd.



Since  eeeedddd is thoroughly documented in eeeedddd(1) and explained in



_A _T_u_t_o_r_i_a_l _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _T_e_x_t  _E_d_i_t_o_r,  we  won't



spend  any  time here describing how to use it.  All we want



it for right now is to make some _f_i_l_e_s.  (A file is  just  a



collection  of information stored in the machine, a simplis-



tic but adequate definition.)





  To create a file called jjjjuuuunnnnkkkk with some text in it, do  the



following:





  eeeedddd jjjjuuuunnnnkkkk(invokes the text editor)

  aaaa     (command to ``ed'', to add text)

  _n_o_w _t_y_p_e _i_n

  _w_h_a_t_e_v_e_r _t_e_x_t _y_o_u _w_a_n_t ...

  ....     (signals the end of adding text)





                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        13





The ``....'' that signals the end of adding text must be at the



beginning  of  a line by itself.  Don't forget it, for until



it is typed, no other  eeeedddd  commands  will  be  recognized  -



everything you type will be treated as text to be added.





  At this point you can do various editing operations on the



text  you  typed  in,  such as correcting spelling mistakes,



rearranging paragraphs and  the  like.   Finally,  you  must



write  the  information  you have typed into a file with the



editor command wwww:





  wwww



eeeedddd will respond with the number of characters it wrote  into



the file jjjjuuuunnnnkkkk.





  Until the wwww command, nothing is stored permanently, so  if



you hang up and go home the information is lost.|- But  after



wwww the information is there permanently; you can re-access it



any time by typing





  eeeedddd jjjjuuuunnnnkkkk



Type a qqqq command to quit the editor.  (If you  try  to  quit



without  writing, eeeedddd will print a ???? to remind you.  A second



qqqq gets you out regardless.)





  Now create a second file called tttteeeemmmmpppp in the  same  manner.



You should now have two files, jjjjuuuunnnnkkkk and tttteeeemmmmpppp.

__________________________

|- This is not strictly true -  if  you  hang  up  while

editing,  the  data  you  were working on is saved in a

file called eeeedddd....hhhhuuuupppp, which you can continue with at your

next session.







                     November 16, 1985













14                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





_W_h_a_t _f_i_l_e_s _a_r_e _o_u_t _t_h_e_r_e?





  The llllssss (for ``list'') command lists the  names  (not  con-



tents)  of  any  of the files that UNIX knows about.  If you



type





  llllssss



the response will be





  jjjjuuuunnnnkkkk

  tttteeeemmmmpppp



which are indeed the two files just created.  The names  are



sorted  into  alphabetical  order  automatically,  but other



variations are possible.  For example, the command





  llllssss ----tttt



causes the files to be listed in the  order  in  which  they



were last changed, most recent first.  The ----llll option gives a



``long'' listing:





  llllssss ----llll



will produce something like





  ----rrrrwwww----rrrrwwww----rrrrwwww---- 1111 bbbbwwwwkkkk 44441111 JJJJuuuullll 22222222 2222::::55556666 jjjjuuuunnnnkkkk

  ----rrrrwwww----rrrrwwww----rrrrwwww---- 1111 bbbbwwwwkkkk 77778888 JJJJuuuullll 22222222 2222::::55557777 tttteeeemmmmpppp



The date and time are of the last change to the  file.   The



41  and  78 are the number of characters (which should agree



with the numbers you got from eeeedddd).  bbbbwwwwkkkk is the owner of  the



file,  that  is,  the person who created it.  The ----rrrrwwww----rrrrwwww----rrrrwwww----



tells who has permission to read and write the file, in this



case everyone.







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        15





  Options can be combined: llllssss ----lllltttt gives the  same  thing  as



llllssss ----llll,  but  sorted  into time order.  You can also name the



files you're interested in, and llllssss will list the information



about them only.  More details can be found in llllssss(1).





  The use of optional arguments  that  begin  with  a  minus



sign,  like ----tttt and ----lllltttt, is a common convention for UNIX pro-



grams.  In general, if a program accepts such optional argu-



ments,  they  precede  any  filename  arguments.  It is also



vital that you separate the various arguments  with  spaces:



llllssss----llll is not the same as llllssss  ----llll.





_P_r_i_n_t_i_n_g _F_i_l_e_s





  Now that you've got a file of text, how do you print it so



people can look at it?  There are a host of programs that do



that, probably more than are needed.





  One simple thing is to use the editor, since  printing  is



often done just before making changes anyway.  You can say





  eeeedddd jjjjuuuunnnnkkkk

  1111,,,,$$$$pppp



eeeedddd will reply with the count of the characters in  jjjjuuuunnnnkkkk  and



then  print  all the lines in the file.  After you learn how



to use the editor, you can be selective about the parts  you



print.





  There are times when it's not feasible to use  the  editor



for  printing.   For  example, there is a limit on how big a



file eeeedddd can handle (several thousand lines).   Secondly,  it





                     November 16, 1985













16                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





will  only  print one file at a time, and sometimes you want



to print several, one after another.  So here are  a  couple



of alternatives.





  First is ccccaaaatttt, the simplest of all the  printing  programs.



ccccaaaatttt  simply  prints  on the terminal the contents of all the



files named in a list.  Thus





  ccccaaaatttt jjjjuuuunnnnkkkk



prints one file, and





  ccccaaaatttt jjjjuuuunnnnkkkk tttteeeemmmmpppp



prints two.  The files are simply  concatenated  (hence  the



name ``ccccaaaatttt'') onto the terminal.





  pppprrrr produces formatted printouts of files.  As with ccccaaaatttt, pppprrrr



prints  all  the  files  named in a list.  The difference is



that it produces headings with date, time, page  number  and



file  name  at the top of each page, and extra lines to skip



over the fold in the paper.  Thus,





  pppprrrr jjjjuuuunnnnkkkk tttteeeemmmmpppp



will print jjjjuuuunnnnkkkk neatly, then skip to the top of a  new  page



and print tttteeeemmmmpppp neatly.





  pppprrrr can also produce multi-column output:





  pppprrrr ----3333 jjjjuuuunnnnkkkk



prints jjjjuuuunnnnkkkk in 3-column format.  You can use any  reasonable



number  in  place  of ``3'' and pppprrrr will do its best.  pppprrrr has



other capabilities as well; see pppprrrr(1).





                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        17





  It should be noted that pppprrrr is _n_o_t a formatting program  in



the  sense of shuffling lines around and justifying margins.



The true formatters are nnnnrrrrooooffffffff and ttttrrrrooooffffffff, which we  will  get



to in the section on document preparation.





  There are also programs that print files on  a  high-speed



printer.   Look  in your manual under oooopppprrrr and llllpppprrrr.  Which to



use depends on what equipment is attached to your machine.





_S_h_u_f_f_l_i_n_g _F_i_l_e_s _A_b_o_u_t





  Now that you have some files in the file system  and  some



experience in printing them, you can try bigger things.  For



example, you can move a  file  from  one  place  to  another



(which amounts to giving it a new name), like this:





  mmmmvvvv jjjjuuuunnnnkkkk pppprrrreeeecccciiiioooouuuussss



This means that what used  to  be  ``junk''  is  now  ``pre-



cious''.  If you do an llllssss command now, you will get





  pppprrrreeeecccciiiioooouuuussss

  tttteeeemmmmpppp



Beware that if you move a file to another one  that  already



exists, the already existing contents are lost forever.





  If you want to make a _c_o_p_y of a file (that is, to have two



versions of something), you can use the ccccpppp command:





  ccccpppp pppprrrreeeecccciiiioooouuuussss tttteeeemmmmpppp1111



makes a duplicate copy of pppprrrreeeecccciiiioooouuuussss in tttteeeemmmmpppp1111.





  Finally, when you get tired of creating and moving  files,





                     November 16, 1985













18                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





there  is  a  command  to remove files from the file system,



called rrrrmmmm.





  rrrrmmmm tttteeeemmmmpppp tttteeeemmmmpppp1111



will remove both of the files named.





  You will get a warning message if one of the  named  files



wasn't  there,  but  otherwise  rrrrmmmm, like most UNIX commands,



does its work silently.  There is no prompting  or  chatter,



and error messages are occasionally curt.  This terseness is



sometimes disconcerting to newcomers, but experienced  users



find it desirable.





_W_h_a_t'_s _i_n _a _F_i_l_e_n_a_m_e





  So far we have used filenames without ever saying what's a



legal  name,  so  it's  time  for a couple of rules.  First,



filenames are limited to 14 characters, which is  enough  to



be  descriptive.   Second,  although  you can use almost any



character in a filename, common sense says you should  stick



to ones that are visible, and that you should probably avoid



characters that might be used with other meanings.  We  have



already  seen,  for  example,  that in the llllssss command, llllssss ----tttt



means to list in time order.  So if you  had  a  file  whose



name was ----tttt, you would have a tough time listing it by name.



Besides the minus sign, there  are  other  characters  which



have  special meaning.  To avoid pitfalls, you would do well



to use only letters, numbers and  the  period  until  you're



familiar with the situation.







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        19





  On to some more positive suggestions.  Suppose you're typ-



ing  a  large  document like a book.  Logically this divides



into many small pieces, like chapters and perhaps  sections.



Physically  it  must  be divided too, for eeeedddd will not handle



really big files.  Thus you should type the  document  as  a



number  of  files.   You might have a separate file for each



chapter, called





  cccchhhhaaaapppp1111

  cccchhhhaaaapppp2222

  etc...



Or, if each chapter were  broken  into  several  files,  you



might have





  cccchhhhaaaapppp1111....1111

  cccchhhhaaaapppp1111....2222

  cccchhhhaaaapppp1111....3333

  ............

  cccchhhhaaaapppp2222....1111

  cccchhhhaaaapppp2222....2222

  ............



You can now tell at a glance where a  particular  file  fits



into the whole.





  There are advantages to  a  systematic  naming  convention



which  are not obvious to the novice UNIX user.  What if you



wanted to print the whole book?  You could say





  pppprrrr cccchhhhaaaapppp1111....1111 cccchhhhaaaapppp1111....2222 cccchhhhaaaapppp1111....3333 ........................



but you would get tired pretty fast, and would probably even



make  mistakes.   Fortunately, there is a shortcut.  You can



say





  pppprrrr cccchhhhaaaapppp****





                     November 16, 1985













20                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





The **** means ``anything at all,''  so  this  translates  into



``print  all  files whose names begin with cccchhhhaaaapppp'', listed in



alphabetical order.





  This shorthand notation is not a property of the  pppprrrr  com-



mand,  by the way.  It is system-wide, a service of the pro-



gram that interprets commands (the ``shell,'' sssshhhh(1)).  Using



that fact, you can see how to list the names of the files in



the book:





  llllssss cccchhhhaaaapppp****



produces





  cccchhhhaaaapppp1111....1111

  cccchhhhaaaapppp1111....2222

  cccchhhhaaaapppp1111....3333

  ............



The **** is not limited to the last position in a filename - it



can be anywhere and can occur several times.  Thus





  rrrrmmmm ****jjjjuuuunnnnkkkk**** ****tttteeeemmmmpppp****



removes all files that contain jjjjuuuunnnnkkkk or tttteeeemmmmpppp as any  part  of



their  name.   As  a special case, **** by itself matches every



filename, so





  pppprrrr ****



prints all your files (alphabetical order), and





  rrrrmmmm ****



removes _a_l_l _f_i_l_e_s.  (You had better be _v_e_r_y sure that's what



you wanted to say!)









                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        21





  The **** is not the only pattern-matching feature  available.



Suppose  you  want to print only chapters 1 through 4 and 9.



Then you can say





  pppprrrr cccchhhhaaaapppp[[[[11112222333344449999]]]]****



The [[[[............]]]] means to match any of the  characters  inside  the



brackets.   A  range of consecutive letters or digits can be



abbreviated, so you can also do this with





  pppprrrr cccchhhhaaaapppp[[[[1111----44449999]]]]****



Letters can also be used within brackets: [[[[aaaa----zzzz]]]] matches  any



character in the range aaaa through zzzz.





  The ???? pattern matches any single character, so





  llllssss ????



lists all files which have single-character names, and





  llllssss ----llll cccchhhhaaaapppp????....1111



lists information about  the  first  file  of  each  chapter



(cccchhhhaaaapppp1111....1111, cccchhhhaaaapppp2222....1111, etc.).





  Of these niceties, **** is certainly the most useful, and you



should  get  used  to  it.  The others are frills, but worth



knowing.





  If you should ever have to turn off the special meaning of



****, ????, etc., enclose the entire argument in single quotes, as



in





  llllssss ''''????''''







                     November 16, 1985













22                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





We'll see some more examples of this shortly.





_W_h_a_t'_s _i_n _a _F_i_l_e_n_a_m_e, _C_o_n_t_i_n_u_e_d





  When you first made that file called  jjjjuuuunnnnkkkk,  how  did  the



system  know  that there wasn't another jjjjuuuunnnnkkkk somewhere else,



especially since the person in the next office is also read-



ing  this  tutorial?  The answer is that generally each user



has a private _d_i_r_e_c_t_o_r_y, which contains only the files  that



belong  to him.  When you log in, you are ``in'' your direc-



tory.  Unless you take special action, when you create a new



file, it is made in the directory that you are currently in;



this is most often your own directory, and thus the file  is



unrelated  to  any  other  file  of the same name that might



exist in someone else's directory.





  The set of all files is organized  into  a  (usually  big)



tree,  with  your  files  located  several branches into the



tree.  It is possible for you to ``walk'' around this  tree,



and  to find any file in the system, by starting at the root



of the tree and walking along the proper  set  of  branches.



Conversely,  you can start where you are and walk toward the



root.





  Let's try the latter first.  The basic tools is  the  com-



mand  ppppwwwwdddd  (``print  working  directory''), which prints the



name of the directory you are currently in.





  Although the details will vary according to the system you



are on, if you give the command ppppwwwwdddd, it will print something





                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        23





like





  ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee



This says that you are currently in the directory yyyyoooouuuurrrr----nnnnaaaammmmeeee,



which  is in turn in the directory ////uuuussssrrrr, which is in turn in



the root directory called by convention just  ////.   (Even  if



it's  not called ////uuuussssrrrr on your system, you will get something



analogous.  Make the corresponding changes and read on.)





  If you now type





  llllssss ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee



you should get exactly the same list of file  names  as  you



get  from  a  plain llllssss: with no arguments, llllssss lists the con-



tents of the current directory; given the name of  a  direc-



tory, it lists the contents of that directory.





  Next, try





  llllssss ////uuuussssrrrr



This should print a long series of  names,  among  which  is



your  own  login  name yyyyoooouuuurrrr----nnnnaaaammmmeeee.  On many systems, uuuussssrrrr is a



directory that contains the directories of  all  the  normal



users of the system, like you.





  The next step is to try





  llllssss ////



You should get a  response  something  like  this  (although



again the details may be different):









                     November 16, 1985













24                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s







  bbbbiiiinnnn

  ddddeeeevvvv

  eeeettttcccc

  lllliiiibbbb

  ttttmmmmpppp

  uuuussssrrrr



This is a collection of the basic directories of files  that



the system knows about; we are at the root of the tree.





  Now try





  ccccaaaatttt ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////jjjjuuuunnnnkkkk



(if jjjjuuuunnnnkkkk is still around in your directory).  The name





  ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////jjjjuuuunnnnkkkk



is called the ppppaaaatttthhhhnnnnaaaammmmeeee of the file that you  normally  think



of  as  ``junk''.   ``Pathname''  has an obvious meaning: it



represents the full name of the path you have to follow from



the root through the tree of directories to get to a partic-



ular file.  It is a universal rule in the UNIX  system  that



anywhere  you  can  use  an ordinary filename, you can use a



pathname.





  Here is a picture which may make this clearer:





                            (root)

                            / | \

                           /  |  \

                          /   |   \

                bin    etc    usr    dev   tmp

            / | \   / | \   / | \   / | \   / | \

                           /  |  \

                          /   |   \

                       adam  eve   mary

                   /        /   \        \

                            /     \       junk

                          junk temp







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        25





Notice that Mary's jjjjuuuunnnnkkkk is unrelated to Eve's.





  This isn't too exciting if all the files of  interest  are



in  your own directory, but if you work with someone else or



on several projects concurrently, it becomes  handy  indeed.



For example, your friends can print your book by saying





  pppprrrr ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////cccchhhhaaaapppp****



Similarly, you can find out what files your neighbor has  by



saying





  llllssss ////uuuussssrrrr////nnnneeeeiiiigggghhhhbbbboooorrrr----nnnnaaaammmmeeee



or make your own copy of one of his files by





  ccccpppp ////uuuussssrrrr////yyyyoooouuuurrrr----nnnneeeeiiiigggghhhhbbbboooorrrr////hhhhiiiissss----ffffiiiilllleeee yyyyoooouuuurrrrffffiiiil

llleeee





  If your neighbor doesn't want you  poking  around  in  his



files,  or  vice  versa, privacy can be arranged.  Each file



and directory has  read-write-execute  permissions  for  the



owner,  a group, and everyone else, which can be set to con-



trol access.  See llllssss(1) and  cccchhhhmmmmoooodddd(1)  for  details.   As  a



matter  of  observed  fact, most users most of the time find



openness of more benefit than privacy.





  As a final experiment with pathnames, try





  llllssss ////bbbbiiiinnnn ////uuuussssrrrr////bbbbiiiinnnn



Do some of the names look familiar?  When you run a program,



by  typing  its  name after the prompt character, the system



simply looks for a file of that  name.   It  normally  looks



first  in  your  directory  (where it typically doesn't find





                     November 16, 1985













26                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





it), then in ////bbbbiiiinnnn and finally in ////uuuussssrrrr////bbbbiiiinnnn.  There is nothing



magic  about  commands like ccccaaaatttt or llllssss, except that they have



been collected into a couple of places to be  easy  to  find



and administer.





  What if you work regularly with  someone  else  on  common



information in his directory?  You could just log in as your



friend each time you want to, but you can also say ``I  want



to  work  on his files instead of my own''.  This is done by



changing the directory that you are currently in:





  ccccdddd ////uuuussssrrrr////yyyyoooouuuurrrr----ffffrrrriiiieeeennnndddd



(On some systems, ccccdddd is spelled cccchhhhddddiiiirrrr.) Now when you  use  a



filename  in something like ccccaaaatttt or pppprrrr, it refers to the file



in your friend's directory.   Changing  directories  doesn't



affect  any  permissions  associated  with  a  file - if you



couldn't access a file from your own directory, changing  to



another  directory won't alter that fact.  Of course, if you



forget what directory you're in, type





  ppppwwwwdddd



to find out.





  It is usually convenient to arrange your own files so that



all  the  files  related  to  one  thing  are in a directory



separate from other projects.  For example, when  you  write



your  book,  you might want to keep all the text in a direc-



tory called bbbbooooooookkkk.  So make one with





  mmmmkkkkddddiiiirrrr bbbbooooooookkkk





                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        27





then go to it with





  ccccdddd bbbbooooooookkkk



then start typing  chapters.   The  book  is  now  found  in



(presumably)





  ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////bbbbooooooookkkk



To remove the directory bbbbooooooookkkk, type





  rrrrmmmm bbbbooooooookkkk////****

  rrrrmmmmddddiiiirrrr bbbbooooooookkkk



The first command removes all files from the directory;  the



second removes the empty directory.





  You can go up one level in the tree of files by saying





  ccccdddd ........



``........'' is the name of the parent of whatever  directory  you



are  currently  in.  For completeness, ``....'' is an alternate



name for the directory you are in.





_U_s_i_n_g _F_i_l_e_s _i_n_s_t_e_a_d _o_f _t_h_e _T_e_r_m_i_n_a_l





  Most of the commands we have seen so far produce output on



the  terminal;  some, like the editor, also take their input



from the terminal.  It is universal in UNIX systems that the



terminal  can  be  replaced  by a file for either or both of



input and output.  As one example,





  llllssss



makes a list of files on your terminal.  But if you say









                     November 16, 1985













28                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s







  llllssss >>>>ffffiiiilllleeeelllliiiisssstttt



a list of your files will be placed  in  the  file  ffffiiiilllleeeelllliiiisssstttt



(which  will  be  created  if  it  doesn't already exist, or



overwritten if it does).  The symbol >>>> means ``put the  out-



put  on  the  following file, rather than on the terminal.''



Nothing is produced on the terminal.   As  another  example,



you  could  combine  several files into one by capturing the



output of ccccaaaatttt in a file:





  ccccaaaatttt ffff1111 ffff2222 ffff3333 >>>>tttteeeemmmmpppp





  The symbol >>>>>>>> operates very much like >>>> does, except  that



it means ``add to the end of.'' That is,





  ccccaaaatttt ffff1111 ffff2222 ffff3333 >>>>>>>>tttteeeemmmmpppp



means to concatenate ffff1111, ffff2222 and ffff3333 to the end of whatever is



already  in  tttteeeemmmmpppp,  instead of overwriting the existing con-



tents.  As with >>>>, if tttteeeemmmmpppp doesn't exist, it will be created



for you.





  In a similar way, the symbol <<<< means to take the input for



a  program from the following file, instead of from the ter-



minal.  Thus, you could make up a script  of  commonly  used



editing  commands  and  put  them into a file called ssssccccrrrriiiipppptttt.



Then you can run the script on a file by saying





  eeeedddd ffffiiiilllleeee <<<>>>tttteeeemmmmpppp

  pppprrrr <<<>>>, and |||| into changes of input and  output



streams.





  The shell has other capabilities too.   For  example,  you



can run two programs with one command line by separating the



commands with a semicolon; the shell  recognizes  the  semi-



colon and breaks the line into two commands.  Thus





  ddddaaaatttteeee;;;; wwwwhhhhoooo



does both commands before returning with a prompt character.





  You can also have more than one program running _s_i_m_u_l_t_a_n_e_-



_o_u_s_l_y  if you wish.  For example, if you are doing something



time-consuming, like the editor script of  an  earlier  sec-



tion,  and  you  don't  want  to wait around for the results



before starting something else, you can say





  eeeedddd ffffiiiilllleeee <<<>>>ssssccccrrrriiiipppptttt....oooouuuutttt &&&&





                     November 16, 1985













32                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





which saves the output lines in a file called ssssccccrrrriiiipppptttt....oooouuuutttt.





  When you initiate a command with  &&&&,  the  system  replies



with  a  number  called the process number, which identifies



the command in case you later want to stop it.  If  you  do,



you can say





  kkkkiiiillllllll pppprrrroooocccceeeessssssss----nnnnuuuummmmbbbbeeeerrrr



If you forget the process number, the command ppppssss  will  tell



you   about  everything  you  have  running.   (If  you  are



desperate, kkkkiiiillllllll 0000 will kill  all  your  processes.)  And  if



you're  curious about other people, ppppssss aaaa will tell you about



_a_l_l programs that are currently running.





  You can say





  ((((ccccoooommmmmmmmaaaannnndddd----1111;;;; ccccoooommmmmmmmaaaannnndddd----2222;;;; ccccoooommmmmmmmaaaannnndddd----3333)))) &&&&



to start three commands in the background, or you can  start



a background pipeline with





  ccccoooommmmmmmmaaaannnndddd----1111 |||| ccccoooommmmmmmmaaaannnndddd----2222 &&&&





  Just as you can tell the editor or some similar program to



take its input from a file instead of from the terminal, you



can tell the shell to read a file  to  get  commands.   (Why



not?  The  shell,  after  all,  is  just a program, albeit a



clever one.) For instance, suppose you want to set  tabs  on



your terminal, and find out the date and who's on the system



every time you log in.  Then you can put the three necessary



commands  (ttttaaaabbbbssss,  ddddaaaatttteeee,  wwwwhhhhoooo)  into  a  file,  let's call it







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        33





ssssttttaaaarrrrttttuuuupppp, and then run it with





  sssshhhh ssssttttaaaarrrrttttuuuupppp



This says to run the shell with the file ssssttttaaaarrrrttttuuuupppp  as  input.



The effect is as if you had typed the contents of ssssttttaaaarrrrttttuuuupppp on



the terminal.





  If this is to be a regular thing, you  can  eliminate  the



need to type sssshhhh: simply type, once only, the command





  cccchhhhmmmmoooodddd ++++xxxx ssssttttaaaarrrrttttuuuupppp



and thereafter you need only say





  ssssttttaaaarrrrttttuuuupppp



to run the sequence of commands.  The cccchhhhmmmmoooodddd(1) command marks



the  file  executable; the shell recognizes this and runs it



as a sequence of commands.





  If you want ssssttttaaaarrrrttttuuuupppp to run automatically  every  time  you



log  in,  create  a  file  in  your  login  directory called



....pppprrrrooooffffiiiilllleeee, and place in it the line ssssttttaaaarrrrttttuuuupppp.  When the  shell



first  gains  control  when  you  log  in,  it looks for the



....pppprrrrooooffffiiiilllleeee file and does whatever commands  it  finds  in  it.



We'll get back to the shell in the section on programming.







_I_I_I. _D_O_C_U_M_E_N_T _P_R_E_P_A_R_A_T_I_O_N





  UNIX systems are used extensively  for  document  prepara-



tion.   There  are  two  major formatting programs, that is,



programs that produce a text with justified  right  margins,







                     November 16, 1985













34                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





automatic page numbering and titling, automatic hyphenation,



and the like.  nnnnrrrrooooffffffff is designed to produce output on termi-



nals  and  line-printers.   ttttrrrrooooffffffff  (pronounced ``tee-roff'')



instead drives a phototypesetter, which produces  very  high



quality  output  on photographic paper.  This paper was for-



matted with ttttrrrrooooffffffff.





_F_o_r_m_a_t_t_i_n_g _P_a_c_k_a_g_e_s





  The basic idea of nnnnrrrrooooffffffff and ttttrrrrooooffffffff is that the text  to  be



formatted  contains  within  it ``formatting commands'' that



indicate in detail how the formatted text is to  look.   For



example, there might be commands that specify how long lines



are, whether to use single or double spacing, and what  run-



ning titles to use on each page.





  Because nnnnrrrrooooffffffff and ttttrrrrooooffffffff are relatively hard  to  learn  to



use  effectively,  several ``packages'' of canned formatting



requests are available to let you specify  paragraphs,  run-



ning titles, footnotes, multi-column output, and so on, with



little effort and without having to learn nnnnrrrrooooffffffff  and  ttttrrrrooooffffffff.



These  packages  take  a  modest  effort  to  learn, but the



rewards for using them are so great that  it  is  time  well



spent.





  In this section, we will  provide  a  hasty  look  at  the



``manuscript''  package  known  as ----mmmmssss.  Formatting requests



typically consist of a period and  two  upper-case  letters,



such  as  ....TTTTLLLL, which is used to introduce a title, or ....PPPPPPPP to







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        35





begin a new paragraph.





  A document is typed so it looks something like this:





  ....TTTTLLLL

  ttttiiiittttlllleeee ooooffff ddddooooccccuuuummmmeeeennnntttt

  ....AAAAUUUU

  aaaauuuutttthhhhoooorrrr nnnnaaaammmmeeee

  ....SSSSHHHH

  sssseeeeccccttttiiiioooonnnn hhhheeeeaaaaddddiiiinnnngggg

  ....PPPPPPPP

  ppppaaaarrrraaaaggggrrrraaaapppphhhh ............

  ....PPPPPPPP

  aaaannnnooootttthhhheeeerrrr ppppaaaarrrraaaaggggrrrraaaapppphhhh ............

  ....SSSSHHHH

  aaaannnnooootttthhhheeeerrrr sssseeeeccccttttiiiioooonnnn hhhheeeeaaaaddddiiiinnnngggg

  ....PPPPPPPP

  eeeettttcccc....



The lines that  begin  with  a  period  are  the  formatting



requests.   For  example, ....PPPPPPPP calls for starting a new para-



graph.  The precise meaning of ....PPPPPPPP depends  on  what  output



device is being used (typesetter or terminal, for instance),



and on what publication the document will  appear  in.   For



example,  ----mmmmssss  normally assumes that a paragraph is preceded



by a space (one line in nnnnrrrrooooffffffff, 1/2 line in ttttrrrrooooffffffff),  and  the



first  word  is indented.  These rules can be changed if you



like, but they are changed by changing the interpretation of



....PPPPPPPP, not by re-typing the document.





  To actually produce a document in  standard  format  using



----mmmmssss, use the command





  ttttrrrrooooffffffff ----mmmmssss ffffiiiilllleeeessss ............



for the typesetter, and





  nnnnrrrrooooffffffff ----mmmmssss ffffiiiilllleeeessss ............







                     November 16, 1985













36                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





for a terminal.  The ----mmmmssss argument tells ttttrrrrooooffffffff and  nnnnrrrrooooffffffff  to



use the manuscript package of formatting requests.





  There are several similar packages;  check  with  a  local



expert  to  determine  which  ones are in common use on your



machine.





_S_u_p_p_o_r_t_i_n_g _T_o_o_l_s





  In addition to the basic formatters, there is  a  host  of



supporting  programs  that  help  with document preparation.



The list in the next few paragraphs is far from complete, so



browse  through  the manual and check with people around you



for other possibilities.





  eeeeqqqqnnnn and nnnneeeeqqqqnnnn let you integrate mathematics into  the  text



of  a  document,  in  an easy-to-learn language that closely



resembles the way you would speak it  aloud.   For  example,



the eeeeqqqqnnnn input





  ssssuuuummmm ffffrrrroooommmm iiii====0000 ttttoooo nnnn xxxx ssssuuuubbbb iiii ~~~~====~~~~ ppppiiii oooovvvveeeerrrr 2222



produces the output







999                         _i_=078_R78_n999 _x_i _=99 278_J9__





9

  The program ttttbbbbllll provides an analogous service for  prepar-



ing tabular material; it does all the computations necessary



to  align  complicated  columns  with  elements  of  varying



widths.









                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        37





  rrrreeeeffffeeeerrrr prepares bibliographic citations from a  data  base,



in  whatever style is defined by the formatting package.  It



looks after all  the  details  of  numbering  references  in



sequence,  filling  in  page and volume numbers, getting the



author's initials and the journal name right, and so on.





  ssssppppeeeellllllll and ttttyyyyppppoooo detect  possible  spelling  mistakes  in  a



document.   ssssppppeeeellllllll works by comparing the words in your docu-



ment to a dictionary, printing those that  are  not  in  the



dictionary.   It  knows  enough  about  English  spelling to



detect plurals and the like, so it does  a  very  good  job.



ttttyyyyppppoooo  looks  for  words  which  are  ``unusual'', and prints



those.  Spelling mistakes tend to be more unusual, and  thus



show up early when the most unusual words are printed first.





  ggggrrrreeeepppp looks through a set of files for lines that contain a



particular  text  pattern  (rather like the editor's context



search does, but on a bunch of files).  For example,





  ggggrrrreeeepppp ''''iiiinnnngggg$$$$'''' cccchhhhaaaapppp****



will find all lines that end with the  letters  iiiinnnngggg  in  the



files  cccchhhhaaaapppp****.   (It  is almost always a good practice to put



single quotes around the pattern you're  searching  for,  in



case  it contains characters like **** or $$$$ that have a special



meaning to the shell.) ggggrrrreeeepppp is often useful for finding  out



in  which of a set of files the misspelled words detected by



ssssppppeeeellllllll are actually located.





  ddddiiiiffffffff prints a list of the differences between  two  files,







                     November 16, 1985













38                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





so  you  can compare two versions of something automatically



(which certainly beats proofreading by hand).





  wwwwcccc counts the words, lines and  characters  in  a  set  of



files.   ttttrrrr translates characters into other characters; for



example it will convert upper to lower case and vice  versa.



This translates upper into lower:





  ttttrrrr AAAA----ZZZZ aaaa----zzzz <<<>>>oooouuuuttttppppuuuutttt





  ssssoooorrrrtttt sorts files in a variety of ways; ccccrrrreeeeffff  makes  cross-



references;  ppppttttxxxx  makes a permuted index (keyword-in-context



listing).  sssseeeedddd provides many of the  editing  facilities  of



eeeedddd, but can apply them to arbitrarily long inputs.  aaaawwwwkkkk pro-



vides the ability to do both pattern  matching  and  numeric



computations,  and  to  conveniently  process  fields within



lines.  These programs are for more advanced users, and they



are  not  limited to document preparation.  Put them on your



list of things to learn about.





  Most of these programs are either independently documented



(like  eeeeqqqqnnnn  and  ttttbbbbllll),  or  are sufficiently simple that the



description in the  _U_N_I_X  _P_r_o_g_r_a_m_m_e_r'_s  _M_a_n_u_a_l  is  adequate



explanation.





_H_i_n_t_s _f_o_r _P_r_e_p_a_r_i_n_g _D_o_c_u_m_e_n_t_s





  Most documents go through several  versions  (always  more



than   you  expected)  before  they  are  finally  finished.



Accordingly, you should do whatever possible to make the job







                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        39





of changing them easy.





  First, when you do the  purely  mechanical  operations  of



typing, type so that subsequent editing will be easy.  Start



each sentence on a new line.  Make lines  short,  and  break



lines  at  natural  places,  such  as after commas and semi-



colons, rather than  randomly.   Since  most  people  change



documents  by  rewriting  phrases  and  adding, deleting and



rearranging sentences, these precautions simplify any  edit-



ing you have to do later.





  Keep the individual files of a  document  down  to  modest



size,  perhaps  ten  to fifteen thousand characters.  Larger



files edit more slowly, and of course if  you  make  a  dumb



mistake  it's  better  to have clobbered a small file than a



big one.  Split into files  at  natural  boundaries  in  the



document,  for the same reasons that you start each sentence



on a new line.





  The second aspect of making change easy is to  not  commit



yourself to formatting details too early.  One of the advan-



tages of formatting packages like ----mmmmssss is  that  they  permit



you to delay decisions to the last possible moment.  Indeed,



until a document is printed, it is not even decided  whether



it will be typeset or put on a line printer.





  As a rule of thumb, for all but the most trivial jobs, you



should  type  a  document in terms of a set of requests like



....PPPPPPPP, and then define them appropriately, either by using one







                     November 16, 1985













40                                        _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s





of  the canned packages (the better way) or by defining your



own nnnnrrrrooooffffffff and ttttrrrrooooffffffff commands.  As long as you  have  entered



the text in some systematic way, it can always be cleaned up



and re-formatted by a judicious combination of editing  com-



mands and request definitions.





_I_V.  _P_R_O_G_R_A_M_M_I_N_G





  There will be no attempt made to teach any of the program-



ming  languages  available  but a few words of advice are in



order.  One of the reasons why the UNIX system is a  produc-



tive programming environment is that there is already a rich



set of tools  available,  and  facilities  like  pipes,  I/O



redirection, and the capabilities of the shell often make it



possible to do a  job  by  pasting  together  programs  that



already exist instead of writing from scratch.





_T_h_e _S_h_e_l_l





  The pipe mechanism lets you  fabricate  quite  complicated



operations out of spare parts that already exist.  For exam-



ple, the first draft of the ssssppppeeeellllllll program was (roughly)





  ccccaaaatttt ............     _c_o_l_l_e_c_t _t_h_e _f_i_l_e_s

  |||| ttttrrrr ............    _p_u_t _e_a_c_h _w_o_r_d _o_n _a _n_e_w _l_i_n_e

  |||| ttttrrrr ............    _d_e_l_e_t_e _p_u_n_c_t_u_a_t_i_o_n, _e_t_c.

  |||| ssssoooorrrrtttt      _i_n_t_o _d_i_c_t_i_o_n_a_r_y _o_r_d_e_r

  |||| uuuunnnniiiiqqqq      _d_i_s_c_a_r_d _d_u_p_l_i_c_a_t_e_s

  |||| ccccoooommmmmmmm      _p_r_i_n_t _w_o_r_d_s _i_n _t_e_x_t

          _b_u_t _n_o_t _i_n _d_i_c_t_i_o_n_a_r_y



More pieces have been added subsequently, but  this  goes  a



long way for such a small effort.









                     November 16, 1985













_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s                                        41





  The editor can be made to do things  that  would  normally



require  special programs on other systems.  For example, to



list the first and last lines of each of  a  set  of  files,



such as a book, you could laboriously type





  eeeedddd

  eeee cccchhhhaaaapppp1111....1111

  1111pppp

  $$$$pppp

  eeee cccchhhhaaaapppp1111....2222

  1111pppp

  $$$$pppp

  etc.



But you can do the job much more easily.  One way is to type





  llllssss cccchhhhaaaapppp**** >>>>tttteeeemmmmpppp



to get the list of filenames into a file.   Then  edit  this



file to make the necessary series of editing commands (using



the global commands of eeeedddd), and write it into  ssssccccrrrriiiipppptttt.   Now



the command





  eeeedddd <<<