├── ref ├── awk1978.pdf ├── mcilroy.htm ├── awk1line.txt └── hist.html ├── present.awk ├── README.md ├── LICENSE └── slides.txt /ref/awk1978.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mikepea/awk_tawk/HEAD/ref/awk1978.pdf -------------------------------------------------------------------------------- /present.awk: -------------------------------------------------------------------------------- 1 | #!/usr/bin/awk -f 2 | 3 | BEGIN { FS="\n"; RS=""; } # multiline mode 4 | 5 | function get_key() { RS="\n"; getline key < "-"; RS="" } 6 | function refresh() { system("clear"); print "=== Slide " NR " ===" } 7 | function alen(a) { c=0; for (i in a) c++; return c } 8 | function empty_array(a) { split("", a, ":") } 9 | function print_slide(slide) { 10 | l = alen(slide) 11 | for (i=1;i<=l;i++) { 12 | if ( slide[i] ~ /^@/ ) { continue } 13 | print ( slide[i] == "." ) ? "" : slide[i] 14 | } 15 | } 16 | 17 | { 18 | refresh() 19 | if ($1 ~ /^!/) { 20 | system(substr($1, 2)); empty_array(slide_cache) 21 | } else if ($1 ~ /^#/) next 22 | else { 23 | if ( $1 != "@last") empty_array(slide_cache) 24 | orig_len = alen(slide_cache) 25 | for (i=1; i<=NF; i++) { 26 | slide_cache[orig_len + i] = $i 27 | } 28 | } 29 | } 30 | NR >= ENVIRON["SS"] { print_slide(slide_cache); get_key() } 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # awk_tawk 2 | 3 | Presentation on how great AWK is, including a review of The AWK Programming Language 4 | 5 | Also includes a presentation tool, written in AWK. Eating own dog food or what! 6 | 7 | SS=1 awk -f ./present.awk slides.txt 8 | 9 | ... where SS optionally gives the starting slide number. 10 | 11 | Hit enter to advance slides. PRs welcome. 12 | 13 | ### Notes and References 14 | 15 | Dennis Richie on early Unix history: https://www.bell-labs.com/usr/dmr/www/hist.html 16 | 17 | Doug McIlroy Interview: https://www.princeton.edu/~hos/frs122/precis/mcilroy.htm 18 | 19 | 1978 Awk Paper - 'Awk -- A Pattern Scanning and Processing Language (Second 20 | Edition) (1978)': http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1299 21 | 22 | Handy Awk cheat sheet: http://www.catonmat.net/download/awk.cheat.sheet.pdf 23 | 24 | Archive.org link to The Awk Programming Language PDF: 25 | https://ia802309.us.archive.org/25/items/pdfy-MgN0H1joIoDVoIC7/The_AWK_Programming_Language.pdf 26 | 27 | And a nice list of one-liners: http://www.catonmat.net/blog/wp-content/uploads/2008/09/awk1line.txt 28 | 29 | Karabiner Elements - makes it possible to control this presentation with 30 | a clicker (that emulates a keyboard): https://pqrs.org/osx/karabiner/ 31 | 32 | 33 | -------------------------------------------------------------------------------- /ref/mcilroy.htm: -------------------------------------------------------------------------------- 1 | 2 |
3 |With the experience gained from 18 | Multics, "Ken Thompson began to build his own operating system for the 19 | giant 645, starting from scratch." In the wee hours of the 20 | night, Thompson would take the system down when nobody was using it. 21 | 22 |
At the same time, the development 23 | group was still working. The Computer Center still owned the machines 24 | and was separate from Research. So Computing Research had no computers. 25 | "Visual and acoustics research had had computers for some time." 26 | They were interested in listening to signals in real time and make digital 27 | filters, but this ate up all the cycles of the machine. As more and 28 | more minicomputers came available, Visual and Acoustics Research kept getting 29 | them. They had nice hardware, and we would comment on how inefficiently 30 | they were using their cycles. Because they really didn't like making 31 | software, when things got tough, they would just buy another machine. 32 | And if the machines got a little faster, they would just throw out the 33 | old one. That was the origin of the PDP7." 34 | 35 |
The PDP7 had an improved graphics 36 | engine, that had been sitting idle. "That's what Thompson grabbed 37 | and finally used to build early versions of Unix on." As Thompson 38 | brought along his operating system, Ritchie joined in. McIlroy also 39 | saw its potential, and, being head of the department, he muscled in. 40 | 41 |
One place that McIlroy exerted 42 | managerial control of Unix was in pushing for pipes. The idea of 43 | pipes goes way back. McIlroy began doing macros in the CACM back 44 | in 1959 or 1960. Macros involve switching among many data streams. 45 | "You're taking in your input, you suddenly come to a macro call, and that 46 | says, stop taking input from here, go take it from the definition. 47 | In the middle of the definition, you'll find another macro call. 48 | Somewhere I talked of a macro processor as a switchyard for data streams. 49 | ...In 1964, [according to a] paper hanging on Brian's wall, I talked about 50 | screwing together streams like garden hoses." 51 | 52 |
"On MULTICS, Joe Osanna, ... was 53 | actually beginning to build a way to do input-output plumbing. Input-output 54 | was interpreted over this idea of the segmented address space in the file 55 | system: files were really just segments of the same old address space. 56 | Nevertheless, you had to do I/O because all the programming languages did 57 | it. And he was making ways of connecting programs together." 58 | 59 |
While Thompson and Ritchie were 60 | laying out their file system, McIlroy was "sketching out how to do data 61 | processing by connecting together cascades of processes and looking for 62 | a kind of prefix-notation language for connecting processes together." 63 | 64 |
Over a period from 1970 to 1972, 65 | McIlroy suggested proposal after proposal. He recalls the break-through 66 | day: "Then one day, I came up with a syntax for the shell that went 67 | along with the piping, and Ken said, I'm gonna do it. He was tired 68 | of hearing all this stuff." Thompson didn't do exactly what McIlroy 69 | had proposed for the pipe system call, but "invented a slightly better 70 | one. That finally got changed once more to what we have today. He 71 | put pipes into Unix." Thompson also had to change most of the programs, 72 | because up until that time, they couldn't take standard input. There 73 | wasn't really a need; they all had file arguments. "GREP had a file argument, 74 | CAT had a file argument." 75 | 76 |
The next morning, "we had this 77 | orgy of `one liners.' Everybody had a one liner. Look 78 | at this, look at that. ...Everybody started putting forth the UNIX 79 | philosophy. Write programs that do one thing and do it well. 80 | Write programs to work together. Write programs that handle text 81 | streams, because that is a universal interface." Those ideas 82 | which add up to the tool approach, were there in some unformed way before 83 | pipes, but they really came together afterwards. Pipes became the 84 | catalyst for this UNIX philosophy. "The tool thing has turned out 85 | to be actually successful. With pipes, many programs could work together, 86 | and they could work together at a distance." 87 | 88 |
APL influenced the development 89 | of pipes. APL did not allow the use of operators with variants, which 90 | many utilities had at the time. It only took a willingness to throw 91 | in a new separator, the vertical bar. About four years passed, from 92 | the time they started talking about developing a new separator, to the 93 | time it happened. 94 | 95 |
To most of the Research group at 96 | Bell Labs, the computing theory was there on the side, while they had functionality 97 | to deal with. Most of the group members were more computer types 98 | than mathematicians, even though they wrote papers occasionally with mathematical 99 | notation. McIlroy went to Oxford for a year, solely to "imbibe the 100 | notion of semantics form the source." The Research group included 101 | system builders like Thompson, and theoretical scientists like Aho. 102 | "Aho handed out paper after paper of slightly different models of parsing 103 | and automata, and that was supported with the overt idea that one day [it] 104 | would feed computing practice." 105 | 106 |
"There is a case where there's absolutely no doubt 107 | that, overtly, theory fed into what we did. When the sound theory 108 | of parsing went into a compiler-writing system, it became available to 109 | the masses. There are lots of other places where theory is an inspiration, 110 | or it's in the back of your mind." Thompson wrote one famous recognizer, 111 | which is still used in GREP. And Aho decided that he was going 112 | to take that part of automata theory, and so he built EGREP. "So, 113 | you have the deterministic [parser] in EGREP, and the nondeterministic one in 114 | GREP." 115 | 116 |
"I think really that YACC and GREP are what people 117 | hold up as the `real tools' and they are the ones where we find a strong 118 | theoretical underpinning. TROFF has none. [While] 119 | it's used, and indispensable, nobody holds it up as a programming 120 | gem." 121 | 122 |
This concludes what is contained in the interview,
123 | as it relates to Unix.
124 |
40 | During the past few years, the Unix operating system
41 | has come into wide use,
42 | so wide that its very name has become a trademark of Bell Laboratories.
43 | Its important characteristics have become known to many people.
44 | It has suffered much rewriting and tinkering
45 | since the first publication describing it in 1974 [1],
46 | but few fundamental changes.
47 | However, Unix was born in 1969 not 1974,
48 | and the account of its development
49 | makes a little-known and perhaps instructive story.
50 | This paper presents a technical and social history
51 | of the evolution of the system.
52 |
56 | For computer science at Bell Laboratories, the period 1968-1969 was
57 | somewhat unsettled.
58 | The main reason for this was the slow, though clearly
59 | inevitable, withdrawal of the Labs from the Multics project.
60 | To the Labs computing community as a whole,
61 | the problem was the increasing obviousness of the failure of Multics to deliver
62 | promptly any sort of usable system,
63 | let alone the panacea envisioned earlier.
64 | For much of this time,
65 | the Murray Hill Computer Center was also running a costly
66 | GE 645 machine that inadequately simulated
67 | the GE 635.
68 | Another shake-up that occurred during this period
69 | was the organizational separation of computing services
70 | and computing research.
71 |
73 | From the point of view of the group that was to
74 | be most involved in the beginnings of Unix
75 | (K. Thompson, Ritchie, M. D. McIlroy, J. F. Ossanna),
76 | the decline and fall of Multics had a directly felt effect.
77 | We were among the last Bell Laboratories holdouts
78 | actually working on Multics,
79 | so we still felt some sort of stake in its success.
80 | More important, the convenient interactive computing service
81 | that Multics had promised to the entire community
82 | was in fact available to our limited group,
83 | at first under the CTSS system used to develop Multics,
84 | and later under Multics itself.
85 | Even though Multics could not then
86 | support many users, it could support us,
87 | albeit at exorbitant cost.
88 | We didn't want to lose the pleasant niche we occupied,
89 | because no similar ones were available;
90 | even the time-sharing service that would later be offered
91 | under GE's operating system
92 | did not exist.
93 | What we wanted to preserve was not just a good environment in which to
94 | do programming, but a system around which a fellowship could form.
95 | We knew from experience
96 | that
97 | the essence of communal computing, as supplied by remote-access, time-shared machines,
98 | is not just to type programs
99 | into a terminal instead of a keypunch,
100 | but to encourage close communication.
101 |
103 | Thus, during 1969,
104 | we began trying to find an alternative to Multics.
105 | The search took several forms.
106 | Throughout 1969 we (mainly Ossanna, Thompson, Ritchie)
107 | lobbied intensively for the purchase of
108 | a medium-scale machine
109 | for which we promised to write an operating system;
110 | the machines we suggested were the DEC PDP-10
111 | and the SDS (later Xerox) Sigma 7.
112 | The effort was frustrating, because our proposals were never
113 | clearly and finally turned down,
114 | but yet were certainly never accepted.
115 | Several times it seemed we were very near success.
116 | The final blow to this effort came when we
117 | presented an exquisitely complicated proposal,
118 | designed to minimize financial outlay,
119 | that involved some outright purchase, some third-party lease,
120 | and a plan to turn in a DEC KA-10 processor on the soon-to-be-announced
121 | and more capable KI-10.
122 | The proposal was rejected, and rumor soon had it that
123 | W. O. Baker (then vice-president of Research)
124 | had reacted to it with the comment `Bell Laboratories
125 | just doesn't do business this way!'
126 |
128 | Actually, it is perfectly obvious in retrospect
129 | (and should have been at the time)
130 | that we were asking the Labs to spend too much money
131 | on too few people with too vague a plan.
132 | Moreover, I am quite sure that at that time operating systems
133 | were not, for our management, an attractive area in which to support work.
134 | They were in the process of extricating themselves
135 | not only from an operating system development effort that
136 | had failed,
137 | but from running the local Computation Center.
138 | Thus it may have seemed that buying a machine such as we
139 | suggested might lead on the one hand to yet another Multics,
140 | or on the other, if we produced something useful,
141 | to yet another Comp Center for them to be responsible for.
142 |
144 | Besides the financial agitations that took place in 1969,
145 | there was technical work also.
146 | Thompson, R. H. Canaday, and Ritchie
147 | developed, on blackboards and scribbled notes,
148 | the basic design of a file system
149 | that was later to become the heart of Unix.
150 | Most of the design was Thompson's,
151 | as was the impulse to think about file systems at all,
152 | but I believe I contributed the idea of device files.
153 | Thompson's itch for creation of an operating system took several forms during
154 | this period;
155 | he also wrote (on Multics)
156 | a fairly detailed simulation of the performance of the proposed file
157 | system design
158 | and of paging behavior of programs.
159 | In addition, he started work on a new operating system
160 | for the GE-645, going as far as writing an assembler
161 | for the machine and a rudimentary operating system kernel
162 | whose greatest achievement, so far as I remember,
163 | was to type a greeting message.
164 | The complexity of the machine was such that a mere message was already
165 | a fairly notable accomplishment, but when it became clear that the lifetime
166 | of the 645 at the Labs was measured in months,
167 | the work was dropped.
168 |
170 | Also during 1969, Thompson developed the game of
171 | `Space Travel.'
172 | First written on Multics, then transliterated into Fortran
173 | for GECOS
174 | (the operating system for the GE, later Honeywell, 635),
175 | it was nothing less than a simulation of the movement of the major bodies
176 | of the Solar System, with the player guiding a ship
177 | here and there, observing the scenery, and attempting to
178 | land on the various planets and moons.
179 | The GECOS version was unsatisfactory in two important respects:
180 | first, the display of the state of the game was jerky and hard to control
181 | because one had to type commands at it,
182 | and second, a game cost about $75 for CPU time on the big computer.
183 | It did not take long, therefore,
184 | for Thompson to find a little-used PDP-7 computer with
185 | an excellent display processor;
186 | the whole system was used as a Graphic-II terminal.
187 | He and I rewrote Space Travel
188 | to run on this machine.
189 | The undertaking was more ambitious than it might seem;
190 | because we disdained all existing software,
191 | we had to write a floating-point arithmetic package,
192 | the pointwise specification of the graphic characters
193 | for the display,
194 | and a debugging subsystem that continuously
195 | displayed the contents of typed-in locations in a corner
196 | of the screen.
197 | All this was written in assembly language for a cross-assembler
198 | that ran under GECOS and produced paper tapes
199 | to be carried to the PDP-7.
200 |
202 | Space Travel, though it made a very attractive game,
203 | served mainly as an introduction to the clumsy
204 | technology of preparing programs for the PDP-7.
205 | Soon Thompson began implementing the paper file system
206 | (perhaps `chalk file system' would be more accurate)
207 | that had been designed earlier.
208 | A file system without a way to exercise it
209 | is a sterile proposition,
210 | so he
211 | proceeded to flesh it out with
212 | the other requirements for a working operating system,
213 | in particular the notion of processes.
214 | Then came a small set of user-level utilities:
215 | the means to copy, print, delete, and edit files,
216 | and of course a simple command interpreter (shell).
217 | Up to this time all the programs were written using GECOS
218 | and files were transferred
219 | to the PDP-7 on paper tape;
220 | but once an assembler was completed the
221 | system was able to support itself.
222 | Although it was not until well into 1970 that
223 | Brian Kernighan
224 | suggested the name `Unix,' in a somewhat treacherous
225 | pun on `Multics,'
226 | the operating system we know today was born.
227 |
231 | Structurally,
232 | the file system of PDP-7 Unix
233 | was nearly identical to today's.
234 | It had
235 |
259 | The important file system calls were also present
260 | from the start.
261 | Read, write, open, creat (sic), close:
262 | with one very important exception, discussed below,
263 | they were similar to what one finds now.
264 | A minor difference was that the unit of I/O was the
265 | word, not the byte, because the PDP-7 was a word-addressed machine.
266 | In practice this meant merely that all programs dealing
267 | with character streams ignored null characters,
268 | because null was used to pad a file to
269 | an even number of characters.
270 | Another minor, occasionally annoying difference
271 | was the lack of erase and kill processing
272 | for terminals.
273 | Terminals, in effect, were always in raw mode.
274 | Only a few programs (notably the shell and the editor)
275 | bothered to implement erase-kill processing.
276 |
278 | In spite of its considerable similarity
279 | to the current file system,
280 | the PDP-7 file system was in one way remarkably different:
281 | there were no path names, and each file-name argument
282 | to the system was a simple name (without `/') taken
283 | relative to the current directory.
284 | Links, in the usual Unix sense, did exist.
285 | Together with an elaborate set of conventions,
286 | they were the
287 | principal means by which the lack of path names became
288 | acceptable.
289 |
291 | The
292 | link
293 | call took the form
294 |
125 |
126 |
127 |
--------------------------------------------------------------------------------
/ref/awk1line.txt:
--------------------------------------------------------------------------------
1 | HANDY ONE-LINERS FOR AWK 22 July 2003
2 | compiled by Eric Pement The Evolution of the Unix Time-sharing System*
8 |
9 |
12 |
10 | Bell Laboratories, Murray Hill, NJ, 07974
11 |
21 | ABSTRACT
13 | This paper presents a brief history of the early development of the Unix operating
14 | system.
15 | It concentrates on the evolution of the file system,
16 | the process-control mechanism,
17 | and the idea of pipelined commands.
18 | Some attention is paid to social conditions
19 | during the development of the system.
20 |
22 |
23 |
36 |
37 | Introduction
38 |
39 | Origins
54 |
55 | The PDP-7 Unix file system
229 |
230 |
237 |
258 |
297 | where
298 | dir
299 | was a directory file in the current directory,
300 | file
301 | an existing entry in that directory,
302 | and
303 | newname
304 | the name of the link, which was added to the
305 | current directory.
306 | Because
307 | dir
308 | needed to be in the current directory,
309 | it is evident that today's prohibition against
310 | links to directories was not enforced;
311 | the PDP-7 Unix file system had the shape of a general
312 | directed graph.
313 |
295 | link(dir, file, newname)
296 |
315 | So that every user did not need 316 | to maintain a link to all directories of interest, 317 | there existed a directory 318 | called 319 | dd 320 | that contained entries for the directory of each 321 | user. 322 | Thus, to make a link to file 323 | x 324 | in directory 325 | ken, 326 | I might do 327 |
328 | ln dd ken ken 329 | ln ken x x 330 | rm ken 331 |
341 | The 342 | dd 343 | convention made the 344 | chdir 345 | command relatively convenient. 346 | It took multiple arguments, and switched the current directory to each named directory in turn. 347 | Thus 348 |
349 | chdir dd ken 350 |
362 | The most serious inconvenience of the implementation of the file system, 363 | aside from the lack of path names, 364 | was the difficulty of changing its configuration; 365 | as mentioned, directories and special files were both made 366 | only when the disk was recreated. 367 | Installation of a new device was very painful, because the code 368 | for devices was spread widely throughout the system; 369 | for example there were several loops that visited each device in turn. 370 | Not surprisingly, there was no notion of mounting a removable 371 | disk pack, because the machine had only a single fixed-head disk. 372 |
373 |374 | The operating system code that implemented this file system 375 | was a drastically simplified version of the present 376 | scheme. 377 | One important simplification followed from the fact that 378 | the system was not multi-programmed; 379 | only one program was in memory at a time, 380 | and control was passed between processes 381 | only when an explicit swap took place. 382 | So, for example, 383 | there was an 384 | iget 385 | routine that made a named i-node available, 386 | but it left the i-node in a constant, static location 387 | rather than returning a pointer into a large table 388 | of active i-nodes. 389 | A precursor of the current buffering mechanism was present 390 | (with about 4 buffers) 391 | but there was essentially no overlap of disk I/O with computation. 392 | This was avoided not merely for simplicity. 393 | The disk attached to the PDP-7 was fast for its time; 394 | it transferred one 18-bit word every 2 microseconds. 395 | On the other hand, the PDP-7 itself had a memory cycle time 396 | of 1 microsecond, 397 | and most instructions took 2 cycles (one for the instruction itself, 398 | one for the operand). 399 | However, indirectly addressed instructions required 3 cycles, 400 | and indirection was quite common, because the machine had no index registers. 401 | Finally, the DMA controller was unable to access memory during an instruction. 402 | The upshot was that the disk would incur overrun errors if any 403 | indirectly-addressed instructions 404 | were executed while it was transferring. 405 | Thus control could not be returned to the user, 406 | nor in fact could general system code be executed, 407 | with the disk running. 408 | The interrupt routines for the clock and terminals, 409 | which needed to be runnable at all times, 410 | had to be coded in very strange fashion to avoid indirection. 411 |
412 |415 | By `process control,' I mean 416 | the mechanisms by which processes are created and used; 417 | today the system calls 418 | fork, 419 | exec, 420 | wait, 421 | and 422 | exit 423 | implement these mechanisms. 424 | Unlike the file system, which existed in nearly its 425 | present form from the earliest days, the process control 426 | scheme underwent considerable mutation after PDP-7 427 | Unix was already in use. 428 | (The introduction of path names in the PDP-11 system 429 | was certainly a considerable notational advance, 430 | but not a change in fundamental structure.) 431 |
432 |433 | Today, the way in which commands are executed by the shell can 434 | be summarized as follows: 435 |
436 |457 | Processes (independently executing entities) 458 | existed very early in PDP-7 Unix. 459 | There were in fact precisely two of them, 460 | one for each of the two terminals attached to the machine. 461 | There was no 462 | fork, 463 | wait, 464 | or 465 | exec. 466 | There was an 467 | exit, 468 | but its meaning was rather different, as will be seen. 469 | The main loop of the shell went as follows. 470 |
471 |497 | The most interesting thing about this primitive implementation 498 | is the degree to which it anticipated themes 499 | developed more fully later. 500 | True, it could support neither background processes 501 | nor shell command files (let alone pipes and filters); 502 | but IO redirection (via `<' and `>') was soon there; 503 | it is discussed below. 504 | The implementation of redirection was quite straightforward; 505 | in step 3) above the shell just replaced its standard input 506 | or output with the appropriate file. 507 | Crucial to subsequent development 508 | was the implementation of the shell as a user-level 509 | program stored in a file, 510 | rather than a part of the operating system. 511 |
512 |513 | The structure of this process control scheme, 514 | with one process per terminal, 515 | is similar to that of many interactive systems, 516 | for example CTSS, Multics, Honeywell TSS, and IBM TSS and TSO. 517 | In general such systems require special mechanisms 518 | to implement useful facilities such as detached computations 519 | and command files; 520 | Unix at that stage didn't bother to supply the special mechanisms. 521 | It also exhibited some irritating, idiosyncratic problems. 522 | For example, a newly recreated shell had to close all its open files 523 | both to get rid of any open files 524 | left by the command just executed and to rescind previous IO 525 | redirection. 526 | Then it had to reopen the special file corresponding to 527 | its terminal, in order to read a new command line. 528 | There was no 529 | /dev 530 | directory (because no path names); 531 | moreover, the shell could retain no memory 532 | across commands, because it was reexecuted afresh 533 | after each command. 534 | Thus a further file system convention was required: 535 | each directory had to contain an entry 536 | tty 537 | for a special file that referred to the terminal 538 | of the process that opened it. 539 | If by accident one changed into some directory that lacked this 540 | entry, the shell would loop hopelessly; 541 | about the only remedy was to reboot. 542 | (Sometimes the missing link could be made from the other terminal.) 543 |
544 |545 | Process control in its modern form was designed and implemented 546 | within a couple of days. 547 | It is astonishing how easily it fitted into the existing system; 548 | at the same time it is easy to see how some of the slightly 549 | unusual features of the design are present precisely because 550 | they represented small, easily-coded changes to what existed. 551 | A good example is the separation of the 552 | fork 553 | and 554 | exec 555 | functions. 556 | The most common model for the creation of new 557 | processes involves specifying a program for the process 558 | to execute; 559 | in Unix, 560 | a forked process continues to run the same program 561 | as its parent until it performs an explicit 562 | exec. 563 | The separation of the functions is certainly not unique to 564 | Unix, 565 | and in fact it was present in the Berkeley time-sharing 566 | system [2], 567 | which was well-known to Thompson. 568 | Still, it seems reasonable to suppose that it exists in Unix 569 | mainly because of the ease with which 570 | fork 571 | could be implemented without changing much else. 572 | The system already handled multiple 573 | (i.e. two) processes; 574 | there was a process table, and the processes were swapped between 575 | main memory and the disk. 576 | The initial implementation of 577 | fork 578 | required only 579 |
580 |590 | In fact, the PDP-7's 591 | fork 592 | call required precisely 27 lines of assembly code. 593 | Of course, other changes in the operating system 594 | and user programs were required, and some of them were 595 | rather interesting and unexpected. 596 | But a combined 597 | fork-exec 598 | would have been considerably more complicated, if only because 599 | exec 600 | as such did not exist; 601 | its function was already performed, 602 | using explicit IO, by the shell. 603 |
604 |605 | The 606 | exit 607 | system call, which previously read in a new copy of the shell 608 | (actually a sort of automatic 609 | exec 610 | but without arguments), 611 | simplified considerably; 612 | in the new version a process only had to clean out its 613 | process table entry, 614 | and give up control. 615 |
616 |617 | Curiously, 618 | the primitives that became 619 | wait 620 | were considerably more general 621 | than the present scheme. 622 | A pair of primitives sent one-word messages between named processes: 623 |
624 | smes(pid, message) 625 | (pid, message) = rmes() 626 |
641 | The message facility was used as follows: 642 | the parent shell, 643 | after creating a process to execute a command, 644 | sent a message to the new process by 645 | smes; 646 | when the command terminated 647 | (assuming it did not try to read any messages) 648 | the shell's blocked 649 | smes 650 | call returned an error indication that the target 651 | process did not exist. 652 | Thus the shell's 653 | smes 654 | became, in effect, 655 | the equivalent of 656 | wait. 657 |
658 |659 | A different protocol, 660 | which took advantage of more of the generality offered by messages, 661 | was used between the initialization program and the shells 662 | for each terminal. 663 | The initialization process, 664 | whose ID was understood to be 1, 665 | created a shell for each of the terminals, 666 | and then issued 667 | rmes; 668 | each shell, when it read the end of its input file, 669 | used 670 | smes 671 | to send a conventional `I am terminating' 672 | message to the initialization process, 673 | which recreated a new shell process 674 | for that terminal. 675 |
676 |677 | I can recall no other use of messages. 678 | This explains why the facility 679 | was replaced by the 680 | wait 681 | call of the present system, which is less general, 682 | but more directly applicable 683 | to the desired purpose. 684 | Possibly relevant also is the evident bug in the mechanism: 685 | if a command process attempted to use messages to 686 | communicate with other processes, 687 | it would disrupt the shell's synchronization. 688 | The shell depended on sending a message that 689 | was never received; 690 | if a command executed 691 | rmes, 692 | it would receive the shell's phony message, 693 | and cause the shell to read another input line just as if 694 | the command had terminated. 695 | If a need for general 696 | messages had manifested itself, 697 | the bug would have been repaired. 698 |
699 |700 | At any rate, the new process control scheme 701 | instantly rendered some very valuable features 702 | trivial to implement; 703 | for example detached processes (with `&') 704 | and recursive use of the shell as a command. 705 | Most systems have to supply some 706 | sort of special `batch job submission' facility 707 | and 708 | a special command interpreter for files distinct 709 | from the one used interactively. 710 |
711 |712 | Although the multiple-process idea slipped in very easily indeed, 713 | there were some aftereffects that weren't anticipated. 714 | The most memorable of these became evident 715 | soon after the new system 716 | came up and apparently worked. 717 | In the midst of our jubilation, it was discovered 718 | that the 719 | chdir 720 | (change current directory) 721 | command had stopped working. 722 | There was much reading of code and anxious introspection about 723 | how the addition of 724 | fork 725 | could have broken the 726 | chdir 727 | call. 728 | Finally the truth dawned: 729 | in the old system 730 | chdir 731 | was an ordinary command; 732 | it adjusted the current directory of the (unique) 733 | process attached to the terminal. 734 | Under the new system, the 735 | chdir 736 | command correctly changed the current directory of the process 737 | created to execute it, 738 | but this process promptly terminated 739 | and had no effect whatsoever on its parent shell! 740 | It was necessary to make 741 | chdir 742 | a special command, executed internally within the shell. 743 | It turns out that several command-like functions have the same 744 | property, 745 | for example 746 | login. 747 |
748 |749 | Another mismatch between the system as it had been 750 | and the new process control scheme took longer to become 751 | evident. 752 | Originally, the read/write pointer associated with 753 | each open file was stored within the process that opened 754 | the file. 755 | (This pointer indicates where in the file the next 756 | read or write will take place.) 757 | The problem with this organization became evident only 758 | when we tried to use command files. 759 | Suppose a simple command file contains 760 |
761 | ls 762 | who 763 |
766 | sh comfile >output 767 |
795 | Solution of this problem required creation of a new 796 | system table to contain the IO pointers 797 | of open files independently of the process in which they 798 | were opened. 799 |
800 |803 | The very convenient notation for IO redirection, using the `>' and `<' 804 | characters, 805 | was not present from the very beginning of the PDP-7 Unix system, 806 | but it did appear quite early. 807 | Like much else in Unix, 808 | it was inspired by an idea from Multics. 809 | Multics has a rather general IO redirection mechanism [3] 810 | embodying named IO streams 811 | that can be dynamically redirected 812 | to various devices, files, and even through special 813 | stream-processing modules. 814 | Even in the version of Multics we were familiar with a decade ago, 815 | there existed a command that switched subsequent output 816 | normally destined for the terminal to a file, and another command 817 | to reattach output to the terminal. 818 | Where under Unix one might say 819 |
820 | ls >xx 821 |
826 | iocall attach user_output file xx 827 | list 828 | iocall attach user_output syn user_i/o 829 |
860 | By the beginning of 1970, 861 | PDP-7 Unix was a going concern. 862 | Primitive by today's standards, 863 | it was still capable of providing 864 | a more congenial programming environment than its alternatives. 865 | Nevertheless, it was clear that the PDP-7, a machine we didn't even own, 866 | was already obsolete, 867 | and its successors in the same line offered little of interest. 868 | In early 1970 we proposed 869 | acquisition of a PDP-11, which had just been introduced by 870 | Digital. 871 | In some sense, this proposal was merely the latest 872 | in the series of attempts that had been made throughout the preceding year. 873 | It differed in two important ways. 874 | First, the amount of money (about $65,000) 875 | was an order of magnitude less than what we had previously asked; 876 | second, 877 | the charter sought was not merely to 878 | write some (unspecified) operating system, 879 | but instead to 880 | create a system specifically designed for editing and formatting 881 | text, 882 | what might today be called a `word-processing system.' 883 | The impetus for the proposal came mainly from J. F. Ossanna, 884 | who was then and until the end of his life interested 885 | in text processing. 886 | If our early proposals were too vague, 887 | this one was perhaps too specific; at first it too 888 | met with disfavor. 889 | Before long, however, 890 | funds were obtained through the efforts of L. E. McMahon 891 | and an order for a PDP-11 was placed in May. 892 |
893 |894 | The processor arrived at the end of the summer, but 895 | the PDP-11 was so new a product that no disk was available until 896 | December. 897 | In the meantime, a rudimentary, core-only version of Unix was written 898 | using a cross-assembler on the PDP-7. 899 | Most of the time, 900 | the machine sat in a corner, enumerating all the closed Knight's tours 901 | on a 6×8 chess boarda three-month job. 902 |
903 |906 | Once the disk arrived, 907 | the system was quickly completed. 908 | In internal structure, the first version of Unix for the PDP-11 represented a relatively 909 | minor advance over the PDP-7 system; 910 | writing it was largely a matter of transliteration. 911 | For example, 912 | there was no multi-programming; only one user program 913 | was present in core at any moment. 914 | On the other hand, 915 | there were important changes in the interface to the user: 916 | the present directory structure, 917 | with full path names, 918 | was in place, 919 | along with the modern form of 920 | exec 921 | and 922 | wait, 923 | and conveniences like 924 | character-erase and line-kill 925 | processing for terminals. 926 | Perhaps the most interesting thing about the 927 | enterprise was its small size: 928 | there were 24K bytes of core memory 929 | (16K for the system, 8K for user programs), 930 | and a disk with 1K blocks (512K bytes). 931 | Files were limited to 64K bytes. 932 |
933 |934 | At the time of the placement of the order for the PDP-11, 935 | it had seemed natural, or perhaps expedient, 936 | to promise a system dedicated to word processing. 937 | During the protracted arrival of the hardware, 938 | the increasing usefulness of PDP-7 Unix 939 | made it appropriate to justify creating PDP-11 Unix 940 | as a development tool, to be used in writing the 941 | more special-purpose system. 942 | By the spring of 1971, 943 | it was generally agreed that 944 | no one had the slightest interest in scrapping Unix. 945 | Therefore, we transliterated the 946 | roff 947 | text formatter 948 | into PDP-11 assembler language, 949 | starting from the PDP-7 version that 950 | had been transliterated 951 | from McIlroy's BCPL version on Multics, 952 | which had in turn been inspired 953 | by J. Saltzer's 954 | runoff 955 | program on CTSS. 956 | In early summer, editor and formatter in hand, 957 | we felt prepared to fulfill our charter by offering 958 | to supply a text-processing service to the 959 | Patent department for preparing patent applications. 960 | At the time, they were evaluating a commercial system 961 | for this purpose; the main advantages 962 | we offered 963 | (besides the dubious one of taking part in 964 | an in-house experiment) 965 | were two in number: 966 | first, 967 | we supported Teletype's model 37 terminals, 968 | which, with an extended type-box, 969 | could print most of the math symbols 970 | they required; 971 | second, we quickly endowed 972 | roff 973 | with the ability to produce line-numbered pages, 974 | which the Patent Office required and which the other 975 | system could not handle. 976 |
977 |978 | During the last half of 1971, we supported three typists from the Patent 979 | department, who spent the day busily typing, editing, and formatting 980 | patent applications, and meanwhile tried to carry on our own work. 981 | Unix has a reputation for supplying interesting services on modest hardware, 982 | and this period may mark a high point in the benefit/equipment ratio; 983 | on a machine with no memory protection and a single .5 MB disk, 984 | every test of a new program required care and boldness, because it could 985 | easily crash the system, and every few hours' work by the typists 986 | meant pushing out more information onto DECtape, because of the 987 | very small disk. 988 |
989 |990 | The experiment was trying but successful. 991 | Not only did the Patent department adopt Unix, 992 | and thus become the first of many groups at the Laboratories 993 | to ratify our work, 994 | but we achieved sufficient credibility to convince our own management 995 | to acquire 996 | one of the first PDP 11/45 systems made. 997 | We have accumulated much hardware since then, 998 | and labored continuously on the software, 999 | but because most of the interesting work has already been published, 1000 | (e.g. on the system itself [1, 5, 6, 7, 8, 9]) it seems unnecessary to repeat it here. 1001 |
1002 | 1003 |1006 | One of the most widely admired contributions of Unix 1007 | to the culture of operating systems and command languages 1008 | is the 1009 | pipe, 1010 | as used in a pipeline of commands. 1011 | Of course, the fundamental idea was by no means new; 1012 | the pipeline is merely a specific form of coroutine. 1013 | Even the implementation was not unprecedented, 1014 | although we didn't know it at the time; 1015 | the `communication files' of the Dartmouth 1016 | Time-Sharing System [10] 1017 | did very nearly what Unix pipes do, 1018 | though they seem not to have been exploited so fully. 1019 |
1020 |1021 | Pipes appeared in Unix in 1972, 1022 | well after the PDP-11 version of the system was in operation, 1023 | at the suggestion (or perhaps insistence) of M. D. McIlroy, 1024 | a long-time advocate of the non-hierarchical control flow 1025 | that characterizes coroutines. 1026 | Some years before pipes were implemented, he suggested 1027 | that commands should be thought of as binary operators, 1028 | whose left and right operand specified the input and output files. 1029 | Thus a `copy' utility would be commanded by 1030 |
1031 | inputfile copy outputfile 1032 |
1039 | input sort paginate offprint 1040 |
1043 | sort input | pr | opr 1044 |
1060 | Some time later, thanks to McIlroy's persistence, 1061 | pipes were finally installed in the operating system 1062 | (a relatively simple job), 1063 | and a new notation was introduced. 1064 | It used the same characters as for I/O redirection. 1065 | For example, the pipeline above might have been written 1066 |
1067 | sort input >pr>opr> 1068 |
1085 | The new facility was enthusiastically received, and 1086 | the term `filter' was soon coined. 1087 | Many commands were changed to make them usable in pipelines. 1088 | For example, no one had imagined that anyone would want the 1089 | sort 1090 | or 1091 | pr 1092 | utility to sort or print its standard input if given no explicit arguments. 1093 |
1094 |1095 | Soon some problems with the notation became evident. 1096 | Most annoying was a silly lexical problem: 1097 | the string after `>' was delimited by blanks, so, 1098 | to give a parameter to 1099 | pr 1100 | in the example, one had to quote: 1101 |
1102 | sort input >"pr -2">opr> 1103 |
1110 | opr <pr<"sort input"< 1111 |
1114 | pr <"sort input"< >opr> 1115 |
1134 | I mentioned above in the section on IO redirection that Multics 1135 | provided a mechanism by which IO streams could be directed 1136 | through processing modules on the way to (or from) the device 1137 | or file serving as source or sink. 1138 | Thus it might seem that stream-splicing in Multics 1139 | was the direct precursor of Unix pipes, as Multics 1140 | IO redirection certainly was for its Unix version. 1141 | In fact I do not think this is true, or is true only in a weak sense. 1142 | Not only were coroutines well-known already, 1143 | but their embodiment as Multics spliceable IO modules 1144 | required that the modules be specially coded in such a way 1145 | that they could be used for no other purpose. 1146 | The genius of the Unix pipeline is precisely that it 1147 | is constructed from the very same commands used constantly 1148 | in simplex fashion. 1149 | The mental leap needed to see this possibility 1150 | and to invent the notation is large indeed. 1151 |
1152 |1155 | Every program for the original PDP-7 Unix system was written in 1156 | assembly language, and bare assembly language it wasfor example, 1157 | there were no macros. 1158 | Moreover, there was no loader or link-editor, so every program had to be complete in itself. 1159 | The first interesting language to appear was a version 1160 | of McClure's TMG [11] 1161 | that was implemented by McIlroy. 1162 | Soon after TMG became available, 1163 | Thompson decided 1164 | that we could not pretend to offer a real computing service 1165 | without Fortran, 1166 | so he sat down to write a Fortran in TMG. 1167 | As I recall, 1168 | the intent to handle Fortran lasted about a week. 1169 | What he produced instead was a definition of and a compiler for 1170 | the new language B [12]. 1171 | B was much influenced by the BCPL language [13]; 1172 | other influences were Thompson's taste for spartan syntax, 1173 | and the very small space into which the compiler had to fit. 1174 | The compiler produced simple interpretive code; 1175 | although it and the programs it produced were rather slow, 1176 | it made life much more pleasant. 1177 | Once interfaces to the regular system calls were made available, 1178 | we began once again to enjoy the benefits of using a reasonable 1179 | language to write what are usually called 1180 | `systems programs:' 1181 | compilers, assemblers, and the like. 1182 | (Although some might consider the PL/I we used under 1183 | Multics unreasonable, 1184 | it was much better than assembly language.) 1185 | Among other programs, the PDP-7 B cross-compiler for the PDP-11 1186 | was written in B, and in the course of time, 1187 | the B compiler for the PDP-7 itself was transliterated 1188 | from TMG into B. 1189 |
1190 |1191 | When the PDP-11 arrived, 1192 | B was moved to it almost immediately. 1193 | In fact, a version of the multi-precision `desk calculator' 1194 | program 1195 | dc 1196 | was one of the earliest programs to run on the PDP-11, 1197 | well before the disk arrived. 1198 | However, B did not take over instantly. 1199 | Only passing thought was given to rewriting the operating system 1200 | in B rather than assembler, 1201 | and the same was true of most of the utilities. 1202 | Even the assembler was rewritten in assembler. 1203 | This approach was taken mainly because of the slowness of the interpretive 1204 | code. 1205 | Of smaller but still real importance was the mismatch 1206 | of the word-oriented B language with the byte-addressed 1207 | PDP-11. 1208 |
1209 |1210 | Thus, in 1971, work began on what was to become the C language [14]. 1211 | The story of the language developments from BCPL 1212 | through B to C is told elsewhere [15], 1213 | and need not be repeated here. 1214 | Perhaps the most important watershed occurred during 1973, 1215 | when the operating system kernel was rewritten in C. 1216 | It was at this point that the system assumed its modern form; 1217 | the most far-reaching change was the introduction of 1218 | multi-programming. 1219 | There were few externally-visible changes, but the internal structure of the 1220 | system became much more rational and general. 1221 | The success of this effort convinced us that C was useful 1222 | as a nearly universal tool for systems programming, 1223 | instead of just a toy for simple applications. 1224 |
1225 |1226 | Today, the only important Unix program still written in assembler 1227 | is the assembler itself; 1228 | virtually all the utility programs are in C, 1229 | and so are most of the applications programs, although there are 1230 | sites with many in Fortran, Pascal, and Algol 68 as well. 1231 | It seems certain that much of the success of Unix follows 1232 | from the readability, modifiability, and portability 1233 | of its software that in turn follows 1234 | from its expression in high-level languages. 1235 |
1236 |1239 | One of the comforting things about old memories is their tendency 1240 | to take on a rosy glow. 1241 | The programming environment provided by the early versions of Unix seems, 1242 | when described here, to be extremely harsh and primitive. 1243 | I am sure that if forced back to the PDP-7 I would find it intolerably limiting and 1244 | lacking in conveniences. 1245 | Nevertheless, it did not seem so at the time; 1246 | the memory fixes on what was good and what lasted, and on the joy of helping 1247 | to create the improvements that made life better. 1248 | In ten years, I hope we can look back with the same mixed impression 1249 | of progress combined with continuity. 1250 |
1251 |1254 | I am grateful to S. P. Morgan, K. Thompson, and M. D. McIlroy 1255 | for providing early documents and digging up recollections. 1256 |
1257 |1258 | Because I am most interested in describing the evolution 1259 | of ideas, this paper attributes ideas and work to individuals only where 1260 | it seems most important. 1261 | The reader will not, on the average, 1262 | go far wrong if he reads each occurrence of `we' 1263 | with unclear antecedent 1264 | as `Thompson, with some assistance from me.' 1265 |
1266 |