├── img
├── forest.jpg
├── lua-logo.pdf
├── theend.jpg
├── theend2.jpg
├── theend3.jpg
├── nanovg-demo.png
├── nanovg-noise.png
├── trello-board.png
├── lua-logo-nolabel.pdf
├── lua-logo-nolabel.ps
└── lua-logo.svg
├── fonts
├── Symbola.ttf
├── Andada-Bold.ttf
├── Andada-Italic.ttf
├── AndadaSC-Bold.ttf
├── Raleway-Black.ttf
├── Raleway-Bold.ttf
├── Raleway-Light.ttf
├── Raleway-Thin.ttf
├── iosevka-bold.ttf
├── Andada-Regular.ttf
├── AndadaSC-Italic.ttf
├── AndadaSC-Regular.ttf
├── Raleway-Medium.ttf
├── Raleway-Regular.ttf
├── Raleway-SemiBold.ttf
├── iosevka-italic.ttf
├── iosevka-regular.ttf
├── Andada-BoldItalic.ttf
├── Raleway-ExtraBold.ttf
├── Raleway-ExtraLight.ttf
├── iosevka-bolditalic.ttf
├── AndadaSC-BoldItalic.ttf
├── InputMonoNarrow-Light.ttf
├── Raleway-Black-Italic.ttf
├── Raleway-Bold-Italic.ttf
├── Raleway-Light-Italic.ttf
├── Raleway-Medium-Italic.ttf
├── Raleway-Thin-Italic.ttf
├── InputMonoNarrow-Italic.ttf
├── InputMonoNarrow-Regular.ttf
├── Raleway-Regular-Italic.ttf
├── Raleway-SemiBold-Italic.ttf
├── Raleway-ExtraBold-Italic.ttf
├── Raleway-ExtraLight-Italic.ttf
└── InputMonoNarrow-LightItalic.ttf
├── prebuilt
├── eris-report.pdf
├── lua-eol-report.pdf
├── eris-report-diff-20150629.pdf
├── lua-eol-report-diff-20150817.pdf
├── lua-eol-report-diff-20150823.pdf
└── lua-eol-report-diff-20150831.pdf
├── Makefile
├── hyphenation.tex
├── .gitignore
├── summary.tex
├── appendix-installation.tex
├── bibliography.tex
├── introduction.tex
├── glossary.tex
├── lua-eol-report.tex
├── conclusions.tex
├── slides.tex
├── design.tex
└── implementation.tex
/img/forest.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/forest.jpg
--------------------------------------------------------------------------------
/img/lua-logo.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/lua-logo.pdf
--------------------------------------------------------------------------------
/img/theend.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/theend.jpg
--------------------------------------------------------------------------------
/img/theend2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/theend2.jpg
--------------------------------------------------------------------------------
/img/theend3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/theend3.jpg
--------------------------------------------------------------------------------
/fonts/Symbola.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Symbola.ttf
--------------------------------------------------------------------------------
/fonts/Andada-Bold.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-Bold.ttf
--------------------------------------------------------------------------------
/img/nanovg-demo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/nanovg-demo.png
--------------------------------------------------------------------------------
/img/nanovg-noise.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/nanovg-noise.png
--------------------------------------------------------------------------------
/img/trello-board.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/trello-board.png
--------------------------------------------------------------------------------
/fonts/Andada-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-Italic.ttf
--------------------------------------------------------------------------------
/fonts/AndadaSC-Bold.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-Bold.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Black.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Black.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Bold.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Bold.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Light.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Light.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Thin.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Thin.ttf
--------------------------------------------------------------------------------
/fonts/iosevka-bold.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-bold.ttf
--------------------------------------------------------------------------------
/fonts/Andada-Regular.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-Regular.ttf
--------------------------------------------------------------------------------
/fonts/AndadaSC-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-Italic.ttf
--------------------------------------------------------------------------------
/fonts/AndadaSC-Regular.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-Regular.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Medium.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Medium.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Regular.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Regular.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-SemiBold.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-SemiBold.ttf
--------------------------------------------------------------------------------
/fonts/iosevka-italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-italic.ttf
--------------------------------------------------------------------------------
/fonts/iosevka-regular.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-regular.ttf
--------------------------------------------------------------------------------
/img/lua-logo-nolabel.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/lua-logo-nolabel.pdf
--------------------------------------------------------------------------------
/prebuilt/eris-report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/eris-report.pdf
--------------------------------------------------------------------------------
/fonts/Andada-BoldItalic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-BoldItalic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-ExtraBold.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraBold.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-ExtraLight.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraLight.ttf
--------------------------------------------------------------------------------
/fonts/iosevka-bolditalic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-bolditalic.ttf
--------------------------------------------------------------------------------
/prebuilt/lua-eol-report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report.pdf
--------------------------------------------------------------------------------
/fonts/AndadaSC-BoldItalic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-BoldItalic.ttf
--------------------------------------------------------------------------------
/fonts/InputMonoNarrow-Light.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-Light.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Black-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Black-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Bold-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Bold-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Light-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Light-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Medium-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Medium-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Thin-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Thin-Italic.ttf
--------------------------------------------------------------------------------
/fonts/InputMonoNarrow-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-Italic.ttf
--------------------------------------------------------------------------------
/fonts/InputMonoNarrow-Regular.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-Regular.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-Regular-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Regular-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-SemiBold-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-SemiBold-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-ExtraBold-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraBold-Italic.ttf
--------------------------------------------------------------------------------
/fonts/Raleway-ExtraLight-Italic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraLight-Italic.ttf
--------------------------------------------------------------------------------
/fonts/InputMonoNarrow-LightItalic.ttf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-LightItalic.ttf
--------------------------------------------------------------------------------
/prebuilt/eris-report-diff-20150629.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/eris-report-diff-20150629.pdf
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | #
2 | # Makefile
3 | # Adrian Perez, 2015-06-30 05:38
4 | #
5 |
6 | all:
7 | ninja
8 |
9 |
10 | # vim:ft=make
11 | #
12 |
--------------------------------------------------------------------------------
/prebuilt/lua-eol-report-diff-20150817.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report-diff-20150817.pdf
--------------------------------------------------------------------------------
/prebuilt/lua-eol-report-diff-20150823.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report-diff-20150823.pdf
--------------------------------------------------------------------------------
/prebuilt/lua-eol-report-diff-20150831.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report-diff-20150831.pdf
--------------------------------------------------------------------------------
/hyphenation.tex:
--------------------------------------------------------------------------------
1 | % vim:ft=tex:
2 | %
3 | \hyphenation{
4 | com-pati-ble
5 | array
6 | re-fer-ence
7 | func-ti-ons
8 | DynASM
9 | }
10 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | *.lo[gft]
2 | *.mtc*
3 | *.aux
4 | *.toc
5 | *.out
6 | .*.sw[op]
7 | .ninja_log
8 | .ninja_deps
9 | slides.pdf
10 | lua-eol-report.pdf
11 | *.ac[rn]
12 | _minted-lua-eol-report/
13 | _minted-slides/
14 | *.nav
15 | *.snm
16 | *.vrb
17 | *.maf
18 | *.ist
19 | *.auxlock
20 | *.glsdefs
21 | *.alg
22 | *.md5
23 | *.xdy
24 | *.gl[gos]
25 | *.lol
26 | *.tdo
27 |
--------------------------------------------------------------------------------
/summary.tex:
--------------------------------------------------------------------------------
1 | % vim: ft=tex ts=2 sw=2 spell spelllang=en
2 | \chapter*{Summary}
3 |
4 | The objective of this project is to implement an automated mechanism that,
5 | using the DWARF debugging information from ELF shared objects, allows the Lua
6 | virtual machine to call native functions from shared objects implemented in
7 | the C programming language. The process is automatic, in the sense that the
8 | user does not need to write code to convert values passed between Lua and the
9 | invoked C functions, and the C functions behave essentially like Lua from
10 | the user point of view. The ultimate goal is to allow transparent usage of
11 | existing C libraries from Lua.
12 |
13 | Lua has been chosen because it provides a clean C interface to its \gls{VM},
14 | which has been designed from the ground up to be embedded in larger projects.
15 | The implementation is also compact (under 16.000 lines of code), which makes
16 | it feasible to gain in-depth knowledge of its innerworkings in a relatively
17 | short time. Lua has also grown in popularity in the last years as its adoption
18 | has skyrocketed in the game industry.
19 |
20 | The reason to focus on the combination of debugging information in DWARF
21 | format contained in ELF shared objects is that they are a widespread, standard
22 | configuration used by the majority of contemporary Unix-like operating
23 | systems. The target system during development has been a GNU/Linux system
24 | running on the Intel x86\_64 architecture, which also uses the aforementioned
25 | configuration, though provisions are to be included in the design to ease
26 | future porting efforts for other platforms.
27 |
28 | In order to validate the correctness of the implementation, an automated test
29 | suite was also developed. Unit tests were used also as regression tests, to
30 | ensure that modifications to the system did not introduce programming errors
31 | in the implementation.
32 |
--------------------------------------------------------------------------------
/img/lua-logo-nolabel.ps:
--------------------------------------------------------------------------------
1 | %!PS-Adobe-2.0 EPSF-2.0
2 | %%Title: Lua logo
3 | %%Creator: lua@tecgraf.puc-rio.br
4 | %%CreationDate: Wed Nov 29 19:02:41 EDT 2000
5 | %%BoundingBox: -45 0 1035 1080
6 | %%Pages: 1
7 | %%EndComments
8 | %%EndProlog
9 |
10 | %------------------------------------------------------------------------------
11 | %
12 | % Graphic design by Alexandre Nakonechnyj.
13 | % PostScript programming by the Lua team.
14 | % This code is hereby placed in the public domain.
15 | %
16 | % Permission is hereby granted, without written agreement and without license
17 | % or royalty fees, to use, copy, and distribute this logo for any purpose,
18 | % including commercial applications, subject to the following conditions:
19 | %
20 | % * The origin of this logo must not be misrepresented; you must not
21 | % claim that you drew the original logo. We recommend that you give credit
22 | % to the graphics designer in all printed matter that includes the logo.
23 | %
24 | % * The only modification you can make is to adapt the orbiting text to
25 | % your product name.
26 | %
27 | % * The logo can be used in any scale as long as the relative proportions
28 | % of its elements are maintained.
29 | %
30 | %------------------------------------------------------------------------------
31 |
32 | /PLANETCOLOR {0 0 0.5 setrgbcolor} bind def
33 | /HOLECOLOR {1.0 setgray} bind def
34 | /ORBITCOLOR {0.5 setgray} bind def
35 | /LOGOFONT {/Helvetica 0.90} def
36 | /LABELFONT {/Helvetica 0.36} def
37 |
38 | %------------------------------------------------------------------------------
39 |
40 | /MOONCOLOR {PLANETCOLOR} bind def
41 | /LOGOCOLOR {HOLECOLOR} bind def
42 | /LABELCOLOR {ORBITCOLOR} bind def
43 |
44 | /LABELANGLE 125 def
45 | /LOGO (Lua) def
46 |
47 | /DASHANGLE 10 def
48 | /HALFDASHANGLE DASHANGLE 2 div def
49 |
50 | % moon radius. planet radius is 1.
51 | /r 1 2 sqrt 2 div sub def
52 |
53 | /D {0 360 arc fill} bind def
54 | /F {exch findfont exch scalefont setfont} bind def
55 |
56 | % place it nicely on the paper
57 | /RESOLUTION 1024 def
58 | RESOLUTION 2 div dup translate
59 | RESOLUTION 2 div 2 sqrt div dup scale
60 |
61 | %-------------------------------------------------------------------- planet --
62 | PLANETCOLOR
63 | 0 0 1 D
64 |
65 | %---------------------------------------------------------------------- hole --
66 | HOLECOLOR
67 | 1 2 r mul sub dup r D
68 |
69 | %---------------------------------------------------------------------- moon --
70 | MOONCOLOR
71 | 1 1 r D
72 |
73 | %---------------------------------------------------------------------- logo --
74 | LOGOCOLOR
75 | LOGOFONT
76 | F
77 | LOGO stringwidth pop 2 div neg
78 | -0.5 moveto
79 | LOGO show
80 |
81 | %--------------------------------------------------------------------- orbit --
82 | ORBITCOLOR
83 | 0.03 setlinewidth
84 | [1 r add 3.1415926535 180 div HALFDASHANGLE mul mul] 0 setdash
85 | newpath
86 | 0 0
87 | 1 r add
88 | 3 copy
89 | 27 57
90 | arcn
91 | stroke
92 |
93 | %------------------------------------------------------------------ copyright --
94 | /COPYRIGHT
95 | (Graphic design by A. Nakonechnyj. Copyright (c) 1998, All rights reserved.)
96 | def
97 |
98 | LABELCOLOR
99 | LOGOFONT
100 | 32 div
101 | F
102 | 2 sqrt 0.99 mul
103 | dup
104 | neg
105 | moveto
106 | COPYRIGHT
107 | 90 rotate
108 | %show
109 |
110 | %---------------------------------------------------------------------- done --
111 | showpage
112 |
113 | %%Trailer
114 | %%EOF
115 |
--------------------------------------------------------------------------------
/appendix-installation.tex:
--------------------------------------------------------------------------------
1 | % vim: set ft=tex spelllang=en ts=2 sw=2 et
2 |
3 | \chapter{Installation}
4 |
5 | \section{Prerequisites}
6 | \label{sec:eol-prereqs}
7 |
8 | Instead of providing its own implementation for certain functionality, \Eol*
9 | uses existing, proved software components.
10 |
11 | \begin{table}[h]
12 | \centering
13 | \begin{tabular}{lrccc}
14 | \toprule
15 | Component & Version & Required & Optional & Bundled \\
16 | \midrule
17 | Lua & 5.3 & \Tick & & \Tick \\
18 | LuaBitOp & 1.0.2 & & \Tick & \Tick \\
19 | \verb|libdwarf| & 20150507 & \Tick & & \Tick \\
20 | \verb|libelf| & 0.152 & \Tick & & \\
21 | \verb|readline| & 5.0 & & \Tick & \\
22 | \verb|libffi| & 3.1 & & \Tick & \\
23 | \bottomrule
24 | \end{tabular}
25 | \caption{Dependencies}
26 | \label{tab:eol-dependencies}
27 | \end{table}
28 |
29 | \autoref{tab:eol-dependencies} shows the dependencies expected to be
30 | installed in the system. The items marked (\inlinesymbol\Tick) as
31 | \emph{bundled} are not included in the source repository, but the build system
32 | includes support for downloading tarballs with the source code and doing
33 | a local build. When enabled, bundled dependencies will be automatically
34 | downloaded, built, and used instead instead of the versions provided by the
35 | system. In the case of using \verb|libdwarf| bundled, it will be statically
36 | linked. See~\autoref{sec:running-configure} for instructions to enable
37 | bundled libraries. This is particularly useful for systems which do not
38 | provide Lua 5.3 packages (for example, the case Debian and Ubuntu at the time
39 | of writing).
40 |
41 | \begin{table}
42 | \begin{tabular}{ccp{0.4\textwidth}}
43 | \toprule
44 | Distribution & Installation Command & Packages \\
45 | \midrule
46 | Debian, Ubuntu &
47 | \verb|apt-get install| &
48 | \verb|libdwarf-dev| \verb|ninja-build| \\
49 | Arch Linux & \verb|pacman -S| & \verb|libdwarf| \verb|ninja| \verb|lua| \\
50 | \bottomrule
51 | \end{tabular}
52 | \caption{Dependency packages in popular GNU/Linux distributions.}
53 | \label{tab:distro-dependency-packages}
54 | \end{table}
55 |
56 | \autoref{tab:distro-dependency-packages} shows required packages as provided
57 | by popular GNU/Linux distributions. Some versions of Debian (and derivatives
58 | like Ubuntu) include only a static version of \verb|libdwarf| in the packages,
59 | most likely not built as \gls{PIC}, which is a requirement.
60 |
61 |
62 | \section{Building}
63 |
64 | The build process follows the convention pioneered by GNU
65 | Autotools~\cite{autotools-history}, in which an autoconfiguration script
66 | (\verb|configure|) is run first to inspect the system, determine which
67 | optional components are to be enabled at build time, and generate the needed
68 | build files. In short, building \Eol* is done by executing the following
69 | commands from the top level source directory:
70 |
71 | \begin{minted}{sh}
72 | ./configure
73 | make
74 | \end{minted}
75 |
76 | or, using Ninja~\cite{ninja-manual}:
77 |
78 | \begin{minted}{sh}
79 | ./configure
80 | ninja
81 | \end{minted}
82 |
83 |
84 | \subsection{Autoconfiguration}
85 | \label{sec:running-configure}
86 |
87 | The \verb|configure| script accepts a number of command line parameters, which
88 | determine how the system is built. In most cases the script will figure out
89 | automatically whether the required prerequisites (\autoref{sec:eol-prereqs})
90 | are available, and whether the bundled versions should be used. Passing
91 | parameters to the script is useful in case the detection fails, or to force
92 | certain build options. The following are the parameters most commonly used
93 | with the \verb|configure| script:
94 |
95 | \begin{description}
96 |
97 | \item [\texttt{--enable-bundled-libdwarf}] \hfill\\
98 | Uses the bundled \verb|libdwarf| instead of trying to use the one
99 | provided by the system.
100 |
101 | \item [\texttt{--enable-bundled-lua}] \hfill\\
102 | Uses the bundled Lua distribution instead of trying to use the one
103 | provided by the system.
104 |
105 | \item [\texttt{--enable-ffi}] \hfill\\
106 | Always use \verb|libffi| to perform function calls, and do not build
107 | support their JIT compilation.
108 |
109 | \item [\texttt{--jit-arch=ARCH}] \hfill\\
110 | Skip detection of the operating system and processor, and use JIT
111 | compilation for the supplied architecture (\verb|ARCH|). It is
112 | possible to obtain a list of the supported architectures by running
113 | \texttt{./configure --jit-arch=help}.
114 |
115 | \end{description}
116 |
117 | In order to obtain a complete list of all the command line options that the
118 | script accepts, use:
119 |
120 | \begin{minted}{sh}
121 | ./configure --help
122 | \end{minted}
123 |
124 |
125 | \section{Testing the Build}
126 |
127 | Once \Eol* is has been built, it is recommended to run the test suite to
128 | ensure that the binaries work as expected. The test suite is included in the
129 | source tree, and it does not require any additional dependencies. In order
130 | to run the test suite, use the \verb|tools/run-tests| script from the
131 | top level firectory of the source tree:
132 |
133 | \begin{minted}{sh}
134 | ./tools/run-tests
135 | \end{minted}
136 |
137 | \beforeintro
138 |
--------------------------------------------------------------------------------
/bibliography.tex:
--------------------------------------------------------------------------------
1 | % vim: ft=tex ts=2 sw=2 foldlevel=2
2 |
3 | \begin{thebibliography}{99}
4 |
5 | \bibitem{elfspec-sysv}
6 | \emph{Chapter 4: Object Files}, in
7 | \emph{System V Application Binary Interface Edition 4.1} (pages 44-72) \\
8 | The Santa Cruz Operation, AT\&T, The 88open Consortium \\
9 | \url{http://www.sco.com/developers/devspecs/gabi41.pdf} \\
10 | Accessed: May 3rd, 2015.
11 |
12 | % \bibitem{tis-elf}
13 | % \emph{Tool Interface Standard (TIS) Executable and Linking Format (ELF)
14 | % Specification, version 1.2} \\
15 | % Tool Interface Standard Committee \\
16 | % Edited May 1995. \\
17 | % \url{http://refspecs.linuxbase.org/elf/elf.pdf} \\
18 | % Accessed: May 4rd, 2015.
19 |
20 | \bibitem{dwarfspecv4}
21 | \emph{DWARF Debugging Information Format Version 4} \\
22 | DWARF Debugging Information Format Committee \\
23 | \url{http://dwarfstd.org/doc/DWARF4.pdf} \\
24 | Accessed: May 5th, 2015.
25 |
26 | \bibitem{debugdwarf}
27 | \emph{Introduction to the DWARF Debugging Format} \\
28 | Michael J. Eager \\
29 | \url{http://dwarfstd.org/doc/Debugging\%20using\%20DWARF-2012.pdf} \\
30 | Accessed: May 5th, 2015.
31 |
32 | \bibitem{howdebugworks}
33 | \emph{Part 3 - Debugging Information}, in \emph{How Debuggers Work} \\
34 | Eli Bendersky \\
35 | \url{http://eli.thegreenplace.net/2011/02/07/how-debuggers-work-part-3-debugging-information/} \\
36 | Accessed: May 5th, 2015.
37 |
38 | \bibitem{tratt-dynamic-langs}
39 | \emph{Dynamically Typed Languages},
40 | in \emph{Advances in Computers} (volume 77, pages 149-184) \\
41 | Laurence Tratt \\
42 | Edited by Marvin V. Zelkowitz (July 2009) \\
43 | \url{http://tratt.net/laurie/research/pubs/html/tratt__dynamically_typed_languages/} \\
44 | Accessed: August 22nd, 2015.
45 |
46 | \bibitem{lua-pil}
47 | \emph{Programming in Lua} \\
48 | Roberto Ierusalimschy \\
49 | Lua.org, Third Edition (January 3rd, 2013), ISBN 859037985X.
50 |
51 | \bibitem{lua-pil-online}
52 | \emph{Programming in Lua} \\
53 | Roberto Ierusalimschy \\
54 | Lua.org, First Edition (December 2003), ISBN 8590379817. \\
55 | \url{http://www.lua.org/pil/contents.html}
56 |
57 | \bibitem{lua-about}
58 | \emph{About}, in \emph{Lua website} \\
59 | \url{http://www.lua.org/about.html} \\
60 | Checked April 20th, 2015.
61 |
62 | \bibitem{lua-manual}
63 | \emph{Lua 5.3 Reference Manual} \\
64 | Roberto Ierusalimschy, Luiz Henrique de Figueiredo,
65 | Waldemar Celes \\
66 | \url{http://www.lua.org/manual/5.3/manual.html} \\
67 | Accessed: April 29th, 2015.
68 |
69 | \bibitem{lua50-impl}
70 | \emph{The Implementatin of Lua 5.0} \\
71 | Roberto Ierusalimschy, Luiz Henrique de Figueiredo,
72 | Waldemar Celes \\
73 | Journal of Universal Computer Science 11 \#7 (2005) \\
74 | \url{http://www.jucs.org/jucs_11_7/the_implementation_of_lua}
75 |
76 | \bibitem{lj-ffi-api}
77 | \emph{ffi.* API Functions} \\
78 | Mike Pall \\
79 | \url{http://luajit.org/ext_ffi_api.html} \\
80 | Accessed: May 11th, 2015.
81 |
82 | \bibitem{luaffi}
83 | \emph{luaffi: Standalone FFI library for calling C functions from Lua} \\
84 | James McKaskill \\
85 | \url{https://github.com/jmckaskill/luaffi} \\
86 | Accessed: May 11th, 2015.
87 |
88 | \bibitem{lj-ffi-semantic}
89 | \emph{FFI Semantics} \\
90 | Mike Pall \\
91 | \url{http://luajit.org/ext_ffi_semantics.html} \\
92 | Accessed: May 11th, 2015.
93 |
94 | \bibitem{kb-sunit}
95 | \emph{Simple Smalltalk Testing}, in \emph{Kent Beck's Guide to Better
96 | Smalltalk} \\
97 | Kent Beck, Donald G. Firesmith \\
98 | Cambridge University (December 1998), ISBN 978-0-521-64437-2.
99 |
100 | \bibitem{opengroup-dlopen}
101 | \emph{dlopen - open a symbol table handle}, in \emph{The Open Group Base
102 | Specifications Issue 7, IEEE Standard 1003.1 2013 Edition} \\
103 | The Open Group, IEEE \\
104 | \url{http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlopen.html} \\
105 | Accessed: August 30th, 2015.
106 |
107 | \bibitem{scrumban-getting-started}
108 | \emph{Getting Started with Scrumban} \\
109 | \url{http://www.aboutscrumban.com/how-to-start-using-scrumban/} \\
110 | Accessed: May 5th, 2015.
111 |
112 | \bibitem{tap-spec}
113 | \emph{TAP Specification} \\
114 | Michael G. Schwern, Andy Lester \\
115 | \url{http://testanything.org/tap-specification.html} \\
116 | Accessed: May 4th, 2015.
117 |
118 | \bibitem{nanovg} \
119 | \emph{NanoVG: Antialiased 2D vector drawing library on top of OpenGL for UI
120 | and visualizations} \\
121 | Various Authors \\
122 | \url{https://github.com/memononen/nanovg} \\
123 | Accessed: May 4th, 2015.
124 |
125 | \bibitem{libdwarf-doc}
126 | \emph{A Consumer Library Interface to DWARF} \\
127 | David Anderson \\
128 | \url{https://github.com/Distrotech/libdwarf/blob/distrotech-libdwarf/libdwarf/libdwarf2.1.pdf} \\
129 | Accessed: May 11th, 2015.
130 |
131 | \bibitem{libdwarfp-doc}
132 | \emph{A Producer Library Interface to DWARF} \\
133 | David Anderson \\
134 | \url{https://github.com/Distrotech/libdwarf/blob/distrotech-libdwarf/libdwarf/libdwarf2p.1.pdf} \\
135 | Accessed: May 11th, 2015.
136 |
137 | \bibitem{ninja-manual}
138 | \emph{Ninja documentation} \\
139 | Evan Martin \\
140 | \url{http://martine.github.io/ninja/manual.html} \\
141 | Accessed: May 11th, 2015.
142 |
143 | \bibitem{gnumake-manual}
144 | \emph{GNU Make: A Program for Directing Recompilation} \\
145 | Free Software Foundation, ISBN 1-882114-83-3. \\
146 | \url{https://www.gnu.org/software/make/manual/make.pdf} \\
147 | Accessed: May 13th, 2015.
148 |
149 | \bibitem{uthash-guide}
150 | \emph{uthash User Guide} \\
151 | Troy D. Hanson \\
152 | \url{https://troydhanson.github.io/uthash/userguide.html} \\
153 | Accessed: May 15th, 2015.
154 |
155 | \bibitem{swig3doc}
156 | \emph{SWIG 3.0 Documentation} \\
157 | \url{http://swig.org/Doc3.0/SWIGDocumentation.html} \\
158 | Accessed: June 25th, 2015.
159 |
160 | \bibitem{lusers-BindingCodeToLua}
161 | \emph{Binding Code To Lua}, in \emph{Lua-Users Wiki} \\
162 | \url{http://lua-users.org/wiki/BindingCodeToLua} \\
163 | Accessed: June 25th, 2015.
164 |
165 | \bibitem{js-raceforspeed}
166 | \emph{The JavaScript engine family tree},
167 | in \emph{The race for speed, part 1} \\
168 | John Dalziel, CreativeJS \\
169 | \url{http://creativejs.com/2013/06/the-race-for-speed-part-1-the-javascript-engine-family-tree/} \\
170 | Accessed: August 16th, 2015.
171 |
172 | \bibitem{gobject-introspection}
173 | \emph{GObject Introspection}, in \emph{GNOME Wiki} \\
174 | \url{https://wiki.gnome.org/Projects/GObjectIntrospection} \\
175 | Accessed: August 15th, 2015.
176 |
177 | \bibitem{unofficial-dasm-doc}
178 | \emph{The Unofficial DynASM Documentation} \\
179 | Peter Cawley \\
180 | \url{http://corsix.github.io/dynasm-doc/} \\
181 | Accessed: August 8th, 2015
182 |
183 | \bibitem{lj-perf1}
184 | \emph{LuaJIT performance}, in \emph{lua-l mailing list, August 2009} \\
185 | Mike Pall \\
186 | \url{http://lua-users.org/lists/lua-l/2009-08/msg00151.html} \\
187 | Accessed: August 22nd, 2015
188 |
189 | \bibitem{autotools-history}
190 | \emph{The First Configure Programs},
191 | in \emph{Autoconf, Automake, and Libtool} \\
192 | Gary V. Vaughan, Ben Elliston, Tom Tromey and Ian Lance Taylor \\
193 | New Riders Publishing (October 2000; updated February 2006). \\
194 | \url{https://www.sourceware.org/autobook/autobook/autobook_8.html} \\
195 | Accessed: August 27th, 2015.
196 |
197 | \bibitem{mit-license}
198 | \emph{The MIT License} \\
199 | \url{http://opensource.org/licenses/mit} \\
200 | Accessed: September 6th, 2015.
201 |
202 | \bibitem{bsd-licenses}
203 | \emph{BSD licenses}, in \emph{Wikipedia} \\
204 | Various authors. \\
205 | \url{https://en.wikipedia.org/wiki/BSD_licenses} \\
206 | Accessed: September 6th, 2015.
207 |
208 | \bibitem{eol-github}
209 | \emph{Eöl: Fully automatic Lua↔C bridge using DWARF debug information} \\
210 | Adrián Pérez de Castro \\
211 | \url{https://github.com/aperezdc/lua-eol/}
212 |
213 | \end{thebibliography}
214 |
215 |
--------------------------------------------------------------------------------
/img/lua-logo.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
156 |
--------------------------------------------------------------------------------
/introduction.tex:
--------------------------------------------------------------------------------
1 | % vim: ft=tex spell spelllang=en ts=2 sw=2
2 |
3 | \cleardoublepage
4 | \setchaptertoc
5 | \chapter{Introduction}
6 |
7 | This chapter explains the reasons which motivate the development of this
8 | project, and provides an outline of the goals and planning for its
9 | realization.
10 |
11 | \afterintro
12 |
13 | \section{Description \& Motivation}
14 |
15 | Most programming languages provide some mechanism to use libraries —sometimes
16 | called \emph{modules}— implemented in some other language. Most of the time,
17 | this other language belongs to the family of the C language, which can be
18 | compiled into \emph{native object code}. The reasons are twofold: on one hand
19 | it allows to reuse functionality provided by the system that otherwise would
20 | not be available, and in the other hand it opens the door to implementing
21 | performance--critical pieces of a system using native code.
22 |
23 | Despite the advantages, using native code from a different host programming
24 | language requires creating a layer of software often called \emph{bridge}, or
25 | \emph{binding} from now on, which wraps the native library to provide an
26 | interface compatible with the run-time environment of the dynamic programming
27 | language. Those bindings, created either manually or with the help of code
28 | generation tools, need to be compiled before they can be used.
29 |
30 | When building native code, compilers are capable of adding
31 | \emph{debugging information} to their output, which can be used to gain
32 | additional insight into a program using a \emph{symbolic debugger}. As
33 | a matter of fact, any other tool capable of understanding the format in which
34 | the compiler writes the debugging information can make use of it for its own
35 | purposes. Among plenty other details about the source program, debugging
36 | information includes descriptions of the functions compiled as part of each
37 | compilation unit, parameters and their corresponding data types, return types,
38 | and the memory layout of the involved user-defined types; which is a superset
39 | of the information needed to invoke those functions. In other words, the
40 | debugging information contains all the details needed to make library bindings
41 | automatically, potentially allowing dynamic programming languages to invoke
42 | native code directly without any kind of human intervention.
43 |
44 | % The goal of this project is to implement such an automatic invocation method
45 | % for the Lua programming language, using the debugging information in \Dwarf*
46 | % format as generated by the compiler to allow calling into native code from
47 | % arbitrary libraries at run-time, without needing the presence of previously
48 | % created bindings.
49 |
50 |
51 | \section{Project Goals}
52 | \label{sec:project-goals}
53 |
54 | The main goal of this project to develop an automatic binding system for the
55 | Lua programming language which allows seamless usage of libraries written in
56 | C at runtime. To achieve this, it will use the debugging information generated
57 | by the C compiler. Additionally:
58 |
59 | \begin{itemize}
60 |
61 | \item Modifications to the Lua virtual machine, or its core libraries are to
62 | be avoided, if possible. The fewer the changes, the lower the maintenance
63 | cost of the system when Lua is updated. An implementation which does not
64 | modify Lua itself would be usable with Lua packages provided by the
65 | operating system, thus easing the setup process.
66 |
67 | \item The implementation will load \gls{ELF} shared objects into the Lua
68 | virtual machine, and use the debugging information in \gls{DWARF} format
69 | present in them.
70 |
71 | \item Values of C types, including user defined ones, will be readable and
72 | modifiable from Lua. It will also be possible to create new values of
73 | C types from Lua.
74 |
75 | \item Invocation of functions from loaded shared objects will be supported
76 | for functions of arbitrary return types, and any number of parameters of any
77 | supported type. Lua values passed to functions will be automatically
78 | converted to C types whenever possible. Values of C types created from Lua
79 | will also be accepted as valid function parameters.
80 |
81 | \item The implementation will target the GNU/Linux operating system running
82 | on the x86\_64 architecture.
83 |
84 | \item The design of the system will be extensible, allowing to add support
85 | for more shared object formats, debugging information formats, operating
86 | systems, and architectures.
87 |
88 | \end{itemize}
89 |
90 |
91 | \section{Planning \& Methodologies}
92 | \label{sec:plan-method}
93 |
94 | During the planification phase, the following tasks and subtasks have been
95 | identified:
96 |
97 | \begin{enumerate}
98 | \item Initial study, including:
99 | \begin{enumerate}
100 | \item Understanding how different kinds of data are stored in \gls{ELF}
101 | object files.
102 | \item Identifying the parts of the \gls{DWARF} specification which apply
103 | to the scope of the project.
104 | \item Investigating existing tools which share similar goals.
105 | \end{enumerate}
106 |
107 | \item Analysis, including:
108 | \begin{enumerate}
109 | \item Understanding the relevant parts of the \gls{DWARF}
110 | debugging information format.
111 | \item Getting acquainted with Lua and the implementation
112 | of its \gls{VM}.
113 | \end{enumerate}
114 |
115 | \item Development, including:
116 | \begin{enumerate}
117 | \item Designing the automatic binding system.
118 | \item Implementing the automatic binding mechanism.
119 | \item Testing the system, including:
120 | \begin{itemize}
121 | \item Designing a set of unit and regressions tests.
122 | \item Implementing unit and regression tests.
123 | \end{itemize}
124 | \end{enumerate}
125 |
126 | \item Validation, including:
127 | \begin{enumerate}
128 | \item Developing example Lua programs which demonstrate the
129 | capabilities of the system.
130 | \item Rewriting at least one previously existing program to
131 | validate usage of the system in a real--world scenario.
132 | \end{enumerate}
133 |
134 | \item Documentation, including writing of the final report.
135 | % \item Determine whether to use an existing JIT code generator or to
136 | % implement our own.
137 | % \item Design the JIT code generator.
138 | % \item Implement the JIT code generator.
139 | \end{enumerate}
140 |
141 | For each one of the top-level tasks in the list above,
142 | \autoref{tab:effort-estimate} provides an estimation of the time needed for
143 | the completion, using an effort of eight hours per person, per day (8h/p/d).
144 | For 115 days estimated, the cost of the project would be of 59.000€, using
145 | a price of 65€ per hour.
146 |
147 | \begin{table}
148 | \centering
149 | \begin{tabular}{rlrr}
150 | \toprule
151 | \# & Task & Estimation (days) & Cost (€) \\
152 | \midrule
153 | 1. & Initial study & 10 & 5.200 \\
154 | 2. & Analysis & 15 & 7.800 \\
155 | 3. & Development & 50 & 26.000 \\
156 | 4. & Validation & 10 & 5.200 \\
157 | 5. & Documentation & 30 & 15.600 \\
158 | \midrule
159 | & \emph{Total} & 115& 59.800 \\
160 | \bottomrule
161 | \end{tabular}
162 | \caption{Effort estimation}
163 | \label{tab:effort-estimate}
164 | \end{table}
165 |
166 | Even though there is only one resource executing the tasks, some techniques
167 | from agile development methodologies are used. Namely:
168 |
169 | \begin{itemize}
170 | \item From Scrum, the concepts of \emph{iteration} and \emph{sprints}, with
171 | their respective planning and review seasons. Daily stand-up meetings are
172 | not used, and there is no \emph{scrum master}: none of those would make
173 | make sense provided that there is only one person in the team.
174 | \item The \emph{Kanban} methodology is used in order to keep an always
175 | up to date dashboard with the status of the tasks.
176 | \end{itemize}
177 |
178 | \begin{figure}[htH]
179 | \centering
180 | \includegraphics[width=0.8\textwidth]{img/trello-board.png}
181 | \caption{Kanban board, showing some tasks of this very project}
182 | \label{fig:kanban-board}
183 | \end{figure}
184 |
185 | The Kanban method was invented by \gls{Toyota} to keep the status of
186 | production lines. This methodology keeps a board (physical, in the original
187 | incarnation of the method; nowadays there are even web-based applications like
188 | the one shown in \autoref{fig:kanban-board}) where each element is a task, and
189 | elements are distributed in columns depending on their status. For example,
190 | applied to software development, the columns could be “Pending”, “In
191 | Progress”, “Testing”, and “Finished”. All the tasks are always visible in the
192 | board, so this allows to know the overall status of a project intuitively by
193 | glancing at the board.
194 |
195 |
196 | \beforeintro
197 |
--------------------------------------------------------------------------------
/glossary.tex:
--------------------------------------------------------------------------------
1 | % vim: ft=tex spell spelllang=en
2 | %
3 | % Glossary/Acronyms
4 | %
5 |
6 | \newglossaryentry{LuaJIT}{
7 | name=LuaJIT,
8 | description={
9 | Just-In-Time compiler (\gls{JIT}) for the Lua programming language.
10 | It is a third-party, independent implementation of the Lua VM created
11 | and maintained by Mike Pall, who allegedly was born in planet Krypton.
12 | Available at \url{http://luajit.org}
13 | },
14 | }
15 |
16 | \newglossaryentry{transpiler}{
17 | name=transpiler,
18 | description={Type of compiler that takes source code of a programming
19 | language as its input, and produces a different source code as
20 | output, usually in a different programming language. Also known
21 | as \emph{source-to-source compiler}, or \emph{transcompiler}
22 | },
23 | }
24 |
25 | \newglossaryentry{lua-users-wiki}{
26 | name={Lua-Users Wiki},
27 | description={Community operated \emph{wiki} site which contains resources
28 | for Lua development, written by users of the programming language
29 | themselves. URL address: \url{http://lua-users.org/wiki}
30 | },
31 | }
32 |
33 | \newglossaryentry{pascal}{
34 | name={Pascal},
35 | description={
36 | Procedural programming language designed in the late 60s by Niklaus Wirth
37 | to encourage good programming practices
38 | },
39 | }
40 |
41 | \newglossaryentry{name-mangling}{
42 | name={name mangling},
43 | description={
44 | Technique used generate unique names for programming entities, usually by
45 | encoding additional information about the entity in its name
46 | },
47 | }
48 |
49 | \newglossaryentry{object-oriented}{
50 | name={object oriented},
51 | description={
52 | Programming paradigm based on the concept of \emph{objects}, which are
53 | data structures that encapsulate both data, and its behavior
54 | },
55 | }
56 |
57 | \newglossaryentry{data-deduplication}{
58 | name={data deduplication},
59 | description={Any technique which eliminates duplicate copies of repeating
60 | data in order to improve storage utilization
61 | },
62 | }
63 |
64 | \newglossaryentry{flexible-array-member}{
65 | name={flexible array member},
66 | description={
67 | Feature introduced in the C99 standard of the C programming language
68 | which allows the last member of a \Mc:struct: to be an array of an
69 | unspecified dimension. Space needed by the array does not contribute
70 | to the size of the \Mc:struct: type, and must be manually accounted
71 | for when allocating the \Mc:struct: from the heap
72 | },
73 | }
74 |
75 | \newglossaryentry{emulation}{
76 | name={emulation},
77 | description={
78 | Piece of hardware or software that enables one computer system (called
79 | the \emph{host}) to behave like another computer system (called the
80 | \emph{guest}) which enables the host system to run software or use
81 | peripheral devices designed for the guest system
82 | },
83 | }
84 |
85 | \newglossaryentry{metaprogramming}{
86 | name={metaprogramming},
87 | description={
88 | Writing of computer programs which are able to read, generate, analyze
89 | or transform other programs, or even modify themselves while running
90 | },
91 | }
92 |
93 | \newglossaryentry{memoization}{
94 | name={memoization},
95 | description={
96 | Optimization technique which stores the results of expensive function
97 | calls, and returns the previously calculated value when the same inputs
98 | occur again
99 | },
100 | }
101 |
102 | \newglossaryentry{dynamic-programming}{
103 | name={dynamic programming},
104 | description={
105 | Problem solving method —and programming technique— which solves a
106 | complicated problem by breaking it up in smaller problems in a
107 | recursive manner
108 | }
109 | }
110 |
111 | \newglossaryentry{dynamic-dispatch}{
112 | name={dynamic dispatch},
113 | description={
114 | Process of selecting a concrete implementation of a polymorphic
115 | method (or function) at runtime. It is typically used in object
116 | oriented languages when different classes contain different
117 | implementations of the same method due to inheritance
118 | }
119 | }
120 |
121 | \newglossaryentry{fibonacci-number}{
122 | name={Fibonacci number},
123 | description={
124 | Number from the sequence $1, 1, 2, 3, 5, 8, 13, ...$, given by the
125 | recurrence relation $F_n = F_{n-1} + F_{n-2}$, with $F_1 = 1$, and
126 | $F_2 = 1$, as defined by the Italian mathematician Leonardo Fibonacci
127 | },
128 | }
129 |
130 | \newglossaryentry{first-class-value}{
131 | name={first--class value},
132 | description={
133 | In programming language design, an entity which supports all the
134 | operations generally available to other entities of a laguage,
135 | typically: being passed as a parameter, returned from a function,
136 | and assigned to a variable
137 | },
138 | }
139 |
140 | \newglossaryentry{closure}{
141 | name={closure},
142 | description={
143 | Technique for implementing lexically scoped name binding in languages
144 | with first--class functions
145 | },
146 | }
147 |
148 | \newglossaryentry{constructor}{
149 | name={constructor},
150 | description={
151 | Special type of subroutine in a program which is called to create an
152 | object, and performing any initialization needed before the object can
153 | be used
154 | },
155 | }
156 |
157 | \newglossaryentry{refcounting}{
158 | name={reference counting},
159 | description={
160 | Technique of storing the number of references to an object, block of
161 | memory, disk space, or any other resource, which allows tracking
162 | whether the resource is in use by others
163 | },
164 | }
165 |
166 | \newglossaryentry{backronym}{
167 | name={backronym},
168 | description={
169 | A \emph{backward acronym} is an acronym constructed in reverse, by
170 | creating a new phrase to fit an existing word, name, or acronym
171 | },
172 | }
173 |
174 | \newglossaryentry{Toyota}{
175 | name={Toyota},
176 | description={Japanese car manufacturer},
177 | }
178 |
179 | \newglossaryentry{gls-ABI}{
180 | name={Application Binary Interface},
181 | description={
182 | Interface between two program modules, one of which is
183 | usually a library or the operating system, at the level of machine
184 | code. An ABI determines such details as how functions are called,
185 | and how parameters are passed to them
186 | },
187 | }
188 | \newacronym[see={[Glossary:]{gls-ABI}}]{ABI}{ABI}
189 | {Application Binary Interface\glsadd{gls-ABI}}
190 |
191 | \newglossaryentry{gls-TUE}{
192 | name={Type Unit Entry},
193 | description={A particular kind of DWARF DIE that contains information
194 | about a data type},
195 | }
196 | \newacronym[see={[Glossary:]{gls-TUE}}]{TUE}{TUE}
197 | {Type Unit Entry\glsadd{gls-TUE}}
198 |
199 | \newglossaryentry{gls-ISA}{
200 | name={Instruction Set Architecture},
201 | description={
202 | Part of the computer architecture related to programming, including
203 | the native data types, instructions, registers, addressing modes,
204 | memory architecture, interrupt and exception handling, and external
205 | I/O. An ISA includes a specification of the machine language, and
206 | the native commands implemented by a particular processor.
207 | },
208 | }
209 | \newacronym[see={[Glossary:]{gls-ISA}}]{ISA}{ISA}
210 | {Instruction Set Architecture\glsadd{gls-ISA}}
211 |
212 | \newglossaryentry{gls-GC}{
213 | name={Garbage Collection},
214 | description={
215 | Method of automatic memory management, in which a \emph{garbage
216 | collector} tries to reclaim “garbage” (memory occupied by data no
217 | longer in use by the program) with a certain periodicity
218 | },
219 | }
220 | \newacronym[see={[Glossary:]{gls-GC}}]{GC}{GC}
221 | {Garbage Collection\glsadd{gls-GC}}
222 |
223 | \newglossaryentry{gls-TAP}{
224 | name={Test Anything Protocol},
225 | description={
226 | },
227 | }
228 | \newacronym[see={[Glossary:]{gls-TAP}}]{TAP}{TAP}
229 | {Test Anything Protocol\glsadd{gls-TAP}}
230 |
231 | \newacronym{ELF}{ELF}{Executable and Linkable Format}
232 | \newacronym{DWARF}{DWARF}{Debugging With Attributed Record Formats}
233 | \newacronym{JIT}{JIT}{Just-In-Time}
234 | \newacronym{FFI}{FFI}{Foreign Function Interface}
235 | \newacronym{DIE}{DIE}{Debugging Information Entry}
236 | \newacronym{CU}{CU}{Compilation Unit}
237 | \newacronym{FDL}{FDL}{Free Documentation License}
238 | \newacronym{API}{API}{Application Programming Interface}
239 | \newacronym{PIC}{PIC}{Position-Independent Code}
240 | \newacronym{VLA}{VLA}{Variable-Length Array}
241 | \newacronym{PUC-Rio}{PUC-Rio}{Pontifícia Universidade Católica do Rio de Janeiro}
242 | \newacronym{VM}{VM}{Virtual Machine}
243 | \newacronym{IRC}{IRC}{Internet Relay Chat}
244 | \newacronym{OSI}{OSI}{Open Source Initiative}
245 | \newacronym{PNG}{PNG}{Portable Network Graphics}
246 | \newacronym{TDD}{TDD}{Test-Driven Development}
247 |
248 | %
249 | % Those are just simple command-abbreviations to format pieces of text which
250 | % should be always displayed with the same formatting. Using a macro ensures
251 | % that, and makes it easier to come back here and change the formatting for
252 | % all occurrences, if needed.
253 | %
254 | \def\Eol*{\textsc{\sffamily Eöl}}
255 |
--------------------------------------------------------------------------------
/lua-eol-report.tex:
--------------------------------------------------------------------------------
1 | % vim:ft=tex:
2 | %
3 | \documentclass[a4paper,
4 | fontsize=12pt,final,
5 | titlepage=firstiscover,
6 | chapterprefix=true,
7 | appendixprefix=true,
8 | headings=big,
9 | headsepline,
10 | toc=bibliographynumbered,
11 | twoside]{scrbook}
12 |
13 | \usepackage[english]{babel}
14 | \usepackage{hyphenat}
15 | \usepackage{varioref}
16 |
17 | % Graphics support
18 | \usepackage{xcolor}
19 | \usepackage{dirtree}
20 | \usepackage{calc}
21 | \usepackage{tikz}
22 | \usetikzlibrary{arrows,shapes,fit,shadows,positioning,chains,%
23 | decorations.pathreplacing,decorations.pathmorphing,calc,%
24 | matrix}
25 | \usepackage{pgfplots}
26 |
27 | \definecolor{grey}{rgb}{0.5, 0.5, 0.5}
28 | \definecolor{lightgrey}{rgb}{0.95, 0.95, 0.95}
29 | \definecolor{grassgreen}{rgb}{0.1, 0.85, 0.2}
30 | \definecolor{fadedbrown}{rgb}{0.85, 0.1, 0.1}
31 | \definecolor{darkmagenta}{rgb}{0.65, 0.0, 0.65}
32 | \definecolor{lightblue}{rgb}{0.5, 0.75, 1.0}
33 | \definecolor{linkboxcolor}{rgb}{0.8, 0.8, 0.85}
34 |
35 | % \def\checkmark{\tikz\fill (0,.35) -- (.25,0) -- (1,.7) -- (.25,.15) -- cycle;}
36 | \def\checkmark{\tikz\fill
37 | (0, 1) -- (1, 0) -- (2.5, 1.5) -- (1, 0.5) -- cycle;}
38 |
39 | \def\foldersymbol{\tikz\fill[scale=0.25]
40 | (0, 0) -- (1.7, 0) -- (1.7, 1) -- (1, 1) --
41 | (0.85, 1.25) -- (0.15, 1.25) -- (0, 1) -- cycle;}
42 | \newcommand\inlinesymbol[1]{\resizebox{\widthof{#1}*\ratio{\widthof{x}}{\widthof{\normalsize x}}}{!}{#1}}
43 |
44 | \newcommand\DtFolder[1]{{\inlinesymbol{\color{grey}\foldersymbol}} #1}
45 | % \newcommand\Tick{\inlinesymbol\checkmark}
46 | \newcommand\Tick{ $\star$ }
47 |
48 | % Make a TOC in the generated PDF
49 | \usepackage[pdfstartview=FitH,
50 | linkbordercolor=linkboxcolor,
51 | urlbordercolor=linkboxcolor,
52 | linkcolor={blue!80},
53 | citecolor={blue!80},
54 | urlcolor={blue!80},
55 | colorlinks=true,
56 | hidelinks=true,
57 | unicode=true,
58 | linktoc=all]{hyperref}
59 | \providecommand*{\listingautorefname}{Listing}
60 | \providecommand*{\sectionname}{Section}
61 | \providecommand*{\subsectionautorefname}{Subsection}
62 | \usepackage[open]{bookmark}
63 | \bookmarksetup{color=blue}
64 |
65 | \usepackage{placeins}
66 | \usepackage[nohints,tight]{minitoc}
67 | \mtcsetrules{minitoc}{off}
68 | \setlength{\mtcindent}{1ex}
69 | \renewcommand{\mtifont}{\sf\bf\normalsize}
70 | % \renewcommand{\mtcfont}{\footnotesize\bf}
71 | % \renewcommand{\mtcSfont}{\footnotesize\rm}
72 | % \renewcommand{\mtcSSfont}{\footnotesize\rm}
73 |
74 | \newcommand{\setchaptertoc}{%
75 | \setchapterpreamble{\minitoc}}
76 |
77 | \newcommand\beforeintro{%
78 | \begin{center}%
79 | \Large\Symbol{🙠}%
80 | \end{center}%
81 | \FloatBarrier%
82 | }
83 | \newcommand\afterintro{%
84 | \begin{center}%
85 | \Large\Symbol{🙣}%
86 | \end{center}%
87 | }
88 |
89 |
90 | \usepackage[acronym,xindy,toc]{glossaries}
91 | \makeglossaries
92 | \glossarystyle{altlistgroup}
93 | \input{glossary}
94 |
95 | % Fonts
96 | \usepackage[OT1]{fontenc}
97 | \usepackage{fontspec}
98 | \defaultfontfeatures{Ligatures=TeX}
99 |
100 | \setmainfont{Andada}[
101 | Path = fonts/,
102 | Extension = .ttf,
103 | UprightFont = *-Regular,
104 | ItalicFont = *-Italic,
105 | BoldFont = *-Bold,
106 | BoldItalicFont = *-BoldItalic,
107 | SmallCapsFont = *SC-Regular,
108 | ]
109 |
110 | \newfontfamily\RlwLight{Raleway}[
111 | Path = fonts/,
112 | Extension = .ttf,
113 | UprightFont = *-ExtraLight,
114 | ItalicFont = *-ExtraLight-Italic,
115 | BoldFont = *-Light,
116 | BoldItalicFont = *-Light-Italic,
117 | ]
118 |
119 | \setsansfont{Raleway}[
120 | Path = fonts/,
121 | Extension = .ttf,
122 | UprightFont = *-Regular,
123 | ItalicFont = *-Regular-Italic,
124 | BoldFont = *-Bold,
125 | BoldItalicFont = *-Bold-Italic,
126 | ]
127 |
128 | \setmonofont{InputMonoNarrow}[
129 | Scale = MatchLowercase,
130 | Path = fonts/,
131 | Extension = .ttf,
132 | UprightFont = *-Light,
133 | ItalicFont = *-LightItalic,
134 | BoldFont = *-Regular,
135 | BoldItalicFont = *-Italic,
136 | ]
137 |
138 | \newfontfamily\SymbolaFont{Symbola}[
139 | Path = fonts/,
140 | Extension = .ttf,
141 | ]
142 | \newcommand\Symbol[1]{{\SymbolaFont#1}}
143 |
144 |
145 | \usepackage{setspace}
146 | \onehalfspacing
147 | % \doublespacing
148 | \parskip=6pt
149 | \parindent=10pt
150 |
151 | \usepackage{scrlayer-scrpage}
152 | % Header:
153 | % Inner: section numbering and title
154 | % Outer: chapter number
155 | \ohead{Chapter \thechapter}
156 | \chead{}
157 | \ihead{\rightmark}
158 | % Footer:
159 | % Center: page number
160 | \cfoot{\thepage}
161 | \ifoot{}
162 | \ofoot{}
163 |
164 | \setkomafont{chapterprefix}{\RlwLight\Large}
165 | \setkomafont{chapter}{\RlwLight\bfseries\Huge}
166 |
167 | \usepackage{booktabs}
168 |
169 | % Pretty code listings
170 | \usepackage{scrhack} % Needed to use Minted w/KOMA-Script
171 | \usepackage{minted}
172 | \setminted{
173 | autogobble = true,
174 | breaklines = true,
175 | codetagify = true,
176 | encoding = utf-8,
177 | outencoding = utf-8,
178 | frame = leftline,
179 | framerule = 5pt,
180 | framesep = 0.65em,
181 | xleftmargin = 1em,
182 | xrightmargin = 1em,
183 | rulecolor = \color{lightgrey},
184 | }
185 | \newmintinline[Mc]{c}{}
186 | \newminted{c}{
187 | fontsize = \small,
188 | baselinestretch = 1.0,
189 | }
190 | \newmintinline[Mlua]{lua}{}
191 | \newminted{lua}{
192 | fontsize = \small,
193 | baselinestretch = 1.0,
194 | }
195 |
196 |
197 | \usepackage[shadow,obeyFinal,textsize=footnotesize]{todonotes}
198 |
199 | \include{hyphenation}
200 |
201 |
202 | \newcommand\PfcTitle[0]{%
203 | Automatic bridging of native code to Lua
204 | using existing debugging information\relax}
205 | \newcommand\PfcAuthor[0]{%
206 | Adrián Pérez de Castro\relax}
207 | \newcommand\PfcDirector[0]{%
208 | Laura Milagros Castro Souto\relax}
209 |
210 |
211 | \title{\PfcTitle}
212 | \author{\PfcAuthor}
213 | \hypersetup{%
214 | pdftitle={\PfcTitle},%
215 | pdfauthor={\PfcAuthor},%
216 | pdfkeywords={ (╯°□°)╯︵ ┻━┻), ┬─┬ノ( º _ ºノ)},%
217 | }
218 |
219 | \setlength{\parskip}{2ex plus 1ex minus 1ex}
220 |
221 | \begin{document}
222 | \dominitoc
223 | \pagestyle{empty}
224 |
225 | \begin{titlepage}
226 | \begin{center}
227 | \vspace{7cm}
228 | % Logo
229 | \begin{tikzpicture}[y=0.80pt, x=0.8pt,yscale=-1, inner sep=0pt, outer sep=0pt, scale=0.2]
230 | \path[fill=magenta,nonzero rule] (220.7188,106.4062) -- (382.3633,33.4609) ..
231 | controls (341.6836,12.8594) and (284.3281,-0.0039) .. (220.7227,-0.0039) ..
232 | controls (157.1016,-0.0039) and (99.7461,12.8594) .. (59.0742,33.4609) --
233 | (220.7188,106.4062);
234 | \path[fill=magenta,nonzero rule] (440.9648,89.9531) .. controls
235 | (436.4648,76.9375) and (427.1289,64.7109) .. (413.8828,53.7188) --
236 | (233.1914,105.5312) -- (440.9648,89.9531);
237 | \path[fill=magenta,nonzero rule] (414.9375,161.0898) .. controls
238 | (428.0547,149.9570) and (437.2305,137.6055) .. (441.4414,124.4531) --
239 | (232.9805,109.8984) -- (414.9375,161.0898);
240 | \path[fill=magenta,nonzero rule] (220.7305,109.2188) -- (57.9609,181.6680) ..
241 | controls (98.6992,202.6055) and (156.5195,215.6992) .. (220.7227,215.6992) ..
242 | controls (284.9102,215.6992) and (342.7344,202.6055) .. (383.4805,181.6680) --
243 | (220.7305,109.2188);
244 | \path[fill=magenta,nonzero rule] (0.0000,124.4531) .. controls (4.2109,137.6055)
245 | and (13.3867,149.9570) .. (26.4961,161.0898) -- (208.4492,109.8984) --
246 | (0.0000,124.4531);
247 | \path[fill=magenta,nonzero rule] (208.2422,105.5312) -- (27.5625,53.7109) ..
248 | controls (14.3164,64.7070) and (4.9766,76.9336) .. (0.4727,89.9531) --
249 | (208.2422,105.5312);
250 | \end{tikzpicture}
251 |
252 | {\Large\textbf{Facultade de Informática \\
253 | Universidade da Coruña}} \\
254 | {\large\textit{Departamento de Computación}}
255 | \vspace{1cm}
256 |
257 | {\large\textsc{Proyecto de Fin de Carrera \\
258 | Ingeniería Informática}}
259 | \vspace{1cm}
260 |
261 | {\Large\textbf{\PfcTitle}}
262 | \end{center}
263 |
264 | \vfill
265 |
266 | \begin{flushright}
267 | \begin{tabular}{ll}
268 | {\large\textbf{Student:}} & {\large\PfcAuthor} \\
269 | {\large\textbf{Director:}} & {\large\PfcDirector} \\
270 | {\large\textbf{Date:}} & {\large\today}
271 | \end{tabular}
272 | \end{flushright}
273 | \end{titlepage}
274 |
275 | \frontmatter
276 |
277 | \clearpage
278 | \listoftodos
279 |
280 | % Dedication
281 | \cleardoublepage
282 | \begin{minipage}[t][6cm][l]{\textwidth}
283 | \vspace{10cm}
284 | \begin{flushright}
285 | \textit{Do it, or don't, but don't try.}
286 | \end{flushright}
287 | \end{minipage}
288 |
289 | % Acknowledgements
290 | \cleardoublepage
291 | \chapter*{Acknowledgements}
292 |
293 |
294 | \begin{minipage}{0.6\textwidth}
295 | \begin{raggedleft} \itshape
296 |
297 | To my wife, who supported unconditionally me during the long hours I have
298 | devoted to this project.
299 | % , and helped to proof-read the final iterations of the
300 | % present document.
301 |
302 | \vspace{2cm}
303 |
304 | To my parents, whom have not thought that I would ever get this piece of work
305 | done.
306 |
307 | \vspace{2cm}
308 |
309 | Also, I would like to thank my Finnish “adoptive” family, who have kindly
310 | accepted me as one more of them, and that have been of invaluable support.
311 | Their appreciation of knowledge is something I am willing to pass down to
312 | upcoming generations.
313 |
314 | \end{raggedleft}
315 | \end{minipage}
316 |
317 |
318 | % Summary
319 | \cleardoublepage
320 | \include{summary}
321 |
322 | % Keywords
323 | \cleardoublepage
324 | \chapter*{Keywords}
325 | \begin{itemize}
326 | \item Automatic binding generation.
327 | \item ELF.
328 | \item DWARF.
329 | \item Debugging information.
330 | \item Lua programming language.
331 | \item Virtual machines.
332 | \item FFI.
333 | % \item JIT code generation.
334 | \end{itemize}
335 |
336 | % Indexes
337 | \cleardoublepage
338 | \tableofcontents
339 | \listoffigures
340 | \listoftables
341 | \listoflistings
342 |
343 | \mainmatter
344 | \pagestyle{scrheadings}
345 | \include{introduction}
346 | \include{contextualization}
347 | \include{design}
348 | \include{implementation}
349 | \include{conclusions}
350 |
351 | \backmatter
352 |
353 | \include{appendix-installation}
354 |
355 | \cleardoublepage
356 | \printglossaries
357 |
358 | \include{bibliography}
359 |
360 | \end{document}
361 |
--------------------------------------------------------------------------------
/conclusions.tex:
--------------------------------------------------------------------------------
1 | % vim: set ft=tex foldlevel=2 spelllang=en spell:
2 | \cleardoublepage
3 | \setchaptertoc
4 | % TODO: Shouldn't this be "Final remarks" or so?
5 | \chapter{Final Remarks}
6 |
7 | This last chapter provides a \emph{post-factum} evaluation of the development
8 | process of the project, which describes the level of completion achieved,
9 | whether the planning schedule was followed through, and the future of \Eol*.
10 | \afterintro
11 |
12 | \section{Achieved Goals}
13 |
14 | We consider that the main goals set at the beginning of this project
15 | (\autoref{sec:project-goals}) have been fulfilled. In particular:
16 |
17 | \begin{itemize}
18 |
19 | \item Development of an automatic binding system for the Lua programming
20 | language, allowing seamless usage of C libraries.
21 |
22 | \item Use of the DWARF debugging information contained in ELF shared object
23 | files to pinpoint the details about invoked functions, and involved data
24 | types.
25 |
26 | \item Conversion of data values transparently between Lua and C.
27 |
28 | \item Reference implementation for GNU/Linux running on the
29 | x86\_86 architecture.
30 |
31 | \item Full compliance with the Lua C API, avoiding modifications of the
32 | internals of the Lua VM.
33 |
34 | \end{itemize}
35 |
36 | The aforementioned achievements take the shape of the \Eol* system, developed
37 | in the C programming language. \Eol* is a Lua module which implements a FFI
38 | for Lua using the DWARF debugging information to gain knowledge about the
39 | types and functions available in ELF shared object files.
40 |
41 | The source code has been publicly available~\cite{eol-github} under the terms
42 | of the \gls{OSI}-approved MIT license~\cite{mit-license} since the very moment the development
43 | effort was started, and so the code repository contains the complete history
44 | of changes it has received so far. The MIT license was chosen because most
45 | Lua-related projects use either the BSD or MIT license, and using a compatible
46 | license allows developers to confidently mix \Eol* with other Lua modules
47 | right away. Also, referring to the MIT license is clearer because it avoids
48 | the potential confusion derived from the existence of different variants of
49 | the BSD license~\cite{bsd-licenses}.
50 |
51 | Separating the code which deals with native function invocation allows us to
52 | select select which method is used to invoke native functions from Lua at
53 | build-time. Two backends were implemented: one using \verb|libffi|, and
54 | a second one using JIT compilation (by means of DynASM) for the Intel x86\_64
55 | architecture to generate the needed glue code.
56 |
57 | Additionally, a test harness for Lua has been developed, due to the existing
58 | Lua unit testing frameworks aborting their execution in the event of a crash
59 | of the process. The harness has been implemented mostly in Lua, plus a small
60 | helper module written in C to access Unix system calls which are not covered
61 | by the Lua standard library.
62 |
63 | The implementation of \Eol* consists of approximately 7.000 \todo{Update
64 | numbers later if needed} lines of code (standard LOC, ignoring comments, see
65 | \autoref{fig:eol-loc}), of which the biggest part is the implementation of the
66 | \Eol* Lua module, as expected.
67 |
68 | \begin{figure}[ht]
69 | \centering
70 | \begin{tikzpicture}
71 | \begin{axis}[
72 | style={/pgf/number format/assume math mode=true},
73 | width=0.7\textwidth,
74 | ybar, axis on top,
75 | ymajorgrids, tick align=inside,
76 | major grid style={draw=white},
77 | enlarge y limits={value=.1,upper},
78 | axis x line*=bottom,
79 | axis y line*=right,
80 | y axis line style={opacity=0},
81 | enlarge x limits=0.25,
82 | tickwidth=0pt,
83 | xtick=data,
84 | bar width=1cm,
85 | symbolic x coords={C, Lua, Shell, Make, Ninja},
86 | nodes near coords,
87 | ymin=0,
88 | ]
89 | \addplot[draw=none, fill=blue!30] coordinates {
90 | (Lua,1110) (C,4600) (Shell,550) (Make,200) (Ninja,290)
91 | };
92 | \end{axis}
93 | \end{tikzpicture}
94 | \caption{Lines of code in \Eol*, per language.}
95 | \label{fig:eol-loc}
96 | \end{figure}
97 |
98 | Last but not least, the \Eol* FFI module has been tested thoroughly, in two
99 | ways:
100 |
101 | \begin{itemize}
102 |
103 | \item With an automated test suite, which can be used for regression
104 | testing—and has been used as such to ensure that the behaviour of code
105 | generated by the JIT compilation of function invocations works the same
106 | way as the \verb|libffi|-based method.
107 |
108 | \item Writing example programs which exercise the module. These use third
109 | party libraries used in real world projects, so the example programs
110 | stress the FFI in the way it is intended to be used.
111 |
112 | \end{itemize}
113 |
114 | \section{Lessons Learned}
115 |
116 | During the development of the project, the specification of the ELF and DWARF
117 | standards has been analyzed, and the relevant parts which are useful for
118 | implementing a \gls{FFI} have been identified. A comprehensive understanding
119 | of the specification was acquired thanks to the following documentation:
120 |
121 | \begin{itemize}
122 |
123 | \item \emph{How Debugging Works}~\cite{howdebugworks}: Tutorial-style
124 | series of articles which give a good overall overview of how debugging
125 | information is embeeded into compiled object code.
126 |
127 | \item \emph{System V Application Binary Interface Edition
128 | 4.1}\cite{elfspec-sysv}: Contains the original (non normative)
129 | specification of the ELF object file format. Though newer, normative
130 | versions of the specification exist, this version explains the concepts
131 | needed to understand the DWARF debugging information in an more
132 | approachable way.
133 |
134 | \item \emph{DWARF Debugging Information Format Version
135 | 4}~\cite{dwarfspecv4}: Normative specification of the DWARF debugging
136 | information format.
137 |
138 | \end{itemize}
139 |
140 | Existing FFI implementations for Lua have been analyzed to understand how they
141 | work, which was a valuable knowledge to keep in mind while designing how \Eol*
142 | bridges native code to Lua. In particular, the LuaJIT \verb|ffi| module was
143 | taken as a prime example of a proven solution which is popular in the Lua
144 | community. The following documents were instrumental for the design of the
145 | developed solution:
146 |
147 | \begin{itemize}
148 |
149 | \item \emph{FFI semantics}~\cite{lj-ffi-semantic}: Describes how the
150 | module interacts both with Lua, and the compiled C code.
151 |
152 | \item \emph{ffi.* API functions}~\cite{lj-ffi-api}: Describes the API
153 | of the \verb|ffi| module.
154 |
155 | \end{itemize}
156 |
157 | Implementing the JIT code generation required learning how to use LuaJIT's
158 | DynASM, which lacks official documentation. It was also needed to learn to
159 | program in assembler for the Intel x86 platform, and its \gls{ABI} calling
160 | conventions during the development of the JIT code generator.
161 |
162 |
163 | \section{Planning Results}
164 |
165 | \begin{table}[ht]
166 | \centering
167 | \begin{tabular}{rlrrrr}
168 | \toprule
169 | & & \multicolumn{2}{c}{Time (days)} & \multicolumn{2}{c}{Cost (€)} \\
170 | \cmidrule(r){3-6}
171 | \# & Task & Estimated & Actual & Estimated & Actual \\
172 | \midrule
173 | 1. & Initial study & 10 & 5 & 5.200 & 2.600 \\
174 | 2. & Analysis & 15 & 23 & 7.800 & 11.960 \\
175 | 3. & Development & 50 & 80 & 26.000 & 41.600 \\
176 | 4. & Validation & 10 & 8 & 5.200 & 4.160 \\
177 | 5. & Documentation & 30 & 78 & 15.600 & 40.560 \\
178 | \midrule
179 | & \emph{Total} & 115 & 194& 59.800 & 100.880 \\
180 | \bottomrule
181 | \end{tabular}
182 | \caption{Estimated vs.\ actual schedule and cost}
183 | \label{tab:sched-postmortem}
184 | \end{table}
185 |
186 | The planning done \emph{a priori} (\autoref{sec:plan-method}) resulted in a
187 | tight schedule which did not include enough clearance for anything else than
188 | the smallest of the unexpected delays~(c.f \autoref{tab:sched-postmortem}).
189 | In hindsight, it would have been good to schedule additional time to cope
190 | with the multiple causes of delays:
191 |
192 | \begin{itemize}
193 |
194 | \item The main deviation, filed under \emph{Documentation}, was caused
195 | because the effort of writing the final report was underestimated by
196 | a wide margin. The estimation was overly optimistic, and while for someone
197 | used to do documentation work in a regular basis it might have been an
198 | adequate estimation, that was not the case for the author. Plus, there was
199 | the added difficulty of writing the documentation in English: while
200 | capable of its fluent use in a daily basis, the author is not a native
201 | speaker and lacked experience in writing long-form technical documentation
202 | in it.
203 |
204 | \item The additional time needed for the \emph{Analysis} was motivated by
205 | the need of reading complex normative documentation, mainly the
206 | specifications of the \gls{ELF} and \gls{DWARF} standards, which weight
207 | over 320 and 100 pages, respectively. Even though not the whole text of
208 | the specifications was relevant for the present project, it was needed to
209 | wade through them to acquire concepts which then allowed to understand the
210 | rest.
211 |
212 | \item As for the \emph{Development} phase, it took a good amount of
213 | unplanned time to understand how use DynASM for JIT code generation due to
214 | the utter lack of documentation, which required frequent detours to read
215 | parts of its source code. The most complete resource on
216 | DynASM~\cite{unofficial-dasm-doc} was written by a third party, it is not
217 | part of official documentation, and it was found when the most of the
218 | deviation had already taken place.
219 |
220 | \end{itemize}
221 |
222 |
223 | On the other side of the spectrum, the \emph{Initial study} was carried out
224 | faster than planned: the preexisting knowledge about Lua, LuaJIT, and the
225 | existing solutions to use native libraries with them was a valuable asset.
226 |
227 | The final cost of the project has increased accordingly to the additional time
228 | needed for its completion, being now 100.800€ instead of the planned 59.800€.
229 | This figure uses a cost per hour of 65€, which is a current average value
230 | used by the author's company, a 8-hour work day, and does not take into
231 | account the time devoted by the tutor of the project.
232 |
233 |
234 | \section{Future Directions}
235 |
236 | Every software project has room for improvement and continued refinement, and
237 | \Eol* is no exception. There are a number of features which have been
238 | knowingly left out of the present project, in order to keep its scope under
239 | control. It is the intention of the author to keep developing \Eol* as a Free
240 | Software project, and there are a number of ideas for future development which
241 | have surfaced during the realization of its current version.
242 |
243 | The following are ideas which are complex to realize, and even though working
244 | on them would require a big development effort, they open the path to exciting
245 | new possibilities:
246 |
247 | \begin{itemize}
248 |
249 | \item Defining an on-disk format for type and function information. The
250 | idea would be to obtain the information from the DWARF debugging
251 | information, and store it in a format which is optimized for faster
252 | reading. The files in this new format would be used for on-disk caching.
253 |
254 | \item Using the file format implemented from the previous bullet point,
255 | allow reading type information directly from it, without requiring that
256 | the ELF shared objects include DWARF debugging information.
257 |
258 | \item Saving the generated code to disk in ELF object files, when \Eol* is
259 | built with JIT code generation enabled. The generated code could be loaded
260 | reusing the Lua module loader.
261 |
262 | \item Implementing support for reading debugging information in formats
263 | other than DWARF. Ideally, there would be an interface that a “type
264 | information provider” component could implement, and the DWARF provider
265 | would be just one of many.
266 |
267 | \end{itemize}
268 |
269 | The following fall into the category of improvements which can be done with
270 | a moderate effort, and would certainly provide added value to the project:
271 |
272 | \begin{itemize}
273 |
274 | \item Building a community. \Eol* is already Free Software, it lacks
275 | a community, and it would be interesting to foster a healthy one around
276 | the project. That would require writing more documentation (e.g. a quick
277 | start tutorial, a walkthough of the features), and having public
278 | communication channels (e.g. a mailing list, an \gls{IRC} chat room,
279 | participating in the \verb|lua-l| list...).
280 |
281 | \item Ensuring compatibility. Making sure that \Eol* works with Lua 5.2,
282 | and 5.1 would favour maximum adoption, since those are the versions more
283 | widely deployed.\todo{See if it's possible to add a citation}
284 |
285 | \item Enabling use of \Eol* with LuaJIT. Most modules using the Lua C API
286 | can be built for LuaJIT as well. LuaJIT is designed to be compatible with
287 | Lua 5.1, while the current Implementation targets version 5.3.
288 |
289 | \item Implementing JIT compilation of function invocations for
290 | architectures other than x86, and x86\_64. This can be done with ease for
291 | the other architectures supported by DynASM: ARM, MIPS, and PowerPC.
292 |
293 | \end{itemize}
294 |
295 | \beforeintro
296 |
--------------------------------------------------------------------------------
/slides.tex:
--------------------------------------------------------------------------------
1 | % vim:ft=tex:
2 | %
3 | \documentclass[luatex]{beamer}
4 |
5 | \setbeamercolor{background canvas}{bg=white}
6 | \setbeamercolor{normal text}{fg=black!90}
7 | \setbeamerfont{text}{size*={14}{1.4em}}
8 |
9 | \setbeamertemplate{frametitle}{%
10 | \begin{centering}%
11 | \bigskip\Huge\insertframetitle\par\smallskip%
12 | \end{centering}%
13 | }
14 | \setbeamertemplate{navigation symbols}{}
15 | \setbeamertemplate{footline}[text line]{}
16 |
17 | \usepackage[english]{babel}
18 | \usepackage{booktabs}
19 | \usepackage{calc}
20 | \usepackage{tikz}
21 | \usetikzlibrary{arrows,shapes,fit,shadows,positioning,chains,%
22 | decorations.pathreplacing,decorations.pathmorphing,calc,%
23 | matrix}
24 | \usepackage{pgfplots}
25 |
26 | \definecolor{grey}{rgb}{0.5, 0.5, 0.5}
27 | \definecolor{lightgrey}{rgb}{0.95, 0.95, 0.95}
28 | \definecolor{grassgreen}{rgb}{0.1, 0.85, 0.2}
29 | \definecolor{fadedbrown}{rgb}{0.85, 0.1, 0.1}
30 | \definecolor{darkmagenta}{rgb}{0.65, 0.0, 0.65}
31 | \definecolor{lightblue}{rgb}{0.5, 0.75, 1.0}
32 | \definecolor{linkboxcolor}{rgb}{0.8, 0.8, 0.85}
33 |
34 | % Minted
35 | \usepackage{minted}
36 | \setminted{
37 | autogobble = true,
38 | breaklines = true,
39 | codetagify = true,
40 | encoding = utf-8,
41 | outencoding = utf-8,
42 | frame = leftline,
43 | framerule = 5pt,
44 | framesep = 0.65em,
45 | xleftmargin = 1em,
46 | xrightmargin = 1em,
47 | rulecolor = \color{lightgrey},
48 | }
49 | \newmintinline[Mc]{c}{}
50 | \newminted{c}{
51 | fontsize = \small,
52 | baselinestretch = 1.0,
53 | }
54 | \newmintinline[Mlua]{lua}{}
55 | \newminted{lua}{
56 | fontsize = \small,
57 | baselinestretch = 1.0,
58 | }
59 |
60 | % Fonts
61 | \usepackage[OT1]{fontenc}
62 | \usepackage{fontspec}
63 | \defaultfontfeatures{Ligatures=TeX}
64 |
65 | \setmainfont{Andada}[
66 | Path = fonts/,
67 | Extension = .ttf,
68 | UprightFont = *-Regular,
69 | ItalicFont = *-Italic,
70 | BoldFont = *-Bold,
71 | BoldItalicFont = *-BoldItalic,
72 | SmallCapsFont = *SC-Regular,
73 | ]
74 |
75 | \newfontfamily\RlwLight{Raleway}[
76 | Path = fonts/,
77 | Extension = .ttf,
78 | UprightFont = *-ExtraLight,
79 | ItalicFont = *-ExtraLight-Italic,
80 | BoldFont = *-Light,
81 | BoldItalicFont = *-Light-Italic,
82 | ]
83 |
84 | \setsansfont{Raleway}[
85 | Path = fonts/,
86 | Extension = .ttf,
87 | UprightFont = *-Regular,
88 | ItalicFont = *-Regular-Italic,
89 | BoldFont = *-Bold,
90 | BoldItalicFont = *-Bold-Italic,
91 | ]
92 |
93 | \setmonofont{InputMonoNarrow}[
94 | Scale = MatchLowercase,
95 | Path = fonts/,
96 | Extension = .ttf,
97 | UprightFont = *-Light,
98 | ItalicFont = *-LightItalic,
99 | BoldFont = *-Regular,
100 | BoldItalicFont = *-Italic,
101 | ]
102 |
103 | \newfontfamily\SymbolaFont{Symbola}[
104 | Path = fonts/,
105 | Extension = .ttf,
106 | ]
107 | \newcommand\Symbol[1]{{\SymbolaFont#1}}
108 |
109 | % Symbols
110 | \newcommand\LeafOpen{\Symbol{🙠}}
111 | \newcommand\LeafClose{\Symbol{🙣}}
112 |
113 | \title{\textsc{Eöl}}
114 | \subtitle{Automatic bridging of native code to Lua using existing debugging information}
115 | \author{Adrián Pérez de Castro}
116 | \institute[UDC]{
117 | \begin{tikzpicture}[y=0.80pt, x=0.8pt,yscale=-1, inner sep=0pt, outer sep=0pt, scale=0.2]
118 | \path[fill=magenta,nonzero rule] (220.7188,106.4062) -- (382.3633,33.4609) ..
119 | controls (341.6836,12.8594) and (284.3281,-0.0039) .. (220.7227,-0.0039) ..
120 | controls (157.1016,-0.0039) and (99.7461,12.8594) .. (59.0742,33.4609) --
121 | (220.7188,106.4062);
122 | \path[fill=magenta,nonzero rule] (440.9648,89.9531) .. controls
123 | (436.4648,76.9375) and (427.1289,64.7109) .. (413.8828,53.7188) --
124 | (233.1914,105.5312) -- (440.9648,89.9531);
125 | \path[fill=magenta,nonzero rule] (414.9375,161.0898) .. controls
126 | (428.0547,149.9570) and (437.2305,137.6055) .. (441.4414,124.4531) --
127 | (232.9805,109.8984) -- (414.9375,161.0898);
128 | \path[fill=magenta,nonzero rule] (220.7305,109.2188) -- (57.9609,181.6680) ..
129 | controls (98.6992,202.6055) and (156.5195,215.6992) .. (220.7227,215.6992) ..
130 | controls (284.9102,215.6992) and (342.7344,202.6055) .. (383.4805,181.6680) --
131 | (220.7305,109.2188);
132 | \path[fill=magenta,nonzero rule] (0.0000,124.4531) .. controls (4.2109,137.6055)
133 | and (13.3867,149.9570) .. (26.4961,161.0898) -- (208.4492,109.8984) --
134 | (0.0000,124.4531);
135 | \path[fill=magenta,nonzero rule] (208.2422,105.5312) -- (27.5625,53.7109) ..
136 | controls (14.3164,64.7070) and (4.9766,76.9336) .. (0.4727,89.9531) --
137 | (208.2422,105.5312);
138 | \end{tikzpicture}
139 | \vspace{0.5em}
140 |
141 | Universidade da Coruña}
142 | \date[Sep 2015]{September, 2015}
143 |
144 | \begin{document}
145 |
146 | \setbeamertemplate{background canvas}{%
147 | \includegraphics[height=\paperheight]{img/forest.jpg}}
148 | % \setbeamercolor{title}{fg=black}
149 | % \setbeamercolor{block body}{fg=white}
150 |
151 | \maketitle
152 |
153 | \setbeamertemplate{background canvas}{}
154 | % \setbeamercolor{normal text}{fg=black!90}
155 |
156 | \begin{frame}{Outline}
157 | \tableofcontents
158 | \end{frame}
159 |
160 |
161 | \section{Lua}
162 |
163 | \subsection{Quick Introduction}
164 |
165 | \begin{frame}
166 |
167 | \centering
168 | \includegraphics[height=0.15\textheight]{img/lua-logo.pdf}
169 | \vspace{3em}
170 | \begin{quote}
171 | Lua is apowerful, fast, lightweight, embeddable
172 | scripting language.
173 | \begin{flushright}
174 | \begin{scriptsize}
175 | — \url{http://www.lua.org/about.html}
176 | \end{scriptsize}
177 | \end{flushright}
178 | \end{quote}
179 | \vfill
180 |
181 | \begin{itemize}
182 | \item Single data structure \visible<2->{$\rightarrow$ \emph{tables}}
183 | \item Extensible semantics \visible<3->{$\rightarrow$ \emph{metatables}}
184 | \item Automatic memory management \visible<4->{$\rightarrow$ \emph{GC}}
185 | \item Bytecode, register-based VM
186 | \end{itemize}
187 |
188 | \note[itemize]{
189 | \item Intially created as a data description language at Tecgraf
190 | PUC-Rio, for in-house software development: trade barriers were in
191 | effect for software and computer hardware.
192 | \item Petrobras was one of the first users of SOL and DEL, the
193 | predecessors of Lua.
194 | \item Tables can be used as hash tables, arrays, and objects.
195 | Particularly well suited for data description.
196 | \item Influences by Module (control structures), AWK (tables), and
197 | LISP (everything is a list\^W table)
198 | \item Widely used in the games industry (WoW).
199 | }
200 | \end{frame}
201 |
202 |
203 | \begin{frame}[fragile]{Lua by Example}
204 | \begin{luacode}
205 | animal = {
206 | name = "Unnamed",
207 | kind = "living creature",
208 | describe = function (self)
209 | print(self.name .. " is a " .. self.kind)
210 | end,
211 | }
212 |
213 | f = setmetatable({ kind="cat", name="Fifi" },
214 | { __index=animal })
215 | t = setmetatable({ name="Tom", },
216 | { __index=animal })
217 | f:describe() --> Fifi is a cat
218 | t:describe() --> Tom is a living creature
219 | \end{luacode}
220 |
221 | \note[itemize]{
222 | \item First a base object (which is a table) is defined
223 | \item Functions are first-class values, and as such can
224 | be values in a table
225 | \item Metatables allow defining the runtime behaviour of
226 | certain operations. Here we set \texttt{\_\_index}
227 | so the fields not found in \texttt{cat} or \texttt{dog}
228 | are looked up in the \texttt{animal} table instead. This
229 | effectively creates a prototype-style chain of obejcts.
230 | }
231 | \end{frame}
232 |
233 |
234 | \subsection{Accessing Native Code}
235 |
236 |
237 | \begin{frame}[fragile]{Going Native: Try I}
238 |
239 | Fact: Lua has a very minimal and small standard library.
240 |
241 | \pause
242 | \textbf{How is additional functionality provided?}
243 |
244 | \pause
245 | \vspace{2em}
246 |
247 | \begin{luacode}
248 | function isdir(path)
249 | local fd = io.popen("test -d " .. path)
250 | fd:read("*a") -- Discard output
251 | local ok, reason, code = fd:close()
252 | return ok and reason == "exit" and code == 0
253 | end
254 | \end{luacode}
255 | \vspace{2em}
256 |
257 | \pause
258 | \hfill …but this is \emph{cheating}
259 |
260 | \pause
261 | \hfill …and horrible in many ways
262 |
263 | \note[itemize]{
264 | \item Fatality 1: Spawning a child process just to check something that is
265 | provided as a system call.
266 | \item Fatality 2: The \texttt{path} variable needs to be quoted properly.
267 | \item Fatality 3: \texttt{io.popen} uses \texttt{system()}, which does
268 | shell expansion, and is a security hole.
269 | \item What we really want is to be able to call native functions.
270 | }
271 |
272 | \end{frame}
273 |
274 |
275 | \begin{frame}{Going Native: Try II}
276 |
277 | Civilized ways:
278 |
279 | \begin{enumerate}
280 | \item Lua C API
281 | \item Wrappers over the C API
282 | \item Binding generators
283 | \item Foreign Function Interfaces
284 | \item \textsc{Eöl} (this project)
285 | \end{enumerate}
286 |
287 | \end{frame}
288 |
289 |
290 | \begin{frame}[fragile]{\texttt{isatty}}
291 | \begin{luacode}
292 | function isatty(fileno)
293 | -- ???
294 | end
295 | \end{luacode}
296 | \end{frame}
297 |
298 |
299 | \begin{frame}[fragile]{\texttt{isatty}: Lua C API}
300 |
301 | C:
302 |
303 | \begin{ccode}
304 | static int f_isatty(lua_State *L) {
305 | lua_Integer fileno = luaL_checkinteger (L, 1);
306 | lua_pushboolean (L, isatty ((int) fileno));
307 | return 1;
308 | }
309 |
310 | int luaopen_isatty (lua_State *L) {
311 | lua_pushcfunction (L, f_isatty);
312 | return 1;
313 | }
314 | \end{ccode}
315 |
316 | Lua:
317 |
318 | \begin{luacode}
319 | local isatty = require("isatty")
320 | print(isatty(0), type(isttyatty(0)))
321 | -- Output: true boolean
322 | \end{luacode}
323 | \end{frame}
324 |
325 |
326 | \begin{frame}{Binding Generators}
327 | \begin{itemize}
328 | \item Tools that create bindings in an automated way.
329 | \item<2-> They require an extra compilation step.
330 | \item<2-> Cleanup of the C definition may be needed.
331 | \end{itemize}
332 |
333 | \note[itemize]{
334 | \item Some Lua-specific binding generators exist, they are either
335 | outdated or are not very complete.
336 | \item SWIG is the standard to beat... but it's still a binding
337 | generator, requires an extra step, and so on.
338 | }
339 | \end{frame}
340 |
341 |
342 | \begin{frame}[fragile]{\texttt{isatty}: LuaJIT FFI}
343 | \begin{luacode}
344 | local ffi = require("ffi")
345 | ffi.cdef("int isatty(int)")
346 | local isatty = ffi.C.isatty
347 | print(isatty(0), type(isatty(0)))
348 | -- Output: 1 number
349 | \end{luacode}
350 |
351 | \visible<2->{
352 | \begin{itemize}
353 | \item No need to manually write glue code
354 | \item Precise type information
355 | \end{itemize}
356 | }
357 |
358 | \end{frame}
359 |
360 |
361 | \begin{frame}[fragile]{\texttt{isatty}: Ideal FFI}
362 | \begin{luacode}
363 | local eol = require("eol")
364 | local isatty = eol.C.isatty
365 | print(isatty(0), type(isatty(0)))
366 | -- Output: 1 number
367 | \end{luacode}
368 |
369 | \hspace{4em}
370 | \visible<2->{
371 | \begin{itemize}
372 | \item No need to manually write glue code
373 | \item Precise type information
374 | \item \textbf{No manual function declaration} (\Mlua|ffi.cdef|)
375 | \end{itemize}
376 | }
377 |
378 | \end{frame}
379 |
380 |
381 | \begin{frame}{Ideal FFI = \textsc{Eöl}}
382 | \begin{quote}
383 | In order to implement the “ideal” FFI we need information
384 | about functions, their parameters, and involved data types.
385 | \end{quote}
386 | \pause
387 | \begin{itemize}
388 | \item The compiler already knows that information.
389 | \pause
390 | \item The compiler \emph{can} write it as debugging information.
391 | \end{itemize}
392 | \end{frame}
393 |
394 |
395 | \section{\textsc{Eöl}}
396 |
397 | \pgfdeclarelayer{background}
398 | \pgfdeclarelayer{foreground}
399 | \pgfsetlayers{background,main,foreground}
400 |
401 | \tikzstyle{bdBox} = [
402 | rectangle, drop shadow, draw=black, thick, fill=white,
403 | text centered, minimum height=2em, minimum width=3em,
404 | ]
405 | \tikzstyle{bdProcBox} = [
406 | rounded corners, fill=blue!10, text centered
407 | ]
408 | \tikzstyle{bdProcLine} = [
409 | draw, thick, color=blue!20
410 | ]
411 | \tikzstyle{bdCircle} = [
412 | circle, fill=blue!20, draw=black,
413 | ]
414 | \tikzstyle{bdLine} = [draw, thick]
415 | \tikzstyle{bdArrow} = [bdLine, >=triangle 45, ->]
416 |
417 |
418 | \tikzstyle{datablob} = [
419 | rectangle, rounded corners, drop shadow, draw=black, thick,
420 | text centered, minimum height=2em, minimum width=3em, fill=blue!20,
421 | ]
422 | \tikzstyle{die} = [start chain=going below, node distance=1mm]
423 | \tikzstyle{dielabel} = [on chain]
424 | \tikzstyle{dieitems} = [
425 | rectangle split, rectangle split parts=#1, rectangle split part align=left,
426 | thick, draw, fill=blue!10, on chain,
427 | ]
428 | \tikzstyle{enumitems} = [
429 | rectangle split, rectangle split parts=#1, rectangle split part align=left,
430 | thick, rounded corners,
431 | color=black!50, fill=black!5, draw=black!50,
432 | minimum height=3em,
433 | ]
434 | \tikzstyle{valuefrom} = [draw=black!50, thick, dashed]
435 | \tikzstyle{arrow} = [draw, thick, >=triangle 45, ->]
436 | \tikzstyle{datain} = [
437 | draw=black!80, thick, fill=blue!10, rectangle,
438 | text centered, minimum height=1.3em, text width=10em,
439 | ]
440 | \tikzstyle{component} = [
441 | draw=black,
442 | thick,
443 | fill=green!10,
444 | rectangle,
445 | text centered,
446 | minimum height=2em,
447 | text width=6em,
448 | rounded corners,
449 | drop shadow,
450 | ]
451 | \tikzstyle{uses} = [
452 | draw,
453 | very thick,
454 | >=triangle 45,
455 | ->,
456 | dashed,
457 | ]
458 | \tikzstyle{contains} = [
459 | draw,
460 | thick,
461 | >=triangle 45,
462 | -*,
463 | ]
464 |
465 |
466 | \begin{frame}{\textsc{Eöl}}
467 | \begin{center}
468 | \Huge
469 | FFI + DWARF/ELF
470 | \end{center}
471 |
472 | \pause
473 |
474 | \begin{center}
475 | Automatic binding system for the Lua programming language which allows
476 | seamless usage, at runtime, of libraries written in C.
477 | \end{center}
478 | \end{frame}
479 |
480 |
481 | \subsection{Architecture}
482 |
483 | \begin{frame}{Design}
484 |
485 | \resizebox{\textwidth}{!}{
486 | \begin{tikzpicture}[node distance=2cm]
487 | \node[component] (library) {Library};
488 | \node[component] (typecache) [above of=library] {Type Cache};
489 | \node[component] (ctype) [right=1cm of library] {CType};
490 | \node[component] (function) [right=1cm of ctype] {Function};
491 | \node[component] (variable) [right=1cm of function] {Variable};
492 | \node[component] (typeinfo) [above of=function] {Type Information};
493 | \node[datain] (dwarf) [above=1cm of typecache]
494 | {DWARF debugging information};
495 |
496 | \node (luadata) [below right of=ctype] {Visible in Lua as userdata};
497 | \node (elf) [above=0cm of dwarf] {ELF shared object};
498 |
499 | \path[uses] (function) -- (typeinfo);
500 | \path[uses] (variable) -- (typeinfo);
501 | \path[uses] (ctype) -- (typeinfo);
502 | \path[uses] (typecache) -- (dwarf);
503 | \path[contains] (library) -- (typecache);
504 | \path[contains] (typecache) -- (typeinfo);
505 |
506 | \begin{pgfonlayer}{background}
507 | \node[datablob] (elfbox) [fit=(dwarf) (elf), drop shadow] {};
508 | \node[fill=yellow!20, rectangle, rounded corners] (wrappers)
509 | [fit=(library) (variable) (function) (luadata)] { };
510 | \end{pgfonlayer}
511 | \end{tikzpicture}
512 | }
513 |
514 | \end{frame}
515 |
516 |
517 | \subsection{DWARF + ELF}
518 |
519 | \begin{frame}{ELF}
520 | \begin{columns}
521 | \begin{column}{0.55\textwidth}
522 | \resizebox{\textwidth}{!}{
523 | \begin{tikzpicture}[node distance=1.5mm, bend angle=0]
524 | \node[bdBox] (elfheader) [minimum width=10em, start chain=going below, on chain] {ELF header};
525 | \node[bdBox] (prgheader) [minimum width=10em, on chain] {Program header};
526 | \node[bdBox] (sect-text) [minimum width=10em, on chain] {\texttt{.text}};
527 | \node[bdBox] (sect-rodata) [minimum width=10em, on chain] {\texttt{.rodata}};
528 | \node (ellipsis) [minimum width=10em, on chain] {...};
529 | % \node[bdBox] (sect-data) [minimum width=10em, on chain, yshift=-1em] {\textt{.data}};
530 | \node[bdBox] (sect-data) [minimum width=10em, on chain] {\texttt{.data}};
531 | \node[bdBox] (secthdrtable) [minimum width=10em, on chain] {Section header table};
532 | \draw[decorate, decoration={brace}] let \p1=(sect-text.north),
533 | \p2=(sect-rodata.south) in ($(2.2, \y1)$) -- ($(2.2, \y2)$)
534 | node[midway] (g1) {};
535 | \draw[decorate, decoration={brace}] let \p1=(ellipsis.north),
536 | \p2=(sect-data.south) in ($(2.2, \y1)$) -- ($(2.2, \y2)$)
537 | node[midway] (g2) {};
538 | \draw[->, bend right, >=latex, bend right, thick]
539 | (elfheader.east) to [out=90,in=90] (g1.east);
540 | \draw[->, bend right, >=latex, bend right, thick]
541 | (elfheader.east) to [out=90,in=90] (g2.east);
542 | \draw[->, bend left, >=latex, bend right, thick]
543 | (secthdrtable.west) to [out=90,in=90] (sect-text.west);
544 | \draw[->, bend left, >=latex, bend right, thick]
545 | (secthdrtable.west) to [out=90,in=90] (sect-rodata.west);
546 | \draw[->, bend left, >=latex, bend right, thick]
547 | (secthdrtable.west) to [out=90,in=90] (sect-data.west);
548 | \end{tikzpicture}
549 | }
550 | \end{column}
551 | % \begin{column}{0.1\textwidth}
552 | % \end{column}
553 | \begin{column}{0.45\textwidth}
554 | \begin{itemize}
555 | \item Headers
556 | \begin{itemize}
557 | \item Fixed part
558 | \item Variable tables
559 | \end{itemize}
560 | \item Segments
561 | \begin{itemize}
562 | \item Runtime
563 | \item Executable shape
564 | \end{itemize}
565 | \item Sections
566 | \begin{itemize}
567 | \item Offline
568 | \item Arbitrary data
569 | \end{itemize}
570 | \end{itemize}
571 | \end{column}
572 | \end{columns}
573 | \end{frame}
574 |
575 |
576 | \begin{frame}{DWARF}
577 | \centering
578 | \resizebox{0.75\textwidth}{!}{
579 | \begin{tikzpicture}[die]
580 | \node[dielabel] (taglabel) {Tag};
581 | \node[dieitems=1] (tag) {\texttt{DW\_TAG\_pointer}};
582 | \node[dielabel] (attrlabel) {Attributes};
583 | \node[dieitems=1] (attributes) {\texttt{DW\_AT\_type}};
584 | \node[dielabel] (rtaglabel) [right=3cm of tag, yshift=-3.5mm] {Tag};
585 | \node[dieitems=1] (rtag) {\texttt{DW\_TAG\_pointer}};
586 | \node[dielabel] (rattrlabel) {Attributes};
587 | \node[dieitems=1] (rattributes) {\texttt{DW\_AT\_type}};
588 | \node[datablob] (basedie) [right=2cm of rattributes] {Type DIE};
589 | \path[arrow] (rattributes.text east) -- (basedie);
590 | \begin{pgfonlayer}{background}
591 | \node[datablob] (die) [fit=(taglabel) (attributes) (tag)] {};
592 | \node[datablob] (rdie) [fit=(rtaglabel) (rattributes) (rtag)] {};
593 | \end{pgfonlayer}
594 | \path[arrow] (attributes.text east) -- (rdie.west);
595 | \end{tikzpicture}
596 | }
597 |
598 | \vspace{2em}
599 |
600 | \begin{columns}
601 | \begin{column}{0.5\textwidth}
602 | ELF sections:
603 | \begin{itemize}
604 | \item \texttt{.debug\_types}
605 | \item \texttt{.debug\_info}
606 | \item \texttt{.debug\_}…
607 | \end{itemize}
608 | \end{column}
609 | \begin{column}{0.5\textwidth}
610 | DWARF information:
611 | \begin{itemize}
612 | \item Tree-like structure
613 | \item Nodes: DIEs \& TUEs
614 | \item Tagged attributes
615 | \end{itemize}
616 | \end{column}
617 | \end{columns}
618 | \end{frame}
619 |
620 |
621 | \section{Demos}
622 |
623 | \begin{frame}
624 | \centering
625 | \Symbol{\Huge 🖮}
626 | \Large
627 |
628 | Demo Time!
629 | \end{frame}
630 |
631 |
632 | \setbeamertemplate{background canvas}{%
633 | \includegraphics[height=\paperheight]{img/theend3.jpg}}
634 | \begin{frame}
635 | \end{frame}
636 |
637 | \end{document}
638 |
--------------------------------------------------------------------------------
/design.tex:
--------------------------------------------------------------------------------
1 | % vim: ft=tex spell spelllang=en ts=2 sw=2
2 |
3 | \cleardoublepage
4 | \setchaptertoc
5 | \chapter{Analysis \& Design}
6 |
7 | This chapter is a tour through the architecture of the developed software
8 | solution, analyzing relevant decisions taken that gave it its final shape.
9 | \afterintro
10 |
11 | \section{Naming}
12 |
13 | In the Lua community there is a certain tradition of naming projects after
14 | celestial bodies, or terms related to them —after all, Lua means \emph{moon}
15 | in Portuguese—, but unfortunately the name initially chosen for the project
16 | was Eris —a dwarf planet, neither a planet nor a moon— was already being used
17 | by another Lua-related project\footnote{The Eris persistence system,
18 | \url{https://github.com/fnuecke/eris}}. A closer inspection showed that other
19 | dwarf planet names were already in use for software projects, so in the end
20 | it was needed to draw inspiration from a different area.
21 |
22 | Eöl, also known as “The Dark Elf”, is a fictional character in
23 | J. R. R. Tolkien's Middle-earth legendarium, who is said to be the elf with
24 | closest relationships with dwarves, and one of the first able to speak their
25 | language. \Eol* can also be an \gls{backronym} for “ELF Object Loader”,
26 | which describes well the purpose of the developed solution.
27 |
28 |
29 | \section{Overview}
30 | \label{sec:design-overview}
31 |
32 | The main components of \Eol* are shown in \autoref{fig:eol-architecture}.
33 |
34 | The design of the system revolves around \textsf{Type Information}: it
35 | describes native types in detail, and it is used by all the other components
36 | in different ways to provide their functionality. Its importance should not be
37 | surprising, because the ultimate goal of \Eol* is to allow seamless invocation
38 | of native functions which, being close to the bare metal, always conform to
39 | strict \gls{ABI} specifications. While starting execution of a native function
40 | is as simple as generating a jump machine instruction to its start address in
41 | memory, the function will only behave as expected if the data it uses —its
42 | parameters, space for return values, etc.— is laid out in memory exactly in
43 | the way its machine code expects it to be. In turn, this layout depends on the
44 | types of the values passed to and from the function.
45 |
46 | \tikzstyle{component} = [
47 | draw=black,
48 | thick,
49 | fill=green!10,
50 | rectangle,
51 | text centered,
52 | minimum height=2em,
53 | text width=6em,
54 | rounded corners,
55 | drop shadow,
56 | ]
57 | \tikzstyle{uses} = [
58 | draw,
59 | very thick,
60 | >=triangle 45,
61 | ->,
62 | dashed,
63 | ]
64 | \tikzstyle{contains} = [
65 | draw,
66 | thick,
67 | >=triangle 45,
68 | -*,
69 | ]
70 |
71 | \begin{figure}
72 | \centering
73 | \begin{tikzpicture}[node distance=2cm]
74 |
75 | \node[component] (library) {Library};
76 | \node[component] (typecache) [above of=library] {Type Cache};
77 | \node[component] (ctype) [right=1cm of library] {CType};
78 | \node[component] (function) [right=1cm of ctype] {Function};
79 | \node[component] (variable) [right=1cm of function] {Variable};
80 | \node[component] (typeinfo) [above of=function] {Type Information};
81 | \node[datain] (dwarf) [above=1cm of typecache]
82 | {DWARF debugging information};
83 |
84 | \node (luadata) [below right of=ctype] {Visible in Lua as userdata};
85 | \node (elf) [above=0cm of dwarf] {ELF shared object};
86 |
87 | \path[uses] (function) -- (typeinfo);
88 | \path[uses] (variable) -- (typeinfo);
89 | \path[uses] (ctype) -- (typeinfo);
90 | \path[uses] (typecache) -- (dwarf);
91 | \path[contains] (library) -- (typecache);
92 | \path[contains] (typecache) -- (typeinfo);
93 |
94 | \begin{pgfonlayer}{background}
95 | \node[datablob] (elfbox) [fit=(dwarf) (elf), drop shadow] {};
96 | \node[fill=yellow!20, rectangle, rounded corners] (wrappers)
97 | [fit=(library) (variable) (function) (luadata)] { };
98 | \end{pgfonlayer}
99 | \end{tikzpicture}
100 | \caption{Architecture of \Eol*.}
101 | \label{fig:eol-architecture}
102 | \end{figure}
103 |
104 | There are four components which form part of the interface to Lua (as
105 | specified in \autoref{sec:design-lua-api}):
106 |
107 | \begin{itemize}
108 |
109 | \item \textsf{Library} (\autoref{sec:eol-api-library-t}) represents a loaded
110 | ELF library. It is responsible of accessing the DWARF debugging information,
111 | and looking up values of the other types.
112 |
113 | \item \textsf{CType} (\autoref{sec:eol-api-ctype-t}) represents a native
114 | data type. It is responsible for providing information about the represented
115 | native type, and for creating native values of the represented native type
116 | from Lua. The types available to C programs are supported, hence the name.
117 |
118 | \item \textsf{Function} (\autoref{sec:eol-api-function-t}) represents a
119 | fragment of native code contained in a library, which can be invoked as
120 | a function. It is responsible for performing calls into native code
121 | from Lua.
122 |
123 | \item \textsf{Variable} (\autoref{sec:eol-api-variable-t}) represents a
124 | variable from a library. It is responsible for allowing reading and
125 | writing its value from Lua, performing conversions as needed.
126 |
127 | \end{itemize}
128 |
129 | Looking up type information involves reading the DWARF debugging information
130 | from disk and decoding it appropriately. In order to avoid repeatedly reading
131 | the debugging information from disk to construct new \textsf{Type Information}
132 | values, each \textsf{Library} makes use of a \textsf{Type Cache} which keeps
133 | the information in memory. An additional benefit of the cache is that it
134 | allows reusing the \textsf{Type Information}: many DWARF \gls{DIE}s contain
135 | references to others\todo{Got time? Add a diagram with an example}, and the
136 | cache can be queried to determine whether a referenced DIE has been already
137 | turned into \textsf{Type Information}, and use the data from the cache
138 | instead.
139 |
140 |
141 | \section{Interaction With the Lua GC}
142 | \label{sec:design-gc-interaction}
143 |
144 | Userdata values are subject to Lua's \gls{GC} (c.f.
145 | \autoref{sec:userdata-lua-custom-allocator}), which poses a problem for the
146 | \textsf{Library} userdata: if the Lua VM does not keep an active reference to
147 | a \textsf{Library} value, the GC will consider it to be garbage, and will
148 | deallocate it while it may be still referenced by other resources. In
149 | particular, a \textsf{Library} cannot be unloaded while there is any
150 | \textsf{CType}, \textsf{Function}, or \textsf{Variable} userdata which belong
151 | to the library being used from Lua. This kind of situation can be triggered by
152 | the following simple sequence of events, illustrated by
153 | \autoref{lst:library-gc-issue}:
154 |
155 | \begin{listing}[ht]
156 | \begin{luacode}
157 | function loadfunction(libname, funcname)
158 | local eol = require("eol")
159 | local lib = eol.load(libname)
160 | return lib[funcname]
161 | end
162 | -- Obtain an userdata for the add() function from libtest.so
163 | add = loadfunction("libtest", "add")
164 | -- Crashes if the GC has already collected the library.
165 | print(add(6, 5))
166 | \end{luacode}
167 | \caption{Lua example which makes a \textsf{Library} subject to GC}
168 | \label{lst:library-gc-issue}
169 | \end{listing}
170 |
171 | \begin{enumerate}
172 |
173 | \item A library is loaded and returned to Lua as a \textsf{Library}
174 | userdata. The userdata is assigned to a temporary variable (e.g.
175 | a \Mlua|local| variable inside a \Mlua|function|) which eventually
176 | will go out of scope.
177 | \item A \textsf{Function} userdata is obtained for a native function
178 | contained by the library.
179 | \item The GC determines that the \textsf{Library} userdata is garbage,
180 | and frees the resources used by it. This unloads the library.
181 | \item At this point, invoking the function crashes the program because
182 | its machine code, contained in the library, is no longer loaded in
183 | memory.
184 |
185 | \end{enumerate}
186 |
187 | The solution for this problem is to use \gls{refcounting}, to ensure that the
188 | libraries are kept loaded while needed: each active userdata value of type
189 | \textsf{Function}, \textsf{CType}, or \textsf{Variable} contributes to the
190 | reference count. This way, a library is unloaded only when its reference count
191 | reaches zero.
192 |
193 |
194 | \section{Module API}
195 | \label{sec:design-lua-api}
196 |
197 | The \gls{API} exposed by the \Eol* module to the Lua world is loosely modelled
198 | after the one provided by the LuaJIT FFI module~\cite{lj-ffi-api} —also
199 | implemented by the standalone \verb|luaffi| module~\cite{luaffi}—, with some
200 | functions even having the same names and semantics, and others differing where
201 | appropriate. For example, \Eol* does not need to provide a function to parse
202 | C-like declarations because the type information is obtained from the
203 | \gls{DWARF} debugging information instead. The goal is to provide an API which
204 | is proven to be suitable for Lua FFIs, and at the same time not force
205 | programmers who have used the LuaJIT FFI —or the standalone \verb|luaffi|— to
206 | learn how to use a completely different API.
207 |
208 | \subsection{The \texttt{eol} Namespace}
209 |
210 | Where the LuaJIT FFI and \verb|luaffi| modules provide their functions in the
211 | \verb|ffi| namespace, \Eol* provides its functionality in the \verb|eol|
212 | namespace. Also, loading the \verb|eol| module should be possible using the
213 | standard Lua module loader, via the \Mlua|require()| function:
214 |
215 | \begin{luacode}
216 | eol = require("eol")
217 | \end{luacode}
218 |
219 | Unless started otherwise, in the specification of the functions summarized in
220 | \autoref{tab:eol-api-functions-summary} parameters named \texttt{typevalue}
221 | accept both a \textsf{CType} values, or \textsf{Variable} values, in which
222 | case the type associated with the variable will be used. Also, functions can
223 | generate Lua errors when the types of values passed to them are incorrect.
224 |
225 |
226 | \begin{table}[ht]
227 | \centering
228 | \begin{tabular}{lcc}
229 | \toprule
230 | Function & LJ & Err \\
231 | \midrule
232 | \Mlua|library = eol.load(name, global)| & \Tick & \Tick \\
233 | \Mlua|typeinfo = eol.type(library, name)| & & \Tick \\
234 | \Mlua|typeinfo = eol.typeof(typevalue)| & \Tick & \Tick \\
235 | \Mlua|size = eol.sizeof(typevalue)| & \Tick & \\
236 | \Mlua|alignment = eol.alignof(typevalue)| & \Tick & \\
237 | \Mlua|offset = eol.offsetof(typevalue, field)| & \Tick &\\
238 | \Mlua|value = eol.cast(typeinfo, value)| & \Tick & \\
239 | \Mlua|flag = eol.abi(parameter)| & \Tick & \\
240 | \bottomrule
241 | \end{tabular}
242 |
243 | \vspace{2pt}
244 |
245 | \begin{small}
246 | \begin{tabular}{lp{0.65\textwidth}}
247 | \emph{LJ} & \emph{Function available in the LuaJIT FFI module} \\
248 | \emph{Err}& \emph{May raise a Lua error while reading debugging information} \\
249 | \end{tabular}
250 | \end{small}
251 |
252 | \caption{API functions in the \texttt{eol} namespace}
253 | \label{tab:eol-api-functions-summary}
254 | \end{table}
255 |
256 |
257 | % eol.cdef(text)
258 | % UNAVAILABLE / UNNEEDED
259 |
260 | % eol.C
261 | % libc access (UNIMPLEMENTED)
262 |
263 | \subsubsection{Function \texttt{eol.load}}
264 | \label{sec:eol-api-load}
265 |
266 | \begin{luacode}
267 | library = eol.load(name, global)
268 | \end{luacode}
269 |
270 | Loads a library given its \texttt{name}, and returns it as a \textsf{Library}
271 | userdata value. The library \texttt{name} is specified without the file
272 | extension, because the module will add the appropriate extension for the
273 | operating system being used (e.g. \texttt{.so} for GNU/Linux). The library is
274 | then searched for the given order of preference:
275 |
276 | \begin{enumerate}
277 |
278 | \item \texttt{name} is an absolute path, and points to an existing file
279 |
280 | \item \texttt{name} is a relative path which, using the working directory as
281 | starting point, can be resolved to an existing file
282 |
283 | \item \texttt{name} does not contain path separators, and a library with
284 | a matching name exists in one of the standard locations for shared libraries
285 | of the operating system being used (e.g. \texttt{/lib}, \texttt{/usr/lib},
286 | and \texttt{/usr/local/lib} for most Unix-like systems, including GNU/Linux)
287 |
288 | \end{enumerate}
289 |
290 | If a suitable library could not be found using the method outlined above, or
291 | it could not be lodeaded, the function raises a Lua error.
292 |
293 | The \texttt{global} parameter is a boolean value which determines how the
294 | symbols from the loaded library interact with the ones from other libraries.
295 | When \Mlua|true|, the symbols defined by the library will be made available
296 | for symbols resolution of subsequently loaded libraries. The parameter is
297 | optional, and if not supplied the option is disabled as if \Mlua|false| was
298 | supplied as the second paramter. In an Unix-like system, this is equivalent to
299 | using \texttt{RTLD\_GLOBAL}, and \texttt{RTLD\_LOCAL} when
300 | \texttt{dlopen()}~\cite{opengroup-dlopen} is used to load a shared object file,
301 | respectively.
302 |
303 | \subsubsection{Function \texttt{eol.type}}
304 | \label{sec:eol-api-type}
305 |
306 | \begin{luacode}
307 | typeinfo = eol.type(library, name)
308 | \end{luacode}
309 |
310 | Obtains the information for a type of a given \texttt{name},
311 | contained in a \texttt{library}. The result is returned as a \textsf{CType}
312 | userdata. If the type is not found in the library, \Mlua|nil| is returned
313 | instead.
314 |
315 |
316 | \subsubsection{Function \texttt{eol.typeof}}
317 | \label{sec:eol-api-typeof}
318 |
319 | \begin{luacode}
320 | typeinfo = eol.typeof(typevalue)
321 | \end{luacode}
322 |
323 | Obtains the type of \texttt{typevalue}, and returns it as a \textsf{CType}
324 | userdata.
325 |
326 | \begin{luacode}
327 | typeinfo = eol.typeof(name)
328 | \end{luacode}
329 |
330 | Alternatively, it is possible to pass a string with the \texttt{name} of
331 | a type, and it will be searched for in all the currently loaded libraries.
332 | This second invocation method is an \Eol* extension which is not available in
333 | the LuaJIT FFI module.
334 |
335 | \subsubsection{Function \texttt{eol.sizeof}}
336 |
337 | \begin{luacode}
338 | size = eol.sizeof(typevalue)
339 | \end{luacode}
340 |
341 | Obtains the size of \texttt{typevalue}, in bytes. If \texttt{typevalue} is
342 | a \textsf{CType} userdata, the size returned corresponds to the size of values
343 | of the type. If the size is not known (e.g. for \Mc|void|, or functions),
344 | \Mlua|nil| is returned instead.
345 |
346 | For any other values, an error is raised.
347 |
348 | \subsubsection{Function \texttt{eol.alignof}}
349 | \label{sec:eol-api-alignof}
350 |
351 | \begin{luacode}
352 | alignment = eol.alignof(typevalue)
353 | \end{luacode}
354 |
355 | Obtains the minimum required alignment for \texttt{typevalue}, in bytes.
356 |
357 |
358 | \subsubsection{Function \texttt{eol.offsetof}}
359 | \label{sec:eol-api-offsetof}
360 |
361 | \begin{luacode}
362 | offset = eol.offsetof(typevalue, field)
363 | \end{luacode}
364 |
365 | Obtains the offset in bytes of \texttt{field} inside \texttt{typevalue}, which
366 | must be a record data type (a \Mc|struct| in C). The \texttt{field} can be
367 | specified as positive integer, or as a string. In the latter case, if there is
368 | no field with the given name, a Lua error is raised.
369 |
370 | \subsubsection{Function \texttt{eol.cast}}
371 | \label{sec:eol-api-cast}
372 |
373 | \begin{luacode}
374 | value = eol.cast(typeinfo, value)
375 | \end{luacode}
376 |
377 | Creates and returns a new \textsf{Variable} userdata which describes the same
378 | memory area as the passed \texttt{value}, but associates a new
379 | \texttt{typeinfo} to it.
380 |
381 | This function can be used to change the type associated with a \textsf{Variable}
382 | userdata, without changing the value itself. It is useful to manually override
383 | the pointer compatibility checks, or to convert between pointer values and
384 | addresses represented as integers.
385 |
386 | % ctype = eol.metatype(ct, metatable)
387 | %
388 | % cdata = eol.gc(cdata, finalizer)
389 | %
390 | %
391 | %
392 | % status = eol.istype(ct, obj)
393 | %
394 | % eol.copy(dst, src, len)
395 | % eol.copy(dst, str)
396 | % eol.fill(dst, len, [, c])
397 |
398 |
399 | \subsubsection{Function \texttt{eol.abi}}
400 | \label{sec:eol-api-abi}
401 |
402 | \begin{luacode}
403 | flag = eol.abi(param)
404 | \end{luacode}
405 |
406 | Returns \Mlua|true| if \texttt{param} (a Lua string) applies for the target
407 | \gls{ABI}. Returns \Mlua|false| otherwise. The defined parameters are detailed
408 | in \autoref{tab:eol-abi-params}. This function is provided for compatibility
409 | with the LuaJIT FFI module.
410 |
411 | \begin{table}[htH]
412 | \centering
413 | \begin{tabular}{ll}
414 | \toprule
415 | Parameter & Description \\
416 | \midrule
417 | \texttt{"32bit"} & The architecture uses 32-bit wide words. \\
418 | \texttt{"64bit"} & The architecture uses 64-bit wide words. \\
419 | \midrule
420 | \texttt{"le"} & Little-endian architecture. \\
421 | \texttt{"be"} & Big-endian architecture. \\
422 | \bottomrule
423 | \end{tabular}
424 | \caption{Defined parameters for \texttt{eol.abi()}}
425 | \label{tab:eol-abi-params}
426 | \end{table}
427 |
428 |
429 | % \subsubsection{Variable \texttt{eol.os}}
430 | %
431 | % \begin{luacode}
432 | % operatingsystem = eol.os
433 | % \end{luacode}
434 | %
435 | %
436 | % \subsubsection{Variable \texttt{eol.arch}}
437 | %
438 | % \begin{luacode}
439 | % architecture = eol.arch
440 | % \end{luacode}
441 |
442 |
443 | \subsection{Library userdata}
444 | \label{sec:eol-api-library-t}
445 |
446 | \begin{luacode}
447 | libc = eol.load("libc")
448 | stdout = libc.stdout -- Obtain a variable
449 | libc.fputs(stdout, "Hello, libc\n") -- Obtain a function
450 | \end{luacode}
451 |
452 | Userdata values of type \textsf{Library} represent a loaded library. The only
453 | way of obtaining them is using the \texttt{eol.load()} function
454 | (\autoref{sec:eol-api-load}). Indexing a library with a string key looks up
455 | the symbol of the same name, with one of the following results:
456 |
457 | \begin{itemize}
458 |
459 | \item If the symbol refers to executable code, a \textsf{Function} userdata
460 | (\autoref{sec:eol-api-function-t}) is returned.
461 |
462 | \item If the symbol refers to data, a \textsf{Variable} userdata
463 | (\autoref{sec:eol-api-variable-t}) is returned.
464 |
465 | \item Otherwise, \Mlua|nil| is returned.
466 |
467 | \end{itemize}
468 |
469 | Note that it is not possible to obtain a \textsf{CType} userdata directly from
470 | a library. The \texttt{eol.type()} function (\autoref{sec:eol-api-type}) must
471 | be used to that effect.
472 |
473 |
474 | \subsection{CType Userdata}
475 | \label{sec:eol-api-ctype-t}
476 |
477 | Userdata values of type \textsf{CType} represent information about types used
478 | by the native code of libraries. There are three ways in which values can be
479 | obtained:
480 |
481 | \begin{itemize}
482 |
483 | \item Using the \texttt{\_\_type} key to index a \textsf{Variable}
484 | userdata (\autoref{sec:eol-api-variable-t}).
485 |
486 | \item Using the \texttt{eol.typeof()} function
487 | (\autoref{sec:eol-api-typeof}).
488 |
489 | \item Using the \texttt{eol.type()} function (\autoref{sec:eol-api-type}).
490 |
491 | \end{itemize}
492 |
493 |
494 | \subsubsection{Value Construction}
495 |
496 | \begin{luacode}
497 | new_value = typeinfo(n)
498 | \end{luacode}
499 |
500 | A \textsf{CType} userdata is also a \gls{constructor} for values of the type
501 | it represents, by means of its \texttt{\_\_call} metamethod. Invoking
502 | a \value{CType} as a constructor accepts an optional parameter: if supplied,
503 | an array of \texttt{n} elements is created; otherwise a single element is
504 | created. The memory used to store the value is initialized by filling it with
505 | zeroes (\texttt{0x00}).
506 |
507 | All the values created this way are subject to \gls{GC}, as specified in
508 | \autoref{sec:design-gc-interaction}.
509 |
510 |
511 | \subsubsection{Type information}
512 |
513 | Information about the represented data type can be obtained by indexing
514 | \textsf{CType} values (by means of an \texttt{\_\_index} metamethod) with the
515 | keys detailed in \autoref{tab:eol-api-userdata-keys}.
516 |
517 | \begin{table}[ht]
518 | \centering
519 | \begin{tabular}{lp{0.7\textwidth}}
520 | \toprule
521 | Key & Description \\
522 | \midrule
523 | \texttt{name} & Name of the type, as a string. \\
524 | \texttt{sizeof} & Size of values of the type, in bytes. Equivalent to
525 | calling \texttt{eol.typeof()} passing the \textsf{CType} userdata as
526 | a parameter. \\
527 | \texttt{readonly} & Boolean value; indicates whether the type is
528 | declared as readonly (e.g. using \Mc|const| in C). \\
529 | \texttt{kind} & String which represents the kind of type, e.g.
530 | \texttt{"struct"}, \texttt{"union"}... \\
531 | \texttt{type} & For types which are defined in terms of another
532 | \emph{base type}, the \textsf{CType} userdata for the base type.
533 | Otherwise \Mlua|nil|. \\
534 | \bottomrule
535 | \end{tabular}
536 | \caption{Keys available in \textsf{CType} userdata}
537 | \label{tab:eol-api-userdata-keys}
538 | \end{table}
539 |
540 | For compound data types (in C, \Mc|struct|s and \Mc|union|s), two additional
541 | operations are supported on their \textsf{CType} userdata. The Lua length
542 | operator (\texttt\#, by means of a \texttt{\_\_len} metamethod) returns the
543 | number of members in the compound type, and indexing the userdata as an array
544 | —using numeric indexes— returns information about its \emph{nth} member, as
545 | a Lua table which contains the fields specified in
546 | \autoref{tab:eol-api-ctype-compound-member-fields}.
547 |
548 | \begin{table}[ht]
549 | \centering
550 | \begin{tabular}{lccp{0.6\textwidth}}
551 | \toprule
552 | Key & Enum & Struct & Description\\
553 | \midrule
554 | \texttt{name} & \Tick & \Tick & Name of the member, as a string. \\
555 | \texttt{value} & \Tick & & Value, as an integer. \\
556 | \texttt{type} & & \Tick & Type of the member, as a \textsf{CType} userdata. \\
557 | \texttt{offset}& & \Tick & Offset of the member, in bytes.
558 | Equivalent to calling \texttt{eol.offsetof()} passing the the
559 | \textsf{CType} userdata and the member name as paramters. \\
560 | \bottomrule
561 | \end{tabular}
562 | \caption{Keys available in compound \textsf{CType} member information.}
563 | \label{tab:eol-api-ctype-compound-member-fields}
564 | \end{table}
565 |
566 |
567 | \subsubsection{Method \texttt{:pointerto()}}
568 |
569 | \begin{luacode}
570 | pointer_typeinfo = typeinfo:pointerto()
571 | \end{luacode}
572 |
573 | Uses \texttt{typeinfo} as base type to construct a new \textsf{CType} userdata
574 | value which represents a pointer to a value of the base type.
575 |
576 |
577 | \subsubsection{Method \texttt{:arrayof(n)}}
578 |
579 | \begin{luacode}
580 | array_typeinfo = typeinfo:arrayof(n)
581 | \end{luacode}
582 |
583 | Uses \texttt{typeinfo} as base type to construct a new \textsf{CType} userdata
584 | value which represents an array of \texttt{n} elements of the base type.
585 |
586 |
587 | \subsection{Function userdata}
588 | \label{sec:eol-api-function-t}
589 |
590 | Userdata values of type \textsf{Function} represent any piece of native code
591 | from a \textsf{Library} which can be invoked transparently from Lua.
592 |
593 | Information about a \textsf{Function} can be obtained by indexing the userdata
594 | (by means of an \texttt{\_\_index} metamethod) with the keys detailed in
595 | \autoref{tab:eol-api-function-keys}.
596 |
597 | \begin{table}[ht]
598 | \centering
599 | \begin{tabular}{lp{0.7\textwidth}}
600 | \toprule
601 | Key & Description \\
602 | \midrule
603 | \texttt{\_\_name} & Name of the function, as a string. \\
604 | \texttt{\_\_type} & Type of the return value, as a \textsf{CType}
605 | userdata, or \Mlua|nil| if the function does not return a value. \\
606 | \texttt{\_\_library} & Library which contains the function, as a
607 | \textsf{Library} userdata. \\
608 | \bottomrule
609 | \end{tabular}
610 | \caption{Keys available in \textsf{Function} userdata}
611 | \label{tab:eol-api-function-keys}
612 | \end{table}
613 |
614 |
615 | \subsection{Variable userdata}
616 | \label{sec:eol-api-variable-t}
617 |
618 | Userdata values of type \textsf{Variable} represent native data values. Each
619 | value has a pointer to the region of memory occupied by the actual data, and
620 | an associated \textsf{CType} which determines how the pointer to the data is
621 | used.
622 |
623 | The actual value represented by the \textsf{Variable} userdata and information
624 | about them can be obtained by indexing the userdatas (by means of an
625 | \texttt{\_\_index} metamethod) with the keys detailed in
626 | \autoref{tab:eol-api-variable-keys}.
627 |
628 | For \textsf{Variable}s with an associated array \textsf{CType}, it is possible
629 | to manipulate the variable directly as if it were a Lua array, using the
630 | \Mlua|variable[index]| syntax, and the length operator (\texttt\#) returns
631 | the number of elements in the array.
632 |
633 | % \texttt{\_\_index} and \texttt{\_\_newindex} metamethods allow
634 | %
635 | % contents using normal array
636 | % to use the Lua length operator (\texttt\#, by means of a \texttt{\_\_len}
637 | % metamethod) to obtain the number of elements in the array, and manipulating
638 | % the values of individual elements using numeric indexes (both reading and
639 | % writing values of the elements are possible, by means of the
640 | % \texttt{\_\_index} and \texttt{\_\_newindex} metamethods, respectively).
641 |
642 | \begin{table}[ht]
643 | \centering
644 | \begin{tabular}{lcp{0.65\textwidth}}
645 | \toprule
646 | Key & Writable & Description \\
647 | \midrule
648 | \texttt{\_\_value} & \Tick & Value of the variable. \\
649 | \texttt{\_\_name} & & Name of the variable, as a string. \\
650 | \texttt{\_\_type} & & Type of the variable, as a \textsf{CType} userdata. \\
651 | \texttt{\_\_library} & & Library which contains the variable, as
652 | a \textsf{Library} userdata. It may be \Mlua|nil| for variables created from Lua. \\
653 | \bottomrule
654 | \end{tabular}
655 | \caption{Keys available in \textsf{Variable} userdata}
656 | \label{tab:eol-api-variable-keys}
657 | \end{table}
658 |
659 |
660 |
661 | \subsubsection{Type information}
662 |
663 | \textsf{Function} userdata values provide information about their return type
664 | when indexing them with the \texttt{\_\_type} key, as seen in the previous
665 | section. Type information for function parameters is also provided: applying
666 | the Lua length operator (\texttt\#, by means of a \texttt{\_\_len} metamethod)
667 | returns the number of parameters accepted, and indexing the userdata as an
668 | array —using numeric indexes— returns the type information for the paramters
669 | as \textsf{CType} userdata.
670 |
671 |
672 | \subsubsection{Invocation}
673 |
674 | The \texttt{\_\_call} metamethod is implemented for \textsf{Function}
675 | userdata values, effectively making them directly callable from Lua. Invoking
676 | native code involves:
677 |
678 | \begin{enumerate}
679 |
680 | \item Checking that the number of parameters passed to the function from Lua
681 | match the amount accepted by the native function.
682 |
683 | \item Allocating as much space as needed to pass the parameters to
684 | the native function, plus the space needed for the return value (if any).
685 | The amount of space needed must be calculated using the sizes of the native
686 | types, as used by the native function.
687 |
688 | \item For each value passed as a parameter in Lua:
689 |
690 | \begin{enumerate}
691 |
692 | \item Checking that the type of the Lua value is compatible and can be
693 | converted to a value of the type expected by the native function.
694 |
695 | \item Converting the Lua value to the corresponding native type, and
696 | storing the result in the allocated space.
697 |
698 | \end{enumerate}
699 |
700 | \item Re-arranging the data as needed, if the in-memory layout of the
701 | allocated data does not match the layout defined by the \gls{ABI} of the
702 | target architecture and operating system.
703 |
704 | \item Invoking the native function by jumping to its start address.
705 |
706 | \item Converting the return value, if any, to a Lua value, and pushing
707 | and pushing it into the Lua stack.
708 |
709 | \end{enumerate}
710 |
711 |
712 | \section{Testing}
713 | \label{sec:design-testing}
714 |
715 | We decided to use a test harness for \Eol*. The test harness should be usable
716 | not only for unit testing, but also for regression testing, so it should
717 | exercise the implementation using the \Eol* module API
718 | (\autoref{sec:design-lua-api}) and not depend on knowledge about the internals
719 | of the implementation.
720 |
721 | One challenge for the test harness is that programming errors in the system
722 | may cause the entire process to hang, or crash: the \Eol* Lua module is
723 | implemented in C, and therefore all the caveats of running and testing native
724 | code apply.
725 |
726 | A number of third party unit testing frameworks exist for Lua, but evaluating
727 | them showed that none of them satisfies our requisites:
728 |
729 | \begin{itemize}
730 |
731 | \item Compatibility with Lua 5.3, which is the Lua version used as target.
732 | Many of the testing frameworks only support older versions only, and
733 | \texttt{lunit}\footnote{\url{http://www.mroth.net/lunit/}},
734 | Lunity\footnote{\url{https://github.com/Phrogz/Lunity}},
735 | Lunatest\footnote{\url{https://github.com/silentbicycle/lunatest}},
736 | LuaUnit\footnote{\url{https://github.com/bluebird75/luaunit}},
737 | Shake\footnote{\url{http://shake.luaforge.net/}},
738 | BTDLua\footnote{\url{http://users.skynet.be/adrias/Lua/BTDLua/}},
739 | Luaspec\footnote{\url{https://github.com/mirven/luaspec/}},
740 | Telescope\footnote{\url{https://github.com/norman/telescope/}}, and
741 | Gambiarra\footnote{\url{https://bitbucket.org/zserge/gambiarra}} were
742 | discarded because of that.
743 |
744 | \item Ability to handle gracefully crashes of the process running the Lua
745 | VM. Many testing frameworks for Lua focus on testing Lua code, and do not
746 | handle crashes in native code gracefully. Because of this,
747 | \texttt{lunitx}\footnote{\url{https://github.com/dcurrie/lunit}},
748 | Testy\footnote{\url{https://github.com/siffiejoe/lua-testy}},
749 | Busted\footnote{\url{http://olivinelabs.com/busted/}}, and
750 | TestMore\footnote{\url{http://fperrad.github.io/lua-TestMore/}}
751 | were discarded.
752 |
753 | \end{itemize}
754 |
755 | Because of the impossibility of reusing an existing testing framework, we
756 | needed to implement our own test harness, which revolves around the
757 | requirement of gracefully handling crashes in native code.
758 |
759 | \minisec{Unit Test Isolation}
760 |
761 | The best way of ensuring that the test harness can continue running in the
762 | event of a crash is to run each unit test in a new process, with a fresh Lua
763 | VM. This motivates each unit tests to be stored in its own Lua script. This
764 | way it is possible for the test harness to run as a separate process, which in
765 | turns executes a new process for each unit test, which can crash safely
766 | without affecting the test harness or the rest of the tests.
767 |
768 | \minisec{Test Assertions}
769 |
770 | On top of the standard \Mlua|assert()| function provided by Lua, the test
771 | harness additionally provides the assertions listed in
772 | \autoref{tab:design-test-asserts}, to be used in unit tests.
773 |
774 | \begin{table}[htH]
775 | \centering
776 | \begin{tabular}{lp{0.6\textwidth}}
777 | \toprule
778 | Function & Description \\
779 | \midrule
780 | \verb|assert.False(value)| &
781 | Checks that \texttt{value} is \verb|false| \\
782 | \verb|assert.True(value)| &
783 | Checks that \texttt{value} is \verb|true| \\
784 | \verb|assert.Falsey(value)| &
785 | Checks that \texttt{value} is \verb|false| or \Mlua|nil| \\
786 | \verb|assert.Truthy(value)| &
787 | Checks that \texttt{value} evaluates to a non-falsey value (any value
788 | except \verb|false| or \Mlua|nil|) \\
789 | \verb|assert.Error(f)| &
790 | Checks whether invoking function \texttt{f} raises a Lua error \\
791 | \verb|assert.Callable(value)| &
792 | Checks whether \texttt{value} is a function or has a \texttt{\_\_call}
793 | metamethod which allows to treat it as a function \\
794 | \verb|assert.Field(obj, name)| &
795 | Checks whether an \texttt{obj}ect is indexable and contains a field with
796 | the given \texttt{name} \\
797 | \verb|assert.Userdata(value, T)| &
798 | Checks whether a \texttt{value} is an userdata of type \texttt{T} \\
799 | \verb|assert.Equal(a, b)| &
800 | Checks whether two values \texttt{a} and \texttt{b} are equal \\
801 | \verb|assert.Match(re, str)| &
802 | Checks whether a \texttt{str}ing matches a particular \texttt{re}gular
803 | expression. \\
804 | \bottomrule
805 | \end{tabular}
806 | \caption{Additional assertions provided by the test harness}
807 | \label{tab:design-test-asserts}
808 | \end{table}
809 |
810 | The checks performed by the assertions can be reversed by indexing the
811 | \verb|assert| object with the \verb|Not| key, and using the result to invoke
812 | the negated assertion. As complex as it may sound, this is easily exemplified:
813 |
814 | \begin{luacode}
815 | assert.Equal(2, 1+1) -- Normal assertion
816 | assert.Not.Equal(nil, 2) -- Negated assertion
817 | \end{luacode}
818 |
819 | \minisec{Test Names}
820 |
821 | In order to allow specifying which test (or tests) from the corpus of unit tests
822 | are to be run, we need a way to refer to them by name. Provided that each unit
823 | test is contained in a file, the file name without the \verb|.lua| suffix is
824 | used as the name the unit test.
825 |
826 | \minisec{Test harness architecture}
827 |
828 | The components of the test harness are shown in \autoref{fig:design-harness}.
829 |
830 | \begin{figure}[th]
831 | \centering
832 | \begin{tikzpicture}[node distance=2cm]
833 | \node[component] (test) {Test};
834 | \node[component] (runner) [right=1cm of test] {Runner};
835 | \node[component] (baseout) [right=1cm of runner] {Output};
836 | \node[component, below] (tapout) [below of=baseout, xshift=-19mm] {TAP Output};
837 | \node[component, below] (conout) [right=1cm of tapout] {Console Output};
838 |
839 | \node[datain] (t1) [below of=test] {\texttt{test1.lua}};
840 | \node[datain] (t2) [below of=t1, node distance=1.1\baselineskip]
841 | {\texttt{test2.lua}};
842 | \node[datain] (tt) [below of=t2, node distance=1.1\baselineskip] {...};
843 | \node (ttlabel) [below of=tt, node distance=1.5\baselineskip]
844 | {Unit Test Scripts};
845 |
846 | \path[uses] (runner.east) -- (baseout.west);
847 |
848 | \path[uses, solid] (tapout.east) -| (baseout.south);
849 | \path[uses, solid] (conout.west) -| (baseout.south);
850 |
851 | \path[contains] (runner.west) -- (test.east);
852 |
853 | \begin{pgfonlayer}{background}
854 | \node[datablob] (tests) [fit=(t1) (t2) (tt) (ttlabel), drop shadow] {};
855 | \end{pgfonlayer}
856 |
857 | \path[datain, dashed, thick] (tests.north) -- (test.south);
858 | \end{tikzpicture} \caption{Architecture of the test harness}
859 | \label{fig:design-harness} \end{figure}
860 |
861 | Each unit test, which is ultimately a Lua script in the file system, is
862 | modelled by a \textsf{Test}, which is responsible for executing its
863 | corresponding Lua script in a new process and determining whether the
864 | execution of the unit test succeeded, failed, or crashed.
865 |
866 | The \textsf{Runner} is the main component of the harness. It manages
867 | a collection of \textsf{Test}, and is responsible for triggering their
868 | execution, keeping statistics about the test process (total number of tests,
869 | amount of failed tests, and so on), and reporting status and results to an
870 | \textsf{Output}.
871 |
872 | An \textsf{Output} is responsible for reporting the status and results of the
873 | execution of the unit tests to the user. Its interface is abstract, and two
874 | concrete implementations are to be initially provided: \textsf{Console
875 | Output}, to produce textual output suitable for display in a Unix color
876 | terminal (DEC VT420 or compatible i.e. XTerm), and \textsf{TAP Output}, to
877 | write the output in the \gls{TAP} format~\cite{tap-spec}. The latter, being
878 | a de facto standard, allows integration with third party tools.
879 |
880 | \beforeintro
881 |
--------------------------------------------------------------------------------
/implementation.tex:
--------------------------------------------------------------------------------
1 | % vim: ft=tex spell spelllang=en ts=2 sw=2 et
2 |
3 | \setchaptertoc
4 | \chapter{Implementation}
5 | \clearpage
6 | % \enlargethispage{2\baselineskip}
7 |
8 | This chapter provides both guidance to browse the \Eol* source
9 | code~\cite{eol-github}, and insight into the details worth
10 | mentioning of the implementation of the system.
11 |
12 | % \enlargethispage{2\baselineskip}
13 | \afterintro
14 |
15 |
16 | \section{Project Source Structure}
17 |
18 | The \Eol* source code is organized in the following directory structure, which
19 | follows usual conventions for C projects:
20 |
21 | \begin{figure}[h]
22 | \centering
23 | \noindent\begin{minipage}{0.75\textwidth}
24 | \dirtree{%
25 | .1 \DtFolder{lua-eol/}.
26 | .2 \DtFolder{doc/} \DTcomment{Documentation and API reference}.
27 | .2 \DtFolder{examples/}.
28 | .3 *.lua \DTcomment{Module usage examples}.
29 | .2 \DtFolder{tools/} \DTcomment{Build \& test utilities}.
30 | .3 \DtFolder{ninja/} \DTcomment{Ninja build support files}.
31 | .3 \DtFolder{make/} \DTcomment{GNU Make build support files}.
32 | .2 \DtFolder{test/}.
33 | .3 *.lua \DTcomment{Unit tests}.
34 | .2 uthash.h \DTcomment{Copy of UT-hash}.
35 | .2 eol-*.c \DTcomment{Module sources}.
36 | .2 eol-*.h \DTcomment{Module sources}.
37 | }
38 | \end{minipage}
39 | \caption{Source tree structure.}
40 | \end{figure}
41 |
42 | Module source files (\verb|eol-*.h|, \verb|eol-*.c|) are named after the
43 | components identified during the design phase. In particular:
44 |
45 | \begin{itemize}
46 |
47 | \item \verb|eol-module.c| \hfill\\
48 | Main part of the code, including the interfacing with Lua.
49 |
50 | \item \verb|eol-fcall.h|,
51 | \verb|eol-fcall-.h|,
52 | \verb|eol-fcall-.c|... \hfill\\
53 | Different implementations of the native function invocation mechanism.
54 |
55 | \item \verb|eol-typing.h|, \verb|eol-typing.c| \hfill\\
56 | Type representation module.
57 |
58 | \item \verb|eol-typecache.h|, \verb|eol-typecache.c|, \verb|uthash.h| \hfill\\
59 | Type representation cache module.
60 |
61 | \item \verb|eol-libdwarf.h|, \verb|eol-libdwarf.c| \hfill\\
62 | Utility functions to simplify working with \verb|libdwarf|.
63 |
64 | \item \verb|eol-lua.h| \hfill\\
65 | Utility functions to simplify working with the Lua C API.
66 |
67 | \item \verb|eol-trace.h|, \verb|eol-trace.c| \hfill\\
68 | Tracing support module.
69 |
70 | \item \verb|eol-util.h|, \verb|eol-util.c| \hfill\\
71 | Miscellaneous utility code, including support code for the runtime checks.
72 |
73 | \end{itemize}
74 |
75 |
76 | \section{Type Representation}
77 |
78 | Converting values from C to Lua, and vice versa, is one of the most important
79 | tasks performed by \Eol*: C values need to be made accessible from Lua.
80 | Therefore, this information needs to be read from the DWARF debugging
81 | information (see \nameref{sec:debuginfo-structure}), and kept around in
82 | a suitable data structure. This structure must be:
83 |
84 | \begin{itemize}
85 | \item Exhaustive, to hold all the needed information.
86 | \item Compact, to minimize memory usage.
87 | \end{itemize}
88 |
89 | Describing base types is possible using just an enumerated type: there is
90 | a fixed amount of them, and the characteristics (size, name, etc) are well
91 | known. The challenging part is representing user defined types (\Mc|struct|,
92 | \Mc|enum|, \Mc|union|), and derived types (pointers, arrays).
93 |
94 | The data structure for describing types is \verb|EolTypeInfo|
95 | (\autoref{lst:EolTypeInfo}). It is a tagged \Mc|struct|, with the tag
96 | indicatingthe type kind (\verb|EOL_TYPE_S32| for 32-bit signed integers,
97 | \verb|EOL_TYPE_STRUCT| for a \Mc|struct|, etc; the complete list of values
98 | can be seen in \autoref{lst:EolType}). The contained data will vary
99 | depending on the value of the \emph{kind} tag. The members for all possible
100 | values are grouped in an \Mc|union| in order to make them share the
101 | same memory space.
102 |
103 | \begin{listing}[H]
104 | \begin{ccode}
105 | struct _EolTypeInfo {
106 | EolType type;
107 | union {
108 | struct TI_base ti_base;
109 | struct TI_pointer ti_pointer;
110 | struct TI_typedef ti_typedef;
111 | struct TI_const ti_const;
112 | struct TI_array ti_array;
113 | struct TI_compound ti_compound;
114 | };
115 | };
116 | typedef struct _EolTypeInfo EolTypeInfo;
117 | \end{ccode}
118 | \caption{\texttt{EolTypeInfo}.}
119 | \label{lst:EolTypeInfo}
120 | \end{listing}
121 |
122 | \begin{listing}[tH]
123 | \centering
124 | \begin{ccode}
125 | typedef enum {
126 | EOL_TYPE_VOID, /* void */
127 | EOL_TYPE_BOOL, /* _Bool */
128 | EOL_TYPE_S8, /* int8_t */
129 | EOL_TYPE_U8, /* uint8_t */
130 | EOL_TYPE_S16, /* int16_t */
131 | EOL_TYPE_U16, /* uint16_t */
132 | EOL_TYPE_S32, /* int32_t */
133 | EOL_TYPE_U32, /* uint32_t */
134 | EOL_TYPE_S64, /* int64_t */
135 | EOL_TYPE_U64, /* uint64_t */
136 | EOL_TYPE_FLOAT, /* float */
137 | EOL_TYPE_DOUBLE, /* double */
138 | EOL_TYPE_TYPEDEF, /* typedef … T */
139 | EOL_TYPE_CONST, /* const T */
140 | EOL_TYPE_POINTER, /* T* */
141 | EOL_TYPE_ARRAY, /* T …[n] */
142 | EOL_TYPE_STRUCT, /* struct … */
143 | EOL_TYPE_UNION, /* union … */
144 | EOL_TYPE_ENUM, /* enum … */
145 | } EolType;
146 | \end{ccode}
147 | \caption{\texttt{EolType} enumeration.}
148 | \label{lst:EolType}
149 | \end{listing}
150 |
151 |
152 | \begin{table}[f]
153 | \centering
154 | \begin{tabular}{lll}
155 | \toprule
156 | C Construct & DWARF DIE & \Eol* Type \\
157 | \midrule
158 | \Mc|void| & ø & \Mc|EOL_TYPE_VOID| \\
159 | \Mc|bool| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_BOOL| \\
160 | \Mc|int8_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S8| \\
161 | \Mc|uint8_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U8| \\
162 | \Mc|int16_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S16| \\
163 | \Mc|uint16_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U16| \\
164 | \Mc|int32_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S32| \\
165 | \Mc|uint32_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U32| \\
166 | \Mc|int64_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S64| \\
167 | \Mc|uint64_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U64| \\
168 | \Mc|float| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_FLOAT| \\
169 | \Mc|double| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_DOUBLE| \\
170 | \Mc|typedef|... & \verb|DW_TAG_typedef| & \Mc|EOL_TYPE_TYPEDEF| \\
171 | \Mc|const|... & \verb|DW_TAG_const_type| & \Mc|EOL_TYPE_CONST| \\
172 | ...\Mc|*| & \verb|DW_TAG_pointer_type| & \Mc|EOL_TYPE_POINTER| \\
173 | ...\Mc|[n]| & \verb|DW_TAG_array_type| & \Mc|EOL_TYPE_ARRAY| \\
174 | \Mc|struct|... & \verb|DW_TAG_structure_type| & \Mc|EOL_TYPE_STRUCT| \\
175 | \Mc|union|... & \verb|DW_TAG_union_type| & \Mc|EOL_TYPE_UNION| \\
176 | \Mc|enum|... & \verb|DW_TAG_enumration_type|& \Mc|EOL_TYPE_ENUM| \\
177 | \bottomrule
178 | \end{tabular}
179 | \caption{Mapping of C types, DWARF DIEs and \Mc|EolType|.}
180 | \end{table}
181 |
182 | \noindent
183 | The following sections describe in detail the members of \verb|EolTypeInfo|.
184 |
185 |
186 | \subsection{Base Type Representation}
187 |
188 | \begin{ccode*}{samepage=true}
189 | struct TI_base {
190 | char *name;
191 | uint32_t size;
192 | };
193 | \end{ccode*}
194 |
195 | \noindent
196 | Even though it is sufficient to provide type kind codes for all the base types
197 | as discussed before, providing the possibility of querying their \verb|name|
198 | and \verb|size| is a convenient feature, at a very small cost: the
199 | \verb|EolTypeInfo| value for each one of the base types is a singleton,
200 | defined as follows:
201 |
202 | \begin{ccode*}{samepage=true}
203 | /* File: eol-typing.h */
204 | extern const EolTypeInfo* eol_typeinfo_u32;
205 |
206 | /* File: eol-typing.c */
207 | const EolTypeInfo* eol_typeinfo_u32 = &((EolTypeInfo) {
208 | .kind = EOL_TYPE_U32,
209 | .ti_base.name = "uint32_t",
210 | .ti_base.size = sizeof (uint32_t),
211 | });
212 | \end{ccode*}
213 |
214 | \noindent In practice, to avoid writing the definitions of all the base types,
215 | the C preprocessor and a couple generator macros are used (see
216 | \autoref{sec:cpp-abuse-genmacros}).
217 |
218 |
219 | \subsection{Pointer Representation}
220 | \label{sec:pointer-typeinfo}
221 |
222 | \begin{ccode*}{samepage=true}
223 | struct TI_pointer {
224 | const EolTypeInfo *typeinfo;
225 | };
226 | \end{ccode*}
227 |
228 | \noindent
229 | Pointers are represented by referencing the \verb|EolTypeInfo| of the
230 | pointed-to type. Thus, it is the only member in \Mc|struct TI_pointer|. The
231 | size of a pointer value is platform dependent, but well known and constant for
232 | each platform, and is the value of the C expression \Mc|sizeof(void*)|.
233 |
234 |
235 | \subsection{Array Representation}
236 |
237 | \begin{ccode*}{samepage=true}
238 | struct TI_array {
239 | const EolTypeInfo *typeinfo;
240 | uint64_t n_items;
241 | };
242 | \end{ccode*}
243 |
244 | \noindent
245 | Arrays are represented by referencing the \verb|EolTypeInfo| of the array
246 | items, plus the number of items (\verb|n_items|) present in the array. The
247 | size of an array value can be calculated multiplying the size of the item type
248 | by the number of items in the array.
249 |
250 |
251 | \subsection{User Defined Type Representation}
252 |
253 | \begin{ccode*}{samepage=true}
254 | struct TI_compound {
255 | char *name;
256 | uint32_t size;
257 | uint32_t n_members;
258 | EolTypeInfoMember members[];
259 | };
260 | \end{ccode*}
261 |
262 | \noindent This record type represents all user defined types: enumerated types
263 | (\Mc:enum:), record types (\Mc:struct:), and union types (\Mc:union:):
264 |
265 | \begin{description}
266 | \item [\Mc|name|] \hfill \\
267 | User defined types are usually given a name, but it is optional and in
268 | this case the value will be \Mc|NULL|.
269 | \item [\Mc|size|] \hfill \\
270 | Contains the size of the type, in bytes.
271 | \item [\Mc|n_members| / \Mc|members|] \hfill \\
272 | Count of members (or enumerators, for \verb|EOL_TYPE_ENUM|) in the type,
273 | and an array contaning their descriptions. Using
274 | a \gls{flexible-array-member}, allows usage of a single chunk of memory
275 | for the \verb|EolTypeInfo| itself and the items in the array.
276 | \end{description}
277 |
278 | \noindent
279 | The auxiliar \verb|EolTypeInfoMember| type is defined as follows:
280 |
281 | \begin{ccode*}{samepage=true}
282 | typedef struct {
283 | const char *name;
284 | union {
285 | int64_t value; /* enum */
286 | struct { /* union, struct */
287 | uint32_t offset;
288 | const EolTypeInfo *typeinfo;
289 | };
290 | };
291 | } EolTypeInfoMember;
292 | \end{ccode*}
293 |
294 | \noindent
295 | This always contains the (optional) \Mc|name| of the types, and usage of the
296 | remaining fields varies with the type being described:
297 |
298 | \begin{itemize}
299 | \item For \Mc|EOL_TYPE_STRUCT|, the \Mc|offset| of the member (in bytes,
300 | from the beginning of the record) and a pointer to its type information
301 | (\Mc|typeinfo|) are used.
302 | \item For \Mc|EOL_TYPE_UNION|, the \Mc|offset| is ignored, and only the
303 | type information of the member (\Mc|typeinfo|) is used.
304 | \item For \Mc|EOL_TYPE_ENUM|, only the \Mc|value| associated with the
305 | enumerator is used.
306 | \end{itemize}
307 |
308 | \noindent
309 | An \Mc|union| is used to make fields share the same memory space.
310 |
311 |
312 | \subsection{Type Alias Representation}
313 |
314 | \begin{ccode*}{samepage=true}
315 | struct TI_typedef {
316 | char *name;
317 | const EolTypeInfo *typeinfo;
318 | };
319 | \end{ccode*}
320 |
321 | Type aliases assign a name to an arbitrary type. They are represented by the
322 | \Mc|name| and a pointer to the \Mc|EolTypeInfo| of the type.
323 |
324 |
325 | \subsection{Read-only Type Representation}
326 |
327 | \begin{ccode*}{samepage=true}
328 | struct TI_const {
329 | const EolTypeInfo *typeinfo;
330 | };
331 | \end{ccode*}
332 |
333 | \noindent
334 | Flagging a type as read-only (i.e. using the \Mc|const| type qualifier in C)
335 | is represented in the same way as pointers (\autoref{sec:pointer-typeinfo}):
336 | by keeping a pointer to the \Mc|EolTypeInfo| that is read-only.
337 |
338 |
339 |
340 |
341 | \section{Type cache}
342 | \label{sec:impl-type-cache}
343 |
344 | The type cache is implemented as an opaque data structure which can only be
345 | used by means of its public API (\autoref{lst:eol-typecache-api}). Internally
346 | it is implemented as a hash table which reuses uthash~\cite{uthash-guide}, and
347 | it maps integer keys (\Mc|uint32_t|) to \Mc|EolTypeInfo| structures. Cache
348 | keys can be any integer which uniquely identifies a particular type.
349 | For a \Mc|EolTypeInfo| created from its DWARF representation, the offset of
350 | the top-level \gls{DIE} is used as the key. This works because information for
351 | a particular type is never duplicated inside the same ELF file, so there is an
352 | unique offset in the file for it.
353 |
354 | \begin{listing}[tH]
355 | \centering
356 | \begin{ccode}
357 | typedef struct _EolTypeCacheEntry* EolTypeCache;
358 |
359 | typedef bool (*EolTypeCacheIter) (EolTypeCache*,
360 | const EolTypeInfo*,
361 | void *userdata);
362 |
363 | void eol_type_cache_init (EolTypeCache *cache);
364 | void eol_type_cache_free (EolTypeCache *cache);
365 |
366 | void eol_type_cache_add (EolTypeCache *cache,
367 | uint32_t offset,
368 | const EolTypeInfo *typeinfo);
369 |
370 | const EolTypeInfo* eol_type_cache_lookup (EolTypeCache *cache,
371 | uint32_t offset);
372 |
373 | void eol_type_cache_foreach (EolTypeCache *cache,
374 | EolTypeCacheIter callback,
375 | void *userdata);
376 | \end{ccode}
377 | \caption{Public API of \Mc|EolTypeCache|}.
378 | \label{lst:eol-typecache-api}
379 | \end{listing}
380 |
381 | The type cache only manages its own dynamically allocated memory, used for the
382 | nodes of the hash table. The \Mc|EolTypeInfo| structures referenced by the
383 | cache are considered opaque by the cache, and the memory used by them is not
384 | ever freed by the cache: freeing the cache leaks memory if the cached entries
385 | are not freed by other means. In practice, this is not a problem because the
386 | type information is constructed as-needed, and cached during the whole
387 | lifetime of each loaded ELF object. This approach allows to simply iterate
388 | over the elements to free each one of them before unloading the object file.
389 |
390 |
391 | \section{Memory Ownership and Life Cycle}
392 |
393 | Native code uses a different approach for memory management compared to Lua:
394 | while Lua uses \gls{GC}, which handles freeing chunks of memory automatically,
395 | native code frees memory explicitly. Take for example a function which creates
396 | a new \Mc|struct| and returns it:
397 |
398 | \begin{ccode*}{samepage=true}
399 | struct point { int x; int y; };
400 |
401 | struct point* point_new (int x, int y) {
402 | struct point *p = malloc (sizeof (struct point));
403 | p->x = x;
404 | p->y = y;
405 | return p;
406 | }
407 | \end{ccode*}
408 |
409 | Then, that code is built into an ELF shared object file (\verb|point.so|),
410 | which is loaded using \Eol*, and used normally:
411 |
412 | \begin{luacode}
413 | local Geometry = eol.load("point.so")
414 | local point = Geometry.point_new(1, -1)
415 | -- Use the point normally.
416 | \end{luacode}
417 |
418 | The Lua VM only knows the userdata that \Eol* has created to represent the
419 | returned value, but it is not aware of the memory that has been allocated to
420 | hold the \Mc|struct point| value. Once the value is not used anymore by the
421 | Lua program, the garbage collector reclaims the space used by the userdata,
422 | but \verb|free()| is not called to free the memory allocated by
423 | \verb|malloc()|. Lua just does not know about memory that it has not allocated
424 | itself. One solution is to manually call a function to free the memory from
425 | Lua:
426 |
427 | \begin{luacode}
428 | local libc = eol.load("libc.so")
429 | libc.free(point)
430 | point = nil -- Make sure it won't be used
431 | \end{luacode}
432 |
433 | The main problem with this is that the automatic memory management done by the
434 | Lua VM is lost, and programmers are forced to write additional code to free
435 | memory regions. This puts a burden in the developer, which would rather be
436 | avoided.
437 |
438 |
439 | \subsection{Lua as a Custom Allocator}
440 | \label{sec:userdata-lua-custom-allocator}
441 |
442 | The Lua VM exposes in its C API the ability to create \emph{userdata} objects.
443 | For the VM, userdata is seen as an opaque value which, by default, cannot be
444 | manipulated from Lua; for the client code using the Lua API, an userdata value
445 | is a region of memory allocated by the Lua VM, which can contain any data.
446 | Like every other value managed by the VM, userdata is subject to \gls{GC},
447 | which means that userdata values which are no longer referenced by a Lua
448 | program are garbage collected. This effectively allows C programmers to reuse
449 | the Lua GC for their data.
450 |
451 | \begin{listing}[tH]
452 | \centering
453 | \begin{ccode}
454 | struct data { /* ... */ };
455 |
456 | void initialize_data (struct data *d);
457 |
458 | struct data* push_new_data (lua_State *L) {
459 | struct data *d = lua_newuserdata (L, sizeof (struct data));
460 | initialize_data (d);
461 | return d;
462 | }
463 | \end{ccode}
464 | \caption{Using Lua userdata to store values}
465 | \label{lst:values-in-userdata}
466 | \end{listing}
467 |
468 | By default, userdata values have no predefined behaviour in Lua, except for
469 | assignment (which also covers passing userdata values as function parameters),
470 | and testing for identity. Assigning a metatable to an userdata value allows
471 | the programmer to define operations on userdata values.
472 |
473 |
474 | \subsubsection{GC Finalization}
475 |
476 | The only thing known by the Lua VM about userdata values is that they are
477 | a region of memory. That means that Lua will only free the memory region when
478 | the userdata is picked by the \gls{GC}. If the userdata contains resources
479 | other than raw memory (a file handle, for example), it must be ensured that
480 | those are released appropriately. In order to coordinate the GC with the
481 | management of resources unknown to the VM, Lua supports defining
482 | \emph{finalizer} functions.
483 |
484 | Lua requires the programmer to explicitly mark userdata to be finalized. This
485 | is done by setting its metatable: if the metatable contains a function (the
486 | finalizer) associated to the \Mlua|__gc| key, it is called after the userdata
487 | is marked by the GC as garbage, with the userdata itself being passed as the
488 | only argument. Once the userdata is finalized, the memory used by it will be
489 | normally released by Lua. The finalizer can be a C function, allowing native
490 | code to release any resources used by userdata.
491 |
492 | Using finalizers is needed when dealing with opaque types which are handled
493 | via pointers. The standard C library includes such a type: open files in C are
494 | handled using pointers to \Mc|FILE| values (for an example, see
495 | \autoref{lst:c-fileptr}), which are created when opening a file with
496 | \Mc|fopen()|. Files cannot be deallocated directly, and instead the
497 | \Mc|fclose()| function must be used. \autoref{lst:lua-gc-example-module}
498 | contains a complete example of a Lua module implemented in C which uses a
499 | finalizer to ensure that opened files are properly closed calling
500 | \Mc|fclose()| from the finalizer.
501 |
502 | \begin{listing}[tH]
503 | \begin{ccode}
504 | int main (int argc, char **argv) {
505 | FILE *fd = fopen ("hello.txt", "w");
506 | fprintf (fd, "Hello, C file!\n");
507 | fclose (fd);
508 | return 0;
509 | }
510 | \end{ccode}
511 | \caption{Using an opaque \Mc|FILE*| in C.}
512 | \label{lst:c-fileptr}
513 | \end{listing}
514 |
515 | Finalizers are used extensively in the implementation of the \Eol* Lua module.
516 | The module's own userdata types need finalization (details on those are
517 | provided in \autoref{sec:eol-mod-typemeta}), and it also allows to attach
518 | arbitrary Lua functions to values created by native code wrapped in an
519 | \Mc|EolVariable| userdata.
520 |
521 |
522 | \begin{listing}[tH]
523 | \small
524 | \begin{center}
525 | \emph{C code, to be built with \texttt{cc -shared -o openlog.so
526 | openlog.c}}
527 | \end{center}
528 | \begin{ccode}
529 | struct logger {
530 | FILE *output;
531 | };
532 |
533 | static int logger_call (lua_State *L) { /* ... */ }
534 |
535 | static int logger_gc (lua_State *L) {
536 | struct logger *l = luaL_checkudata (L, 1, "LOGGER");
537 | if (l->output) fclose (l->output); /* Close the file */
538 | return 0;
539 | }
540 |
541 | static int logger_new (lua_State *L) {
542 | const char *path = luaL_checkstring (L, 1);
543 | lua_Integer verbosity = luaL_checkinteger (L, 2);
544 | struct logger *l = lua_newuserdata (L, sizeof (struct logger));
545 | if (!(l->output = fopen (path, "a")))
546 | return luaL_error (L, "cannot open (\%s)", strerror (errno));
547 | l->verbosity = (int) verbosity;
548 | luaL_setmetatable (L, "LOGGER");
549 | return 1;
550 | }
551 |
552 | int luaopen_openlog (lua_State *L) {
553 | static const luaL_Reg metamethods[] = {
554 | { "__call", logger_call, }
555 | { "__gc", logger_gc, }
556 | { NULL, NULL },
557 | };
558 | luaL_newmetatable (L, "LOGGER");
559 | luaL_setfuncs (L, metamethods, 0);
560 | lua_pushcfunction (L, logger_new);
561 | return 1;
562 | }
563 | \end{ccode}
564 |
565 | \begin{center}
566 | \emph{Using the module from Lua}
567 | \end{center}
568 |
569 | \begin{luacode}
570 | local openlog = require("openlog")
571 | local log = openlog("/var/log/example.log", true)
572 | log("Log line")
573 | \end{luacode}
574 |
575 | \caption{Small C module which demonstrates using a \texttt{\_\_gc} metamethod}
576 | \label{lst:lua-gc-example-module}
577 | \end{listing}
578 |
579 |
580 | \section{\Eol* Lua module}
581 |
582 | The top-level API exposed to Lua is the \verb|eol| module, which has to be
583 | returned by the C function that the Lua VM calls after loading an extension
584 | module. This function is always called \verb|luaopen_|, where
585 | \verb|| is the name of the module being loaded:
586 |
587 | \begin{ccode}
588 | LUAMOD_API int
589 | luaopen_eol (lua_State *L)
590 | {
591 | eol_trace_setup ();
592 |
593 | (void) elf_version (EV_NONE);
594 | if (elf_version (EV_CURRENT) == EV_NONE)
595 | return luaL_error (L, "outdated libelf version");
596 |
597 | luaL_newlib (L, eollib);
598 | create_meta (L);
599 | return 1;
600 | }
601 | \end{ccode}
602 |
603 | Notice how the function uses \verb|luaL_newlib()| instead of manually creating
604 | a table in the Lua stack, and setting a field for each one of the \verb|eol.*|
605 | module level functions. The \verb|eollib| variable is defined as follows,
606 | using the supplied \verb|luaL_Reg| type:
607 |
608 | \begin{ccode}
609 | static const luaL_Reg eollib[] = {
610 | { "load", eol_load },
611 | { "type", eol_type },
612 | { "sizeof", eol_sizeof },
613 | { "typeof", eol_typeof },
614 | { "offsetof", eol_offsetof },
615 | { "cast", eol_cast },
616 | { NULL, NULL },
617 | };
618 | \end{ccode}
619 |
620 | As a last step, \verb|create_meta()| is called to register the metatables for
621 | the C types which \Eol* exposes to the Lua VM (\vref{sec:eol-mod-typemeta}).
622 |
623 |
624 | \subsection{Userdata Metatables}
625 | \label{sec:eol-mod-typemeta}
626 |
627 | Metatables for the userdata types (\textsf{Library}, \textsf{Function},
628 | \textsf{CType}, and \textsf{Variable}) are created when the \Eol*
629 | module is loaded by the Lua VM, in the \verb|create_meta()| C function.
630 | Exactly one metatable for each type is created, and all the userdata values of the same
631 | type share the same metatable. All metatables have some common entries:
632 |
633 | \begin{itemize}
634 |
635 | \item A \verb|__gc| metamethod, responsible for decreasing the reference
636 | count of the associated \textsf{Library}.
637 |
638 | \item A \verb|__tostring| metamethod, in order to provide a string
639 | representation of the values of the userdata. This is used by the Lua
640 | \Mlua|tostring()| function.
641 |
642 | \item A \verb|__index| metamethod, which provides support for indexing the
643 | userdata. The concrete behavior depends on the type of the userdata.
644 |
645 | \end{itemize}
646 |
647 | For each metatable, an array of \Mc|struct luaL_Reg| values is created. It
648 | contains a list of metamethods and pointers to the C functions which implement
649 | them. The example below is for \textsf{Library}; the rest share a strong
650 | similarity:
651 |
652 | \begin{ccode}
653 | /* Methods for Library userdata. */
654 | static const luaL_Reg library_methods[] = {
655 | { "__gc", library_gc },
656 | { "__tostring", library_tostring },
657 | { "__index", library_index },
658 | { "__eq", library_eq },
659 | { NULL, NULL }
660 | };
661 | \end{ccode}
662 |
663 | Then, in the \verb|create_meta()| function, the metatable is created with the
664 | aid of the \verb|luaL_newmetatable()| utility function, which arranges for the
665 | metatable to be available for checking the types of a userdata value later on
666 | with the companion functions \verb|luaL_checkudata()| and
667 | \verb|luaL_testudata()|:
668 |
669 | \begin{ccode}
670 | static void
671 | create_meta (lua_State *L) {
672 | /* EolLibrary */
673 | luaL_newmetatable (L, EOL_LIBRARY);
674 | luaL_setfuncs (L, library_methods, 0);
675 | lua_pop (L, 1);
676 |
677 | /* ... */
678 | }
679 | \end{ccode}
680 |
681 | \todo[inline]{Got time? Describe how lookups are done, it is relatively interesting}
682 |
683 | \subsection{Library loading}
684 |
685 | The implementation of \verb|eol.load()| tries to avoid loading the same
686 | library more than once. If a library is loaded, its reference count is
687 | incremented, and the returned \textsf{Library} userdata contains a reference
688 | to the previously loaded library. To achieve this behavior while preserving
689 | reference bookkeeping simple, a linked list of loaded libraries is maintained.
690 | Each \verb|EolLibrary| \Mc|struct| contains a pointer to the \verb|next|
691 | library (which can be \Mc|NULL|):
692 |
693 | \begin{ccode}
694 | struct EolLibrary {
695 | /* Members used for bookkeeping */
696 | unsigned int ref_counter;
697 | const char *path;
698 | struct EolLibrary *next;
699 |
700 | /* Other members */
701 | /* ... */
702 | };
703 | \end{ccode}
704 |
705 | The \verb|path| of libraries is used to determine if two libraries are the
706 | same. For this to work, the \verb|path| of a library, and the paths of
707 | libraries which are candidates to be loaded must must be in canonical form
708 | as returned by the \verb|realpath()|\footnote{\texttt{realpath()}
709 | canonicalizes a file path, it is part of the POSIX standard and included in
710 | most Unix-like systems.} function before comparing them.
711 |
712 | It would have been possible to use a hash table to map library paths
713 | (canonicalized) to their corresponding \verb|EolLibrary| \Mc|struct|, to
714 | determine in constant time whether a library is loaded, instead of the linear
715 | time required to check a linked list. However, most programs use, on average,
716 | a number of libraries in the order of tens, and reading the DWARF sections
717 | which contain the index of available types and symbols is much more costly for
718 | any non trivial library. Therefore, it was determined that the linked list
719 | solution would suffice, while avoiding the additional complexity of the hash
720 | table.
721 |
722 |
723 | \subsection{Type Information Lookup}
724 |
725 | Looking up type information is one of the most common operations performed by
726 | \Eol*. As per the design (\autoref{sec:design-overview}), type information is
727 | to be stored in the \textsf{Type Cache} (for details of its implementation,
728 | see \autoref{sec:impl-type-cache}). It seemed convenient to provide a single
729 | entry point for all the type information lookups which always checks the
730 | \textsf{Type Cache}. This function is \verb|library_lookup_type()|:
731 |
732 | \begin{ccode}
733 | static const EolTypeInfo*
734 | library_lookup_type (EolLibrary *library,
735 | Dwarf_Off d_offset,
736 | Dwarf_Error *d_error) {
737 | const EolTypeInfo *typeinfo =
738 | eol_type_cache_lookup (&library->type_cache, d_offset);
739 | if (!typeinfo) {
740 | typeinfo = library_build_typeinfo (library, d_offset, d_error);
741 | eol_type_cache_add (&library->type_cache, d_offset, typeinfo);
742 | }
743 | return typeinfo;
744 | }
745 | \end{ccode}
746 |
747 | It is important to note that the cache is always checked first: on a cache
748 | hit, the cached values are returned immediately, while in the event of a cache
749 | miss a call to the \verb|library_build_typeinfo()| is used to create a new
750 | \verb|EolTypeInfo| from the DWARF debugging information, which is always added
751 | to the cache \emph{right away}. This is especially important because
752 | \verb|EolTypeInfo| values can contain references to others, which in turn
753 | cause additional type information lookups. Having the intermediate values
754 | built also added in the cache greatly increases the ratio of cache hits. To
755 | better understand this, consider the following example lookups for functions
756 | of the C standard library, starting from an empty cache:
757 |
758 | \begin{figure}[thH]
759 | \centering
760 | \begin{tikzpicture}
761 | \begin{axis}[
762 | width=0.85\textwidth,
763 | height=0.5\textwidth,
764 | style={/pgf/number format/assume math mode=true},
765 | xlabel={\emph{time}},
766 | % ylabel={\emph{hits / misses}},
767 | enlarge x limits=0.05,
768 | axis on top,
769 | tick align=inside,
770 | axis x line*=bottom,
771 | axis y line*=left,
772 | legend style={legend pos=north west},
773 | x tick label style={opacity=0},
774 | ]
775 | \addplot plot coordinates {
776 | (1,0) (2,7) (3,8) (4,9) (5,9) (6,11) (7,13) (8,13) (9,154) (10,155)
777 | (11,158) (12,165) (13,168) (14,170) (15,171)
778 | };
779 | \addplot plot coordinates {
780 | (1,0) (2,7) (3,14) (4,14) (5,18) (6,18) (7,18) (8,19) (9,107)
781 | (10,107) (11,107) (12,107) (13,107) (14,107) (15,107)
782 | };
783 | \legend{hits \\ misses \\}
784 | \end{axis}
785 | \end{tikzpicture}
786 | \caption{Typical progression of type cache hits/misses over time}
787 | \label{fig:plot-type-cache-hitmiss}
788 | \end{figure}
789 |
790 | \begin{enumerate}
791 |
792 | \item Lookup type information for the \Mc|int atoi(const char*)| function.
793 | This generates one lookup for the \Mc|int| return type, plus another lookup
794 | for the \Mc|const char*| parameter. The latter causes itself more lookups:
795 | one for the \Mc|char*| type, which itself causes yet another lookup for the
796 | \Mc|char| type. The type information for \Mc|char| gets added to the
797 | cache at this point, then the information for \Mc|char*|, and finally for
798 | \Mc|const char*|.
799 |
800 | \item Lookup type information for the \Mc|char* strchr(const char*, int)|
801 | function. The cache already contains the type information for the function
802 | parameters, which were looked up for the \Mc|atoi()| function. The type
803 | information for the return type, \Mc|char*|, is already in the cache,
804 | because it has been added as a partial result for the \Mc|atoi()|
805 | function. As for the parameter types, \Mc|const char*| is also
806 | in the cache.
807 |
808 | \end{enumerate}
809 |
810 | The implemented policy for type information lookup quickly fills up the cache
811 | as fast as possible while the program starts, up to a point where most of the
812 | types used are all present in the cache
813 | (\autoref{fig:plot-type-cache-hitmiss}), and from that moment onwards the
814 | amount of cache misses is very small.
815 |
816 |
817 | \subsection{Querying Types}
818 |
819 | The \verb|eol.typeof()| function accepts arguments of different types.
820 | The \verb|luaL_check*()| family of functions from the Lua C API raise an error
821 | if the argument is not of the expected type, and therefore some additional
822 | work is needed to ensure that it works as specified:
823 |
824 | \begin{ccode}
825 | static int
826 | eol_typeof (lua_State *L) {
827 | if (luaL_testudata (L, 1, EOL_TYPEINFO)) {
828 | lua_settop (L, 1);
829 | } else {
830 | EolVariable *ev = luaL_testudata (L, 1, EOL_VARIABLE);
831 | if (ev) {
832 | typeinfo_push_userdata (L, ev->typeinfo);
833 | } else {
834 | luaL_checktype (L, 1, LUA_TSTRING);
835 | const char *name = lua_tostring (L, 1);
836 | /* Omitted: Lookup type by name in all loaded libraries */
837 | typeinfo_push_userdata (L, typeinfo);
838 | }
839 | }
840 | return 1;
841 | }
842 | \end{ccode}
843 |
844 | Functions \verb|eol.sizeof()|, \verb|eol.alignof()|, and \verb|eol.offsetof()|
845 | are implemented similarly, with the exception that they omit the code to
846 | look up the type information when passing a string argument. Once the
847 | corresponding \verb|EolTypeInfo| is found, it can be queried for the requested
848 | information: \verb|eol_typeinfo_sizeof()| to obtain the size,
849 | \verb|eol_typeinfo_alignment()| for the alignment, and for obtaining the
850 | offset of a \Mc|struct| member, the information is available in the
851 | \verb|EolTypeInfoMember| value returned by the
852 | \verb|eol_typeinfo_compound_named_member()|.
853 |
854 | \subsection{Casting}
855 |
856 | Using the \verb|eol.cast()| function on a \textsf{Variable} userdata changes
857 | the associated type for it, effectively treating the same data as if it were
858 | of another type. In order to achieve this, we just return to Lua a new
859 | \textsf{Variable} userdata with the given type information which points to
860 | the same memory area:
861 |
862 | \begin{ccode}
863 | static int
864 | eol_cast (lua_State *L) {
865 | const EolTypeInfo *typeinfo = to_eol_typeinfo (L, 1);
866 | EolVariable *ev = to_eol_variable (L, 2);
867 |
868 | /* Use typeinfo from 2nd argument, same data address. */
869 | variable_push_userdata (L, ev->library, typeinfo,
870 | ev->address, ev->name, VARIABLE_NOCOPY);
871 | return 1;
872 | }
873 | \end{ccode}
874 |
875 |
876 | \subsection{Preprocessor “Generator Macros”}
877 | \label{sec:cpp-abuse-genmacros}
878 |
879 | This is a programming pattern used thorough the code of \Eol*: the
880 | C preprocessor is used in a convoluted way as a rudimentary code generator
881 | using lists of related elements. First, a macro of related elements is defined
882 | (\emph{enumerator macro}, from now on), and it must accept the identifier for
883 | another macro (the \emph{generator macro}) as a parameter. Each element in the
884 | enumerator macro is an expansion of the generator, passing the parameters
885 | needed by the generator.
886 |
887 | In order to better understand how generator macros work, let us walk through
888 | a complete example adapted from the \Eol* source code. The following macro
889 | expands into a function which checks the type of an \Mc|EolTypeInfo| — it is
890 | the \emph{generator}:
891 |
892 | \begin{ccode*}{samepage=true}
893 | #define MAKE_TYPEINFO_IS_TYPE(suffix, name, ctype) \
894 | bool eol_typeinfo_is_ ## name (const EolTypeInfo *înfo) \
895 | { return info->type == EOL_TYPE_ ## suffix; }
896 | \end{ccode*}
897 |
898 | \noindent In generator macros like this, the concatenation operator
899 | (\verb|##|) of the preprocessor is used extensively to build pieces of valid
900 | C code. The example shows how the \verb|name| parameter is concatenated to
901 | create the name of the generated function, and the \verb|suffix| parameter is
902 | concatenated to create a valid \verb|EolType| (\autoref{lst:EolType})
903 | value. A valid expansion of the above macro is:
904 |
905 | \begin{ccode*}{samepage=true}
906 | MAKE_TYPEINFO_IS_TYPE (S32, s32, int32_t)
907 | \end{ccode*}
908 |
909 | \noindent
910 | which generates the following valid C function:
911 |
912 | \begin{ccode*}{samepage=true}
913 | bool eol_typeinfo_is_s32 (const EolTypeInfo *info)
914 | { return info->type == EOL_TYPE_S32; }
915 | \end{ccode*}
916 |
917 | \noindent The \emph{enumerator macro} is made by grouping a set of macro
918 | expansions like the one above. The key is using a generic name for the
919 | generator macro, which will be passed as a parameter. The next listing defines
920 | an enumerator which expands a generator \verb|F| for each signed integer type:
921 |
922 | \begin{ccode*}{samepage=true}
923 | #define INTEGER_S_TYPES(F) \
924 | F (S8, s8, int8_t ) \
925 | F (S16, s16, int16_t ) \
926 | F (S32, s32, int32_t ) \
927 | F (S64, s64, int64_t )
928 | \end{ccode*}
929 |
930 | \noindent Using the above definition, an expansion of the \emph{enumerator
931 | macro} causes multiple expansions at once of the \emph{generator macro} passed
932 | as \verb|F|, which in turn creates as many functions as elements in the
933 | enumerator macro. In this example, using generator macros reduces the amount
934 | of code that the programmer must write manually close to one fourth of the
935 | original.
936 |
937 | Another use case for generator macros is creating the code for cases in
938 | a \Mc|switch| statement. Instead of constructing the code for an entire
939 | function at a time, only a single \Mc|case| label and its associated
940 | statements are generated. This is done in the following example:
941 |
942 | \begin{ccode*}{samepage=true}
943 | #define MAKE_SIGNED_TYPE_CASE(suffix, name, ctype) \
944 | case EOL_TYPE_ ## suffix: return true;
945 |
946 | bool eol_type_is_signed (EolType type) {
947 | switch (type) {
948 | INTEGER_S_TYPES (MAKE_SIGNED_TYPE_CASE)
949 | default: return false;
950 | }
951 | }
952 | \end{ccode*}
953 |
954 |
955 | \section{Test Harness}
956 |
957 | The implementation of the test harness is not particularly complex, and its
958 | main piece of code is the test runner, contained in a single Lua script
959 | (\texttt{tools/harness.lua} in the source tree) which is mostly self
960 | explanatory. In broad terms, it works in the following way:
961 |
962 | \begin{enumerate}
963 |
964 | \item The directory containing the unit tests, which are Lua scripts, is
965 | scanned, and the file names used to populate the list of unit tests to run.
966 |
967 | \item If the names of any tests have been given in the command line, the
968 | lists os tests to run is changed to contain the names given as command line
969 | arguments.
970 |
971 | \item A \textsf{Test} object is created for each unit test of the list of
972 | tests to run.
973 |
974 | \item An \textsf{Output} is chosen, depending on the execution environment
975 | and command line arguments. The \Mlua|Output:setup()| method is invoked.
976 |
977 | \item For each \textsf{Test}, \Mlua|Output:start()| is called, the
978 | \textsf{Test} executed, and \Mlua|Output:finish()| is called to report its
979 | execution status.
980 |
981 | \item After running the unit tests, before exiting, the \textsf{Output} is
982 | give the chance of generating a summary of the test results (or any other
983 | content it may consider neccessary) by calling \Mlua|Output:report()|.
984 |
985 | \end{enumerate}
986 |
987 | Running each test case in a new process is achieved using the
988 | \Mlua|io.popen()| function from the Lua standard library, which allows
989 | capturing the output of the process as well. A normal Lua interpreter is used
990 | to run each unit test.
991 |
992 | \begin{figure}[h]
993 | \begin{luacode}
994 | local eol = require("eol")
995 | assert.Field(eol, "alignof")
996 | assert.Callable(eol.alignof)
997 |
998 | local libtest = eol.load("libtest")
999 | local u8type = eol.type(libtest, "uint8_t")
1000 |
1001 | assert.Equal(1, eol.alignof(u8type))
1002 | assert.Equal(1, eol.alignof(libtest.var_u8))
1003 |
1004 | -- Values other than variables or typeinfos raise an error
1005 | for _, value in ipairs { 1, 3.14, false, true, "str", { a=1 } } do
1006 | assert.Error(function ()
1007 | local _ = eol.alignof(value)
1008 | end)
1009 | end
1010 | \end{luacode}
1011 | \caption{Unit test for the \texttt{eol.alignof()} function}
1012 | \label{lst:unittest-example}
1013 | \end{figure}
1014 |
1015 | \subsection{\texttt{tools/run-tests}}
1016 |
1017 | The \verb|run-tests| helper script simplifies running the test harnes.
1018 | In order to preload the implementation of the
1019 | (\texttt{tools/harness-assertions.lua} in the source tree, specification in
1020 | \autoref{tab:design-test-asserts}) the script defines the \verb|LUA_INIT|.
1021 | It also makes sure that the correct \verb|lua| executable is used.
1022 |
1023 |
1024 | \subsection{Helper C Module}
1025 |
1026 | The test runner needs access to a series of functions from the C standard
1027 | library which are not available in Lua. Access to these functions is
1028 | implemented as a loadable C module named \texttt{testutil} which contains
1029 | trivial code for the following functions:
1030 |
1031 | \begin{itemize}
1032 |
1033 | \item \verb|testutil.isatty()| is used to determine whether a console is
1034 | connected to the standard output of the test harness process. In that case,
1035 | the \textsf{Console Output} is used for output formatting, otherwise the
1036 | output is redirected to a pipe or a file, and the \textsf{TAP Output} is
1037 | selected instead.
1038 |
1039 | \item \verb|testutil.listdir()| reads the directory at a given path, and
1040 | returns an array with the names of the files contained in it.
1041 |
1042 | \item \verb|testutil.isfile()| and \verb|testutil.isdir()| check whether the
1043 | path passed to them is a regular file or a directory, respectively.
1044 | Internally they use the \verb|stat()| system call to obtain information
1045 | about the path.
1046 |
1047 | \item \verb|testutil.realpath()| allows calling the POSIX \verb|realpath()|
1048 | function to canonicalize file system paths.
1049 |
1050 | \item \verb|testutil.getcwd()| allows calling the \verb|getcwd()| function
1051 | from the C library to obtain the working directory of the test runner.
1052 |
1053 | \end{itemize}
1054 |
1055 |
1056 | \section{Complete Test Programs}
1057 |
1058 | In order to stress the implementation of the \Eol* module, a few programs
1059 | which feature code similar to that found in real world systems have been
1060 | developed. The following libraries were chosen due to their size being small
1061 | and easy to build with full DWARF debugging information, yet containing
1062 | complex functions and types which would pose a challenge to \Eol*:
1063 |
1064 | \begin{itemize}
1065 |
1066 | \item µPNG\footnote{\url{https://github.com/elanthis/upng}}: small library which
1067 | implements a \gls{PNG} image decoder. It is used in embedded devices with
1068 | constrained memory resources, like the Pebble Time smartwatch.
1069 |
1070 | \item NanoVG\footnote{\url{https://github.com/memononen/nanovg}}: embedded
1071 | implementation of a subset of the OpenVG graphics API, which uses OpenGL for
1072 | rendering.
1073 |
1074 | % \item cImgIU\footnote{\url{https://github.com/Extrawurst/cimgui/}}:
1075 | % graphical user interface library
1076 |
1077 | \item GLFW\footnote{\url{http://www.glfw.org}}: utility library for OpenGL
1078 | application development which simplifies the creation of windows with OpenGL
1079 | contexts, and handling input from the user.
1080 |
1081 | \end{itemize}
1082 |
1083 | The programs are included in the \texttt{examples/} directory of the \Eol*
1084 | source code:
1085 |
1086 | \begin{description}
1087 |
1088 | \item [type-pp.lua] \hfill\\
1089 | Pretty-prints information about C types present in an arbitrary library.
1090 | This was implemented to exercise the ability of the \Eol* module to
1091 | provide precise type information to Lua programs.
1092 |
1093 | \item [upng-info.lua] \hfill\\ Uses the µPNG library to show information
1094 | about \gls{PNG} images.
1095 |
1096 | \item [nanovg-demo.lua, nanovg-noise.lua] \hfill\\
1097 | There two programs use the GLFW library to create a window with an OpenGL
1098 | context, and the NanoVG graphics library for rendering.
1099 |
1100 | The first uses functions involving complex native
1101 | types being passed between Lua and C to paint a series of translucent
1102 | animated waves (\autoref{fig:nanovg-demo}). The second displays animated
1103 | random noise which is generated from Lua into a native memory buffer
1104 | that has been allocated by \Eol* and passed as a texture to the graphics
1105 | card using NanoVG.
1106 |
1107 | \end{description}
1108 |
1109 | \begin{figure}
1110 | \centering
1111 | \includegraphics[width=0.75\textwidth]{img/nanovg-demo.png}
1112 | % \includegraphics[width=0.45\textwidth]{img/nanovg-noise.png}
1113 | \caption{Demo implemented in Lua of the NanoVG graphics library}
1114 | \label{fig:nanovg-demo}
1115 | \end{figure}
1116 |
1117 |
1118 | The development of these pilot programs using \Eol* did not uncover issues or
1119 | bugs that had not been detected by the unit tests. Therefore, we can conclude
1120 | that the unit tests and harness were most effective for the development
1121 | process. Validating the system using realistic code examples has helped
1122 | increase the confidence in the usefulness, stability and quality of this
1123 | project.
1124 |
1125 | \beforeintro
1126 |
--------------------------------------------------------------------------------