├── img ├── forest.jpg ├── lua-logo.pdf ├── theend.jpg ├── theend2.jpg ├── theend3.jpg ├── nanovg-demo.png ├── nanovg-noise.png ├── trello-board.png ├── lua-logo-nolabel.pdf ├── lua-logo-nolabel.ps └── lua-logo.svg ├── fonts ├── Symbola.ttf ├── Andada-Bold.ttf ├── Andada-Italic.ttf ├── AndadaSC-Bold.ttf ├── Raleway-Black.ttf ├── Raleway-Bold.ttf ├── Raleway-Light.ttf ├── Raleway-Thin.ttf ├── iosevka-bold.ttf ├── Andada-Regular.ttf ├── AndadaSC-Italic.ttf ├── AndadaSC-Regular.ttf ├── Raleway-Medium.ttf ├── Raleway-Regular.ttf ├── Raleway-SemiBold.ttf ├── iosevka-italic.ttf ├── iosevka-regular.ttf ├── Andada-BoldItalic.ttf ├── Raleway-ExtraBold.ttf ├── Raleway-ExtraLight.ttf ├── iosevka-bolditalic.ttf ├── AndadaSC-BoldItalic.ttf ├── InputMonoNarrow-Light.ttf ├── Raleway-Black-Italic.ttf ├── Raleway-Bold-Italic.ttf ├── Raleway-Light-Italic.ttf ├── Raleway-Medium-Italic.ttf ├── Raleway-Thin-Italic.ttf ├── InputMonoNarrow-Italic.ttf ├── InputMonoNarrow-Regular.ttf ├── Raleway-Regular-Italic.ttf ├── Raleway-SemiBold-Italic.ttf ├── Raleway-ExtraBold-Italic.ttf ├── Raleway-ExtraLight-Italic.ttf └── InputMonoNarrow-LightItalic.ttf ├── prebuilt ├── eris-report.pdf ├── lua-eol-report.pdf ├── eris-report-diff-20150629.pdf ├── lua-eol-report-diff-20150817.pdf ├── lua-eol-report-diff-20150823.pdf └── lua-eol-report-diff-20150831.pdf ├── Makefile ├── hyphenation.tex ├── .gitignore ├── summary.tex ├── appendix-installation.tex ├── bibliography.tex ├── introduction.tex ├── glossary.tex ├── lua-eol-report.tex ├── conclusions.tex ├── slides.tex ├── design.tex └── implementation.tex /img/forest.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/forest.jpg -------------------------------------------------------------------------------- /img/lua-logo.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/lua-logo.pdf -------------------------------------------------------------------------------- /img/theend.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/theend.jpg -------------------------------------------------------------------------------- /img/theend2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/theend2.jpg -------------------------------------------------------------------------------- /img/theend3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/theend3.jpg -------------------------------------------------------------------------------- /fonts/Symbola.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Symbola.ttf -------------------------------------------------------------------------------- /fonts/Andada-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-Bold.ttf -------------------------------------------------------------------------------- /img/nanovg-demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/nanovg-demo.png -------------------------------------------------------------------------------- /img/nanovg-noise.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/nanovg-noise.png -------------------------------------------------------------------------------- /img/trello-board.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/trello-board.png -------------------------------------------------------------------------------- /fonts/Andada-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-Italic.ttf -------------------------------------------------------------------------------- /fonts/AndadaSC-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-Bold.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Black.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Black.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Bold.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Light.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Light.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Thin.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Thin.ttf -------------------------------------------------------------------------------- /fonts/iosevka-bold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-bold.ttf -------------------------------------------------------------------------------- /fonts/Andada-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-Regular.ttf -------------------------------------------------------------------------------- /fonts/AndadaSC-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-Italic.ttf -------------------------------------------------------------------------------- /fonts/AndadaSC-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-Regular.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Medium.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Medium.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Regular.ttf -------------------------------------------------------------------------------- /fonts/Raleway-SemiBold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-SemiBold.ttf -------------------------------------------------------------------------------- /fonts/iosevka-italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-italic.ttf -------------------------------------------------------------------------------- /fonts/iosevka-regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-regular.ttf -------------------------------------------------------------------------------- /img/lua-logo-nolabel.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/img/lua-logo-nolabel.pdf -------------------------------------------------------------------------------- /prebuilt/eris-report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/eris-report.pdf -------------------------------------------------------------------------------- /fonts/Andada-BoldItalic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Andada-BoldItalic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-ExtraBold.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraBold.ttf -------------------------------------------------------------------------------- /fonts/Raleway-ExtraLight.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraLight.ttf -------------------------------------------------------------------------------- /fonts/iosevka-bolditalic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/iosevka-bolditalic.ttf -------------------------------------------------------------------------------- /prebuilt/lua-eol-report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report.pdf -------------------------------------------------------------------------------- /fonts/AndadaSC-BoldItalic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/AndadaSC-BoldItalic.ttf -------------------------------------------------------------------------------- /fonts/InputMonoNarrow-Light.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-Light.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Black-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Black-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Bold-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Bold-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Light-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Light-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Medium-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Medium-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Thin-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Thin-Italic.ttf -------------------------------------------------------------------------------- /fonts/InputMonoNarrow-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-Italic.ttf -------------------------------------------------------------------------------- /fonts/InputMonoNarrow-Regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-Regular.ttf -------------------------------------------------------------------------------- /fonts/Raleway-Regular-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-Regular-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-SemiBold-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-SemiBold-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-ExtraBold-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraBold-Italic.ttf -------------------------------------------------------------------------------- /fonts/Raleway-ExtraLight-Italic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/Raleway-ExtraLight-Italic.ttf -------------------------------------------------------------------------------- /fonts/InputMonoNarrow-LightItalic.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/fonts/InputMonoNarrow-LightItalic.ttf -------------------------------------------------------------------------------- /prebuilt/eris-report-diff-20150629.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/eris-report-diff-20150629.pdf -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # 2 | # Makefile 3 | # Adrian Perez, 2015-06-30 05:38 4 | # 5 | 6 | all: 7 | ninja 8 | 9 | 10 | # vim:ft=make 11 | # 12 | -------------------------------------------------------------------------------- /prebuilt/lua-eol-report-diff-20150817.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report-diff-20150817.pdf -------------------------------------------------------------------------------- /prebuilt/lua-eol-report-diff-20150823.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report-diff-20150823.pdf -------------------------------------------------------------------------------- /prebuilt/lua-eol-report-diff-20150831.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aperezdc/lua-eol-report/master/prebuilt/lua-eol-report-diff-20150831.pdf -------------------------------------------------------------------------------- /hyphenation.tex: -------------------------------------------------------------------------------- 1 | % vim:ft=tex: 2 | % 3 | \hyphenation{ 4 | com-pati-ble 5 | array 6 | re-fer-ence 7 | func-ti-ons 8 | DynASM 9 | } 10 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.lo[gft] 2 | *.mtc* 3 | *.aux 4 | *.toc 5 | *.out 6 | .*.sw[op] 7 | .ninja_log 8 | .ninja_deps 9 | slides.pdf 10 | lua-eol-report.pdf 11 | *.ac[rn] 12 | _minted-lua-eol-report/ 13 | _minted-slides/ 14 | *.nav 15 | *.snm 16 | *.vrb 17 | *.maf 18 | *.ist 19 | *.auxlock 20 | *.glsdefs 21 | *.alg 22 | *.md5 23 | *.xdy 24 | *.gl[gos] 25 | *.lol 26 | *.tdo 27 | -------------------------------------------------------------------------------- /summary.tex: -------------------------------------------------------------------------------- 1 | % vim: ft=tex ts=2 sw=2 spell spelllang=en 2 | \chapter*{Summary} 3 | 4 | The objective of this project is to implement an automated mechanism that, 5 | using the DWARF debugging information from ELF shared objects, allows the Lua 6 | virtual machine to call native functions from shared objects implemented in 7 | the C programming language. The process is automatic, in the sense that the 8 | user does not need to write code to convert values passed between Lua and the 9 | invoked C functions, and the C functions behave essentially like Lua from 10 | the user point of view. The ultimate goal is to allow transparent usage of 11 | existing C libraries from Lua. 12 | 13 | Lua has been chosen because it provides a clean C interface to its \gls{VM}, 14 | which has been designed from the ground up to be embedded in larger projects. 15 | The implementation is also compact (under 16.000 lines of code), which makes 16 | it feasible to gain in-depth knowledge of its innerworkings in a relatively 17 | short time. Lua has also grown in popularity in the last years as its adoption 18 | has skyrocketed in the game industry. 19 | 20 | The reason to focus on the combination of debugging information in DWARF 21 | format contained in ELF shared objects is that they are a widespread, standard 22 | configuration used by the majority of contemporary Unix-like operating 23 | systems. The target system during development has been a GNU/Linux system 24 | running on the Intel x86\_64 architecture, which also uses the aforementioned 25 | configuration, though provisions are to be included in the design to ease 26 | future porting efforts for other platforms. 27 | 28 | In order to validate the correctness of the implementation, an automated test 29 | suite was also developed. Unit tests were used also as regression tests, to 30 | ensure that modifications to the system did not introduce programming errors 31 | in the implementation. 32 | -------------------------------------------------------------------------------- /img/lua-logo-nolabel.ps: -------------------------------------------------------------------------------- 1 | %!PS-Adobe-2.0 EPSF-2.0 2 | %%Title: Lua logo 3 | %%Creator: lua@tecgraf.puc-rio.br 4 | %%CreationDate: Wed Nov 29 19:02:41 EDT 2000 5 | %%BoundingBox: -45 0 1035 1080 6 | %%Pages: 1 7 | %%EndComments 8 | %%EndProlog 9 | 10 | %------------------------------------------------------------------------------ 11 | % 12 | % Graphic design by Alexandre Nakonechnyj. 13 | % PostScript programming by the Lua team. 14 | % This code is hereby placed in the public domain. 15 | % 16 | % Permission is hereby granted, without written agreement and without license 17 | % or royalty fees, to use, copy, and distribute this logo for any purpose, 18 | % including commercial applications, subject to the following conditions: 19 | % 20 | % * The origin of this logo must not be misrepresented; you must not 21 | % claim that you drew the original logo. We recommend that you give credit 22 | % to the graphics designer in all printed matter that includes the logo. 23 | % 24 | % * The only modification you can make is to adapt the orbiting text to 25 | % your product name. 26 | % 27 | % * The logo can be used in any scale as long as the relative proportions 28 | % of its elements are maintained. 29 | % 30 | %------------------------------------------------------------------------------ 31 | 32 | /PLANETCOLOR {0 0 0.5 setrgbcolor} bind def 33 | /HOLECOLOR {1.0 setgray} bind def 34 | /ORBITCOLOR {0.5 setgray} bind def 35 | /LOGOFONT {/Helvetica 0.90} def 36 | /LABELFONT {/Helvetica 0.36} def 37 | 38 | %------------------------------------------------------------------------------ 39 | 40 | /MOONCOLOR {PLANETCOLOR} bind def 41 | /LOGOCOLOR {HOLECOLOR} bind def 42 | /LABELCOLOR {ORBITCOLOR} bind def 43 | 44 | /LABELANGLE 125 def 45 | /LOGO (Lua) def 46 | 47 | /DASHANGLE 10 def 48 | /HALFDASHANGLE DASHANGLE 2 div def 49 | 50 | % moon radius. planet radius is 1. 51 | /r 1 2 sqrt 2 div sub def 52 | 53 | /D {0 360 arc fill} bind def 54 | /F {exch findfont exch scalefont setfont} bind def 55 | 56 | % place it nicely on the paper 57 | /RESOLUTION 1024 def 58 | RESOLUTION 2 div dup translate 59 | RESOLUTION 2 div 2 sqrt div dup scale 60 | 61 | %-------------------------------------------------------------------- planet -- 62 | PLANETCOLOR 63 | 0 0 1 D 64 | 65 | %---------------------------------------------------------------------- hole -- 66 | HOLECOLOR 67 | 1 2 r mul sub dup r D 68 | 69 | %---------------------------------------------------------------------- moon -- 70 | MOONCOLOR 71 | 1 1 r D 72 | 73 | %---------------------------------------------------------------------- logo -- 74 | LOGOCOLOR 75 | LOGOFONT 76 | F 77 | LOGO stringwidth pop 2 div neg 78 | -0.5 moveto 79 | LOGO show 80 | 81 | %--------------------------------------------------------------------- orbit -- 82 | ORBITCOLOR 83 | 0.03 setlinewidth 84 | [1 r add 3.1415926535 180 div HALFDASHANGLE mul mul] 0 setdash 85 | newpath 86 | 0 0 87 | 1 r add 88 | 3 copy 89 | 27 57 90 | arcn 91 | stroke 92 | 93 | %------------------------------------------------------------------ copyright -- 94 | /COPYRIGHT 95 | (Graphic design by A. Nakonechnyj. Copyright (c) 1998, All rights reserved.) 96 | def 97 | 98 | LABELCOLOR 99 | LOGOFONT 100 | 32 div 101 | F 102 | 2 sqrt 0.99 mul 103 | dup 104 | neg 105 | moveto 106 | COPYRIGHT 107 | 90 rotate 108 | %show 109 | 110 | %---------------------------------------------------------------------- done -- 111 | showpage 112 | 113 | %%Trailer 114 | %%EOF 115 | -------------------------------------------------------------------------------- /appendix-installation.tex: -------------------------------------------------------------------------------- 1 | % vim: set ft=tex spelllang=en ts=2 sw=2 et 2 | 3 | \chapter{Installation} 4 | 5 | \section{Prerequisites} 6 | \label{sec:eol-prereqs} 7 | 8 | Instead of providing its own implementation for certain functionality, \Eol* 9 | uses existing, proved software components. 10 | 11 | \begin{table}[h] 12 | \centering 13 | \begin{tabular}{lrccc} 14 | \toprule 15 | Component & Version & Required & Optional & Bundled \\ 16 | \midrule 17 | Lua & 5.3 & \Tick & & \Tick \\ 18 | LuaBitOp & 1.0.2 & & \Tick & \Tick \\ 19 | \verb|libdwarf| & 20150507 & \Tick & & \Tick \\ 20 | \verb|libelf| & 0.152 & \Tick & & \\ 21 | \verb|readline| & 5.0 & & \Tick & \\ 22 | \verb|libffi| & 3.1 & & \Tick & \\ 23 | \bottomrule 24 | \end{tabular} 25 | \caption{Dependencies} 26 | \label{tab:eol-dependencies} 27 | \end{table} 28 | 29 | \autoref{tab:eol-dependencies} shows the dependencies expected to be 30 | installed in the system. The items marked (\inlinesymbol\Tick) as 31 | \emph{bundled} are not included in the source repository, but the build system 32 | includes support for downloading tarballs with the source code and doing 33 | a local build. When enabled, bundled dependencies will be automatically 34 | downloaded, built, and used instead instead of the versions provided by the 35 | system. In the case of using \verb|libdwarf| bundled, it will be statically 36 | linked. See~\autoref{sec:running-configure} for instructions to enable 37 | bundled libraries. This is particularly useful for systems which do not 38 | provide Lua 5.3 packages (for example, the case Debian and Ubuntu at the time 39 | of writing). 40 | 41 | \begin{table} 42 | \begin{tabular}{ccp{0.4\textwidth}} 43 | \toprule 44 | Distribution & Installation Command & Packages \\ 45 | \midrule 46 | Debian, Ubuntu & 47 | \verb|apt-get install| & 48 | \verb|libdwarf-dev| \verb|ninja-build| \\ 49 | Arch Linux & \verb|pacman -S| & \verb|libdwarf| \verb|ninja| \verb|lua| \\ 50 | \bottomrule 51 | \end{tabular} 52 | \caption{Dependency packages in popular GNU/Linux distributions.} 53 | \label{tab:distro-dependency-packages} 54 | \end{table} 55 | 56 | \autoref{tab:distro-dependency-packages} shows required packages as provided 57 | by popular GNU/Linux distributions. Some versions of Debian (and derivatives 58 | like Ubuntu) include only a static version of \verb|libdwarf| in the packages, 59 | most likely not built as \gls{PIC}, which is a requirement. 60 | 61 | 62 | \section{Building} 63 | 64 | The build process follows the convention pioneered by GNU 65 | Autotools~\cite{autotools-history}, in which an autoconfiguration script 66 | (\verb|configure|) is run first to inspect the system, determine which 67 | optional components are to be enabled at build time, and generate the needed 68 | build files. In short, building \Eol* is done by executing the following 69 | commands from the top level source directory: 70 | 71 | \begin{minted}{sh} 72 | ./configure 73 | make 74 | \end{minted} 75 | 76 | or, using Ninja~\cite{ninja-manual}: 77 | 78 | \begin{minted}{sh} 79 | ./configure 80 | ninja 81 | \end{minted} 82 | 83 | 84 | \subsection{Autoconfiguration} 85 | \label{sec:running-configure} 86 | 87 | The \verb|configure| script accepts a number of command line parameters, which 88 | determine how the system is built. In most cases the script will figure out 89 | automatically whether the required prerequisites (\autoref{sec:eol-prereqs}) 90 | are available, and whether the bundled versions should be used. Passing 91 | parameters to the script is useful in case the detection fails, or to force 92 | certain build options. The following are the parameters most commonly used 93 | with the \verb|configure| script: 94 | 95 | \begin{description} 96 | 97 | \item [\texttt{--enable-bundled-libdwarf}] \hfill\\ 98 | Uses the bundled \verb|libdwarf| instead of trying to use the one 99 | provided by the system. 100 | 101 | \item [\texttt{--enable-bundled-lua}] \hfill\\ 102 | Uses the bundled Lua distribution instead of trying to use the one 103 | provided by the system. 104 | 105 | \item [\texttt{--enable-ffi}] \hfill\\ 106 | Always use \verb|libffi| to perform function calls, and do not build 107 | support their JIT compilation. 108 | 109 | \item [\texttt{--jit-arch=ARCH}] \hfill\\ 110 | Skip detection of the operating system and processor, and use JIT 111 | compilation for the supplied architecture (\verb|ARCH|). It is 112 | possible to obtain a list of the supported architectures by running 113 | \texttt{./configure --jit-arch=help}. 114 | 115 | \end{description} 116 | 117 | In order to obtain a complete list of all the command line options that the 118 | script accepts, use: 119 | 120 | \begin{minted}{sh} 121 | ./configure --help 122 | \end{minted} 123 | 124 | 125 | \section{Testing the Build} 126 | 127 | Once \Eol* is has been built, it is recommended to run the test suite to 128 | ensure that the binaries work as expected. The test suite is included in the 129 | source tree, and it does not require any additional dependencies. In order 130 | to run the test suite, use the \verb|tools/run-tests| script from the 131 | top level firectory of the source tree: 132 | 133 | \begin{minted}{sh} 134 | ./tools/run-tests 135 | \end{minted} 136 | 137 | \beforeintro 138 | -------------------------------------------------------------------------------- /bibliography.tex: -------------------------------------------------------------------------------- 1 | % vim: ft=tex ts=2 sw=2 foldlevel=2 2 | 3 | \begin{thebibliography}{99} 4 | 5 | \bibitem{elfspec-sysv} 6 | \emph{Chapter 4: Object Files}, in 7 | \emph{System V Application Binary Interface Edition 4.1} (pages 44-72) \\ 8 | The Santa Cruz Operation, AT\&T, The 88open Consortium \\ 9 | \url{http://www.sco.com/developers/devspecs/gabi41.pdf} \\ 10 | Accessed: May 3rd, 2015. 11 | 12 | % \bibitem{tis-elf} 13 | % \emph{Tool Interface Standard (TIS) Executable and Linking Format (ELF) 14 | % Specification, version 1.2} \\ 15 | % Tool Interface Standard Committee \\ 16 | % Edited May 1995. \\ 17 | % \url{http://refspecs.linuxbase.org/elf/elf.pdf} \\ 18 | % Accessed: May 4rd, 2015. 19 | 20 | \bibitem{dwarfspecv4} 21 | \emph{DWARF Debugging Information Format Version 4} \\ 22 | DWARF Debugging Information Format Committee \\ 23 | \url{http://dwarfstd.org/doc/DWARF4.pdf} \\ 24 | Accessed: May 5th, 2015. 25 | 26 | \bibitem{debugdwarf} 27 | \emph{Introduction to the DWARF Debugging Format} \\ 28 | Michael J. Eager \\ 29 | \url{http://dwarfstd.org/doc/Debugging\%20using\%20DWARF-2012.pdf} \\ 30 | Accessed: May 5th, 2015. 31 | 32 | \bibitem{howdebugworks} 33 | \emph{Part 3 - Debugging Information}, in \emph{How Debuggers Work} \\ 34 | Eli Bendersky \\ 35 | \url{http://eli.thegreenplace.net/2011/02/07/how-debuggers-work-part-3-debugging-information/} \\ 36 | Accessed: May 5th, 2015. 37 | 38 | \bibitem{tratt-dynamic-langs} 39 | \emph{Dynamically Typed Languages}, 40 | in \emph{Advances in Computers} (volume 77, pages 149-184) \\ 41 | Laurence Tratt \\ 42 | Edited by Marvin V. Zelkowitz (July 2009) \\ 43 | \url{http://tratt.net/laurie/research/pubs/html/tratt__dynamically_typed_languages/} \\ 44 | Accessed: August 22nd, 2015. 45 | 46 | \bibitem{lua-pil} 47 | \emph{Programming in Lua} \\ 48 | Roberto Ierusalimschy \\ 49 | Lua.org, Third Edition (January 3rd, 2013), ISBN 859037985X. 50 | 51 | \bibitem{lua-pil-online} 52 | \emph{Programming in Lua} \\ 53 | Roberto Ierusalimschy \\ 54 | Lua.org, First Edition (December 2003), ISBN 8590379817. \\ 55 | \url{http://www.lua.org/pil/contents.html} 56 | 57 | \bibitem{lua-about} 58 | \emph{About}, in \emph{Lua website} \\ 59 | \url{http://www.lua.org/about.html} \\ 60 | Checked April 20th, 2015. 61 | 62 | \bibitem{lua-manual} 63 | \emph{Lua 5.3 Reference Manual} \\ 64 | Roberto Ierusalimschy, Luiz Henrique de Figueiredo, 65 | Waldemar Celes \\ 66 | \url{http://www.lua.org/manual/5.3/manual.html} \\ 67 | Accessed: April 29th, 2015. 68 | 69 | \bibitem{lua50-impl} 70 | \emph{The Implementatin of Lua 5.0} \\ 71 | Roberto Ierusalimschy, Luiz Henrique de Figueiredo, 72 | Waldemar Celes \\ 73 | Journal of Universal Computer Science 11 \#7 (2005) \\ 74 | \url{http://www.jucs.org/jucs_11_7/the_implementation_of_lua} 75 | 76 | \bibitem{lj-ffi-api} 77 | \emph{ffi.* API Functions} \\ 78 | Mike Pall \\ 79 | \url{http://luajit.org/ext_ffi_api.html} \\ 80 | Accessed: May 11th, 2015. 81 | 82 | \bibitem{luaffi} 83 | \emph{luaffi: Standalone FFI library for calling C functions from Lua} \\ 84 | James McKaskill \\ 85 | \url{https://github.com/jmckaskill/luaffi} \\ 86 | Accessed: May 11th, 2015. 87 | 88 | \bibitem{lj-ffi-semantic} 89 | \emph{FFI Semantics} \\ 90 | Mike Pall \\ 91 | \url{http://luajit.org/ext_ffi_semantics.html} \\ 92 | Accessed: May 11th, 2015. 93 | 94 | \bibitem{kb-sunit} 95 | \emph{Simple Smalltalk Testing}, in \emph{Kent Beck's Guide to Better 96 | Smalltalk} \\ 97 | Kent Beck, Donald G. Firesmith \\ 98 | Cambridge University (December 1998), ISBN 978-0-521-64437-2. 99 | 100 | \bibitem{opengroup-dlopen} 101 | \emph{dlopen - open a symbol table handle}, in \emph{The Open Group Base 102 | Specifications Issue 7, IEEE Standard 1003.1 2013 Edition} \\ 103 | The Open Group, IEEE \\ 104 | \url{http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlopen.html} \\ 105 | Accessed: August 30th, 2015. 106 | 107 | \bibitem{scrumban-getting-started} 108 | \emph{Getting Started with Scrumban} \\ 109 | \url{http://www.aboutscrumban.com/how-to-start-using-scrumban/} \\ 110 | Accessed: May 5th, 2015. 111 | 112 | \bibitem{tap-spec} 113 | \emph{TAP Specification} \\ 114 | Michael G. Schwern, Andy Lester \\ 115 | \url{http://testanything.org/tap-specification.html} \\ 116 | Accessed: May 4th, 2015. 117 | 118 | \bibitem{nanovg} \ 119 | \emph{NanoVG: Antialiased 2D vector drawing library on top of OpenGL for UI 120 | and visualizations} \\ 121 | Various Authors \\ 122 | \url{https://github.com/memononen/nanovg} \\ 123 | Accessed: May 4th, 2015. 124 | 125 | \bibitem{libdwarf-doc} 126 | \emph{A Consumer Library Interface to DWARF} \\ 127 | David Anderson \\ 128 | \url{https://github.com/Distrotech/libdwarf/blob/distrotech-libdwarf/libdwarf/libdwarf2.1.pdf} \\ 129 | Accessed: May 11th, 2015. 130 | 131 | \bibitem{libdwarfp-doc} 132 | \emph{A Producer Library Interface to DWARF} \\ 133 | David Anderson \\ 134 | \url{https://github.com/Distrotech/libdwarf/blob/distrotech-libdwarf/libdwarf/libdwarf2p.1.pdf} \\ 135 | Accessed: May 11th, 2015. 136 | 137 | \bibitem{ninja-manual} 138 | \emph{Ninja documentation} \\ 139 | Evan Martin \\ 140 | \url{http://martine.github.io/ninja/manual.html} \\ 141 | Accessed: May 11th, 2015. 142 | 143 | \bibitem{gnumake-manual} 144 | \emph{GNU Make: A Program for Directing Recompilation} \\ 145 | Free Software Foundation, ISBN 1-882114-83-3. \\ 146 | \url{https://www.gnu.org/software/make/manual/make.pdf} \\ 147 | Accessed: May 13th, 2015. 148 | 149 | \bibitem{uthash-guide} 150 | \emph{uthash User Guide} \\ 151 | Troy D. Hanson \\ 152 | \url{https://troydhanson.github.io/uthash/userguide.html} \\ 153 | Accessed: May 15th, 2015. 154 | 155 | \bibitem{swig3doc} 156 | \emph{SWIG 3.0 Documentation} \\ 157 | \url{http://swig.org/Doc3.0/SWIGDocumentation.html} \\ 158 | Accessed: June 25th, 2015. 159 | 160 | \bibitem{lusers-BindingCodeToLua} 161 | \emph{Binding Code To Lua}, in \emph{Lua-Users Wiki} \\ 162 | \url{http://lua-users.org/wiki/BindingCodeToLua} \\ 163 | Accessed: June 25th, 2015. 164 | 165 | \bibitem{js-raceforspeed} 166 | \emph{The JavaScript engine family tree}, 167 | in \emph{The race for speed, part 1} \\ 168 | John Dalziel, CreativeJS \\ 169 | \url{http://creativejs.com/2013/06/the-race-for-speed-part-1-the-javascript-engine-family-tree/} \\ 170 | Accessed: August 16th, 2015. 171 | 172 | \bibitem{gobject-introspection} 173 | \emph{GObject Introspection}, in \emph{GNOME Wiki} \\ 174 | \url{https://wiki.gnome.org/Projects/GObjectIntrospection} \\ 175 | Accessed: August 15th, 2015. 176 | 177 | \bibitem{unofficial-dasm-doc} 178 | \emph{The Unofficial DynASM Documentation} \\ 179 | Peter Cawley \\ 180 | \url{http://corsix.github.io/dynasm-doc/} \\ 181 | Accessed: August 8th, 2015 182 | 183 | \bibitem{lj-perf1} 184 | \emph{LuaJIT performance}, in \emph{lua-l mailing list, August 2009} \\ 185 | Mike Pall \\ 186 | \url{http://lua-users.org/lists/lua-l/2009-08/msg00151.html} \\ 187 | Accessed: August 22nd, 2015 188 | 189 | \bibitem{autotools-history} 190 | \emph{The First Configure Programs}, 191 | in \emph{Autoconf, Automake, and Libtool} \\ 192 | Gary V. Vaughan, Ben Elliston, Tom Tromey and Ian Lance Taylor \\ 193 | New Riders Publishing (October 2000; updated February 2006). \\ 194 | \url{https://www.sourceware.org/autobook/autobook/autobook_8.html} \\ 195 | Accessed: August 27th, 2015. 196 | 197 | \bibitem{mit-license} 198 | \emph{The MIT License} \\ 199 | \url{http://opensource.org/licenses/mit} \\ 200 | Accessed: September 6th, 2015. 201 | 202 | \bibitem{bsd-licenses} 203 | \emph{BSD licenses}, in \emph{Wikipedia} \\ 204 | Various authors. \\ 205 | \url{https://en.wikipedia.org/wiki/BSD_licenses} \\ 206 | Accessed: September 6th, 2015. 207 | 208 | \bibitem{eol-github} 209 | \emph{Eöl: Fully automatic Lua↔C bridge using DWARF debug information} \\ 210 | Adrián Pérez de Castro \\ 211 | \url{https://github.com/aperezdc/lua-eol/} 212 | 213 | \end{thebibliography} 214 | 215 | -------------------------------------------------------------------------------- /img/lua-logo.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 20 | 22 | 24 | 28 | 33 | 34 | 38 | 43 | 44 | 48 | 53 | 54 | 58 | 63 | 64 | 65 | 66 | 88 | 90 | 91 | 93 | image/svg+xml 94 | 96 | 97 | 98 | 99 | 100 | 105 | 107 | 112 | 117 | 121 | 128 | 135 | 142 | 143 | 148 | 153 | 154 | 155 | 156 | -------------------------------------------------------------------------------- /introduction.tex: -------------------------------------------------------------------------------- 1 | % vim: ft=tex spell spelllang=en ts=2 sw=2 2 | 3 | \cleardoublepage 4 | \setchaptertoc 5 | \chapter{Introduction} 6 | 7 | This chapter explains the reasons which motivate the development of this 8 | project, and provides an outline of the goals and planning for its 9 | realization. 10 | 11 | \afterintro 12 | 13 | \section{Description \& Motivation} 14 | 15 | Most programming languages provide some mechanism to use libraries —sometimes 16 | called \emph{modules}— implemented in some other language. Most of the time, 17 | this other language belongs to the family of the C language, which can be 18 | compiled into \emph{native object code}. The reasons are twofold: on one hand 19 | it allows to reuse functionality provided by the system that otherwise would 20 | not be available, and in the other hand it opens the door to implementing 21 | performance--critical pieces of a system using native code. 22 | 23 | Despite the advantages, using native code from a different host programming 24 | language requires creating a layer of software often called \emph{bridge}, or 25 | \emph{binding} from now on, which wraps the native library to provide an 26 | interface compatible with the run-time environment of the dynamic programming 27 | language. Those bindings, created either manually or with the help of code 28 | generation tools, need to be compiled before they can be used. 29 | 30 | When building native code, compilers are capable of adding 31 | \emph{debugging information} to their output, which can be used to gain 32 | additional insight into a program using a \emph{symbolic debugger}. As 33 | a matter of fact, any other tool capable of understanding the format in which 34 | the compiler writes the debugging information can make use of it for its own 35 | purposes. Among plenty other details about the source program, debugging 36 | information includes descriptions of the functions compiled as part of each 37 | compilation unit, parameters and their corresponding data types, return types, 38 | and the memory layout of the involved user-defined types; which is a superset 39 | of the information needed to invoke those functions. In other words, the 40 | debugging information contains all the details needed to make library bindings 41 | automatically, potentially allowing dynamic programming languages to invoke 42 | native code directly without any kind of human intervention. 43 | 44 | % The goal of this project is to implement such an automatic invocation method 45 | % for the Lua programming language, using the debugging information in \Dwarf* 46 | % format as generated by the compiler to allow calling into native code from 47 | % arbitrary libraries at run-time, without needing the presence of previously 48 | % created bindings. 49 | 50 | 51 | \section{Project Goals} 52 | \label{sec:project-goals} 53 | 54 | The main goal of this project to develop an automatic binding system for the 55 | Lua programming language which allows seamless usage of libraries written in 56 | C at runtime. To achieve this, it will use the debugging information generated 57 | by the C compiler. Additionally: 58 | 59 | \begin{itemize} 60 | 61 | \item Modifications to the Lua virtual machine, or its core libraries are to 62 | be avoided, if possible. The fewer the changes, the lower the maintenance 63 | cost of the system when Lua is updated. An implementation which does not 64 | modify Lua itself would be usable with Lua packages provided by the 65 | operating system, thus easing the setup process. 66 | 67 | \item The implementation will load \gls{ELF} shared objects into the Lua 68 | virtual machine, and use the debugging information in \gls{DWARF} format 69 | present in them. 70 | 71 | \item Values of C types, including user defined ones, will be readable and 72 | modifiable from Lua. It will also be possible to create new values of 73 | C types from Lua. 74 | 75 | \item Invocation of functions from loaded shared objects will be supported 76 | for functions of arbitrary return types, and any number of parameters of any 77 | supported type. Lua values passed to functions will be automatically 78 | converted to C types whenever possible. Values of C types created from Lua 79 | will also be accepted as valid function parameters. 80 | 81 | \item The implementation will target the GNU/Linux operating system running 82 | on the x86\_64 architecture. 83 | 84 | \item The design of the system will be extensible, allowing to add support 85 | for more shared object formats, debugging information formats, operating 86 | systems, and architectures. 87 | 88 | \end{itemize} 89 | 90 | 91 | \section{Planning \& Methodologies} 92 | \label{sec:plan-method} 93 | 94 | During the planification phase, the following tasks and subtasks have been 95 | identified: 96 | 97 | \begin{enumerate} 98 | \item Initial study, including: 99 | \begin{enumerate} 100 | \item Understanding how different kinds of data are stored in \gls{ELF} 101 | object files. 102 | \item Identifying the parts of the \gls{DWARF} specification which apply 103 | to the scope of the project. 104 | \item Investigating existing tools which share similar goals. 105 | \end{enumerate} 106 | 107 | \item Analysis, including: 108 | \begin{enumerate} 109 | \item Understanding the relevant parts of the \gls{DWARF} 110 | debugging information format. 111 | \item Getting acquainted with Lua and the implementation 112 | of its \gls{VM}. 113 | \end{enumerate} 114 | 115 | \item Development, including: 116 | \begin{enumerate} 117 | \item Designing the automatic binding system. 118 | \item Implementing the automatic binding mechanism. 119 | \item Testing the system, including: 120 | \begin{itemize} 121 | \item Designing a set of unit and regressions tests. 122 | \item Implementing unit and regression tests. 123 | \end{itemize} 124 | \end{enumerate} 125 | 126 | \item Validation, including: 127 | \begin{enumerate} 128 | \item Developing example Lua programs which demonstrate the 129 | capabilities of the system. 130 | \item Rewriting at least one previously existing program to 131 | validate usage of the system in a real--world scenario. 132 | \end{enumerate} 133 | 134 | \item Documentation, including writing of the final report. 135 | % \item Determine whether to use an existing JIT code generator or to 136 | % implement our own. 137 | % \item Design the JIT code generator. 138 | % \item Implement the JIT code generator. 139 | \end{enumerate} 140 | 141 | For each one of the top-level tasks in the list above, 142 | \autoref{tab:effort-estimate} provides an estimation of the time needed for 143 | the completion, using an effort of eight hours per person, per day (8h/p/d). 144 | For 115 days estimated, the cost of the project would be of 59.000€, using 145 | a price of 65€ per hour. 146 | 147 | \begin{table} 148 | \centering 149 | \begin{tabular}{rlrr} 150 | \toprule 151 | \# & Task & Estimation (days) & Cost (€) \\ 152 | \midrule 153 | 1. & Initial study & 10 & 5.200 \\ 154 | 2. & Analysis & 15 & 7.800 \\ 155 | 3. & Development & 50 & 26.000 \\ 156 | 4. & Validation & 10 & 5.200 \\ 157 | 5. & Documentation & 30 & 15.600 \\ 158 | \midrule 159 | & \emph{Total} & 115& 59.800 \\ 160 | \bottomrule 161 | \end{tabular} 162 | \caption{Effort estimation} 163 | \label{tab:effort-estimate} 164 | \end{table} 165 | 166 | Even though there is only one resource executing the tasks, some techniques 167 | from agile development methodologies are used. Namely: 168 | 169 | \begin{itemize} 170 | \item From Scrum, the concepts of \emph{iteration} and \emph{sprints}, with 171 | their respective planning and review seasons. Daily stand-up meetings are 172 | not used, and there is no \emph{scrum master}: none of those would make 173 | make sense provided that there is only one person in the team. 174 | \item The \emph{Kanban} methodology is used in order to keep an always 175 | up to date dashboard with the status of the tasks. 176 | \end{itemize} 177 | 178 | \begin{figure}[htH] 179 | \centering 180 | \includegraphics[width=0.8\textwidth]{img/trello-board.png} 181 | \caption{Kanban board, showing some tasks of this very project} 182 | \label{fig:kanban-board} 183 | \end{figure} 184 | 185 | The Kanban method was invented by \gls{Toyota} to keep the status of 186 | production lines. This methodology keeps a board (physical, in the original 187 | incarnation of the method; nowadays there are even web-based applications like 188 | the one shown in \autoref{fig:kanban-board}) where each element is a task, and 189 | elements are distributed in columns depending on their status. For example, 190 | applied to software development, the columns could be “Pending”, “In 191 | Progress”, “Testing”, and “Finished”. All the tasks are always visible in the 192 | board, so this allows to know the overall status of a project intuitively by 193 | glancing at the board. 194 | 195 | 196 | \beforeintro 197 | -------------------------------------------------------------------------------- /glossary.tex: -------------------------------------------------------------------------------- 1 | % vim: ft=tex spell spelllang=en 2 | % 3 | % Glossary/Acronyms 4 | % 5 | 6 | \newglossaryentry{LuaJIT}{ 7 | name=LuaJIT, 8 | description={ 9 | Just-In-Time compiler (\gls{JIT}) for the Lua programming language. 10 | It is a third-party, independent implementation of the Lua VM created 11 | and maintained by Mike Pall, who allegedly was born in planet Krypton. 12 | Available at \url{http://luajit.org} 13 | }, 14 | } 15 | 16 | \newglossaryentry{transpiler}{ 17 | name=transpiler, 18 | description={Type of compiler that takes source code of a programming 19 | language as its input, and produces a different source code as 20 | output, usually in a different programming language. Also known 21 | as \emph{source-to-source compiler}, or \emph{transcompiler} 22 | }, 23 | } 24 | 25 | \newglossaryentry{lua-users-wiki}{ 26 | name={Lua-Users Wiki}, 27 | description={Community operated \emph{wiki} site which contains resources 28 | for Lua development, written by users of the programming language 29 | themselves. URL address: \url{http://lua-users.org/wiki} 30 | }, 31 | } 32 | 33 | \newglossaryentry{pascal}{ 34 | name={Pascal}, 35 | description={ 36 | Procedural programming language designed in the late 60s by Niklaus Wirth 37 | to encourage good programming practices 38 | }, 39 | } 40 | 41 | \newglossaryentry{name-mangling}{ 42 | name={name mangling}, 43 | description={ 44 | Technique used generate unique names for programming entities, usually by 45 | encoding additional information about the entity in its name 46 | }, 47 | } 48 | 49 | \newglossaryentry{object-oriented}{ 50 | name={object oriented}, 51 | description={ 52 | Programming paradigm based on the concept of \emph{objects}, which are 53 | data structures that encapsulate both data, and its behavior 54 | }, 55 | } 56 | 57 | \newglossaryentry{data-deduplication}{ 58 | name={data deduplication}, 59 | description={Any technique which eliminates duplicate copies of repeating 60 | data in order to improve storage utilization 61 | }, 62 | } 63 | 64 | \newglossaryentry{flexible-array-member}{ 65 | name={flexible array member}, 66 | description={ 67 | Feature introduced in the C99 standard of the C programming language 68 | which allows the last member of a \Mc:struct: to be an array of an 69 | unspecified dimension. Space needed by the array does not contribute 70 | to the size of the \Mc:struct: type, and must be manually accounted 71 | for when allocating the \Mc:struct: from the heap 72 | }, 73 | } 74 | 75 | \newglossaryentry{emulation}{ 76 | name={emulation}, 77 | description={ 78 | Piece of hardware or software that enables one computer system (called 79 | the \emph{host}) to behave like another computer system (called the 80 | \emph{guest}) which enables the host system to run software or use 81 | peripheral devices designed for the guest system 82 | }, 83 | } 84 | 85 | \newglossaryentry{metaprogramming}{ 86 | name={metaprogramming}, 87 | description={ 88 | Writing of computer programs which are able to read, generate, analyze 89 | or transform other programs, or even modify themselves while running 90 | }, 91 | } 92 | 93 | \newglossaryentry{memoization}{ 94 | name={memoization}, 95 | description={ 96 | Optimization technique which stores the results of expensive function 97 | calls, and returns the previously calculated value when the same inputs 98 | occur again 99 | }, 100 | } 101 | 102 | \newglossaryentry{dynamic-programming}{ 103 | name={dynamic programming}, 104 | description={ 105 | Problem solving method —and programming technique— which solves a 106 | complicated problem by breaking it up in smaller problems in a 107 | recursive manner 108 | } 109 | } 110 | 111 | \newglossaryentry{dynamic-dispatch}{ 112 | name={dynamic dispatch}, 113 | description={ 114 | Process of selecting a concrete implementation of a polymorphic 115 | method (or function) at runtime. It is typically used in object 116 | oriented languages when different classes contain different 117 | implementations of the same method due to inheritance 118 | } 119 | } 120 | 121 | \newglossaryentry{fibonacci-number}{ 122 | name={Fibonacci number}, 123 | description={ 124 | Number from the sequence $1, 1, 2, 3, 5, 8, 13, ...$, given by the 125 | recurrence relation $F_n = F_{n-1} + F_{n-2}$, with $F_1 = 1$, and 126 | $F_2 = 1$, as defined by the Italian mathematician Leonardo Fibonacci 127 | }, 128 | } 129 | 130 | \newglossaryentry{first-class-value}{ 131 | name={first--class value}, 132 | description={ 133 | In programming language design, an entity which supports all the 134 | operations generally available to other entities of a laguage, 135 | typically: being passed as a parameter, returned from a function, 136 | and assigned to a variable 137 | }, 138 | } 139 | 140 | \newglossaryentry{closure}{ 141 | name={closure}, 142 | description={ 143 | Technique for implementing lexically scoped name binding in languages 144 | with first--class functions 145 | }, 146 | } 147 | 148 | \newglossaryentry{constructor}{ 149 | name={constructor}, 150 | description={ 151 | Special type of subroutine in a program which is called to create an 152 | object, and performing any initialization needed before the object can 153 | be used 154 | }, 155 | } 156 | 157 | \newglossaryentry{refcounting}{ 158 | name={reference counting}, 159 | description={ 160 | Technique of storing the number of references to an object, block of 161 | memory, disk space, or any other resource, which allows tracking 162 | whether the resource is in use by others 163 | }, 164 | } 165 | 166 | \newglossaryentry{backronym}{ 167 | name={backronym}, 168 | description={ 169 | A \emph{backward acronym} is an acronym constructed in reverse, by 170 | creating a new phrase to fit an existing word, name, or acronym 171 | }, 172 | } 173 | 174 | \newglossaryentry{Toyota}{ 175 | name={Toyota}, 176 | description={Japanese car manufacturer}, 177 | } 178 | 179 | \newglossaryentry{gls-ABI}{ 180 | name={Application Binary Interface}, 181 | description={ 182 | Interface between two program modules, one of which is 183 | usually a library or the operating system, at the level of machine 184 | code. An ABI determines such details as how functions are called, 185 | and how parameters are passed to them 186 | }, 187 | } 188 | \newacronym[see={[Glossary:]{gls-ABI}}]{ABI}{ABI} 189 | {Application Binary Interface\glsadd{gls-ABI}} 190 | 191 | \newglossaryentry{gls-TUE}{ 192 | name={Type Unit Entry}, 193 | description={A particular kind of DWARF DIE that contains information 194 | about a data type}, 195 | } 196 | \newacronym[see={[Glossary:]{gls-TUE}}]{TUE}{TUE} 197 | {Type Unit Entry\glsadd{gls-TUE}} 198 | 199 | \newglossaryentry{gls-ISA}{ 200 | name={Instruction Set Architecture}, 201 | description={ 202 | Part of the computer architecture related to programming, including 203 | the native data types, instructions, registers, addressing modes, 204 | memory architecture, interrupt and exception handling, and external 205 | I/O. An ISA includes a specification of the machine language, and 206 | the native commands implemented by a particular processor. 207 | }, 208 | } 209 | \newacronym[see={[Glossary:]{gls-ISA}}]{ISA}{ISA} 210 | {Instruction Set Architecture\glsadd{gls-ISA}} 211 | 212 | \newglossaryentry{gls-GC}{ 213 | name={Garbage Collection}, 214 | description={ 215 | Method of automatic memory management, in which a \emph{garbage 216 | collector} tries to reclaim “garbage” (memory occupied by data no 217 | longer in use by the program) with a certain periodicity 218 | }, 219 | } 220 | \newacronym[see={[Glossary:]{gls-GC}}]{GC}{GC} 221 | {Garbage Collection\glsadd{gls-GC}} 222 | 223 | \newglossaryentry{gls-TAP}{ 224 | name={Test Anything Protocol}, 225 | description={ 226 | }, 227 | } 228 | \newacronym[see={[Glossary:]{gls-TAP}}]{TAP}{TAP} 229 | {Test Anything Protocol\glsadd{gls-TAP}} 230 | 231 | \newacronym{ELF}{ELF}{Executable and Linkable Format} 232 | \newacronym{DWARF}{DWARF}{Debugging With Attributed Record Formats} 233 | \newacronym{JIT}{JIT}{Just-In-Time} 234 | \newacronym{FFI}{FFI}{Foreign Function Interface} 235 | \newacronym{DIE}{DIE}{Debugging Information Entry} 236 | \newacronym{CU}{CU}{Compilation Unit} 237 | \newacronym{FDL}{FDL}{Free Documentation License} 238 | \newacronym{API}{API}{Application Programming Interface} 239 | \newacronym{PIC}{PIC}{Position-Independent Code} 240 | \newacronym{VLA}{VLA}{Variable-Length Array} 241 | \newacronym{PUC-Rio}{PUC-Rio}{Pontifícia Universidade Católica do Rio de Janeiro} 242 | \newacronym{VM}{VM}{Virtual Machine} 243 | \newacronym{IRC}{IRC}{Internet Relay Chat} 244 | \newacronym{OSI}{OSI}{Open Source Initiative} 245 | \newacronym{PNG}{PNG}{Portable Network Graphics} 246 | \newacronym{TDD}{TDD}{Test-Driven Development} 247 | 248 | % 249 | % Those are just simple command-abbreviations to format pieces of text which 250 | % should be always displayed with the same formatting. Using a macro ensures 251 | % that, and makes it easier to come back here and change the formatting for 252 | % all occurrences, if needed. 253 | % 254 | \def\Eol*{\textsc{\sffamily Eöl}} 255 | -------------------------------------------------------------------------------- /lua-eol-report.tex: -------------------------------------------------------------------------------- 1 | % vim:ft=tex: 2 | % 3 | \documentclass[a4paper, 4 | fontsize=12pt,final, 5 | titlepage=firstiscover, 6 | chapterprefix=true, 7 | appendixprefix=true, 8 | headings=big, 9 | headsepline, 10 | toc=bibliographynumbered, 11 | twoside]{scrbook} 12 | 13 | \usepackage[english]{babel} 14 | \usepackage{hyphenat} 15 | \usepackage{varioref} 16 | 17 | % Graphics support 18 | \usepackage{xcolor} 19 | \usepackage{dirtree} 20 | \usepackage{calc} 21 | \usepackage{tikz} 22 | \usetikzlibrary{arrows,shapes,fit,shadows,positioning,chains,% 23 | decorations.pathreplacing,decorations.pathmorphing,calc,% 24 | matrix} 25 | \usepackage{pgfplots} 26 | 27 | \definecolor{grey}{rgb}{0.5, 0.5, 0.5} 28 | \definecolor{lightgrey}{rgb}{0.95, 0.95, 0.95} 29 | \definecolor{grassgreen}{rgb}{0.1, 0.85, 0.2} 30 | \definecolor{fadedbrown}{rgb}{0.85, 0.1, 0.1} 31 | \definecolor{darkmagenta}{rgb}{0.65, 0.0, 0.65} 32 | \definecolor{lightblue}{rgb}{0.5, 0.75, 1.0} 33 | \definecolor{linkboxcolor}{rgb}{0.8, 0.8, 0.85} 34 | 35 | % \def\checkmark{\tikz\fill (0,.35) -- (.25,0) -- (1,.7) -- (.25,.15) -- cycle;} 36 | \def\checkmark{\tikz\fill 37 | (0, 1) -- (1, 0) -- (2.5, 1.5) -- (1, 0.5) -- cycle;} 38 | 39 | \def\foldersymbol{\tikz\fill[scale=0.25] 40 | (0, 0) -- (1.7, 0) -- (1.7, 1) -- (1, 1) -- 41 | (0.85, 1.25) -- (0.15, 1.25) -- (0, 1) -- cycle;} 42 | \newcommand\inlinesymbol[1]{\resizebox{\widthof{#1}*\ratio{\widthof{x}}{\widthof{\normalsize x}}}{!}{#1}} 43 | 44 | \newcommand\DtFolder[1]{{\inlinesymbol{\color{grey}\foldersymbol}} #1} 45 | % \newcommand\Tick{\inlinesymbol\checkmark} 46 | \newcommand\Tick{ $\star$ } 47 | 48 | % Make a TOC in the generated PDF 49 | \usepackage[pdfstartview=FitH, 50 | linkbordercolor=linkboxcolor, 51 | urlbordercolor=linkboxcolor, 52 | linkcolor={blue!80}, 53 | citecolor={blue!80}, 54 | urlcolor={blue!80}, 55 | colorlinks=true, 56 | hidelinks=true, 57 | unicode=true, 58 | linktoc=all]{hyperref} 59 | \providecommand*{\listingautorefname}{Listing} 60 | \providecommand*{\sectionname}{Section} 61 | \providecommand*{\subsectionautorefname}{Subsection} 62 | \usepackage[open]{bookmark} 63 | \bookmarksetup{color=blue} 64 | 65 | \usepackage{placeins} 66 | \usepackage[nohints,tight]{minitoc} 67 | \mtcsetrules{minitoc}{off} 68 | \setlength{\mtcindent}{1ex} 69 | \renewcommand{\mtifont}{\sf\bf\normalsize} 70 | % \renewcommand{\mtcfont}{\footnotesize\bf} 71 | % \renewcommand{\mtcSfont}{\footnotesize\rm} 72 | % \renewcommand{\mtcSSfont}{\footnotesize\rm} 73 | 74 | \newcommand{\setchaptertoc}{% 75 | \setchapterpreamble{\minitoc}} 76 | 77 | \newcommand\beforeintro{% 78 | \begin{center}% 79 | \Large\Symbol{🙠}% 80 | \end{center}% 81 | \FloatBarrier% 82 | } 83 | \newcommand\afterintro{% 84 | \begin{center}% 85 | \Large\Symbol{🙣}% 86 | \end{center}% 87 | } 88 | 89 | 90 | \usepackage[acronym,xindy,toc]{glossaries} 91 | \makeglossaries 92 | \glossarystyle{altlistgroup} 93 | \input{glossary} 94 | 95 | % Fonts 96 | \usepackage[OT1]{fontenc} 97 | \usepackage{fontspec} 98 | \defaultfontfeatures{Ligatures=TeX} 99 | 100 | \setmainfont{Andada}[ 101 | Path = fonts/, 102 | Extension = .ttf, 103 | UprightFont = *-Regular, 104 | ItalicFont = *-Italic, 105 | BoldFont = *-Bold, 106 | BoldItalicFont = *-BoldItalic, 107 | SmallCapsFont = *SC-Regular, 108 | ] 109 | 110 | \newfontfamily\RlwLight{Raleway}[ 111 | Path = fonts/, 112 | Extension = .ttf, 113 | UprightFont = *-ExtraLight, 114 | ItalicFont = *-ExtraLight-Italic, 115 | BoldFont = *-Light, 116 | BoldItalicFont = *-Light-Italic, 117 | ] 118 | 119 | \setsansfont{Raleway}[ 120 | Path = fonts/, 121 | Extension = .ttf, 122 | UprightFont = *-Regular, 123 | ItalicFont = *-Regular-Italic, 124 | BoldFont = *-Bold, 125 | BoldItalicFont = *-Bold-Italic, 126 | ] 127 | 128 | \setmonofont{InputMonoNarrow}[ 129 | Scale = MatchLowercase, 130 | Path = fonts/, 131 | Extension = .ttf, 132 | UprightFont = *-Light, 133 | ItalicFont = *-LightItalic, 134 | BoldFont = *-Regular, 135 | BoldItalicFont = *-Italic, 136 | ] 137 | 138 | \newfontfamily\SymbolaFont{Symbola}[ 139 | Path = fonts/, 140 | Extension = .ttf, 141 | ] 142 | \newcommand\Symbol[1]{{\SymbolaFont#1}} 143 | 144 | 145 | \usepackage{setspace} 146 | \onehalfspacing 147 | % \doublespacing 148 | \parskip=6pt 149 | \parindent=10pt 150 | 151 | \usepackage{scrlayer-scrpage} 152 | % Header: 153 | % Inner: section numbering and title 154 | % Outer: chapter number 155 | \ohead{Chapter \thechapter} 156 | \chead{} 157 | \ihead{\rightmark} 158 | % Footer: 159 | % Center: page number 160 | \cfoot{\thepage} 161 | \ifoot{} 162 | \ofoot{} 163 | 164 | \setkomafont{chapterprefix}{\RlwLight\Large} 165 | \setkomafont{chapter}{\RlwLight\bfseries\Huge} 166 | 167 | \usepackage{booktabs} 168 | 169 | % Pretty code listings 170 | \usepackage{scrhack} % Needed to use Minted w/KOMA-Script 171 | \usepackage{minted} 172 | \setminted{ 173 | autogobble = true, 174 | breaklines = true, 175 | codetagify = true, 176 | encoding = utf-8, 177 | outencoding = utf-8, 178 | frame = leftline, 179 | framerule = 5pt, 180 | framesep = 0.65em, 181 | xleftmargin = 1em, 182 | xrightmargin = 1em, 183 | rulecolor = \color{lightgrey}, 184 | } 185 | \newmintinline[Mc]{c}{} 186 | \newminted{c}{ 187 | fontsize = \small, 188 | baselinestretch = 1.0, 189 | } 190 | \newmintinline[Mlua]{lua}{} 191 | \newminted{lua}{ 192 | fontsize = \small, 193 | baselinestretch = 1.0, 194 | } 195 | 196 | 197 | \usepackage[shadow,obeyFinal,textsize=footnotesize]{todonotes} 198 | 199 | \include{hyphenation} 200 | 201 | 202 | \newcommand\PfcTitle[0]{% 203 | Automatic bridging of native code to Lua 204 | using existing debugging information\relax} 205 | \newcommand\PfcAuthor[0]{% 206 | Adrián Pérez de Castro\relax} 207 | \newcommand\PfcDirector[0]{% 208 | Laura Milagros Castro Souto\relax} 209 | 210 | 211 | \title{\PfcTitle} 212 | \author{\PfcAuthor} 213 | \hypersetup{% 214 | pdftitle={\PfcTitle},% 215 | pdfauthor={\PfcAuthor},% 216 | pdfkeywords={ (╯°□°)╯︵ ┻━┻), ┬─┬ノ( º _ ºノ)},% 217 | } 218 | 219 | \setlength{\parskip}{2ex plus 1ex minus 1ex} 220 | 221 | \begin{document} 222 | \dominitoc 223 | \pagestyle{empty} 224 | 225 | \begin{titlepage} 226 | \begin{center} 227 | \vspace{7cm} 228 | % Logo 229 | \begin{tikzpicture}[y=0.80pt, x=0.8pt,yscale=-1, inner sep=0pt, outer sep=0pt, scale=0.2] 230 | \path[fill=magenta,nonzero rule] (220.7188,106.4062) -- (382.3633,33.4609) .. 231 | controls (341.6836,12.8594) and (284.3281,-0.0039) .. (220.7227,-0.0039) .. 232 | controls (157.1016,-0.0039) and (99.7461,12.8594) .. (59.0742,33.4609) -- 233 | (220.7188,106.4062); 234 | \path[fill=magenta,nonzero rule] (440.9648,89.9531) .. controls 235 | (436.4648,76.9375) and (427.1289,64.7109) .. (413.8828,53.7188) -- 236 | (233.1914,105.5312) -- (440.9648,89.9531); 237 | \path[fill=magenta,nonzero rule] (414.9375,161.0898) .. controls 238 | (428.0547,149.9570) and (437.2305,137.6055) .. (441.4414,124.4531) -- 239 | (232.9805,109.8984) -- (414.9375,161.0898); 240 | \path[fill=magenta,nonzero rule] (220.7305,109.2188) -- (57.9609,181.6680) .. 241 | controls (98.6992,202.6055) and (156.5195,215.6992) .. (220.7227,215.6992) .. 242 | controls (284.9102,215.6992) and (342.7344,202.6055) .. (383.4805,181.6680) -- 243 | (220.7305,109.2188); 244 | \path[fill=magenta,nonzero rule] (0.0000,124.4531) .. controls (4.2109,137.6055) 245 | and (13.3867,149.9570) .. (26.4961,161.0898) -- (208.4492,109.8984) -- 246 | (0.0000,124.4531); 247 | \path[fill=magenta,nonzero rule] (208.2422,105.5312) -- (27.5625,53.7109) .. 248 | controls (14.3164,64.7070) and (4.9766,76.9336) .. (0.4727,89.9531) -- 249 | (208.2422,105.5312); 250 | \end{tikzpicture} 251 | 252 | {\Large\textbf{Facultade de Informática \\ 253 | Universidade da Coruña}} \\ 254 | {\large\textit{Departamento de Computación}} 255 | \vspace{1cm} 256 | 257 | {\large\textsc{Proyecto de Fin de Carrera \\ 258 | Ingeniería Informática}} 259 | \vspace{1cm} 260 | 261 | {\Large\textbf{\PfcTitle}} 262 | \end{center} 263 | 264 | \vfill 265 | 266 | \begin{flushright} 267 | \begin{tabular}{ll} 268 | {\large\textbf{Student:}} & {\large\PfcAuthor} \\ 269 | {\large\textbf{Director:}} & {\large\PfcDirector} \\ 270 | {\large\textbf{Date:}} & {\large\today} 271 | \end{tabular} 272 | \end{flushright} 273 | \end{titlepage} 274 | 275 | \frontmatter 276 | 277 | \clearpage 278 | \listoftodos 279 | 280 | % Dedication 281 | \cleardoublepage 282 | \begin{minipage}[t][6cm][l]{\textwidth} 283 | \vspace{10cm} 284 | \begin{flushright} 285 | \textit{Do it, or don't, but don't try.} 286 | \end{flushright} 287 | \end{minipage} 288 | 289 | % Acknowledgements 290 | \cleardoublepage 291 | \chapter*{Acknowledgements} 292 | 293 | 294 | \begin{minipage}{0.6\textwidth} 295 | \begin{raggedleft} \itshape 296 | 297 | To my wife, who supported unconditionally me during the long hours I have 298 | devoted to this project. 299 | % , and helped to proof-read the final iterations of the 300 | % present document. 301 | 302 | \vspace{2cm} 303 | 304 | To my parents, whom have not thought that I would ever get this piece of work 305 | done. 306 | 307 | \vspace{2cm} 308 | 309 | Also, I would like to thank my Finnish “adoptive” family, who have kindly 310 | accepted me as one more of them, and that have been of invaluable support. 311 | Their appreciation of knowledge is something I am willing to pass down to 312 | upcoming generations. 313 | 314 | \end{raggedleft} 315 | \end{minipage} 316 | 317 | 318 | % Summary 319 | \cleardoublepage 320 | \include{summary} 321 | 322 | % Keywords 323 | \cleardoublepage 324 | \chapter*{Keywords} 325 | \begin{itemize} 326 | \item Automatic binding generation. 327 | \item ELF. 328 | \item DWARF. 329 | \item Debugging information. 330 | \item Lua programming language. 331 | \item Virtual machines. 332 | \item FFI. 333 | % \item JIT code generation. 334 | \end{itemize} 335 | 336 | % Indexes 337 | \cleardoublepage 338 | \tableofcontents 339 | \listoffigures 340 | \listoftables 341 | \listoflistings 342 | 343 | \mainmatter 344 | \pagestyle{scrheadings} 345 | \include{introduction} 346 | \include{contextualization} 347 | \include{design} 348 | \include{implementation} 349 | \include{conclusions} 350 | 351 | \backmatter 352 | 353 | \include{appendix-installation} 354 | 355 | \cleardoublepage 356 | \printglossaries 357 | 358 | \include{bibliography} 359 | 360 | \end{document} 361 | -------------------------------------------------------------------------------- /conclusions.tex: -------------------------------------------------------------------------------- 1 | % vim: set ft=tex foldlevel=2 spelllang=en spell: 2 | \cleardoublepage 3 | \setchaptertoc 4 | % TODO: Shouldn't this be "Final remarks" or so? 5 | \chapter{Final Remarks} 6 | 7 | This last chapter provides a \emph{post-factum} evaluation of the development 8 | process of the project, which describes the level of completion achieved, 9 | whether the planning schedule was followed through, and the future of \Eol*. 10 | \afterintro 11 | 12 | \section{Achieved Goals} 13 | 14 | We consider that the main goals set at the beginning of this project 15 | (\autoref{sec:project-goals}) have been fulfilled. In particular: 16 | 17 | \begin{itemize} 18 | 19 | \item Development of an automatic binding system for the Lua programming 20 | language, allowing seamless usage of C libraries. 21 | 22 | \item Use of the DWARF debugging information contained in ELF shared object 23 | files to pinpoint the details about invoked functions, and involved data 24 | types. 25 | 26 | \item Conversion of data values transparently between Lua and C. 27 | 28 | \item Reference implementation for GNU/Linux running on the 29 | x86\_86 architecture. 30 | 31 | \item Full compliance with the Lua C API, avoiding modifications of the 32 | internals of the Lua VM. 33 | 34 | \end{itemize} 35 | 36 | The aforementioned achievements take the shape of the \Eol* system, developed 37 | in the C programming language. \Eol* is a Lua module which implements a FFI 38 | for Lua using the DWARF debugging information to gain knowledge about the 39 | types and functions available in ELF shared object files. 40 | 41 | The source code has been publicly available~\cite{eol-github} under the terms 42 | of the \gls{OSI}-approved MIT license~\cite{mit-license} since the very moment the development 43 | effort was started, and so the code repository contains the complete history 44 | of changes it has received so far. The MIT license was chosen because most 45 | Lua-related projects use either the BSD or MIT license, and using a compatible 46 | license allows developers to confidently mix \Eol* with other Lua modules 47 | right away. Also, referring to the MIT license is clearer because it avoids 48 | the potential confusion derived from the existence of different variants of 49 | the BSD license~\cite{bsd-licenses}. 50 | 51 | Separating the code which deals with native function invocation allows us to 52 | select select which method is used to invoke native functions from Lua at 53 | build-time. Two backends were implemented: one using \verb|libffi|, and 54 | a second one using JIT compilation (by means of DynASM) for the Intel x86\_64 55 | architecture to generate the needed glue code. 56 | 57 | Additionally, a test harness for Lua has been developed, due to the existing 58 | Lua unit testing frameworks aborting their execution in the event of a crash 59 | of the process. The harness has been implemented mostly in Lua, plus a small 60 | helper module written in C to access Unix system calls which are not covered 61 | by the Lua standard library. 62 | 63 | The implementation of \Eol* consists of approximately 7.000 \todo{Update 64 | numbers later if needed} lines of code (standard LOC, ignoring comments, see 65 | \autoref{fig:eol-loc}), of which the biggest part is the implementation of the 66 | \Eol* Lua module, as expected. 67 | 68 | \begin{figure}[ht] 69 | \centering 70 | \begin{tikzpicture} 71 | \begin{axis}[ 72 | style={/pgf/number format/assume math mode=true}, 73 | width=0.7\textwidth, 74 | ybar, axis on top, 75 | ymajorgrids, tick align=inside, 76 | major grid style={draw=white}, 77 | enlarge y limits={value=.1,upper}, 78 | axis x line*=bottom, 79 | axis y line*=right, 80 | y axis line style={opacity=0}, 81 | enlarge x limits=0.25, 82 | tickwidth=0pt, 83 | xtick=data, 84 | bar width=1cm, 85 | symbolic x coords={C, Lua, Shell, Make, Ninja}, 86 | nodes near coords, 87 | ymin=0, 88 | ] 89 | \addplot[draw=none, fill=blue!30] coordinates { 90 | (Lua,1110) (C,4600) (Shell,550) (Make,200) (Ninja,290) 91 | }; 92 | \end{axis} 93 | \end{tikzpicture} 94 | \caption{Lines of code in \Eol*, per language.} 95 | \label{fig:eol-loc} 96 | \end{figure} 97 | 98 | Last but not least, the \Eol* FFI module has been tested thoroughly, in two 99 | ways: 100 | 101 | \begin{itemize} 102 | 103 | \item With an automated test suite, which can be used for regression 104 | testing—and has been used as such to ensure that the behaviour of code 105 | generated by the JIT compilation of function invocations works the same 106 | way as the \verb|libffi|-based method. 107 | 108 | \item Writing example programs which exercise the module. These use third 109 | party libraries used in real world projects, so the example programs 110 | stress the FFI in the way it is intended to be used. 111 | 112 | \end{itemize} 113 | 114 | \section{Lessons Learned} 115 | 116 | During the development of the project, the specification of the ELF and DWARF 117 | standards has been analyzed, and the relevant parts which are useful for 118 | implementing a \gls{FFI} have been identified. A comprehensive understanding 119 | of the specification was acquired thanks to the following documentation: 120 | 121 | \begin{itemize} 122 | 123 | \item \emph{How Debugging Works}~\cite{howdebugworks}: Tutorial-style 124 | series of articles which give a good overall overview of how debugging 125 | information is embeeded into compiled object code. 126 | 127 | \item \emph{System V Application Binary Interface Edition 128 | 4.1}\cite{elfspec-sysv}: Contains the original (non normative) 129 | specification of the ELF object file format. Though newer, normative 130 | versions of the specification exist, this version explains the concepts 131 | needed to understand the DWARF debugging information in an more 132 | approachable way. 133 | 134 | \item \emph{DWARF Debugging Information Format Version 135 | 4}~\cite{dwarfspecv4}: Normative specification of the DWARF debugging 136 | information format. 137 | 138 | \end{itemize} 139 | 140 | Existing FFI implementations for Lua have been analyzed to understand how they 141 | work, which was a valuable knowledge to keep in mind while designing how \Eol* 142 | bridges native code to Lua. In particular, the LuaJIT \verb|ffi| module was 143 | taken as a prime example of a proven solution which is popular in the Lua 144 | community. The following documents were instrumental for the design of the 145 | developed solution: 146 | 147 | \begin{itemize} 148 | 149 | \item \emph{FFI semantics}~\cite{lj-ffi-semantic}: Describes how the 150 | module interacts both with Lua, and the compiled C code. 151 | 152 | \item \emph{ffi.* API functions}~\cite{lj-ffi-api}: Describes the API 153 | of the \verb|ffi| module. 154 | 155 | \end{itemize} 156 | 157 | Implementing the JIT code generation required learning how to use LuaJIT's 158 | DynASM, which lacks official documentation. It was also needed to learn to 159 | program in assembler for the Intel x86 platform, and its \gls{ABI} calling 160 | conventions during the development of the JIT code generator. 161 | 162 | 163 | \section{Planning Results} 164 | 165 | \begin{table}[ht] 166 | \centering 167 | \begin{tabular}{rlrrrr} 168 | \toprule 169 | & & \multicolumn{2}{c}{Time (days)} & \multicolumn{2}{c}{Cost (€)} \\ 170 | \cmidrule(r){3-6} 171 | \# & Task & Estimated & Actual & Estimated & Actual \\ 172 | \midrule 173 | 1. & Initial study & 10 & 5 & 5.200 & 2.600 \\ 174 | 2. & Analysis & 15 & 23 & 7.800 & 11.960 \\ 175 | 3. & Development & 50 & 80 & 26.000 & 41.600 \\ 176 | 4. & Validation & 10 & 8 & 5.200 & 4.160 \\ 177 | 5. & Documentation & 30 & 78 & 15.600 & 40.560 \\ 178 | \midrule 179 | & \emph{Total} & 115 & 194& 59.800 & 100.880 \\ 180 | \bottomrule 181 | \end{tabular} 182 | \caption{Estimated vs.\ actual schedule and cost} 183 | \label{tab:sched-postmortem} 184 | \end{table} 185 | 186 | The planning done \emph{a priori} (\autoref{sec:plan-method}) resulted in a 187 | tight schedule which did not include enough clearance for anything else than 188 | the smallest of the unexpected delays~(c.f \autoref{tab:sched-postmortem}). 189 | In hindsight, it would have been good to schedule additional time to cope 190 | with the multiple causes of delays: 191 | 192 | \begin{itemize} 193 | 194 | \item The main deviation, filed under \emph{Documentation}, was caused 195 | because the effort of writing the final report was underestimated by 196 | a wide margin. The estimation was overly optimistic, and while for someone 197 | used to do documentation work in a regular basis it might have been an 198 | adequate estimation, that was not the case for the author. Plus, there was 199 | the added difficulty of writing the documentation in English: while 200 | capable of its fluent use in a daily basis, the author is not a native 201 | speaker and lacked experience in writing long-form technical documentation 202 | in it. 203 | 204 | \item The additional time needed for the \emph{Analysis} was motivated by 205 | the need of reading complex normative documentation, mainly the 206 | specifications of the \gls{ELF} and \gls{DWARF} standards, which weight 207 | over 320 and 100 pages, respectively. Even though not the whole text of 208 | the specifications was relevant for the present project, it was needed to 209 | wade through them to acquire concepts which then allowed to understand the 210 | rest. 211 | 212 | \item As for the \emph{Development} phase, it took a good amount of 213 | unplanned time to understand how use DynASM for JIT code generation due to 214 | the utter lack of documentation, which required frequent detours to read 215 | parts of its source code. The most complete resource on 216 | DynASM~\cite{unofficial-dasm-doc} was written by a third party, it is not 217 | part of official documentation, and it was found when the most of the 218 | deviation had already taken place. 219 | 220 | \end{itemize} 221 | 222 | 223 | On the other side of the spectrum, the \emph{Initial study} was carried out 224 | faster than planned: the preexisting knowledge about Lua, LuaJIT, and the 225 | existing solutions to use native libraries with them was a valuable asset. 226 | 227 | The final cost of the project has increased accordingly to the additional time 228 | needed for its completion, being now 100.800€ instead of the planned 59.800€. 229 | This figure uses a cost per hour of 65€, which is a current average value 230 | used by the author's company, a 8-hour work day, and does not take into 231 | account the time devoted by the tutor of the project. 232 | 233 | 234 | \section{Future Directions} 235 | 236 | Every software project has room for improvement and continued refinement, and 237 | \Eol* is no exception. There are a number of features which have been 238 | knowingly left out of the present project, in order to keep its scope under 239 | control. It is the intention of the author to keep developing \Eol* as a Free 240 | Software project, and there are a number of ideas for future development which 241 | have surfaced during the realization of its current version. 242 | 243 | The following are ideas which are complex to realize, and even though working 244 | on them would require a big development effort, they open the path to exciting 245 | new possibilities: 246 | 247 | \begin{itemize} 248 | 249 | \item Defining an on-disk format for type and function information. The 250 | idea would be to obtain the information from the DWARF debugging 251 | information, and store it in a format which is optimized for faster 252 | reading. The files in this new format would be used for on-disk caching. 253 | 254 | \item Using the file format implemented from the previous bullet point, 255 | allow reading type information directly from it, without requiring that 256 | the ELF shared objects include DWARF debugging information. 257 | 258 | \item Saving the generated code to disk in ELF object files, when \Eol* is 259 | built with JIT code generation enabled. The generated code could be loaded 260 | reusing the Lua module loader. 261 | 262 | \item Implementing support for reading debugging information in formats 263 | other than DWARF. Ideally, there would be an interface that a “type 264 | information provider” component could implement, and the DWARF provider 265 | would be just one of many. 266 | 267 | \end{itemize} 268 | 269 | The following fall into the category of improvements which can be done with 270 | a moderate effort, and would certainly provide added value to the project: 271 | 272 | \begin{itemize} 273 | 274 | \item Building a community. \Eol* is already Free Software, it lacks 275 | a community, and it would be interesting to foster a healthy one around 276 | the project. That would require writing more documentation (e.g. a quick 277 | start tutorial, a walkthough of the features), and having public 278 | communication channels (e.g. a mailing list, an \gls{IRC} chat room, 279 | participating in the \verb|lua-l| list...). 280 | 281 | \item Ensuring compatibility. Making sure that \Eol* works with Lua 5.2, 282 | and 5.1 would favour maximum adoption, since those are the versions more 283 | widely deployed.\todo{See if it's possible to add a citation} 284 | 285 | \item Enabling use of \Eol* with LuaJIT. Most modules using the Lua C API 286 | can be built for LuaJIT as well. LuaJIT is designed to be compatible with 287 | Lua 5.1, while the current Implementation targets version 5.3. 288 | 289 | \item Implementing JIT compilation of function invocations for 290 | architectures other than x86, and x86\_64. This can be done with ease for 291 | the other architectures supported by DynASM: ARM, MIPS, and PowerPC. 292 | 293 | \end{itemize} 294 | 295 | \beforeintro 296 | -------------------------------------------------------------------------------- /slides.tex: -------------------------------------------------------------------------------- 1 | % vim:ft=tex: 2 | % 3 | \documentclass[luatex]{beamer} 4 | 5 | \setbeamercolor{background canvas}{bg=white} 6 | \setbeamercolor{normal text}{fg=black!90} 7 | \setbeamerfont{text}{size*={14}{1.4em}} 8 | 9 | \setbeamertemplate{frametitle}{% 10 | \begin{centering}% 11 | \bigskip\Huge\insertframetitle\par\smallskip% 12 | \end{centering}% 13 | } 14 | \setbeamertemplate{navigation symbols}{} 15 | \setbeamertemplate{footline}[text line]{} 16 | 17 | \usepackage[english]{babel} 18 | \usepackage{booktabs} 19 | \usepackage{calc} 20 | \usepackage{tikz} 21 | \usetikzlibrary{arrows,shapes,fit,shadows,positioning,chains,% 22 | decorations.pathreplacing,decorations.pathmorphing,calc,% 23 | matrix} 24 | \usepackage{pgfplots} 25 | 26 | \definecolor{grey}{rgb}{0.5, 0.5, 0.5} 27 | \definecolor{lightgrey}{rgb}{0.95, 0.95, 0.95} 28 | \definecolor{grassgreen}{rgb}{0.1, 0.85, 0.2} 29 | \definecolor{fadedbrown}{rgb}{0.85, 0.1, 0.1} 30 | \definecolor{darkmagenta}{rgb}{0.65, 0.0, 0.65} 31 | \definecolor{lightblue}{rgb}{0.5, 0.75, 1.0} 32 | \definecolor{linkboxcolor}{rgb}{0.8, 0.8, 0.85} 33 | 34 | % Minted 35 | \usepackage{minted} 36 | \setminted{ 37 | autogobble = true, 38 | breaklines = true, 39 | codetagify = true, 40 | encoding = utf-8, 41 | outencoding = utf-8, 42 | frame = leftline, 43 | framerule = 5pt, 44 | framesep = 0.65em, 45 | xleftmargin = 1em, 46 | xrightmargin = 1em, 47 | rulecolor = \color{lightgrey}, 48 | } 49 | \newmintinline[Mc]{c}{} 50 | \newminted{c}{ 51 | fontsize = \small, 52 | baselinestretch = 1.0, 53 | } 54 | \newmintinline[Mlua]{lua}{} 55 | \newminted{lua}{ 56 | fontsize = \small, 57 | baselinestretch = 1.0, 58 | } 59 | 60 | % Fonts 61 | \usepackage[OT1]{fontenc} 62 | \usepackage{fontspec} 63 | \defaultfontfeatures{Ligatures=TeX} 64 | 65 | \setmainfont{Andada}[ 66 | Path = fonts/, 67 | Extension = .ttf, 68 | UprightFont = *-Regular, 69 | ItalicFont = *-Italic, 70 | BoldFont = *-Bold, 71 | BoldItalicFont = *-BoldItalic, 72 | SmallCapsFont = *SC-Regular, 73 | ] 74 | 75 | \newfontfamily\RlwLight{Raleway}[ 76 | Path = fonts/, 77 | Extension = .ttf, 78 | UprightFont = *-ExtraLight, 79 | ItalicFont = *-ExtraLight-Italic, 80 | BoldFont = *-Light, 81 | BoldItalicFont = *-Light-Italic, 82 | ] 83 | 84 | \setsansfont{Raleway}[ 85 | Path = fonts/, 86 | Extension = .ttf, 87 | UprightFont = *-Regular, 88 | ItalicFont = *-Regular-Italic, 89 | BoldFont = *-Bold, 90 | BoldItalicFont = *-Bold-Italic, 91 | ] 92 | 93 | \setmonofont{InputMonoNarrow}[ 94 | Scale = MatchLowercase, 95 | Path = fonts/, 96 | Extension = .ttf, 97 | UprightFont = *-Light, 98 | ItalicFont = *-LightItalic, 99 | BoldFont = *-Regular, 100 | BoldItalicFont = *-Italic, 101 | ] 102 | 103 | \newfontfamily\SymbolaFont{Symbola}[ 104 | Path = fonts/, 105 | Extension = .ttf, 106 | ] 107 | \newcommand\Symbol[1]{{\SymbolaFont#1}} 108 | 109 | % Symbols 110 | \newcommand\LeafOpen{\Symbol{🙠}} 111 | \newcommand\LeafClose{\Symbol{🙣}} 112 | 113 | \title{\textsc{Eöl}} 114 | \subtitle{Automatic bridging of native code to Lua using existing debugging information} 115 | \author{Adrián Pérez de Castro} 116 | \institute[UDC]{ 117 | \begin{tikzpicture}[y=0.80pt, x=0.8pt,yscale=-1, inner sep=0pt, outer sep=0pt, scale=0.2] 118 | \path[fill=magenta,nonzero rule] (220.7188,106.4062) -- (382.3633,33.4609) .. 119 | controls (341.6836,12.8594) and (284.3281,-0.0039) .. (220.7227,-0.0039) .. 120 | controls (157.1016,-0.0039) and (99.7461,12.8594) .. (59.0742,33.4609) -- 121 | (220.7188,106.4062); 122 | \path[fill=magenta,nonzero rule] (440.9648,89.9531) .. controls 123 | (436.4648,76.9375) and (427.1289,64.7109) .. (413.8828,53.7188) -- 124 | (233.1914,105.5312) -- (440.9648,89.9531); 125 | \path[fill=magenta,nonzero rule] (414.9375,161.0898) .. controls 126 | (428.0547,149.9570) and (437.2305,137.6055) .. (441.4414,124.4531) -- 127 | (232.9805,109.8984) -- (414.9375,161.0898); 128 | \path[fill=magenta,nonzero rule] (220.7305,109.2188) -- (57.9609,181.6680) .. 129 | controls (98.6992,202.6055) and (156.5195,215.6992) .. (220.7227,215.6992) .. 130 | controls (284.9102,215.6992) and (342.7344,202.6055) .. (383.4805,181.6680) -- 131 | (220.7305,109.2188); 132 | \path[fill=magenta,nonzero rule] (0.0000,124.4531) .. controls (4.2109,137.6055) 133 | and (13.3867,149.9570) .. (26.4961,161.0898) -- (208.4492,109.8984) -- 134 | (0.0000,124.4531); 135 | \path[fill=magenta,nonzero rule] (208.2422,105.5312) -- (27.5625,53.7109) .. 136 | controls (14.3164,64.7070) and (4.9766,76.9336) .. (0.4727,89.9531) -- 137 | (208.2422,105.5312); 138 | \end{tikzpicture} 139 | \vspace{0.5em} 140 | 141 | Universidade da Coruña} 142 | \date[Sep 2015]{September, 2015} 143 | 144 | \begin{document} 145 | 146 | \setbeamertemplate{background canvas}{% 147 | \includegraphics[height=\paperheight]{img/forest.jpg}} 148 | % \setbeamercolor{title}{fg=black} 149 | % \setbeamercolor{block body}{fg=white} 150 | 151 | \maketitle 152 | 153 | \setbeamertemplate{background canvas}{} 154 | % \setbeamercolor{normal text}{fg=black!90} 155 | 156 | \begin{frame}{Outline} 157 | \tableofcontents 158 | \end{frame} 159 | 160 | 161 | \section{Lua} 162 | 163 | \subsection{Quick Introduction} 164 | 165 | \begin{frame} 166 | 167 | \centering 168 | \includegraphics[height=0.15\textheight]{img/lua-logo.pdf} 169 | \vspace{3em} 170 | \begin{quote} 171 | Lua is apowerful, fast, lightweight, embeddable 172 | scripting language. 173 | \begin{flushright} 174 | \begin{scriptsize} 175 | — \url{http://www.lua.org/about.html} 176 | \end{scriptsize} 177 | \end{flushright} 178 | \end{quote} 179 | \vfill 180 | 181 | \begin{itemize} 182 | \item Single data structure \visible<2->{$\rightarrow$ \emph{tables}} 183 | \item Extensible semantics \visible<3->{$\rightarrow$ \emph{metatables}} 184 | \item Automatic memory management \visible<4->{$\rightarrow$ \emph{GC}} 185 | \item Bytecode, register-based VM 186 | \end{itemize} 187 | 188 | \note[itemize]{ 189 | \item Intially created as a data description language at Tecgraf 190 | PUC-Rio, for in-house software development: trade barriers were in 191 | effect for software and computer hardware. 192 | \item Petrobras was one of the first users of SOL and DEL, the 193 | predecessors of Lua. 194 | \item Tables can be used as hash tables, arrays, and objects. 195 | Particularly well suited for data description. 196 | \item Influences by Module (control structures), AWK (tables), and 197 | LISP (everything is a list\^W table) 198 | \item Widely used in the games industry (WoW). 199 | } 200 | \end{frame} 201 | 202 | 203 | \begin{frame}[fragile]{Lua by Example} 204 | \begin{luacode} 205 | animal = { 206 | name = "Unnamed", 207 | kind = "living creature", 208 | describe = function (self) 209 | print(self.name .. " is a " .. self.kind) 210 | end, 211 | } 212 | 213 | f = setmetatable({ kind="cat", name="Fifi" }, 214 | { __index=animal }) 215 | t = setmetatable({ name="Tom", }, 216 | { __index=animal }) 217 | f:describe() --> Fifi is a cat 218 | t:describe() --> Tom is a living creature 219 | \end{luacode} 220 | 221 | \note[itemize]{ 222 | \item First a base object (which is a table) is defined 223 | \item Functions are first-class values, and as such can 224 | be values in a table 225 | \item Metatables allow defining the runtime behaviour of 226 | certain operations. Here we set \texttt{\_\_index} 227 | so the fields not found in \texttt{cat} or \texttt{dog} 228 | are looked up in the \texttt{animal} table instead. This 229 | effectively creates a prototype-style chain of obejcts. 230 | } 231 | \end{frame} 232 | 233 | 234 | \subsection{Accessing Native Code} 235 | 236 | 237 | \begin{frame}[fragile]{Going Native: Try I} 238 | 239 | Fact: Lua has a very minimal and small standard library. 240 | 241 | \pause 242 | \textbf{How is additional functionality provided?} 243 | 244 | \pause 245 | \vspace{2em} 246 | 247 | \begin{luacode} 248 | function isdir(path) 249 | local fd = io.popen("test -d " .. path) 250 | fd:read("*a") -- Discard output 251 | local ok, reason, code = fd:close() 252 | return ok and reason == "exit" and code == 0 253 | end 254 | \end{luacode} 255 | \vspace{2em} 256 | 257 | \pause 258 | \hfill …but this is \emph{cheating} 259 | 260 | \pause 261 | \hfill …and horrible in many ways 262 | 263 | \note[itemize]{ 264 | \item Fatality 1: Spawning a child process just to check something that is 265 | provided as a system call. 266 | \item Fatality 2: The \texttt{path} variable needs to be quoted properly. 267 | \item Fatality 3: \texttt{io.popen} uses \texttt{system()}, which does 268 | shell expansion, and is a security hole. 269 | \item What we really want is to be able to call native functions. 270 | } 271 | 272 | \end{frame} 273 | 274 | 275 | \begin{frame}{Going Native: Try II} 276 | 277 | Civilized ways: 278 | 279 | \begin{enumerate} 280 | \item Lua C API 281 | \item Wrappers over the C API 282 | \item Binding generators 283 | \item Foreign Function Interfaces 284 | \item \textsc{Eöl} (this project) 285 | \end{enumerate} 286 | 287 | \end{frame} 288 | 289 | 290 | \begin{frame}[fragile]{\texttt{isatty}} 291 | \begin{luacode} 292 | function isatty(fileno) 293 | -- ??? 294 | end 295 | \end{luacode} 296 | \end{frame} 297 | 298 | 299 | \begin{frame}[fragile]{\texttt{isatty}: Lua C API} 300 | 301 | C: 302 | 303 | \begin{ccode} 304 | static int f_isatty(lua_State *L) { 305 | lua_Integer fileno = luaL_checkinteger (L, 1); 306 | lua_pushboolean (L, isatty ((int) fileno)); 307 | return 1; 308 | } 309 | 310 | int luaopen_isatty (lua_State *L) { 311 | lua_pushcfunction (L, f_isatty); 312 | return 1; 313 | } 314 | \end{ccode} 315 | 316 | Lua: 317 | 318 | \begin{luacode} 319 | local isatty = require("isatty") 320 | print(isatty(0), type(isttyatty(0))) 321 | -- Output: true boolean 322 | \end{luacode} 323 | \end{frame} 324 | 325 | 326 | \begin{frame}{Binding Generators} 327 | \begin{itemize} 328 | \item Tools that create bindings in an automated way. 329 | \item<2-> They require an extra compilation step. 330 | \item<2-> Cleanup of the C definition may be needed. 331 | \end{itemize} 332 | 333 | \note[itemize]{ 334 | \item Some Lua-specific binding generators exist, they are either 335 | outdated or are not very complete. 336 | \item SWIG is the standard to beat... but it's still a binding 337 | generator, requires an extra step, and so on. 338 | } 339 | \end{frame} 340 | 341 | 342 | \begin{frame}[fragile]{\texttt{isatty}: LuaJIT FFI} 343 | \begin{luacode} 344 | local ffi = require("ffi") 345 | ffi.cdef("int isatty(int)") 346 | local isatty = ffi.C.isatty 347 | print(isatty(0), type(isatty(0))) 348 | -- Output: 1 number 349 | \end{luacode} 350 | 351 | \visible<2->{ 352 | \begin{itemize} 353 | \item No need to manually write glue code 354 | \item Precise type information 355 | \end{itemize} 356 | } 357 | 358 | \end{frame} 359 | 360 | 361 | \begin{frame}[fragile]{\texttt{isatty}: Ideal FFI} 362 | \begin{luacode} 363 | local eol = require("eol") 364 | local isatty = eol.C.isatty 365 | print(isatty(0), type(isatty(0))) 366 | -- Output: 1 number 367 | \end{luacode} 368 | 369 | \hspace{4em} 370 | \visible<2->{ 371 | \begin{itemize} 372 | \item No need to manually write glue code 373 | \item Precise type information 374 | \item \textbf{No manual function declaration} (\Mlua|ffi.cdef|) 375 | \end{itemize} 376 | } 377 | 378 | \end{frame} 379 | 380 | 381 | \begin{frame}{Ideal FFI = \textsc{Eöl}} 382 | \begin{quote} 383 | In order to implement the “ideal” FFI we need information 384 | about functions, their parameters, and involved data types. 385 | \end{quote} 386 | \pause 387 | \begin{itemize} 388 | \item The compiler already knows that information. 389 | \pause 390 | \item The compiler \emph{can} write it as debugging information. 391 | \end{itemize} 392 | \end{frame} 393 | 394 | 395 | \section{\textsc{Eöl}} 396 | 397 | \pgfdeclarelayer{background} 398 | \pgfdeclarelayer{foreground} 399 | \pgfsetlayers{background,main,foreground} 400 | 401 | \tikzstyle{bdBox} = [ 402 | rectangle, drop shadow, draw=black, thick, fill=white, 403 | text centered, minimum height=2em, minimum width=3em, 404 | ] 405 | \tikzstyle{bdProcBox} = [ 406 | rounded corners, fill=blue!10, text centered 407 | ] 408 | \tikzstyle{bdProcLine} = [ 409 | draw, thick, color=blue!20 410 | ] 411 | \tikzstyle{bdCircle} = [ 412 | circle, fill=blue!20, draw=black, 413 | ] 414 | \tikzstyle{bdLine} = [draw, thick] 415 | \tikzstyle{bdArrow} = [bdLine, >=triangle 45, ->] 416 | 417 | 418 | \tikzstyle{datablob} = [ 419 | rectangle, rounded corners, drop shadow, draw=black, thick, 420 | text centered, minimum height=2em, minimum width=3em, fill=blue!20, 421 | ] 422 | \tikzstyle{die} = [start chain=going below, node distance=1mm] 423 | \tikzstyle{dielabel} = [on chain] 424 | \tikzstyle{dieitems} = [ 425 | rectangle split, rectangle split parts=#1, rectangle split part align=left, 426 | thick, draw, fill=blue!10, on chain, 427 | ] 428 | \tikzstyle{enumitems} = [ 429 | rectangle split, rectangle split parts=#1, rectangle split part align=left, 430 | thick, rounded corners, 431 | color=black!50, fill=black!5, draw=black!50, 432 | minimum height=3em, 433 | ] 434 | \tikzstyle{valuefrom} = [draw=black!50, thick, dashed] 435 | \tikzstyle{arrow} = [draw, thick, >=triangle 45, ->] 436 | \tikzstyle{datain} = [ 437 | draw=black!80, thick, fill=blue!10, rectangle, 438 | text centered, minimum height=1.3em, text width=10em, 439 | ] 440 | \tikzstyle{component} = [ 441 | draw=black, 442 | thick, 443 | fill=green!10, 444 | rectangle, 445 | text centered, 446 | minimum height=2em, 447 | text width=6em, 448 | rounded corners, 449 | drop shadow, 450 | ] 451 | \tikzstyle{uses} = [ 452 | draw, 453 | very thick, 454 | >=triangle 45, 455 | ->, 456 | dashed, 457 | ] 458 | \tikzstyle{contains} = [ 459 | draw, 460 | thick, 461 | >=triangle 45, 462 | -*, 463 | ] 464 | 465 | 466 | \begin{frame}{\textsc{Eöl}} 467 | \begin{center} 468 | \Huge 469 | FFI + DWARF/ELF 470 | \end{center} 471 | 472 | \pause 473 | 474 | \begin{center} 475 | Automatic binding system for the Lua programming language which allows 476 | seamless usage, at runtime, of libraries written in C. 477 | \end{center} 478 | \end{frame} 479 | 480 | 481 | \subsection{Architecture} 482 | 483 | \begin{frame}{Design} 484 | 485 | \resizebox{\textwidth}{!}{ 486 | \begin{tikzpicture}[node distance=2cm] 487 | \node[component] (library) {Library}; 488 | \node[component] (typecache) [above of=library] {Type Cache}; 489 | \node[component] (ctype) [right=1cm of library] {CType}; 490 | \node[component] (function) [right=1cm of ctype] {Function}; 491 | \node[component] (variable) [right=1cm of function] {Variable}; 492 | \node[component] (typeinfo) [above of=function] {Type Information}; 493 | \node[datain] (dwarf) [above=1cm of typecache] 494 | {DWARF debugging information}; 495 | 496 | \node (luadata) [below right of=ctype] {Visible in Lua as userdata}; 497 | \node (elf) [above=0cm of dwarf] {ELF shared object}; 498 | 499 | \path[uses] (function) -- (typeinfo); 500 | \path[uses] (variable) -- (typeinfo); 501 | \path[uses] (ctype) -- (typeinfo); 502 | \path[uses] (typecache) -- (dwarf); 503 | \path[contains] (library) -- (typecache); 504 | \path[contains] (typecache) -- (typeinfo); 505 | 506 | \begin{pgfonlayer}{background} 507 | \node[datablob] (elfbox) [fit=(dwarf) (elf), drop shadow] {}; 508 | \node[fill=yellow!20, rectangle, rounded corners] (wrappers) 509 | [fit=(library) (variable) (function) (luadata)] { }; 510 | \end{pgfonlayer} 511 | \end{tikzpicture} 512 | } 513 | 514 | \end{frame} 515 | 516 | 517 | \subsection{DWARF + ELF} 518 | 519 | \begin{frame}{ELF} 520 | \begin{columns} 521 | \begin{column}{0.55\textwidth} 522 | \resizebox{\textwidth}{!}{ 523 | \begin{tikzpicture}[node distance=1.5mm, bend angle=0] 524 | \node[bdBox] (elfheader) [minimum width=10em, start chain=going below, on chain] {ELF header}; 525 | \node[bdBox] (prgheader) [minimum width=10em, on chain] {Program header}; 526 | \node[bdBox] (sect-text) [minimum width=10em, on chain] {\texttt{.text}}; 527 | \node[bdBox] (sect-rodata) [minimum width=10em, on chain] {\texttt{.rodata}}; 528 | \node (ellipsis) [minimum width=10em, on chain] {...}; 529 | % \node[bdBox] (sect-data) [minimum width=10em, on chain, yshift=-1em] {\textt{.data}}; 530 | \node[bdBox] (sect-data) [minimum width=10em, on chain] {\texttt{.data}}; 531 | \node[bdBox] (secthdrtable) [minimum width=10em, on chain] {Section header table}; 532 | \draw[decorate, decoration={brace}] let \p1=(sect-text.north), 533 | \p2=(sect-rodata.south) in ($(2.2, \y1)$) -- ($(2.2, \y2)$) 534 | node[midway] (g1) {}; 535 | \draw[decorate, decoration={brace}] let \p1=(ellipsis.north), 536 | \p2=(sect-data.south) in ($(2.2, \y1)$) -- ($(2.2, \y2)$) 537 | node[midway] (g2) {}; 538 | \draw[->, bend right, >=latex, bend right, thick] 539 | (elfheader.east) to [out=90,in=90] (g1.east); 540 | \draw[->, bend right, >=latex, bend right, thick] 541 | (elfheader.east) to [out=90,in=90] (g2.east); 542 | \draw[->, bend left, >=latex, bend right, thick] 543 | (secthdrtable.west) to [out=90,in=90] (sect-text.west); 544 | \draw[->, bend left, >=latex, bend right, thick] 545 | (secthdrtable.west) to [out=90,in=90] (sect-rodata.west); 546 | \draw[->, bend left, >=latex, bend right, thick] 547 | (secthdrtable.west) to [out=90,in=90] (sect-data.west); 548 | \end{tikzpicture} 549 | } 550 | \end{column} 551 | % \begin{column}{0.1\textwidth} 552 | % \end{column} 553 | \begin{column}{0.45\textwidth} 554 | \begin{itemize} 555 | \item Headers 556 | \begin{itemize} 557 | \item Fixed part 558 | \item Variable tables 559 | \end{itemize} 560 | \item Segments 561 | \begin{itemize} 562 | \item Runtime 563 | \item Executable shape 564 | \end{itemize} 565 | \item Sections 566 | \begin{itemize} 567 | \item Offline 568 | \item Arbitrary data 569 | \end{itemize} 570 | \end{itemize} 571 | \end{column} 572 | \end{columns} 573 | \end{frame} 574 | 575 | 576 | \begin{frame}{DWARF} 577 | \centering 578 | \resizebox{0.75\textwidth}{!}{ 579 | \begin{tikzpicture}[die] 580 | \node[dielabel] (taglabel) {Tag}; 581 | \node[dieitems=1] (tag) {\texttt{DW\_TAG\_pointer}}; 582 | \node[dielabel] (attrlabel) {Attributes}; 583 | \node[dieitems=1] (attributes) {\texttt{DW\_AT\_type}}; 584 | \node[dielabel] (rtaglabel) [right=3cm of tag, yshift=-3.5mm] {Tag}; 585 | \node[dieitems=1] (rtag) {\texttt{DW\_TAG\_pointer}}; 586 | \node[dielabel] (rattrlabel) {Attributes}; 587 | \node[dieitems=1] (rattributes) {\texttt{DW\_AT\_type}}; 588 | \node[datablob] (basedie) [right=2cm of rattributes] {Type DIE}; 589 | \path[arrow] (rattributes.text east) -- (basedie); 590 | \begin{pgfonlayer}{background} 591 | \node[datablob] (die) [fit=(taglabel) (attributes) (tag)] {}; 592 | \node[datablob] (rdie) [fit=(rtaglabel) (rattributes) (rtag)] {}; 593 | \end{pgfonlayer} 594 | \path[arrow] (attributes.text east) -- (rdie.west); 595 | \end{tikzpicture} 596 | } 597 | 598 | \vspace{2em} 599 | 600 | \begin{columns} 601 | \begin{column}{0.5\textwidth} 602 | ELF sections: 603 | \begin{itemize} 604 | \item \texttt{.debug\_types} 605 | \item \texttt{.debug\_info} 606 | \item \texttt{.debug\_}… 607 | \end{itemize} 608 | \end{column} 609 | \begin{column}{0.5\textwidth} 610 | DWARF information: 611 | \begin{itemize} 612 | \item Tree-like structure 613 | \item Nodes: DIEs \& TUEs 614 | \item Tagged attributes 615 | \end{itemize} 616 | \end{column} 617 | \end{columns} 618 | \end{frame} 619 | 620 | 621 | \section{Demos} 622 | 623 | \begin{frame} 624 | \centering 625 | \Symbol{\Huge 🖮} 626 | \Large 627 | 628 | Demo Time! 629 | \end{frame} 630 | 631 | 632 | \setbeamertemplate{background canvas}{% 633 | \includegraphics[height=\paperheight]{img/theend3.jpg}} 634 | \begin{frame} 635 | \end{frame} 636 | 637 | \end{document} 638 | -------------------------------------------------------------------------------- /design.tex: -------------------------------------------------------------------------------- 1 | % vim: ft=tex spell spelllang=en ts=2 sw=2 2 | 3 | \cleardoublepage 4 | \setchaptertoc 5 | \chapter{Analysis \& Design} 6 | 7 | This chapter is a tour through the architecture of the developed software 8 | solution, analyzing relevant decisions taken that gave it its final shape. 9 | \afterintro 10 | 11 | \section{Naming} 12 | 13 | In the Lua community there is a certain tradition of naming projects after 14 | celestial bodies, or terms related to them —after all, Lua means \emph{moon} 15 | in Portuguese—, but unfortunately the name initially chosen for the project 16 | was Eris —a dwarf planet, neither a planet nor a moon— was already being used 17 | by another Lua-related project\footnote{The Eris persistence system, 18 | \url{https://github.com/fnuecke/eris}}. A closer inspection showed that other 19 | dwarf planet names were already in use for software projects, so in the end 20 | it was needed to draw inspiration from a different area. 21 | 22 | Eöl, also known as “The Dark Elf”, is a fictional character in 23 | J. R. R. Tolkien's Middle-earth legendarium, who is said to be the elf with 24 | closest relationships with dwarves, and one of the first able to speak their 25 | language. \Eol* can also be an \gls{backronym} for “ELF Object Loader”, 26 | which describes well the purpose of the developed solution. 27 | 28 | 29 | \section{Overview} 30 | \label{sec:design-overview} 31 | 32 | The main components of \Eol* are shown in \autoref{fig:eol-architecture}. 33 | 34 | The design of the system revolves around \textsf{Type Information}: it 35 | describes native types in detail, and it is used by all the other components 36 | in different ways to provide their functionality. Its importance should not be 37 | surprising, because the ultimate goal of \Eol* is to allow seamless invocation 38 | of native functions which, being close to the bare metal, always conform to 39 | strict \gls{ABI} specifications. While starting execution of a native function 40 | is as simple as generating a jump machine instruction to its start address in 41 | memory, the function will only behave as expected if the data it uses —its 42 | parameters, space for return values, etc.— is laid out in memory exactly in 43 | the way its machine code expects it to be. In turn, this layout depends on the 44 | types of the values passed to and from the function. 45 | 46 | \tikzstyle{component} = [ 47 | draw=black, 48 | thick, 49 | fill=green!10, 50 | rectangle, 51 | text centered, 52 | minimum height=2em, 53 | text width=6em, 54 | rounded corners, 55 | drop shadow, 56 | ] 57 | \tikzstyle{uses} = [ 58 | draw, 59 | very thick, 60 | >=triangle 45, 61 | ->, 62 | dashed, 63 | ] 64 | \tikzstyle{contains} = [ 65 | draw, 66 | thick, 67 | >=triangle 45, 68 | -*, 69 | ] 70 | 71 | \begin{figure} 72 | \centering 73 | \begin{tikzpicture}[node distance=2cm] 74 | 75 | \node[component] (library) {Library}; 76 | \node[component] (typecache) [above of=library] {Type Cache}; 77 | \node[component] (ctype) [right=1cm of library] {CType}; 78 | \node[component] (function) [right=1cm of ctype] {Function}; 79 | \node[component] (variable) [right=1cm of function] {Variable}; 80 | \node[component] (typeinfo) [above of=function] {Type Information}; 81 | \node[datain] (dwarf) [above=1cm of typecache] 82 | {DWARF debugging information}; 83 | 84 | \node (luadata) [below right of=ctype] {Visible in Lua as userdata}; 85 | \node (elf) [above=0cm of dwarf] {ELF shared object}; 86 | 87 | \path[uses] (function) -- (typeinfo); 88 | \path[uses] (variable) -- (typeinfo); 89 | \path[uses] (ctype) -- (typeinfo); 90 | \path[uses] (typecache) -- (dwarf); 91 | \path[contains] (library) -- (typecache); 92 | \path[contains] (typecache) -- (typeinfo); 93 | 94 | \begin{pgfonlayer}{background} 95 | \node[datablob] (elfbox) [fit=(dwarf) (elf), drop shadow] {}; 96 | \node[fill=yellow!20, rectangle, rounded corners] (wrappers) 97 | [fit=(library) (variable) (function) (luadata)] { }; 98 | \end{pgfonlayer} 99 | \end{tikzpicture} 100 | \caption{Architecture of \Eol*.} 101 | \label{fig:eol-architecture} 102 | \end{figure} 103 | 104 | There are four components which form part of the interface to Lua (as 105 | specified in \autoref{sec:design-lua-api}): 106 | 107 | \begin{itemize} 108 | 109 | \item \textsf{Library} (\autoref{sec:eol-api-library-t}) represents a loaded 110 | ELF library. It is responsible of accessing the DWARF debugging information, 111 | and looking up values of the other types. 112 | 113 | \item \textsf{CType} (\autoref{sec:eol-api-ctype-t}) represents a native 114 | data type. It is responsible for providing information about the represented 115 | native type, and for creating native values of the represented native type 116 | from Lua. The types available to C programs are supported, hence the name. 117 | 118 | \item \textsf{Function} (\autoref{sec:eol-api-function-t}) represents a 119 | fragment of native code contained in a library, which can be invoked as 120 | a function. It is responsible for performing calls into native code 121 | from Lua. 122 | 123 | \item \textsf{Variable} (\autoref{sec:eol-api-variable-t}) represents a 124 | variable from a library. It is responsible for allowing reading and 125 | writing its value from Lua, performing conversions as needed. 126 | 127 | \end{itemize} 128 | 129 | Looking up type information involves reading the DWARF debugging information 130 | from disk and decoding it appropriately. In order to avoid repeatedly reading 131 | the debugging information from disk to construct new \textsf{Type Information} 132 | values, each \textsf{Library} makes use of a \textsf{Type Cache} which keeps 133 | the information in memory. An additional benefit of the cache is that it 134 | allows reusing the \textsf{Type Information}: many DWARF \gls{DIE}s contain 135 | references to others\todo{Got time? Add a diagram with an example}, and the 136 | cache can be queried to determine whether a referenced DIE has been already 137 | turned into \textsf{Type Information}, and use the data from the cache 138 | instead. 139 | 140 | 141 | \section{Interaction With the Lua GC} 142 | \label{sec:design-gc-interaction} 143 | 144 | Userdata values are subject to Lua's \gls{GC} (c.f. 145 | \autoref{sec:userdata-lua-custom-allocator}), which poses a problem for the 146 | \textsf{Library} userdata: if the Lua VM does not keep an active reference to 147 | a \textsf{Library} value, the GC will consider it to be garbage, and will 148 | deallocate it while it may be still referenced by other resources. In 149 | particular, a \textsf{Library} cannot be unloaded while there is any 150 | \textsf{CType}, \textsf{Function}, or \textsf{Variable} userdata which belong 151 | to the library being used from Lua. This kind of situation can be triggered by 152 | the following simple sequence of events, illustrated by 153 | \autoref{lst:library-gc-issue}: 154 | 155 | \begin{listing}[ht] 156 | \begin{luacode} 157 | function loadfunction(libname, funcname) 158 | local eol = require("eol") 159 | local lib = eol.load(libname) 160 | return lib[funcname] 161 | end 162 | -- Obtain an userdata for the add() function from libtest.so 163 | add = loadfunction("libtest", "add") 164 | -- Crashes if the GC has already collected the library. 165 | print(add(6, 5)) 166 | \end{luacode} 167 | \caption{Lua example which makes a \textsf{Library} subject to GC} 168 | \label{lst:library-gc-issue} 169 | \end{listing} 170 | 171 | \begin{enumerate} 172 | 173 | \item A library is loaded and returned to Lua as a \textsf{Library} 174 | userdata. The userdata is assigned to a temporary variable (e.g. 175 | a \Mlua|local| variable inside a \Mlua|function|) which eventually 176 | will go out of scope. 177 | \item A \textsf{Function} userdata is obtained for a native function 178 | contained by the library. 179 | \item The GC determines that the \textsf{Library} userdata is garbage, 180 | and frees the resources used by it. This unloads the library. 181 | \item At this point, invoking the function crashes the program because 182 | its machine code, contained in the library, is no longer loaded in 183 | memory. 184 | 185 | \end{enumerate} 186 | 187 | The solution for this problem is to use \gls{refcounting}, to ensure that the 188 | libraries are kept loaded while needed: each active userdata value of type 189 | \textsf{Function}, \textsf{CType}, or \textsf{Variable} contributes to the 190 | reference count. This way, a library is unloaded only when its reference count 191 | reaches zero. 192 | 193 | 194 | \section{Module API} 195 | \label{sec:design-lua-api} 196 | 197 | The \gls{API} exposed by the \Eol* module to the Lua world is loosely modelled 198 | after the one provided by the LuaJIT FFI module~\cite{lj-ffi-api} —also 199 | implemented by the standalone \verb|luaffi| module~\cite{luaffi}—, with some 200 | functions even having the same names and semantics, and others differing where 201 | appropriate. For example, \Eol* does not need to provide a function to parse 202 | C-like declarations because the type information is obtained from the 203 | \gls{DWARF} debugging information instead. The goal is to provide an API which 204 | is proven to be suitable for Lua FFIs, and at the same time not force 205 | programmers who have used the LuaJIT FFI —or the standalone \verb|luaffi|— to 206 | learn how to use a completely different API. 207 | 208 | \subsection{The \texttt{eol} Namespace} 209 | 210 | Where the LuaJIT FFI and \verb|luaffi| modules provide their functions in the 211 | \verb|ffi| namespace, \Eol* provides its functionality in the \verb|eol| 212 | namespace. Also, loading the \verb|eol| module should be possible using the 213 | standard Lua module loader, via the \Mlua|require()| function: 214 | 215 | \begin{luacode} 216 | eol = require("eol") 217 | \end{luacode} 218 | 219 | Unless started otherwise, in the specification of the functions summarized in 220 | \autoref{tab:eol-api-functions-summary} parameters named \texttt{typevalue} 221 | accept both a \textsf{CType} values, or \textsf{Variable} values, in which 222 | case the type associated with the variable will be used. Also, functions can 223 | generate Lua errors when the types of values passed to them are incorrect. 224 | 225 | 226 | \begin{table}[ht] 227 | \centering 228 | \begin{tabular}{lcc} 229 | \toprule 230 | Function & LJ & Err \\ 231 | \midrule 232 | \Mlua|library = eol.load(name, global)| & \Tick & \Tick \\ 233 | \Mlua|typeinfo = eol.type(library, name)| & & \Tick \\ 234 | \Mlua|typeinfo = eol.typeof(typevalue)| & \Tick & \Tick \\ 235 | \Mlua|size = eol.sizeof(typevalue)| & \Tick & \\ 236 | \Mlua|alignment = eol.alignof(typevalue)| & \Tick & \\ 237 | \Mlua|offset = eol.offsetof(typevalue, field)| & \Tick &\\ 238 | \Mlua|value = eol.cast(typeinfo, value)| & \Tick & \\ 239 | \Mlua|flag = eol.abi(parameter)| & \Tick & \\ 240 | \bottomrule 241 | \end{tabular} 242 | 243 | \vspace{2pt} 244 | 245 | \begin{small} 246 | \begin{tabular}{lp{0.65\textwidth}} 247 | \emph{LJ} & \emph{Function available in the LuaJIT FFI module} \\ 248 | \emph{Err}& \emph{May raise a Lua error while reading debugging information} \\ 249 | \end{tabular} 250 | \end{small} 251 | 252 | \caption{API functions in the \texttt{eol} namespace} 253 | \label{tab:eol-api-functions-summary} 254 | \end{table} 255 | 256 | 257 | % eol.cdef(text) 258 | % UNAVAILABLE / UNNEEDED 259 | 260 | % eol.C 261 | % libc access (UNIMPLEMENTED) 262 | 263 | \subsubsection{Function \texttt{eol.load}} 264 | \label{sec:eol-api-load} 265 | 266 | \begin{luacode} 267 | library = eol.load(name, global) 268 | \end{luacode} 269 | 270 | Loads a library given its \texttt{name}, and returns it as a \textsf{Library} 271 | userdata value. The library \texttt{name} is specified without the file 272 | extension, because the module will add the appropriate extension for the 273 | operating system being used (e.g. \texttt{.so} for GNU/Linux). The library is 274 | then searched for the given order of preference: 275 | 276 | \begin{enumerate} 277 | 278 | \item \texttt{name} is an absolute path, and points to an existing file 279 | 280 | \item \texttt{name} is a relative path which, using the working directory as 281 | starting point, can be resolved to an existing file 282 | 283 | \item \texttt{name} does not contain path separators, and a library with 284 | a matching name exists in one of the standard locations for shared libraries 285 | of the operating system being used (e.g. \texttt{/lib}, \texttt{/usr/lib}, 286 | and \texttt{/usr/local/lib} for most Unix-like systems, including GNU/Linux) 287 | 288 | \end{enumerate} 289 | 290 | If a suitable library could not be found using the method outlined above, or 291 | it could not be lodeaded, the function raises a Lua error. 292 | 293 | The \texttt{global} parameter is a boolean value which determines how the 294 | symbols from the loaded library interact with the ones from other libraries. 295 | When \Mlua|true|, the symbols defined by the library will be made available 296 | for symbols resolution of subsequently loaded libraries. The parameter is 297 | optional, and if not supplied the option is disabled as if \Mlua|false| was 298 | supplied as the second paramter. In an Unix-like system, this is equivalent to 299 | using \texttt{RTLD\_GLOBAL}, and \texttt{RTLD\_LOCAL} when 300 | \texttt{dlopen()}~\cite{opengroup-dlopen} is used to load a shared object file, 301 | respectively. 302 | 303 | \subsubsection{Function \texttt{eol.type}} 304 | \label{sec:eol-api-type} 305 | 306 | \begin{luacode} 307 | typeinfo = eol.type(library, name) 308 | \end{luacode} 309 | 310 | Obtains the information for a type of a given \texttt{name}, 311 | contained in a \texttt{library}. The result is returned as a \textsf{CType} 312 | userdata. If the type is not found in the library, \Mlua|nil| is returned 313 | instead. 314 | 315 | 316 | \subsubsection{Function \texttt{eol.typeof}} 317 | \label{sec:eol-api-typeof} 318 | 319 | \begin{luacode} 320 | typeinfo = eol.typeof(typevalue) 321 | \end{luacode} 322 | 323 | Obtains the type of \texttt{typevalue}, and returns it as a \textsf{CType} 324 | userdata. 325 | 326 | \begin{luacode} 327 | typeinfo = eol.typeof(name) 328 | \end{luacode} 329 | 330 | Alternatively, it is possible to pass a string with the \texttt{name} of 331 | a type, and it will be searched for in all the currently loaded libraries. 332 | This second invocation method is an \Eol* extension which is not available in 333 | the LuaJIT FFI module. 334 | 335 | \subsubsection{Function \texttt{eol.sizeof}} 336 | 337 | \begin{luacode} 338 | size = eol.sizeof(typevalue) 339 | \end{luacode} 340 | 341 | Obtains the size of \texttt{typevalue}, in bytes. If \texttt{typevalue} is 342 | a \textsf{CType} userdata, the size returned corresponds to the size of values 343 | of the type. If the size is not known (e.g. for \Mc|void|, or functions), 344 | \Mlua|nil| is returned instead. 345 | 346 | For any other values, an error is raised. 347 | 348 | \subsubsection{Function \texttt{eol.alignof}} 349 | \label{sec:eol-api-alignof} 350 | 351 | \begin{luacode} 352 | alignment = eol.alignof(typevalue) 353 | \end{luacode} 354 | 355 | Obtains the minimum required alignment for \texttt{typevalue}, in bytes. 356 | 357 | 358 | \subsubsection{Function \texttt{eol.offsetof}} 359 | \label{sec:eol-api-offsetof} 360 | 361 | \begin{luacode} 362 | offset = eol.offsetof(typevalue, field) 363 | \end{luacode} 364 | 365 | Obtains the offset in bytes of \texttt{field} inside \texttt{typevalue}, which 366 | must be a record data type (a \Mc|struct| in C). The \texttt{field} can be 367 | specified as positive integer, or as a string. In the latter case, if there is 368 | no field with the given name, a Lua error is raised. 369 | 370 | \subsubsection{Function \texttt{eol.cast}} 371 | \label{sec:eol-api-cast} 372 | 373 | \begin{luacode} 374 | value = eol.cast(typeinfo, value) 375 | \end{luacode} 376 | 377 | Creates and returns a new \textsf{Variable} userdata which describes the same 378 | memory area as the passed \texttt{value}, but associates a new 379 | \texttt{typeinfo} to it. 380 | 381 | This function can be used to change the type associated with a \textsf{Variable} 382 | userdata, without changing the value itself. It is useful to manually override 383 | the pointer compatibility checks, or to convert between pointer values and 384 | addresses represented as integers. 385 | 386 | % ctype = eol.metatype(ct, metatable) 387 | % 388 | % cdata = eol.gc(cdata, finalizer) 389 | % 390 | % 391 | % 392 | % status = eol.istype(ct, obj) 393 | % 394 | % eol.copy(dst, src, len) 395 | % eol.copy(dst, str) 396 | % eol.fill(dst, len, [, c]) 397 | 398 | 399 | \subsubsection{Function \texttt{eol.abi}} 400 | \label{sec:eol-api-abi} 401 | 402 | \begin{luacode} 403 | flag = eol.abi(param) 404 | \end{luacode} 405 | 406 | Returns \Mlua|true| if \texttt{param} (a Lua string) applies for the target 407 | \gls{ABI}. Returns \Mlua|false| otherwise. The defined parameters are detailed 408 | in \autoref{tab:eol-abi-params}. This function is provided for compatibility 409 | with the LuaJIT FFI module. 410 | 411 | \begin{table}[htH] 412 | \centering 413 | \begin{tabular}{ll} 414 | \toprule 415 | Parameter & Description \\ 416 | \midrule 417 | \texttt{"32bit"} & The architecture uses 32-bit wide words. \\ 418 | \texttt{"64bit"} & The architecture uses 64-bit wide words. \\ 419 | \midrule 420 | \texttt{"le"} & Little-endian architecture. \\ 421 | \texttt{"be"} & Big-endian architecture. \\ 422 | \bottomrule 423 | \end{tabular} 424 | \caption{Defined parameters for \texttt{eol.abi()}} 425 | \label{tab:eol-abi-params} 426 | \end{table} 427 | 428 | 429 | % \subsubsection{Variable \texttt{eol.os}} 430 | % 431 | % \begin{luacode} 432 | % operatingsystem = eol.os 433 | % \end{luacode} 434 | % 435 | % 436 | % \subsubsection{Variable \texttt{eol.arch}} 437 | % 438 | % \begin{luacode} 439 | % architecture = eol.arch 440 | % \end{luacode} 441 | 442 | 443 | \subsection{Library userdata} 444 | \label{sec:eol-api-library-t} 445 | 446 | \begin{luacode} 447 | libc = eol.load("libc") 448 | stdout = libc.stdout -- Obtain a variable 449 | libc.fputs(stdout, "Hello, libc\n") -- Obtain a function 450 | \end{luacode} 451 | 452 | Userdata values of type \textsf{Library} represent a loaded library. The only 453 | way of obtaining them is using the \texttt{eol.load()} function 454 | (\autoref{sec:eol-api-load}). Indexing a library with a string key looks up 455 | the symbol of the same name, with one of the following results: 456 | 457 | \begin{itemize} 458 | 459 | \item If the symbol refers to executable code, a \textsf{Function} userdata 460 | (\autoref{sec:eol-api-function-t}) is returned. 461 | 462 | \item If the symbol refers to data, a \textsf{Variable} userdata 463 | (\autoref{sec:eol-api-variable-t}) is returned. 464 | 465 | \item Otherwise, \Mlua|nil| is returned. 466 | 467 | \end{itemize} 468 | 469 | Note that it is not possible to obtain a \textsf{CType} userdata directly from 470 | a library. The \texttt{eol.type()} function (\autoref{sec:eol-api-type}) must 471 | be used to that effect. 472 | 473 | 474 | \subsection{CType Userdata} 475 | \label{sec:eol-api-ctype-t} 476 | 477 | Userdata values of type \textsf{CType} represent information about types used 478 | by the native code of libraries. There are three ways in which values can be 479 | obtained: 480 | 481 | \begin{itemize} 482 | 483 | \item Using the \texttt{\_\_type} key to index a \textsf{Variable} 484 | userdata (\autoref{sec:eol-api-variable-t}). 485 | 486 | \item Using the \texttt{eol.typeof()} function 487 | (\autoref{sec:eol-api-typeof}). 488 | 489 | \item Using the \texttt{eol.type()} function (\autoref{sec:eol-api-type}). 490 | 491 | \end{itemize} 492 | 493 | 494 | \subsubsection{Value Construction} 495 | 496 | \begin{luacode} 497 | new_value = typeinfo(n) 498 | \end{luacode} 499 | 500 | A \textsf{CType} userdata is also a \gls{constructor} for values of the type 501 | it represents, by means of its \texttt{\_\_call} metamethod. Invoking 502 | a \value{CType} as a constructor accepts an optional parameter: if supplied, 503 | an array of \texttt{n} elements is created; otherwise a single element is 504 | created. The memory used to store the value is initialized by filling it with 505 | zeroes (\texttt{0x00}). 506 | 507 | All the values created this way are subject to \gls{GC}, as specified in 508 | \autoref{sec:design-gc-interaction}. 509 | 510 | 511 | \subsubsection{Type information} 512 | 513 | Information about the represented data type can be obtained by indexing 514 | \textsf{CType} values (by means of an \texttt{\_\_index} metamethod) with the 515 | keys detailed in \autoref{tab:eol-api-userdata-keys}. 516 | 517 | \begin{table}[ht] 518 | \centering 519 | \begin{tabular}{lp{0.7\textwidth}} 520 | \toprule 521 | Key & Description \\ 522 | \midrule 523 | \texttt{name} & Name of the type, as a string. \\ 524 | \texttt{sizeof} & Size of values of the type, in bytes. Equivalent to 525 | calling \texttt{eol.typeof()} passing the \textsf{CType} userdata as 526 | a parameter. \\ 527 | \texttt{readonly} & Boolean value; indicates whether the type is 528 | declared as readonly (e.g. using \Mc|const| in C). \\ 529 | \texttt{kind} & String which represents the kind of type, e.g. 530 | \texttt{"struct"}, \texttt{"union"}... \\ 531 | \texttt{type} & For types which are defined in terms of another 532 | \emph{base type}, the \textsf{CType} userdata for the base type. 533 | Otherwise \Mlua|nil|. \\ 534 | \bottomrule 535 | \end{tabular} 536 | \caption{Keys available in \textsf{CType} userdata} 537 | \label{tab:eol-api-userdata-keys} 538 | \end{table} 539 | 540 | For compound data types (in C, \Mc|struct|s and \Mc|union|s), two additional 541 | operations are supported on their \textsf{CType} userdata. The Lua length 542 | operator (\texttt\#, by means of a \texttt{\_\_len} metamethod) returns the 543 | number of members in the compound type, and indexing the userdata as an array 544 | —using numeric indexes— returns information about its \emph{nth} member, as 545 | a Lua table which contains the fields specified in 546 | \autoref{tab:eol-api-ctype-compound-member-fields}. 547 | 548 | \begin{table}[ht] 549 | \centering 550 | \begin{tabular}{lccp{0.6\textwidth}} 551 | \toprule 552 | Key & Enum & Struct & Description\\ 553 | \midrule 554 | \texttt{name} & \Tick & \Tick & Name of the member, as a string. \\ 555 | \texttt{value} & \Tick & & Value, as an integer. \\ 556 | \texttt{type} & & \Tick & Type of the member, as a \textsf{CType} userdata. \\ 557 | \texttt{offset}& & \Tick & Offset of the member, in bytes. 558 | Equivalent to calling \texttt{eol.offsetof()} passing the the 559 | \textsf{CType} userdata and the member name as paramters. \\ 560 | \bottomrule 561 | \end{tabular} 562 | \caption{Keys available in compound \textsf{CType} member information.} 563 | \label{tab:eol-api-ctype-compound-member-fields} 564 | \end{table} 565 | 566 | 567 | \subsubsection{Method \texttt{:pointerto()}} 568 | 569 | \begin{luacode} 570 | pointer_typeinfo = typeinfo:pointerto() 571 | \end{luacode} 572 | 573 | Uses \texttt{typeinfo} as base type to construct a new \textsf{CType} userdata 574 | value which represents a pointer to a value of the base type. 575 | 576 | 577 | \subsubsection{Method \texttt{:arrayof(n)}} 578 | 579 | \begin{luacode} 580 | array_typeinfo = typeinfo:arrayof(n) 581 | \end{luacode} 582 | 583 | Uses \texttt{typeinfo} as base type to construct a new \textsf{CType} userdata 584 | value which represents an array of \texttt{n} elements of the base type. 585 | 586 | 587 | \subsection{Function userdata} 588 | \label{sec:eol-api-function-t} 589 | 590 | Userdata values of type \textsf{Function} represent any piece of native code 591 | from a \textsf{Library} which can be invoked transparently from Lua. 592 | 593 | Information about a \textsf{Function} can be obtained by indexing the userdata 594 | (by means of an \texttt{\_\_index} metamethod) with the keys detailed in 595 | \autoref{tab:eol-api-function-keys}. 596 | 597 | \begin{table}[ht] 598 | \centering 599 | \begin{tabular}{lp{0.7\textwidth}} 600 | \toprule 601 | Key & Description \\ 602 | \midrule 603 | \texttt{\_\_name} & Name of the function, as a string. \\ 604 | \texttt{\_\_type} & Type of the return value, as a \textsf{CType} 605 | userdata, or \Mlua|nil| if the function does not return a value. \\ 606 | \texttt{\_\_library} & Library which contains the function, as a 607 | \textsf{Library} userdata. \\ 608 | \bottomrule 609 | \end{tabular} 610 | \caption{Keys available in \textsf{Function} userdata} 611 | \label{tab:eol-api-function-keys} 612 | \end{table} 613 | 614 | 615 | \subsection{Variable userdata} 616 | \label{sec:eol-api-variable-t} 617 | 618 | Userdata values of type \textsf{Variable} represent native data values. Each 619 | value has a pointer to the region of memory occupied by the actual data, and 620 | an associated \textsf{CType} which determines how the pointer to the data is 621 | used. 622 | 623 | The actual value represented by the \textsf{Variable} userdata and information 624 | about them can be obtained by indexing the userdatas (by means of an 625 | \texttt{\_\_index} metamethod) with the keys detailed in 626 | \autoref{tab:eol-api-variable-keys}. 627 | 628 | For \textsf{Variable}s with an associated array \textsf{CType}, it is possible 629 | to manipulate the variable directly as if it were a Lua array, using the 630 | \Mlua|variable[index]| syntax, and the length operator (\texttt\#) returns 631 | the number of elements in the array. 632 | 633 | % \texttt{\_\_index} and \texttt{\_\_newindex} metamethods allow 634 | % 635 | % contents using normal array 636 | % to use the Lua length operator (\texttt\#, by means of a \texttt{\_\_len} 637 | % metamethod) to obtain the number of elements in the array, and manipulating 638 | % the values of individual elements using numeric indexes (both reading and 639 | % writing values of the elements are possible, by means of the 640 | % \texttt{\_\_index} and \texttt{\_\_newindex} metamethods, respectively). 641 | 642 | \begin{table}[ht] 643 | \centering 644 | \begin{tabular}{lcp{0.65\textwidth}} 645 | \toprule 646 | Key & Writable & Description \\ 647 | \midrule 648 | \texttt{\_\_value} & \Tick & Value of the variable. \\ 649 | \texttt{\_\_name} & & Name of the variable, as a string. \\ 650 | \texttt{\_\_type} & & Type of the variable, as a \textsf{CType} userdata. \\ 651 | \texttt{\_\_library} & & Library which contains the variable, as 652 | a \textsf{Library} userdata. It may be \Mlua|nil| for variables created from Lua. \\ 653 | \bottomrule 654 | \end{tabular} 655 | \caption{Keys available in \textsf{Variable} userdata} 656 | \label{tab:eol-api-variable-keys} 657 | \end{table} 658 | 659 | 660 | 661 | \subsubsection{Type information} 662 | 663 | \textsf{Function} userdata values provide information about their return type 664 | when indexing them with the \texttt{\_\_type} key, as seen in the previous 665 | section. Type information for function parameters is also provided: applying 666 | the Lua length operator (\texttt\#, by means of a \texttt{\_\_len} metamethod) 667 | returns the number of parameters accepted, and indexing the userdata as an 668 | array —using numeric indexes— returns the type information for the paramters 669 | as \textsf{CType} userdata. 670 | 671 | 672 | \subsubsection{Invocation} 673 | 674 | The \texttt{\_\_call} metamethod is implemented for \textsf{Function} 675 | userdata values, effectively making them directly callable from Lua. Invoking 676 | native code involves: 677 | 678 | \begin{enumerate} 679 | 680 | \item Checking that the number of parameters passed to the function from Lua 681 | match the amount accepted by the native function. 682 | 683 | \item Allocating as much space as needed to pass the parameters to 684 | the native function, plus the space needed for the return value (if any). 685 | The amount of space needed must be calculated using the sizes of the native 686 | types, as used by the native function. 687 | 688 | \item For each value passed as a parameter in Lua: 689 | 690 | \begin{enumerate} 691 | 692 | \item Checking that the type of the Lua value is compatible and can be 693 | converted to a value of the type expected by the native function. 694 | 695 | \item Converting the Lua value to the corresponding native type, and 696 | storing the result in the allocated space. 697 | 698 | \end{enumerate} 699 | 700 | \item Re-arranging the data as needed, if the in-memory layout of the 701 | allocated data does not match the layout defined by the \gls{ABI} of the 702 | target architecture and operating system. 703 | 704 | \item Invoking the native function by jumping to its start address. 705 | 706 | \item Converting the return value, if any, to a Lua value, and pushing 707 | and pushing it into the Lua stack. 708 | 709 | \end{enumerate} 710 | 711 | 712 | \section{Testing} 713 | \label{sec:design-testing} 714 | 715 | We decided to use a test harness for \Eol*. The test harness should be usable 716 | not only for unit testing, but also for regression testing, so it should 717 | exercise the implementation using the \Eol* module API 718 | (\autoref{sec:design-lua-api}) and not depend on knowledge about the internals 719 | of the implementation. 720 | 721 | One challenge for the test harness is that programming errors in the system 722 | may cause the entire process to hang, or crash: the \Eol* Lua module is 723 | implemented in C, and therefore all the caveats of running and testing native 724 | code apply. 725 | 726 | A number of third party unit testing frameworks exist for Lua, but evaluating 727 | them showed that none of them satisfies our requisites: 728 | 729 | \begin{itemize} 730 | 731 | \item Compatibility with Lua 5.3, which is the Lua version used as target. 732 | Many of the testing frameworks only support older versions only, and 733 | \texttt{lunit}\footnote{\url{http://www.mroth.net/lunit/}}, 734 | Lunity\footnote{\url{https://github.com/Phrogz/Lunity}}, 735 | Lunatest\footnote{\url{https://github.com/silentbicycle/lunatest}}, 736 | LuaUnit\footnote{\url{https://github.com/bluebird75/luaunit}}, 737 | Shake\footnote{\url{http://shake.luaforge.net/}}, 738 | BTDLua\footnote{\url{http://users.skynet.be/adrias/Lua/BTDLua/}}, 739 | Luaspec\footnote{\url{https://github.com/mirven/luaspec/}}, 740 | Telescope\footnote{\url{https://github.com/norman/telescope/}}, and 741 | Gambiarra\footnote{\url{https://bitbucket.org/zserge/gambiarra}} were 742 | discarded because of that. 743 | 744 | \item Ability to handle gracefully crashes of the process running the Lua 745 | VM. Many testing frameworks for Lua focus on testing Lua code, and do not 746 | handle crashes in native code gracefully. Because of this, 747 | \texttt{lunitx}\footnote{\url{https://github.com/dcurrie/lunit}}, 748 | Testy\footnote{\url{https://github.com/siffiejoe/lua-testy}}, 749 | Busted\footnote{\url{http://olivinelabs.com/busted/}}, and 750 | TestMore\footnote{\url{http://fperrad.github.io/lua-TestMore/}} 751 | were discarded. 752 | 753 | \end{itemize} 754 | 755 | Because of the impossibility of reusing an existing testing framework, we 756 | needed to implement our own test harness, which revolves around the 757 | requirement of gracefully handling crashes in native code. 758 | 759 | \minisec{Unit Test Isolation} 760 | 761 | The best way of ensuring that the test harness can continue running in the 762 | event of a crash is to run each unit test in a new process, with a fresh Lua 763 | VM. This motivates each unit tests to be stored in its own Lua script. This 764 | way it is possible for the test harness to run as a separate process, which in 765 | turns executes a new process for each unit test, which can crash safely 766 | without affecting the test harness or the rest of the tests. 767 | 768 | \minisec{Test Assertions} 769 | 770 | On top of the standard \Mlua|assert()| function provided by Lua, the test 771 | harness additionally provides the assertions listed in 772 | \autoref{tab:design-test-asserts}, to be used in unit tests. 773 | 774 | \begin{table}[htH] 775 | \centering 776 | \begin{tabular}{lp{0.6\textwidth}} 777 | \toprule 778 | Function & Description \\ 779 | \midrule 780 | \verb|assert.False(value)| & 781 | Checks that \texttt{value} is \verb|false| \\ 782 | \verb|assert.True(value)| & 783 | Checks that \texttt{value} is \verb|true| \\ 784 | \verb|assert.Falsey(value)| & 785 | Checks that \texttt{value} is \verb|false| or \Mlua|nil| \\ 786 | \verb|assert.Truthy(value)| & 787 | Checks that \texttt{value} evaluates to a non-falsey value (any value 788 | except \verb|false| or \Mlua|nil|) \\ 789 | \verb|assert.Error(f)| & 790 | Checks whether invoking function \texttt{f} raises a Lua error \\ 791 | \verb|assert.Callable(value)| & 792 | Checks whether \texttt{value} is a function or has a \texttt{\_\_call} 793 | metamethod which allows to treat it as a function \\ 794 | \verb|assert.Field(obj, name)| & 795 | Checks whether an \texttt{obj}ect is indexable and contains a field with 796 | the given \texttt{name} \\ 797 | \verb|assert.Userdata(value, T)| & 798 | Checks whether a \texttt{value} is an userdata of type \texttt{T} \\ 799 | \verb|assert.Equal(a, b)| & 800 | Checks whether two values \texttt{a} and \texttt{b} are equal \\ 801 | \verb|assert.Match(re, str)| & 802 | Checks whether a \texttt{str}ing matches a particular \texttt{re}gular 803 | expression. \\ 804 | \bottomrule 805 | \end{tabular} 806 | \caption{Additional assertions provided by the test harness} 807 | \label{tab:design-test-asserts} 808 | \end{table} 809 | 810 | The checks performed by the assertions can be reversed by indexing the 811 | \verb|assert| object with the \verb|Not| key, and using the result to invoke 812 | the negated assertion. As complex as it may sound, this is easily exemplified: 813 | 814 | \begin{luacode} 815 | assert.Equal(2, 1+1) -- Normal assertion 816 | assert.Not.Equal(nil, 2) -- Negated assertion 817 | \end{luacode} 818 | 819 | \minisec{Test Names} 820 | 821 | In order to allow specifying which test (or tests) from the corpus of unit tests 822 | are to be run, we need a way to refer to them by name. Provided that each unit 823 | test is contained in a file, the file name without the \verb|.lua| suffix is 824 | used as the name the unit test. 825 | 826 | \minisec{Test harness architecture} 827 | 828 | The components of the test harness are shown in \autoref{fig:design-harness}. 829 | 830 | \begin{figure}[th] 831 | \centering 832 | \begin{tikzpicture}[node distance=2cm] 833 | \node[component] (test) {Test}; 834 | \node[component] (runner) [right=1cm of test] {Runner}; 835 | \node[component] (baseout) [right=1cm of runner] {Output}; 836 | \node[component, below] (tapout) [below of=baseout, xshift=-19mm] {TAP Output}; 837 | \node[component, below] (conout) [right=1cm of tapout] {Console Output}; 838 | 839 | \node[datain] (t1) [below of=test] {\texttt{test1.lua}}; 840 | \node[datain] (t2) [below of=t1, node distance=1.1\baselineskip] 841 | {\texttt{test2.lua}}; 842 | \node[datain] (tt) [below of=t2, node distance=1.1\baselineskip] {...}; 843 | \node (ttlabel) [below of=tt, node distance=1.5\baselineskip] 844 | {Unit Test Scripts}; 845 | 846 | \path[uses] (runner.east) -- (baseout.west); 847 | 848 | \path[uses, solid] (tapout.east) -| (baseout.south); 849 | \path[uses, solid] (conout.west) -| (baseout.south); 850 | 851 | \path[contains] (runner.west) -- (test.east); 852 | 853 | \begin{pgfonlayer}{background} 854 | \node[datablob] (tests) [fit=(t1) (t2) (tt) (ttlabel), drop shadow] {}; 855 | \end{pgfonlayer} 856 | 857 | \path[datain, dashed, thick] (tests.north) -- (test.south); 858 | \end{tikzpicture} \caption{Architecture of the test harness} 859 | \label{fig:design-harness} \end{figure} 860 | 861 | Each unit test, which is ultimately a Lua script in the file system, is 862 | modelled by a \textsf{Test}, which is responsible for executing its 863 | corresponding Lua script in a new process and determining whether the 864 | execution of the unit test succeeded, failed, or crashed. 865 | 866 | The \textsf{Runner} is the main component of the harness. It manages 867 | a collection of \textsf{Test}, and is responsible for triggering their 868 | execution, keeping statistics about the test process (total number of tests, 869 | amount of failed tests, and so on), and reporting status and results to an 870 | \textsf{Output}. 871 | 872 | An \textsf{Output} is responsible for reporting the status and results of the 873 | execution of the unit tests to the user. Its interface is abstract, and two 874 | concrete implementations are to be initially provided: \textsf{Console 875 | Output}, to produce textual output suitable for display in a Unix color 876 | terminal (DEC VT420 or compatible i.e. XTerm), and \textsf{TAP Output}, to 877 | write the output in the \gls{TAP} format~\cite{tap-spec}. The latter, being 878 | a de facto standard, allows integration with third party tools. 879 | 880 | \beforeintro 881 | -------------------------------------------------------------------------------- /implementation.tex: -------------------------------------------------------------------------------- 1 | % vim: ft=tex spell spelllang=en ts=2 sw=2 et 2 | 3 | \setchaptertoc 4 | \chapter{Implementation} 5 | \clearpage 6 | % \enlargethispage{2\baselineskip} 7 | 8 | This chapter provides both guidance to browse the \Eol* source 9 | code~\cite{eol-github}, and insight into the details worth 10 | mentioning of the implementation of the system. 11 | 12 | % \enlargethispage{2\baselineskip} 13 | \afterintro 14 | 15 | 16 | \section{Project Source Structure} 17 | 18 | The \Eol* source code is organized in the following directory structure, which 19 | follows usual conventions for C projects: 20 | 21 | \begin{figure}[h] 22 | \centering 23 | \noindent\begin{minipage}{0.75\textwidth} 24 | \dirtree{% 25 | .1 \DtFolder{lua-eol/}. 26 | .2 \DtFolder{doc/} \DTcomment{Documentation and API reference}. 27 | .2 \DtFolder{examples/}. 28 | .3 *.lua \DTcomment{Module usage examples}. 29 | .2 \DtFolder{tools/} \DTcomment{Build \& test utilities}. 30 | .3 \DtFolder{ninja/} \DTcomment{Ninja build support files}. 31 | .3 \DtFolder{make/} \DTcomment{GNU Make build support files}. 32 | .2 \DtFolder{test/}. 33 | .3 *.lua \DTcomment{Unit tests}. 34 | .2 uthash.h \DTcomment{Copy of UT-hash}. 35 | .2 eol-*.c \DTcomment{Module sources}. 36 | .2 eol-*.h \DTcomment{Module sources}. 37 | } 38 | \end{minipage} 39 | \caption{Source tree structure.} 40 | \end{figure} 41 | 42 | Module source files (\verb|eol-*.h|, \verb|eol-*.c|) are named after the 43 | components identified during the design phase. In particular: 44 | 45 | \begin{itemize} 46 | 47 | \item \verb|eol-module.c| \hfill\\ 48 | Main part of the code, including the interfacing with Lua. 49 | 50 | \item \verb|eol-fcall.h|, 51 | \verb|eol-fcall-.h|, 52 | \verb|eol-fcall-.c|... \hfill\\ 53 | Different implementations of the native function invocation mechanism. 54 | 55 | \item \verb|eol-typing.h|, \verb|eol-typing.c| \hfill\\ 56 | Type representation module. 57 | 58 | \item \verb|eol-typecache.h|, \verb|eol-typecache.c|, \verb|uthash.h| \hfill\\ 59 | Type representation cache module. 60 | 61 | \item \verb|eol-libdwarf.h|, \verb|eol-libdwarf.c| \hfill\\ 62 | Utility functions to simplify working with \verb|libdwarf|. 63 | 64 | \item \verb|eol-lua.h| \hfill\\ 65 | Utility functions to simplify working with the Lua C API. 66 | 67 | \item \verb|eol-trace.h|, \verb|eol-trace.c| \hfill\\ 68 | Tracing support module. 69 | 70 | \item \verb|eol-util.h|, \verb|eol-util.c| \hfill\\ 71 | Miscellaneous utility code, including support code for the runtime checks. 72 | 73 | \end{itemize} 74 | 75 | 76 | \section{Type Representation} 77 | 78 | Converting values from C to Lua, and vice versa, is one of the most important 79 | tasks performed by \Eol*: C values need to be made accessible from Lua. 80 | Therefore, this information needs to be read from the DWARF debugging 81 | information (see \nameref{sec:debuginfo-structure}), and kept around in 82 | a suitable data structure. This structure must be: 83 | 84 | \begin{itemize} 85 | \item Exhaustive, to hold all the needed information. 86 | \item Compact, to minimize memory usage. 87 | \end{itemize} 88 | 89 | Describing base types is possible using just an enumerated type: there is 90 | a fixed amount of them, and the characteristics (size, name, etc) are well 91 | known. The challenging part is representing user defined types (\Mc|struct|, 92 | \Mc|enum|, \Mc|union|), and derived types (pointers, arrays). 93 | 94 | The data structure for describing types is \verb|EolTypeInfo| 95 | (\autoref{lst:EolTypeInfo}). It is a tagged \Mc|struct|, with the tag 96 | indicatingthe type kind (\verb|EOL_TYPE_S32| for 32-bit signed integers, 97 | \verb|EOL_TYPE_STRUCT| for a \Mc|struct|, etc; the complete list of values 98 | can be seen in \autoref{lst:EolType}). The contained data will vary 99 | depending on the value of the \emph{kind} tag. The members for all possible 100 | values are grouped in an \Mc|union| in order to make them share the 101 | same memory space. 102 | 103 | \begin{listing}[H] 104 | \begin{ccode} 105 | struct _EolTypeInfo { 106 | EolType type; 107 | union { 108 | struct TI_base ti_base; 109 | struct TI_pointer ti_pointer; 110 | struct TI_typedef ti_typedef; 111 | struct TI_const ti_const; 112 | struct TI_array ti_array; 113 | struct TI_compound ti_compound; 114 | }; 115 | }; 116 | typedef struct _EolTypeInfo EolTypeInfo; 117 | \end{ccode} 118 | \caption{\texttt{EolTypeInfo}.} 119 | \label{lst:EolTypeInfo} 120 | \end{listing} 121 | 122 | \begin{listing}[tH] 123 | \centering 124 | \begin{ccode} 125 | typedef enum { 126 | EOL_TYPE_VOID, /* void */ 127 | EOL_TYPE_BOOL, /* _Bool */ 128 | EOL_TYPE_S8, /* int8_t */ 129 | EOL_TYPE_U8, /* uint8_t */ 130 | EOL_TYPE_S16, /* int16_t */ 131 | EOL_TYPE_U16, /* uint16_t */ 132 | EOL_TYPE_S32, /* int32_t */ 133 | EOL_TYPE_U32, /* uint32_t */ 134 | EOL_TYPE_S64, /* int64_t */ 135 | EOL_TYPE_U64, /* uint64_t */ 136 | EOL_TYPE_FLOAT, /* float */ 137 | EOL_TYPE_DOUBLE, /* double */ 138 | EOL_TYPE_TYPEDEF, /* typedef … T */ 139 | EOL_TYPE_CONST, /* const T */ 140 | EOL_TYPE_POINTER, /* T* */ 141 | EOL_TYPE_ARRAY, /* T …[n] */ 142 | EOL_TYPE_STRUCT, /* struct … */ 143 | EOL_TYPE_UNION, /* union … */ 144 | EOL_TYPE_ENUM, /* enum … */ 145 | } EolType; 146 | \end{ccode} 147 | \caption{\texttt{EolType} enumeration.} 148 | \label{lst:EolType} 149 | \end{listing} 150 | 151 | 152 | \begin{table}[f] 153 | \centering 154 | \begin{tabular}{lll} 155 | \toprule 156 | C Construct & DWARF DIE & \Eol* Type \\ 157 | \midrule 158 | \Mc|void| & ø & \Mc|EOL_TYPE_VOID| \\ 159 | \Mc|bool| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_BOOL| \\ 160 | \Mc|int8_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S8| \\ 161 | \Mc|uint8_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U8| \\ 162 | \Mc|int16_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S16| \\ 163 | \Mc|uint16_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U16| \\ 164 | \Mc|int32_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S32| \\ 165 | \Mc|uint32_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U32| \\ 166 | \Mc|int64_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_S64| \\ 167 | \Mc|uint64_t| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_U64| \\ 168 | \Mc|float| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_FLOAT| \\ 169 | \Mc|double| & \verb|DW_TAG_base_type| & \Mc|EOL_TYPE_DOUBLE| \\ 170 | \Mc|typedef|... & \verb|DW_TAG_typedef| & \Mc|EOL_TYPE_TYPEDEF| \\ 171 | \Mc|const|... & \verb|DW_TAG_const_type| & \Mc|EOL_TYPE_CONST| \\ 172 | ...\Mc|*| & \verb|DW_TAG_pointer_type| & \Mc|EOL_TYPE_POINTER| \\ 173 | ...\Mc|[n]| & \verb|DW_TAG_array_type| & \Mc|EOL_TYPE_ARRAY| \\ 174 | \Mc|struct|... & \verb|DW_TAG_structure_type| & \Mc|EOL_TYPE_STRUCT| \\ 175 | \Mc|union|... & \verb|DW_TAG_union_type| & \Mc|EOL_TYPE_UNION| \\ 176 | \Mc|enum|... & \verb|DW_TAG_enumration_type|& \Mc|EOL_TYPE_ENUM| \\ 177 | \bottomrule 178 | \end{tabular} 179 | \caption{Mapping of C types, DWARF DIEs and \Mc|EolType|.} 180 | \end{table} 181 | 182 | \noindent 183 | The following sections describe in detail the members of \verb|EolTypeInfo|. 184 | 185 | 186 | \subsection{Base Type Representation} 187 | 188 | \begin{ccode*}{samepage=true} 189 | struct TI_base { 190 | char *name; 191 | uint32_t size; 192 | }; 193 | \end{ccode*} 194 | 195 | \noindent 196 | Even though it is sufficient to provide type kind codes for all the base types 197 | as discussed before, providing the possibility of querying their \verb|name| 198 | and \verb|size| is a convenient feature, at a very small cost: the 199 | \verb|EolTypeInfo| value for each one of the base types is a singleton, 200 | defined as follows: 201 | 202 | \begin{ccode*}{samepage=true} 203 | /* File: eol-typing.h */ 204 | extern const EolTypeInfo* eol_typeinfo_u32; 205 | 206 | /* File: eol-typing.c */ 207 | const EolTypeInfo* eol_typeinfo_u32 = &((EolTypeInfo) { 208 | .kind = EOL_TYPE_U32, 209 | .ti_base.name = "uint32_t", 210 | .ti_base.size = sizeof (uint32_t), 211 | }); 212 | \end{ccode*} 213 | 214 | \noindent In practice, to avoid writing the definitions of all the base types, 215 | the C preprocessor and a couple generator macros are used (see 216 | \autoref{sec:cpp-abuse-genmacros}). 217 | 218 | 219 | \subsection{Pointer Representation} 220 | \label{sec:pointer-typeinfo} 221 | 222 | \begin{ccode*}{samepage=true} 223 | struct TI_pointer { 224 | const EolTypeInfo *typeinfo; 225 | }; 226 | \end{ccode*} 227 | 228 | \noindent 229 | Pointers are represented by referencing the \verb|EolTypeInfo| of the 230 | pointed-to type. Thus, it is the only member in \Mc|struct TI_pointer|. The 231 | size of a pointer value is platform dependent, but well known and constant for 232 | each platform, and is the value of the C expression \Mc|sizeof(void*)|. 233 | 234 | 235 | \subsection{Array Representation} 236 | 237 | \begin{ccode*}{samepage=true} 238 | struct TI_array { 239 | const EolTypeInfo *typeinfo; 240 | uint64_t n_items; 241 | }; 242 | \end{ccode*} 243 | 244 | \noindent 245 | Arrays are represented by referencing the \verb|EolTypeInfo| of the array 246 | items, plus the number of items (\verb|n_items|) present in the array. The 247 | size of an array value can be calculated multiplying the size of the item type 248 | by the number of items in the array. 249 | 250 | 251 | \subsection{User Defined Type Representation} 252 | 253 | \begin{ccode*}{samepage=true} 254 | struct TI_compound { 255 | char *name; 256 | uint32_t size; 257 | uint32_t n_members; 258 | EolTypeInfoMember members[]; 259 | }; 260 | \end{ccode*} 261 | 262 | \noindent This record type represents all user defined types: enumerated types 263 | (\Mc:enum:), record types (\Mc:struct:), and union types (\Mc:union:): 264 | 265 | \begin{description} 266 | \item [\Mc|name|] \hfill \\ 267 | User defined types are usually given a name, but it is optional and in 268 | this case the value will be \Mc|NULL|. 269 | \item [\Mc|size|] \hfill \\ 270 | Contains the size of the type, in bytes. 271 | \item [\Mc|n_members| / \Mc|members|] \hfill \\ 272 | Count of members (or enumerators, for \verb|EOL_TYPE_ENUM|) in the type, 273 | and an array contaning their descriptions. Using 274 | a \gls{flexible-array-member}, allows usage of a single chunk of memory 275 | for the \verb|EolTypeInfo| itself and the items in the array. 276 | \end{description} 277 | 278 | \noindent 279 | The auxiliar \verb|EolTypeInfoMember| type is defined as follows: 280 | 281 | \begin{ccode*}{samepage=true} 282 | typedef struct { 283 | const char *name; 284 | union { 285 | int64_t value; /* enum */ 286 | struct { /* union, struct */ 287 | uint32_t offset; 288 | const EolTypeInfo *typeinfo; 289 | }; 290 | }; 291 | } EolTypeInfoMember; 292 | \end{ccode*} 293 | 294 | \noindent 295 | This always contains the (optional) \Mc|name| of the types, and usage of the 296 | remaining fields varies with the type being described: 297 | 298 | \begin{itemize} 299 | \item For \Mc|EOL_TYPE_STRUCT|, the \Mc|offset| of the member (in bytes, 300 | from the beginning of the record) and a pointer to its type information 301 | (\Mc|typeinfo|) are used. 302 | \item For \Mc|EOL_TYPE_UNION|, the \Mc|offset| is ignored, and only the 303 | type information of the member (\Mc|typeinfo|) is used. 304 | \item For \Mc|EOL_TYPE_ENUM|, only the \Mc|value| associated with the 305 | enumerator is used. 306 | \end{itemize} 307 | 308 | \noindent 309 | An \Mc|union| is used to make fields share the same memory space. 310 | 311 | 312 | \subsection{Type Alias Representation} 313 | 314 | \begin{ccode*}{samepage=true} 315 | struct TI_typedef { 316 | char *name; 317 | const EolTypeInfo *typeinfo; 318 | }; 319 | \end{ccode*} 320 | 321 | Type aliases assign a name to an arbitrary type. They are represented by the 322 | \Mc|name| and a pointer to the \Mc|EolTypeInfo| of the type. 323 | 324 | 325 | \subsection{Read-only Type Representation} 326 | 327 | \begin{ccode*}{samepage=true} 328 | struct TI_const { 329 | const EolTypeInfo *typeinfo; 330 | }; 331 | \end{ccode*} 332 | 333 | \noindent 334 | Flagging a type as read-only (i.e. using the \Mc|const| type qualifier in C) 335 | is represented in the same way as pointers (\autoref{sec:pointer-typeinfo}): 336 | by keeping a pointer to the \Mc|EolTypeInfo| that is read-only. 337 | 338 | 339 | 340 | 341 | \section{Type cache} 342 | \label{sec:impl-type-cache} 343 | 344 | The type cache is implemented as an opaque data structure which can only be 345 | used by means of its public API (\autoref{lst:eol-typecache-api}). Internally 346 | it is implemented as a hash table which reuses uthash~\cite{uthash-guide}, and 347 | it maps integer keys (\Mc|uint32_t|) to \Mc|EolTypeInfo| structures. Cache 348 | keys can be any integer which uniquely identifies a particular type. 349 | For a \Mc|EolTypeInfo| created from its DWARF representation, the offset of 350 | the top-level \gls{DIE} is used as the key. This works because information for 351 | a particular type is never duplicated inside the same ELF file, so there is an 352 | unique offset in the file for it. 353 | 354 | \begin{listing}[tH] 355 | \centering 356 | \begin{ccode} 357 | typedef struct _EolTypeCacheEntry* EolTypeCache; 358 | 359 | typedef bool (*EolTypeCacheIter) (EolTypeCache*, 360 | const EolTypeInfo*, 361 | void *userdata); 362 | 363 | void eol_type_cache_init (EolTypeCache *cache); 364 | void eol_type_cache_free (EolTypeCache *cache); 365 | 366 | void eol_type_cache_add (EolTypeCache *cache, 367 | uint32_t offset, 368 | const EolTypeInfo *typeinfo); 369 | 370 | const EolTypeInfo* eol_type_cache_lookup (EolTypeCache *cache, 371 | uint32_t offset); 372 | 373 | void eol_type_cache_foreach (EolTypeCache *cache, 374 | EolTypeCacheIter callback, 375 | void *userdata); 376 | \end{ccode} 377 | \caption{Public API of \Mc|EolTypeCache|}. 378 | \label{lst:eol-typecache-api} 379 | \end{listing} 380 | 381 | The type cache only manages its own dynamically allocated memory, used for the 382 | nodes of the hash table. The \Mc|EolTypeInfo| structures referenced by the 383 | cache are considered opaque by the cache, and the memory used by them is not 384 | ever freed by the cache: freeing the cache leaks memory if the cached entries 385 | are not freed by other means. In practice, this is not a problem because the 386 | type information is constructed as-needed, and cached during the whole 387 | lifetime of each loaded ELF object. This approach allows to simply iterate 388 | over the elements to free each one of them before unloading the object file. 389 | 390 | 391 | \section{Memory Ownership and Life Cycle} 392 | 393 | Native code uses a different approach for memory management compared to Lua: 394 | while Lua uses \gls{GC}, which handles freeing chunks of memory automatically, 395 | native code frees memory explicitly. Take for example a function which creates 396 | a new \Mc|struct| and returns it: 397 | 398 | \begin{ccode*}{samepage=true} 399 | struct point { int x; int y; }; 400 | 401 | struct point* point_new (int x, int y) { 402 | struct point *p = malloc (sizeof (struct point)); 403 | p->x = x; 404 | p->y = y; 405 | return p; 406 | } 407 | \end{ccode*} 408 | 409 | Then, that code is built into an ELF shared object file (\verb|point.so|), 410 | which is loaded using \Eol*, and used normally: 411 | 412 | \begin{luacode} 413 | local Geometry = eol.load("point.so") 414 | local point = Geometry.point_new(1, -1) 415 | -- Use the point normally. 416 | \end{luacode} 417 | 418 | The Lua VM only knows the userdata that \Eol* has created to represent the 419 | returned value, but it is not aware of the memory that has been allocated to 420 | hold the \Mc|struct point| value. Once the value is not used anymore by the 421 | Lua program, the garbage collector reclaims the space used by the userdata, 422 | but \verb|free()| is not called to free the memory allocated by 423 | \verb|malloc()|. Lua just does not know about memory that it has not allocated 424 | itself. One solution is to manually call a function to free the memory from 425 | Lua: 426 | 427 | \begin{luacode} 428 | local libc = eol.load("libc.so") 429 | libc.free(point) 430 | point = nil -- Make sure it won't be used 431 | \end{luacode} 432 | 433 | The main problem with this is that the automatic memory management done by the 434 | Lua VM is lost, and programmers are forced to write additional code to free 435 | memory regions. This puts a burden in the developer, which would rather be 436 | avoided. 437 | 438 | 439 | \subsection{Lua as a Custom Allocator} 440 | \label{sec:userdata-lua-custom-allocator} 441 | 442 | The Lua VM exposes in its C API the ability to create \emph{userdata} objects. 443 | For the VM, userdata is seen as an opaque value which, by default, cannot be 444 | manipulated from Lua; for the client code using the Lua API, an userdata value 445 | is a region of memory allocated by the Lua VM, which can contain any data. 446 | Like every other value managed by the VM, userdata is subject to \gls{GC}, 447 | which means that userdata values which are no longer referenced by a Lua 448 | program are garbage collected. This effectively allows C programmers to reuse 449 | the Lua GC for their data. 450 | 451 | \begin{listing}[tH] 452 | \centering 453 | \begin{ccode} 454 | struct data { /* ... */ }; 455 | 456 | void initialize_data (struct data *d); 457 | 458 | struct data* push_new_data (lua_State *L) { 459 | struct data *d = lua_newuserdata (L, sizeof (struct data)); 460 | initialize_data (d); 461 | return d; 462 | } 463 | \end{ccode} 464 | \caption{Using Lua userdata to store values} 465 | \label{lst:values-in-userdata} 466 | \end{listing} 467 | 468 | By default, userdata values have no predefined behaviour in Lua, except for 469 | assignment (which also covers passing userdata values as function parameters), 470 | and testing for identity. Assigning a metatable to an userdata value allows 471 | the programmer to define operations on userdata values. 472 | 473 | 474 | \subsubsection{GC Finalization} 475 | 476 | The only thing known by the Lua VM about userdata values is that they are 477 | a region of memory. That means that Lua will only free the memory region when 478 | the userdata is picked by the \gls{GC}. If the userdata contains resources 479 | other than raw memory (a file handle, for example), it must be ensured that 480 | those are released appropriately. In order to coordinate the GC with the 481 | management of resources unknown to the VM, Lua supports defining 482 | \emph{finalizer} functions. 483 | 484 | Lua requires the programmer to explicitly mark userdata to be finalized. This 485 | is done by setting its metatable: if the metatable contains a function (the 486 | finalizer) associated to the \Mlua|__gc| key, it is called after the userdata 487 | is marked by the GC as garbage, with the userdata itself being passed as the 488 | only argument. Once the userdata is finalized, the memory used by it will be 489 | normally released by Lua. The finalizer can be a C function, allowing native 490 | code to release any resources used by userdata. 491 | 492 | Using finalizers is needed when dealing with opaque types which are handled 493 | via pointers. The standard C library includes such a type: open files in C are 494 | handled using pointers to \Mc|FILE| values (for an example, see 495 | \autoref{lst:c-fileptr}), which are created when opening a file with 496 | \Mc|fopen()|. Files cannot be deallocated directly, and instead the 497 | \Mc|fclose()| function must be used. \autoref{lst:lua-gc-example-module} 498 | contains a complete example of a Lua module implemented in C which uses a 499 | finalizer to ensure that opened files are properly closed calling 500 | \Mc|fclose()| from the finalizer. 501 | 502 | \begin{listing}[tH] 503 | \begin{ccode} 504 | int main (int argc, char **argv) { 505 | FILE *fd = fopen ("hello.txt", "w"); 506 | fprintf (fd, "Hello, C file!\n"); 507 | fclose (fd); 508 | return 0; 509 | } 510 | \end{ccode} 511 | \caption{Using an opaque \Mc|FILE*| in C.} 512 | \label{lst:c-fileptr} 513 | \end{listing} 514 | 515 | Finalizers are used extensively in the implementation of the \Eol* Lua module. 516 | The module's own userdata types need finalization (details on those are 517 | provided in \autoref{sec:eol-mod-typemeta}), and it also allows to attach 518 | arbitrary Lua functions to values created by native code wrapped in an 519 | \Mc|EolVariable| userdata. 520 | 521 | 522 | \begin{listing}[tH] 523 | \small 524 | \begin{center} 525 | \emph{C code, to be built with \texttt{cc -shared -o openlog.so 526 | openlog.c}} 527 | \end{center} 528 | \begin{ccode} 529 | struct logger { 530 | FILE *output; 531 | }; 532 | 533 | static int logger_call (lua_State *L) { /* ... */ } 534 | 535 | static int logger_gc (lua_State *L) { 536 | struct logger *l = luaL_checkudata (L, 1, "LOGGER"); 537 | if (l->output) fclose (l->output); /* Close the file */ 538 | return 0; 539 | } 540 | 541 | static int logger_new (lua_State *L) { 542 | const char *path = luaL_checkstring (L, 1); 543 | lua_Integer verbosity = luaL_checkinteger (L, 2); 544 | struct logger *l = lua_newuserdata (L, sizeof (struct logger)); 545 | if (!(l->output = fopen (path, "a"))) 546 | return luaL_error (L, "cannot open (\%s)", strerror (errno)); 547 | l->verbosity = (int) verbosity; 548 | luaL_setmetatable (L, "LOGGER"); 549 | return 1; 550 | } 551 | 552 | int luaopen_openlog (lua_State *L) { 553 | static const luaL_Reg metamethods[] = { 554 | { "__call", logger_call, } 555 | { "__gc", logger_gc, } 556 | { NULL, NULL }, 557 | }; 558 | luaL_newmetatable (L, "LOGGER"); 559 | luaL_setfuncs (L, metamethods, 0); 560 | lua_pushcfunction (L, logger_new); 561 | return 1; 562 | } 563 | \end{ccode} 564 | 565 | \begin{center} 566 | \emph{Using the module from Lua} 567 | \end{center} 568 | 569 | \begin{luacode} 570 | local openlog = require("openlog") 571 | local log = openlog("/var/log/example.log", true) 572 | log("Log line") 573 | \end{luacode} 574 | 575 | \caption{Small C module which demonstrates using a \texttt{\_\_gc} metamethod} 576 | \label{lst:lua-gc-example-module} 577 | \end{listing} 578 | 579 | 580 | \section{\Eol* Lua module} 581 | 582 | The top-level API exposed to Lua is the \verb|eol| module, which has to be 583 | returned by the C function that the Lua VM calls after loading an extension 584 | module. This function is always called \verb|luaopen_|, where 585 | \verb|| is the name of the module being loaded: 586 | 587 | \begin{ccode} 588 | LUAMOD_API int 589 | luaopen_eol (lua_State *L) 590 | { 591 | eol_trace_setup (); 592 | 593 | (void) elf_version (EV_NONE); 594 | if (elf_version (EV_CURRENT) == EV_NONE) 595 | return luaL_error (L, "outdated libelf version"); 596 | 597 | luaL_newlib (L, eollib); 598 | create_meta (L); 599 | return 1; 600 | } 601 | \end{ccode} 602 | 603 | Notice how the function uses \verb|luaL_newlib()| instead of manually creating 604 | a table in the Lua stack, and setting a field for each one of the \verb|eol.*| 605 | module level functions. The \verb|eollib| variable is defined as follows, 606 | using the supplied \verb|luaL_Reg| type: 607 | 608 | \begin{ccode} 609 | static const luaL_Reg eollib[] = { 610 | { "load", eol_load }, 611 | { "type", eol_type }, 612 | { "sizeof", eol_sizeof }, 613 | { "typeof", eol_typeof }, 614 | { "offsetof", eol_offsetof }, 615 | { "cast", eol_cast }, 616 | { NULL, NULL }, 617 | }; 618 | \end{ccode} 619 | 620 | As a last step, \verb|create_meta()| is called to register the metatables for 621 | the C types which \Eol* exposes to the Lua VM (\vref{sec:eol-mod-typemeta}). 622 | 623 | 624 | \subsection{Userdata Metatables} 625 | \label{sec:eol-mod-typemeta} 626 | 627 | Metatables for the userdata types (\textsf{Library}, \textsf{Function}, 628 | \textsf{CType}, and \textsf{Variable}) are created when the \Eol* 629 | module is loaded by the Lua VM, in the \verb|create_meta()| C function. 630 | Exactly one metatable for each type is created, and all the userdata values of the same 631 | type share the same metatable. All metatables have some common entries: 632 | 633 | \begin{itemize} 634 | 635 | \item A \verb|__gc| metamethod, responsible for decreasing the reference 636 | count of the associated \textsf{Library}. 637 | 638 | \item A \verb|__tostring| metamethod, in order to provide a string 639 | representation of the values of the userdata. This is used by the Lua 640 | \Mlua|tostring()| function. 641 | 642 | \item A \verb|__index| metamethod, which provides support for indexing the 643 | userdata. The concrete behavior depends on the type of the userdata. 644 | 645 | \end{itemize} 646 | 647 | For each metatable, an array of \Mc|struct luaL_Reg| values is created. It 648 | contains a list of metamethods and pointers to the C functions which implement 649 | them. The example below is for \textsf{Library}; the rest share a strong 650 | similarity: 651 | 652 | \begin{ccode} 653 | /* Methods for Library userdata. */ 654 | static const luaL_Reg library_methods[] = { 655 | { "__gc", library_gc }, 656 | { "__tostring", library_tostring }, 657 | { "__index", library_index }, 658 | { "__eq", library_eq }, 659 | { NULL, NULL } 660 | }; 661 | \end{ccode} 662 | 663 | Then, in the \verb|create_meta()| function, the metatable is created with the 664 | aid of the \verb|luaL_newmetatable()| utility function, which arranges for the 665 | metatable to be available for checking the types of a userdata value later on 666 | with the companion functions \verb|luaL_checkudata()| and 667 | \verb|luaL_testudata()|: 668 | 669 | \begin{ccode} 670 | static void 671 | create_meta (lua_State *L) { 672 | /* EolLibrary */ 673 | luaL_newmetatable (L, EOL_LIBRARY); 674 | luaL_setfuncs (L, library_methods, 0); 675 | lua_pop (L, 1); 676 | 677 | /* ... */ 678 | } 679 | \end{ccode} 680 | 681 | \todo[inline]{Got time? Describe how lookups are done, it is relatively interesting} 682 | 683 | \subsection{Library loading} 684 | 685 | The implementation of \verb|eol.load()| tries to avoid loading the same 686 | library more than once. If a library is loaded, its reference count is 687 | incremented, and the returned \textsf{Library} userdata contains a reference 688 | to the previously loaded library. To achieve this behavior while preserving 689 | reference bookkeeping simple, a linked list of loaded libraries is maintained. 690 | Each \verb|EolLibrary| \Mc|struct| contains a pointer to the \verb|next| 691 | library (which can be \Mc|NULL|): 692 | 693 | \begin{ccode} 694 | struct EolLibrary { 695 | /* Members used for bookkeeping */ 696 | unsigned int ref_counter; 697 | const char *path; 698 | struct EolLibrary *next; 699 | 700 | /* Other members */ 701 | /* ... */ 702 | }; 703 | \end{ccode} 704 | 705 | The \verb|path| of libraries is used to determine if two libraries are the 706 | same. For this to work, the \verb|path| of a library, and the paths of 707 | libraries which are candidates to be loaded must must be in canonical form 708 | as returned by the \verb|realpath()|\footnote{\texttt{realpath()} 709 | canonicalizes a file path, it is part of the POSIX standard and included in 710 | most Unix-like systems.} function before comparing them. 711 | 712 | It would have been possible to use a hash table to map library paths 713 | (canonicalized) to their corresponding \verb|EolLibrary| \Mc|struct|, to 714 | determine in constant time whether a library is loaded, instead of the linear 715 | time required to check a linked list. However, most programs use, on average, 716 | a number of libraries in the order of tens, and reading the DWARF sections 717 | which contain the index of available types and symbols is much more costly for 718 | any non trivial library. Therefore, it was determined that the linked list 719 | solution would suffice, while avoiding the additional complexity of the hash 720 | table. 721 | 722 | 723 | \subsection{Type Information Lookup} 724 | 725 | Looking up type information is one of the most common operations performed by 726 | \Eol*. As per the design (\autoref{sec:design-overview}), type information is 727 | to be stored in the \textsf{Type Cache} (for details of its implementation, 728 | see \autoref{sec:impl-type-cache}). It seemed convenient to provide a single 729 | entry point for all the type information lookups which always checks the 730 | \textsf{Type Cache}. This function is \verb|library_lookup_type()|: 731 | 732 | \begin{ccode} 733 | static const EolTypeInfo* 734 | library_lookup_type (EolLibrary *library, 735 | Dwarf_Off d_offset, 736 | Dwarf_Error *d_error) { 737 | const EolTypeInfo *typeinfo = 738 | eol_type_cache_lookup (&library->type_cache, d_offset); 739 | if (!typeinfo) { 740 | typeinfo = library_build_typeinfo (library, d_offset, d_error); 741 | eol_type_cache_add (&library->type_cache, d_offset, typeinfo); 742 | } 743 | return typeinfo; 744 | } 745 | \end{ccode} 746 | 747 | It is important to note that the cache is always checked first: on a cache 748 | hit, the cached values are returned immediately, while in the event of a cache 749 | miss a call to the \verb|library_build_typeinfo()| is used to create a new 750 | \verb|EolTypeInfo| from the DWARF debugging information, which is always added 751 | to the cache \emph{right away}. This is especially important because 752 | \verb|EolTypeInfo| values can contain references to others, which in turn 753 | cause additional type information lookups. Having the intermediate values 754 | built also added in the cache greatly increases the ratio of cache hits. To 755 | better understand this, consider the following example lookups for functions 756 | of the C standard library, starting from an empty cache: 757 | 758 | \begin{figure}[thH] 759 | \centering 760 | \begin{tikzpicture} 761 | \begin{axis}[ 762 | width=0.85\textwidth, 763 | height=0.5\textwidth, 764 | style={/pgf/number format/assume math mode=true}, 765 | xlabel={\emph{time}}, 766 | % ylabel={\emph{hits / misses}}, 767 | enlarge x limits=0.05, 768 | axis on top, 769 | tick align=inside, 770 | axis x line*=bottom, 771 | axis y line*=left, 772 | legend style={legend pos=north west}, 773 | x tick label style={opacity=0}, 774 | ] 775 | \addplot plot coordinates { 776 | (1,0) (2,7) (3,8) (4,9) (5,9) (6,11) (7,13) (8,13) (9,154) (10,155) 777 | (11,158) (12,165) (13,168) (14,170) (15,171) 778 | }; 779 | \addplot plot coordinates { 780 | (1,0) (2,7) (3,14) (4,14) (5,18) (6,18) (7,18) (8,19) (9,107) 781 | (10,107) (11,107) (12,107) (13,107) (14,107) (15,107) 782 | }; 783 | \legend{hits \\ misses \\} 784 | \end{axis} 785 | \end{tikzpicture} 786 | \caption{Typical progression of type cache hits/misses over time} 787 | \label{fig:plot-type-cache-hitmiss} 788 | \end{figure} 789 | 790 | \begin{enumerate} 791 | 792 | \item Lookup type information for the \Mc|int atoi(const char*)| function. 793 | This generates one lookup for the \Mc|int| return type, plus another lookup 794 | for the \Mc|const char*| parameter. The latter causes itself more lookups: 795 | one for the \Mc|char*| type, which itself causes yet another lookup for the 796 | \Mc|char| type. The type information for \Mc|char| gets added to the 797 | cache at this point, then the information for \Mc|char*|, and finally for 798 | \Mc|const char*|. 799 | 800 | \item Lookup type information for the \Mc|char* strchr(const char*, int)| 801 | function. The cache already contains the type information for the function 802 | parameters, which were looked up for the \Mc|atoi()| function. The type 803 | information for the return type, \Mc|char*|, is already in the cache, 804 | because it has been added as a partial result for the \Mc|atoi()| 805 | function. As for the parameter types, \Mc|const char*| is also 806 | in the cache. 807 | 808 | \end{enumerate} 809 | 810 | The implemented policy for type information lookup quickly fills up the cache 811 | as fast as possible while the program starts, up to a point where most of the 812 | types used are all present in the cache 813 | (\autoref{fig:plot-type-cache-hitmiss}), and from that moment onwards the 814 | amount of cache misses is very small. 815 | 816 | 817 | \subsection{Querying Types} 818 | 819 | The \verb|eol.typeof()| function accepts arguments of different types. 820 | The \verb|luaL_check*()| family of functions from the Lua C API raise an error 821 | if the argument is not of the expected type, and therefore some additional 822 | work is needed to ensure that it works as specified: 823 | 824 | \begin{ccode} 825 | static int 826 | eol_typeof (lua_State *L) { 827 | if (luaL_testudata (L, 1, EOL_TYPEINFO)) { 828 | lua_settop (L, 1); 829 | } else { 830 | EolVariable *ev = luaL_testudata (L, 1, EOL_VARIABLE); 831 | if (ev) { 832 | typeinfo_push_userdata (L, ev->typeinfo); 833 | } else { 834 | luaL_checktype (L, 1, LUA_TSTRING); 835 | const char *name = lua_tostring (L, 1); 836 | /* Omitted: Lookup type by name in all loaded libraries */ 837 | typeinfo_push_userdata (L, typeinfo); 838 | } 839 | } 840 | return 1; 841 | } 842 | \end{ccode} 843 | 844 | Functions \verb|eol.sizeof()|, \verb|eol.alignof()|, and \verb|eol.offsetof()| 845 | are implemented similarly, with the exception that they omit the code to 846 | look up the type information when passing a string argument. Once the 847 | corresponding \verb|EolTypeInfo| is found, it can be queried for the requested 848 | information: \verb|eol_typeinfo_sizeof()| to obtain the size, 849 | \verb|eol_typeinfo_alignment()| for the alignment, and for obtaining the 850 | offset of a \Mc|struct| member, the information is available in the 851 | \verb|EolTypeInfoMember| value returned by the 852 | \verb|eol_typeinfo_compound_named_member()|. 853 | 854 | \subsection{Casting} 855 | 856 | Using the \verb|eol.cast()| function on a \textsf{Variable} userdata changes 857 | the associated type for it, effectively treating the same data as if it were 858 | of another type. In order to achieve this, we just return to Lua a new 859 | \textsf{Variable} userdata with the given type information which points to 860 | the same memory area: 861 | 862 | \begin{ccode} 863 | static int 864 | eol_cast (lua_State *L) { 865 | const EolTypeInfo *typeinfo = to_eol_typeinfo (L, 1); 866 | EolVariable *ev = to_eol_variable (L, 2); 867 | 868 | /* Use typeinfo from 2nd argument, same data address. */ 869 | variable_push_userdata (L, ev->library, typeinfo, 870 | ev->address, ev->name, VARIABLE_NOCOPY); 871 | return 1; 872 | } 873 | \end{ccode} 874 | 875 | 876 | \subsection{Preprocessor “Generator Macros”} 877 | \label{sec:cpp-abuse-genmacros} 878 | 879 | This is a programming pattern used thorough the code of \Eol*: the 880 | C preprocessor is used in a convoluted way as a rudimentary code generator 881 | using lists of related elements. First, a macro of related elements is defined 882 | (\emph{enumerator macro}, from now on), and it must accept the identifier for 883 | another macro (the \emph{generator macro}) as a parameter. Each element in the 884 | enumerator macro is an expansion of the generator, passing the parameters 885 | needed by the generator. 886 | 887 | In order to better understand how generator macros work, let us walk through 888 | a complete example adapted from the \Eol* source code. The following macro 889 | expands into a function which checks the type of an \Mc|EolTypeInfo| — it is 890 | the \emph{generator}: 891 | 892 | \begin{ccode*}{samepage=true} 893 | #define MAKE_TYPEINFO_IS_TYPE(suffix, name, ctype) \ 894 | bool eol_typeinfo_is_ ## name (const EolTypeInfo *înfo) \ 895 | { return info->type == EOL_TYPE_ ## suffix; } 896 | \end{ccode*} 897 | 898 | \noindent In generator macros like this, the concatenation operator 899 | (\verb|##|) of the preprocessor is used extensively to build pieces of valid 900 | C code. The example shows how the \verb|name| parameter is concatenated to 901 | create the name of the generated function, and the \verb|suffix| parameter is 902 | concatenated to create a valid \verb|EolType| (\autoref{lst:EolType}) 903 | value. A valid expansion of the above macro is: 904 | 905 | \begin{ccode*}{samepage=true} 906 | MAKE_TYPEINFO_IS_TYPE (S32, s32, int32_t) 907 | \end{ccode*} 908 | 909 | \noindent 910 | which generates the following valid C function: 911 | 912 | \begin{ccode*}{samepage=true} 913 | bool eol_typeinfo_is_s32 (const EolTypeInfo *info) 914 | { return info->type == EOL_TYPE_S32; } 915 | \end{ccode*} 916 | 917 | \noindent The \emph{enumerator macro} is made by grouping a set of macro 918 | expansions like the one above. The key is using a generic name for the 919 | generator macro, which will be passed as a parameter. The next listing defines 920 | an enumerator which expands a generator \verb|F| for each signed integer type: 921 | 922 | \begin{ccode*}{samepage=true} 923 | #define INTEGER_S_TYPES(F) \ 924 | F (S8, s8, int8_t ) \ 925 | F (S16, s16, int16_t ) \ 926 | F (S32, s32, int32_t ) \ 927 | F (S64, s64, int64_t ) 928 | \end{ccode*} 929 | 930 | \noindent Using the above definition, an expansion of the \emph{enumerator 931 | macro} causes multiple expansions at once of the \emph{generator macro} passed 932 | as \verb|F|, which in turn creates as many functions as elements in the 933 | enumerator macro. In this example, using generator macros reduces the amount 934 | of code that the programmer must write manually close to one fourth of the 935 | original. 936 | 937 | Another use case for generator macros is creating the code for cases in 938 | a \Mc|switch| statement. Instead of constructing the code for an entire 939 | function at a time, only a single \Mc|case| label and its associated 940 | statements are generated. This is done in the following example: 941 | 942 | \begin{ccode*}{samepage=true} 943 | #define MAKE_SIGNED_TYPE_CASE(suffix, name, ctype) \ 944 | case EOL_TYPE_ ## suffix: return true; 945 | 946 | bool eol_type_is_signed (EolType type) { 947 | switch (type) { 948 | INTEGER_S_TYPES (MAKE_SIGNED_TYPE_CASE) 949 | default: return false; 950 | } 951 | } 952 | \end{ccode*} 953 | 954 | 955 | \section{Test Harness} 956 | 957 | The implementation of the test harness is not particularly complex, and its 958 | main piece of code is the test runner, contained in a single Lua script 959 | (\texttt{tools/harness.lua} in the source tree) which is mostly self 960 | explanatory. In broad terms, it works in the following way: 961 | 962 | \begin{enumerate} 963 | 964 | \item The directory containing the unit tests, which are Lua scripts, is 965 | scanned, and the file names used to populate the list of unit tests to run. 966 | 967 | \item If the names of any tests have been given in the command line, the 968 | lists os tests to run is changed to contain the names given as command line 969 | arguments. 970 | 971 | \item A \textsf{Test} object is created for each unit test of the list of 972 | tests to run. 973 | 974 | \item An \textsf{Output} is chosen, depending on the execution environment 975 | and command line arguments. The \Mlua|Output:setup()| method is invoked. 976 | 977 | \item For each \textsf{Test}, \Mlua|Output:start()| is called, the 978 | \textsf{Test} executed, and \Mlua|Output:finish()| is called to report its 979 | execution status. 980 | 981 | \item After running the unit tests, before exiting, the \textsf{Output} is 982 | give the chance of generating a summary of the test results (or any other 983 | content it may consider neccessary) by calling \Mlua|Output:report()|. 984 | 985 | \end{enumerate} 986 | 987 | Running each test case in a new process is achieved using the 988 | \Mlua|io.popen()| function from the Lua standard library, which allows 989 | capturing the output of the process as well. A normal Lua interpreter is used 990 | to run each unit test. 991 | 992 | \begin{figure}[h] 993 | \begin{luacode} 994 | local eol = require("eol") 995 | assert.Field(eol, "alignof") 996 | assert.Callable(eol.alignof) 997 | 998 | local libtest = eol.load("libtest") 999 | local u8type = eol.type(libtest, "uint8_t") 1000 | 1001 | assert.Equal(1, eol.alignof(u8type)) 1002 | assert.Equal(1, eol.alignof(libtest.var_u8)) 1003 | 1004 | -- Values other than variables or typeinfos raise an error 1005 | for _, value in ipairs { 1, 3.14, false, true, "str", { a=1 } } do 1006 | assert.Error(function () 1007 | local _ = eol.alignof(value) 1008 | end) 1009 | end 1010 | \end{luacode} 1011 | \caption{Unit test for the \texttt{eol.alignof()} function} 1012 | \label{lst:unittest-example} 1013 | \end{figure} 1014 | 1015 | \subsection{\texttt{tools/run-tests}} 1016 | 1017 | The \verb|run-tests| helper script simplifies running the test harnes. 1018 | In order to preload the implementation of the 1019 | (\texttt{tools/harness-assertions.lua} in the source tree, specification in 1020 | \autoref{tab:design-test-asserts}) the script defines the \verb|LUA_INIT|. 1021 | It also makes sure that the correct \verb|lua| executable is used. 1022 | 1023 | 1024 | \subsection{Helper C Module} 1025 | 1026 | The test runner needs access to a series of functions from the C standard 1027 | library which are not available in Lua. Access to these functions is 1028 | implemented as a loadable C module named \texttt{testutil} which contains 1029 | trivial code for the following functions: 1030 | 1031 | \begin{itemize} 1032 | 1033 | \item \verb|testutil.isatty()| is used to determine whether a console is 1034 | connected to the standard output of the test harness process. In that case, 1035 | the \textsf{Console Output} is used for output formatting, otherwise the 1036 | output is redirected to a pipe or a file, and the \textsf{TAP Output} is 1037 | selected instead. 1038 | 1039 | \item \verb|testutil.listdir()| reads the directory at a given path, and 1040 | returns an array with the names of the files contained in it. 1041 | 1042 | \item \verb|testutil.isfile()| and \verb|testutil.isdir()| check whether the 1043 | path passed to them is a regular file or a directory, respectively. 1044 | Internally they use the \verb|stat()| system call to obtain information 1045 | about the path. 1046 | 1047 | \item \verb|testutil.realpath()| allows calling the POSIX \verb|realpath()| 1048 | function to canonicalize file system paths. 1049 | 1050 | \item \verb|testutil.getcwd()| allows calling the \verb|getcwd()| function 1051 | from the C library to obtain the working directory of the test runner. 1052 | 1053 | \end{itemize} 1054 | 1055 | 1056 | \section{Complete Test Programs} 1057 | 1058 | In order to stress the implementation of the \Eol* module, a few programs 1059 | which feature code similar to that found in real world systems have been 1060 | developed. The following libraries were chosen due to their size being small 1061 | and easy to build with full DWARF debugging information, yet containing 1062 | complex functions and types which would pose a challenge to \Eol*: 1063 | 1064 | \begin{itemize} 1065 | 1066 | \item µPNG\footnote{\url{https://github.com/elanthis/upng}}: small library which 1067 | implements a \gls{PNG} image decoder. It is used in embedded devices with 1068 | constrained memory resources, like the Pebble Time smartwatch. 1069 | 1070 | \item NanoVG\footnote{\url{https://github.com/memononen/nanovg}}: embedded 1071 | implementation of a subset of the OpenVG graphics API, which uses OpenGL for 1072 | rendering. 1073 | 1074 | % \item cImgIU\footnote{\url{https://github.com/Extrawurst/cimgui/}}: 1075 | % graphical user interface library 1076 | 1077 | \item GLFW\footnote{\url{http://www.glfw.org}}: utility library for OpenGL 1078 | application development which simplifies the creation of windows with OpenGL 1079 | contexts, and handling input from the user. 1080 | 1081 | \end{itemize} 1082 | 1083 | The programs are included in the \texttt{examples/} directory of the \Eol* 1084 | source code: 1085 | 1086 | \begin{description} 1087 | 1088 | \item [type-pp.lua] \hfill\\ 1089 | Pretty-prints information about C types present in an arbitrary library. 1090 | This was implemented to exercise the ability of the \Eol* module to 1091 | provide precise type information to Lua programs. 1092 | 1093 | \item [upng-info.lua] \hfill\\ Uses the µPNG library to show information 1094 | about \gls{PNG} images. 1095 | 1096 | \item [nanovg-demo.lua, nanovg-noise.lua] \hfill\\ 1097 | There two programs use the GLFW library to create a window with an OpenGL 1098 | context, and the NanoVG graphics library for rendering. 1099 | 1100 | The first uses functions involving complex native 1101 | types being passed between Lua and C to paint a series of translucent 1102 | animated waves (\autoref{fig:nanovg-demo}). The second displays animated 1103 | random noise which is generated from Lua into a native memory buffer 1104 | that has been allocated by \Eol* and passed as a texture to the graphics 1105 | card using NanoVG. 1106 | 1107 | \end{description} 1108 | 1109 | \begin{figure} 1110 | \centering 1111 | \includegraphics[width=0.75\textwidth]{img/nanovg-demo.png} 1112 | % \includegraphics[width=0.45\textwidth]{img/nanovg-noise.png} 1113 | \caption{Demo implemented in Lua of the NanoVG graphics library} 1114 | \label{fig:nanovg-demo} 1115 | \end{figure} 1116 | 1117 | 1118 | The development of these pilot programs using \Eol* did not uncover issues or 1119 | bugs that had not been detected by the unit tests. Therefore, we can conclude 1120 | that the unit tests and harness were most effective for the development 1121 | process. Validating the system using realistic code examples has helped 1122 | increase the confidence in the usefulness, stability and quality of this 1123 | project. 1124 | 1125 | \beforeintro 1126 | --------------------------------------------------------------------------------