├── .gitignore
├── LICENSE
├── README.md
├── docs
    ├── CHROME_TRACING_RENDERING_TOGGLE.png
    ├── CHROME_TRACING_RENDERING_TRACE.png
    ├── CPU_LIVE_USAGE_PROFILER.png
    ├── CPU_TRACE_PROFILER.png
    ├── GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg
    ├── HEAP_SNAPSHOT_BUTTON.png
    ├── HEAP_SNAPSHOT_TRACE_DETACHED.png
    ├── RENDERDOC_DRAWCALL.png
    ├── SPECTORJS_SHADER.png
    ├── SPECTORJS_STATE.png
    ├── V8_AVAILABLE_FLAGS.md
    └── V8_COMPILER_PIPELINE.jpg
└── scripts
    ├── run_macos.sh
    └── setup_macos.sh


/.gitignore:
--------------------------------------------------------------------------------
1 | # OS
2 | .DS_Store
3 | 
4 | # Log files
5 | *.log
6 | 
7 | # Directories
8 | logs
9 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 Tim van Scherpenzeel
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Profiling research
  2 | 
  3 | Research on profiling of high-performance web applications (primarily WebGL applications).
  4 | 
  5 | ## Table of contents
  6 | 
  7 | - [Introduction](#introduction)
  8 | 
  9 | - [Compiler pipeline](#compiler-pipeline)
 10 | 
 11 |   - [Source code](#source-code)
 12 |   - [Parser](#parser)
 13 |   - [AST](#ast)
 14 |   - [Baseline compiler](#baseline-compiler)
 15 |   - [Optimising compiler](#optimising-compiler)
 16 |   - [Conclusion](#conclusion)
 17 | 
 18 | - [Profiling](#profiling)
 19 | 
 20 |   - [Memory profiling and garbage collection](#memory-profiling-and-garbage-collection)
 21 |     - [Heap snapshot](#heap-snapshot)
 22 |     - [Three snapshot technique](#three-snapshot-technique)
 23 |   - [CPU profiling](#cpu-profiling)
 24 |   - [GPU profiling](#gpu-profiling)
 25 | 
 26 | - [Installation](#installation)
 27 | 
 28 | - [Usage](#usage)
 29 | 
 30 | - [Resources and references](#resources-and-references)
 31 | 
 32 | ## Introduction
 33 | 
 34 | In order to profile the performance of a web application one would usually use the browsers built-in developer tools. Every once in a while however there comes a time when a developer needs a better understanding of a performance issue in order to solve it. In order to get that understanding the developer needs access and an understanding of the low-level optimisations, de-optimisations and caching techniques in modern browser engines. Due to security restrictions in the browser it is only really possible to get this low-level information from browsers by enabling various flags when launching the browser locally.
 35 | 
 36 | `Chrome` and `V8` ship with various built-in tools that help their developers during development of the browser and engine. Luckily we can, as a web developer, leverage these same tools to get a better understanding of what is happening under the hood.
 37 | 
 38 | To understand what parts of the application are useful to profile one must have a general understanding of the architecture of the compiler pipeline in modern browser engines like `V8`. The compiler pipelines behind each browser are similar but not at all the same on a technical level. By looking at the `V8` pipeline in general terms we can understand what the core parts of a browser engine is without getting lost in the implementation details.
 39 | 
 40 | It is not necessary to understand the intrinsics of each browser engine but it is beneficial as a starting point in understanding what is harming the performance of your application.
 41 | 
 42 | ## Compiler pipeline
 43 | 
 44 | ![V8 compiler pipeline](/docs/V8_COMPILER_PIPELINE.jpg?raw=true)
 45 | 
 46 | _Image source: Franziska Hinkelmann - https://medium.com/dailyjs/understanding-v8s-bytecode-317d46c94775_
 47 | 
 48 | ### Source code
 49 | 
 50 | JavaScript source code is `JIT (Just In Time)` compiled meaning it is being compiled to machine code as the program is running. Source code is initially just plain text with a mime-type that identifies it as JavaScript code. It must be parsed by a `parser` in order to be understood as JavaScript code by the browser engine.
 51 | 
 52 | ### Parser
 53 | 
 54 | The parser generally consists out of a `pre-parser` and a `full-parser`. The `pre-parser` rapidly checks for syntactical and early errors in the program and will throw if it finds any. The `full-parser` evaluates the scope of variables throughout the program and collects basic type information.
 55 | 
 56 | ### AST
 57 | 
 58 | An `Abstract Syntax Tree` or in short `AST` is created from the parsed source code.
 59 | `AST's` are data structures widely used in compilers, due to their property of representing the structure of program code. An `AST` is usually the result of the syntax analysis phase of a compiler, a tree representation of the abstract syntactic structure of source code. Each node of the tree denotes a construct occurring in the source code. It is beneficial to get a good understanding of what `AST's` are as they are very oftenly used in pre-processors, code generators, minifiers, transpilers, linters and codemods.
 60 | 
 61 | ### Baseline compiler
 62 | 
 63 | The goal of the baseline compiler (`Ignition` in `V8`) is to rapidly generate relatively unoptimised `machine code` (CPU architecture specific `bytecode` in the case of `Ignition`) as fast as possible and infer general type information to be used in potential further compilation steps. Whilst running, functions that are called often are marked as `hot` and are a candidate for further optimisation using the optimising compiler(s).
 64 | 
 65 | ### Optimising compiler
 66 | 
 67 | The optimising compiler (`Turbofan` in `V8`) recompiles `hot` functions using previously collected type information to optimise the generated `machine code` further. However, in order to make a faster version of the `machine code`, the optimising compiler has to make some assumptions regarding the shape of the object, namely that they always have the same property names and order. Based on that the compiler can then make further optimisations. If the object shape has been the same throughout the lifetime of the program it is assumed that it will remain that way during future execution. Unfortunately in JavaScript there are no guarantees that this is actually the case meaning that object shapes can change at any stage over time. Due to this lack of guarantees the assumptions of the compiler need to be validated every single time before it runs. If it turns out the assumptions are false the optimising compiler assumes it made the wrong assumptions, trashes the last version of the optimised code and steps back to a de-optimised version where assumptions are still valid. It is therefore very important that you limit the amount of type changes of an object throughout the lifetime of the program in order to keep the highly optimised code produced by the optimising compiler alive. In the worst case scenario the object ends up in `de-optimised hell` and will never be picked up again to be optimised. Any code that V8 refuses to optimise can also end up in `de-optimised hell`.
 68 | 
 69 | Browser engines are continuously improving their optimisation techniques, especially around new browser features. Browser vendors generally recommend **against** implementing browser specific hacks to work around de-optimisations however there are specific keywords and patterns you should avoid using:
 70 | 
 71 | - Avoid using `eval`, `arguments` and `with`. They cause what is known as [aliasing](<https://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts_with_optimization>) preventing the browser engine from optimising them.
 72 | - Avoid using arrays with many different types in them.
 73 | - Avoid swapping out values in an array with a value of another type.
 74 | - Avoid creating holes in arrays by deleting array entries or setting entries to `undefined`.
 75 | - Avoid using `for in` as it will include properties that are inherited through the prototype chain. This behavior can lead to unexpected items in your loop and browsers likely deoptimise anything within them.
 76 | 
 77 | ### Conclusion
 78 | 
 79 | When profiling and optimising your JavaScript code part of your effort should go out to optimising the parts of the application that are being optimised by the optimising compiler, meaning that these functions are `hot`, and more importantly which parts of the application are being de-optimised. De-optimisation likely happens because types are changing in `hot` parts of the code or certain optimisations are not yet implemented by the compiler (such as `try catch` a few years ago, which has since been fixed). It is important to note that whilst you should pay attention when using these unoptimised implementations you should use them and report to the browser engines that you are using these features. If a certain de-optimisation shows up a lot in heuristics and performance bug reports it is likely to be picked up by the engine maintainers as a priority. Other things to take into account are optimising object property access, maintaining object shapes and understand the power of inline caches (monomorphic, polymorphic, megamorphic). Inline caches are used to memorize information on where to find properties on objects to reduce the number of expensive lookups.
 80 | 
 81 | ## Profiling
 82 | 
 83 | Besides the browser's built-in sampling profilers available in `Chrome developer tools` and the structural profiler available in `chrome://tracing` one can start the browser from the command line with flags to enable the tracing of various parts of the web application.
 84 | 
 85 | Please note that any traces recorded with the tool will contains all currently opened resources (tabs, extensions, subresources) with the browser. Make sure that `Chrome` starts without any other resources (other tabs, extensions, profile) active in order to be able to get a relatively clean trace. In order to record a clean trace you should keep the recording to a maximum of 10 seconds, focus on a single activity per recording and leave the computer completely idle for 2 seconds before and after each recording. This will help making the slow process stand out amongst the other recorded data.
 86 | 
 87 | ### Memory profiling and garbage collection
 88 | 
 89 | The essential point of garbage collection is the ability to manage memory usage by an application. All management of the memory is done by the browser engine, no API is exposed to web developers to control it explicitly. Web developers can however learn how to structure their programs in order to use the garbage collector to their advantage and avoid the generation of garbage.
 90 | 
 91 | All variables in a program are part of the object graph and object variables can reference other variables. Allocating variables is done from the `young memory pool` and is very cheap until the memory pool runs out of memory. Whenever that happens a garbage collection is forced which causes higher latency, dropped frames and thus a major impact on the user experience.
 92 | 
 93 | All variables that cannot be reached from the root node are considered as garbage. The job of the garbage collector is to `mark-and-sweep` or in other words: go through objects that are allocated in memory and determine whether they are `dead` or `alive`. If an object is unreachable it is removed from memory and previously allocated memory gets released back to the heap. Generally, e.g. in `V8`, the object heap is segmented into two parts: the `young generation` and the `old generation`. The `young generation` consists of `new space` in which new objects are allocated. It allocates fast, frequently collects and collects fast. The `old space` stores objects that survived enough garbage collector cycles to be promoted to the `old generation`. It allocates fast, infrequently collects and does slower collection.
 94 | 
 95 | The cost of the garbage collection is proportional to the number of live objects. This is due to a copying mechanism that copies over objects that are still alive into a new space. Most of the time newly allocated objects do not survive long enough in order to become old. It is important to understand that each allocation moves you closer to a garbage collection and every collection pauses the execution of your application. It is therefore important in performance critical applications to strive for a static amount of alive objects and prevent allocating new ones whilst running.
 96 | 
 97 | In order to limit the amount of objects that have to be garbage collected a developer should take the following aspects into account:
 98 | 
 99 | - Avoid allocating new objects or change types of outer scoped (or even global) variables inside of a `hot` function.
100 | - Avoid circular references. A circular reference is formed when two objects reference each other. Memory leaks can occur when the engine and garbage collector fail to identify a circular reference meaning that neither object will ever be destroyed and memory will keep growing over time.
101 | - A possible solution for the object allocation problem in the `young generation` is the use of an `object pool` that basically pre-allocates a fixed number of objects ahead of time and keeps them alive by recycling them. This is a relatively common technique that allows you to have more explicit control over your objects lifetime. This however does come with an upfront cost when initializing and filling the pool and a consistent chunk of memory throughout your applications lifetime. An example of an object pool implementation can be found [here](https://github.com/timvanscherpenzeel/object-pool).
102 | - Make use of [WeakMaps](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WeakMap) where possible as they hold "weak" references to key objects, which means that they do not prevent garbage collection in case there would be no other reference to the key object.
103 | - Avoid associating the `delete` keyword in JavaScript with manual memory management. The `delete` keyword is used to remove properties from objects, not objects or variables as a whole, and is therefore **not** useful to mark objects ready to be garbage collected.
104 | - When profiling make sure to run it in an incognito window in a fresh browser instance **without** any browser extensions as they share the same heap as the JavaScript program that you are trying to profile.
105 | 
106 | #### Heap snapshot
107 | 
108 | In the `Chrome developer tools` panel, in the memory tab, you can find the option to take a `heap snapshot` which shows the memory distribution among your application's JavaScript objects and related DOM nodes. It is important to note that right **before** you click the heap snapshot button a major garbage collection is done. Because of this you can assume that everything that `V8` assumes to be able to garbage collected has already been cleaned up allowing you to get an idea of what `V8` was unable to clean up at the time.
109 | 
110 | ![Heap snapshot button](/docs/HEAP_SNAPSHOT_BUTTON.png?raw=true)
111 | 
112 | _Image source: Google Developers - https://developers.google.com/web/tools/chrome-devtools/memory-problems/_
113 | 
114 | Once you have taken your snapshot you can start inspecting it.
115 | 
116 | You should ignore everything in parentheses and everything that is dimmed in the `heap snapshot`. These are various constructors that you do not have explicit control over in your application (such as native methods and global browser methods). The snapshot is ordered by the `constructor` name and you can filter the heap to find your constructor using the `class filter` up top. If you record multiple snapshots it is beneficial to compare them to each other. You can do this by opening the dropdown menu left of the `class filter` and set it to `comparison`. You can now see the difference between two snapshots. The list will be much shorter and you can see more easily what has changed in memory.
117 | 
118 | ![Heap snapshot trace](/docs/HEAP_SNAPSHOT_TRACE_DETACHED.png?raw=true)
119 | 
120 | _Image source: Google Developers - https://developers.google.com/web/tools/chrome-devtools/memory-problems/_
121 | 
122 | Objects in the `heap snapshot` with a **yellow background** are an indicator that there is no active event handle available meaning that these objects will be difficult to clean up as you have probably lost its reference to it. Most likely it is still in the DOM tree but you have lost your JavaScript reference.
123 | 
124 | Objects with a **red background** in the `heap snapshot` are considered objects that have been detached from the DOM tree but their JavaScript reference is being retained. A DOM node can only be garbage collected when there are no references to it from either the page's DOM tree or JavaScript code. A node is said to be `detached` when it's removed from the DOM tree but some JavaScript still references it. Detached DOM nodes are a common cause of memory leaks. They are only alive because they are part of a yellow node's tree.
125 | 
126 | In general, you want to focus on the yellow nodes in the `heap snapshot`. Fix your code so that the yellow node isn't alive for longer than it needs to be, that way you also get rid of the red nodes that are part of the yellow node's tree.
127 | 
128 | For more information there are excellent entries on the `Chrome developer tools` blog on memory profiling:
129 | 
130 | - [Fix memory problems](https://developers.google.com/web/tools/chrome-devtools/memory-problems/)
131 | - [Understand memory terminology](https://developers.google.com/web/tools/chrome-devtools/memory-problems/memory-101)
132 | - [Record Heap Snapshots](https://developers.google.com/web/tools/chrome-devtools/memory-problems/heap-snapshots)
133 | - [Use the Allocation Profiler](https://developers.google.com/web/tools/chrome-devtools/memory-problems/allocation-profiler)
134 | 
135 | #### Three snapshot technique
136 | 
137 | A recommended technique for capturing and analyzing snapshots is to make three captures and do comparisons between them as shown in the following graphic:
138 | 
139 | ![Three snapshot technique](/docs/GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg?raw=true)
140 | 
141 | _Image source: Google Developers Live - https://www.youtube.com/watch?v=L3ugr9BJqIs_
142 | 
143 | ### CPU profiling
144 | 
145 | In order to know if you are CPU bound you must profile the CPU. Most of the time it makes sense to keep an eye on real-time CPU usage and only when in doubt capture a CPU trace.
146 | 
147 | In `Chrome` there is a useful live CPU usage and runtime performance visualizer available in the `performance monitor` tab.
148 | 
149 | ![Performance monitor](/docs/CPU_LIVE_USAGE_PROFILER.png?raw=true)
150 | 
151 | More advanced captures over period of time can be done using the performance capture feature available in the `performance` tab in `Chrome`. A good tutorial for understanding the runtime performance trace can be found [here](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance/).
152 | 
153 | ![Performance tracer](/docs/CPU_TRACE_PROFILER.png?raw=true)
154 | 
155 | If you are CPU bound when rendering it is likely because of too many draw calls. This is a common problem and the solution is often to combine draw calls to reduce the cost.
156 | 
157 | There is however a lot more going on than just draw calls. The renderer needs to process and update each object (culling, material, lighting, collision) on every frame tick. The more complex your materials (math, textures) the higher the cost at creation time and the more expensive it is to run at runtime. In WebGL there is a small but significant overhead due to strict validation of the shader code. The underlying graphics driver validates the commands further and creates a command buffer for the hardware. In a browser a lot more is going on than just the WebGL rendering context.
158 | 
159 | In order to reduce the mesh draw calls one can use the following techniques:
160 | 
161 | - Combine meshes into a single mesh to reduce the amount of necessary draw calls.
162 | - Reduce the object count (e.g. static meshes, dynamic meshes and mesh particles).
163 | - Reduce the far view distance on your camera's.
164 | - Adjust the field of view of your camera's to be smaller in order to have less objects in the view frustum.
165 | - Reduce the amount of elements per draw call (e.g. combine textures into texture atlases, use LOD models).
166 | - Disable features on a mesh like custom depth, shadow casting and shadow receiving.
167 | - Change light sources to not shadow cast or have a tighter bounding volume (view cone, attenuation radius).
168 | - Use hardware instancing where possible as it reduces the driver overhead per draw call (e.g. mesh particles).
169 | - Reduce the depth of your scene graph.
170 | 
171 | If you are CPU bound by other parts of your application there is likely some other issue in your codebase.
172 | 
173 | ### GPU profiling
174 | 
175 | In order to know if you are GPU bound you must profile the GPU. Most of the time it makes sense to keep an eye on real-time GPU timing queries and when in doubt capture a GPU trace.
176 | 
177 | The GPU has many processing units working in parallel and it is common to be bound by different units for different parts of the frame. Because of this, it makes sense to look at finding where the GPU cost is going when looking for the GPU bottleneck. Common ways your can be GPU bound are the application being draw call heavy, complex materials, dense triangle meshes and a large view frustum).
178 | 
179 | In order to know if you are pixel bound one can try varying the viewport resolution. If you see a measurable performance change it likely means that you are bound by something pixel related. Usually it is either texture memory bandwidth (reading and writing) bound or math bound ([ALU](https://en.wikipedia.org/wiki/Arithmetic_logic_unit)), but in rare cases, some specific units are saturated (e.g. `MRT`). If you can lower the memory, or math, on the relevant passes and see a performance difference you know it was bound by the memory bandwidth (or the ALU units).
180 | 
181 | In general you should look at using the following optimisation techniques:
182 | 
183 | - Do as much as you can in the vertex shader rather than in the fragment shader because, per rendering pass, fragment shaders run many more times than vertex shaders, any calculation that can be done on the vertices and then just interpolated among fragments is a performance benefit (this interpolation is done "automagically" for you, through the fixed functionality rasterization phase of the WebGL pipeline).
184 | - Reduce the amount of WebGL state changes by caching and mirroring the state on the JavaScript. By diffing the state in JavaScript you can drastically reduce the amount of expensive WebGL state changes.
185 | - Avoid anything that requires synching the CPU and GPU as it is potentially very slow. Cache WebGL getter calls such as `getParameter` and `getUniformLocation` in JavaScript variables and only programmatically use `setParameter` after making sure you actually need to set the parameter by checking the mirrored WebGL state in JavaScript.
186 | - Cull any geometry that won't be visible (octree, BVH, kd-tree, quadtree) through occlusion culling, viewport culling or backface culling.
187 | - Group mesh submissions with the same state in order to prevent unnecessary WebGL state switches.
188 | - Limit the size of the canvas and do not directly use the device's pixel ratio but rather artificially limit it to a point where visual fidelity is acceptable yet performant.
189 | - Turn off the native anti-aliasing option on the canvas element, instead anti-alias once during postprocessing using FXAA, SMAA or similar in the fragment shader. The native implementation is unreliable and very naive.
190 | - Avoid using the native screen resolution retrieved using `window.devicePixelRatio()` for your fullscreen canvas as some phones can have a very high density display. Some effects and scenes can often get away with rendering at a lower resolution.
191 | - Disable alpha blending and disable the preserving of the drawing buffer when creating the WebGL canvas.
192 | 
193 | If you are **fragment shader** bound you can look at the following optimisation techniques:
194 | 
195 | - Avoid having to resize textures to be a power of two during runtime. This is unnecessary in WebGL2 but it is still highly recommended to use power of two textures for a more efficient memory layout. NPOT textures may be handled noticeable slower and can cause black edging artifacts by mipmap interpolation.
196 | - Avoid using too many uniforms, use `Uniform Buffer Objects` and `Uniform Block`'s where possible (WebGL2).
197 | - Reduce the amount of stationary and dynamic lights in your scene. Pre-bake where possible.
198 | - Try to combine lights that have a similar origin.
199 | - Limit the attenuation radius and light cone angle to the minimum needed.
200 | - Use an early Z-pass in order to determine what parts of the scene are actually visible. It allows you to avoid expensive shading operations on pixels that do not contribute to the final image.
201 | - Limit the amount of post-processing steps.
202 | - Disable shadow casting where possible, either per object or per light.
203 | - Reduce the shadow map resolution.
204 | - Make use of the multi-render target extension `WEBGL_draw_buffers` when using deferred rendering. Be aware that this extension is not available everywhere where `WebGL` is available. It fortunately is a part of the `WebGL2` core spec making it available everywhere where the `WebGL2` spec is implemented correctly.
205 | - Materials with fewer shader instructions and texture lookups run faster.
206 | - Never disable mipmaps if the texture can be seen in a smaller scale to avoid slowdowns due to texture cache misses.
207 | - Make use of GPU compressed textures and lower bitrate texture formats in order to reduce the in-memory GPU footprint.
208 | 
209 | Often the shadow map rendering is bound by the vertex shader, except if you have very large areas affected by shadows or use translucent materials. Shadow map rendering cost scales with the number of dynamic lights in the scene, number of shadow casting objects in the light frustum and the number of cascades. This is a very common bottleneck.
210 | 
211 | Highly tessellated meshes, where the wireframe appears as a solid color, can suffer from poor quad utilization. This is because GPUs process triangles in 2x2 pixel blocks and reject pixels outside of the triangle a bit later. This is needed for mip-map computations. For larger triangles, this is not a problem, but if triangles are small or very lengthy the performance can suffer as many pixels are processed but few actually contribute to the image.
212 | 
213 | If you are **vertex shader** bound you can look at the following optimisation techniques:
214 | 
215 | - Verify that the vertex count on your models in reasonable for real-time usage.
216 | - Avoid using too many vertices (use LOD meshes).
217 | - Verify your LOD is setup with aggressive transition ranges. A LOD should use vertex count by at least 2x. To optimise this, check the wireframe, solid colors indicate a problem.
218 | - Avoid using complex world position offsets (morph targets, vertex displacement using textures with poor mip-mapping)
219 | - Avoid tessellation if possible (if necessary be sure to limit the tessellation factor to a reasonable amount). Pretesselated meshes are usually faster.
220 | - Very large meshes can be split up into multiple parts for better view culling.
221 | - Avoid using too many vertex attributes, use `Vertex Array Objects` where possible (almost always available in WebGL, always available in WebGL2).
222 | - Billboards, imposter meshes or skybox textures can be used to efficiently fake detailed geometry when a mesh is far in the distance.
223 | 
224 | In `Chrome` there are various ways to profile the GPU:
225 | 
226 | For tracing an individual WebGL frame in depth without setting up an external debugger I highly recommend using the `Chrome` extension [Spector.js](https://spector.babylonjs.com/) made by the Babylon team at Microsoft. It allows for exporting and importing stack traces generated by Spector, captures the full WebGL state at each step and allows for easy exploration of the vertex and fragment shader. On top of that the project is free, open source and maintained by a professional team instead of an individual.
227 | 
228 | ![Spector.js state](/docs/SPECTORJS_STATE.png?raw=true)
229 | 
230 | ![Spector.js shader](/docs/SPECTORJS_SHADER.png?raw=true)
231 | 
232 | The main advantage of this approach is that it does **not** require the disabling of the GPU sandbox, like some external debuggers do, and avoids the need of having to install and learn a complex debugger. I would highly recommend this method over using an external debugger if you use Mac OS or if you are not familiar with an alternative external debugger like [RenderDoc (Windows, Linux)](https://renderdoc.org/docs/index.html) or [APITrace (Windows, Linux, Mac (limited support))](https://github.com/apitrace/apitrace). Instructions on how to debug WebGL using APITrace can be found [here](https://github.com/apitrace/apitrace/wiki/Google-Chrome-Browser).
233 | 
234 | ![Renderdoc drawcall](/docs/RENDERDOC_DRAWCALL.png?raw=true)
235 | 
236 | For capturing traces over time one can use the advanced tracing capabilities like [MemoryInfra](https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/README.md) available in `chrome://tracing`.
237 | A good example for how to understand and work with the captures of it can be found [here](https://www.html5rocks.com/en/tutorials/games/abouttracing/).
238 | 
239 | For capturing GPU traces using `chrome://tracing` I recommend using the `rendering` preset.
240 | 
241 | ![Chrome tracing rendering toggle](/docs/CHROME_TRACING_RENDERING_TOGGLE.png?raw=true)
242 | 
243 | ![Chrome tracing rendering trace](/docs/CHROME_TRACING_RENDERING_TRACE.png?raw=true)
244 | 
245 | There are also various ways you can integrate profiling into your application.
246 | 
247 | One can use the WebGL extension `EXT_disjoint_timer_query` to measure the duration of OpenGL commands submitted to the graphics processor without stalling the rendering pipeline. It makes most sense if this extension is integrated into the WebGL engine that you are using. A good example of a WebGL framework with an integrated profiler is [Luma.gl](https://github.com/uber/luma.gl).
248 | 
249 | One can also wrap the `WebGLRenderingContext` with a debugging wrapper like the [one provided by the Khronos Group](https://www.npmjs.com/package/webgl-debug) to catch invalid WebGL operations and give the errors a bit more context. This comes with a large overhead as every single instruction is traced (and optionally logged to the console so make sure to only optionally include the dependency in development. I have rarely found this method to be useful as it does not capture a single frame clearly and logs everything with the same priority to the console.
250 | 
251 | ## Installation
252 | 
253 | To automatically download and setup Chrome Canary on Mac OS using Homebrew you can use:
254 | 
255 | ```sh
256 | $ ./scripts/setup_macos.sh
257 | ```
258 | 
259 | In order to install `V8` and the `D8` shell I recommend following the excellent guide by [Kevin Cennis](https://gist.github.com/kevincennis/0cd2138c78a07412ef21).
260 | 
261 | ## Usage
262 | 
263 | In order to be able to [properly profile your application](#three-snapshot-technique) the browsers needs to expose more of its internals than usual. One can do this by launching the browser with a set of command line flags.
264 | 
265 | Included in this repo one can find a [script](scripts/run_macos.sh) that launches the latest version of Chrome Canary with a temporary user profile in an incognito window without any extensions installed.
266 | 
267 | Chrome has VSync disabled for unlocked framerates (useful for knowing how much of your frame budget you still have left), more precise memory tracking (necessary for properly tracing your memory usage), remote port debugging, optimisation tracing and de-optimisation tracing (both logged to files).
268 | 
269 | It is currently only configured for MacOS but I welcome PR's to add support for [Windows](https://github.com/TimvanScherpenzeel/profiling-research/issues/1) and [Linux](https://github.com/TimvanScherpenzeel/profiling-research/issues/2).
270 | 
271 | You can launch the script as follows:
272 | 
273 | ```sh
274 | $ ./scripts/run_macos.sh <URL>
275 | ```
276 | 
277 | ## Resources and references
278 | 
279 | - [Optimising WebGL Applications with Don Olmstead](https://www.youtube.com/watch?v=QVvHtWePQdA)
280 | - [CPU profiling in Unreal Engine](https://docs.unrealengine.com/en-us/Engine/Performance/CPU)
281 | - [GPU profiling in Unreal Engine](https://docs.unrealengine.com/en-us/Engine/Performance/GPU)
282 | - [Performance Guidelines for Artists and Designers](https://docs.unrealengine.com/en-us/Engine/Performance/Guidelines)
283 | - [The Breakpoint Ep. 8: Memory Profiling with Chrome DevTools](https://www.youtube.com/watch?v=L3ugr9BJqIs)
284 | - [V8 Garbage Collector](https://github.com/thlorenz/v8-perf/blob/master/gc.md#heap-organization-in-detail)
285 | - [Google I/O 2013 - Accelerating Oz with V8: Follow the Yellow Brick Road to JavaScript Performance](https://www.youtube.com/watch?v=VhpdsjBUS3g)
286 | - [Franziska Hinkelmann - Performance Profiling for V8 / Script17 (Slides)](https://fhinkel.rocks/PerformanceProfiling/assets/player/KeynoteDHTMLPlayer.html#3)
287 | - [Franziska Hinkelmann - Performance Profiling for V8 / Script17 (Video)](https://www.youtube.com/watch?v=j6LfSlg8Fig)
288 | - [Franziska Hinkelmann - Performance Profiling for V8 / Script17 (Files)](https://github.com/fhinkel/PerformanceProfiling)
289 | - [V8 profile documentation](https://v8.dev/docs/profile)
290 | - [V8 performance notes and resources](https://github.com/thlorenz/v8-perf)
291 | - [Turbolizer](https://github.com/thlorenz/turbolizer)
292 | - [The Trace Event Profiling Tool (chrome://tracing)](https://www.chromium.org/developers/how-tos/trace-event-profiling-tool)
293 | - [Ignition - an interpreter for V8](https://www.youtube.com/watch?v=r5OWCtuKiAk)
294 | - [A crash course in Just In Time (JIT) compilers](https://hacks.mozilla.org/2017/02/a-crash-course-in-just-in-time-jit-compilers/)
295 | - [JavaScript engine fundamentals: Shapes and Inline Caches](https://mathiasbynens.be/notes/shapes-ics)
296 | - [JavaScript Engines: The Good Parts™ - Mathias Bynens & Benedikt Meurer - JSConf EU 2018](https://www.youtube.com/watch?v=5nmpokoRaZI)
297 | - [Understanding V8’s Bytecode](https://medium.com/dailyjs/understanding-v8s-bytecode-317d46c94775)
298 | - [Visualize JavaScript AST's](https://resources.jointjs.com/demos/javascript-ast)
299 | - [Garbage collection in V8, an illustrated guide](https://medium.com/@_lrlna/garbage-collection-in-v8-an-illustrated-guide-d24a952ee3b8)
300 | - [Bailout reasons in V8 (Crankshaft)](https://github.com/vhf/v8-bailout-reasons)
301 | - [Bailout reasons in V8 (Turbofan)](https://chromium.googlesource.com/v8/v8/+/d3f074b23195a2426d14298dca30c4cf9183f203/src/bailout-reason.h)
302 | - [Scott Meyers: Cpu Caches and Why You Care](https://www.youtube.com/watch?v=WDIkqP4JbkE)
303 | 


--------------------------------------------------------------------------------
/docs/CHROME_TRACING_RENDERING_TOGGLE.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CHROME_TRACING_RENDERING_TOGGLE.png


--------------------------------------------------------------------------------
/docs/CHROME_TRACING_RENDERING_TRACE.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CHROME_TRACING_RENDERING_TRACE.png


--------------------------------------------------------------------------------
/docs/CPU_LIVE_USAGE_PROFILER.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CPU_LIVE_USAGE_PROFILER.png


--------------------------------------------------------------------------------
/docs/CPU_TRACE_PROFILER.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CPU_TRACE_PROFILER.png


--------------------------------------------------------------------------------
/docs/GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg


--------------------------------------------------------------------------------
/docs/HEAP_SNAPSHOT_BUTTON.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/HEAP_SNAPSHOT_BUTTON.png


--------------------------------------------------------------------------------
/docs/HEAP_SNAPSHOT_TRACE_DETACHED.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/HEAP_SNAPSHOT_TRACE_DETACHED.png


--------------------------------------------------------------------------------
/docs/RENDERDOC_DRAWCALL.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/RENDERDOC_DRAWCALL.png


--------------------------------------------------------------------------------
/docs/SPECTORJS_SHADER.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/SPECTORJS_SHADER.png


--------------------------------------------------------------------------------
/docs/SPECTORJS_STATE.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/SPECTORJS_STATE.png


--------------------------------------------------------------------------------
/docs/V8_AVAILABLE_FLAGS.md:
--------------------------------------------------------------------------------
  1 | # Available flags in V8
  2 | 
  3 | ```sh
  4 | # https://github.com/v8/v8/blob/master/src/flag-definitions.h
  5 | 
  6 | $ /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --js-flags="--help" > flags.txt
  7 | ```
  8 | 
  9 | ```txt
 10 | SSE3=1 SSSE3=1 SSE4_1=1 SAHF=1 AVX=1 FMA3=1 BMI1=1 BMI2=1 LZCNT=1 POPCNT=1 ATOM=0
 11 | Synopsis:
 12 |   shell [options] [--shell] [<file>...]
 13 |   d8 [options] [-e <string>] [--shell] [[--module] <file>...]
 14 | 
 15 |   -e        execute a string in V8
 16 |   --shell   run an interactive JavaScript shell
 17 |   --module  execute a file as a JavaScript module
 18 | 
 19 | Note: the --module option is implicitly enabled for *.mjs files.
 20 | 
 21 | Options:
 22 |   --experimental-extras (enable code compiled in via v8_experimental_extra_library_files)
 23 |         type: bool  default: false
 24 |   --use-strict (enforce strict mode)
 25 |         type: bool  default: false
 26 |   --es-staging (enable test-worthy harmony features (for internal use only))
 27 |         type: bool  default: false
 28 |   --harmony (enable all completed harmony features)
 29 |         type: bool  default: false
 30 |   --harmony-shipping (enable all shipped harmony features)
 31 |         type: bool  default: true
 32 |   --harmony-do-expressions (enable "harmony do-expressions" (in progress))
 33 |         type: bool  default: false
 34 |   --harmony-class-fields (enable "harmony fields in class literals" (in progress))
 35 |         type: bool  default: false
 36 |   --harmony-await-optimization (enable "harmony await taking 1 tick" (in progress))
 37 |         type: bool  default: true
 38 |   --harmony-regexp-sequence (enable "RegExp Unicode sequence properties" (in progress))
 39 |         type: bool  default: false
 40 |   --harmony-weak-refs (enable "harmony weak references" (in progress))
 41 |         type: bool  default: false
 42 |   --harmony-locale (enable "Intl.Locale" (in progress))
 43 |         type: bool  default: false
 44 |   --harmony-intl-list-format (enable "Intl.ListFormat" (in progress))
 45 |         type: bool  default: false
 46 |   --harmony-intl-segmenter (enable "Intl.Segmenter" (in progress))
 47 |         type: bool  default: false
 48 |   --harmony-private-fields (enable "harmony private fields in class literals")
 49 |         type: bool  default: false
 50 |   --harmony-numeric-separator (enable "harmony numeric separator between digits")
 51 |         type: bool  default: false
 52 |   --harmony-string-matchall (enable "harmony String.prototype.matchAll")
 53 |         type: bool  default: false
 54 |   --harmony-namespace-exports (enable "harmony namespace exports (export * as foo from 'bar')")
 55 |         type: bool  default: true
 56 |   --harmony-sharedarraybuffer (enable "harmony sharedarraybuffer")
 57 |         type: bool  default: true
 58 |   --harmony-import-meta (enable "harmony import.meta property")
 59 |         type: bool  default: true
 60 |   --harmony-dynamic-import (enable "harmony dynamic import")
 61 |         type: bool  default: true
 62 |   --harmony-array-prototype-values (enable "harmony Array.prototype.values")
 63 |         type: bool  default: true
 64 |   --harmony-array-flat (enable "harmony Array.prototype.{flat,flatMap}")
 65 |         type: bool  default: true
 66 |   --harmony-symbol-description (enable "harmony Symbol.prototype.description")
 67 |         type: bool  default: true
 68 |   --harmony-global (enable "harmony global")
 69 |         type: bool  default: true
 70 |   --harmony-json-stringify (enable "well-formed JSON.stringify")
 71 |         type: bool  default: true
 72 |   --harmony-public-fields (enable "harmony public instance fields in class literals")
 73 |         type: bool  default: true
 74 |   --harmony-static-fields (enable "harmony static fields in class literals")
 75 |         type: bool  default: true
 76 |   --harmony-intl-relative-time-format (enable "Intl.RelativeTimeFormat")
 77 |         type: bool  default: true
 78 |   --icu-timezone-data (get information about timezones from ICU)
 79 |         type: bool  default: true
 80 |   --future (Implies all staged features that we want to ship in the not-too-far future)
 81 |         type: bool  default: false
 82 |   --allocation-site-pretenuring (pretenure with allocation sites)
 83 |         type: bool  default: true
 84 |   --page-promotion (promote pages based on utilization)
 85 |         type: bool  default: true
 86 |   --page-promotion-threshold (min percentage of live bytes on a page to enable fast evacuation)
 87 |         type: int  default: 70
 88 |   --trace-pretenuring (trace pretenuring decisions of HAllocate instructions)
 89 |         type: bool  default: false
 90 |   --trace-pretenuring-statistics (trace allocation site pretenuring statistics)
 91 |         type: bool  default: false
 92 |   --track-fields (track fields with only smi values)
 93 |         type: bool  default: true
 94 |   --track-double-fields (track fields with double values)
 95 |         type: bool  default: true
 96 |   --track-heap-object-fields (track fields with heap values)
 97 |         type: bool  default: true
 98 |   --track-computed-fields (track computed boilerplate fields)
 99 |         type: bool  default: true
100 |   --track-field-types (track field types)
101 |         type: bool  default: true
102 |   --trace-block-coverage (trace collected block coverage information)
103 |         type: bool  default: false
104 |   --feedback-normalization (feed back normalization to constructors)
105 |         type: bool  default: false
106 |   --enable-one-shot-optimization (Enable size optimizations for the code that will only be executed once)
107 |         type: bool  default: true
108 |   --unbox-double-arrays (automatically unbox arrays of doubles)
109 |         type: bool  default: true
110 |   --stress-delay-tasks (delay execution of tasks by 0-100ms randomly (based on --random-seed))
111 |         type: bool  default: false
112 |   --ignition-elide-noneffectful-bytecodes (elide bytecodes which won't have any external effect)
113 |         type: bool  default: true
114 |   --ignition-reo (use ignition register equivalence optimizer)
115 |         type: bool  default: true
116 |   --ignition-filter-expression-positions (filter expression positions before the bytecode pipeline)
117 |         type: bool  default: true
118 |   --ignition-share-named-property-feedback (share feedback slots when loading the same named property from the same object)
119 |         type: bool  default: true
120 |   --print-bytecode (print bytecode generated by ignition interpreter)
121 |         type: bool  default: false
122 |   --print-bytecode-filter (filter for selecting which functions to print bytecode)
123 |         type: string  default: *
124 |   --trace-ignition-codegen (trace the codegen of ignition interpreter bytecode handlers)
125 |         type: bool  default: false
126 |   --trace-ignition-dispatches (traces the dispatches to bytecode handlers by the ignition interpreter)
127 |         type: bool  default: false
128 |   --trace-ignition-dispatches-output-file (the file to which the bytecode handler dispatch table is written (by default, the table is not written to a file))
129 |         type: string  default: nullptr
130 |   --fast-math (faster (but maybe less accurate) math functions)
131 |         type: bool  default: true
132 |   --trace-track-allocation-sites (trace the tracking of allocation sites)
133 |         type: bool  default: false
134 |   --trace-migration (trace object migration)
135 |         type: bool  default: false
136 |   --trace-generalization (trace map generalization)
137 |         type: bool  default: false
138 |   --concurrent-recompilation (optimizing hot functions asynchronously on a separate thread)
139 |         type: bool  default: true
140 |   --trace-concurrent-recompilation (track concurrent recompilation)
141 |         type: bool  default: false
142 |   --concurrent-recompilation-queue-length (the length of the concurrent compilation queue)
143 |         type: int  default: 8
144 |   --concurrent-recompilation-delay (artificial compilation delay in ms)
145 |         type: int  default: 0
146 |   --block-concurrent-recompilation (block queued jobs until released)
147 |         type: bool  default: false
148 |   --concurrent-inlining (run optimizing compiler's inlining phase on a separate thread)
149 |         type: bool  default: false
150 |   --strict-heap-broker (fail on incomplete serialization)
151 |         type: bool  default: true
152 |   --trace-heap-broker (trace the heap broker)
153 |         type: bool  default: false
154 |   --stress-runs (number of stress runs)
155 |         type: int  default: 0
156 |   --deopt-every-n-times (deoptimize every n times a deopt point is passed)
157 |         type: int  default: 0
158 |   --print-deopt-stress (print number of possible deopt points)
159 |         type: bool  default: false
160 |   --turbo-sp-frame-access (use stack pointer-relative access to frame wherever possible)
161 |         type: bool  default: false
162 |   --turbo-preprocess-ranges (run pre-register allocation heuristics)
163 |         type: bool  default: true
164 |   --turbo-filter (optimization filter for TurboFan compiler)
165 |         type: string  default: *
166 |   --trace-turbo (trace generated TurboFan IR)
167 |         type: bool  default: false
168 |   --trace-turbo-path (directory to dump generated TurboFan IR to)
169 |         type: string  default: nullptr
170 |   --trace-turbo-filter (filter for tracing turbofan compilation)
171 |         type: string  default: *
172 |   --trace-turbo-graph (trace generated TurboFan graphs)
173 |         type: bool  default: false
174 |   --trace-turbo-scheduled (trace TurboFan IR with schedule)
175 |         type: bool  default: false
176 |   --trace-turbo-cfg-file (trace turbo cfg graph (for C1 visualizer) to a given file name)
177 |         type: string  default: nullptr
178 |   --trace-turbo-types (trace TurboFan's types)
179 |         type: bool  default: true
180 |   --trace-turbo-scheduler (trace TurboFan's scheduler)
181 |         type: bool  default: false
182 |   --trace-turbo-reduction (trace TurboFan's various reducers)
183 |         type: bool  default: false
184 |   --trace-turbo-trimming (trace TurboFan's graph trimmer)
185 |         type: bool  default: false
186 |   --trace-turbo-jt (trace TurboFan's jump threading)
187 |         type: bool  default: false
188 |   --trace-turbo-ceq (trace TurboFan's control equivalence)
189 |         type: bool  default: false
190 |   --trace-turbo-loop (trace TurboFan's loop optimizations)
191 |         type: bool  default: false
192 |   --trace-alloc (trace register allocator)
193 |         type: bool  default: false
194 |   --trace-all-uses (trace all use positions)
195 |         type: bool  default: false
196 |   --trace-representation (trace representation types)
197 |         type: bool  default: false
198 |   --turbo-verify (verify TurboFan graphs at each phase)
199 |         type: bool  default: false
200 |   --turbo-verify-machine-graph (verify TurboFan machine graph before instruction selection)
201 |         type: string  default: nullptr
202 |   --trace-verify-csa (trace code stubs verification)
203 |         type: bool  default: false
204 |   --csa-trap-on-node (trigger break point when a node with given id is created in given stub. The format is: StubName,NodeId)
205 |         type: string  default: nullptr
206 |   --turbo-stats (print TurboFan statistics)
207 |         type: bool  default: false
208 |   --turbo-stats-nvp (print TurboFan statistics in machine-readable format)
209 |         type: bool  default: false
210 |   --turbo-stats-wasm (print TurboFan statistics of wasm compilations)
211 |         type: bool  default: false
212 |   --turbo-splitting (split nodes during scheduling in TurboFan)
213 |         type: bool  default: true
214 |   --function-context-specialization (enable function context specialization in TurboFan)
215 |         type: bool  default: false
216 |   --turbo-inlining (enable inlining in TurboFan)
217 |         type: bool  default: true
218 |   --max-inlined-bytecode-size (maximum size of bytecode for a single inlining)
219 |         type: int  default: 500
220 |   --max-inlined-bytecode-size-cumulative (maximum cumulative size of bytecode considered for inlining)
221 |         type: int  default: 1000
222 |   --max-inlined-bytecode-size-absolute (maximum cumulative size of bytecode considered for inlining)
223 |         type: int  default: 5000
224 |   --reserve-inline-budget-scale-factor (maximum cumulative size of bytecode considered for inlining)
225 |         type: float  default: 1.2
226 |   --max-inlined-bytecode-size-small (maximum size of bytecode considered for small function inlining)
227 |         type: int  default: 30
228 |   --min-inlining-frequency (minimum frequency for inlining)
229 |         type: float  default: 0.15
230 |   --polymorphic-inlining (polymorphic inlining)
231 |         type: bool  default: true
232 |   --stress-inline (set high thresholds for inlining to inline as much as possible)
233 |         type: bool  default: false
234 |   --trace-turbo-inlining (trace TurboFan inlining)
235 |         type: bool  default: false
236 |   --inline-accessors (inline JavaScript accessors)
237 |         type: bool  default: true
238 |   --inline-into-try (inline into try blocks)
239 |         type: bool  default: true
240 |   --turbo-inline-array-builtins (inline array builtins in TurboFan code)
241 |         type: bool  default: true
242 |   --use-osr (use on-stack replacement)
243 |         type: bool  default: true
244 |   --trace-osr (trace on-stack replacement)
245 |         type: bool  default: false
246 |   --analyze-environment-liveness (analyze liveness of environment slots and zap dead values)
247 |         type: bool  default: true
248 |   --trace-environment-liveness (trace liveness of local variable slots)
249 |         type: bool  default: false
250 |   --turbo-load-elimination (enable load elimination in TurboFan)
251 |         type: bool  default: true
252 |   --trace-turbo-load-elimination (trace TurboFan load elimination)
253 |         type: bool  default: false
254 |   --turbo-profiling (enable profiling in TurboFan)
255 |         type: bool  default: false
256 |   --turbo-verify-allocation (verify register allocation in TurboFan)
257 |         type: bool  default: false
258 |   --turbo-move-optimization (optimize gap moves in TurboFan)
259 |         type: bool  default: true
260 |   --turbo-jt (enable jump threading in TurboFan)
261 |         type: bool  default: true
262 |   --turbo-loop-peeling (Turbofan loop peeling)
263 |         type: bool  default: true
264 |   --turbo-loop-variable (Turbofan loop variable optimization)
265 |         type: bool  default: true
266 |   --turbo-cf-optimization (optimize control flow in TurboFan)
267 |         type: bool  default: true
268 |   --turbo-escape (enable escape analysis)
269 |         type: bool  default: true
270 |   --turbo-allocation-folding (Turbofan allocation folding)
271 |         type: bool  default: true
272 |   --turbo-instruction-scheduling (enable instruction scheduling in TurboFan)
273 |         type: bool  default: false
274 |   --turbo-stress-instruction-scheduling (randomly schedule instructions to stress dependency tracking)
275 |         type: bool  default: false
276 |   --turbo-store-elimination (enable store-store elimination in TurboFan)
277 |         type: bool  default: true
278 |   --trace-store-elimination (trace store elimination)
279 |         type: bool  default: false
280 |   --turbo-rewrite-far-jumps (rewrite far to near jumps (ia32,x64))
281 |         type: bool  default: true
282 |   --experimental-inline-promise-constructor (inline the Promise constructor in TurboFan)
283 |         type: bool  default: true
284 |   --untrusted-code-mitigations (Enable mitigations for executing untrusted code)
285 |         type: bool  default: false
286 |   --expose-wasm (expose wasm interface to JavaScript)
287 |         type: bool  default: true
288 |   --assume-asmjs-origin (force wasm decoder to assume input is internal asm-wasm format)
289 |         type: bool  default: false
290 |   --wasm-disable-structured-cloning (disable wasm structured cloning)
291 |         type: bool  default: false
292 |   --wasm-num-compilation-tasks (number of parallel compilation tasks for wasm)
293 |         type: int  default: 10
294 |   --wasm-write-protect-code-memory (write protect code memory on the wasm native heap)
295 |         type: bool  default: false
296 |   --trace-wasm-serialization (trace serialization/deserialization)
297 |         type: bool  default: false
298 |   --wasm-async-compilation (enable actual asynchronous compilation for WebAssembly.compile)
299 |         type: bool  default: true
300 |   --wasm-test-streaming (use streaming compilation instead of async compilation for tests)
301 |         type: bool  default: false
302 |   --wasm-max-mem-pages (maximum number of 64KiB memory pages of a wasm instance)
303 |         type: uint  default: 32767
304 |   --wasm-max-table-size (maximum table size of a wasm instance)
305 |         type: uint  default: 10000000
306 |   --wasm-tier-up (enable wasm baseline compilation and tier up to the optimizing compiler)
307 |         type: bool  default: true
308 |   --trace-wasm-ast-start (start function for wasm AST trace (inclusive))
309 |         type: int  default: 0
310 |   --trace-wasm-ast-end (end function for wasm AST trace (exclusive))
311 |         type: int  default: 0
312 |   --liftoff (enable Liftoff, the baseline compiler for WebAssembly)
313 |         type: bool  default: true
314 |   --trace-wasm-memory (print all memory updates performed in wasm code)
315 |         type: bool  default: false
316 |   --wasm-tier-mask-for-testing (bitmask of functions to compile with TurboFan instead of Liftoff)
317 |         type: int  default: 0
318 |   --validate-asm (validate asm.js modules before compiling)
319 |         type: bool  default: true
320 |   --suppress-asm-messages (don't emit asm.js related messages (for golden file testing))
321 |         type: bool  default: false
322 |   --trace-asm-time (log asm.js timing info to the console)
323 |         type: bool  default: false
324 |   --trace-asm-scanner (log tokens encountered by asm.js scanner)
325 |         type: bool  default: false
326 |   --trace-asm-parser (verbose logging of asm.js parse failures)
327 |         type: bool  default: false
328 |   --stress-validate-asm (try to validate everything as asm.js)
329 |         type: bool  default: false
330 |   --dump-wasm-module-path (directory to dump wasm modules to)
331 |         type: string  default: nullptr
332 |   --experimental-wasm-mv (enable prototype multi-value support for wasm)
333 |         type: bool  default: false
334 |   --experimental-wasm-eh (enable prototype exception handling opcodes for wasm)
335 |         type: bool  default: false
336 |   --experimental-wasm-se (enable prototype sign extension opcodes for wasm)
337 |         type: bool  default: true
338 |   --experimental-wasm-sat-f2i-conversions (enable prototype saturating float conversion opcodes for wasm)
339 |         type: bool  default: false
340 |   --experimental-wasm-threads (enable prototype thread opcodes for wasm)
341 |         type: bool  default: false
342 |   --experimental-wasm-simd (enable prototype SIMD opcodes for wasm)
343 |         type: bool  default: false
344 |   --experimental-wasm-anyref (enable prototype anyref opcodes for wasm)
345 |         type: bool  default: false
346 |   --experimental-wasm-mut-global (enable prototype import/export mutable global support for wasm)
347 |         type: bool  default: true
348 |   --wasm-opt (enable wasm optimization)
349 |         type: bool  default: false
350 |   --wasm-no-bounds-checks (disable bounds checks (performance testing only))
351 |         type: bool  default: false
352 |   --wasm-no-stack-checks (disable stack checks (performance testing only))
353 |         type: bool  default: false
354 |   --wasm-shared-engine (shares one wasm engine between all isolates within a process)
355 |         type: bool  default: true
356 |   --wasm-shared-code (shares code underlying a wasm module when it is transferred)
357 |         type: bool  default: true
358 |   --wasm-trap-handler (use signal handlers to catch out of bounds memory access in wasm (currently Linux x86_64 only))
359 |         type: bool  default: false
360 |   --wasm-trap-handler-fallback (Use bounds checks if guarded memory is not available)
361 |         type: bool  default: false
362 |   --wasm-fuzzer-gen-test (Generate a test case when running a wasm fuzzer)
363 |         type: bool  default: false
364 |   --print-wasm-code (Print WebAssembly code)
365 |         type: bool  default: false
366 |   --wasm-interpret-all (Execute all wasm code in the wasm interpreter)
367 |         type: bool  default: false
368 |   --asm-wasm-lazy-compilation (enable lazy compilation for asm-wasm modules)
369 |         type: bool  default: true
370 |   --wasm-lazy-compilation (enable lazy compilation for all wasm modules)
371 |         type: bool  default: false
372 |   --frame-count (number of stack frames inspected by the profiler)
373 |         type: int  default: 1
374 |   --type-info-threshold (percentage of ICs that must have type info to allow optimization)
375 |         type: int  default: 25
376 |   --stress-sampling-allocation-profiler (Enables sampling allocation profiler with X as a sample interval)
377 |         type: int  default: 0
378 |   --min-semi-space-size (min size of a semi-space (in MBytes), the new space consists of two semi-spaces)
379 |         type: size_t  default: 0
380 |   --max-semi-space-size (max size of a semi-space (in MBytes), the new space consists of two semi-spaces)
381 |         type: size_t  default: 0
382 |   --semi-space-growth-factor (factor by which to grow the new space)
383 |         type: int  default: 2
384 |   --experimental-new-space-growth-heuristic (Grow the new space based on the percentage of survivors instead of their absolute value.)
385 |         type: bool  default: false
386 |   --max-old-space-size (max size of the old space (in Mbytes))
387 |         type: size_t  default: 0
388 |   --initial-old-space-size (initial old space size (in Mbytes))
389 |         type: size_t  default: 0
390 |   --gc-global (always perform global GCs)
391 |         type: bool  default: false
392 |   --random-gc-interval (Collect garbage after random(0, X) allocations. It overrides gc_interval.)
393 |         type: int  default: 0
394 |   --gc-interval (garbage collect after <n> allocations)
395 |         type: int  default: -1
396 |   --retain-maps-for-n-gc (keeps maps alive for <n> old space garbage collections)
397 |         type: int  default: 2
398 |   --trace-gc (print one trace line following each garbage collection)
399 |         type: bool  default: false
400 |   --trace-gc-nvp (print one detailed trace line in name=value format after each garbage collection)
401 |         type: bool  default: false
402 |   --trace-gc-ignore-scavenger (do not print trace line after scavenger collection)
403 |         type: bool  default: false
404 |   --trace-idle-notification (print one trace line following each idle notification)
405 |         type: bool  default: false
406 |   --trace-idle-notification-verbose (prints the heap state used by the idle notification)
407 |         type: bool  default: false
408 |   --trace-gc-verbose (print more details following each garbage collection)
409 |         type: bool  default: false
410 |   --trace-allocation-stack-interval (print stack trace after <n> free-list allocations)
411 |         type: int  default: -1
412 |   --trace-duplicate-threshold-kb (print duplicate objects in the heap if their size is more than given threshold)
413 |         type: int  default: 0
414 |   --trace-fragmentation (report fragmentation for old space)
415 |         type: bool  default: false
416 |   --trace-fragmentation-verbose (report fragmentation for old space (detailed))
417 |         type: bool  default: false
418 |   --trace-evacuation (report evacuation statistics)
419 |         type: bool  default: false
420 |   --trace-mutator-utilization (print mutator utilization, allocation speed, gc speed)
421 |         type: bool  default: false
422 |   --incremental-marking (use incremental marking)
423 |         type: bool  default: true
424 |   --incremental-marking-wrappers (use incremental marking for marking wrappers)
425 |         type: bool  default: true
426 |   --trace-unmapper (Trace the unmapping)
427 |         type: bool  default: false
428 |   --parallel-scavenge (parallel scavenge)
429 |         type: bool  default: true
430 |   --trace-parallel-scavenge (trace parallel scavenge)
431 |         type: bool  default: false
432 |   --write-protect-code-memory (write protect code memory)
433 |         type: bool  default: true
434 |   --concurrent-marking (use concurrent marking)
435 |         type: bool  default: true
436 |   --parallel-marking (use parallel marking in atomic pause)
437 |         type: bool  default: true
438 |   --ephemeron-fixpoint-iterations (number of fixpoint iterations it takes to switch to linear ephemeron algorithm)
439 |         type: int  default: 10
440 |   --trace-concurrent-marking (trace concurrent marking)
441 |         type: bool  default: false
442 |   --black-allocation (use black allocation)
443 |         type: bool  default: true
444 |   --concurrent-store-buffer (use concurrent store buffer processing)
445 |         type: bool  default: true
446 |   --concurrent-sweeping (use concurrent sweeping)
447 |         type: bool  default: true
448 |   --parallel-compaction (use parallel compaction)
449 |         type: bool  default: true
450 |   --parallel-pointer-update (use parallel pointer update during compaction)
451 |         type: bool  default: true
452 |   --detect-ineffective-gcs-near-heap-limit (trigger out-of-memory failure to avoid GC storm near heap limit)
453 |         type: bool  default: true
454 |   --trace-incremental-marking (trace progress of the incremental marking)
455 |         type: bool  default: false
456 |   --trace-stress-marking (trace stress marking progress)
457 |         type: bool  default: false
458 |   --trace-stress-scavenge (trace stress scavenge progress)
459 |         type: bool  default: false
460 |   --track-gc-object-stats (track object counts and memory usage)
461 |         type: bool  default: false
462 |   --trace-gc-object-stats (trace object counts and memory usage)
463 |         type: bool  default: false
464 |   --trace-zone-stats (trace zone memory usage)
465 |         type: bool  default: false
466 |   --track-retaining-path (enable support for tracking retaining path)
467 |         type: bool  default: false
468 |   --concurrent-array-buffer-freeing (free array buffer allocations on a background thread)
469 |         type: bool  default: true
470 |   --gc-stats (Used by tracing internally to enable gc statistics)
471 |         type: int  default: 0
472 |   --track-detached-contexts (track native contexts that are expected to be garbage collected)
473 |         type: bool  default: true
474 |   --trace-detached-contexts (trace native contexts that are expected to be garbage collected)
475 |         type: bool  default: false
476 |   --move-object-start (enable moving of object starts)
477 |         type: bool  default: true
478 |   --memory-reducer (use memory reducer)
479 |         type: bool  default: true
480 |   --heap-growing-percent (specifies heap growing factor as (1 + heap_growing_percent/100))
481 |         type: int  default: 0
482 |   --v8-os-page-size (override OS page size (in KBytes))
483 |         type: int  default: 0
484 |   --always-compact (Perform compaction on every full GC)
485 |         type: bool  default: false
486 |   --never-compact (Never perform compaction on full GC - testing only)
487 |         type: bool  default: false
488 |   --compact-code-space (Compact code space on full collections)
489 |         type: bool  default: true
490 |   --use-marking-progress-bar (Use a progress bar to scan large objects in increments when incremental marking is active.)
491 |         type: bool  default: true
492 |   --force-marking-deque-overflows (force overflows of marking deque by reducing it's size to 64 words)
493 |         type: bool  default: false
494 |   --stress-compaction (stress the GC compactor to flush out bugs (implies --force_marking_deque_overflows))
495 |         type: bool  default: false
496 |   --stress-compaction-random (Stress GC compaction by selecting random percent of pages as evacuation candidates. It overrides stress_compaction.)
497 |         type: bool  default: false
498 |   --stress-incremental-marking (force incremental marking for small heaps and run it more often)
499 |         type: bool  default: false
500 |   --fuzzer-gc-analysis (prints number of allocations and enables analysis mode for gc fuzz testing, e.g. --stress-marking, --stress-scavenge)
501 |         type: bool  default: false
502 |   --stress-marking (force marking at random points between 0 and X (inclusive) percent of the regular marking start limit)
503 |         type: int  default: 0
504 |   --stress-scavenge (force scavenge at random points between 0 and X (inclusive) percent of the new space capacity)
505 |         type: int  default: 0
506 |   --disable-abortjs (disables AbortJS runtime function)
507 |         type: bool  default: false
508 |   --manual-evacuation-candidates-selection (Test mode only flag. It allows an unit test to select evacuation candidates pages (requires --stress_compaction).)
509 |         type: bool  default: false
510 |   --fast-promotion-new-space (fast promote new space on high survival rates)
511 |         type: bool  default: false
512 |   --clear-free-memory (initialize free memory with 0)
513 |         type: bool  default: false
514 |   --young-generation-large-objects (allocates large objects by default in the young generation large object space)
515 |         type: bool  default: false
516 |   --idle-time-scavenge (Perform scavenges in idle time.)
517 |         type: bool  default: true
518 |   --debug-code (generate extra code (assertions) for debugging)
519 |         type: bool  default: false
520 |   --code-comments (emit comments in code disassembly; for more readable source positions you should add --no-concurrent_recompilation)
521 |         type: bool  default: false
522 |   --enable-sse3 (enable use of SSE3 instructions if available)
523 |         type: bool  default: true
524 |   --enable-ssse3 (enable use of SSSE3 instructions if available)
525 |         type: bool  default: true
526 |   --enable-sse4-1 (enable use of SSE4.1 instructions if available)
527 |         type: bool  default: true
528 |   --enable-sahf (enable use of SAHF instruction if available (X64 only))
529 |         type: bool  default: true
530 |   --enable-avx (enable use of AVX instructions if available)
531 |         type: bool  default: true
532 |   --enable-fma3 (enable use of FMA3 instructions if available)
533 |         type: bool  default: true
534 |   --enable-bmi1 (enable use of BMI1 instructions if available)
535 |         type: bool  default: true
536 |   --enable-bmi2 (enable use of BMI2 instructions if available)
537 |         type: bool  default: true
538 |   --enable-lzcnt (enable use of LZCNT instruction if available)
539 |         type: bool  default: true
540 |   --enable-popcnt (enable use of POPCNT instruction if available)
541 |         type: bool  default: true
542 |   --arm-arch (generate instructions for the selected ARM architecture if available: armv6, armv7, armv7+sudiv or armv8)
543 |         type: string  default: armv8
544 |   --force-long-branches (force all emitted branches to be in long mode (MIPS/PPC only))
545 |         type: bool  default: false
546 |   --mcpu (enable optimization for specific cpu)
547 |         type: string  default: auto
548 |   --partial-constant-pool (enable use of partial constant pools (X64 only))
549 |         type: bool  default: true
550 |   --enable-armv7 (deprecated (use --arm_arch instead))
551 |         type: maybe_bool  default: unset
552 |   --enable-vfp3 (deprecated (use --arm_arch instead))
553 |         type: maybe_bool  default: unset
554 |   --enable-32dregs (deprecated (use --arm_arch instead))
555 |         type: maybe_bool  default: unset
556 |   --enable-neon (deprecated (use --arm_arch instead))
557 |         type: maybe_bool  default: unset
558 |   --enable-sudiv (deprecated (use --arm_arch instead))
559 |         type: maybe_bool  default: unset
560 |   --enable-armv8 (deprecated (use --arm_arch instead))
561 |         type: maybe_bool  default: unset
562 |   --enable-regexp-unaligned-accesses (enable unaligned accesses for the regexp engine)
563 |         type: bool  default: true
564 |   --script-streaming (enable parsing on background)
565 |         type: bool  default: true
566 |   --disable-old-api-accessors (Disable old-style API accessors whose setters trigger through the prototype chain)
567 |         type: bool  default: false
568 |   --expose-natives-as (expose natives in global object)
569 |         type: string  default: nullptr
570 |   --expose-free-buffer (expose freeBuffer extension)
571 |         type: bool  default: false
572 |   --expose-gc (expose gc extension)
573 |         type: bool  default: false
574 |   --expose-gc-as (expose gc extension under the specified name)
575 |         type: string  default: nullptr
576 |   --expose-externalize-string (expose externalize string extension)
577 |         type: bool  default: false
578 |   --expose-trigger-failure (expose trigger-failure extension)
579 |         type: bool  default: false
580 |   --stack-trace-limit (number of stack frames to capture)
581 |         type: int  default: 10
582 |   --builtins-in-stack-traces (show built-in functions in stack traces)
583 |         type: bool  default: false
584 |   --disallow-code-generation-from-strings (disallow eval and friends)
585 |         type: bool  default: false
586 |   --expose-async-hooks (expose async_hooks object)
587 |         type: bool  default: false
588 |   --allow-unsafe-function-constructor (allow invoking the function constructor without security checks)
589 |         type: bool  default: false
590 |   --force-slow-path (always take the slow path for builtins)
591 |         type: bool  default: false
592 |   --inline-new (use fast inline allocation)
593 |         type: bool  default: true
594 |   --trace (trace function calls)
595 |         type: bool  default: false
596 |   --lazy (use lazy compilation)
597 |         type: bool  default: true
598 |   --trace-opt (trace lazy optimization)
599 |         type: bool  default: false
600 |   --trace-opt-verbose (extra verbose compilation tracing)
601 |         type: bool  default: false
602 |   --trace-opt-stats (trace lazy optimization statistics)
603 |         type: bool  default: false
604 |   --trace-deopt (trace optimize function deoptimization)
605 |         type: bool  default: false
606 |   --trace-file-names (include file names in trace-opt/trace-deopt output)
607 |         type: bool  default: false
608 |   --trace-interrupts (trace interrupts when they are handled)
609 |         type: bool  default: false
610 |   --always-opt (always try to optimize functions)
611 |         type: bool  default: false
612 |   --always-osr (always try to OSR functions)
613 |         type: bool  default: false
614 |   --prepare-always-opt (prepare for turning on always opt)
615 |         type: bool  default: false
616 |   --trace-serializer (print code serializer trace)
617 |         type: bool  default: false
618 |   --compilation-cache (enable compilation cache)
619 |         type: bool  default: true
620 |   --cache-prototype-transitions (cache prototype transitions)
621 |         type: bool  default: true
622 |   --parallel-compile-tasks (enable parallel compile tasks)
623 |         type: bool  default: false
624 |   --compiler-dispatcher (enable compiler dispatcher)
625 |         type: bool  default: false
626 |   --trace-compiler-dispatcher (trace compiler dispatcher activity)
627 |         type: bool  default: false
628 |   --cpu-profiler-sampling-interval (CPU profiler sampling interval in microseconds)
629 |         type: int  default: 1000
630 |   --trace-js-array-abuse (trace out-of-bounds accesses to JS arrays)
631 |         type: bool  default: false
632 |   --trace-external-array-abuse (trace out-of-bounds-accesses to external arrays)
633 |         type: bool  default: false
634 |   --trace-array-abuse (trace out-of-bounds accesses to all arrays)
635 |         type: bool  default: false
636 |   --trace-side-effect-free-debug-evaluate (print debug messages for side-effect-free debug-evaluate for testing)
637 |         type: bool  default: false
638 |   --hard-abort (abort by crashing)
639 |         type: bool  default: true
640 |   --expose-inspector-scripts (expose injected-script-source.js for debugging)
641 |         type: bool  default: false
642 |   --stack-size (default size of stack region v8 is allowed to use (in kBytes))
643 |         type: int  default: 984
644 |   --max-stack-trace-source-length (maximum length of function source code printed in a stack trace.)
645 |         type: int  default: 300
646 |   --clear-exceptions-on-js-entry (clear pending exceptions when entering JavaScript)
647 |         type: bool  default: false
648 |   --histogram-interval (time interval in ms for aggregating memory histograms)
649 |         type: int  default: 600000
650 |   --heap-profiler-trace-objects (Dump heap object allocations/movements/size_updates)
651 |         type: bool  default: false
652 |   --heap-profiler-use-embedder-graph (Use the new EmbedderGraph API to get embedder nodes)
653 |         type: bool  default: true
654 |   --heap-snapshot-string-limit (truncate strings to this length in the heap snapshot)
655 |         type: int  default: 1024
656 |   --sampling-heap-profiler-suppress-randomness (Use constant sample intervals to eliminate test flakiness)
657 |         type: bool  default: false
658 |   --use-idle-notification (Use idle notification to reduce memory footprint.)
659 |         type: bool  default: true
660 |   --trace-ic (trace inline cache state transitions for tools/ic-processor)
661 |         type: bool  default: false
662 |   --ic-stats (inline cache state transitions statistics)
663 |         type: int  default: 0
664 |   --native-code-counters (generate extra code for manipulating stats counters)
665 |         type: bool  default: false
666 |   --thin-strings (Enable ThinString support)
667 |         type: bool  default: true
668 |   --trace-prototype-users (Trace updates to prototype user tracking)
669 |         type: bool  default: false
670 |   --use-verbose-printer (allows verbose printing)
671 |         type: bool  default: true
672 |   --trace-for-in-enumerate (Trace for-in enumerate slow-paths)
673 |         type: bool  default: false
674 |   --trace-maps (trace map creation)
675 |         type: bool  default: false
676 |   --trace-maps-details (also log map details)
677 |         type: bool  default: true
678 |   --allow-natives-syntax (allow natives syntax)
679 |         type: bool  default: false
680 |   --trace-sim (Trace simulator execution)
681 |         type: bool  default: false
682 |   --debug-sim (Enable debugging the simulator)
683 |         type: bool  default: false
684 |   --check-icache (Check icache flushes in ARM and MIPS simulator)
685 |         type: bool  default: false
686 |   --stop-sim-at (Simulator stop after x number of instructions)
687 |         type: int  default: 0
688 |   --sim-stack-alignment (Stack alingment in bytes in simulator (4 or 8, 8 is default))
689 |         type: int  default: 8
690 |   --sim-stack-size (Stack size of the ARM64, MIPS64 and PPC64 simulator in kBytes (default is 2 MB))
691 |         type: int  default: 2048
692 |   --log-colour (When logging, try to use coloured output.)
693 |         type: bool  default: true
694 |   --ignore-asm-unimplemented-break (Don't break for ASM_UNIMPLEMENTED_BREAK macros.)
695 |         type: bool  default: false
696 |   --trace-sim-messages (Trace simulator debug messages. Implied by --trace-sim.)
697 |         type: bool  default: false
698 |   --async-stack-traces (include async stack traces in Error.stack)
699 |         type: bool  default: false
700 |   --stack-trace-on-illegal (print stack trace when an illegal exception is thrown)
701 |         type: bool  default: false
702 |   --abort-on-uncaught-exception (abort program (dump core) when an uncaught exception is thrown)
703 |         type: bool  default: false
704 |   --abort-on-stack-or-string-length-overflow (Abort program when the stack overflows or a string exceeds maximum length (as opposed to throwing RangeError). This is useful for fuzzing where the spec behaviour would introduce nondeterminism.)
705 |         type: bool  default: false
706 |   --randomize-hashes (randomize hashes to avoid predictable hash collisions (with snapshots this option cannot override the baked-in seed))
707 |         type: bool  default: true
708 |   --rehash-snapshot (rehash strings from the snapshot to override the baked-in seed)
709 |         type: bool  default: true
710 |   --hash-seed (Fixed seed to use to hash property keys (0 means random)(with snapshots this option cannot override the baked-in seed))
711 |         type: uint64  default: 0
712 |   --random-seed (Default seed for initializing random generator (0, the default, means to use system random).)
713 |         type: int  default: 0
714 |   --fuzzer-random-seed (Default seed for initializing fuzzer random generator (0, the default, means to use v8's random number generator seed).)
715 |         type: int  default: 0
716 |   --trace-rail (trace RAIL mode)
717 |         type: bool  default: false
718 |   --print-all-exceptions (print exception object and stack trace on each thrown exception)
719 |         type: bool  default: false
720 |   --runtime-call-stats (report runtime call counts and times)
721 |         type: bool  default: false
722 |   --runtime-stats (internal usage only for controlling runtime statistics)
723 |         type: int  default: 0
724 |   --profile-deserialization (Print the time it takes to deserialize the snapshot.)
725 |         type: bool  default: false
726 |   --serialization-statistics (Collect statistics on serialized objects.)
727 |         type: bool  default: false
728 |   --serialization-chunk-size (Custom size for serialization chunks)
729 |         type: uint  default: 4096
730 |   --regexp-optimization (generate optimized regexp code)
731 |         type: bool  default: true
732 |   --regexp-mode-modifiers (enable inline flags in regexp.)
733 |         type: bool  default: false
734 |   --testing-bool-flag (testing_bool_flag)
735 |         type: bool  default: true
736 |   --testing-maybe-bool-flag (testing_maybe_bool_flag)
737 |         type: maybe_bool  default: unset
738 |   --testing-int-flag (testing_int_flag)
739 |         type: int  default: 13
740 |   --testing-float-flag (float-flag)
741 |         type: float  default: 2.5
742 |   --testing-string-flag (string-flag)
743 |         type: string  default: Hello, world!
744 |   --testing-prng-seed (Seed used for threading test randomness)
745 |         type: int  default: 42
746 |   --embedded-src (Path for the generated embedded data file. (mksnapshot only))
747 |         type: string  default: nullptr
748 |   --embedded-variant (Label to disambiguate symbols in embedded data file. (mksnapshot only))
749 |         type: string  default: nullptr
750 |   --startup-src (Write V8 startup as C++ src. (mksnapshot only))
751 |         type: string  default: nullptr
752 |   --startup-blob (Write V8 startup blob file. (mksnapshot only))
753 |         type: string  default: nullptr
754 |   --minor-mc-parallel-marking (use parallel marking for the young generation)
755 |         type: bool  default: true
756 |   --trace-minor-mc-parallel-marking (trace parallel marking for the young generation)
757 |         type: bool  default: false
758 |   --minor-mc (perform young generation mark compact GCs)
759 |         type: bool  default: false
760 |   --help (Print usage message, including flags, on console)
761 |         type: bool  default: true
762 |   --dump-counters (Dump counters on exit)
763 |         type: bool  default: false
764 |   --dump-counters-nvp (Dump counters as name-value pairs on exit)
765 |         type: bool  default: false
766 |   --use-external-strings (Use external strings for source code)
767 |         type: bool  default: false
768 |   --map-counters (Map counters to a file)
769 |         type: string  default: 
770 |   --mock-arraybuffer-allocator (Use a mock ArrayBuffer allocator for testing.)
771 |         type: bool  default: false
772 |   --mock-arraybuffer-allocator-limit (Memory limit for mock ArrayBuffer allocator used to simulate OOM for testing.)
773 |         type: size_t  default: 0
774 |   --opt (use adaptive optimizations)
775 |         type: bool  default: true
776 |   --use-ic (use inline caching)
777 |         type: bool  default: true
778 |   --optimize-for-size (Enables optimizations which favor memory size over execution speed)
779 |         type: bool  default: false
780 |   --log (Minimal logging (no API, code, GC, suspect, or handles samples).)
781 |         type: bool  default: false
782 |   --log-all (Log all events to the log file.)
783 |         type: bool  default: false
784 |   --log-api (Log API events to the log file.)
785 |         type: bool  default: false
786 |   --log-code (Log code events to the log file without profiling.)
787 |         type: bool  default: false
788 |   --log-handles (Log global handle events.)
789 |         type: bool  default: false
790 |   --log-suspect (Log suspect operations.)
791 |         type: bool  default: false
792 |   --log-source-code (Log source code.)
793 |         type: bool  default: false
794 |   --log-function-events (Log function events (parse, compile, execute) separately.)
795 |         type: bool  default: false
796 |   --prof (Log statistical profiling information (implies --log-code).)
797 |         type: bool  default: false
798 |   --detailed-line-info (Always generate detailed line information for CPU profiling.)
799 |         type: bool  default: false
800 |   --prof-sampling-interval (Interval for --prof samples (in microseconds).)
801 |         type: int  default: 1000
802 |   --prof-cpp (Like --prof, but ignore generated code.)
803 |         type: bool  default: false
804 |   --prof-browser-mode (Used with --prof, turns on browser-compatible mode for profiling.)
805 |         type: bool  default: true
806 |   --logfile (Specify the name of the log file.)
807 |         type: string  default: v8.log
808 |   --logfile-per-isolate (Separate log files for each isolate.)
809 |         type: bool  default: true
810 |   --ll-prof (Enable low-level linux profiler.)
811 |         type: bool  default: false
812 |   --interpreted-frames-native-stack (Show interpreted frames on the native stack (useful for external profilers).)
813 |         type: bool  default: false
814 |   --perf-basic-prof (Enable perf linux profiler (basic support).)
815 |         type: bool  default: false
816 |   --perf-basic-prof-only-functions (Only report function code ranges to perf (i.e. no stubs).)
817 |         type: bool  default: false
818 |   --perf-prof (Enable perf linux profiler (experimental annotate support).)
819 |         type: bool  default: false
820 |   --perf-prof-unwinding-info (Enable unwinding info for perf linux profiler (experimental).)
821 |         type: bool  default: false
822 |   --gc-fake-mmap (Specify the name of the file for fake gc mmap used in ll_prof)
823 |         type: string  default: /tmp/__v8_gc__
824 |   --log-internal-timer-events (Time internal events.)
825 |         type: bool  default: false
826 |   --log-timer-events (Time events including external callbacks.)
827 |         type: bool  default: false
828 |   --log-instruction-stats (Log AArch64 instruction statistics.)
829 |         type: bool  default: false
830 |   --log-instruction-file (AArch64 instruction statistics log file.)
831 |         type: string  default: arm64_inst.csv
832 |   --log-instruction-period (AArch64 instruction statistics logging period.)
833 |         type: int  default: 4194304
834 |   --redirect-code-traces (output deopt information and disassembly into file code-<pid>-<isolate id>.asm)
835 |         type: bool  default: false
836 |   --redirect-code-traces-to (output deopt information and disassembly into the given file)
837 |         type: string  default: nullptr
838 |   --print-opt-source (print source code of optimized and inlined functions)
839 |         type: bool  default: false
840 |   --predictable (enable predictable mode)
841 |         type: bool  default: false
842 |   --predictable-gc-schedule (Predictable garbage collection schedule. Fixes heap growing, idle, and memory reducing behavior.)
843 |         type: bool  default: false
844 |   --single-threaded (disable the use of background tasks)
845 |         type: bool  default: false
846 |   --single-threaded-gc (disable the use of background gc tasks)
847 |         type: bool  default: false
848 | ```
849 | 


--------------------------------------------------------------------------------
/docs/V8_COMPILER_PIPELINE.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/V8_COMPILER_PIPELINE.jpg


--------------------------------------------------------------------------------
/scripts/run_macos.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Script to start the Chrome Canary with V8 profiler flags
 4 | 
 5 | # Configuration
 6 | LOCATION="${1:-http://localhost:8080}"
 7 | LOG_DIRECTORY="${2:-logs}"
 8 | LOG_OUTPUT="${3:-logs/chrome_canary_output.log}"
 9 | LOG_ERROR="${4:-logs/chrome_canary_error.log}"
10 | BASE_TEMP_PROFILE_DIR="${5:-/tmp}"
11 | REMOTE_DEBUGGING_PORT="${6:-9222}"
12 | 
13 | # Temporary profile
14 | TEMP_PROFILE_DIR=$(mktemp -d $BASE_TEMP_PROFILE_DIR/google-chome.XXXXXXX)
15 | 
16 | # Helper functions
17 | 
18 | function log () {
19 |   echo -e "\033[36m"
20 |   echo "#########################################################"
21 |   echo "#### $1 "
22 |   echo "#########################################################"
23 |   echo -e "\033[m"
24 | }
25 | 
26 | run() {
27 |     log "Starting the Chrome Canary with V8 profiler flags"
28 | 
29 |     mkdir -p $LOG_DIRECTORY
30 |     touch $LOG_OUTPUT
31 |     touch $LOG_ERROR
32 | 
33 |     echo -e "Starting Chrome Canary with custom profiling flags\n"
34 | 
35 |     echo -e "Created temporary profile folder in $TEMP_PROFILE_DIR"
36 | 
37 |     # Opening chrome://tracing is not allowed from the command line
38 |     echo -e "Please open \"chrome://tracing\" in a new browser tab to start structural profiling\n"
39 | 
40 |     # Chrome flags
41 | 
42 |     # --incognito | Launches Chrome in incognito mode
43 |     # --disable-gpu-vsync + --disable-frame-rate-limit | Disables the VSync and de-limits the 60 frames per second rate limiting imposed by Chrome
44 |     # --no-default-browser-check | Disables a pop up window checking if Chrome is the default browser
45 |     # --enable-precise-memory-info | Enables precise memory info (otherwise the results from performance.memory are bucketed and less useful)
46 |     # --remote-debugging-port | Enables remote debugging using the DevTools API
47 |     # --user-data-dir + --no-first-run | Chrome creates a user profile by default in a temporary directory and disable a pop up window checking if the user has a new profile
48 | 
49 |     # V8 flags
50 | 
51 |     # --trace-file-names | When tracing show the filename of the file where the optimized or de-optimized code is located
52 |     # --trace-opt | Trace code optimisations of hot functions
53 |     # --trace-deopt | Trace code de-optimisations of hot functions
54 |     # --print-opt-source | Print the optimized source code and trace the difference
55 |     # --code-comments | Comment the code where possible (useful for understanding the optimized and deoptimized source code)
56 | 
57 |     /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary $LOCATION --incognito --disable-gpu-vsync --disable-frame-rate-limit --no-default-browser-check --enable-precise-memory-info --remote-debugging-port=$REMOTE_DEBUGGING_PORT --user-data-dir=$TEMP_PROFILE_DIR --no-first-run --js-flags="--trace-file-names --trace-opt --trace-deopt --print-opt-source --code-comments" 1> $LOG_OUTPUT 2> $LOG_ERROR
58 | 
59 |     echo -e "Cleaning up temporary profile folder in $TEMP_PROFILE_DIR"
60 | 
61 |     rm -rf $TEMP_PROFILE_DIR
62 | }
63 | 
64 | # Main script
65 | 
66 | run
67 | 
68 | log "Done!"


--------------------------------------------------------------------------------
/scripts/setup_macos.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Script to install and setup the required toolchain
 4 | 
 5 | # Helper functions
 6 | 
 7 | function log () {
 8 |   echo -e "\033[36m"
 9 |   echo "#########################################################"
10 |   echo "#### $1 "
11 |   echo "#########################################################"
12 |   echo -e "\033[m"
13 | }
14 | 
15 | setup() {
16 |   UNAME=$(uname)
17 | 
18 |   if [ "$UNAME" != "Darwin" ]; then
19 |       echo "Currently only MacOS is supported by this automatic setup script"
20 |       exit 1
21 |   fi
22 | 
23 |   # Install `homebrew` dependencies
24 |   log "Installing Homebrew packages"
25 | 
26 |   if ! type "brew" > /dev/null; then
27 |       echo "Please install Homebrew (https://brew.sh/)"
28 |       exit
29 |   else
30 |       for pkg in google-chrome-canary; do
31 |           if brew cask list -1 | grep -q "^${pkg}\$"; then
32 |               echo "Package '$pkg' is already installed"
33 |           else
34 |               echo "Package '$pkg' is not installed"
35 | 
36 |               # Convert
37 |               log "Installing Google Chrome Canary"
38 |               brew tap homebrew/cask-versions && brew cask install google-chrome-canary
39 |           fi
40 |       done
41 |   fi
42 | }
43 | 
44 | # Main script
45 | 
46 | setup
47 | 
48 | log "Done!"


--------------------------------------------------------------------------------