├── .gitignore ├── LICENSE ├── README.md ├── docs ├── CHROME_TRACING_RENDERING_TOGGLE.png ├── CHROME_TRACING_RENDERING_TRACE.png ├── CPU_LIVE_USAGE_PROFILER.png ├── CPU_TRACE_PROFILER.png ├── GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg ├── HEAP_SNAPSHOT_BUTTON.png ├── HEAP_SNAPSHOT_TRACE_DETACHED.png ├── RENDERDOC_DRAWCALL.png ├── SPECTORJS_SHADER.png ├── SPECTORJS_STATE.png ├── V8_AVAILABLE_FLAGS.md └── V8_COMPILER_PIPELINE.jpg └── scripts ├── run_macos.sh └── setup_macos.sh /.gitignore: -------------------------------------------------------------------------------- 1 | # OS 2 | .DS_Store 3 | 4 | # Log files 5 | *.log 6 | 7 | # Directories 8 | logs 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Tim van Scherpenzeel 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Profiling research 2 | 3 | Research on profiling of high-performance web applications (primarily WebGL applications). 4 | 5 | ## Table of contents 6 | 7 | - [Introduction](#introduction) 8 | 9 | - [Compiler pipeline](#compiler-pipeline) 10 | 11 | - [Source code](#source-code) 12 | - [Parser](#parser) 13 | - [AST](#ast) 14 | - [Baseline compiler](#baseline-compiler) 15 | - [Optimising compiler](#optimising-compiler) 16 | - [Conclusion](#conclusion) 17 | 18 | - [Profiling](#profiling) 19 | 20 | - [Memory profiling and garbage collection](#memory-profiling-and-garbage-collection) 21 | - [Heap snapshot](#heap-snapshot) 22 | - [Three snapshot technique](#three-snapshot-technique) 23 | - [CPU profiling](#cpu-profiling) 24 | - [GPU profiling](#gpu-profiling) 25 | 26 | - [Installation](#installation) 27 | 28 | - [Usage](#usage) 29 | 30 | - [Resources and references](#resources-and-references) 31 | 32 | ## Introduction 33 | 34 | In order to profile the performance of a web application one would usually use the browsers built-in developer tools. Every once in a while however there comes a time when a developer needs a better understanding of a performance issue in order to solve it. In order to get that understanding the developer needs access and an understanding of the low-level optimisations, de-optimisations and caching techniques in modern browser engines. Due to security restrictions in the browser it is only really possible to get this low-level information from browsers by enabling various flags when launching the browser locally. 35 | 36 | `Chrome` and `V8` ship with various built-in tools that help their developers during development of the browser and engine. Luckily we can, as a web developer, leverage these same tools to get a better understanding of what is happening under the hood. 37 | 38 | To understand what parts of the application are useful to profile one must have a general understanding of the architecture of the compiler pipeline in modern browser engines like `V8`. The compiler pipelines behind each browser are similar but not at all the same on a technical level. By looking at the `V8` pipeline in general terms we can understand what the core parts of a browser engine is without getting lost in the implementation details. 39 | 40 | It is not necessary to understand the intrinsics of each browser engine but it is beneficial as a starting point in understanding what is harming the performance of your application. 41 | 42 | ## Compiler pipeline 43 | 44 | ![V8 compiler pipeline](/docs/V8_COMPILER_PIPELINE.jpg?raw=true) 45 | 46 | _Image source: Franziska Hinkelmann - https://medium.com/dailyjs/understanding-v8s-bytecode-317d46c94775_ 47 | 48 | ### Source code 49 | 50 | JavaScript source code is `JIT (Just In Time)` compiled meaning it is being compiled to machine code as the program is running. Source code is initially just plain text with a mime-type that identifies it as JavaScript code. It must be parsed by a `parser` in order to be understood as JavaScript code by the browser engine. 51 | 52 | ### Parser 53 | 54 | The parser generally consists out of a `pre-parser` and a `full-parser`. The `pre-parser` rapidly checks for syntactical and early errors in the program and will throw if it finds any. The `full-parser` evaluates the scope of variables throughout the program and collects basic type information. 55 | 56 | ### AST 57 | 58 | An `Abstract Syntax Tree` or in short `AST` is created from the parsed source code. 59 | `AST's` are data structures widely used in compilers, due to their property of representing the structure of program code. An `AST` is usually the result of the syntax analysis phase of a compiler, a tree representation of the abstract syntactic structure of source code. Each node of the tree denotes a construct occurring in the source code. It is beneficial to get a good understanding of what `AST's` are as they are very oftenly used in pre-processors, code generators, minifiers, transpilers, linters and codemods. 60 | 61 | ### Baseline compiler 62 | 63 | The goal of the baseline compiler (`Ignition` in `V8`) is to rapidly generate relatively unoptimised `machine code` (CPU architecture specific `bytecode` in the case of `Ignition`) as fast as possible and infer general type information to be used in potential further compilation steps. Whilst running, functions that are called often are marked as `hot` and are a candidate for further optimisation using the optimising compiler(s). 64 | 65 | ### Optimising compiler 66 | 67 | The optimising compiler (`Turbofan` in `V8`) recompiles `hot` functions using previously collected type information to optimise the generated `machine code` further. However, in order to make a faster version of the `machine code`, the optimising compiler has to make some assumptions regarding the shape of the object, namely that they always have the same property names and order. Based on that the compiler can then make further optimisations. If the object shape has been the same throughout the lifetime of the program it is assumed that it will remain that way during future execution. Unfortunately in JavaScript there are no guarantees that this is actually the case meaning that object shapes can change at any stage over time. Due to this lack of guarantees the assumptions of the compiler need to be validated every single time before it runs. If it turns out the assumptions are false the optimising compiler assumes it made the wrong assumptions, trashes the last version of the optimised code and steps back to a de-optimised version where assumptions are still valid. It is therefore very important that you limit the amount of type changes of an object throughout the lifetime of the program in order to keep the highly optimised code produced by the optimising compiler alive. In the worst case scenario the object ends up in `de-optimised hell` and will never be picked up again to be optimised. Any code that V8 refuses to optimise can also end up in `de-optimised hell`. 68 | 69 | Browser engines are continuously improving their optimisation techniques, especially around new browser features. Browser vendors generally recommend **against** implementing browser specific hacks to work around de-optimisations however there are specific keywords and patterns you should avoid using: 70 | 71 | - Avoid using `eval`, `arguments` and `with`. They cause what is known as [aliasing]() preventing the browser engine from optimising them. 72 | - Avoid using arrays with many different types in them. 73 | - Avoid swapping out values in an array with a value of another type. 74 | - Avoid creating holes in arrays by deleting array entries or setting entries to `undefined`. 75 | - Avoid using `for in` as it will include properties that are inherited through the prototype chain. This behavior can lead to unexpected items in your loop and browsers likely deoptimise anything within them. 76 | 77 | ### Conclusion 78 | 79 | When profiling and optimising your JavaScript code part of your effort should go out to optimising the parts of the application that are being optimised by the optimising compiler, meaning that these functions are `hot`, and more importantly which parts of the application are being de-optimised. De-optimisation likely happens because types are changing in `hot` parts of the code or certain optimisations are not yet implemented by the compiler (such as `try catch` a few years ago, which has since been fixed). It is important to note that whilst you should pay attention when using these unoptimised implementations you should use them and report to the browser engines that you are using these features. If a certain de-optimisation shows up a lot in heuristics and performance bug reports it is likely to be picked up by the engine maintainers as a priority. Other things to take into account are optimising object property access, maintaining object shapes and understand the power of inline caches (monomorphic, polymorphic, megamorphic). Inline caches are used to memorize information on where to find properties on objects to reduce the number of expensive lookups. 80 | 81 | ## Profiling 82 | 83 | Besides the browser's built-in sampling profilers available in `Chrome developer tools` and the structural profiler available in `chrome://tracing` one can start the browser from the command line with flags to enable the tracing of various parts of the web application. 84 | 85 | Please note that any traces recorded with the tool will contains all currently opened resources (tabs, extensions, subresources) with the browser. Make sure that `Chrome` starts without any other resources (other tabs, extensions, profile) active in order to be able to get a relatively clean trace. In order to record a clean trace you should keep the recording to a maximum of 10 seconds, focus on a single activity per recording and leave the computer completely idle for 2 seconds before and after each recording. This will help making the slow process stand out amongst the other recorded data. 86 | 87 | ### Memory profiling and garbage collection 88 | 89 | The essential point of garbage collection is the ability to manage memory usage by an application. All management of the memory is done by the browser engine, no API is exposed to web developers to control it explicitly. Web developers can however learn how to structure their programs in order to use the garbage collector to their advantage and avoid the generation of garbage. 90 | 91 | All variables in a program are part of the object graph and object variables can reference other variables. Allocating variables is done from the `young memory pool` and is very cheap until the memory pool runs out of memory. Whenever that happens a garbage collection is forced which causes higher latency, dropped frames and thus a major impact on the user experience. 92 | 93 | All variables that cannot be reached from the root node are considered as garbage. The job of the garbage collector is to `mark-and-sweep` or in other words: go through objects that are allocated in memory and determine whether they are `dead` or `alive`. If an object is unreachable it is removed from memory and previously allocated memory gets released back to the heap. Generally, e.g. in `V8`, the object heap is segmented into two parts: the `young generation` and the `old generation`. The `young generation` consists of `new space` in which new objects are allocated. It allocates fast, frequently collects and collects fast. The `old space` stores objects that survived enough garbage collector cycles to be promoted to the `old generation`. It allocates fast, infrequently collects and does slower collection. 94 | 95 | The cost of the garbage collection is proportional to the number of live objects. This is due to a copying mechanism that copies over objects that are still alive into a new space. Most of the time newly allocated objects do not survive long enough in order to become old. It is important to understand that each allocation moves you closer to a garbage collection and every collection pauses the execution of your application. It is therefore important in performance critical applications to strive for a static amount of alive objects and prevent allocating new ones whilst running. 96 | 97 | In order to limit the amount of objects that have to be garbage collected a developer should take the following aspects into account: 98 | 99 | - Avoid allocating new objects or change types of outer scoped (or even global) variables inside of a `hot` function. 100 | - Avoid circular references. A circular reference is formed when two objects reference each other. Memory leaks can occur when the engine and garbage collector fail to identify a circular reference meaning that neither object will ever be destroyed and memory will keep growing over time. 101 | - A possible solution for the object allocation problem in the `young generation` is the use of an `object pool` that basically pre-allocates a fixed number of objects ahead of time and keeps them alive by recycling them. This is a relatively common technique that allows you to have more explicit control over your objects lifetime. This however does come with an upfront cost when initializing and filling the pool and a consistent chunk of memory throughout your applications lifetime. An example of an object pool implementation can be found [here](https://github.com/timvanscherpenzeel/object-pool). 102 | - Make use of [WeakMaps](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WeakMap) where possible as they hold "weak" references to key objects, which means that they do not prevent garbage collection in case there would be no other reference to the key object. 103 | - Avoid associating the `delete` keyword in JavaScript with manual memory management. The `delete` keyword is used to remove properties from objects, not objects or variables as a whole, and is therefore **not** useful to mark objects ready to be garbage collected. 104 | - When profiling make sure to run it in an incognito window in a fresh browser instance **without** any browser extensions as they share the same heap as the JavaScript program that you are trying to profile. 105 | 106 | #### Heap snapshot 107 | 108 | In the `Chrome developer tools` panel, in the memory tab, you can find the option to take a `heap snapshot` which shows the memory distribution among your application's JavaScript objects and related DOM nodes. It is important to note that right **before** you click the heap snapshot button a major garbage collection is done. Because of this you can assume that everything that `V8` assumes to be able to garbage collected has already been cleaned up allowing you to get an idea of what `V8` was unable to clean up at the time. 109 | 110 | ![Heap snapshot button](/docs/HEAP_SNAPSHOT_BUTTON.png?raw=true) 111 | 112 | _Image source: Google Developers - https://developers.google.com/web/tools/chrome-devtools/memory-problems/_ 113 | 114 | Once you have taken your snapshot you can start inspecting it. 115 | 116 | You should ignore everything in parentheses and everything that is dimmed in the `heap snapshot`. These are various constructors that you do not have explicit control over in your application (such as native methods and global browser methods). The snapshot is ordered by the `constructor` name and you can filter the heap to find your constructor using the `class filter` up top. If you record multiple snapshots it is beneficial to compare them to each other. You can do this by opening the dropdown menu left of the `class filter` and set it to `comparison`. You can now see the difference between two snapshots. The list will be much shorter and you can see more easily what has changed in memory. 117 | 118 | ![Heap snapshot trace](/docs/HEAP_SNAPSHOT_TRACE_DETACHED.png?raw=true) 119 | 120 | _Image source: Google Developers - https://developers.google.com/web/tools/chrome-devtools/memory-problems/_ 121 | 122 | Objects in the `heap snapshot` with a **yellow background** are an indicator that there is no active event handle available meaning that these objects will be difficult to clean up as you have probably lost its reference to it. Most likely it is still in the DOM tree but you have lost your JavaScript reference. 123 | 124 | Objects with a **red background** in the `heap snapshot` are considered objects that have been detached from the DOM tree but their JavaScript reference is being retained. A DOM node can only be garbage collected when there are no references to it from either the page's DOM tree or JavaScript code. A node is said to be `detached` when it's removed from the DOM tree but some JavaScript still references it. Detached DOM nodes are a common cause of memory leaks. They are only alive because they are part of a yellow node's tree. 125 | 126 | In general, you want to focus on the yellow nodes in the `heap snapshot`. Fix your code so that the yellow node isn't alive for longer than it needs to be, that way you also get rid of the red nodes that are part of the yellow node's tree. 127 | 128 | For more information there are excellent entries on the `Chrome developer tools` blog on memory profiling: 129 | 130 | - [Fix memory problems](https://developers.google.com/web/tools/chrome-devtools/memory-problems/) 131 | - [Understand memory terminology](https://developers.google.com/web/tools/chrome-devtools/memory-problems/memory-101) 132 | - [Record Heap Snapshots](https://developers.google.com/web/tools/chrome-devtools/memory-problems/heap-snapshots) 133 | - [Use the Allocation Profiler](https://developers.google.com/web/tools/chrome-devtools/memory-problems/allocation-profiler) 134 | 135 | #### Three snapshot technique 136 | 137 | A recommended technique for capturing and analyzing snapshots is to make three captures and do comparisons between them as shown in the following graphic: 138 | 139 | ![Three snapshot technique](/docs/GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg?raw=true) 140 | 141 | _Image source: Google Developers Live - https://www.youtube.com/watch?v=L3ugr9BJqIs_ 142 | 143 | ### CPU profiling 144 | 145 | In order to know if you are CPU bound you must profile the CPU. Most of the time it makes sense to keep an eye on real-time CPU usage and only when in doubt capture a CPU trace. 146 | 147 | In `Chrome` there is a useful live CPU usage and runtime performance visualizer available in the `performance monitor` tab. 148 | 149 | ![Performance monitor](/docs/CPU_LIVE_USAGE_PROFILER.png?raw=true) 150 | 151 | More advanced captures over period of time can be done using the performance capture feature available in the `performance` tab in `Chrome`. A good tutorial for understanding the runtime performance trace can be found [here](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance/). 152 | 153 | ![Performance tracer](/docs/CPU_TRACE_PROFILER.png?raw=true) 154 | 155 | If you are CPU bound when rendering it is likely because of too many draw calls. This is a common problem and the solution is often to combine draw calls to reduce the cost. 156 | 157 | There is however a lot more going on than just draw calls. The renderer needs to process and update each object (culling, material, lighting, collision) on every frame tick. The more complex your materials (math, textures) the higher the cost at creation time and the more expensive it is to run at runtime. In WebGL there is a small but significant overhead due to strict validation of the shader code. The underlying graphics driver validates the commands further and creates a command buffer for the hardware. In a browser a lot more is going on than just the WebGL rendering context. 158 | 159 | In order to reduce the mesh draw calls one can use the following techniques: 160 | 161 | - Combine meshes into a single mesh to reduce the amount of necessary draw calls. 162 | - Reduce the object count (e.g. static meshes, dynamic meshes and mesh particles). 163 | - Reduce the far view distance on your camera's. 164 | - Adjust the field of view of your camera's to be smaller in order to have less objects in the view frustum. 165 | - Reduce the amount of elements per draw call (e.g. combine textures into texture atlases, use LOD models). 166 | - Disable features on a mesh like custom depth, shadow casting and shadow receiving. 167 | - Change light sources to not shadow cast or have a tighter bounding volume (view cone, attenuation radius). 168 | - Use hardware instancing where possible as it reduces the driver overhead per draw call (e.g. mesh particles). 169 | - Reduce the depth of your scene graph. 170 | 171 | If you are CPU bound by other parts of your application there is likely some other issue in your codebase. 172 | 173 | ### GPU profiling 174 | 175 | In order to know if you are GPU bound you must profile the GPU. Most of the time it makes sense to keep an eye on real-time GPU timing queries and when in doubt capture a GPU trace. 176 | 177 | The GPU has many processing units working in parallel and it is common to be bound by different units for different parts of the frame. Because of this, it makes sense to look at finding where the GPU cost is going when looking for the GPU bottleneck. Common ways your can be GPU bound are the application being draw call heavy, complex materials, dense triangle meshes and a large view frustum). 178 | 179 | In order to know if you are pixel bound one can try varying the viewport resolution. If you see a measurable performance change it likely means that you are bound by something pixel related. Usually it is either texture memory bandwidth (reading and writing) bound or math bound ([ALU](https://en.wikipedia.org/wiki/Arithmetic_logic_unit)), but in rare cases, some specific units are saturated (e.g. `MRT`). If you can lower the memory, or math, on the relevant passes and see a performance difference you know it was bound by the memory bandwidth (or the ALU units). 180 | 181 | In general you should look at using the following optimisation techniques: 182 | 183 | - Do as much as you can in the vertex shader rather than in the fragment shader because, per rendering pass, fragment shaders run many more times than vertex shaders, any calculation that can be done on the vertices and then just interpolated among fragments is a performance benefit (this interpolation is done "automagically" for you, through the fixed functionality rasterization phase of the WebGL pipeline). 184 | - Reduce the amount of WebGL state changes by caching and mirroring the state on the JavaScript. By diffing the state in JavaScript you can drastically reduce the amount of expensive WebGL state changes. 185 | - Avoid anything that requires synching the CPU and GPU as it is potentially very slow. Cache WebGL getter calls such as `getParameter` and `getUniformLocation` in JavaScript variables and only programmatically use `setParameter` after making sure you actually need to set the parameter by checking the mirrored WebGL state in JavaScript. 186 | - Cull any geometry that won't be visible (octree, BVH, kd-tree, quadtree) through occlusion culling, viewport culling or backface culling. 187 | - Group mesh submissions with the same state in order to prevent unnecessary WebGL state switches. 188 | - Limit the size of the canvas and do not directly use the device's pixel ratio but rather artificially limit it to a point where visual fidelity is acceptable yet performant. 189 | - Turn off the native anti-aliasing option on the canvas element, instead anti-alias once during postprocessing using FXAA, SMAA or similar in the fragment shader. The native implementation is unreliable and very naive. 190 | - Avoid using the native screen resolution retrieved using `window.devicePixelRatio()` for your fullscreen canvas as some phones can have a very high density display. Some effects and scenes can often get away with rendering at a lower resolution. 191 | - Disable alpha blending and disable the preserving of the drawing buffer when creating the WebGL canvas. 192 | 193 | If you are **fragment shader** bound you can look at the following optimisation techniques: 194 | 195 | - Avoid having to resize textures to be a power of two during runtime. This is unnecessary in WebGL2 but it is still highly recommended to use power of two textures for a more efficient memory layout. NPOT textures may be handled noticeable slower and can cause black edging artifacts by mipmap interpolation. 196 | - Avoid using too many uniforms, use `Uniform Buffer Objects` and `Uniform Block`'s where possible (WebGL2). 197 | - Reduce the amount of stationary and dynamic lights in your scene. Pre-bake where possible. 198 | - Try to combine lights that have a similar origin. 199 | - Limit the attenuation radius and light cone angle to the minimum needed. 200 | - Use an early Z-pass in order to determine what parts of the scene are actually visible. It allows you to avoid expensive shading operations on pixels that do not contribute to the final image. 201 | - Limit the amount of post-processing steps. 202 | - Disable shadow casting where possible, either per object or per light. 203 | - Reduce the shadow map resolution. 204 | - Make use of the multi-render target extension `WEBGL_draw_buffers` when using deferred rendering. Be aware that this extension is not available everywhere where `WebGL` is available. It fortunately is a part of the `WebGL2` core spec making it available everywhere where the `WebGL2` spec is implemented correctly. 205 | - Materials with fewer shader instructions and texture lookups run faster. 206 | - Never disable mipmaps if the texture can be seen in a smaller scale to avoid slowdowns due to texture cache misses. 207 | - Make use of GPU compressed textures and lower bitrate texture formats in order to reduce the in-memory GPU footprint. 208 | 209 | Often the shadow map rendering is bound by the vertex shader, except if you have very large areas affected by shadows or use translucent materials. Shadow map rendering cost scales with the number of dynamic lights in the scene, number of shadow casting objects in the light frustum and the number of cascades. This is a very common bottleneck. 210 | 211 | Highly tessellated meshes, where the wireframe appears as a solid color, can suffer from poor quad utilization. This is because GPUs process triangles in 2x2 pixel blocks and reject pixels outside of the triangle a bit later. This is needed for mip-map computations. For larger triangles, this is not a problem, but if triangles are small or very lengthy the performance can suffer as many pixels are processed but few actually contribute to the image. 212 | 213 | If you are **vertex shader** bound you can look at the following optimisation techniques: 214 | 215 | - Verify that the vertex count on your models in reasonable for real-time usage. 216 | - Avoid using too many vertices (use LOD meshes). 217 | - Verify your LOD is setup with aggressive transition ranges. A LOD should use vertex count by at least 2x. To optimise this, check the wireframe, solid colors indicate a problem. 218 | - Avoid using complex world position offsets (morph targets, vertex displacement using textures with poor mip-mapping) 219 | - Avoid tessellation if possible (if necessary be sure to limit the tessellation factor to a reasonable amount). Pretesselated meshes are usually faster. 220 | - Very large meshes can be split up into multiple parts for better view culling. 221 | - Avoid using too many vertex attributes, use `Vertex Array Objects` where possible (almost always available in WebGL, always available in WebGL2). 222 | - Billboards, imposter meshes or skybox textures can be used to efficiently fake detailed geometry when a mesh is far in the distance. 223 | 224 | In `Chrome` there are various ways to profile the GPU: 225 | 226 | For tracing an individual WebGL frame in depth without setting up an external debugger I highly recommend using the `Chrome` extension [Spector.js](https://spector.babylonjs.com/) made by the Babylon team at Microsoft. It allows for exporting and importing stack traces generated by Spector, captures the full WebGL state at each step and allows for easy exploration of the vertex and fragment shader. On top of that the project is free, open source and maintained by a professional team instead of an individual. 227 | 228 | ![Spector.js state](/docs/SPECTORJS_STATE.png?raw=true) 229 | 230 | ![Spector.js shader](/docs/SPECTORJS_SHADER.png?raw=true) 231 | 232 | The main advantage of this approach is that it does **not** require the disabling of the GPU sandbox, like some external debuggers do, and avoids the need of having to install and learn a complex debugger. I would highly recommend this method over using an external debugger if you use Mac OS or if you are not familiar with an alternative external debugger like [RenderDoc (Windows, Linux)](https://renderdoc.org/docs/index.html) or [APITrace (Windows, Linux, Mac (limited support))](https://github.com/apitrace/apitrace). Instructions on how to debug WebGL using APITrace can be found [here](https://github.com/apitrace/apitrace/wiki/Google-Chrome-Browser). 233 | 234 | ![Renderdoc drawcall](/docs/RENDERDOC_DRAWCALL.png?raw=true) 235 | 236 | For capturing traces over time one can use the advanced tracing capabilities like [MemoryInfra](https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/README.md) available in `chrome://tracing`. 237 | A good example for how to understand and work with the captures of it can be found [here](https://www.html5rocks.com/en/tutorials/games/abouttracing/). 238 | 239 | For capturing GPU traces using `chrome://tracing` I recommend using the `rendering` preset. 240 | 241 | ![Chrome tracing rendering toggle](/docs/CHROME_TRACING_RENDERING_TOGGLE.png?raw=true) 242 | 243 | ![Chrome tracing rendering trace](/docs/CHROME_TRACING_RENDERING_TRACE.png?raw=true) 244 | 245 | There are also various ways you can integrate profiling into your application. 246 | 247 | One can use the WebGL extension `EXT_disjoint_timer_query` to measure the duration of OpenGL commands submitted to the graphics processor without stalling the rendering pipeline. It makes most sense if this extension is integrated into the WebGL engine that you are using. A good example of a WebGL framework with an integrated profiler is [Luma.gl](https://github.com/uber/luma.gl). 248 | 249 | One can also wrap the `WebGLRenderingContext` with a debugging wrapper like the [one provided by the Khronos Group](https://www.npmjs.com/package/webgl-debug) to catch invalid WebGL operations and give the errors a bit more context. This comes with a large overhead as every single instruction is traced (and optionally logged to the console so make sure to only optionally include the dependency in development. I have rarely found this method to be useful as it does not capture a single frame clearly and logs everything with the same priority to the console. 250 | 251 | ## Installation 252 | 253 | To automatically download and setup Chrome Canary on Mac OS using Homebrew you can use: 254 | 255 | ```sh 256 | $ ./scripts/setup_macos.sh 257 | ``` 258 | 259 | In order to install `V8` and the `D8` shell I recommend following the excellent guide by [Kevin Cennis](https://gist.github.com/kevincennis/0cd2138c78a07412ef21). 260 | 261 | ## Usage 262 | 263 | In order to be able to [properly profile your application](#three-snapshot-technique) the browsers needs to expose more of its internals than usual. One can do this by launching the browser with a set of command line flags. 264 | 265 | Included in this repo one can find a [script](scripts/run_macos.sh) that launches the latest version of Chrome Canary with a temporary user profile in an incognito window without any extensions installed. 266 | 267 | Chrome has VSync disabled for unlocked framerates (useful for knowing how much of your frame budget you still have left), more precise memory tracking (necessary for properly tracing your memory usage), remote port debugging, optimisation tracing and de-optimisation tracing (both logged to files). 268 | 269 | It is currently only configured for MacOS but I welcome PR's to add support for [Windows](https://github.com/TimvanScherpenzeel/profiling-research/issues/1) and [Linux](https://github.com/TimvanScherpenzeel/profiling-research/issues/2). 270 | 271 | You can launch the script as follows: 272 | 273 | ```sh 274 | $ ./scripts/run_macos.sh 275 | ``` 276 | 277 | ## Resources and references 278 | 279 | - [Optimising WebGL Applications with Don Olmstead](https://www.youtube.com/watch?v=QVvHtWePQdA) 280 | - [CPU profiling in Unreal Engine](https://docs.unrealengine.com/en-us/Engine/Performance/CPU) 281 | - [GPU profiling in Unreal Engine](https://docs.unrealengine.com/en-us/Engine/Performance/GPU) 282 | - [Performance Guidelines for Artists and Designers](https://docs.unrealengine.com/en-us/Engine/Performance/Guidelines) 283 | - [The Breakpoint Ep. 8: Memory Profiling with Chrome DevTools](https://www.youtube.com/watch?v=L3ugr9BJqIs) 284 | - [V8 Garbage Collector](https://github.com/thlorenz/v8-perf/blob/master/gc.md#heap-organization-in-detail) 285 | - [Google I/O 2013 - Accelerating Oz with V8: Follow the Yellow Brick Road to JavaScript Performance](https://www.youtube.com/watch?v=VhpdsjBUS3g) 286 | - [Franziska Hinkelmann - Performance Profiling for V8 / Script17 (Slides)](https://fhinkel.rocks/PerformanceProfiling/assets/player/KeynoteDHTMLPlayer.html#3) 287 | - [Franziska Hinkelmann - Performance Profiling for V8 / Script17 (Video)](https://www.youtube.com/watch?v=j6LfSlg8Fig) 288 | - [Franziska Hinkelmann - Performance Profiling for V8 / Script17 (Files)](https://github.com/fhinkel/PerformanceProfiling) 289 | - [V8 profile documentation](https://v8.dev/docs/profile) 290 | - [V8 performance notes and resources](https://github.com/thlorenz/v8-perf) 291 | - [Turbolizer](https://github.com/thlorenz/turbolizer) 292 | - [The Trace Event Profiling Tool (chrome://tracing)](https://www.chromium.org/developers/how-tos/trace-event-profiling-tool) 293 | - [Ignition - an interpreter for V8](https://www.youtube.com/watch?v=r5OWCtuKiAk) 294 | - [A crash course in Just In Time (JIT) compilers](https://hacks.mozilla.org/2017/02/a-crash-course-in-just-in-time-jit-compilers/) 295 | - [JavaScript engine fundamentals: Shapes and Inline Caches](https://mathiasbynens.be/notes/shapes-ics) 296 | - [JavaScript Engines: The Good Parts™ - Mathias Bynens & Benedikt Meurer - JSConf EU 2018](https://www.youtube.com/watch?v=5nmpokoRaZI) 297 | - [Understanding V8’s Bytecode](https://medium.com/dailyjs/understanding-v8s-bytecode-317d46c94775) 298 | - [Visualize JavaScript AST's](https://resources.jointjs.com/demos/javascript-ast) 299 | - [Garbage collection in V8, an illustrated guide](https://medium.com/@_lrlna/garbage-collection-in-v8-an-illustrated-guide-d24a952ee3b8) 300 | - [Bailout reasons in V8 (Crankshaft)](https://github.com/vhf/v8-bailout-reasons) 301 | - [Bailout reasons in V8 (Turbofan)](https://chromium.googlesource.com/v8/v8/+/d3f074b23195a2426d14298dca30c4cf9183f203/src/bailout-reason.h) 302 | - [Scott Meyers: Cpu Caches and Why You Care](https://www.youtube.com/watch?v=WDIkqP4JbkE) 303 | -------------------------------------------------------------------------------- /docs/CHROME_TRACING_RENDERING_TOGGLE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CHROME_TRACING_RENDERING_TOGGLE.png -------------------------------------------------------------------------------- /docs/CHROME_TRACING_RENDERING_TRACE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CHROME_TRACING_RENDERING_TRACE.png -------------------------------------------------------------------------------- /docs/CPU_LIVE_USAGE_PROFILER.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CPU_LIVE_USAGE_PROFILER.png -------------------------------------------------------------------------------- /docs/CPU_TRACE_PROFILER.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/CPU_TRACE_PROFILER.png -------------------------------------------------------------------------------- /docs/GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/GOOGLE_THREE_SNAPSHOT_TECHNIQUE.jpg -------------------------------------------------------------------------------- /docs/HEAP_SNAPSHOT_BUTTON.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/HEAP_SNAPSHOT_BUTTON.png -------------------------------------------------------------------------------- /docs/HEAP_SNAPSHOT_TRACE_DETACHED.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/HEAP_SNAPSHOT_TRACE_DETACHED.png -------------------------------------------------------------------------------- /docs/RENDERDOC_DRAWCALL.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/RENDERDOC_DRAWCALL.png -------------------------------------------------------------------------------- /docs/SPECTORJS_SHADER.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/SPECTORJS_SHADER.png -------------------------------------------------------------------------------- /docs/SPECTORJS_STATE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/SPECTORJS_STATE.png -------------------------------------------------------------------------------- /docs/V8_AVAILABLE_FLAGS.md: -------------------------------------------------------------------------------- 1 | # Available flags in V8 2 | 3 | ```sh 4 | # https://github.com/v8/v8/blob/master/src/flag-definitions.h 5 | 6 | $ /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --js-flags="--help" > flags.txt 7 | ``` 8 | 9 | ```txt 10 | SSE3=1 SSSE3=1 SSE4_1=1 SAHF=1 AVX=1 FMA3=1 BMI1=1 BMI2=1 LZCNT=1 POPCNT=1 ATOM=0 11 | Synopsis: 12 | shell [options] [--shell] [...] 13 | d8 [options] [-e ] [--shell] [[--module] ...] 14 | 15 | -e execute a string in V8 16 | --shell run an interactive JavaScript shell 17 | --module execute a file as a JavaScript module 18 | 19 | Note: the --module option is implicitly enabled for *.mjs files. 20 | 21 | Options: 22 | --experimental-extras (enable code compiled in via v8_experimental_extra_library_files) 23 | type: bool default: false 24 | --use-strict (enforce strict mode) 25 | type: bool default: false 26 | --es-staging (enable test-worthy harmony features (for internal use only)) 27 | type: bool default: false 28 | --harmony (enable all completed harmony features) 29 | type: bool default: false 30 | --harmony-shipping (enable all shipped harmony features) 31 | type: bool default: true 32 | --harmony-do-expressions (enable "harmony do-expressions" (in progress)) 33 | type: bool default: false 34 | --harmony-class-fields (enable "harmony fields in class literals" (in progress)) 35 | type: bool default: false 36 | --harmony-await-optimization (enable "harmony await taking 1 tick" (in progress)) 37 | type: bool default: true 38 | --harmony-regexp-sequence (enable "RegExp Unicode sequence properties" (in progress)) 39 | type: bool default: false 40 | --harmony-weak-refs (enable "harmony weak references" (in progress)) 41 | type: bool default: false 42 | --harmony-locale (enable "Intl.Locale" (in progress)) 43 | type: bool default: false 44 | --harmony-intl-list-format (enable "Intl.ListFormat" (in progress)) 45 | type: bool default: false 46 | --harmony-intl-segmenter (enable "Intl.Segmenter" (in progress)) 47 | type: bool default: false 48 | --harmony-private-fields (enable "harmony private fields in class literals") 49 | type: bool default: false 50 | --harmony-numeric-separator (enable "harmony numeric separator between digits") 51 | type: bool default: false 52 | --harmony-string-matchall (enable "harmony String.prototype.matchAll") 53 | type: bool default: false 54 | --harmony-namespace-exports (enable "harmony namespace exports (export * as foo from 'bar')") 55 | type: bool default: true 56 | --harmony-sharedarraybuffer (enable "harmony sharedarraybuffer") 57 | type: bool default: true 58 | --harmony-import-meta (enable "harmony import.meta property") 59 | type: bool default: true 60 | --harmony-dynamic-import (enable "harmony dynamic import") 61 | type: bool default: true 62 | --harmony-array-prototype-values (enable "harmony Array.prototype.values") 63 | type: bool default: true 64 | --harmony-array-flat (enable "harmony Array.prototype.{flat,flatMap}") 65 | type: bool default: true 66 | --harmony-symbol-description (enable "harmony Symbol.prototype.description") 67 | type: bool default: true 68 | --harmony-global (enable "harmony global") 69 | type: bool default: true 70 | --harmony-json-stringify (enable "well-formed JSON.stringify") 71 | type: bool default: true 72 | --harmony-public-fields (enable "harmony public instance fields in class literals") 73 | type: bool default: true 74 | --harmony-static-fields (enable "harmony static fields in class literals") 75 | type: bool default: true 76 | --harmony-intl-relative-time-format (enable "Intl.RelativeTimeFormat") 77 | type: bool default: true 78 | --icu-timezone-data (get information about timezones from ICU) 79 | type: bool default: true 80 | --future (Implies all staged features that we want to ship in the not-too-far future) 81 | type: bool default: false 82 | --allocation-site-pretenuring (pretenure with allocation sites) 83 | type: bool default: true 84 | --page-promotion (promote pages based on utilization) 85 | type: bool default: true 86 | --page-promotion-threshold (min percentage of live bytes on a page to enable fast evacuation) 87 | type: int default: 70 88 | --trace-pretenuring (trace pretenuring decisions of HAllocate instructions) 89 | type: bool default: false 90 | --trace-pretenuring-statistics (trace allocation site pretenuring statistics) 91 | type: bool default: false 92 | --track-fields (track fields with only smi values) 93 | type: bool default: true 94 | --track-double-fields (track fields with double values) 95 | type: bool default: true 96 | --track-heap-object-fields (track fields with heap values) 97 | type: bool default: true 98 | --track-computed-fields (track computed boilerplate fields) 99 | type: bool default: true 100 | --track-field-types (track field types) 101 | type: bool default: true 102 | --trace-block-coverage (trace collected block coverage information) 103 | type: bool default: false 104 | --feedback-normalization (feed back normalization to constructors) 105 | type: bool default: false 106 | --enable-one-shot-optimization (Enable size optimizations for the code that will only be executed once) 107 | type: bool default: true 108 | --unbox-double-arrays (automatically unbox arrays of doubles) 109 | type: bool default: true 110 | --stress-delay-tasks (delay execution of tasks by 0-100ms randomly (based on --random-seed)) 111 | type: bool default: false 112 | --ignition-elide-noneffectful-bytecodes (elide bytecodes which won't have any external effect) 113 | type: bool default: true 114 | --ignition-reo (use ignition register equivalence optimizer) 115 | type: bool default: true 116 | --ignition-filter-expression-positions (filter expression positions before the bytecode pipeline) 117 | type: bool default: true 118 | --ignition-share-named-property-feedback (share feedback slots when loading the same named property from the same object) 119 | type: bool default: true 120 | --print-bytecode (print bytecode generated by ignition interpreter) 121 | type: bool default: false 122 | --print-bytecode-filter (filter for selecting which functions to print bytecode) 123 | type: string default: * 124 | --trace-ignition-codegen (trace the codegen of ignition interpreter bytecode handlers) 125 | type: bool default: false 126 | --trace-ignition-dispatches (traces the dispatches to bytecode handlers by the ignition interpreter) 127 | type: bool default: false 128 | --trace-ignition-dispatches-output-file (the file to which the bytecode handler dispatch table is written (by default, the table is not written to a file)) 129 | type: string default: nullptr 130 | --fast-math (faster (but maybe less accurate) math functions) 131 | type: bool default: true 132 | --trace-track-allocation-sites (trace the tracking of allocation sites) 133 | type: bool default: false 134 | --trace-migration (trace object migration) 135 | type: bool default: false 136 | --trace-generalization (trace map generalization) 137 | type: bool default: false 138 | --concurrent-recompilation (optimizing hot functions asynchronously on a separate thread) 139 | type: bool default: true 140 | --trace-concurrent-recompilation (track concurrent recompilation) 141 | type: bool default: false 142 | --concurrent-recompilation-queue-length (the length of the concurrent compilation queue) 143 | type: int default: 8 144 | --concurrent-recompilation-delay (artificial compilation delay in ms) 145 | type: int default: 0 146 | --block-concurrent-recompilation (block queued jobs until released) 147 | type: bool default: false 148 | --concurrent-inlining (run optimizing compiler's inlining phase on a separate thread) 149 | type: bool default: false 150 | --strict-heap-broker (fail on incomplete serialization) 151 | type: bool default: true 152 | --trace-heap-broker (trace the heap broker) 153 | type: bool default: false 154 | --stress-runs (number of stress runs) 155 | type: int default: 0 156 | --deopt-every-n-times (deoptimize every n times a deopt point is passed) 157 | type: int default: 0 158 | --print-deopt-stress (print number of possible deopt points) 159 | type: bool default: false 160 | --turbo-sp-frame-access (use stack pointer-relative access to frame wherever possible) 161 | type: bool default: false 162 | --turbo-preprocess-ranges (run pre-register allocation heuristics) 163 | type: bool default: true 164 | --turbo-filter (optimization filter for TurboFan compiler) 165 | type: string default: * 166 | --trace-turbo (trace generated TurboFan IR) 167 | type: bool default: false 168 | --trace-turbo-path (directory to dump generated TurboFan IR to) 169 | type: string default: nullptr 170 | --trace-turbo-filter (filter for tracing turbofan compilation) 171 | type: string default: * 172 | --trace-turbo-graph (trace generated TurboFan graphs) 173 | type: bool default: false 174 | --trace-turbo-scheduled (trace TurboFan IR with schedule) 175 | type: bool default: false 176 | --trace-turbo-cfg-file (trace turbo cfg graph (for C1 visualizer) to a given file name) 177 | type: string default: nullptr 178 | --trace-turbo-types (trace TurboFan's types) 179 | type: bool default: true 180 | --trace-turbo-scheduler (trace TurboFan's scheduler) 181 | type: bool default: false 182 | --trace-turbo-reduction (trace TurboFan's various reducers) 183 | type: bool default: false 184 | --trace-turbo-trimming (trace TurboFan's graph trimmer) 185 | type: bool default: false 186 | --trace-turbo-jt (trace TurboFan's jump threading) 187 | type: bool default: false 188 | --trace-turbo-ceq (trace TurboFan's control equivalence) 189 | type: bool default: false 190 | --trace-turbo-loop (trace TurboFan's loop optimizations) 191 | type: bool default: false 192 | --trace-alloc (trace register allocator) 193 | type: bool default: false 194 | --trace-all-uses (trace all use positions) 195 | type: bool default: false 196 | --trace-representation (trace representation types) 197 | type: bool default: false 198 | --turbo-verify (verify TurboFan graphs at each phase) 199 | type: bool default: false 200 | --turbo-verify-machine-graph (verify TurboFan machine graph before instruction selection) 201 | type: string default: nullptr 202 | --trace-verify-csa (trace code stubs verification) 203 | type: bool default: false 204 | --csa-trap-on-node (trigger break point when a node with given id is created in given stub. The format is: StubName,NodeId) 205 | type: string default: nullptr 206 | --turbo-stats (print TurboFan statistics) 207 | type: bool default: false 208 | --turbo-stats-nvp (print TurboFan statistics in machine-readable format) 209 | type: bool default: false 210 | --turbo-stats-wasm (print TurboFan statistics of wasm compilations) 211 | type: bool default: false 212 | --turbo-splitting (split nodes during scheduling in TurboFan) 213 | type: bool default: true 214 | --function-context-specialization (enable function context specialization in TurboFan) 215 | type: bool default: false 216 | --turbo-inlining (enable inlining in TurboFan) 217 | type: bool default: true 218 | --max-inlined-bytecode-size (maximum size of bytecode for a single inlining) 219 | type: int default: 500 220 | --max-inlined-bytecode-size-cumulative (maximum cumulative size of bytecode considered for inlining) 221 | type: int default: 1000 222 | --max-inlined-bytecode-size-absolute (maximum cumulative size of bytecode considered for inlining) 223 | type: int default: 5000 224 | --reserve-inline-budget-scale-factor (maximum cumulative size of bytecode considered for inlining) 225 | type: float default: 1.2 226 | --max-inlined-bytecode-size-small (maximum size of bytecode considered for small function inlining) 227 | type: int default: 30 228 | --min-inlining-frequency (minimum frequency for inlining) 229 | type: float default: 0.15 230 | --polymorphic-inlining (polymorphic inlining) 231 | type: bool default: true 232 | --stress-inline (set high thresholds for inlining to inline as much as possible) 233 | type: bool default: false 234 | --trace-turbo-inlining (trace TurboFan inlining) 235 | type: bool default: false 236 | --inline-accessors (inline JavaScript accessors) 237 | type: bool default: true 238 | --inline-into-try (inline into try blocks) 239 | type: bool default: true 240 | --turbo-inline-array-builtins (inline array builtins in TurboFan code) 241 | type: bool default: true 242 | --use-osr (use on-stack replacement) 243 | type: bool default: true 244 | --trace-osr (trace on-stack replacement) 245 | type: bool default: false 246 | --analyze-environment-liveness (analyze liveness of environment slots and zap dead values) 247 | type: bool default: true 248 | --trace-environment-liveness (trace liveness of local variable slots) 249 | type: bool default: false 250 | --turbo-load-elimination (enable load elimination in TurboFan) 251 | type: bool default: true 252 | --trace-turbo-load-elimination (trace TurboFan load elimination) 253 | type: bool default: false 254 | --turbo-profiling (enable profiling in TurboFan) 255 | type: bool default: false 256 | --turbo-verify-allocation (verify register allocation in TurboFan) 257 | type: bool default: false 258 | --turbo-move-optimization (optimize gap moves in TurboFan) 259 | type: bool default: true 260 | --turbo-jt (enable jump threading in TurboFan) 261 | type: bool default: true 262 | --turbo-loop-peeling (Turbofan loop peeling) 263 | type: bool default: true 264 | --turbo-loop-variable (Turbofan loop variable optimization) 265 | type: bool default: true 266 | --turbo-cf-optimization (optimize control flow in TurboFan) 267 | type: bool default: true 268 | --turbo-escape (enable escape analysis) 269 | type: bool default: true 270 | --turbo-allocation-folding (Turbofan allocation folding) 271 | type: bool default: true 272 | --turbo-instruction-scheduling (enable instruction scheduling in TurboFan) 273 | type: bool default: false 274 | --turbo-stress-instruction-scheduling (randomly schedule instructions to stress dependency tracking) 275 | type: bool default: false 276 | --turbo-store-elimination (enable store-store elimination in TurboFan) 277 | type: bool default: true 278 | --trace-store-elimination (trace store elimination) 279 | type: bool default: false 280 | --turbo-rewrite-far-jumps (rewrite far to near jumps (ia32,x64)) 281 | type: bool default: true 282 | --experimental-inline-promise-constructor (inline the Promise constructor in TurboFan) 283 | type: bool default: true 284 | --untrusted-code-mitigations (Enable mitigations for executing untrusted code) 285 | type: bool default: false 286 | --expose-wasm (expose wasm interface to JavaScript) 287 | type: bool default: true 288 | --assume-asmjs-origin (force wasm decoder to assume input is internal asm-wasm format) 289 | type: bool default: false 290 | --wasm-disable-structured-cloning (disable wasm structured cloning) 291 | type: bool default: false 292 | --wasm-num-compilation-tasks (number of parallel compilation tasks for wasm) 293 | type: int default: 10 294 | --wasm-write-protect-code-memory (write protect code memory on the wasm native heap) 295 | type: bool default: false 296 | --trace-wasm-serialization (trace serialization/deserialization) 297 | type: bool default: false 298 | --wasm-async-compilation (enable actual asynchronous compilation for WebAssembly.compile) 299 | type: bool default: true 300 | --wasm-test-streaming (use streaming compilation instead of async compilation for tests) 301 | type: bool default: false 302 | --wasm-max-mem-pages (maximum number of 64KiB memory pages of a wasm instance) 303 | type: uint default: 32767 304 | --wasm-max-table-size (maximum table size of a wasm instance) 305 | type: uint default: 10000000 306 | --wasm-tier-up (enable wasm baseline compilation and tier up to the optimizing compiler) 307 | type: bool default: true 308 | --trace-wasm-ast-start (start function for wasm AST trace (inclusive)) 309 | type: int default: 0 310 | --trace-wasm-ast-end (end function for wasm AST trace (exclusive)) 311 | type: int default: 0 312 | --liftoff (enable Liftoff, the baseline compiler for WebAssembly) 313 | type: bool default: true 314 | --trace-wasm-memory (print all memory updates performed in wasm code) 315 | type: bool default: false 316 | --wasm-tier-mask-for-testing (bitmask of functions to compile with TurboFan instead of Liftoff) 317 | type: int default: 0 318 | --validate-asm (validate asm.js modules before compiling) 319 | type: bool default: true 320 | --suppress-asm-messages (don't emit asm.js related messages (for golden file testing)) 321 | type: bool default: false 322 | --trace-asm-time (log asm.js timing info to the console) 323 | type: bool default: false 324 | --trace-asm-scanner (log tokens encountered by asm.js scanner) 325 | type: bool default: false 326 | --trace-asm-parser (verbose logging of asm.js parse failures) 327 | type: bool default: false 328 | --stress-validate-asm (try to validate everything as asm.js) 329 | type: bool default: false 330 | --dump-wasm-module-path (directory to dump wasm modules to) 331 | type: string default: nullptr 332 | --experimental-wasm-mv (enable prototype multi-value support for wasm) 333 | type: bool default: false 334 | --experimental-wasm-eh (enable prototype exception handling opcodes for wasm) 335 | type: bool default: false 336 | --experimental-wasm-se (enable prototype sign extension opcodes for wasm) 337 | type: bool default: true 338 | --experimental-wasm-sat-f2i-conversions (enable prototype saturating float conversion opcodes for wasm) 339 | type: bool default: false 340 | --experimental-wasm-threads (enable prototype thread opcodes for wasm) 341 | type: bool default: false 342 | --experimental-wasm-simd (enable prototype SIMD opcodes for wasm) 343 | type: bool default: false 344 | --experimental-wasm-anyref (enable prototype anyref opcodes for wasm) 345 | type: bool default: false 346 | --experimental-wasm-mut-global (enable prototype import/export mutable global support for wasm) 347 | type: bool default: true 348 | --wasm-opt (enable wasm optimization) 349 | type: bool default: false 350 | --wasm-no-bounds-checks (disable bounds checks (performance testing only)) 351 | type: bool default: false 352 | --wasm-no-stack-checks (disable stack checks (performance testing only)) 353 | type: bool default: false 354 | --wasm-shared-engine (shares one wasm engine between all isolates within a process) 355 | type: bool default: true 356 | --wasm-shared-code (shares code underlying a wasm module when it is transferred) 357 | type: bool default: true 358 | --wasm-trap-handler (use signal handlers to catch out of bounds memory access in wasm (currently Linux x86_64 only)) 359 | type: bool default: false 360 | --wasm-trap-handler-fallback (Use bounds checks if guarded memory is not available) 361 | type: bool default: false 362 | --wasm-fuzzer-gen-test (Generate a test case when running a wasm fuzzer) 363 | type: bool default: false 364 | --print-wasm-code (Print WebAssembly code) 365 | type: bool default: false 366 | --wasm-interpret-all (Execute all wasm code in the wasm interpreter) 367 | type: bool default: false 368 | --asm-wasm-lazy-compilation (enable lazy compilation for asm-wasm modules) 369 | type: bool default: true 370 | --wasm-lazy-compilation (enable lazy compilation for all wasm modules) 371 | type: bool default: false 372 | --frame-count (number of stack frames inspected by the profiler) 373 | type: int default: 1 374 | --type-info-threshold (percentage of ICs that must have type info to allow optimization) 375 | type: int default: 25 376 | --stress-sampling-allocation-profiler (Enables sampling allocation profiler with X as a sample interval) 377 | type: int default: 0 378 | --min-semi-space-size (min size of a semi-space (in MBytes), the new space consists of two semi-spaces) 379 | type: size_t default: 0 380 | --max-semi-space-size (max size of a semi-space (in MBytes), the new space consists of two semi-spaces) 381 | type: size_t default: 0 382 | --semi-space-growth-factor (factor by which to grow the new space) 383 | type: int default: 2 384 | --experimental-new-space-growth-heuristic (Grow the new space based on the percentage of survivors instead of their absolute value.) 385 | type: bool default: false 386 | --max-old-space-size (max size of the old space (in Mbytes)) 387 | type: size_t default: 0 388 | --initial-old-space-size (initial old space size (in Mbytes)) 389 | type: size_t default: 0 390 | --gc-global (always perform global GCs) 391 | type: bool default: false 392 | --random-gc-interval (Collect garbage after random(0, X) allocations. It overrides gc_interval.) 393 | type: int default: 0 394 | --gc-interval (garbage collect after allocations) 395 | type: int default: -1 396 | --retain-maps-for-n-gc (keeps maps alive for old space garbage collections) 397 | type: int default: 2 398 | --trace-gc (print one trace line following each garbage collection) 399 | type: bool default: false 400 | --trace-gc-nvp (print one detailed trace line in name=value format after each garbage collection) 401 | type: bool default: false 402 | --trace-gc-ignore-scavenger (do not print trace line after scavenger collection) 403 | type: bool default: false 404 | --trace-idle-notification (print one trace line following each idle notification) 405 | type: bool default: false 406 | --trace-idle-notification-verbose (prints the heap state used by the idle notification) 407 | type: bool default: false 408 | --trace-gc-verbose (print more details following each garbage collection) 409 | type: bool default: false 410 | --trace-allocation-stack-interval (print stack trace after free-list allocations) 411 | type: int default: -1 412 | --trace-duplicate-threshold-kb (print duplicate objects in the heap if their size is more than given threshold) 413 | type: int default: 0 414 | --trace-fragmentation (report fragmentation for old space) 415 | type: bool default: false 416 | --trace-fragmentation-verbose (report fragmentation for old space (detailed)) 417 | type: bool default: false 418 | --trace-evacuation (report evacuation statistics) 419 | type: bool default: false 420 | --trace-mutator-utilization (print mutator utilization, allocation speed, gc speed) 421 | type: bool default: false 422 | --incremental-marking (use incremental marking) 423 | type: bool default: true 424 | --incremental-marking-wrappers (use incremental marking for marking wrappers) 425 | type: bool default: true 426 | --trace-unmapper (Trace the unmapping) 427 | type: bool default: false 428 | --parallel-scavenge (parallel scavenge) 429 | type: bool default: true 430 | --trace-parallel-scavenge (trace parallel scavenge) 431 | type: bool default: false 432 | --write-protect-code-memory (write protect code memory) 433 | type: bool default: true 434 | --concurrent-marking (use concurrent marking) 435 | type: bool default: true 436 | --parallel-marking (use parallel marking in atomic pause) 437 | type: bool default: true 438 | --ephemeron-fixpoint-iterations (number of fixpoint iterations it takes to switch to linear ephemeron algorithm) 439 | type: int default: 10 440 | --trace-concurrent-marking (trace concurrent marking) 441 | type: bool default: false 442 | --black-allocation (use black allocation) 443 | type: bool default: true 444 | --concurrent-store-buffer (use concurrent store buffer processing) 445 | type: bool default: true 446 | --concurrent-sweeping (use concurrent sweeping) 447 | type: bool default: true 448 | --parallel-compaction (use parallel compaction) 449 | type: bool default: true 450 | --parallel-pointer-update (use parallel pointer update during compaction) 451 | type: bool default: true 452 | --detect-ineffective-gcs-near-heap-limit (trigger out-of-memory failure to avoid GC storm near heap limit) 453 | type: bool default: true 454 | --trace-incremental-marking (trace progress of the incremental marking) 455 | type: bool default: false 456 | --trace-stress-marking (trace stress marking progress) 457 | type: bool default: false 458 | --trace-stress-scavenge (trace stress scavenge progress) 459 | type: bool default: false 460 | --track-gc-object-stats (track object counts and memory usage) 461 | type: bool default: false 462 | --trace-gc-object-stats (trace object counts and memory usage) 463 | type: bool default: false 464 | --trace-zone-stats (trace zone memory usage) 465 | type: bool default: false 466 | --track-retaining-path (enable support for tracking retaining path) 467 | type: bool default: false 468 | --concurrent-array-buffer-freeing (free array buffer allocations on a background thread) 469 | type: bool default: true 470 | --gc-stats (Used by tracing internally to enable gc statistics) 471 | type: int default: 0 472 | --track-detached-contexts (track native contexts that are expected to be garbage collected) 473 | type: bool default: true 474 | --trace-detached-contexts (trace native contexts that are expected to be garbage collected) 475 | type: bool default: false 476 | --move-object-start (enable moving of object starts) 477 | type: bool default: true 478 | --memory-reducer (use memory reducer) 479 | type: bool default: true 480 | --heap-growing-percent (specifies heap growing factor as (1 + heap_growing_percent/100)) 481 | type: int default: 0 482 | --v8-os-page-size (override OS page size (in KBytes)) 483 | type: int default: 0 484 | --always-compact (Perform compaction on every full GC) 485 | type: bool default: false 486 | --never-compact (Never perform compaction on full GC - testing only) 487 | type: bool default: false 488 | --compact-code-space (Compact code space on full collections) 489 | type: bool default: true 490 | --use-marking-progress-bar (Use a progress bar to scan large objects in increments when incremental marking is active.) 491 | type: bool default: true 492 | --force-marking-deque-overflows (force overflows of marking deque by reducing it's size to 64 words) 493 | type: bool default: false 494 | --stress-compaction (stress the GC compactor to flush out bugs (implies --force_marking_deque_overflows)) 495 | type: bool default: false 496 | --stress-compaction-random (Stress GC compaction by selecting random percent of pages as evacuation candidates. It overrides stress_compaction.) 497 | type: bool default: false 498 | --stress-incremental-marking (force incremental marking for small heaps and run it more often) 499 | type: bool default: false 500 | --fuzzer-gc-analysis (prints number of allocations and enables analysis mode for gc fuzz testing, e.g. --stress-marking, --stress-scavenge) 501 | type: bool default: false 502 | --stress-marking (force marking at random points between 0 and X (inclusive) percent of the regular marking start limit) 503 | type: int default: 0 504 | --stress-scavenge (force scavenge at random points between 0 and X (inclusive) percent of the new space capacity) 505 | type: int default: 0 506 | --disable-abortjs (disables AbortJS runtime function) 507 | type: bool default: false 508 | --manual-evacuation-candidates-selection (Test mode only flag. It allows an unit test to select evacuation candidates pages (requires --stress_compaction).) 509 | type: bool default: false 510 | --fast-promotion-new-space (fast promote new space on high survival rates) 511 | type: bool default: false 512 | --clear-free-memory (initialize free memory with 0) 513 | type: bool default: false 514 | --young-generation-large-objects (allocates large objects by default in the young generation large object space) 515 | type: bool default: false 516 | --idle-time-scavenge (Perform scavenges in idle time.) 517 | type: bool default: true 518 | --debug-code (generate extra code (assertions) for debugging) 519 | type: bool default: false 520 | --code-comments (emit comments in code disassembly; for more readable source positions you should add --no-concurrent_recompilation) 521 | type: bool default: false 522 | --enable-sse3 (enable use of SSE3 instructions if available) 523 | type: bool default: true 524 | --enable-ssse3 (enable use of SSSE3 instructions if available) 525 | type: bool default: true 526 | --enable-sse4-1 (enable use of SSE4.1 instructions if available) 527 | type: bool default: true 528 | --enable-sahf (enable use of SAHF instruction if available (X64 only)) 529 | type: bool default: true 530 | --enable-avx (enable use of AVX instructions if available) 531 | type: bool default: true 532 | --enable-fma3 (enable use of FMA3 instructions if available) 533 | type: bool default: true 534 | --enable-bmi1 (enable use of BMI1 instructions if available) 535 | type: bool default: true 536 | --enable-bmi2 (enable use of BMI2 instructions if available) 537 | type: bool default: true 538 | --enable-lzcnt (enable use of LZCNT instruction if available) 539 | type: bool default: true 540 | --enable-popcnt (enable use of POPCNT instruction if available) 541 | type: bool default: true 542 | --arm-arch (generate instructions for the selected ARM architecture if available: armv6, armv7, armv7+sudiv or armv8) 543 | type: string default: armv8 544 | --force-long-branches (force all emitted branches to be in long mode (MIPS/PPC only)) 545 | type: bool default: false 546 | --mcpu (enable optimization for specific cpu) 547 | type: string default: auto 548 | --partial-constant-pool (enable use of partial constant pools (X64 only)) 549 | type: bool default: true 550 | --enable-armv7 (deprecated (use --arm_arch instead)) 551 | type: maybe_bool default: unset 552 | --enable-vfp3 (deprecated (use --arm_arch instead)) 553 | type: maybe_bool default: unset 554 | --enable-32dregs (deprecated (use --arm_arch instead)) 555 | type: maybe_bool default: unset 556 | --enable-neon (deprecated (use --arm_arch instead)) 557 | type: maybe_bool default: unset 558 | --enable-sudiv (deprecated (use --arm_arch instead)) 559 | type: maybe_bool default: unset 560 | --enable-armv8 (deprecated (use --arm_arch instead)) 561 | type: maybe_bool default: unset 562 | --enable-regexp-unaligned-accesses (enable unaligned accesses for the regexp engine) 563 | type: bool default: true 564 | --script-streaming (enable parsing on background) 565 | type: bool default: true 566 | --disable-old-api-accessors (Disable old-style API accessors whose setters trigger through the prototype chain) 567 | type: bool default: false 568 | --expose-natives-as (expose natives in global object) 569 | type: string default: nullptr 570 | --expose-free-buffer (expose freeBuffer extension) 571 | type: bool default: false 572 | --expose-gc (expose gc extension) 573 | type: bool default: false 574 | --expose-gc-as (expose gc extension under the specified name) 575 | type: string default: nullptr 576 | --expose-externalize-string (expose externalize string extension) 577 | type: bool default: false 578 | --expose-trigger-failure (expose trigger-failure extension) 579 | type: bool default: false 580 | --stack-trace-limit (number of stack frames to capture) 581 | type: int default: 10 582 | --builtins-in-stack-traces (show built-in functions in stack traces) 583 | type: bool default: false 584 | --disallow-code-generation-from-strings (disallow eval and friends) 585 | type: bool default: false 586 | --expose-async-hooks (expose async_hooks object) 587 | type: bool default: false 588 | --allow-unsafe-function-constructor (allow invoking the function constructor without security checks) 589 | type: bool default: false 590 | --force-slow-path (always take the slow path for builtins) 591 | type: bool default: false 592 | --inline-new (use fast inline allocation) 593 | type: bool default: true 594 | --trace (trace function calls) 595 | type: bool default: false 596 | --lazy (use lazy compilation) 597 | type: bool default: true 598 | --trace-opt (trace lazy optimization) 599 | type: bool default: false 600 | --trace-opt-verbose (extra verbose compilation tracing) 601 | type: bool default: false 602 | --trace-opt-stats (trace lazy optimization statistics) 603 | type: bool default: false 604 | --trace-deopt (trace optimize function deoptimization) 605 | type: bool default: false 606 | --trace-file-names (include file names in trace-opt/trace-deopt output) 607 | type: bool default: false 608 | --trace-interrupts (trace interrupts when they are handled) 609 | type: bool default: false 610 | --always-opt (always try to optimize functions) 611 | type: bool default: false 612 | --always-osr (always try to OSR functions) 613 | type: bool default: false 614 | --prepare-always-opt (prepare for turning on always opt) 615 | type: bool default: false 616 | --trace-serializer (print code serializer trace) 617 | type: bool default: false 618 | --compilation-cache (enable compilation cache) 619 | type: bool default: true 620 | --cache-prototype-transitions (cache prototype transitions) 621 | type: bool default: true 622 | --parallel-compile-tasks (enable parallel compile tasks) 623 | type: bool default: false 624 | --compiler-dispatcher (enable compiler dispatcher) 625 | type: bool default: false 626 | --trace-compiler-dispatcher (trace compiler dispatcher activity) 627 | type: bool default: false 628 | --cpu-profiler-sampling-interval (CPU profiler sampling interval in microseconds) 629 | type: int default: 1000 630 | --trace-js-array-abuse (trace out-of-bounds accesses to JS arrays) 631 | type: bool default: false 632 | --trace-external-array-abuse (trace out-of-bounds-accesses to external arrays) 633 | type: bool default: false 634 | --trace-array-abuse (trace out-of-bounds accesses to all arrays) 635 | type: bool default: false 636 | --trace-side-effect-free-debug-evaluate (print debug messages for side-effect-free debug-evaluate for testing) 637 | type: bool default: false 638 | --hard-abort (abort by crashing) 639 | type: bool default: true 640 | --expose-inspector-scripts (expose injected-script-source.js for debugging) 641 | type: bool default: false 642 | --stack-size (default size of stack region v8 is allowed to use (in kBytes)) 643 | type: int default: 984 644 | --max-stack-trace-source-length (maximum length of function source code printed in a stack trace.) 645 | type: int default: 300 646 | --clear-exceptions-on-js-entry (clear pending exceptions when entering JavaScript) 647 | type: bool default: false 648 | --histogram-interval (time interval in ms for aggregating memory histograms) 649 | type: int default: 600000 650 | --heap-profiler-trace-objects (Dump heap object allocations/movements/size_updates) 651 | type: bool default: false 652 | --heap-profiler-use-embedder-graph (Use the new EmbedderGraph API to get embedder nodes) 653 | type: bool default: true 654 | --heap-snapshot-string-limit (truncate strings to this length in the heap snapshot) 655 | type: int default: 1024 656 | --sampling-heap-profiler-suppress-randomness (Use constant sample intervals to eliminate test flakiness) 657 | type: bool default: false 658 | --use-idle-notification (Use idle notification to reduce memory footprint.) 659 | type: bool default: true 660 | --trace-ic (trace inline cache state transitions for tools/ic-processor) 661 | type: bool default: false 662 | --ic-stats (inline cache state transitions statistics) 663 | type: int default: 0 664 | --native-code-counters (generate extra code for manipulating stats counters) 665 | type: bool default: false 666 | --thin-strings (Enable ThinString support) 667 | type: bool default: true 668 | --trace-prototype-users (Trace updates to prototype user tracking) 669 | type: bool default: false 670 | --use-verbose-printer (allows verbose printing) 671 | type: bool default: true 672 | --trace-for-in-enumerate (Trace for-in enumerate slow-paths) 673 | type: bool default: false 674 | --trace-maps (trace map creation) 675 | type: bool default: false 676 | --trace-maps-details (also log map details) 677 | type: bool default: true 678 | --allow-natives-syntax (allow natives syntax) 679 | type: bool default: false 680 | --trace-sim (Trace simulator execution) 681 | type: bool default: false 682 | --debug-sim (Enable debugging the simulator) 683 | type: bool default: false 684 | --check-icache (Check icache flushes in ARM and MIPS simulator) 685 | type: bool default: false 686 | --stop-sim-at (Simulator stop after x number of instructions) 687 | type: int default: 0 688 | --sim-stack-alignment (Stack alingment in bytes in simulator (4 or 8, 8 is default)) 689 | type: int default: 8 690 | --sim-stack-size (Stack size of the ARM64, MIPS64 and PPC64 simulator in kBytes (default is 2 MB)) 691 | type: int default: 2048 692 | --log-colour (When logging, try to use coloured output.) 693 | type: bool default: true 694 | --ignore-asm-unimplemented-break (Don't break for ASM_UNIMPLEMENTED_BREAK macros.) 695 | type: bool default: false 696 | --trace-sim-messages (Trace simulator debug messages. Implied by --trace-sim.) 697 | type: bool default: false 698 | --async-stack-traces (include async stack traces in Error.stack) 699 | type: bool default: false 700 | --stack-trace-on-illegal (print stack trace when an illegal exception is thrown) 701 | type: bool default: false 702 | --abort-on-uncaught-exception (abort program (dump core) when an uncaught exception is thrown) 703 | type: bool default: false 704 | --abort-on-stack-or-string-length-overflow (Abort program when the stack overflows or a string exceeds maximum length (as opposed to throwing RangeError). This is useful for fuzzing where the spec behaviour would introduce nondeterminism.) 705 | type: bool default: false 706 | --randomize-hashes (randomize hashes to avoid predictable hash collisions (with snapshots this option cannot override the baked-in seed)) 707 | type: bool default: true 708 | --rehash-snapshot (rehash strings from the snapshot to override the baked-in seed) 709 | type: bool default: true 710 | --hash-seed (Fixed seed to use to hash property keys (0 means random)(with snapshots this option cannot override the baked-in seed)) 711 | type: uint64 default: 0 712 | --random-seed (Default seed for initializing random generator (0, the default, means to use system random).) 713 | type: int default: 0 714 | --fuzzer-random-seed (Default seed for initializing fuzzer random generator (0, the default, means to use v8's random number generator seed).) 715 | type: int default: 0 716 | --trace-rail (trace RAIL mode) 717 | type: bool default: false 718 | --print-all-exceptions (print exception object and stack trace on each thrown exception) 719 | type: bool default: false 720 | --runtime-call-stats (report runtime call counts and times) 721 | type: bool default: false 722 | --runtime-stats (internal usage only for controlling runtime statistics) 723 | type: int default: 0 724 | --profile-deserialization (Print the time it takes to deserialize the snapshot.) 725 | type: bool default: false 726 | --serialization-statistics (Collect statistics on serialized objects.) 727 | type: bool default: false 728 | --serialization-chunk-size (Custom size for serialization chunks) 729 | type: uint default: 4096 730 | --regexp-optimization (generate optimized regexp code) 731 | type: bool default: true 732 | --regexp-mode-modifiers (enable inline flags in regexp.) 733 | type: bool default: false 734 | --testing-bool-flag (testing_bool_flag) 735 | type: bool default: true 736 | --testing-maybe-bool-flag (testing_maybe_bool_flag) 737 | type: maybe_bool default: unset 738 | --testing-int-flag (testing_int_flag) 739 | type: int default: 13 740 | --testing-float-flag (float-flag) 741 | type: float default: 2.5 742 | --testing-string-flag (string-flag) 743 | type: string default: Hello, world! 744 | --testing-prng-seed (Seed used for threading test randomness) 745 | type: int default: 42 746 | --embedded-src (Path for the generated embedded data file. (mksnapshot only)) 747 | type: string default: nullptr 748 | --embedded-variant (Label to disambiguate symbols in embedded data file. (mksnapshot only)) 749 | type: string default: nullptr 750 | --startup-src (Write V8 startup as C++ src. (mksnapshot only)) 751 | type: string default: nullptr 752 | --startup-blob (Write V8 startup blob file. (mksnapshot only)) 753 | type: string default: nullptr 754 | --minor-mc-parallel-marking (use parallel marking for the young generation) 755 | type: bool default: true 756 | --trace-minor-mc-parallel-marking (trace parallel marking for the young generation) 757 | type: bool default: false 758 | --minor-mc (perform young generation mark compact GCs) 759 | type: bool default: false 760 | --help (Print usage message, including flags, on console) 761 | type: bool default: true 762 | --dump-counters (Dump counters on exit) 763 | type: bool default: false 764 | --dump-counters-nvp (Dump counters as name-value pairs on exit) 765 | type: bool default: false 766 | --use-external-strings (Use external strings for source code) 767 | type: bool default: false 768 | --map-counters (Map counters to a file) 769 | type: string default: 770 | --mock-arraybuffer-allocator (Use a mock ArrayBuffer allocator for testing.) 771 | type: bool default: false 772 | --mock-arraybuffer-allocator-limit (Memory limit for mock ArrayBuffer allocator used to simulate OOM for testing.) 773 | type: size_t default: 0 774 | --opt (use adaptive optimizations) 775 | type: bool default: true 776 | --use-ic (use inline caching) 777 | type: bool default: true 778 | --optimize-for-size (Enables optimizations which favor memory size over execution speed) 779 | type: bool default: false 780 | --log (Minimal logging (no API, code, GC, suspect, or handles samples).) 781 | type: bool default: false 782 | --log-all (Log all events to the log file.) 783 | type: bool default: false 784 | --log-api (Log API events to the log file.) 785 | type: bool default: false 786 | --log-code (Log code events to the log file without profiling.) 787 | type: bool default: false 788 | --log-handles (Log global handle events.) 789 | type: bool default: false 790 | --log-suspect (Log suspect operations.) 791 | type: bool default: false 792 | --log-source-code (Log source code.) 793 | type: bool default: false 794 | --log-function-events (Log function events (parse, compile, execute) separately.) 795 | type: bool default: false 796 | --prof (Log statistical profiling information (implies --log-code).) 797 | type: bool default: false 798 | --detailed-line-info (Always generate detailed line information for CPU profiling.) 799 | type: bool default: false 800 | --prof-sampling-interval (Interval for --prof samples (in microseconds).) 801 | type: int default: 1000 802 | --prof-cpp (Like --prof, but ignore generated code.) 803 | type: bool default: false 804 | --prof-browser-mode (Used with --prof, turns on browser-compatible mode for profiling.) 805 | type: bool default: true 806 | --logfile (Specify the name of the log file.) 807 | type: string default: v8.log 808 | --logfile-per-isolate (Separate log files for each isolate.) 809 | type: bool default: true 810 | --ll-prof (Enable low-level linux profiler.) 811 | type: bool default: false 812 | --interpreted-frames-native-stack (Show interpreted frames on the native stack (useful for external profilers).) 813 | type: bool default: false 814 | --perf-basic-prof (Enable perf linux profiler (basic support).) 815 | type: bool default: false 816 | --perf-basic-prof-only-functions (Only report function code ranges to perf (i.e. no stubs).) 817 | type: bool default: false 818 | --perf-prof (Enable perf linux profiler (experimental annotate support).) 819 | type: bool default: false 820 | --perf-prof-unwinding-info (Enable unwinding info for perf linux profiler (experimental).) 821 | type: bool default: false 822 | --gc-fake-mmap (Specify the name of the file for fake gc mmap used in ll_prof) 823 | type: string default: /tmp/__v8_gc__ 824 | --log-internal-timer-events (Time internal events.) 825 | type: bool default: false 826 | --log-timer-events (Time events including external callbacks.) 827 | type: bool default: false 828 | --log-instruction-stats (Log AArch64 instruction statistics.) 829 | type: bool default: false 830 | --log-instruction-file (AArch64 instruction statistics log file.) 831 | type: string default: arm64_inst.csv 832 | --log-instruction-period (AArch64 instruction statistics logging period.) 833 | type: int default: 4194304 834 | --redirect-code-traces (output deopt information and disassembly into file code--.asm) 835 | type: bool default: false 836 | --redirect-code-traces-to (output deopt information and disassembly into the given file) 837 | type: string default: nullptr 838 | --print-opt-source (print source code of optimized and inlined functions) 839 | type: bool default: false 840 | --predictable (enable predictable mode) 841 | type: bool default: false 842 | --predictable-gc-schedule (Predictable garbage collection schedule. Fixes heap growing, idle, and memory reducing behavior.) 843 | type: bool default: false 844 | --single-threaded (disable the use of background tasks) 845 | type: bool default: false 846 | --single-threaded-gc (disable the use of background gc tasks) 847 | type: bool default: false 848 | ``` 849 | -------------------------------------------------------------------------------- /docs/V8_COMPILER_PIPELINE.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimvanScherpenzeel/profiling-research/9fde9685f3f9741768a8226229d0534065bc57fd/docs/V8_COMPILER_PIPELINE.jpg -------------------------------------------------------------------------------- /scripts/run_macos.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Script to start the Chrome Canary with V8 profiler flags 4 | 5 | # Configuration 6 | LOCATION="${1:-http://localhost:8080}" 7 | LOG_DIRECTORY="${2:-logs}" 8 | LOG_OUTPUT="${3:-logs/chrome_canary_output.log}" 9 | LOG_ERROR="${4:-logs/chrome_canary_error.log}" 10 | BASE_TEMP_PROFILE_DIR="${5:-/tmp}" 11 | REMOTE_DEBUGGING_PORT="${6:-9222}" 12 | 13 | # Temporary profile 14 | TEMP_PROFILE_DIR=$(mktemp -d $BASE_TEMP_PROFILE_DIR/google-chome.XXXXXXX) 15 | 16 | # Helper functions 17 | 18 | function log () { 19 | echo -e "\033[36m" 20 | echo "#########################################################" 21 | echo "#### $1 " 22 | echo "#########################################################" 23 | echo -e "\033[m" 24 | } 25 | 26 | run() { 27 | log "Starting the Chrome Canary with V8 profiler flags" 28 | 29 | mkdir -p $LOG_DIRECTORY 30 | touch $LOG_OUTPUT 31 | touch $LOG_ERROR 32 | 33 | echo -e "Starting Chrome Canary with custom profiling flags\n" 34 | 35 | echo -e "Created temporary profile folder in $TEMP_PROFILE_DIR" 36 | 37 | # Opening chrome://tracing is not allowed from the command line 38 | echo -e "Please open \"chrome://tracing\" in a new browser tab to start structural profiling\n" 39 | 40 | # Chrome flags 41 | 42 | # --incognito | Launches Chrome in incognito mode 43 | # --disable-gpu-vsync + --disable-frame-rate-limit | Disables the VSync and de-limits the 60 frames per second rate limiting imposed by Chrome 44 | # --no-default-browser-check | Disables a pop up window checking if Chrome is the default browser 45 | # --enable-precise-memory-info | Enables precise memory info (otherwise the results from performance.memory are bucketed and less useful) 46 | # --remote-debugging-port | Enables remote debugging using the DevTools API 47 | # --user-data-dir + --no-first-run | Chrome creates a user profile by default in a temporary directory and disable a pop up window checking if the user has a new profile 48 | 49 | # V8 flags 50 | 51 | # --trace-file-names | When tracing show the filename of the file where the optimized or de-optimized code is located 52 | # --trace-opt | Trace code optimisations of hot functions 53 | # --trace-deopt | Trace code de-optimisations of hot functions 54 | # --print-opt-source | Print the optimized source code and trace the difference 55 | # --code-comments | Comment the code where possible (useful for understanding the optimized and deoptimized source code) 56 | 57 | /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary $LOCATION --incognito --disable-gpu-vsync --disable-frame-rate-limit --no-default-browser-check --enable-precise-memory-info --remote-debugging-port=$REMOTE_DEBUGGING_PORT --user-data-dir=$TEMP_PROFILE_DIR --no-first-run --js-flags="--trace-file-names --trace-opt --trace-deopt --print-opt-source --code-comments" 1> $LOG_OUTPUT 2> $LOG_ERROR 58 | 59 | echo -e "Cleaning up temporary profile folder in $TEMP_PROFILE_DIR" 60 | 61 | rm -rf $TEMP_PROFILE_DIR 62 | } 63 | 64 | # Main script 65 | 66 | run 67 | 68 | log "Done!" -------------------------------------------------------------------------------- /scripts/setup_macos.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Script to install and setup the required toolchain 4 | 5 | # Helper functions 6 | 7 | function log () { 8 | echo -e "\033[36m" 9 | echo "#########################################################" 10 | echo "#### $1 " 11 | echo "#########################################################" 12 | echo -e "\033[m" 13 | } 14 | 15 | setup() { 16 | UNAME=$(uname) 17 | 18 | if [ "$UNAME" != "Darwin" ]; then 19 | echo "Currently only MacOS is supported by this automatic setup script" 20 | exit 1 21 | fi 22 | 23 | # Install `homebrew` dependencies 24 | log "Installing Homebrew packages" 25 | 26 | if ! type "brew" > /dev/null; then 27 | echo "Please install Homebrew (https://brew.sh/)" 28 | exit 29 | else 30 | for pkg in google-chrome-canary; do 31 | if brew cask list -1 | grep -q "^${pkg}\$"; then 32 | echo "Package '$pkg' is already installed" 33 | else 34 | echo "Package '$pkg' is not installed" 35 | 36 | # Convert 37 | log "Installing Google Chrome Canary" 38 | brew tap homebrew/cask-versions && brew cask install google-chrome-canary 39 | fi 40 | done 41 | fi 42 | } 43 | 44 | # Main script 45 | 46 | setup 47 | 48 | log "Done!" --------------------------------------------------------------------------------