├── .gitignore ├── LICENSE ├── README.md ├── build.gradle ├── settings.gradle └── src ├── main └── java │ └── msplit │ ├── SplitMethod.java │ ├── Splitter.java │ └── Util.java └── test └── java └── msplit ├── RuntimeCompiler.java ├── SplitMethodTest.java └── TestUtil.java /.gitignore: -------------------------------------------------------------------------------- 1 | /gradlew.bat 2 | /.gradle 3 | /.idea 4 | /gradle 5 | /gradlew 6 | /build 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Chad Retz 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MSplit 2 | 3 | MSplit splits JVM methods that are too large. Often in [ASM](https://asm.ow2.io/) when compiling for the JVM, the method 4 | will be too large, giving the "Method code too large!" exception. The JVM is limited method sizes to 64k. This project 5 | helps get around that. 6 | 7 | While the goal is similar to https://bitbucket.org/sperber/asm-method-size, it is much simpler in practice. Although it 8 | should work for most use cases in theory, not very many of the quirks have been tested. Please report any issues and 9 | hopefully a test case can be written to try it. 10 | 11 | ## Usage 12 | 13 | Since the code in this repository is only a few files, it was decided not to put it on Maven Central but instead 14 | encourage developers to shade/vendor/embed the code in their own project by just moving the files. 15 | 16 | The common way to split a method is to use `msplit.SplitMethod#split` which accepts the internal class name of the 17 | method, an ASM `MethodNode` to split, `minSize`, `maxSize`, and `atLeastFirst` parameters. `minSize` is the minimum 18 | number of instructions that must be split off, `maxSize` is the maximum number of instructions that can be split off, 19 | and `atLeastFirst` is the number of instructions that, when first reached, will be considered the valid set to be used 20 | immediately. If `atLeastFirst` is <= 0, the entire set of split points is checked to find the largest within min/max. 21 | An overload of `split` exists that defaults `minSize` to 20% + 1, `maxSize` to 70% + 1, and `atLeastFirst` as the 22 | `maxSize`. 23 | 24 | This method returns a `Result` which contains the `splitOffMethod`, which is the new method split off from the original, 25 | and `trimmedMethod`, which is the original changed to call the split off method. The original method is left untouched. 26 | The method uses the `msplit.Splitter` class which is an iterator over `msplit.Splitter.SplitPoint` classes which 27 | continually return split point possibilities. 28 | 29 | The two created methods have all their frames removed and maxs invalid, so when writing with ASM, make sure the class 30 | writer is set to compute frames and maxs. 31 | 32 | ## How it Works 33 | 34 | The algorithm is takes two steps: the first finds valid "split points" where a section of code can be taken out of the 35 | original and put into another method (in `msplit.Splitter`), and the second which uses a split point to do the actual 36 | splitting (in `msplit.SplitMethod`). 37 | 38 | The `msplit.Splitter` algorithm is an iterator that iterates over potential split points constrained by a user-supplied 39 | min and max instruction count. The algorithm goes one instruction at a time and: 40 | 41 | 1. Creates a split point from the current instruction to the max size 42 | 1. Changes the end index based on try-catch blocks: 43 | 1. If the try block is completely within the split point, everything is ok except if the catch handler is not at 44 | which point the end is changed to before the try block to completely exclude it 45 | 1. If the try block starts before the split point but ends inside, the end is reduced to the block's end 46 | 1. If the try block starts inside the split point but ends outside, the end is reduced to before the start 47 | 1. In all cases except the first (i.e. the try block completely inside), if the catch handler jumps inside the block 48 | then the end is reduced to before the catch handler 49 | 1. Reduces the end to just before any jump instruction that jumps out of the split point 50 | 1. Reduces the end to just before any target in the split point jumped to by a non-split-point instruction 51 | 52 | Then, for that split point, more information is added to it. Specifically: 53 | 54 | 1. Record the locals that are read 55 | 1. Record the locals that are written 56 | 1. Record the lowest depth the stack reaches 57 | 58 | Finally, build the split point with that information. 59 | 60 | The `msplit.SplitMethod` algorithm takes a split point and applies it to the method. It has overloads to find the best 61 | split point based on min/max instruction limits and optionally stopping eagerly when it finds one that reaches a certain 62 | size. Then it creates two methods: the split off method, which is the new one with instructions inside the split point, 63 | and the trimmed method, which is the original one but with the split point instructions removed and replaced with a call 64 | to the split off method. 65 | 66 | To create the split off method, a new method is created that accepts the needed start stack types and the read local 67 | types as parameters. It returns an object array which contains the resulting stack items and the resulting written local 68 | types. It is created as a private static synthetic method. When called, the method: 69 | 70 | 1. Writes all read local parameters to locals 71 | 1. Pushes all stack items from parameters on to the stack 72 | 1. Uses the split off instructions 73 | 1. Creates a return object array 74 | 1. Puts the required stack items in the object array 75 | 1. Puts the written locals in the object array 76 | 1. Adds all try-catch blocks from the original that are fully contained within the split point 77 | 78 | All object array work is built to box and unbox as necessary when primitives are encountered. 79 | 80 | To create the trimmed method, the method sans instructions and try/catch blocks is copied. When called, the method: 81 | 82 | 1. Uses all normal instructions up to the split point, keeping track of written locals 83 | 1. Pushes all needed read locals on the stack for split off invocation 84 | 1. NOTE: This uses the written local knowledge from the first step to determine if the local is uninitialized. If it 85 | is uninitialized, it uses the "zero val" of the local instead of loading it. Not yet sure if this is an acceptable 86 | approach to determine uninitialized locals. 87 | 1. Invokes the split off method, which pops/uses the stack then the pushed locals as parameters 88 | 1. Takes the result of the split-off method (the object array) and writes the locals back that were changed 89 | 1. Pushes back on the stack the stack portion of the object array 90 | 1. Uses all normal instructions after the split point 91 | 1. Adds back all try-catch blocks not fully contained within the split point 92 | 93 | There is more complication than this, but the general idea is here. -------------------------------------------------------------------------------- /build.gradle: -------------------------------------------------------------------------------- 1 | plugins { 2 | id 'java' 3 | } 4 | 5 | group 'com.github.cretz.msplit' 6 | version '0.1.0-SNAPSHOT' 7 | 8 | sourceCompatibility = 1.8 9 | 10 | repositories { 11 | mavenCentral() 12 | } 13 | 14 | dependencies { 15 | compileOnly 'org.ow2.asm:asm-tree:6.2.1' 16 | compileOnly 'org.ow2.asm:asm-commons:6.2.1' 17 | testImplementation 'org.ow2.asm:asm-tree:6.2.1' 18 | testImplementation 'org.ow2.asm:asm-commons:6.2.1' 19 | testImplementation 'org.ow2.asm:asm-util:6.2.1' 20 | testImplementation 'junit:junit:4.12' 21 | } 22 | -------------------------------------------------------------------------------- /settings.gradle: -------------------------------------------------------------------------------- 1 | rootProject.name = 'msplit' 2 | 3 | -------------------------------------------------------------------------------- /src/main/java/msplit/SplitMethod.java: -------------------------------------------------------------------------------- 1 | package msplit; 2 | 3 | 4 | import org.objectweb.asm.Label; 5 | import org.objectweb.asm.Opcodes; 6 | import org.objectweb.asm.Type; 7 | import org.objectweb.asm.tree.*; 8 | 9 | import java.util.*; 10 | 11 | import static msplit.Util.*; 12 | 13 | /** Splits a method into two */ 14 | public class SplitMethod { 15 | 16 | protected final int api; 17 | 18 | /** @param api Same as for {@link org.objectweb.asm.MethodVisitor#MethodVisitor(int)} or any other ASM class */ 19 | public SplitMethod(int api) { this.api = api; } 20 | 21 | /** 22 | * Calls {@link #split(String, MethodNode, int, int, int)} with minSize as 20% + 1 of the original, maxSize as 23 | * 70% + 1 of the original, and firstAtLeast as maxSize. The original method is never modified and the result can 24 | * be null if no split points are found. 25 | */ 26 | public Result split(String owner, MethodNode method) { 27 | // Between 20% + 1 and 70% + 1 of size 28 | int insnCount = method.instructions.size(); 29 | int minSize = (int) (insnCount * 0.2) + 1; 30 | int maxSize = (int) (insnCount * 0.7) + 1; 31 | return split(owner, method, minSize, maxSize, maxSize); 32 | } 33 | 34 | /** 35 | * Splits the given method into two. This uses a {@link Splitter} to consistently create 36 | * {@link msplit.Splitter.SplitPoint}s until one reaches firstAtLeast or the largest otherwise, and then calls 37 | * {@link #fromSplitPoint(String, MethodNode, Splitter.SplitPoint)}. 38 | * 39 | * @param owner The internal name of the owning class. Needed when splitting to call the split off method. 40 | * @param method The method to split, never modified 41 | * @param minSize The minimum number of instructions the split off method must have 42 | * @param maxSize The maximum number of instructions the split off method can have 43 | * @param firstAtLeast The number of instructions that, when first reached, will immediately be used without 44 | * continuing. Since split points are streamed, this allows splitting without waiting to 45 | * find the largest overall. If this is <= 0, it will not apply and all split points will be 46 | * checked to find the largest before doing the split. 47 | * @return The resulting split method or null if there were no split points found 48 | */ 49 | public Result split(String owner, MethodNode method, int minSize, int maxSize, int firstAtLeast) { 50 | // Get the largest split point 51 | Splitter.SplitPoint largest = null; 52 | for (Splitter.SplitPoint point : new Splitter(api, owner, method, minSize, maxSize)) { 53 | if (largest == null || point.length > largest.length) { 54 | largest = point; 55 | // Early exit? 56 | if (firstAtLeast > 0 && largest.length >= firstAtLeast) break; 57 | } 58 | } 59 | if (largest == null) return null; 60 | return fromSplitPoint(owner, method, largest); 61 | } 62 | 63 | /** 64 | * Split the given method at the given split point. Called by {@link #split(String, MethodNode, int, int, int)}. The 65 | * original method is never modified. 66 | */ 67 | public Result fromSplitPoint(String owner, MethodNode orig, Splitter.SplitPoint splitPoint) { 68 | MethodNode splitOff = createSplitOffMethod(orig, splitPoint); 69 | MethodNode trimmed = createTrimmedMethod(owner, orig, splitOff, splitPoint); 70 | return new Result(trimmed, splitOff); 71 | } 72 | 73 | protected MethodNode createSplitOffMethod(MethodNode orig, Splitter.SplitPoint splitPoint) { 74 | // The new method is a static synthetic method named method.name + "$split" that returns an object array 75 | // Key is previous local index, value is new local index 76 | Map localsMap = new HashMap<>(); 77 | // The new method's parameters are all stack items + all read locals 78 | List args = new ArrayList<>(splitPoint.neededFromStackAtStart); 79 | splitPoint.localsRead.forEach((index, type) -> { 80 | args.add(type); 81 | localsMap.put(index, args.size() - 1); 82 | }); 83 | // Create the new method 84 | String name = orig.name.replace("<", "__").replace(">", "__") + "$split"; 85 | MethodNode newMethod = new MethodNode(api, 86 | Opcodes.ACC_STATIC + Opcodes.ACC_PRIVATE + Opcodes.ACC_SYNTHETIC, name, 87 | Type.getMethodDescriptor(Type.getType(Object[].class), args.toArray(new Type[0])), null, null); 88 | // Add the written locals to the map that are not already there 89 | int newLocalIndex = args.size(); 90 | for (Integer key : splitPoint.localsWritten.keySet()) { 91 | if (!localsMap.containsKey(key)) { 92 | localsMap.put(key, newLocalIndex); 93 | newLocalIndex++; 94 | } 95 | } 96 | // First set of instructions is pushing the new stack from the params 97 | for (int i = 0; i < splitPoint.neededFromStackAtStart.size(); i++) { 98 | Type item = splitPoint.neededFromStackAtStart.get(i); 99 | newMethod.visitVarInsn(loadOpFromType(item), i); 100 | } 101 | // Next set of instructions comes verbatim from the original, but we have to change the local indexes 102 | Set