82 |
83 | ElementwiseOps are UnaryOps, BinaryOps, and TernaryOps.
84 | They operate on 1-3 tensors and run elementwise.
85 | example: SQRT, LOG2, ADD, MUL, WHERE, etc...
86 | ReduceOps operate on one tensor and return a smaller tensor.
87 | example: SUM, MAX
88 | MovementOps are virtual ops that operate on one tensor and move the data around
89 | Copy-free with ShapeTracker.
90 | example: RESHAPE, PERMUTE, EXPAND, etc...
91 |
92 |
But how...where are your CONVs and MATMULs? Read the code to solve this mystery.
93 |
94 |
95 |
Work at tiny corp
96 | We are now funded and hiring full time software engineers. Very talented interns okay.
97 | See our bounty page to judge if you might be a good fit. Bounties pay you while judging that fit.
98 | We are also hiring for operations and hardware, but if you haven't contributed to tinygrad your application won't be considered.
99 |
100 |
101 |
tinybox (now shipping)
102 |
103 |
We sell a computer called the tinybox. It comes in two colors.
It is a very powerful computer for deep learning, and likely the best performance/$. It was benchmarked in MLPerf Training 4.0 vs computers that cost 10x as much. And of course, anything that can train can do inference.
135 |
136 |
How do I get a tinybox?
137 |
Place an order through the links above. The factory is up and running, and it will ship within one week of us receiving the payment. Currently offering pickup in San Diego + shipping worldwide.
tinygrad is used in openpilot to run the driving model on the Snapdragon 845 GPU. It replaces SNPE, is faster, supports loading onnx files, supports training, and allows for attention (SNPE only allows fixed weights).
144 |
145 |
Is tinygrad inference only?
146 |
No! It supports full forward and backward passes with autodiff. This is implemented at a level of abstraction higher than the accelerator specific code, so a tinygrad port gets you this for free.
147 |
148 |
How can I use tinygrad for my next ML project?
149 |
Follow the installation instructions on the tinygrad repo. It has a similar API to PyTorch, yet simpler and more refined. Less stable though while tinygrad is in alpha, so be warned, though it's been fairly stable for a while.
150 |
151 |
When will tinygrad leave alpha?
152 |
When we can reproduce a common set of papers on 1 NVIDIA GPU 2x faster than PyTorch. We also want the speed to be good on the M1. ETA, Q2 next year.
153 |
154 |
How is tinygrad faster than PyTorch?
155 |
For most use cases it isn't yet, but it will be. It has three advantages:
156 |
It compiles a custom kernel for every operation, allowing extreme shape specialization.
157 |
All tensors are lazy, so it can aggressively fuse operations.
158 |
The backend is 10x+ simpler, meaning optimizing one kernel makes everything fast.