├── .github └── FUNDING.yml ├── CHANGELOG.md ├── LICENSE ├── README.md ├── composer.json └── src ├── AssocDecoder.php ├── Decoder.php └── Encoder.php /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | github: clue 2 | custom: https://clue.engineering/support 3 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | ## 1.2.0 (2022-05-13) 4 | 5 | * Feature: Support custom EOL character when encoding CSV. 6 | (#26 by @clue) 7 | 8 | ```php 9 | $csv = new Clue\React\Csv\Encoder($stdout, ',', '"', '\\', "\r\n"); 10 | ``` 11 | 12 | * Feature: Add `headers` event to `AssocDecoder` class. 13 | (#29 by @SimonFrings) 14 | 15 | ```php 16 | $csv->on('headers', function (array $headers) { 17 | var_dump($headers); // e.g. $headers = ['name','age']; 18 | }); 19 | ``` 20 | 21 | * Feature: Check type of incoming `data` before trying to decode CSV. 22 | (#27 by @clue) 23 | 24 | * Feature: Support parsing multiline values starting with quoted newline. 25 | (#25 by @KamilBalwierz) 26 | 27 | * Improve documentation and examples. 28 | (#30 and #28 by @SimonFrings, #22 and #24 by @clue and #23 by @PaulRotmann) 29 | 30 | ## 1.1.0 (2020-12-10) 31 | 32 | * Feature: Add decoding benchmark plus benchmark for GZIP-compressed CSV files. 33 | (#15 by @clue) 34 | 35 | * Improve test suite and add `.gitattributes` to exclude dev files from exports. 36 | Add PHP 8 support, update to PHPUnit 9 and simplify test setup. 37 | (#13 and #14 by @clue and #16, #18, #19 and #20 by @SimonFrings) 38 | 39 | * Improve documentation wording/typos and examples. 40 | (#9 by @loilo and #10 by @clue) 41 | 42 | ## 1.0.0 (2018-08-14) 43 | 44 | * First stable release, following SemVer 45 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2018 Christian Lück 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is furnished 10 | to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 21 | THE SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # clue/reactphp-csv 2 | 3 | [![CI status](https://github.com/clue/reactphp-csv/actions/workflows/ci.yml/badge.svg)](https://github.com/clue/reactphp-csv/actions) 4 | [![installs on Packagist](https://img.shields.io/packagist/dt/clue/reactphp-csv?color=blue&label=installs%20on%20Packagist)](https://packagist.org/packages/clue/reactphp-csv) 5 | 6 | Streaming CSV (Comma-Separated Values or Character-Separated Values) parser and encoder for [ReactPHP](https://reactphp.org/). 7 | 8 | CSV (Comma-Separated Values or less commonly Character-Separated Values) can be 9 | used to store a large number of (uniform) records in simple text-based files, 10 | such as a list of user records or log entries. CSV is not exactly a new format 11 | and has been used in a large number of systems for decades. In particular, CSV 12 | is often used for historical reasons and despite its shortcomings, it is still a 13 | very common export format for a large number of tools to interface with 14 | spreadsheet processors (such as Excel, Calc etc.). This library provides a simple 15 | streaming API to process very large CSV files with thousands or even millions of 16 | rows efficiently without having to load the whole file into memory at once. 17 | 18 | * **Standard interfaces** - 19 | Allows easy integration with existing higher-level components by implementing 20 | ReactPHP's standard streaming interfaces. 21 | * **Lightweight, SOLID design** - 22 | Provides a thin abstraction that is [*just good enough*](https://en.wikipedia.org/wiki/Principle_of_good_enough) 23 | and does not get in your way. 24 | Builds on top of well-tested components and well-established concepts instead of reinventing the wheel. 25 | * **Good test coverage** - 26 | Comes with an [automated tests suite](#tests) and is regularly tested in the *real world*. 27 | 28 | **Table of contents** 29 | 30 | * [Support us](#support-us) 31 | * [CSV format](#csv-format) 32 | * [Usage](#usage) 33 | * [Decoder](#decoder) 34 | * [AssocDecoder](#assocdecoder) 35 | * [Encoder](#encoder) 36 | * [Install](#install) 37 | * [Tests](#tests) 38 | * [License](#license) 39 | * [More](#more) 40 | 41 | ## Support us 42 | 43 | We invest a lot of time developing, maintaining, and updating our awesome 44 | open-source projects. You can help us sustain this high-quality of our work by 45 | [becoming a sponsor on GitHub](https://github.com/sponsors/clue). Sponsors get 46 | numerous benefits in return, see our [sponsoring page](https://github.com/sponsors/clue) 47 | for details. 48 | 49 | Let's take these projects to the next level together! 🚀 50 | 51 | ## CSV format 52 | 53 | CSV (Comma-Separated Values or less commonly Character-Separated Values) is a 54 | very simple text-based format for storing a large number of (uniform) records, 55 | such as a list of user records or log entries. 56 | 57 | ``` 58 | Alice,30 59 | Bob,50 60 | Carol,40 61 | Dave,30 62 | ``` 63 | 64 | While this may look somewhat trivial, this simplicity comes at a price. CSV is 65 | limited to untyped, two-dimensional data, so there's no standard way of storing 66 | any nested structures or to differentiate a boolean value from a string or 67 | integer. 68 | 69 | CSV allows for optional field names. Whether field names are used is 70 | application-dependant, so this library makes no attempt at *guessing* whether 71 | the first line contains field names or field values. For many common use cases 72 | it's a good idea to include them like this: 73 | 74 | ``` 75 | name,age 76 | Alice,30 77 | Bob,50 78 | Carol,40 79 | Dave,30 80 | ``` 81 | 82 | CSV allows handling field values that contain spaces, the delimiting comma or 83 | even newline characters (think of URLs or user-provided descriptions) by 84 | enclosing them with quotes like this: 85 | 86 | ``` 87 | name,comment 88 | Alice,"Yes, I like cheese" 89 | Bob,"Hello 90 | World!" 91 | ``` 92 | 93 | > Note that these more advanced parsing rules are often handled inconsistently 94 | by other applications. Nowadays, these parsing rules are defined as part of 95 | [RFC 4180](https://tools.ietf.org/html/rfc4180), however many applications 96 | started using some CSV-variant long before this standard was defined. 97 | 98 | Some applications refer to CSV as Character-Separated Values, simply because 99 | using another delimiter (such as a semicolon or tab) is a rather common approach 100 | to avoid the need to enclose common values in quotes. This is particularly 101 | common for systems in Europe (and elsewhere) that use a comma as a decimal separator. 102 | 103 | ``` 104 | name;comment 105 | Alice;Yes, I like cheese 106 | Bob;Turn 22,5 degree clockwise 107 | ``` 108 | 109 | CSV files are often limited to only ASCII characters for best interoperability. 110 | However, many legacy CSV files often use ISO 8859-1 encoding or some other 111 | variant. Newer CSV files are usually best saved as UTF-8 and may thus also 112 | contain special characters from the Unicode range. The text-encoding is usually 113 | application-dependant, so your best bet would be to convert to (or assume) UTF-8 114 | consistently. 115 | 116 | Despite its shortcomings, CSV is widely used and this is unlikely to change any 117 | time soon. In particular, CSV is a very common export format for a lot of tools 118 | to interface with spreadsheet processors (such as Excel, Calc etc.). This means 119 | that CSV is often used for historical reasons and using CSV to store structured 120 | application data is usually not a good idea nowadays – but exporting to CSV for 121 | known applications continues to be a very reasonable approach. 122 | 123 | As an alternative, if you want to process structured data in a more modern 124 | JSON-based format, you may want to use [clue/reactphp-ndjson](https://github.com/clue/reactphp-ndjson) 125 | to process newline-delimited JSON (NDJSON) files (`.ndjson` file extension). 126 | 127 | ```json 128 | {"name":"Alice","age":30,"comment":"Yes, I like cheese"} 129 | {"name":"Bob","age":50,"comment":"Hello\nWorld!"} 130 | ``` 131 | 132 | As another alternative, if you want to use a CSV-variant that avoids some of its 133 | shortcomings (and is somewhat faster!), you may want to use [clue/reactphp-tsv](https://github.com/clue/reactphp-tsv) 134 | to process Tab-Separated-Values (TSV) files (`.tsv` file extension). 135 | 136 | ```tsv 137 | name age comment 138 | Alice 30 Yes, I like cheese 139 | Bob 50 Hello world! 140 | ``` 141 | 142 | ## Usage 143 | 144 | ### Decoder 145 | 146 | The `Decoder` (parser) class can be used to make sure you only get back 147 | complete, valid CSV elements when reading from a stream. 148 | It wraps a given 149 | [`ReadableStreamInterface`](https://github.com/reactphp/stream#readablestreaminterface) 150 | and exposes its data through the same interface, but emits the CSV elements 151 | as parsed values instead of just chunks of strings: 152 | 153 | ``` 154 | test,1,24 155 | "hello world",2,48 156 | ``` 157 | ```php 158 | $stdin = new React\Stream\ReadableResourceStream(STDIN); 159 | 160 | $csv = new Clue\React\Csv\Decoder($stdin); 161 | 162 | $csv->on('data', function (array $data) { 163 | // $data is a parsed element from the CSV stream 164 | // line 1: $data = array('test', '1', '24'); 165 | // line 2: $data = array('hello world', '2', '48'); 166 | var_dump($data); 167 | }); 168 | ``` 169 | 170 | ReactPHP's streams emit chunks of data strings and make no assumption about their lengths. 171 | These chunks do not necessarily represent complete CSV elements, as an 172 | element may be broken up into multiple chunks. 173 | This class reassembles these elements by buffering incomplete ones. 174 | 175 | The `Decoder` supports the same optional parameters as the underlying 176 | [`str_getcsv()`](https://www.php.net/manual/en/function.str-getcsv.php) function. 177 | This means that, by default, CSV fields will be delimited by a comma (`,`), will 178 | use a quote enclosure character (`"`) and a backslash escape character (`\`). 179 | This behavior can be controlled through the optional constructor parameters: 180 | 181 | ```php 182 | $csv = new Clue\React\Csv\Decoder($stdin, ';'); 183 | 184 | $csv->on('data', function (array $data) { 185 | // CSV fields will now be delimited by semicolon 186 | }); 187 | ``` 188 | 189 | Additionally, the `Decoder` limits the maximum buffer size (maximum line 190 | length) to avoid buffer overflows due to malformed user input. Usually, there 191 | should be no need to change this value, unless you know you're dealing with some 192 | unreasonably long lines. It accepts an additional argument if you want to change 193 | this from the default of 64 KiB: 194 | 195 | ```php 196 | $csv = new Clue\React\Csv\Decoder($stdin, ',', '"', '\\', 64 * 1024); 197 | ``` 198 | 199 | If the underlying stream emits an `error` event or the plain stream contains 200 | any data that does not represent a valid CSV stream, 201 | it will emit an `error` event and then `close` the input stream: 202 | 203 | ```php 204 | $csv->on('error', function (Exception $error) { 205 | // an error occurred, stream will close next 206 | }); 207 | ``` 208 | 209 | If the underlying stream emits an `end` event, it will flush any incomplete 210 | data from the buffer, thus either possibly emitting a final `data` event 211 | followed by an `end` event on success or an `error` event for 212 | incomplete/invalid CSV data as above: 213 | 214 | ```php 215 | $csv->on('end', function () { 216 | // stream successfully ended, stream will close next 217 | }); 218 | ``` 219 | 220 | If either the underlying stream or the `Decoder` is closed, it will forward 221 | the `close` event: 222 | 223 | ```php 224 | $csv->on('close', function () { 225 | // stream closed 226 | // possibly after an "end" event or due to an "error" event 227 | }); 228 | ``` 229 | 230 | The `close(): void` method can be used to explicitly close the `Decoder` and 231 | its underlying stream: 232 | 233 | ```php 234 | $csv->close(); 235 | ``` 236 | 237 | The `pipe(WritableStreamInterface $dest, array $options = array(): WritableStreamInterface` 238 | method can be used to forward all data to the given destination stream. 239 | Please note that the `Decoder` emits decoded/parsed data events, while many 240 | (most?) writable streams expect only data chunks: 241 | 242 | ```php 243 | $csv->pipe($logger); 244 | ``` 245 | 246 | For more details, see ReactPHP's 247 | [`ReadableStreamInterface`](https://github.com/reactphp/stream#readablestreaminterface). 248 | 249 | ### AssocDecoder 250 | 251 | The `AssocDecoder` (parser) class can be used to make sure you only get back 252 | complete, valid CSV elements when reading from a stream. 253 | It wraps a given 254 | [`ReadableStreamInterface`](https://github.com/reactphp/stream#readablestreaminterface) 255 | and exposes its data through the same interface, but emits the CSV elements 256 | as parsed assoc arrays instead of just chunks of strings: 257 | 258 | ``` 259 | name,id 260 | test,1 261 | "hello world",2 262 | ``` 263 | ```php 264 | $stdin = new React\Stream\ReadableResourceStream(STDIN); 265 | 266 | $csv = new Clue\React\Csv\AssocDecoder($stdin); 267 | 268 | $csv->on('data', function (array $data) { 269 | // $data is a parsed element from the CSV stream 270 | // line 1: $data = array('name' => 'test', 'id' => '1'); 271 | // line 2: $data = array('name' => 'hello world', 'id' => '2'); 272 | var_dump($data); 273 | }); 274 | ``` 275 | 276 | Whether field names are used is application-dependant, so this library makes no 277 | attempt at *guessing* whether the first line contains field names or field 278 | values. For many common use cases it's a good idea to include them and 279 | explicitly use this class instead of the underlying [`Decoder`](#decoder). 280 | 281 | In fact, it uses the [`Decoder`](#decoder) class internally. The only difference 282 | is that this class requires the first line to include the name of headers and 283 | will use this as keys for all following row data which will be emitted as 284 | assoc arrays. After receiving the name of headers, this class will always emit 285 | a `headers` event with a list of header names. 286 | 287 | ```php 288 | $csv->on('headers', function (array $headers) { 289 | // header line: $headers = array('name', 'id'); 290 | var_dump($headers); 291 | }); 292 | ``` 293 | 294 | This implies that the input stream MUST start with one row of header names and 295 | MUST use the same number of columns for all records. If the input stream does 296 | not emit any data, if any row does not contain the same number of columns, 297 | if the input stream does not represent a valid CSV stream or if the input stream 298 | emits an `error` event, this decoder will emit an appropriate `error` event and 299 | close the input stream. 300 | 301 | This class otherwise accepts the same arguments and follows the exact same 302 | behavior of the underlying [`Decoder`](#decoder) class. For more details, see 303 | the [`Decoder`](#decoder) class. 304 | 305 | ### Encoder 306 | 307 | The `Encoder` (serializer) class can be used to make sure anything you write to 308 | a stream ends up as valid CSV elements in the resulting CSV stream. 309 | It wraps a given 310 | [`WritableStreamInterface`](https://github.com/reactphp/stream#writablestreaminterface) 311 | and accepts its data through the same interface, but handles any data as complete 312 | CSV elements instead of just chunks of strings: 313 | 314 | ```php 315 | $stdout = new React\Stream\WritableResourceStream(STDOUT); 316 | 317 | $csv = new Clue\React\Csv\Encoder($stdout); 318 | 319 | $csv->write(array('test', true, 24)); 320 | $csv->write(array('hello world', 2, 48)); 321 | ``` 322 | ``` 323 | test,1,24 324 | "hello world",2,48 325 | ``` 326 | 327 | The `Encoder` supports the same optional parameters as the underlying 328 | [`fputcsv()`](https://www.php.net/manual/en/function.fputcsv.php) function. 329 | This means that, by default, CSV fields will be delimited by a comma (`,`), will 330 | use a quote enclosure character (`"`), a backslash escape character (`\`), and 331 | a Unix-style EOL (`\n` or `LF`). 332 | This behavior can be controlled through the optional constructor parameters: 333 | 334 | ```php 335 | $csv = new Clue\React\Csv\Encoder($stdout, ';'); 336 | 337 | $csv->write(array('hello', 'world')); 338 | ``` 339 | ``` 340 | hello;world 341 | ``` 342 | 343 | If the underlying stream emits an `error` event or the given data contains 344 | any data that can not be represented as a valid CSV stream, 345 | it will emit an `error` event and then `close` the input stream: 346 | 347 | ```php 348 | $csv->on('error', function (Exception $error) { 349 | // an error occurred, stream will close next 350 | }); 351 | ``` 352 | 353 | If either the underlying stream or the `Encoder` is closed, it will forward 354 | the `close` event: 355 | 356 | ```php 357 | $csv->on('close', function () { 358 | // stream closed 359 | // possibly after an "end" event or due to an "error" event 360 | }); 361 | ``` 362 | 363 | The `end(mixed $data = null): void` method can be used to optionally emit 364 | any final data and then soft-close the `Encoder` and its underlying stream: 365 | 366 | ```php 367 | $csv->end(); 368 | ``` 369 | 370 | The `close(): void` method can be used to explicitly close the `Encoder` and 371 | its underlying stream: 372 | 373 | ```php 374 | $csv->close(); 375 | ``` 376 | 377 | For more details, see ReactPHP's 378 | [`WritableStreamInterface`](https://github.com/reactphp/stream#writablestreaminterface). 379 | 380 | ## Install 381 | 382 | The recommended way to install this library is [through Composer](https://getcomposer.org/). 383 | [New to Composer?](https://getcomposer.org/doc/00-intro.md) 384 | 385 | This project follows [SemVer](https://semver.org/). 386 | This will install the latest supported version: 387 | 388 | ```bash 389 | composer require clue/reactphp-csv:^1.2 390 | ``` 391 | 392 | See also the [CHANGELOG](CHANGELOG.md) for details about version upgrades. 393 | 394 | This project aims to run on any platform and thus does not require any PHP 395 | extensions and supports running on legacy PHP 5.3 through current PHP 8+ and 396 | HHVM. 397 | It's *highly recommended to use the latest supported PHP version* for this project. 398 | 399 | ## Tests 400 | 401 | To run the test suite, you first need to clone this repo and then install all 402 | dependencies [through Composer](https://getcomposer.org/): 403 | 404 | ```bash 405 | composer install 406 | ``` 407 | 408 | To run the test suite, go to the project root and run: 409 | 410 | ```bash 411 | vendor/bin/phpunit 412 | ``` 413 | 414 | ## License 415 | 416 | This project is released under the permissive [MIT license](LICENSE). 417 | 418 | > Did you know that I offer custom development services and issuing invoices for 419 | sponsorships of releases and for contributions? Contact me (@clue) for details. 420 | 421 | ## More 422 | 423 | * If you want to learn more about processing streams of data, refer to the documentation of 424 | the underlying [react/stream](https://github.com/reactphp/stream) component. 425 | 426 | * If you want to process structured data in a more modern JSON-based format, 427 | you may want to use [clue/reactphp-ndjson](https://github.com/clue/reactphp-ndjson) 428 | to process newline-delimited JSON (NDJSON) files (`.ndjson` file extension). 429 | 430 | * If you want to process a slightly simpler text-based tabular data format, 431 | you may want to use [clue/reactphp-tsv](https://github.com/clue/reactphp-tsv) 432 | to process Tab-Separated-Values (TSV) files (`.tsv` file extension). 433 | 434 | * If you want to process compressed CSV files (`.csv.gz` file extension) 435 | you may want to use [clue/reactphp-zlib](https://github.com/clue/reactphp-zlib) 436 | on the compressed input stream before passing the decompressed stream to the CSV decoder. 437 | 438 | * If you want to create compressed CSV files (`.csv.gz` file extension) 439 | you may want to use [clue/reactphp-zlib](https://github.com/clue/reactphp-zlib) 440 | on the resulting CSV encoder output stream before passing the compressed 441 | stream to the file output stream. 442 | 443 | * If you want to concurrently process the records from your CSV stream, 444 | you may want to use [clue/reactphp-flux](https://github.com/clue/reactphp-flux) 445 | to concurrently process many (but not too many) records at once. 446 | -------------------------------------------------------------------------------- /composer.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "clue/reactphp-csv", 3 | "description": "Streaming CSV (Comma-Separated Values or Character-Separated Values) parser and encoder for ReactPHP.", 4 | "keywords": ["CSV", "comma-separated values", "character-separated values", "streaming", "ReactPHP"], 5 | "homepage": "https://github.com/clue/reactphp-csv", 6 | "license": "MIT", 7 | "authors": [ 8 | { 9 | "name": "Christian Lück", 10 | "email": "christian@clue.engineering" 11 | } 12 | ], 13 | "require": { 14 | "php": ">=5.3", 15 | "react/stream": "^1.2" 16 | }, 17 | "require-dev": { 18 | "phpunit/phpunit": "^9.6 || ^5.7 || ^4.8.36", 19 | "react/child-process": "^0.6.3", 20 | "react/event-loop": "^1.2" 21 | }, 22 | "autoload": { 23 | "psr-4": { 24 | "Clue\\React\\Csv\\": "src/" 25 | } 26 | }, 27 | "autoload-dev": { 28 | "psr-4": { 29 | "Clue\\Tests\\React\\Csv\\": "tests/" 30 | } 31 | } 32 | } 33 | -------------------------------------------------------------------------------- /src/AssocDecoder.php: -------------------------------------------------------------------------------- 1 | input = new Decoder($input, $delimiter, $enclosure, $escapeChar, $maxlength); 30 | 31 | if (!$input->isReadable()) { 32 | $this->close(); 33 | return; 34 | } 35 | 36 | $this->input->on('data', array($this, 'handleData')); 37 | $this->input->on('end', array($this, 'handleEnd')); 38 | $this->input->on('error', array($this, 'handleError')); 39 | $this->input->on('close', array($this, 'close')); 40 | } 41 | 42 | public function isReadable() 43 | { 44 | return !$this->closed; 45 | } 46 | 47 | public function close() 48 | { 49 | if ($this->closed) { 50 | return; 51 | } 52 | 53 | $this->closed = true; 54 | $this->input->close(); 55 | 56 | $this->emit('close'); 57 | $this->removeAllListeners(); 58 | } 59 | 60 | public function pause() 61 | { 62 | $this->input->pause(); 63 | } 64 | 65 | public function resume() 66 | { 67 | $this->input->resume(); 68 | } 69 | 70 | public function pipe(WritableStreamInterface $dest, array $options = array()) 71 | { 72 | Util::pipe($this, $dest, $options); 73 | 74 | return $dest; 75 | } 76 | 77 | /** @internal */ 78 | public function handleData($data) 79 | { 80 | if ($this->expected === null) { 81 | $this->headers = $data; 82 | $this->expected = \count($data); 83 | $this->emit('headers', array($data)); 84 | } else { 85 | if (\count($data) !== $this->expected) { 86 | $this->handleError(new \UnexpectedValueException( 87 | 'Expected record with ' . $this->expected . ' columns, but got ' . \count($data) . ' instead') 88 | ); 89 | return; 90 | } 91 | 92 | $this->emit('data', array( 93 | \array_combine($this->headers, $data) 94 | )); 95 | } 96 | } 97 | 98 | /** @internal */ 99 | public function handleEnd() 100 | { 101 | if ($this->headers === array()) { 102 | $this->handleError(new \UnderflowException('Stream ended without headers')); 103 | } 104 | 105 | if (!$this->closed) { 106 | $this->emit('end'); 107 | $this->close(); 108 | } 109 | } 110 | 111 | /** @internal */ 112 | public function handleError(\Exception $error) 113 | { 114 | $this->emit('error', array($error)); 115 | $this->close(); 116 | } 117 | } 118 | -------------------------------------------------------------------------------- /src/Decoder.php: -------------------------------------------------------------------------------- 1 | input = $input; 36 | $this->delimiter = $delimiter; 37 | $this->enclosure = $enclosure; 38 | $this->escapeChar = $escapeChar; 39 | $this->maxlength = $maxlength; 40 | 41 | if (!$input->isReadable()) { 42 | $this->close(); 43 | return; 44 | } 45 | 46 | $this->input->on('data', array($this, 'handleData')); 47 | $this->input->on('end', array($this, 'handleEnd')); 48 | $this->input->on('error', array($this, 'handleError')); 49 | $this->input->on('close', array($this, 'close')); 50 | } 51 | 52 | public function isReadable() 53 | { 54 | return !$this->closed; 55 | } 56 | 57 | public function close() 58 | { 59 | if ($this->closed) { 60 | return; 61 | } 62 | 63 | $this->closed = true; 64 | $this->buffer = ''; 65 | $this->input->close(); 66 | 67 | $this->emit('close'); 68 | $this->removeAllListeners(); 69 | } 70 | 71 | public function pause() 72 | { 73 | $this->input->pause(); 74 | } 75 | 76 | public function resume() 77 | { 78 | $this->input->resume(); 79 | } 80 | 81 | public function pipe(WritableStreamInterface $dest, array $options = array()) 82 | { 83 | Util::pipe($this, $dest, $options); 84 | 85 | return $dest; 86 | } 87 | 88 | /** @internal */ 89 | public function handleData($data) 90 | { 91 | if (!\is_string($data)) { 92 | $this->handleError(new \UnexpectedValueException('Expected stream to emit string, but got ' . \gettype($data))); 93 | return; 94 | } 95 | 96 | $this->buffer .= $data; 97 | 98 | // keep parsing while a newline has been found 99 | while (($newline = \strpos($this->buffer, "\n", $this->offset)) !== false && $newline <= $this->maxlength) { 100 | // read data up until newline and try to parse 101 | $data = \str_getcsv( 102 | \substr($this->buffer, 0, $newline + 1), 103 | $this->delimiter, 104 | $this->enclosure, 105 | $this->escapeChar 106 | ); 107 | 108 | // unable to decode? abort 109 | if ($data === false || \end($data) === null) { 110 | $this->handleError(new \RuntimeException('Unable to decode CSV')); 111 | return; 112 | } 113 | 114 | // the last parsed cell value ends with a newline and the buffer does not end with end quote? 115 | // this looks like a multiline value, so only remember offset and wait for next newline 116 | $last = \substr(\end($data), -1); 117 | \reset($data); 118 | $edgeCase = \substr($this->buffer, $newline - 2, 3); 119 | if ($last === "\n" && ($newline === 1 || $this->buffer[$newline - 1] !== $this->enclosure || $edgeCase === $this->delimiter . $this->enclosure . "\n")) { 120 | $this->offset = $newline + 1; 121 | continue; 122 | } 123 | 124 | // parsing successful => remove from buffer and emit 125 | $this->buffer = (string)\substr($this->buffer, $newline + 1); 126 | $this->offset = 0; 127 | 128 | $this->emit('data', array($data)); 129 | } 130 | 131 | if (isset($this->buffer[$this->maxlength])) { 132 | $this->handleError(new \OverflowException('Buffer size exceeded')); 133 | } 134 | } 135 | 136 | /** @internal */ 137 | public function handleEnd() 138 | { 139 | if ($this->buffer !== '') { 140 | $this->handleData("\n"); 141 | } 142 | 143 | if ($this->buffer !== '') { 144 | $this->handleError(new \RuntimeException('Unable to decode CSV')); 145 | } 146 | 147 | if (!$this->closed) { 148 | $this->emit('end'); 149 | $this->close(); 150 | } 151 | } 152 | 153 | /** @internal */ 154 | public function handleError(\Exception $error) 155 | { 156 | $this->emit('error', array($error)); 157 | $this->close(); 158 | } 159 | } 160 | -------------------------------------------------------------------------------- /src/Encoder.php: -------------------------------------------------------------------------------- 1 | output = $output; 37 | $this->delimiter = $delimiter; 38 | $this->enclosure = $enclosure; 39 | $this->escapeChar = $escapeChar; 40 | $this->eol = $eol; 41 | 42 | if (!$output->isWritable()) { 43 | $this->close(); 44 | return; 45 | } 46 | 47 | $this->temp = fopen('php://memory', 'r+'); 48 | 49 | $this->output->on('drain', array($this, 'handleDrain')); 50 | $this->output->on('error', array($this, 'handleError')); 51 | $this->output->on('close', array($this, 'close')); 52 | } 53 | 54 | public function write($data) 55 | { 56 | if ($this->closed) { 57 | return false; 58 | } 59 | 60 | $written = false; 61 | if (is_array($data)) { 62 | // custom EOL requires PHP 8.1+, custom escape character requires PHP 5.5.4+ (see constructor check) 63 | // @codeCoverageIgnoreStart 64 | if (\PHP_VERSION_ID >= 80100) { 65 | $written = fputcsv($this->temp, $data, $this->delimiter, $this->enclosure, $this->escapeChar, $this->eol); 66 | } elseif (\PHP_VERSION_ID >= 50504) { 67 | $written = fputcsv($this->temp, $data, $this->delimiter, $this->enclosure, $this->escapeChar); 68 | } else { 69 | $written = fputcsv($this->temp, $data, $this->delimiter, $this->enclosure); 70 | } 71 | // @codeCoverageIgnoreEnd 72 | } 73 | 74 | if ($written === false) { 75 | $this->handleError(new \RuntimeException('Unable to encode CSV')); 76 | return false; 77 | } 78 | 79 | rewind($this->temp); 80 | $data = stream_get_contents($this->temp); 81 | ftruncate($this->temp, 0); 82 | 83 | // manually replace custom EOL on PHP < 8.1 84 | if (\PHP_VERSION_ID < 80100 && $this->eol !== "\n") { 85 | $data = \substr($data, 0, -1) . $this->eol; 86 | } 87 | 88 | return $this->output->write($data); 89 | } 90 | 91 | public function end($data = null) 92 | { 93 | if ($data !== null) { 94 | $this->write($data); 95 | } 96 | 97 | $this->output->end(); 98 | } 99 | 100 | public function isWritable() 101 | { 102 | return !$this->closed; 103 | } 104 | 105 | public function close() 106 | { 107 | if ($this->closed) { 108 | return; 109 | } 110 | 111 | $this->closed = true; 112 | $this->output->close(); 113 | 114 | if ($this->temp !== false) { 115 | fclose($this->temp); 116 | $this->temp = false; 117 | } 118 | 119 | $this->emit('close'); 120 | $this->removeAllListeners(); 121 | } 122 | 123 | /** @internal */ 124 | public function handleDrain() 125 | { 126 | $this->emit('drain'); 127 | } 128 | 129 | /** @internal */ 130 | public function handleError(\Exception $error) 131 | { 132 | $this->emit('error', array($error)); 133 | $this->close(); 134 | } 135 | } 136 | --------------------------------------------------------------------------------