├── .gitignore
├── DEVELOP.md
├── LICENSE
├── README.md
├── images
    └── pg_analytica.jpg
├── ingestor
    ├── Makefile
    ├── constants.h
    ├── export_entry.h
    ├── file_utils.h
    ├── generate_test_data.sql
    ├── ingestor--0.0.1.sql
    ├── ingestor.c
    ├── ingestor.control
    ├── parquet.sql
    ├── registry.c
    └── test.sql
└── notes.md


/.gitignore:
--------------------------------------------------------------------------------
1 | */*.o
2 | */*.so
3 | .vscode/*
4 | .DS_Store


--------------------------------------------------------------------------------
/DEVELOP.md:
--------------------------------------------------------------------------------
 1 | 
 2 | ## Postgres Installation using Homebrew:
 3 | 
 4 | ```
 5 | brew install postgresql
 6 | ```
 7 | After you install PostgreSQL on your Mac, you will be able to start it via Terminal using the following command:
 8 | ```
 9 | brew services start postgresql
10 | ```
11 | ```
12 | Note:
13 | You can stop PostgreSQL using the `brew services stop postgresql` command.
14 | ```
15 | Enter into Postgres using below command:
16 | 
17 | ```
18 | psql postgres
19 | ```
20 | ## Check PostgreSQL version and PATH
21 | 
22 | Check the version and PATH with below command: 
23 | 
24 | ```
25 | pg_config --pkglibdir         
26 | ```
27 | The output should be:
28 | /opt/homebrew/lib/postgresql@14
29 | 
30 | 
31 | ## Running a sample extension
32 | 
33 | Create a new directory and place four files there named pg_analytica.c, pg_analytica.control, Makefile and pg_analytica--0.0.1.sql.
34 | 
35 | Run make command in the above directory:  
36 | 
37 | ```python
38 | make CC='/usr/bin/gcc'
39 | make install
40 | ```
41 | 
42 | ## Generate sample data for testing
43 | 
44 | Run the below command.
45 | 
46 | ```
47 | psql postgres -f test.sql
48 | ```
49 | 
50 | This creates a table and populates sample rows that can used in testing.
51 | 
52 | ## Testing the extension
53 | 
54 | Add tests to the test.sql file.
55 | 
56 | Tests can be executed using the command below.
57 | ```
58 | psql postgres -f test.sql
59 | ```
60 | 
61 | ## Viewing Logs
62 | 
63 | Logs can be viewed at.
64 | 
65 | ```
66 | tail -100 /opt/homebrew/var/log/postgresql@14.log
67 | ```
68 | 
69 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                     GNU GENERAL PUBLIC LICENSE
  2 |                        Version 2, June 1991
  3 | 
  4 |  Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
  5 |  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  6 |  Everyone is permitted to copy and distribute verbatim copies
  7 |  of this license document, but changing it is not allowed.
  8 | 
  9 |                             Preamble
 10 | 
 11 |   The licenses for most software are designed to take away your
 12 | freedom to share and change it.  By contrast, the GNU General Public
 13 | License is intended to guarantee your freedom to share and change free
 14 | software--to make sure the software is free for all its users.  This
 15 | General Public License applies to most of the Free Software
 16 | Foundation's software and to any other program whose authors commit to
 17 | using it.  (Some other Free Software Foundation software is covered by
 18 | the GNU Lesser General Public License instead.)  You can apply it to
 19 | your programs, too.
 20 | 
 21 |   When we speak of free software, we are referring to freedom, not
 22 | price.  Our General Public Licenses are designed to make sure that you
 23 | have the freedom to distribute copies of free software (and charge for
 24 | this service if you wish), that you receive source code or can get it
 25 | if you want it, that you can change the software or use pieces of it
 26 | in new free programs; and that you know you can do these things.
 27 | 
 28 |   To protect your rights, we need to make restrictions that forbid
 29 | anyone to deny you these rights or to ask you to surrender the rights.
 30 | These restrictions translate to certain responsibilities for you if you
 31 | distribute copies of the software, or if you modify it.
 32 | 
 33 |   For example, if you distribute copies of such a program, whether
 34 | gratis or for a fee, you must give the recipients all the rights that
 35 | you have.  You must make sure that they, too, receive or can get the
 36 | source code.  And you must show them these terms so they know their
 37 | rights.
 38 | 
 39 |   We protect your rights with two steps: (1) copyright the software, and
 40 | (2) offer you this license which gives you legal permission to copy,
 41 | distribute and/or modify the software.
 42 | 
 43 |   Also, for each author's protection and ours, we want to make certain
 44 | that everyone understands that there is no warranty for this free
 45 | software.  If the software is modified by someone else and passed on, we
 46 | want its recipients to know that what they have is not the original, so
 47 | that any problems introduced by others will not reflect on the original
 48 | authors' reputations.
 49 | 
 50 |   Finally, any free program is threatened constantly by software
 51 | patents.  We wish to avoid the danger that redistributors of a free
 52 | program will individually obtain patent licenses, in effect making the
 53 | program proprietary.  To prevent this, we have made it clear that any
 54 | patent must be licensed for everyone's free use or not licensed at all.
 55 | 
 56 |   The precise terms and conditions for copying, distribution and
 57 | modification follow.
 58 | 
 59 |                     GNU GENERAL PUBLIC LICENSE
 60 |    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 61 | 
 62 |   0. This License applies to any program or other work which contains
 63 | a notice placed by the copyright holder saying it may be distributed
 64 | under the terms of this General Public License.  The "Program", below,
 65 | refers to any such program or work, and a "work based on the Program"
 66 | means either the Program or any derivative work under copyright law:
 67 | that is to say, a work containing the Program or a portion of it,
 68 | either verbatim or with modifications and/or translated into another
 69 | language.  (Hereinafter, translation is included without limitation in
 70 | the term "modification".)  Each licensee is addressed as "you".
 71 | 
 72 | Activities other than copying, distribution and modification are not
 73 | covered by this License; they are outside its scope.  The act of
 74 | running the Program is not restricted, and the output from the Program
 75 | is covered only if its contents constitute a work based on the
 76 | Program (independent of having been made by running the Program).
 77 | Whether that is true depends on what the Program does.
 78 | 
 79 |   1. You may copy and distribute verbatim copies of the Program's
 80 | source code as you receive it, in any medium, provided that you
 81 | conspicuously and appropriately publish on each copy an appropriate
 82 | copyright notice and disclaimer of warranty; keep intact all the
 83 | notices that refer to this License and to the absence of any warranty;
 84 | and give any other recipients of the Program a copy of this License
 85 | along with the Program.
 86 | 
 87 | You may charge a fee for the physical act of transferring a copy, and
 88 | you may at your option offer warranty protection in exchange for a fee.
 89 | 
 90 |   2. You may modify your copy or copies of the Program or any portion
 91 | of it, thus forming a work based on the Program, and copy and
 92 | distribute such modifications or work under the terms of Section 1
 93 | above, provided that you also meet all of these conditions:
 94 | 
 95 |     a) You must cause the modified files to carry prominent notices
 96 |     stating that you changed the files and the date of any change.
 97 | 
 98 |     b) You must cause any work that you distribute or publish, that in
 99 |     whole or in part contains or is derived from the Program or any
100 |     part thereof, to be licensed as a whole at no charge to all third
101 |     parties under the terms of this License.
102 | 
103 |     c) If the modified program normally reads commands interactively
104 |     when run, you must cause it, when started running for such
105 |     interactive use in the most ordinary way, to print or display an
106 |     announcement including an appropriate copyright notice and a
107 |     notice that there is no warranty (or else, saying that you provide
108 |     a warranty) and that users may redistribute the program under
109 |     these conditions, and telling the user how to view a copy of this
110 |     License.  (Exception: if the Program itself is interactive but
111 |     does not normally print such an announcement, your work based on
112 |     the Program is not required to print an announcement.)
113 | 
114 | These requirements apply to the modified work as a whole.  If
115 | identifiable sections of that work are not derived from the Program,
116 | and can be reasonably considered independent and separate works in
117 | themselves, then this License, and its terms, do not apply to those
118 | sections when you distribute them as separate works.  But when you
119 | distribute the same sections as part of a whole which is a work based
120 | on the Program, the distribution of the whole must be on the terms of
121 | this License, whose permissions for other licensees extend to the
122 | entire whole, and thus to each and every part regardless of who wrote it.
123 | 
124 | Thus, it is not the intent of this section to claim rights or contest
125 | your rights to work written entirely by you; rather, the intent is to
126 | exercise the right to control the distribution of derivative or
127 | collective works based on the Program.
128 | 
129 | In addition, mere aggregation of another work not based on the Program
130 | with the Program (or with a work based on the Program) on a volume of
131 | a storage or distribution medium does not bring the other work under
132 | the scope of this License.
133 | 
134 |   3. You may copy and distribute the Program (or a work based on it,
135 | under Section 2) in object code or executable form under the terms of
136 | Sections 1 and 2 above provided that you also do one of the following:
137 | 
138 |     a) Accompany it with the complete corresponding machine-readable
139 |     source code, which must be distributed under the terms of Sections
140 |     1 and 2 above on a medium customarily used for software interchange; or,
141 | 
142 |     b) Accompany it with a written offer, valid for at least three
143 |     years, to give any third party, for a charge no more than your
144 |     cost of physically performing source distribution, a complete
145 |     machine-readable copy of the corresponding source code, to be
146 |     distributed under the terms of Sections 1 and 2 above on a medium
147 |     customarily used for software interchange; or,
148 | 
149 |     c) Accompany it with the information you received as to the offer
150 |     to distribute corresponding source code.  (This alternative is
151 |     allowed only for noncommercial distribution and only if you
152 |     received the program in object code or executable form with such
153 |     an offer, in accord with Subsection b above.)
154 | 
155 | The source code for a work means the preferred form of the work for
156 | making modifications to it.  For an executable work, complete source
157 | code means all the source code for all modules it contains, plus any
158 | associated interface definition files, plus the scripts used to
159 | control compilation and installation of the executable.  However, as a
160 | special exception, the source code distributed need not include
161 | anything that is normally distributed (in either source or binary
162 | form) with the major components (compiler, kernel, and so on) of the
163 | operating system on which the executable runs, unless that component
164 | itself accompanies the executable.
165 | 
166 | If distribution of executable or object code is made by offering
167 | access to copy from a designated place, then offering equivalent
168 | access to copy the source code from the same place counts as
169 | distribution of the source code, even though third parties are not
170 | compelled to copy the source along with the object code.
171 | 
172 |   4. You may not copy, modify, sublicense, or distribute the Program
173 | except as expressly provided under this License.  Any attempt
174 | otherwise to copy, modify, sublicense or distribute the Program is
175 | void, and will automatically terminate your rights under this License.
176 | However, parties who have received copies, or rights, from you under
177 | this License will not have their licenses terminated so long as such
178 | parties remain in full compliance.
179 | 
180 |   5. You are not required to accept this License, since you have not
181 | signed it.  However, nothing else grants you permission to modify or
182 | distribute the Program or its derivative works.  These actions are
183 | prohibited by law if you do not accept this License.  Therefore, by
184 | modifying or distributing the Program (or any work based on the
185 | Program), you indicate your acceptance of this License to do so, and
186 | all its terms and conditions for copying, distributing or modifying
187 | the Program or works based on it.
188 | 
189 |   6. Each time you redistribute the Program (or any work based on the
190 | Program), the recipient automatically receives a license from the
191 | original licensor to copy, distribute or modify the Program subject to
192 | these terms and conditions.  You may not impose any further
193 | restrictions on the recipients' exercise of the rights granted herein.
194 | You are not responsible for enforcing compliance by third parties to
195 | this License.
196 | 
197 |   7. If, as a consequence of a court judgment or allegation of patent
198 | infringement or for any other reason (not limited to patent issues),
199 | conditions are imposed on you (whether by court order, agreement or
200 | otherwise) that contradict the conditions of this License, they do not
201 | excuse you from the conditions of this License.  If you cannot
202 | distribute so as to satisfy simultaneously your obligations under this
203 | License and any other pertinent obligations, then as a consequence you
204 | may not distribute the Program at all.  For example, if a patent
205 | license would not permit royalty-free redistribution of the Program by
206 | all those who receive copies directly or indirectly through you, then
207 | the only way you could satisfy both it and this License would be to
208 | refrain entirely from distribution of the Program.
209 | 
210 | If any portion of this section is held invalid or unenforceable under
211 | any particular circumstance, the balance of the section is intended to
212 | apply and the section as a whole is intended to apply in other
213 | circumstances.
214 | 
215 | It is not the purpose of this section to induce you to infringe any
216 | patents or other property right claims or to contest validity of any
217 | such claims; this section has the sole purpose of protecting the
218 | integrity of the free software distribution system, which is
219 | implemented by public license practices.  Many people have made
220 | generous contributions to the wide range of software distributed
221 | through that system in reliance on consistent application of that
222 | system; it is up to the author/donor to decide if he or she is willing
223 | to distribute software through any other system and a licensee cannot
224 | impose that choice.
225 | 
226 | This section is intended to make thoroughly clear what is believed to
227 | be a consequence of the rest of this License.
228 | 
229 |   8. If the distribution and/or use of the Program is restricted in
230 | certain countries either by patents or by copyrighted interfaces, the
231 | original copyright holder who places the Program under this License
232 | may add an explicit geographical distribution limitation excluding
233 | those countries, so that distribution is permitted only in or among
234 | countries not thus excluded.  In such case, this License incorporates
235 | the limitation as if written in the body of this License.
236 | 
237 |   9. The Free Software Foundation may publish revised and/or new versions
238 | of the General Public License from time to time.  Such new versions will
239 | be similar in spirit to the present version, but may differ in detail to
240 | address new problems or concerns.
241 | 
242 | Each version is given a distinguishing version number.  If the Program
243 | specifies a version number of this License which applies to it and "any
244 | later version", you have the option of following the terms and conditions
245 | either of that version or of any later version published by the Free
246 | Software Foundation.  If the Program does not specify a version number of
247 | this License, you may choose any version ever published by the Free Software
248 | Foundation.
249 | 
250 |   10. If you wish to incorporate parts of the Program into other free
251 | programs whose distribution conditions are different, write to the author
252 | to ask for permission.  For software which is copyrighted by the Free
253 | Software Foundation, write to the Free Software Foundation; we sometimes
254 | make exceptions for this.  Our decision will be guided by the two goals
255 | of preserving the free status of all derivatives of our free software and
256 | of promoting the sharing and reuse of software generally.
257 | 
258 |                             NO WARRANTY
259 | 
260 |   11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
268 | REPAIR OR CORRECTION.
269 | 
270 |   12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
278 | POSSIBILITY OF SUCH DAMAGES.
279 | 
280 |                      END OF TERMS AND CONDITIONS
281 | 
282 |             How to Apply These Terms to Your New Programs
283 | 
284 |   If you develop a new program, and you want it to be of the greatest
285 | possible use to the public, the best way to achieve this is to make it
286 | free software which everyone can redistribute and change under these terms.
287 | 
288 |   To do so, attach the following notices to the program.  It is safest
289 | to attach them to the start of each source file to most effectively
290 | convey the exclusion of warranty; and each file should have at least
291 | the "copyright" line and a pointer to where the full notice is found.
292 | 
293 |     <one line to give the program's name and a brief idea of what it does.>
294 |     Copyright (C) 2024  Sushrut Shivaswamy
295 | 
296 |     This program is free software; you can redistribute it and/or modify
297 |     it under the terms of the GNU General Public License as published by
298 |     the Free Software Foundation; either version 2 of the License, or
299 |     (at your option) any later version.
300 | 
301 |     This program is distributed in the hope that it will be useful,
302 |     but WITHOUT ANY WARRANTY; without even the implied warranty of
303 |     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
304 |     GNU General Public License for more details.
305 | 
306 |     You should have received a copy of the GNU General Public License along
307 |     with this program; if not, write to the Free Software Foundation, Inc.,
308 |     51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
309 | 
310 | Also add information on how to contact you by electronic and paper mail.
311 | 
312 | If the program is interactive, make it output a short notice like this
313 | when it starts in an interactive mode:
314 | 
315 |     Gnomovision version 69, Copyright (C) year name of author
316 |     Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
317 |     This is free software, and you are welcome to redistribute it
318 |     under certain conditions; type `show c' for details.
319 | 
320 | The hypothetical commands `show w' and `show c' should show the appropriate
321 | parts of the General Public License.  Of course, the commands you use may
322 | be called something other than `show w' and `show c'; they could even be
323 | mouse-clicks or menu items--whatever suits your program.
324 | 
325 | You should also get your employer (if you work as a programmer) or your
326 | school, if any, to sign a "copyright disclaimer" for the program, if
327 | necessary.  Here is a sample; alter the names:
328 | 
329 |   Yoyodyne, Inc., hereby disclaims all copyright interest in the program
330 |   `Gnomovision' (which makes passes at compilers) written by James Hacker.
331 | 
332 |   <signature of Ty Coon>, 1 April 1989
333 |   Ty Coon, President of Vice
334 | 
335 | This General Public License does not permit incorporating your program into
336 | proprietary programs.  If your program is a subroutine library, you may
337 | consider it more useful to permit linking proprietary applications with the
338 | library.  If this is what you want to do, use the GNU Lesser General
339 | Public License instead of this License.
340 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ![pg_analytica Banner](images/pg_analytica.jpg)
  2 | 
  3 | ## What is pg_analytica
  4 | 
  5 | *pg_analytica* is a Postgres extension that speeds up analytics queries by upto 90%. 
  6 | 
  7 | ### How it works
  8 | 
  9 | *pg_analytica* automatically exports specified columns from your table into [columnar format](https://parquet.apache.org) at regular intervals. The columnar storage allows for blazing-fast querying compared to traditional row-based storage.
 10 | 
 11 | ### Getting Started
 12 | 
 13 | ### Download extension dependencies
 14 | 
 15 | The extension relies on the following dependencies.
 16 |  - parquet-glib
 17 |  - arrow-glib
 18 |  - gio-2.0
 19 | 
 20 | Installation instructions for various platforms can be found on the [Apache Arrow docs](https://arrow.apache.org/install/).
 21 | 
 22 | The extension relies on the [parquet_fdw](https://github.com/adjust/parquet_fdw) extension to query the exported columnar data. Follow the [installation](https://github.com/adjust/parquet_fdw?tab=readme-ov-file#installation) instructions on the repo.
 23 | 
 24 | Ensure that the extension is loaded before proceeding with next steps.
 25 | Running the query below within a psql session should show that the extension is installed.
 26 | 
 27 | ```
 28 | postgres=# SELECT * FROM pg_extension WHERE extname = 'parquet_fdw';
 29 | ```
 30 | 
 31 | 
 32 | ### Installing pg_analytica
 33 | 
 34 | Clone the extension to your workstation and navigate to the ingestor directory.
 35 | 
 36 | ```
 37 | $ cd ingestor
 38 | ```
 39 | 
 40 | Build and install the extension to your Postgres install path.
 41 | 
 42 | ```
 43 | $ make
 44 | $ make install
 45 | ```
 46 | 
 47 | ### Load the extension
 48 | 
 49 | Start a psql session and load the extension.
 50 | ```
 51 | $ psql postgres
 52 | ```
 53 | 
 54 | Within the session, load the extension.
 55 | ```
 56 | postgres=# CREATE EXTENSION ingestor;
 57 | ```
 58 | 
 59 | ### Register a table for export
 60 | 
 61 | The extension periodically exports data from registered tables in columnar format to
 62 | provide speed ups in querying. To register a table for export use the command below
 63 | in the psql session where the extension was loaded.
 64 | 
 65 | ```
 66 | postgres=# SELECT register_table_export(
 67 |     -- table name
 68 |     'your_table_name', 
 69 |     -- list of columns to export
 70 |     '{column_1,column_2,column_3}',
 71 |     -- Hours after which table data will be re-exported
 72 |     10
 73 | );
 74 | ```
 75 | 
 76 | *pg_analytica* supports the following column types currently.
 77 | 
 78 | | PG Type        | Support                |
 79 | | -------------- | ---------------------- |
 80 | | smallint       | :white_check_mark:     |
 81 | | integer        | :white_check_mark:     |
 82 | | bigint         | :white_check_mark:     |
 83 | | float          | :white_check_mark:     |
 84 | | double         | :white_check_mark:     |
 85 | | varchar        | :white_check_mark:     |
 86 | | text           | :white_check_mark:     |
 87 | | boolean        | :white_check_mark:     |
 88 | | timestamp      | :white_check_mark:     |
 89 | 
 90 | ### Start the export background worker
 91 | 
 92 | The ingestion background worker periodically finds registered tables eligible
 93 | for export and exports the data. The worker only needs to be launched once.
 94 | 
 95 | ```
 96 | postgres=# SELECT ingestor_launch();
 97 | ```
 98 | 
 99 | Once the export process is complete, you will be able to query the table.
100 | To check if your table is ready for export see if your table is listed in the result
101 | for the following query.
102 | 
103 | ```
104 | postgres=# SELECT * FROM analytica_exports WHERE last_run_completed IS NOT NULL;
105 | ```
106 | 
107 | During testing exporting a table with 60M rows and 5 columns took 40min but numbers may vary depending on your machine and size of the table.
108 | 
109 | ### Querying the exported table
110 | 
111 | Exported tables are queryable using the relation name `analytica_{table_name}`.
112 | If the table name you registered was `test_data` then the exported table will be queried as follows.
113 | ```
114 | postgres=# SELECT * FROM analytica_test_data;
115 | ```
116 | 
117 | ### Benchmarks
118 | 
119 | The extension was tested on a Postgres instance running on an M1 Air Macbook. To see the table used for testing see [here](./ingestor/generate_test_data.sql).
120 | Speed ups in querying columnar store vary based on the query but tests indicate upto 90% decease in query latency!
121 | 
122 | 
123 | ### Test 1
124 | 
125 | Measured using the timing feature in postgres.
126 |  - Table Size little over 2GiB
127 |  - 30M rows with 2 int64 columns and a varchar column exported
128 | 
129 | ```
130 | \timing on
131 | 
132 | 
133 | -- Query on row based data
134 | postgres=# SELECT 
135 |     age,
136 |     COUNT(*) 
137 | FROM test_data
138 | WHERE age > 50 AND age < 90 AND id > 1000
139 | GROUP BY age;
140 | 
141 | Time: 5153.393 ms (00:05.153)
142 | 
143 | 
144 | -- Query on columnar data
145 | postgres=# SELECT 
146 |     age,
147 |     COUNT(*) 
148 | FROM analytica_test_data
149 | WHERE age > 50 AND age < 90 AND id > 1000
150 | GROUP BY age;
151 | 
152 | Time: 1633.900 ms (00:01.634)
153 | ```
154 | 
155 | ### Test 2
156 | 
157 | Measured using the timing feature in postgres.
158 |  - Table Size little over 5GiB
159 |  - 60M rows with 6 columns of different types
160 |  
161 | ```
162 | \timing on
163 | 
164 | 
165 | -- Query on row based data
166 | SELECT 
167 |     age,
168 |     COUNT(*) 
169 | FROM test_data
170 | WHERE age > 50 AND age < 90 AND is_random
171 | GROUP BY age;
172 | 
173 | Time: 9446.380 ms (00:09.446)
174 | 
175 | 
176 | -- Query on columnar data
177 | SELECT 
178 |     age,
179 |     COUNT(*) 
180 | FROM analytica_test_data
181 | WHERE age > 50 AND age < 90 AND is_random
182 | GROUP BY age;
183 | 
184 | Time: 2312.845 ms (00:02.313)
185 | 
186 | postgres=# \timing on
187 | Timing is on.
188 | 
189 | -- Query on row based data
190 | postgres=# SELECT COUNT(*) from test_data WHERE age > 50 AND age < 55;
191 |   count  
192 | ---------
193 |  2399709
194 | (1 row)
195 | 
196 | Time: 9904.160 ms (00:09.904)
197 | 
198 | 
199 | -- Query on columnar data
200 | postgres=# SELECT COUNT(*) from analytica_test_data WHERE age > 50 AND age < 55;
201 |   count  
202 | ---------
203 |  2398998
204 | (1 row)
205 | 
206 | Time: 1399.673 ms (00:01.400)
207 | ```


--------------------------------------------------------------------------------
/images/pg_analytica.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sushrut141/pg_analytica/52d62405a2bbcf175977291ee6bf4734c5889104/images/pg_analytica.jpg


--------------------------------------------------------------------------------
/ingestor/Makefile:
--------------------------------------------------------------------------------
 1 | # Refer src/makefiles/pgxs.mk in postgres source for details about flags
 2 | MODULE_big = ingestor
 3 | OBJS = ingestor.o registry.o 
 4 | EXTENSION = ingestor     # the extersion's name
 5 | DATA = ingestor--0.0.1.sql    # script file to install
 6 | #REGRESS = get_sum_test      # the test script file
 7 | 
 8 | # `pkg-config --list-all | grep -e parquet` will find if installed
 9 | PG_CFLAGS += `pkg-config --cflags parquet-glib`
10 | PG_CFLAGS += `pkg-config --cflags arrow-glib`
11 | PG_CFLAGS += `pkg-config --cflags gio-2.0`
12 | # Run pkg-config commands below to understand how outputs differ
13 | # TODO - PG_LIBS libraries are not passed with `-l`. Using PD_LFAGS
14 | # for both library include path(-L) and library name(-l)
15 | PG_LDFLAGS  += `pkg-config --libs-only-L     parquet-glib`
16 | PG_LDFLAGS   += `pkg-config --libs-only-l parquet-glib`
17 | PG_LDFLAGS  += `pkg-config --libs-only-L     arrow-glib`
18 | PG_LDFLAGS   += `pkg-config --libs-only-l arrow-glib`
19 | PG_LDFLAGS  += `pkg-config --libs-only-L     gio-2.0`
20 | PG_LDFLAGS   += `pkg-config --libs-only-l gio-2.0`
21 | 
22 | # for posgres build
23 | PG_CONFIG = pg_config
24 | PGXS := $(shell $(PG_CONFIG) --pgxs)
25 | include $(PGXS)
26 | 


--------------------------------------------------------------------------------
/ingestor/constants.h:
--------------------------------------------------------------------------------
1 | #ifndef _ANALYTICA_CONSTANTS_H
2 | #define _ANALYTICA_CONSTANTS_H
3 | 
4 | enum ExportStatus { PENDING = 0, ACTIVE = 1, INACTIVE = -1 };
5 | 
6 | #endif


--------------------------------------------------------------------------------
/ingestor/export_entry.h:
--------------------------------------------------------------------------------
 1 | #ifndef _EXPORT_ENTRY_H
 2 | #define _EXPORT_ENTRY_H
 3 | 
 4 | #include <string.h>
 5 | 
 6 | #include "utils/elog.h"
 7 | #include "utils/palloc.h"
 8 | 
 9 | /**
10 |  * Detail about table to be processed for export.
11 |  * Callers are responsible for freeing memory of individual fields.
12 |  */
13 | typedef struct _ExportEntry {
14 |   char *table_name;
15 |   char **columns_to_export;
16 |   int num_of_columns;
17 |   int export_status;
18 |   int64 chunk_size;
19 | } ExportEntry;
20 | 
21 | void initialize_export_entry(const char *table_name, int num_of_columns,
22 |                              ExportEntry *entry) {
23 |   // Initialize memory for column names.
24 |   entry->num_of_columns = num_of_columns;
25 |   entry->columns_to_export = (char **)palloc(num_of_columns * sizeof(char *));
26 |   // Initialize memory and set table name.
27 |   entry->table_name = (char *)palloc(strlen(table_name) * sizeof(char));
28 |   strcpy(entry->table_name, table_name);
29 | }
30 | 
31 | void export_entry_add_column(ExportEntry *entry, char *column_name,
32 |                              int column_num) {
33 |   size_t column_name_size = strlen(column_name);
34 |   entry->columns_to_export[column_num] =
35 |       (char *)palloc(column_name_size * sizeof(char));
36 |   strcpy(entry->columns_to_export[column_num], column_name);
37 | }
38 | 
39 | void free_export_entry(ExportEntry *entry) {
40 |   pfree(entry->table_name);
41 |   for (int i = 0; i < entry->num_of_columns; i += 1) {
42 |     pfree(entry->columns_to_export[i]);
43 |   }
44 |   pfree(entry->columns_to_export);
45 | }
46 | 
47 | #endif


--------------------------------------------------------------------------------
/ingestor/file_utils.h:
--------------------------------------------------------------------------------
 1 | #ifndef _FILE_UTILS_H
 2 | #define _FILE_UTILS_H
 3 | #include <dirent.h>
 4 | #include <limits.h>
 5 | #include <stdio.h>
 6 | #include <stdlib.h>
 7 | #include <string.h>
 8 | #include <sys/stat.h>
 9 | #include <sys/types.h>
10 | #include <unistd.h>
11 | 
12 | #include "postgres.h"
13 | 
14 | /*
15 |  * Populates the root path for the extension data.
16 |  * Assumes that buffer has PATH_MAX space available.
17 |  */
18 | void populate_root_path(char *out, bool relative) {
19 |   if (!relative) {
20 |     if (getcwd(out, PATH_MAX) == NULL) {
21 |       perror("getcwd");
22 |     }
23 |   } else {
24 |     strcat(out, ".");
25 |   }
26 |   strcat(out, "/pg_analytica/");
27 | }
28 | 
29 | /*
30 |  * Populates the data path for columnar files for the supplied table in out.
31 |  * Assumes that buffer has PATH_MAX space available.
32 |  */
33 | void populate_data_path_for_table(const char *table, char *out, bool relative) {
34 |   populate_root_path(out, relative);
35 |   strcat(out, table);
36 | }
37 | 
38 | /*
39 |  * Populates the temp path for columnar files for the supplied table in out.
40 |  * Assumes that buffer has PATH_MAX space available.
41 |  */
42 | void populate_temp_path_for_table(const char *table, char *out, bool relative) {
43 |   // TODO - getcwd does not work here.
44 |   populate_data_path_for_table(table, out, relative);
45 |   strcat(out, "/temp");
46 | }
47 | 
48 | static void delete_files_in_directory(const char *path) {
49 |   DIR *dir = opendir(path);
50 |   struct dirent *entry;
51 |   while ((entry = readdir(dir)) != NULL) {
52 |     char filepath[PATH_MAX];
53 |     snprintf(filepath, sizeof(filepath), "%s/%s", path, entry->d_name);
54 | 
55 |     // Skip special entries (".", "..")
56 |     if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
57 |       continue;
58 |     }
59 |     if (unlink(filepath) == -1) {
60 |       elog(LOG, "Failed to delete file %s", filepath);
61 |       perror("unlink");
62 |     }
63 |   }
64 |   closedir(dir);
65 | }
66 | 
67 | int cleanup_table_data(const char *table_name) {
68 |   char temp_path[PATH_MAX];
69 |   char data_path[PATH_MAX];
70 |   populate_temp_path_for_table(table_name, temp_path, /*relative=*/false);
71 |   populate_data_path_for_table(table_name, data_path, /*relative=*/false);
72 | 
73 |   // delete temp files if they exist
74 |   delete_files_in_directory(temp_path);
75 |   // delete temp directory
76 |   if (rmdir(temp_path) != 0) {
77 |     elog(LOG, "Failed to delete temp data directory %s", temp_path);
78 |     perror("rmdir");
79 |     return 1;
80 |   }
81 |   // delete main data files
82 |   delete_files_in_directory(data_path);
83 |   // delete data directory
84 |   if (rmdir(data_path) != 0) {
85 |     elog(LOG, "Failed to delete temp data file %s", data_path);
86 |     perror("rmdir");
87 |     return 1;
88 |   }
89 |   return 0;
90 | }
91 | 
92 | #endif


--------------------------------------------------------------------------------
/ingestor/generate_test_data.sql:
--------------------------------------------------------------------------------
 1 | DROP TABLE test_data;
 2 | 
 3 | CREATE TABLE test_data (
 4 |   id SERIAL PRIMARY KEY,
 5 |   name VARCHAR(50) NOT NULL,
 6 |   age INTEGER NOT NULL,
 7 |   is_random boolean,
 8 |   rating float4,
 9 |   created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
10 | );
11 | 
12 | -- ~5 GiB table
13 | INSERT INTO test_data (name, age, is_random, rating)
14 | SELECT
15 |   floor(random() * 1000) || ' Name',  -- Generate random names
16 |   floor(random() * 100) + 18 AS age,  -- Generate random ages between 18-117
17 |   random() < 0.5 as is_random,
18 |   random() as rating
19 | FROM generate_series(1, 10000000);


--------------------------------------------------------------------------------
/ingestor/ingestor--0.0.1.sql:
--------------------------------------------------------------------------------
 1 | --complain if script is sourced in psql, rather than via CREATE EXTENSION
 2 | \echo Use "CREATE EXTENSION ingestor;" to load this file. \quit
 3 | 
 4 | -- Table to store export metadata
 5 | CREATE TABLE analytica_exports (
 6 |     table_name    text PRIMARY KEY,
 7 |     last_run_completed TIMESTAMP WITH TIME ZONE,
 8 |     columns_to_export text[],
 9 |     export_frequency_hours int,
10 |     export_status int,
11 |     chunk_size int
12 | );
13 | 
14 | -- Register a postgres table for export
15 | CREATE OR REPLACE FUNCTION register_table_export(
16 |     table_name text, 
17 |     columns_to_export text[], 
18 |     export_frequency_hours int, 
19 |     chunk_size int DEFAULT 100000)
20 | RETURNS bigint
21 | AS 'MODULE_PATHNAME'
22 | LANGUAGE C STRICT VOLATILE;
23 | 
24 | -- Un-register a table for export and delete columnar data directory.
25 | CREATE OR REPLACE FUNCTION unregister_table_export(table_name text)
26 | RETURNS bigint
27 | AS 'MODULE_PATHNAME'
28 | LANGUAGE C VOLATILE;
29 | 
30 | 
31 | -- Launch an ingestion worker to export columnar data for tables registered for export.
32 | CREATE OR REPLACE FUNCTION ingestor_launch()
33 | RETURNS pg_catalog.int4 STRICT
34 | AS 'MODULE_PATHNAME'
35 | LANGUAGE C;
36 | 
37 | -- function to list all parquet iles in data directory for table
38 | CREATE OR REPLACE FUNCTION list_parquet_files(args jsonb)
39 | returns text[] as
40 | $$
41 | begin
42 |     return array_agg(args->>'dir' || '/' || filename)
43 |            from pg_ls_dir(args->>'dir') as files(filename)
44 |            where filename ~~ '%.parquet';
45 | end
46 | $$
47 | language plpgsql;


--------------------------------------------------------------------------------
/ingestor/ingestor.c:
--------------------------------------------------------------------------------
   1 | #include "postgres.h"
   2 | #include <dirent.h>
   3 | #include <stdbool.h>
   4 | #include <string.h>
   5 | #include <sys/stat.h>
   6 | #include <time.h>
   7 | #include <unistd.h>
   8 | 
   9 | /* Header for arrow parquet */
  10 | #include <arrow-glib/arrow-glib.h>
  11 | #include <gio/gio.h>
  12 | #include <parquet-glib/parquet-glib.h>
  13 | 
  14 | /* These are always necessary for a bgworker */
  15 | #include "miscadmin.h"
  16 | #include "postmaster/bgworker.h"
  17 | #include "postmaster/interrupt.h"
  18 | #include "storage/ipc.h"
  19 | #include "storage/latch.h"
  20 | #include "storage/lwlock.h"
  21 | #include "storage/proc.h"
  22 | #include "storage/shmem.h"
  23 | 
  24 | /* these headers are used by this particular worker's code */
  25 | #include "access/xact.h"
  26 | #include "commands/dbcommands.h"
  27 | #include "constants.h"
  28 | #include "executor/spi.h"
  29 | #include "export_entry.h"
  30 | #include "file_utils.h"
  31 | #include "fmgr.h"
  32 | #include "lib/stringinfo.h"
  33 | #include "pgstat.h"
  34 | #include "tcop/utility.h"
  35 | #include "utils/acl.h"
  36 | #include "utils/builtins.h"
  37 | #include "utils/snapmgr.h"
  38 | #include "utils/wait_event.h"
  39 | 
  40 | PG_MODULE_MAGIC;
  41 | 
  42 | PG_FUNCTION_INFO_V1(ingestor_launch);
  43 | 
  44 | PGDLLEXPORT void ingestor_main(void) pg_attribute_noreturn();
  45 | 
  46 | #define MAX_COLUMN_NAME_CHARS 100
  47 | #define MAX_SUPPORTED_COLUMNS 100
  48 | #define MAX_EXPORT_ENTRIES 10
  49 | #define PARQUET_ROW_GROUP_CHUNK_SIZE 10000
  50 | 
  51 | /** Arrow functionality */
  52 | #define ASSIGN_IF_NOT_NULL(check_ptr, dest_ptr)                                \
  53 |   {                                                                            \
  54 |     if (check_ptr != NULL) {                                                   \
  55 |       *dest_ptr = *check_ptr;                                                  \
  56 |     }                                                                          \
  57 |   }
  58 | 
  59 | #define LOG_ARROW_ERROR(error)                                                 \
  60 |   {                                                                            \
  61 |     if (error != NULL) {                                                       \
  62 |       elog(LOG, error->message);                                               \
  63 |     }                                                                          \
  64 |   }
  65 | 
  66 | static char source_database[] = "postgres";
  67 | static char *source_database_role = NULL;
  68 | // Wait 5min after a successfull export run before starting again.
  69 | // This param can be tuned using GUC variable.
  70 | static int ingestor_naptime_sec = 300;
  71 | 
  72 | static void list_current_directories() {
  73 |   DIR *dir;
  74 |   struct dirent *entry;
  75 | 
  76 |   dir = opendir(".");
  77 |   if (dir == NULL) {
  78 |     perror("opendir");
  79 |     return;
  80 |   }
  81 | 
  82 |   while ((entry = readdir(dir)) != NULL) {
  83 |     // Check for "." and ".." entries
  84 |     if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
  85 |       continue;
  86 |     }
  87 |     elog(LOG, "Found directory with path %s", entry->d_name);
  88 |   }
  89 |   closedir(dir);
  90 | }
  91 | 
  92 | GArrowArray *create_int64_array(const int64 *data, int num_values,
  93 |                                 GError *error) {
  94 |   GError *inner_error = NULL;
  95 |   GArrowInt64ArrayBuilder *int_array_builder;
  96 |   int_array_builder = garrow_int64_array_builder_new();
  97 | 
  98 |   for (int i = 0; i < num_values; i++) {
  99 |     int64 value = data[i];
 100 |     elog(LOG, "Adding value %d to arrow array", value);
 101 |     garrow_int64_array_builder_append_value(int_array_builder, value,
 102 |                                             &inner_error);
 103 |     LOG_ARROW_ERROR(inner_error);
 104 |     ASSIGN_IF_NOT_NULL(inner_error, error);
 105 |   }
 106 |   GArrowArray *int_array = garrow_array_builder_finish(
 107 |       GARROW_ARRAY_BUILDER(int_array_builder), &inner_error);
 108 |   LOG_ARROW_ERROR(inner_error);
 109 |   ASSIGN_IF_NOT_NULL(inner_error, error);
 110 |   g_object_unref(int_array_builder);
 111 | 
 112 |   return int_array;
 113 | }
 114 | 
 115 | GArrowArray *create_timestamp_array(const int64 *data, int num_values,
 116 |                                     GError *error) {
 117 |   GError *inner_error = NULL;
 118 |   GArrowTimestampArrayBuilder *timestamp_array_builder;
 119 |   GTimeZone *time_zone = g_time_zone_new_utc();
 120 |   GArrowTimestampDataType *timestamp_data_type =
 121 |       garrow_timestamp_data_type_new(GARROW_TIME_UNIT_SECOND, time_zone);
 122 |   timestamp_array_builder =
 123 |       garrow_timestamp_array_builder_new(timestamp_data_type);
 124 | 
 125 |   for (int i = 0; i < num_values; i++) {
 126 |     int64 value = data[i];
 127 |     elog(LOG, "Adding value %ld to arrow array", value);
 128 |     garrow_timestamp_array_builder_append(timestamp_array_builder, value,
 129 |                                           &inner_error);
 130 |     LOG_ARROW_ERROR(inner_error);
 131 |     ASSIGN_IF_NOT_NULL(inner_error, error);
 132 |   }
 133 |   GArrowArray *int_array = garrow_array_builder_finish(
 134 |       GARROW_ARRAY_BUILDER(timestamp_array_builder), &inner_error);
 135 |   LOG_ARROW_ERROR(inner_error);
 136 |   ASSIGN_IF_NOT_NULL(inner_error, error);
 137 | 
 138 |   g_object_unref(timestamp_array_builder);
 139 |   g_object_unref(timestamp_data_type);
 140 | 
 141 |   return int_array;
 142 | }
 143 | 
 144 | GArrowArray *create_double_array(const double *data, int num_values,
 145 |                                  GError *error) {
 146 |   GError *inner_error = NULL;
 147 |   GArrowDoubleArrayBuilder *double_array_builder;
 148 |   double_array_builder = garrow_double_array_builder_new();
 149 | 
 150 |   for (int i = 0; i < num_values; i++) {
 151 |     double value = data[i];
 152 |     elog(LOG, "Adding value %f to arrow array", value);
 153 |     garrow_double_array_builder_append_value(double_array_builder, value,
 154 |                                              &inner_error);
 155 |     LOG_ARROW_ERROR(inner_error);
 156 |     ASSIGN_IF_NOT_NULL(inner_error, error);
 157 |   }
 158 |   GArrowArray *double_array = garrow_array_builder_finish(
 159 |       GARROW_ARRAY_BUILDER(double_array_builder), &inner_error);
 160 |   LOG_ARROW_ERROR(inner_error);
 161 |   ASSIGN_IF_NOT_NULL(inner_error, error);
 162 |   g_object_unref(double_array_builder);
 163 | 
 164 |   return double_array;
 165 | }
 166 | 
 167 | GArrowArray *create_bool_array(const int16 *data, int num_values,
 168 |                                GError *error) {
 169 |   GError *inner_error = NULL;
 170 |   GArrowBooleanArrayBuilder *bool_array_builder;
 171 |   bool_array_builder = garrow_boolean_array_builder_new();
 172 | 
 173 |   for (int i = 0; i < num_values; i++) {
 174 |     bool value = data[i] == 1 ? true : false;
 175 |     elog(LOG, "Adding value %d to arrow array", value);
 176 |     garrow_boolean_array_builder_append_value(bool_array_builder, value,
 177 |                                               &inner_error);
 178 |     LOG_ARROW_ERROR(inner_error);
 179 |     ASSIGN_IF_NOT_NULL(inner_error, error);
 180 |   }
 181 |   GArrowArray *bool_array = garrow_array_builder_finish(
 182 |       GARROW_ARRAY_BUILDER(bool_array_builder), &inner_error);
 183 |   LOG_ARROW_ERROR(inner_error);
 184 |   ASSIGN_IF_NOT_NULL(inner_error, error);
 185 |   g_object_unref(bool_array_builder);
 186 | 
 187 |   return bool_array;
 188 | }
 189 | 
 190 | GArrowArray *create_string_array(const char **data, int num_values,
 191 |                                  GError *error) {
 192 |   GError *inner_error = NULL;
 193 |   GArrowStringArrayBuilder *string_array_builder;
 194 |   string_array_builder = garrow_string_array_builder_new();
 195 | 
 196 |   for (int i = 0; i < num_values; i++) {
 197 |     char *value = data[i];
 198 |     if (value == 0) {
 199 |       value = "";
 200 |     }
 201 |     elog(LOG, "Adding value %s to arrow array", value);
 202 |     garrow_string_array_builder_append_value(string_array_builder, value,
 203 |                                              &inner_error);
 204 |     ASSIGN_IF_NOT_NULL(inner_error, error);
 205 |   }
 206 | 
 207 |   GArrowArray *string_array = garrow_array_builder_finish(
 208 |       GARROW_ARRAY_BUILDER(string_array_builder), &inner_error);
 209 |   LOG_ARROW_ERROR(inner_error);
 210 |   ASSIGN_IF_NOT_NULL(inner_error, error);
 211 |   g_object_unref(string_array_builder);
 212 |   return string_array;
 213 | }
 214 | 
 215 | /* Stores information about column names and column types. */
 216 | typedef struct _ColumnInfo {
 217 |   char column_name[MAX_COLUMN_NAME_CHARS];
 218 |   Oid column_type;
 219 | } ColumnInfo;
 220 | 
 221 | /**
 222 |  * Returns list of columns and type of each column.
 223 |  * num_of_columns is populated with columns in the table.
 224 |  * Caller should free the returned pointer.
 225 |  */
 226 | static ColumnInfo *get_column_types(const char *table_name,
 227 |                                     int *num_of_columns) {
 228 |   StringInfoData buf;
 229 |   initStringInfo(&buf);
 230 |   // TODO - ensure column schema is deterministic even after new columns are
 231 |   // added
 232 |   appendStringInfo(&buf,
 233 |                    "SELECT attname, atttypid FROM pg_attribute WHERE attrelid "
 234 |                    "= '%s'::regclass AND attnum > 0;",
 235 |                    table_name);
 236 |   int status = SPI_execute(buf.data, true, 0);
 237 |   elog(LOG, "Executed SPI_execute query %s with status %d", buf.data, status);
 238 |   if (status != SPI_OK_SELECT) {
 239 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 240 |                     errmsg("Failed to deduce column types")));
 241 |   }
 242 |   elog(LOG, "Found %d processed rows", SPI_processed);
 243 |   bool isnull;
 244 |   ColumnInfo *columns = palloc_array(ColumnInfo, SPI_processed);
 245 |   for (int i = 0; i < SPI_processed; i += 1) {
 246 |     ColumnInfo column_info;
 247 |     Datum column_name_data =
 248 |         SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 1, &isnull);
 249 |     Name column_name = DatumGetName(column_name_data);
 250 |     char *column_name_value = column_name->data;
 251 |     elog(LOG, "Acquired column name as %s", column_name_value);
 252 | 
 253 |     Datum column_type_data =
 254 |         SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 2, &isnull);
 255 |     Oid column_type_value = DatumGetObjectId(column_type_data);
 256 |     elog(LOG, "Acquired column type as %d", column_type_value);
 257 | 
 258 |     strcpy(column_info.column_name, column_name_value);
 259 |     column_info.column_type = column_type_value;
 260 | 
 261 |     columns[i] = column_info;
 262 |   }
 263 |   *num_of_columns = SPI_processed;
 264 |   elog(LOG, "Finished extracting column types");
 265 |   return columns;
 266 | }
 267 | 
 268 | /*
 269 |  * Creates Arrow schema for the table for basic types.
 270 |  * Caller is reponsible for freeing schema memory.
 271 |  */
 272 | static GArrowSchema *create_table_schema(const ColumnInfo *column_info,
 273 |                                          const ExportEntry *entry,
 274 |                                          int overall_num_columns) {
 275 |   GError *error = NULL;
 276 |   GList *fields = NULL;
 277 |   GArrowSchema *temp = NULL;
 278 | 
 279 |   // Define data schema
 280 |   elog(LOG, "Creating arrow schema");
 281 |   temp = garrow_schema_new(fields);
 282 |   for (int i = 0; i < entry->num_of_columns; i += 1) {
 283 |     const char *column_name = entry->columns_to_export[i];
 284 |     // TODO - move column type deduction to util
 285 |     Oid column_type;
 286 |     bool column_type_found = false;
 287 |     for (int j = 0; j < overall_num_columns; j += 1) {
 288 |       if (strcmp(column_info[j].column_name, column_name) == 0) {
 289 |         column_type = column_info[j].column_type;
 290 |         column_type_found = true;
 291 |         break;
 292 |       }
 293 |     }
 294 |     Assert(column_type_found);
 295 |     elog(LOG, "Adding column %s to schema with type %d", column_name,
 296 |          column_type);
 297 |     switch (column_type) {
 298 |     case INT2OID:
 299 |     case INT4OID:
 300 |     case INT8OID: {
 301 |       GArrowDataType *int_type = garrow_int64_data_type_new();
 302 |       GArrowField *int_field = garrow_field_new(column_name, int_type);
 303 | 
 304 |       temp = garrow_schema_add_field(temp, i, int_field, &error);
 305 | 
 306 |       g_object_unref(int_field);
 307 |       g_object_unref(int_type);
 308 |       break;
 309 |     }
 310 |     case FLOAT4OID:
 311 |     case FLOAT8OID: {
 312 |       GArrowDataType *precision_type = garrow_double_data_type_new();
 313 |       GArrowField *precision_field =
 314 |           garrow_field_new(column_name, precision_type);
 315 | 
 316 |       temp = garrow_schema_add_field(temp, i, precision_field, &error);
 317 | 
 318 |       g_object_unref(precision_field);
 319 |       g_object_unref(precision_type);
 320 |       break;
 321 |     }
 322 |     case TEXTOID:
 323 |     case VARCHAROID: {
 324 |       GArrowDataType *string_type = garrow_string_data_type_new();
 325 |       GArrowField *string_field = garrow_field_new(column_name, string_type);
 326 | 
 327 |       temp = garrow_schema_add_field(temp, i, string_field, &error);
 328 | 
 329 |       g_object_unref(string_field);
 330 |       g_object_unref(string_type);
 331 |       break;
 332 |     }
 333 |     case BOOLOID: {
 334 |       GArrowDataType *bool_type = garrow_boolean_data_type_new();
 335 |       GArrowField *bool_field = garrow_field_new(column_name, bool_type);
 336 | 
 337 |       temp = garrow_schema_add_field(temp, i, bool_field, &error);
 338 | 
 339 |       g_object_unref(bool_field);
 340 |       g_object_unref(bool_type);
 341 |       break;
 342 |     }
 343 |     case TIMESTAMPOID: {
 344 |       GTimeZone *time_zone = g_time_zone_new_utc();
 345 |       GArrowDataType *timestamp_type =
 346 |           garrow_timestamp_data_type_new(GARROW_TIME_UNIT_SECOND, time_zone);
 347 |       GArrowField *timestamp_field =
 348 |           garrow_field_new(column_name, timestamp_type);
 349 | 
 350 |       temp = garrow_schema_add_field(temp, i, timestamp_field, &error);
 351 | 
 352 |       g_object_unref(timestamp_field);
 353 |       g_object_unref(timestamp_type);
 354 |       break;
 355 |     }
 356 |     default:
 357 |       elog(LOG, "Column type %s not yet supported in columnar schema",
 358 |            column_type);
 359 |       break;
 360 |     }
 361 |   }
 362 |   const char *schema_str = garrow_schema_to_string(temp);
 363 |   elog(LOG, "Created schema with string %s", schema_str);
 364 |   return temp;
 365 | }
 366 | 
 367 | static int64 *create_int64_data_array(Datum *data, int num_values) {
 368 |   int64 *int_data = (int64 *)palloc(num_values * sizeof(int64));
 369 |   for (int i = 0; i < num_values; i += 1) {
 370 |     if (data[i] == ULONG_MAX) {
 371 |       int_data[i] = 0;
 372 |     } else {
 373 |       int_data[i] = DatumGetInt64(data[i]);
 374 |     }
 375 |   }
 376 |   return int_data;
 377 | }
 378 | 
 379 | static double *create_double_data_array(Datum *data, int num_values) {
 380 |   double *double_data = (double *)palloc(num_values * sizeof(double));
 381 |   for (int i = 0; i < num_values; i += 1) {
 382 |     if (data[i] == ULONG_MAX) {
 383 |       double_data[i] = 0.0f;
 384 |     } else {
 385 |       double_data[i] = DatumGetFloat8(data[i]);
 386 |     }
 387 |   }
 388 |   return double_data;
 389 | }
 390 | 
 391 | static int16 *create_bool_data_array(Datum *data, int num_values) {
 392 |   int16 *bool_data = (int16 *)palloc(num_values * sizeof(int16));
 393 |   for (int i = 0; i < num_values; i += 1) {
 394 |     if (data[i] == ULONG_MAX) {
 395 |       bool_data[i] = 0;
 396 |     } else {
 397 |       bool_data[i] = DatumGetBool(data[i]) ? 1 : 0;
 398 |     }
 399 |   }
 400 |   return bool_data;
 401 | }
 402 | 
 403 | static int64 *create_timestamp_data_array(Datum *data, int num_values) {
 404 |   int64 *timestamp_data = (int64 *)palloc(num_values * sizeof(int64));
 405 |   for (int i = 0; i < num_values; i += 1) {
 406 |     if (data[i] == ULONG_MAX) {
 407 |       timestamp_data[i] = 0;
 408 |     } else {
 409 |       Timestamp pg_timestanp = DatumGetTimestamp(data[i]);
 410 |       timestamp_data[i] = timestamptz_to_time_t(pg_timestanp);
 411 |     }
 412 |   }
 413 |   return timestamp_data;
 414 | }
 415 | 
 416 | static char **create_string_data_array(Datum *data, int num_values) {
 417 |   char **string_data = (char **)palloc(num_values * sizeof(char *));
 418 |   for (int i = 0; i < num_values; i += 1) {
 419 |     if (data[i] == ULONG_MAX) {
 420 |       char *cstring = "";
 421 |       string_data[i] = (char *)palloc(sizeof(char));
 422 |       strcpy(string_data[i], cstring);
 423 |     } else {
 424 |       char *cstring = TextDatumGetCString(data[i]);
 425 |       string_data[i] = (char *)palloc(sizeof(char) * strlen(cstring) + 1);
 426 |       strcpy(string_data[i], cstring);
 427 |     }
 428 |   }
 429 |   return string_data;
 430 | }
 431 | 
 432 | /**
 433 |  * Returns Arrow table pointer with data populated from SPI query.
 434 |  * Expects SPI query to read desired table rows has been executed.
 435 |  * Caller is reponsible for freeing table memory.
 436 |  */
 437 | static GArrowTable *create_arrow_table(GArrowSchema *schema,
 438 |                                        const ColumnInfo *column_info,
 439 |                                        const ExportEntry *entry,
 440 |                                        int overall_num_columns) {
 441 |   Assert(SPI_processed > 0);
 442 |   elog(LOG, "Creating arrow table");
 443 |   int num_export_columns = entry->num_of_columns;
 444 |   GArrowArray **arrow_arrays =
 445 |       (GArrowArray **)palloc(num_export_columns * sizeof(GArrowArray));
 446 |   Datum **table_data = (Datum **)palloc(num_export_columns * sizeof(Datum *));
 447 |   // Iterate over all rows for each column and create Datum Arays for each
 448 |   // column Datums are just pointers to the actual data so the additional memory
 449 |   // of creating temp datums array is very low.
 450 |   for (int i = 0; i < num_export_columns; i += 1) {
 451 |     // Copy datums to temp variable to access all datums for a column
 452 |     table_data[i] = (Datum *)palloc(SPI_processed * sizeof(Datum));
 453 |     for (int j = 0; j < SPI_processed; j += 1) {
 454 |       bool isnull;
 455 |       Datum datum = SPI_getbinval(SPI_tuptable->vals[j], SPI_tuptable->tupdesc,
 456 |                                   i + 1, &isnull);
 457 |       if (!isnull) {
 458 |         table_data[i][j] = datum;
 459 |       } else {
 460 |         table_data[i][j] = ULONG_MAX;
 461 |       }
 462 |     }
 463 |   }
 464 |   GError *error = NULL;
 465 |   // Populate arrow arrays for each column
 466 |   for (int i = 0; i < num_export_columns; i += 1) {
 467 |     // TODO - move column type deduction to util
 468 |     const char *column_name = entry->columns_to_export[i];
 469 |     Oid column_type;
 470 |     bool column_type_found = false;
 471 |     for (int j = 0; j < overall_num_columns; j += 1) {
 472 |       if (strcmp(column_info[j].column_name, column_name) == 0) {
 473 |         column_type = column_info[j].column_type;
 474 |         column_type_found = true;
 475 |         break;
 476 |       }
 477 |     }
 478 |     Assert(column_type_found);
 479 |     elog(LOG, "Adding values to arrow array for column %s", column_name);
 480 |     switch (column_type) {
 481 |     case INT2OID:
 482 |     case INT4OID:
 483 |     case INT8OID: {
 484 |       const int64 *int_data =
 485 |           create_int64_data_array(table_data[i], SPI_processed);
 486 |       arrow_arrays[i] = create_int64_array(int_data, SPI_processed, error);
 487 |       pfree(int_data);
 488 |       elog(LOG, "Created int data array");
 489 |       LOG_ARROW_ERROR(error);
 490 |       break;
 491 |     }
 492 |     case FLOAT4OID:
 493 |     case FLOAT8OID: {
 494 |       const double *double_data =
 495 |           create_double_data_array(table_data[i], SPI_processed);
 496 |       arrow_arrays[i] = create_double_array(double_data, SPI_processed, error);
 497 |       pfree(double_data);
 498 |       elog(LOG, "Created int data array");
 499 |       LOG_ARROW_ERROR(error);
 500 |       break;
 501 |     }
 502 |     case TEXTOID:
 503 |     case VARCHAROID: {
 504 |       const char **string_data =
 505 |           create_string_data_array(table_data[i], SPI_processed);
 506 |       arrow_arrays[i] = create_string_array(string_data, SPI_processed, error);
 507 |       for (int k = 0; k < SPI_processed; k += 1) {
 508 |         pfree(string_data[k]);
 509 |       }
 510 |       pfree(string_data);
 511 |       elog(LOG, "Created string data array");
 512 |       LOG_ARROW_ERROR(error);
 513 |       break;
 514 |     }
 515 |     case BOOLOID: {
 516 |       const int16 *bool_data =
 517 |           create_bool_data_array(table_data[i], SPI_processed);
 518 |       arrow_arrays[i] = create_bool_array(bool_data, SPI_processed, error);
 519 |       pfree(bool_data);
 520 |       elog(LOG, "Created bool data array");
 521 |       LOG_ARROW_ERROR(error);
 522 |       break;
 523 |     }
 524 |     case TIMESTAMPOID: {
 525 |       const int64 *timestamp_data =
 526 |           create_timestamp_data_array(table_data[i], SPI_processed);
 527 |       arrow_arrays[i] =
 528 |           create_timestamp_array(timestamp_data, SPI_processed, error);
 529 |       pfree(timestamp_data);
 530 |       elog(LOG, "Created timestamp data array");
 531 |       LOG_ARROW_ERROR(error);
 532 |       break;
 533 |     }
 534 |     default:
 535 |       elog(LOG, "Column type %s not yet supported in columnar schema",
 536 |            column_type);
 537 |       break;
 538 |     }
 539 |   }
 540 |   GArrowTable *table =
 541 |       garrow_table_new_arrays(schema, arrow_arrays, num_export_columns, &error);
 542 |   LOG_ARROW_ERROR(error);
 543 | 
 544 |   for (int i = 0; i < num_export_columns; i += 1) {
 545 |     g_object_unref(arrow_arrays[i]);
 546 |     pfree(table_data[i]);
 547 |   }
 548 |   pfree(arrow_arrays);
 549 |   pfree(table_data);
 550 | 
 551 |   const char *table_definition = garrow_table_to_string(table, &error);
 552 |   elog(LOG, "Created arrow table with definition %s", table_definition);
 553 |   return table;
 554 | }
 555 | 
 556 | static void write_arrow_table(const char *table_name, int chunk_num,
 557 |                               GArrowSchema *schema, GArrowTable *table) {
 558 |   char path[PATH_MAX];
 559 |   char file_name[PATH_MAX];
 560 |   populate_temp_path_for_table(table_name, path, /*relative=*/true);
 561 |   sprintf(file_name, "/%d.parquet", chunk_num);
 562 |   strcat(path, file_name);
 563 |   elog(LOG, "Attempting to write file %s", path);
 564 | 
 565 |   GError *error = NULL;
 566 |   GParquetWriterProperties *writer_properties =
 567 |       gparquet_writer_properties_new();
 568 |   GParquetArrowFileWriter *writer = gparquet_arrow_file_writer_new_path(
 569 |       schema, path, writer_properties, &error);
 570 |   LOG_ARROW_ERROR(error);
 571 | 
 572 |   gboolean is_write_successfull = gparquet_arrow_file_writer_write_table(
 573 |       writer, table, PARQUET_ROW_GROUP_CHUNK_SIZE, &error);
 574 |   LOG_ARROW_ERROR(error);
 575 | 
 576 |   gboolean file_closed = gparquet_arrow_file_writer_close(writer, &error);
 577 |   LOG_ARROW_ERROR(error);
 578 | 
 579 |   Assert(file_closed);
 580 | 
 581 |   g_object_unref(table);
 582 |   g_object_unref(writer);
 583 |   g_object_unref(writer_properties);
 584 | 
 585 |   elog(LOG, "Sucessfully wrote columnar file to disk.");
 586 | }
 587 | 
 588 | void delete_export_entry(const char *table_name) {
 589 |   StringInfoData buf;
 590 |   initStringInfo(&buf);
 591 |   appendStringInfo(&buf,
 592 |                    "DELETE FROM analytica_exports WHERE table_name = '%s';",
 593 |                    table_name);
 594 | 
 595 |   SetCurrentStatementStartTimestamp();
 596 |   StartTransactionCommand();
 597 |   PushActiveSnapshot(GetTransactionSnapshot());
 598 |   int connection = SPI_connect();
 599 |   if (connection < 0) {
 600 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 601 |                     errmsg("Failed to connect to database")));
 602 |   }
 603 |   elog(LOG, "Executing SPI_execute query %s", buf.data);
 604 |   int status = SPI_execute(buf.data, /*read_only=*/false, /*count=*/0);
 605 |   elog(LOG, "Executed SPI_execute command with status %d", status);
 606 |   if (status != SPI_OK_DELETE) {
 607 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 608 |                     errmsg("Failed to delete export entry.")));
 609 |   }
 610 |   SPI_finish();
 611 |   PopActiveSnapshot();
 612 |   CommitTransactionCommand();
 613 | }
 614 | 
 615 | static void get_tables_to_process_in_order(ExportEntry *entries,
 616 |                                            int *num_of_tables) {
 617 |   StringInfoData buf;
 618 |   initStringInfo(&buf);
 619 |   appendStringInfo(&buf, "SELECT                     \
 620 | 		table_name,             \
 621 | 		columns_to_export,      \
 622 | 		last_run_completed,     \
 623 | 		export_frequency_hours,  \
 624 | 		export_status, \
 625 | 		chunk_size, \
 626 | 		now() \
 627 | 	FROM analytica_exports      \
 628 | 	ORDER BY last_run_completed \
 629 | 	LIMIT %d;",
 630 |                    MAX_EXPORT_ENTRIES);
 631 | 
 632 |   SetCurrentStatementStartTimestamp();
 633 |   StartTransactionCommand();
 634 |   PushActiveSnapshot(GetTransactionSnapshot());
 635 |   int connection = SPI_connect();
 636 |   if (connection < 0) {
 637 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 638 |                     errmsg("Failed to connect to database")));
 639 |   }
 640 |   elog(LOG, "Created connection for query");
 641 |   elog(LOG, "Executing SPI_execute query %s", buf.data);
 642 |   int status = SPI_execute(buf.data, /*read_only=*/true, /*count=*/0);
 643 |   elog(LOG, "Executed SPI_execute command with status %d", status);
 644 |   if (status < 0) {
 645 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 646 |                     errmsg("Failed to fetch tables to export.")));
 647 |   }
 648 | 
 649 |   elog(LOG, "Beginning processing of %d rows", SPI_processed);
 650 |   int valid_entries = 0;
 651 | 
 652 |   // Calaculate number of vvalid entries to export.
 653 |   for (int i = 0; i < SPI_processed; i += 1) {
 654 |     bool isnull;
 655 |     bool is_valid_entry = false;
 656 |     Datum last_completed_datum =
 657 |         SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 3, &isnull);
 658 |     elog(LOG, "Read last completed datum");
 659 |     if (isnull) {
 660 |       // Table is newly scheduled for export so last run completed is null
 661 |       is_valid_entry = true;
 662 |       elog(LOG, "last completed datum is null");
 663 |     } else {
 664 |       elog(LOG, "Evaluating if older export entry is past threshold");
 665 |       Datum export_frequency_datum = SPI_getbinval(
 666 |           SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 4, &isnull);
 667 |       int export_frequency = DatumGetInt16(export_frequency_datum);
 668 | 
 669 |       int64 last_completed = DatumGetTimestampTz(last_completed_datum);
 670 |       int64 last_run_pg_time = timestamptz_to_time_t(last_completed);
 671 | 
 672 |       Datum current_time_datum = SPI_getbinval(
 673 |           SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 7, &isnull);
 674 |       TimestampTz current_time = DatumGetTimestampTz(current_time_datum);
 675 |       int64 current_time_pg_time = timestamptz_to_time_t(current_time);
 676 | 
 677 |       elog(LOG, "last run time %lld", last_run_pg_time);
 678 |       elog(LOG, "current time %lld", current_time_pg_time);
 679 | 
 680 |       int64 seconds_elapsed = (current_time_pg_time - last_run_pg_time);
 681 |       elog(LOG, "seconds lapsed are %lld", seconds_elapsed);
 682 |       uint64 hours_elapsed = (seconds_elapsed) / 3600;
 683 |       elog(LOG, "%lld hours elapsed since last export", hours_elapsed);
 684 |       if (hours_elapsed > export_frequency) {
 685 |         is_valid_entry = true;
 686 |         elog(LOG, "Marking older entry for export again");
 687 |       } else {
 688 |         elog(LOG, "Older export entry threshold han;t passed");
 689 |       }
 690 |     }
 691 | 
 692 |     /**
 693 |      * Surface inactive entries for cleanup later.
 694 |      */
 695 |     Datum export_status_datum =
 696 |         SPI_getbinval(SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 5, &isnull);
 697 |     int export_status = DatumGetInt32(export_status_datum);
 698 |     if (export_status == INACTIVE) {
 699 |       is_valid_entry = true;
 700 |     }
 701 | 
 702 |     if (is_valid_entry) {
 703 |       valid_entries += 1;
 704 | 
 705 |       Datum name_datum = SPI_getbinval(SPI_tuptable->vals[i],
 706 |                                        SPI_tuptable->tupdesc, 1, &isnull);
 707 |       char *table_name = TextDatumGetCString(name_datum);
 708 | 
 709 |       Datum chunk_size_datum = SPI_getbinval(SPI_tuptable->vals[i],
 710 |                                              SPI_tuptable->tupdesc, 6, &isnull);
 711 |       int64 chunk_size = DatumGetInt64(chunk_size_datum);
 712 | 
 713 |       ArrayType *arr = DatumGetArrayTypeP(SPI_getbinval(
 714 |           SPI_tuptable->vals[i], SPI_tuptable->tupdesc, 2, &isnull));
 715 |       Datum *column_datums;
 716 |       int num_of_columns;
 717 |       deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT, &column_datums,
 718 |                         NULL, &num_of_columns);
 719 | 
 720 |       ExportEntry entry;
 721 |       initialize_export_entry(table_name, num_of_columns, &entry);
 722 |       entry.export_status = export_status;
 723 |       entry.chunk_size = chunk_size;
 724 | 
 725 |       for (int j = 0; j < num_of_columns; j += 1) {
 726 |         char *column_name = TextDatumGetCString(column_datums[j]);
 727 |         export_entry_add_column(&entry, column_name, j);
 728 |       }
 729 |       entries[i] = entry;
 730 |     }
 731 |   }
 732 |   *num_of_tables = valid_entries;
 733 |   SPI_finish();
 734 |   PopActiveSnapshot();
 735 |   CommitTransactionCommand();
 736 | }
 737 | 
 738 | static void setup_data_directories(const char *table_name) {
 739 | 
 740 |   char root_path[PATH_MAX];
 741 |   char base_path[PATH_MAX];
 742 |   char temp_path[PATH_MAX];
 743 | 
 744 |   populate_root_path(root_path, /*relative=*/false);
 745 |   populate_data_path_for_table(table_name, base_path, /*relative=*/false);
 746 |   populate_temp_path_for_table(table_name, temp_path, /*relative=*/false);
 747 | 
 748 |   elog(LOG, "Setting up root directory %s", root_path);
 749 |   elog(LOG, "Setting up base directory %s", base_path);
 750 |   elog(LOG, "Setting up temp directory %s", temp_path);
 751 | 
 752 |   // Create root data directory if not exists.
 753 |   if (mkdir(root_path, 0755) == -1) {
 754 |     perror("mkdir");
 755 |     elog(LOG, "Failed to create root directoy %s", root_path);
 756 |   }
 757 | 
 758 |   // Create base data directory if not exists.
 759 |   if (mkdir(base_path, 0755) == -1) {
 760 |     perror("mkdir");
 761 |     elog(LOG, "Failed to create data directoy %s", base_path);
 762 |   }
 763 | 
 764 |   // Create temp directory if not exists.
 765 |   if (mkdir(temp_path, 0755) == -1) {
 766 |     perror("mkdir");
 767 |     elog(LOG, "Failed to create data directoy %s", temp_path);
 768 |   }
 769 | 
 770 |   // Delete content in temp directory.
 771 |   DIR *dir = opendir(temp_path);
 772 |   struct dirent *entry;
 773 |   while ((entry = readdir(dir)) != NULL) {
 774 |     char filepath[PATH_MAX];
 775 |     snprintf(filepath, sizeof(filepath), "%s/%s", temp_path, entry->d_name);
 776 | 
 777 |     // Skip special entries (".", "..")
 778 |     if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
 779 |       continue;
 780 |     }
 781 |     if (unlink(filepath) == -1) {
 782 |       elog(LOG, "Failed to delete temp file %s", filepath);
 783 |     }
 784 |   }
 785 |   closedir(dir);
 786 |   elog(LOG, "Data directory setup complete for %s", table_name);
 787 | }
 788 | 
 789 | /**
 790 |  * Moves columnar files from temp directory to data directory for table.
 791 |  * Existing files in data directory are deleted.
 792 |  * Cuurently this moves files one by one so it isn't atomic.
 793 |  */
 794 | void move_temp_files(const char *table_name) {
 795 |   DIR *dir;
 796 |   struct dirent *entry;
 797 | 
 798 |   char temp_path[PATH_MAX];
 799 |   char data_path[PATH_MAX];
 800 | 
 801 |   populate_data_path_for_table(table_name, data_path, false);
 802 |   populate_temp_path_for_table(table_name, temp_path, false);
 803 | 
 804 |   // delete older parquet files in data directory
 805 |   // This might cause issues if an existing query is reading them.
 806 |   // There might be a way to acquire lock for file in unix.
 807 |   dir = opendir(data_path);
 808 |   while ((entry = readdir(dir)) != NULL) {
 809 |     char filepath[PATH_MAX];
 810 |     snprintf(filepath, sizeof(filepath), "%s/%s", data_path, entry->d_name);
 811 | 
 812 |     // Skip special entries (".", "..")
 813 |     if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
 814 |       continue;
 815 |     }
 816 |     if (unlink(filepath) == -1) {
 817 |       elog(LOG, "Failed to delete data file %s", filepath);
 818 |     }
 819 |   }
 820 |   closedir(dir);
 821 | 
 822 |   dir = opendir(temp_path);
 823 |   if (dir == NULL) {
 824 |     perror("opendir");
 825 |     return;
 826 |   }
 827 | 
 828 |   while ((entry = readdir(dir)) != NULL) {
 829 |     if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
 830 |       continue;
 831 |     }
 832 |     elog(LOG, "Found entry with path %s", entry->d_name);
 833 |     char src_path[PATH_MAX];
 834 |     char dest_path[PATH_MAX];
 835 | 
 836 |     // temp columnar file
 837 |     strcpy(src_path, temp_path);
 838 |     strcat(src_path, "/");
 839 |     strcat(src_path, entry->d_name);
 840 | 
 841 |     // Final path
 842 |     strcpy(dest_path, data_path);
 843 |     strcat(dest_path, "/");
 844 |     strcat(dest_path, entry->d_name);
 845 | 
 846 |     elog(LOG, "Temp path %s", src_path);
 847 |     elog(LOG, "Data path %s", dest_path);
 848 | 
 849 |     if (rename(src_path, dest_path) == 0) {
 850 |       elog(LOG, "File '%s' renamed to '%s' successfully.\n", src_path,
 851 |            dest_path);
 852 |     } else {
 853 |       perror("rename"); // Print error message if rename fails
 854 |     }
 855 |   }
 856 |   closedir(dir);
 857 | }
 858 | 
 859 | char *get_columns_string(char **columns, int num_of_columns) {
 860 |   int total_size = 0;
 861 |   for (int i = 0; i < num_of_columns; i++) {
 862 |     char *column_name = columns[i];
 863 |     total_size += strlen(column_name) + 1;
 864 |     elog(LOG, "Size of column %s is %d", column_name, strlen(column_name));
 865 |   }
 866 |   char *combined_string = (char *)palloc(total_size * sizeof(char));
 867 |   combined_string[0] = '\0';
 868 |   for (int i = 0; i < num_of_columns; i++) {
 869 |     strcat(combined_string, columns[i]);
 870 |     if (i < num_of_columns - 1) {
 871 |       strcat(combined_string, ",");
 872 |     }
 873 |   }
 874 |   return combined_string;
 875 | }
 876 | 
 877 | void export_table_data(ExportEntry entry) {
 878 |   int offset = 0;
 879 |   int64 processed_count;
 880 |   int select;
 881 | 
 882 |   SetCurrentStatementStartTimestamp();
 883 |   StartTransactionCommand();
 884 |   PushActiveSnapshot(GetTransactionSnapshot());
 885 |   int connection = SPI_connect();
 886 |   if (connection == SPI_ERROR_CONNECT) {
 887 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 888 |                     errmsg("Failed to connect to database")));
 889 |   }
 890 |   elog(LOG, "Created connection for query");
 891 | 
 892 |   int total_columns;
 893 |   elog(LOG, "Trying to extract column types");
 894 |   ColumnInfo *column_info = get_column_types(entry.table_name, &total_columns);
 895 |   elog(LOG, "Extracted %d column types", total_columns);
 896 | 
 897 |   GArrowSchema *arrow_schema =
 898 |       create_table_schema(column_info, &entry, total_columns);
 899 | 
 900 |   int chunk = 0;
 901 |   char *column_str =
 902 |       get_columns_string(entry.columns_to_export, entry.num_of_columns);
 903 | 
 904 |   do {
 905 |     StringInfoData buf;
 906 |     initStringInfo(&buf);
 907 |     // TODO - order of columns in result should match schema columns
 908 |     appendStringInfo(&buf, "SELECT %s FROM %s LIMIT %d OFFSET %d;", column_str,
 909 |                      entry.table_name, entry.chunk_size, offset);
 910 | 
 911 |     // Execute the chunked query using SPI_exec
 912 |     elog(LOG, "Executing SPI_execute query %s", buf.data);
 913 |     SetCurrentStatementStartTimestamp();
 914 |     select = SPI_execute(buf.data, true, 0);
 915 |     elog(LOG, "Executed SPI_execute command with status %d", select);
 916 | 
 917 |     if (select != SPI_OK_SELECT) {
 918 |       ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 919 |                       errmsg("SELECT Query execution failed")));
 920 |     }
 921 |     processed_count = SPI_processed;
 922 |     elog(LOG, "Processed %d rows", processed_count);
 923 |     elog(LOG, "Number of rows is %lu", SPI_tuptable->numvals);
 924 |     if (processed_count > 0) {
 925 |       GArrowTable *arrow_table =
 926 |           create_arrow_table(arrow_schema, column_info, &entry, total_columns);
 927 |       // write to disk
 928 |       write_arrow_table(entry.table_name, chunk, arrow_schema, arrow_table);
 929 |       chunk += 1;
 930 |     }
 931 |     // Increment offset by number of processed rows
 932 |     offset += processed_count;
 933 |   } while (processed_count > 0);
 934 |   elog(LOG, "Finished processing %lld rows", processed_count);
 935 |   pfree(column_str);
 936 |   // Move files from temp directly to data directory.
 937 |   move_temp_files(entry.table_name);
 938 | 
 939 |   g_object_unref(arrow_schema);
 940 |   pfree(column_info);
 941 | 
 942 |   SPI_finish();
 943 |   PopActiveSnapshot();
 944 |   CommitTransactionCommand();
 945 | }
 946 | 
 947 | /**
 948 |  * Update table export status after successfull export.
 949 |  */
 950 | void update_table_export_metadata(const char *table_name) {
 951 |   StringInfoData buf;
 952 |   initStringInfo(&buf);
 953 |   appendStringInfo(&buf, "UPDATE analytica_exports        \
 954 | 		 SET last_run_completed = CURRENT_TIMESTAMP, export_status = %d \
 955 | 		 WHERE table_name = '%s';",
 956 |                    ACTIVE, table_name);
 957 | 
 958 |   SetCurrentStatementStartTimestamp();
 959 |   StartTransactionCommand();
 960 |   PushActiveSnapshot(GetTransactionSnapshot());
 961 |   int connection = SPI_connect();
 962 |   if (connection < 0) {
 963 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 964 |                     errmsg("Failed to connect to database")));
 965 |   }
 966 |   elog(LOG, "Created connection for query");
 967 |   elog(LOG, "Executing SPI_execute query %s", buf.data);
 968 |   int status = SPI_execute(buf.data, /*read_only=*/false, /*count=*/0);
 969 |   elog(LOG, "Executed SPI_execute command with status %d", status);
 970 | 
 971 |   SPI_finish();
 972 |   PopActiveSnapshot();
 973 |   CommitTransactionCommand();
 974 | }
 975 | 
 976 | void cleanup_inactive_export(const char *table_name) {
 977 |   elog(LOG, "Cleaning up data for table %s", table_name);
 978 |   // delete data directories
 979 |   int status = cleanup_table_data(table_name);
 980 |   if (status != 0) {
 981 |     elog(LOG, "Failed to cleanup data for table %s", table_name);
 982 |     return;
 983 |   }
 984 |   // delete metadata entry
 985 |   delete_export_entry(table_name);
 986 |   elog(LOG, "Cleaned up data for table %s", table_name);
 987 | }
 988 | 
 989 | void register_table_with_parquet_server(const ExportEntry *entry) {
 990 |   StringInfoData buf;
 991 |   initStringInfo(&buf);
 992 |   appendStringInfo(&buf, "DROP FOREIGN TABLE IF EXISTS analytica_%s;\n",
 993 |                    entry->table_name);
 994 |   appendStringInfo(&buf, "select import_parquet(     \
 995 | 			'analytica_%s',         \
 996 | 			'public',               \
 997 | 			'parquet_srv',          \
 998 | 			'list_parquet_files',   \
 999 | 			'{\"dir\": \"./pg_analytica/%s\"}',  \
1000 | 			'{\"use_mmap\": \"true\", \"use_threads\": \"true\"}' \
1001 | 		);",
1002 |                    entry->table_name, entry->table_name);
1003 | 
1004 |   SetCurrentStatementStartTimestamp();
1005 |   StartTransactionCommand();
1006 |   PushActiveSnapshot(GetTransactionSnapshot());
1007 |   int connection = SPI_connect();
1008 |   if (connection < 0) {
1009 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
1010 |                     errmsg("Failed to connect to database")));
1011 |   }
1012 |   elog(LOG, "Created connection for query");
1013 |   elog(LOG, "Executing SPI_execute query %s", buf.data);
1014 |   int status = SPI_execute(buf.data, /*read_only=*/false, /*count=*/0);
1015 |   elog(LOG, "Executed SPI_execute command with status %d", status);
1016 |   if (status != SPI_OK_SELECT) {
1017 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
1018 |                     errmsg("Failed to register new table entry.")));
1019 |   }
1020 |   SPI_finish();
1021 |   PopActiveSnapshot();
1022 |   CommitTransactionCommand();
1023 | }
1024 | 
1025 | /**
1026 |  * Use SPI to read rows from table.
1027 |  * https://www.postgresql.org/docs/current/spi.html
1028 |  */
1029 | void ingestor_main() {
1030 |   elog(LOG, "Started export in background worker");
1031 |   bits32 flags = 0;
1032 |   BackgroundWorkerUnblockSignals();
1033 |   elog(LOG, "Establishing connection to database %s with role %s",
1034 |        source_database, source_database_role);
1035 |   BackgroundWorkerInitializeConnection(source_database, source_database_role,
1036 |                                        flags);
1037 | 
1038 |   for (;;) {
1039 |     /*
1040 |      * Background workers mustn't call usleep() or any direct equivalent:
1041 |      * instead, they may wait on their process latch, which sleeps as
1042 |      * necessary, but is awakened if postmaster dies.  That way the
1043 |      * background process goes away immediately in an emergency.
1044 |      */
1045 |     (void)WaitLatch(MyLatch, WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
1046 |                     ingestor_naptime_sec * 1000L, PG_WAIT_EXTENSION);
1047 |     ResetLatch(MyLatch);
1048 | 
1049 |     int num_of_tables;
1050 |     ExportEntry entries[MAX_EXPORT_ENTRIES];
1051 |     elog(LOG, "Fetching tables to export");
1052 |     get_tables_to_process_in_order(&entries, &num_of_tables);
1053 | 
1054 |     elog(LOG, "Beginning export for %d tables", num_of_tables);
1055 | 
1056 |     for (int i = 0; i < num_of_tables; i += 1) {
1057 |       char *table_name = entries[i].table_name;
1058 | 
1059 |       if (entries[i].export_status == INACTIVE) {
1060 |         elog(LOG, "Cleaning up inactive entry %s", entries[i].table_name);
1061 |         cleanup_inactive_export(table_name);
1062 |         free_export_entry(&entries[i]);
1063 |         continue;
1064 |       }
1065 | 
1066 |       elog(LOG, "Initializing data directory for %s with %d columns",
1067 |            table_name, entries[i].num_of_columns);
1068 |       setup_data_directories(table_name);
1069 | 
1070 |       elog(LOG, "Starting export for %s", table_name);
1071 |       export_table_data(entries[i]);
1072 | 
1073 |       elog(LOG, "Updating export status for %s", table_name);
1074 |       update_table_export_metadata(entries[i].table_name);
1075 | 
1076 |       elog(LOG, "Registering table with parqut fdw table %s", table_name);
1077 |       register_table_with_parquet_server(&entries[i]);
1078 | 
1079 |       elog(LOG, "Freeing export entry");
1080 |       free_export_entry(&entries[i]);
1081 | 
1082 |       elog(LOG, "Export completed for %s", table_name);
1083 | 
1084 |       CHECK_FOR_INTERRUPTS();
1085 |     }
1086 |   }
1087 |   exit(0);
1088 | }
1089 | 
1090 | Datum ingestor_launch(PG_FUNCTION_ARGS) {
1091 |   pid_t pid;
1092 |   BackgroundWorker worker;
1093 |   BackgroundWorkerHandle *handle;
1094 |   BgwHandleStatus status;
1095 | 
1096 |   DefineCustomIntVariable(
1097 |       "pg_analytica.naptime", "Duration between each check (in seconds).", NULL,
1098 |       &ingestor_naptime_sec, 10, 1, INT_MAX, PGC_SIGHUP, 0, NULL, NULL, NULL);
1099 |   DefineCustomStringVariable("pg_analytica.database", "Database to connect to.",
1100 |                              NULL, &source_database, "postgres", PGC_SIGHUP, 0,
1101 |                              NULL, NULL, NULL);
1102 |   DefineCustomStringVariable("pg_analytica.role", "Role to connect with.", NULL,
1103 |                              &source_database_role, NULL, PGC_SIGHUP, 0, NULL,
1104 |                              NULL, NULL);
1105 | 
1106 |   memset(&worker, 0, sizeof(worker));
1107 |   worker.bgw_flags =
1108 |       BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
1109 |   worker.bgw_start_time = BgWorkerStart_RecoveryFinished;
1110 |   worker.bgw_restart_time = BGW_DEFAULT_RESTART_INTERVAL;
1111 |   sprintf(worker.bgw_library_name, "ingestor");
1112 |   sprintf(worker.bgw_function_name, "ingestor_main");
1113 |   snprintf(worker.bgw_name, BGW_MAXLEN, "ingestor dynamic worker");
1114 |   snprintf(worker.bgw_type, BGW_MAXLEN, "ingestor dynamic");
1115 |   /* set bgw_notify_pid so that we can use WaitForBackgroundWorkerStartup */
1116 |   worker.bgw_notify_pid = MyProcPid;
1117 | 
1118 |   if (!RegisterDynamicBackgroundWorker(&worker, &handle))
1119 |     PG_RETURN_NULL();
1120 | 
1121 |   status = WaitForBackgroundWorkerStartup(handle, &pid);
1122 |   if (status == BGWH_STOPPED)
1123 |     ereport(ERROR,
1124 |             (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
1125 |              errmsg("could not start background process"),
1126 |              errhint("More details may be available in the server log.")));
1127 |   if (status == BGWH_POSTMASTER_DIED)
1128 |     ereport(ERROR,
1129 |             (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
1130 |              errmsg("cannot start background processes without postmaster"),
1131 |              errhint("Kill all remaining database processes and restart the "
1132 |                      "database.")));
1133 | 
1134 |   Assert(status == BGWH_STARTED);
1135 | 
1136 |   elog(LOG, "Background worker started");
1137 |   PG_RETURN_INT32(pid);
1138 | }
1139 | 


--------------------------------------------------------------------------------
/ingestor/ingestor.control:
--------------------------------------------------------------------------------
1 | # ingestor postgresql extension
2 | comment = 'Extension to create columnar store from table in background worker.'
3 | default_version = '0.0.1'
4 | module_pathname = '$libdir/ingestor'
5 | relocatable = true
6 | 


--------------------------------------------------------------------------------
/ingestor/parquet.sql:
--------------------------------------------------------------------------------
 1 | -- This file contains methods to test parquet_fdw functionality.
 2 | 
 3 | -- function to list all parquet iles in data directory for table
 4 | create function list_parquet_files(args jsonb)
 5 | returns text[] as
 6 | $$
 7 | begin
 8 |     return array_agg(args->>'dir' || '/' || filename)
 9 |            from pg_ls_dir(args->>'dir') as files(filename)
10 |            where filename ~~ '%.parquet';
11 | end
12 | $$
13 | language plpgsql;
14 | 
15 | -- Register parquet files as foreign table
16 | create or replace foreign table test_data_columnar (
17 |     id           int,
18 |     name   varchar(40),
19 |     age          int
20 | )
21 | server parquet_srv
22 | options (
23 |     files_func 'list_parquet_files',
24 |     files_func_arg '{"dir": "./pg_analytica/test_data"}',
25 |     use_mmap 'true',
26 |     use_threads 'true'
27 | );
28 | 
29 | select import_parquet(
30 |     'analytica.distributors',
31 |     'public',
32 |     'parquet_srv',
33 |     'list_parquet_files',
34 |     '{"dir": "./pg_analytica/distributors"}',
35 |     '{"use_mmap": "true", "use_threads": "true"}'
36 | );
37 | 


--------------------------------------------------------------------------------
/ingestor/registry.c:
--------------------------------------------------------------------------------
  1 | #include "postgres.h"
  2 | #include "access/xact.h"
  3 | #include "catalog/pg_type_d.h"
  4 | #include "constants.h"
  5 | #include "executor/spi.h"
  6 | #include "postgres.h"
  7 | #include "utils/array.h"
  8 | #include "utils/builtins.h"
  9 | #include "utils/snapmgr.h"
 10 | 
 11 | PG_FUNCTION_INFO_V1(register_table_export);
 12 | PG_FUNCTION_INFO_V1(unregister_table_export);
 13 | 
 14 | static int execute_query(StringInfoData buf) {
 15 |   int connection = SPI_connect();
 16 |   if (connection < 0) {
 17 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 18 |                     errmsg("Failed to connect to database")));
 19 |   }
 20 |   elog(LOG, "Created connection for query");
 21 |   // Execute the chunked query using SPI_exec
 22 |   elog(LOG, "Executing SPI_execute query %s", buf.data);
 23 |   int status = SPI_execute(buf.data, false, 0);
 24 |   elog(LOG, "Executed SPI_execute command with status %d", status);
 25 | 
 26 |   SPI_finish();
 27 |   return status;
 28 | }
 29 | 
 30 | static char *get_columns_string(Datum *columns, int num_of_columns) {
 31 |   int total_size = 0;
 32 |   elog(LOG, "Creating stringified columns");
 33 |   for (int i = 0; i < num_of_columns; i++) {
 34 |     char *column_name = TextDatumGetCString(columns[i]);
 35 |     total_size += strlen(column_name) + 1;
 36 |     elog(LOG, "Size of column %s is %d", column_name, strlen(column_name));
 37 |   }
 38 |   char *combined_string = (char *)palloc(total_size * sizeof(char));
 39 |   elog(LOG, "Allocated memory");
 40 |   combined_string[0] = '\0';
 41 |   for (int i = 0; i < num_of_columns; i++) {
 42 |     strcat(combined_string, TextDatumGetCString(columns[i]));
 43 |     elog(LOG, "Creating combined string %s", combined_string);
 44 |     if (i < num_of_columns - 1) {
 45 |       strcat(combined_string, ",");
 46 |     }
 47 |   }
 48 |   return combined_string;
 49 | }
 50 | 
 51 | Datum register_table_export(PG_FUNCTION_ARGS) {
 52 |   // TODO - add validation to ensure table exists
 53 |   // and column types are supported for export
 54 |   int num_of_args = PG_NARGS();
 55 |   if (num_of_args != 4) {
 56 |     ereport(ERROR, (errcode(ERRCODE_RAISE_EXCEPTION),
 57 |                     errmsg("Invalid number of arguments. Expected format is "
 58 |                            "register_export(table_name text, columns_to_export "
 59 |                            "text[], export_frequency_hours int)")));
 60 |   }
 61 |   // Extract table name
 62 |   char *table_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
 63 |   // Extract columns to export
 64 |   ArrayType *arr = PG_GETARG_ARRAYTYPE_P(1);
 65 |   Assert(ARR_NDIM(arr) == 1);
 66 |   Assert(ARR_ELEMTYPE(arr) == TEXTOID);
 67 |   Datum *column_datums;
 68 |   int num_of_columns;
 69 |   deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT, &column_datums, NULL,
 70 |                     &num_of_columns);
 71 |   char *column_str;
 72 |   column_str = get_columns_string(column_datums, num_of_columns);
 73 |   elog(LOG, "Created column string  as %s", column_str);
 74 |   // Extract export frequency
 75 |   int32 export_frequency_hours = PG_GETARG_INT32(2);
 76 |   int64 chunk_size = PG_GETARG_INT32(3);
 77 | 
 78 |   StringInfoData buf;
 79 |   initStringInfo(&buf);
 80 |   appendStringInfo(&buf,
 81 |                    "INSERT INTO analytica_exports (table_name, "
 82 |                    "columns_to_export, export_frequency_hours, export_status, "
 83 |                    "chunk_size) VALUES ('%s', '{%s}', %d, %d, %ld);",
 84 |                    table_name, column_str, export_frequency_hours, PENDING,
 85 |                    chunk_size);
 86 | 
 87 |   int status = execute_query(buf);
 88 |   if (status < 0) {
 89 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
 90 |                     errmsg("Query execution failed")));
 91 |   }
 92 |   pfree(column_str);
 93 |   elog(LOG, "Scheduled export for table %s with frequency of %d hours",
 94 |        table_name, export_frequency_hours);
 95 |   PG_RETURN_INT32(1);
 96 | }
 97 | 
 98 | Datum unregister_table_export(PG_FUNCTION_ARGS) {
 99 |   int num_of_args = PG_NARGS();
100 |   if (num_of_args != 1) {
101 |     ereport(ERROR, (errcode(ERRCODE_RAISE_EXCEPTION),
102 |                     errmsg("Invalid number of arguments. Expected format is "
103 |                            "unregister_export(table_name text)")));
104 |   }
105 |   // Extract table name
106 |   char *table_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
107 | 
108 |   StringInfoData buf;
109 |   initStringInfo(&buf);
110 |   appendStringInfo(&buf, "UPDATE analytica_exports        \
111 | 		 SET export_status = %d \
112 | 		 WHERE table_name = '%s';",
113 |                    INACTIVE, table_name);
114 | 
115 |   int status = execute_query(buf);
116 |   if (status < 0) {
117 |     ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
118 |                     errmsg("Failed to unregister table.")));
119 |   }
120 | 
121 |   elog(LOG,
122 |        "Marked %s table export as inactive. Table data will be deleted at next "
123 |        "available window.",
124 |        table_name);
125 |   PG_RETURN_INT32(1);
126 | }


--------------------------------------------------------------------------------
/ingestor/test.sql:
--------------------------------------------------------------------------------
 1 | DROP EXTENSION ingestor;
 2 | CREATE EXTENSION ingestor;
 3 | 
 4 | SELECT register_table_export(
 5 |     -- table name
 6 |     'test_data', 
 7 |     -- columns to export
 8 |     '{id,name,age,is_random,rating,created_at}',
 9 |     -- export_frequency_hours
10 |     7
11 | );
12 | 
13 | -- To Launch ingestor background worker
14 | SELECT ingestor_launch();


--------------------------------------------------------------------------------
/notes.md:
--------------------------------------------------------------------------------
  1 | ## Rough Notes on pg_analytica extension
  2 | 
  3 | 
  4 | ## Approach 1
  5 | 
  6 | Try creating a columnar table by running a `CREATE TABLE AS` query by reading data from
  7 | an existing table using a custom table access method to write the data.
  8 | 
  9 | To keep the table updated as new rows are added, we would need to explore other techniques.
 10 | May create a hook that periodically writes new tuples from source table. 
 11 | Maybe we can keep track of the oldest txn id ingested and query rows with txn id greater than that and write to columnar store.
 12 | 
 13 |  - pg_analytica extension defines a custom table access method
 14 |     - When used the following methods are called
 15 |         - `memam_relation_set_new_filelocator`
 16 |         - `memam_relation_needs_toast_table`
 17 |         - `memam_tuple_insert` called for each row in source table
 18 |             - Data in tuples is empty, number of valid attributes `tts_nvalid` is 0.
 19 |             - Calling `ExecFetchSlotHeapTuple` fetches tuple with actual data in it.
 20 | 
 21 |      - `memam_relation_set_new_filelocator` called to create storage for new table
 22 |         - `RelationCreateStorage` can be called to create the create the storage manager relation
 23 |            `SMgrRelation` used by postgres to access the table storage
 24 |         - Citus maps columnar storage concepts to postgres concepts like pages
 25 |         - `CatalogTupleInsert` is called at the end to insert table information into postgres catalog
 26 | 
 27 | ## Approach 2
 28 |      - Table Access Manager definition only support row by row insert for `CREATE TABLE AS` queries
 29 |         - Multi-insert is only called if data is dumped to disk and read as a file
 30 |         - `CTAS` appoach to create columnar store may not work
 31 |      
 32 |      - Creating columnar tables in background process
 33 |         - Extension creates a metadata table to store metadata about columnar tables
 34 |         - Calling a method exposed by extension adds table entry to metadata table
 35 |             - A new table is also created in catalog that can use custom table AM handler ??
 36 |                 - Can we override query behavior here?
 37 |         - Background worker periodically scans registered tables and reads a subset of rows and writes it as a columnar file
 38 |             - After writing, the latest transaction id is written to metadata table
 39 |                 - instead of transaction id we can read using OFFSET and chunk as well
 40 |                     ```
 41 |                     SELECT * FROM your_table
 42 |                     OFFSET offset
 43 |                     LIMIT chunk_size;
 44 |                     ```
 45 |                  - For determinitoc order we can ask users to supply list of columns by which
 46 |                    rows should be ordered. Columnar files will be sorted by these columns.
 47 |             - In subsequent runs, only rows after latest transaction id / offset are scanned
 48 |             - Alternatively, we can scan the entire table each time.
 49 |         - Another background worker periodically combines multiple columnar files for the same table into a single file
 50 |         - Multiple ways of initializing a file defined here
 51 |           `postgres/src/include/storage/fd.h`
 52 |         - Initializing a subdirectory in postgres data directory here
 53 |           `MakePGDirectory(const char *directoryName)`
 54 |         - Details about creating background worker here
 55 |           `postgres/src/include/postmaster/bgworker.h`
 56 |         - Example usage of background worker
 57 |           `postgres/src/test/modules/worker_spi/worker_spi.c`
 58 | 
 59 | ## Reading columnar file created by ingestor
 60 | 
 61 | The columnar file is created in the postgres data directory (PGDATA).
 62 | PGDATA directory can be found using this sql query.
 63 | ```
 64 | SELECT current_setting('data_directory') AS pgdata_path;
 65 | ```
 66 | 
 67 | To read the parquet file, start a duck db instance in memory and
 68 | execute the select query to read the parquet file.
 69 | 
 70 | ```
 71 | $ duckdb
 72 | 
 73 | SELECT * from read_parquet('/opt/homebrew/var/postgresql@14/sample.parquet');
 74 | ```
 75 | 
 76 | # Query using Parquet Foreign Data Wrapper
 77 |  - https://github.com/adjust/parquet_fdw/blob/master/README.md
 78 | 
 79 | # Query timings
 80 | 
 81 | 
 82 | ### Test 1
 83 | 
 84 | Measured using the timing feature in postgres.
 85 |  - Table Size little over 2GiB
 86 |     - 30M rows with 2 int64 columns and a varchar column exported
 87 | 
 88 | ```
 89 | \timing on
 90 | ```
 91 | 
 92 |  - select with group by and filter
 93 | ```
 94 | SELECT 
 95 |     age,
 96 |     COUNT(*) 
 97 | FROM test_data_columnar
 98 | WHERE age > 50 AND age < 90 AND id > 1000
 99 | GROUP BY age;
100 | 
101 | Time: 1633.900 ms (00:01.634)
102 | 
103 | SELECT 
104 |     age,
105 |     COUNT(*) 
106 | FROM test_data
107 | WHERE age > 50 AND age < 90 AND id > 1000
108 | GROUP BY age;
109 | 
110 | Time: 5153.393 ms (00:05.153)
111 | ```
112 | 
113 | ### Test 2
114 | 
115 | Measured using the timing feature in postgres.
116 |  - Table Size little over 5GiB
117 |     - 60M rows with 6 columns of different types
118 | 
119 |  
120 | ```
121 | \timing on
122 | 
123 | 
124 | -- Query on row based data
125 | SELECT 
126 |     age,
127 |     COUNT(*) 
128 | FROM test_data
129 | WHERE age > 50 AND age < 90 AND is_random
130 | GROUP BY age;
131 | 
132 | Time: 9446.380 ms (00:09.446)
133 | 
134 | -- Query on columnar data
135 | SELECT 
136 |     age,
137 |     COUNT(*) 
138 | FROM analytica_test_data
139 | WHERE age > 50 AND age < 90 AND is_random
140 | GROUP BY age;
141 | 
142 | Time: 2312.845 ms (00:02.313)
143 | 
144 | postgres=# \timing on
145 | Timing is on.
146 | postgres=# SELECT COUNT(*) from analytica_test_data WHERE age > 50 AND age < 55;
147 |   count  
148 | ---------
149 |  2398998
150 | (1 row)
151 | 
152 | Time: 1399.673 ms (00:01.400)
153 | 
154 | 
155 | postgres=# SELECT COUNT(*) from test_data WHERE age > 50 AND age < 55;
156 |   count  
157 | ---------
158 |  2399709
159 | (1 row)
160 | 
161 | Time: 9904.160 ms (00:09.904)
162 | 
163 | ```
164 | 
165 | ## Export timings
166 | 
167 | ### Test 1
168 | 
169 | Time to export can be reduced with variable exort chunk sizes.
170 | Users can choose larger chunk size depending on size of row and system memory.
171 | 
172 |   - Table Size little over 2GiB
173 |       - 30M rows with 2 int64 columns and a varchar column exported
174 |       - export chunk size 10k
175 |       - start time 12:21 PM
176 |       - 10M / 30M rows processed at 12:33 PM
177 |       - Finished processing at 13:34 PM
178 |       - Overal export time - 1hr 13m
179 |       - Max Memory usage - 3.2GiB
180 |       - 3k columnar files exported
181 | 
182 | ### Test 2
183 | 
184 | - Table Size little over 5GiB
185 |   - export chunk size 100k
186 |   - Started at 8:47 PM
187 |   - Peak memory usage steadily growing, currently at 9GiB
188 |     - possible memory leak
189 |   - memory usage at 9.82GiB
190 |   - Completed at 9:34 PM
191 |   - Peak memory usage 10.35GiB
192 |   - 40min to export 5GiB of data
193 |   - 600 columnar files exported
194 | 
195 | ## Tasks
196 | 
197 |  - Look into memory usage increase
198 |   - see if using cursors to read chunks reduces memory usage
199 |   - check for memory leaks
200 |  - Create section in readme about perf improvements
201 | 
202 |  - Add export timing metrics to metadata
203 |  - Support exporting columnar files sorted by keys


--------------------------------------------------------------------------------