├── .gitignore
├── LICENSE
├── README.md
├── TROUBLESHOOTING
├── analyze.sh
├── docs
└── quack_20131127_8bit.png
├── dragonscale.html
├── greyscale_report.sh
├── greyscale_stats.sh
├── internal_image_creator.sh
├── presentation.sh
├── quack.sh
├── quack_helper_common.sh
├── quack_helper_imagepage.sh
├── samples
├── ACE-17950616-0018B.alto.xml
├── ACE-17950616-0018B.jpg
├── ACE-17950616-0018B.snippet
├── ACE-17950616-0019A.alto.xml
├── ACE-17950616-0019A.jpg
├── ACE-17950616-0019B.alto.xml
├── ACE-17950616-0019B.jpg
├── README
└── folder.snippet
└── web
├── folder_template.html
├── histogram_template.html
├── image_template.html
├── imagelink_template.html
├── quack.css
├── quack.js
├── quack_folder.js
├── sorttable.js
├── sorttable.licence.html
└── thumb_template.html
/.gitignore:
--------------------------------------------------------------------------------
1 | openseadragon.min.js
2 | quack.settings
3 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction, and
10 | distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by the copyright
13 | owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all other entities
16 | that control, are controlled by, or are under common control with that entity.
17 | For the purposes of this definition, "control" means (i) the power, direct or
18 | indirect, to cause the direction or management of such entity, whether by
19 | contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the
20 | outstanding shares, or (iii) beneficial ownership of such entity.
21 |
22 | "You" (or "Your") shall mean an individual or Legal Entity exercising
23 | permissions granted by this License.
24 |
25 | "Source" form shall mean the preferred form for making modifications, including
26 | but not limited to software source code, documentation source, and configuration
27 | files.
28 |
29 | "Object" form shall mean any form resulting from mechanical transformation or
30 | translation of a Source form, including but not limited to compiled object code,
31 | generated documentation, and conversions to other media types.
32 |
33 | "Work" shall mean the work of authorship, whether in Source or Object form, made
34 | available under the License, as indicated by a copyright notice that is included
35 | in or attached to the work (an example is provided in the Appendix below).
36 |
37 | "Derivative Works" shall mean any work, whether in Source or Object form, that
38 | is based on (or derived from) the Work and for which the editorial revisions,
39 | annotations, elaborations, or other modifications represent, as a whole, an
40 | original work of authorship. For the purposes of this License, Derivative Works
41 | shall not include works that remain separable from, or merely link (or bind by
42 | name) to the interfaces of, the Work and Derivative Works thereof.
43 |
44 | "Contribution" shall mean any work of authorship, including the original version
45 | of the Work and any modifications or additions to that Work or Derivative Works
46 | thereof, that is intentionally submitted to Licensor for inclusion in the Work
47 | by the copyright owner or by an individual or Legal Entity authorized to submit
48 | on behalf of the copyright owner. For the purposes of this definition,
49 | "submitted" means any form of electronic, verbal, or written communication sent
50 | to the Licensor or its representatives, including but not limited to
51 | communication on electronic mailing lists, source code control systems, and
52 | issue tracking systems that are managed by, or on behalf of, the Licensor for
53 | the purpose of discussing and improving the Work, but excluding communication
54 | that is conspicuously marked or otherwise designated in writing by the copyright
55 | owner as "Not a Contribution."
56 |
57 | "Contributor" shall mean Licensor and any individual or Legal Entity on behalf
58 | of whom a Contribution has been received by Licensor and subsequently
59 | incorporated within the Work.
60 |
61 | 2. Grant of Copyright License.
62 |
63 | Subject to the terms and conditions of this License, each Contributor hereby
64 | grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free,
65 | irrevocable copyright license to reproduce, prepare Derivative Works of,
66 | publicly display, publicly perform, sublicense, and distribute the Work and such
67 | Derivative Works in Source or Object form.
68 |
69 | 3. Grant of Patent License.
70 |
71 | Subject to the terms and conditions of this License, each Contributor hereby
72 | grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free,
73 | irrevocable (except as stated in this section) patent license to make, have
74 | made, use, offer to sell, sell, import, and otherwise transfer the Work, where
75 | such license applies only to those patent claims licensable by such Contributor
76 | that are necessarily infringed by their Contribution(s) alone or by combination
77 | of their Contribution(s) with the Work to which such Contribution(s) was
78 | submitted. If You institute patent litigation against any entity (including a
79 | cross-claim or counterclaim in a lawsuit) alleging that the Work or a
80 | Contribution incorporated within the Work constitutes direct or contributory
81 | patent infringement, then any patent licenses granted to You under this License
82 | for that Work shall terminate as of the date such litigation is filed.
83 |
84 | 4. Redistribution.
85 |
86 | You may reproduce and distribute copies of the Work or Derivative Works thereof
87 | in any medium, with or without modifications, and in Source or Object form,
88 | provided that You meet the following conditions:
89 |
90 | You must give any other recipients of the Work or Derivative Works a copy of
91 | this License; and
92 | You must cause any modified files to carry prominent notices stating that You
93 | changed the files; and
94 | You must retain, in the Source form of any Derivative Works that You distribute,
95 | all copyright, patent, trademark, and attribution notices from the Source form
96 | of the Work, excluding those notices that do not pertain to any part of the
97 | Derivative Works; and
98 | If the Work includes a "NOTICE" text file as part of its distribution, then any
99 | Derivative Works that You distribute must include a readable copy of the
100 | attribution notices contained within such NOTICE file, excluding those notices
101 | that do not pertain to any part of the Derivative Works, in at least one of the
102 | following places: within a NOTICE text file distributed as part of the
103 | Derivative Works; within the Source form or documentation, if provided along
104 | with the Derivative Works; or, within a display generated by the Derivative
105 | Works, if and wherever such third-party notices normally appear. The contents of
106 | the NOTICE file are for informational purposes only and do not modify the
107 | License. You may add Your own attribution notices within Derivative Works that
108 | You distribute, alongside or as an addendum to the NOTICE text from the Work,
109 | provided that such additional attribution notices cannot be construed as
110 | modifying the License.
111 | You may add Your own copyright statement to Your modifications and may provide
112 | additional or different license terms and conditions for use, reproduction, or
113 | distribution of Your modifications, or for any such Derivative Works as a whole,
114 | provided Your use, reproduction, and distribution of the Work otherwise complies
115 | with the conditions stated in this License.
116 |
117 | 5. Submission of Contributions.
118 |
119 | Unless You explicitly state otherwise, any Contribution intentionally submitted
120 | for inclusion in the Work by You to the Licensor shall be under the terms and
121 | conditions of this License, without any additional terms or conditions.
122 | Notwithstanding the above, nothing herein shall supersede or modify the terms of
123 | any separate license agreement you may have executed with Licensor regarding
124 | such Contributions.
125 |
126 | 6. Trademarks.
127 |
128 | This License does not grant permission to use the trade names, trademarks,
129 | service marks, or product names of the Licensor, except as required for
130 | reasonable and customary use in describing the origin of the Work and
131 | reproducing the content of the NOTICE file.
132 |
133 | 7. Disclaimer of Warranty.
134 |
135 | Unless required by applicable law or agreed to in writing, Licensor provides the
136 | Work (and each Contributor provides its Contributions) on an "AS IS" BASIS,
137 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied,
138 | including, without limitation, any warranties or conditions of TITLE,
139 | NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are
140 | solely responsible for determining the appropriateness of using or
141 | redistributing the Work and assume any risks associated with Your exercise of
142 | permissions under this License.
143 |
144 | 8. Limitation of Liability.
145 |
146 | In no event and under no legal theory, whether in tort (including negligence),
147 | contract, or otherwise, unless required by applicable law (such as deliberate
148 | and grossly negligent acts) or agreed to in writing, shall any Contributor be
149 | liable to You for damages, including any direct, indirect, special, incidental,
150 | or consequential damages of any character arising as a result of this License or
151 | out of the use or inability to use the Work (including but not limited to
152 | damages for loss of goodwill, work stoppage, computer failure or malfunction, or
153 | any and all other commercial damages or losses), even if such Contributor has
154 | been advised of the possibility of such damages.
155 |
156 | 9. Accepting Warranty or Additional Liability.
157 |
158 | While redistributing the Work or Derivative Works thereof, You may choose to
159 | offer, and charge a fee for, acceptance of support, warranty, indemnity, or
160 | other liability obligations and/or rights consistent with this License. However,
161 | in accepting such obligations, You may act only on Your own behalf and on Your
162 | sole responsibility, not on behalf of any other Contributor, and only if You
163 | agree to indemnify, defend, and hold each Contributor harmless for any liability
164 | incurred by, or claims asserted against, such Contributor by reason of your
165 | accepting any such warranty or additional liability.
166 |
167 | END OF TERMS AND CONDITIONS
168 |
169 | APPENDIX: How to apply the Apache License to your work
170 |
171 | To apply the Apache License to your work, attach the following boilerplate
172 | notice, with the fields enclosed by brackets "[]" replaced with your own
173 | identifying information. (Don't include the brackets!) The text should be
174 | enclosed in the appropriate comment syntax for the file format. We also
175 | recommend that a file or class name and description of purpose be included on
176 | the same "printed page" as the copyright notice for easier identification within
177 | third-party archives.
178 |
179 | Copyright [yyyy] [name of copyright owner]
180 |
181 | Licensed under the Apache License, Version 2.0 (the "License");
182 | you may not use this file except in compliance with the License.
183 | You may obtain a copy of the License at
184 |
185 | http://www.apache.org/licenses/LICENSE-2.0
186 |
187 | Unless required by applicable law or agreed to in writing, software
188 | distributed under the License is distributed on an "AS IS" BASIS,
189 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
190 | See the License for the specific language governing permissions and
191 | limitations under the License.
192 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Quack
2 |
3 | An enhanced [ALTO](http://www.loc.gov/standards/alto/)-viewer for Quality Assurance oriented display of a collections of scans, typically from books or newspapers.
4 |
5 | Please visit [http://tokee.github.io/quack/](http://tokee.github.io/quack/) for the project homepage, featuring a live demo.
6 |
7 |
8 | 
9 |
10 | ## Requirements
11 |
12 | * A minimum of 2GB free RAM for processing of 30MP scans
13 | * bash
14 | * [GraphicsMagic](http://www.graphicsmagick.org/)
15 | * [ImageMagick](http://www.imagemagick.org) (as GraphicsMagic cannot create histograms)
16 | * [openseadragon.min.js](http://openseadragon.github.io/)
17 | * A suitable beefy browser equipped machine for display, depending on image sizes and ALTO complexity
18 | * [deepzoom](http://search.cpan.org/~drrho/Graphics-DZI-0.05/script/deepzoom) (only if tile based display is enabled. Install with 'sudo cpan -f install Graphics::DZI')
19 | * Perl (required by deepzoom)
20 |
21 | ## Usage
22 |
23 | To get started, execute the following in a terminal (Tested under Linux, should work under OS-X and Cygwin):
24 |
25 | ./quack.sh samples samples_out
26 |
--------------------------------------------------------------------------------
/TROUBLESHOOTING:
--------------------------------------------------------------------------------
1 | Some issues encountered with Quack and how to solve them.
2 |
3 | *** deepzoom does not work ***
4 |
5 | There seems to be a problem with Perl 5.18+ and deepzoom 0.05.
6 | http://search.cpan.org/~drrho/Graphics-DZI-0.05/script/deepzoom
7 |
8 | It complains about an enum being wrongly initialized.
9 | https://metacpan.org/pod/release/ETHER/Moose-2.1106-TRIAL/lib/Moose/Manual/Delta.pod#pod2.1200
10 |
11 | The problem can be solved by replacing
12 | enum 'packing' => qw( exponential linear );
13 | with
14 | enum 'packing' => [qw( exponential linear )];
15 | in the file
16 | /usr/local/share/perl/5.18.2/Graphics/DZI/A4.pm
17 | (or similar folder). There will still be warnings with "given is
18 | experimental" and "when is experimental", but they are not critical
19 | in Perl 5.18 (although this probably means that Graphics::DZI will
20 | need further tweaks to work with later Perl versions).
21 |
22 |
23 | *** The zoomable images are too bright ***
24 |
25 | This issue has not been investigated fully. The text below is based on
26 | speculation, but seems to produce the expected result.
27 |
28 | It seems that conversion to PNG with newer versions of Graphics Magick
29 | produces an image with Gamma (gAMA chunk) set to 1.0 if no gamma is
30 | specified in the source image.
31 |
32 | The common implicit assumption with no Gamma specified is Gamma 2.2
33 | so the displayed tiles will appear very bright (1.0 -> 2.2).
34 |
35 | To counter this effect, set Gamma explicitely to 0.4 (1.0 / 2.2) in the
36 | QA image generation by adding the line
37 |
38 | export QA_EXTRA="+gamma 0.4"
39 |
40 | to quack.settings. This specified the Gamma chunk without altering the
41 | pixel values.
42 |
--------------------------------------------------------------------------------
/analyze.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #
4 | # Statistical helper functions for greyscale images.
5 | #
6 | # Requirements: ImageMagick's identify and convert
7 | #
8 |
9 | # If true, input files are assumed to be greyscale.
10 | # If false, they are converted to greyscale before extracting statistics.
11 | if [ "." == ".$ASSUME_GREY" ]; then
12 | ASSUME_GREY=true
13 | fi
14 |
15 | # Prints width and height of the given image, separated by space
16 | # Input: Image
17 | function isize() {
18 | identify -format "%w %h" "$1"
19 | }
20 | export -f isize
21 |
22 | # TODO: Accept destination for identify-file as input
23 | # TODO: If FORCE_HISTOGRAM is true, cached identify-files should be deleted
24 | # to ensure CROP_PERCENT is used
25 |
26 | # Input: image [destination]
27 | # Sample: foo.png
28 | # Produces foo.identify if not already existing.
29 | # Output: The name of the identity file
30 | function im_identify() {
31 | local SRC="$1"
32 | if [ -n "$2" ]; then
33 | local DEST_FOLDER="$2"
34 | else
35 | local DEST_FOLDER=$(dirname "$SRC")
36 | fi
37 |
38 | local BASE=${SRC##*/}
39 | local IDENTIFY=${DEST_FOLDER}/${BASE%%.*}.identify
40 |
41 | if [ -f "$IDENTIFY" ]; then
42 | echo "$IDENTIFY"
43 | return
44 | fi
45 | if [ "false" == "$ASSUME_GREY" ]; then
46 | # We do the TIFF-conversion to force greyscale
47 | local TMP=`mktemp --suffix .tif`
48 | if [ "." == ".$CROP_PERCENT" ]; then
49 | gm convert "$SRC" -colorspace gray "$TMP"
50 | else
51 | gm convert "$SRC" -gravity Center -crop $CROP_PERCENT%x+0+0 -colorspace gray "$TMP"
52 | fi
53 | identify -verbose "$TMP" > "$IDENTIFY"
54 | rm "$TMP"
55 | else
56 | if [ "." == ".$CROP_PERCENT" ]; then
57 | identify -verbose "$SRC" > "$IDENTIFY"
58 | else
59 | local TMP=`mktemp --suffix .tif`
60 | gm convert "$SRC" -gravity Center -crop $CROP_PERCENT%x+0+0 "$TMP"
61 | identify -verbose "$TMP" > "$IDENTIFY"
62 | rm "$TMP"
63 | fi
64 | fi
65 | echo "$IDENTIFY"
66 | }
67 | export -f im_identify
68 |
69 | # Outputs all the greyscale values and their counts
70 | # Input: Image [destination]
71 | function greys() {
72 | local IFILE=`im_identify "$1" "$2"`
73 |
74 | local VALUES=`cat "$IDENTIFY" | grep -A 256 Histogram`
75 | if [ ! "." == ".`grep Colormap "$IFILE"`" ]; then
76 | cat "$IFILE" | grep -A 257 Histogram | grep -B 256 Colormap | grep "[0-9]\\+: ("
77 | else
78 | cat "$IFILE" | grep -A 256 Histogram | grep "[0-9]\\+: ("
79 | fi
80 | }
81 | export -f greys
82 |
83 | # TODO: Accept destination for grey-stats-file as input
84 |
85 | # Input: image
86 | # Sample: foo.png
87 | # Produces foo.grey with $PIXELS $UNIQUE $FIRST_COUNT $PERCENT_FIRST $FIRST_GREY $LAST_COUNT $PERCENT_LAST $LAST_GREY
88 | # Output: $PIXELS $UNIQUE $FIRST_COUNT $PERCENT_FIRST $FIRST_GREY $LAST_COUNT $PERCENT_LAST $LAST_GREY $ZEROES $HOLES
89 | function grey_stats() {
90 | local SRC="$1"
91 | if [ -n "$2" ]; then
92 | local DEST_FOLDER="$2"
93 | else
94 | local DEST_FOLDER=$(dirname "$SRC")
95 | fi
96 |
97 | if [ ! -f "$SRC" ]; then
98 | echo "grey_stats: The file $SRC does not exist in `pwd`" 1>&2
99 | return
100 | fi
101 |
102 | local IDENTIFY=$(im_identify "$SRC" "$DEST_FOLDER")
103 |
104 | local BASE=${SRC##*/}
105 | local GREY=${DEST_FOLDER}/${BASE%%.*}.grey
106 |
107 | local INFO=`cat "$IDENTIFY"`
108 | local RAW_VALUES=`greys "$SRC" "$DEST_FOLDER"`
109 | # TODO: No good as the histogram data might be much less than 256
110 | # local VALUES=`cat "$IDENTIFY" | grep -A 256 Histogram`
111 | # if [ ! "." == ".`echo "$VALUES" | grep Colormap`" ]; then
112 | # local VALUES=`echo "$VALUES" | grep -B 256 Colormap`
113 | # fi
114 | # local RAW_VALUES=`echo "$VALUES" | grep "[0-9]\\+: ("`
115 | # local VALUES="$INFO"
116 | # ***
117 | # local SAVEIFS=$IFS
118 | IFS=$(echo -en $"\n")
119 |
120 | local UNIQUE=`echo "$RAW_VALUES" | wc -l`
121 |
122 | local FIRST_REAL_GREY=`echo "$RAW_VALUES" | head -n 1 | sed 's/.* ( *\([0-9]\+\),.*/\1/'`
123 |
124 | local UNIQUE_DARKS=0
125 | if [ ! "1,1,1" == ".$BLOWN_BLACK_BT" ]; then
126 | # TODO: Add skipping based on BLOWN_BLACK_WT
127 | local FIRST_COUNT=0
128 | local MAXG=`echo "$BLOWN_BLACK_BT" | grep -o "^[^,]\+"`
129 | # echo "$RAW_VALUES" | head -n $MAXG
130 | IFS=$(echo -en $"\n\b")
131 | for E in `echo "$RAW_VALUES" | head -n $MAXG`; do
132 | # echo "e:$E"
133 | # 81422: ( 0, 0, 0) #000000 black
134 | local C=`echo "$E" | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
135 | local G=`echo "$E" | grep -o ": \\+([^0-9]*[0-9]\\+," | grep -o "[0-9]\\+"`
136 | # echo "c:$C g:$G t:$MAXG"
137 | if [ "$G" -lt "$MAXG" ]; then
138 | local UNIQUE_DARKS=$((UNIQUE_DARKS+1))
139 | local FIRST_COUNT=$((FIRST_COUNT+$C))
140 | local LAST_VALID=$G
141 | fi
142 | done
143 | local FIRST_GREY="0-$LAST_VALID"
144 | #local FIRST_GREY=`echo "$E" | head -n 1 | grep -o " ([0-9 ,]*)" | sed 's/ //g'`
145 | else
146 | local UNIQUE_DARKS=1
147 | local FIRST_GREY=`echo "$RAW_VALUES" | head -n 1 | sed 's/.* ( *\([0-9]\+\),.*/\1/'`
148 | local FIRST_COUNT=`echo "$RAW_VALUES" | head -n 1 | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
149 | fi
150 | if [ 0 -eq "$FIRST_COUNT" ]; then
151 | # No pixels from 0-fuzzy_factor
152 | local UNIQUE_DARKS=1
153 | local FIRST_GREY=`echo "$RAW_VALUES" | head -n 1 | sed 's/.* ( *\([0-9]\+\),.*/\1/'`
154 | local FIRST_COUNT=`echo "$RAW_VALUES" | head -n 1 | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
155 | fi
156 | IFS=$(echo -en $"\n")
157 |
158 | local LAST_COUNT=`echo "$RAW_VALUES" | tail -n 1 | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
159 | local LAST_GREY=`echo "$RAW_VALUES" | tail -n 1 | sed 's/.* ( *\([0-9]\+\),.*/\1/'`
160 |
161 | local ZEROES=$((256-UNIQUE))
162 | local SPAN=$((LAST_GREY-FIRST_REAL_GREY+1))
163 | local EDGE=$((256-SPAN))
164 | local HOLES=$((ZEROES-EDGE))
165 |
166 | # TODO: Also remove lightest
167 | local REDUCED=`skipLines "$RAW_VALUES" $UNIQUE_DARKS`
168 | local REDUCED=`skipLines "$REDUCED" -1`
169 | local SPIKE_LINE=`echo "$REDUCED" | sort -n | tail -n 1`
170 | local SPIKE_COUNT=`echo "$SPIKE_LINE" | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
171 | local SPIKE_GREY=`echo "$SPIKE_LINE" | sed 's/.* ( *\([0-9]\+\),.*/\1/'`
172 |
173 | local GEOMETRY=`echo $INFO | grep "Geometry: [0-9]\\+x[0-9]\\+" | grep -o "[0-9]\\+x[0-9]\\+"`
174 | local X=`echo $GEOMETRY | grep -o "[0-9]\\+x" | grep -o "[0-9]\\+"`
175 | local Y=`echo $GEOMETRY | grep -o "x[0-9]\\+" | grep -o "[0-9]\\+"`
176 | local PIXELS=`echo "$X*$Y" | bc`
177 |
178 | # http://stackoverflow.com/questions/8402181/how-do-i-get-bc1-to-print-the-leading-zero
179 | local PERCENT_FIRST=`echo "scale=2;x=$FIRST_COUNT*100/$PIXELS; if(x<1) print 0; x" | bc`
180 | local PERCENT_LAST=`echo "scale=2;x=$LAST_COUNT*100/$PIXELS; if(x<1) print 0; x" | bc`
181 | local SPIKE_PERCENT=`echo "scale=2;x=$SPIKE_COUNT*100/$PIXELS; if(x<1) print 0; x" | bc`
182 |
183 | echo "$PIXELS $UNIQUE $FIRST_COUNT $PERCENT_FIRST $FIRST_GREY $LAST_COUNT $PERCENT_LAST $LAST_GREY" > "$GREY"
184 |
185 | IFS=$SAVEIFS
186 |
187 | echo "$PIXELS $UNIQUE $FIRST_COUNT $PERCENT_FIRST $FIRST_GREY $LAST_COUNT $PERCENT_LAST $LAST_GREY $SPIKE_COUNT $SPIKE_PERCENT $SPIKE_GREY $ZEROES $HOLES"
188 | }
189 |
190 | #http://stackoverflow.com/questions/5799303/print-a-character-repeatedly-in-bash
191 | # Input: char num
192 | printChar() {
193 | str=$1
194 | num=$2
195 | v=$(printf "%-${num}s" "$str")
196 | echo "${v// /*}"
197 | }
198 | export -f printChar
199 |
200 | # Produces a histogram over greyscale intensities in the given image
201 | # Input: image height log [destination]
202 | # Sample: foo.jpg 200 true foo.hist.png
203 | # Output: foo.png (256 x height pixels) with the histogram
204 | function histogramScript() {
205 | local SRC="$1"
206 | local HEIGHT=$2
207 | local LOG=$3
208 | local DEST="$4"
209 | local IDENTIFY_DEST=$(dirname ${DEST})
210 |
211 | local IDENTIFY=`im_identify "$SRC" "$IDENTIFY_DEST"`
212 | if [ ! -n "$DEST" ]; then
213 | local DEST=${SRC%%.*}.histogram.png
214 | fi
215 | # Convert
216 | # 78085: ( 0, 0, 0) #000000 black
217 | # 3410: ( 1, 1, 1) #010101 rgb(1,1,1)
218 | # into
219 | # 0 78085
220 | # 1 3410
221 | GREYS=`greys "$SRC" | sed 's/ \\+\\([0-9]\\+\\): ( *\\([0-9]\\+\\).\\+/\\2 \\1/g'`
222 | # Find lowest and highest for both intensity and count
223 | local MIN_GREY=255
224 | local MAX_GREY=0
225 | local MIN_COUNT=9999999
226 | local MAX_COUNT=0
227 | local TOTAL_COUNT=0
228 |
229 | # Speedup-trick: Read one line of a time instead of splitting up front with for-loop
230 | while IFS= read -r L
231 | do
232 | set -- junk $L
233 | shift
234 | # local GREY=`echo "$L" | cut -d\ -f1`
235 | # local COUNT=`echo "$L" | cut -d\ -f2`
236 | local GREY=$1
237 | local COUNT=$2
238 | local TOTAL_COUNT=$((TOTAL_COUNT+COUNT))
239 | if [ $MIN_GREY -gt $GREY ]; then
240 | local MIN_GREY=$GREY
241 | fi
242 | if [ $MAX_GREY -lt $GREY ]; then
243 | local MAX_GREY=$GREY
244 | fi
245 | if [ $MIN_COUNT -gt $COUNT ]; then
246 | local MIN_COUNT=$COUNT
247 | fi
248 | if [ $MAX_COUNT -lt $COUNT ]; then
249 | local MAX_COUNT=$COUNT
250 | fi
251 | done <<< "$GREYS"
252 |
253 | # echo "Grey: $MIN_GREY $MAX_GREY count: $MIN_COUNT $MAX_COUNT $TOTAL_COUNT"
254 |
255 | if [ -n "$HISTOGRAM_PHEIGHT" ]; then
256 | if [ ! "auto" == "$HISTOGRAM_PHEIGHT" ]; then
257 | if [ ! "script_auto" == "$HISTOGRAM_PHEIGHT" ]; then
258 | local HH=`echo "$HISTOGRAM_PHEIGHT" | grep -o "[0-9]\+"`
259 | local MAX_COUNT=$((HH*TOTAL_COUNT/100))
260 | fi
261 | fi
262 | fi
263 |
264 | # Let SCALE map all counts from 0 to 100000000 (giga)
265 | if [ ".true" == ".$LOG" ]; then
266 | local SCALE=`echo "1000000000/l($MAX_COUNT)" | bc -l`
267 | # local SCALE=`echo "scale=10;1/l($MAX_COUNT)" | bc -l`
268 | else
269 | local SCALE=$((1000000000/MAX_COUNT))
270 | # local SCALE=`echo "scale=10;1/$MAX_COUNT" | bc -l`
271 | fi
272 |
273 | # We create a PGM-file with the extracted greyscale statistics
274 | # as a histogram. The PGM is sideways because it is easier
275 | # http://netpbm.sourceforge.net/doc/pgm.html
276 | local HTMP=`mktemp --suffix .pgm`
277 | if [ "true" == "$LOG" ]; then
278 | local NONE=1
279 | else
280 | local NONE=0
281 | fi
282 |
283 | echo "P5 $HEIGHT 256 255" > $HTMP
284 |
285 | # Speedup-tricks: Avoid forking as much as possible by doing arithmetic
286 | # with the built-in $(()). Avoid floating point by scaling up.
287 | # Output 0 and ff with printf instead of loop.
288 | for G in `seq 0 255`; do
289 | local LINE=`echo "$GREYS" | grep "^$G "`
290 | # http://stackoverflow.com/questions/1469849/how-to-split-one-string-into-multiple-strings-in-bash-shell
291 | set -- junk $LINE
292 | shift
293 | COUNT=$2
294 | # local COUNT=`echo "$GREYS" | grep "^$G " | sed 's/[0-9]\\+ \\([0-9]\\+\\)/\\1/g'`
295 | if [ "." == ".$COUNT" ]; then
296 | local COUNT=$NONE
297 | fi
298 | if [ $COUNT -gt $MAX_COUNT ]; then
299 | local COUNT=$MAX_COUNT
300 | fi
301 | if [ ".true" == ".$LOG" ]; then
302 | local PIXELS=`echo "scale=10;l($COUNT)/l(10)*$SCALE*$HEIGHT" | bc -l`
303 | local PIXELS=`echo "scale=0;$PIXELS/1" | bc -l`
304 | # local PIXELS=`echo "scale=10;l($COUNT)/l(10)*$SCALE*$HEIGHT/1000000000" | bc -l`
305 | else
306 | local PIXELS=$(($COUNT*$SCALE*$HEIGHT/1000000000))
307 | # local PIXELS=`echo "scale=10;$COUNT*$SCALE*$HEIGHT" | bc -l`
308 | fi
309 | # /1 due to funky bc scale not being applied if nothing is done
310 | # local PIXELS=`echo "scale=0;$PIXELS/1" | bc -l`
311 |
312 | printf %$((PIXELS))s |tr " " '\0' >> $HTMP
313 | # 377 octal = ff hex
314 | printf %$((HEIGHT-PIXELS))s |tr " " '\377' >> $HTMP
315 | # echo "$G $COUNT $PIXELS"
316 | done
317 | # echo "convert $HTMP -rotate 270 $DEST"
318 | convert $HTMP -rotate 270 "$DEST"
319 | # ls -l $HTMP
320 | rm $HTMP
321 | }
322 | export -f histogramScript
323 |
324 | #export HISTOGRAM_PHEIGHT="10%"
325 | #time histogramScript $1 200 false
326 | #time histogramScript $1 200 false
327 | #time histogramScript $1 200 false
328 | # grey_stats $1
329 |
--------------------------------------------------------------------------------
/docs/quack_20131127_8bit.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tokee/quack/5cbda6327fff58c409faf3d92e3b90b4439498a4/docs/quack_20131127_8bit.png
--------------------------------------------------------------------------------
/dragonscale.html:
--------------------------------------------------------------------------------
1 |
3 |
4 |
10 |
"
128 | fi
129 |
130 | if [ "html" == "$TYPE" ]; then
131 | echo ""
132 | echo ""
133 | fi
134 |
--------------------------------------------------------------------------------
/greyscale_stats.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #
4 | # Simple statistical analysis of greyscale images.
5 | # Extracts key stats for darkest and brightest intensity, intended
6 | # for use with greyscale_report.sh
7 | #
8 | # Requirements
9 | # * ImageMagick
10 | #
11 |
12 | # Input: A greyscale image
13 | # Output: Name pixel_count unique_greyscales #darkest_pixels darkest_pixels_percent darkest_greyscale #brightest_pixels brightest_pixels_percent lightest_greyscale
14 |
15 | TMP="`mktemp --suffix .bmp`"
16 |
17 | if [ "." == ".$1" ]; then
18 | echo "Usage: filename [croppercent]"
19 | exit 2
20 | fi
21 |
22 | if [ "." != ".$2" ]; then
23 | CROP=$2
24 | convert "$1" -gravity Center -crop $CROP%x+0+0 "$TMP" 2> /dev/null
25 | INFO=`identify -verbose "$TMP" 2> /dev/null`
26 | rm "$TMP"
27 | else
28 | INFO=`identify -verbose $1 2> /dev/null`
29 | fi
30 |
31 | #INFO=`cat t`
32 |
33 | SAVEIFS=$IFS
34 | IFS=$(echo -en "\n")
35 |
36 | UNIQUE=`echo $INFO | grep "[0-9]\\+: (" | wc -l`
37 |
38 | FIRST_COUNT=`echo $INFO | grep "[0-9]\\+: (" | head -n 1 | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
39 | FIRST_GREY=`echo $INFO | grep "[0-9]\\+: (" | head -n 1 | grep -o " ([0-9 ,]*)" | sed 's/ //g'`
40 |
41 | LAST_COUNT=`echo $INFO | grep "[0-9]\\+: (" | tail -n 1 | grep -o " [0-9]\\+:" | grep -o "[0-9]\\+"`
42 | LAST_GREY=`echo $INFO | grep "[0-9]\\+: (" | tail -n 1 | grep -o " ([0-9 ,]*)" | sed 's/ //g'`
43 |
44 | GEOMETRY=`echo $INFO | grep "Geometry: [0-9]\\+x[0-9]\\+" | grep -o "[0-9]\\+x[0-9]\\+"`
45 | X=`echo $GEOMETRY | grep -o "[0-9]\\+x" | grep -o "[0-9]\\+"`
46 | Y=`echo $GEOMETRY | grep -o "x[0-9]\\+" | grep -o "[0-9]\\+"`
47 | PIXELS=`echo "$X*$Y" | bc`
48 |
49 | PERCENT_FIRST=`echo "scale=2;$FIRST_COUNT*100/$PIXELS" | bc`
50 | PERCENT_LAST=`echo "scale=2;$LAST_COUNT*100/$PIXELS" | bc`
51 |
52 | echo "$1 $PIXELS $UNIQUE $FIRST_COUNT $PERCENT_FIRST $FIRST_GREY $LAST_COUNT $PERCENT_LAST $LAST_GREY"
53 | #echo "$1 $UNIQUE $LAST"
54 |
55 | IFS=$SAVEIFS
56 |
--------------------------------------------------------------------------------
/internal_image_creator.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #
4 | # Creates the sample images for Quack from originals.
5 | # This script is only relevant if the caller has access
6 | # to the original scanned files. These are not stored
7 | # with the Quack code as their cumulative size is 34MB.
8 | # Contact Toke Eskildsen - te@statsbiblioteket.dk if
9 | # the originals are of interest.
10 | #
11 | # Requires bash & GraphicsMagick
12 | #
13 |
14 | ORIGINALS="/mnt/bulk/data/quack_samples"
15 | COMMAND="-geometry 40%x -level 0,1.0,220 -quality 55 -type Grayscale"
16 | FROM_TO="AdresseContoirsEfterretninger-1795-06-16-02-0018B,ACE-17950616-0018B AdresseContoirsEfterretninger-1795-06-16-02-0019A,ACE-17950616-0019A AdresseContoirsEfterretninger-1795-06-16-02-0019B,ACE-17950616-0019B"
17 |
18 | # We would like samples from other time periods, but only the 200+ years
19 | # old ones are currently cleared for distribution.
20 | #FROM_TO="AdressecomptoirsEfterretninger-1846-01-20-01-0029A,ACE-18460120-0029A AdressecomptoirsEfterretninger-1846-01-20-01-0031B,ACE-18460120-0031A AdresseContoirsEfterretninger-1795-06-16-02-0018B,ACE-17950616-0018B"
21 | SAMPLES="samples"
22 |
23 | if [ ! -d $ORIGINALS ]; then
24 | echo "The originals folder $ORIGINALS does not exist."
25 | if [ ! "te" == `whoami` ]; then
26 | echo "The user name `whoami` indicates you are not Toke Eskildsen."
27 | echo "This implies that you do not have the originals used to regenerate the samples."
28 | echo "Please ensure that the originals are available at ${ORIGINALS}."
29 | fi
30 | exit 2
31 | fi
32 |
33 | if [ ! -d $SAMPLES ]; then
34 | mkdir $SAMPLES
35 | fi
36 |
37 | for FT in $FROM_TO; do
38 | SRC=`echo "$FT" | cut -d, -f1`
39 | DEST=`echo "$FT" | cut -d, -f2`
40 | echo "Generating sample from ${SRC} to ${DEST}"
41 | gm convert ${ORIGINALS}/${SRC}.png $COMMAND ${SAMPLES}/${DEST}.jpg
42 | cp ${ORIGINALS}/${SRC}.alto.xml ${SAMPLES}/${DEST}.alto.xml
43 | done
44 | echo "Done"
45 |
--------------------------------------------------------------------------------
/presentation.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #
4 | # Generates presentational copies for scanned pages.
5 | # Intended to be called from quack.sh to generate images to shown on the QA page.
6 | #
7 | # Override settings by creating "presentation.settings" and specifying DEFAULT_COMMANDS
8 | # and/or get_commands
9 | #
10 |
11 | SETTINGS="presentation.settings"
12 |
13 | # geometry resizes to 50% (the > is redundant with percents <= 100, but we keep it
14 | # as it is a fine default as we never want to enlarge).
15 | # unsharp is for a high quality 300 DPI scan with no previous sharpen applied.
16 | # level is highly source-specific. The default is a conservative starker contrast.
17 | # no intensities > 240.
18 | # Quality is for JPEG output. This needs to be quite high as JPEG artifacts are
19 | # very visible with tiny text.
20 | DEFAULT_COMMANDS="-geometry 50%x> -unsharp 0.8x0.1+0.8+2.0 -level 10,1.0,245 -quality 95"
21 |
22 | # Input: source
23 | # Output: COMMANDS (GraphicsMagick options)
24 | function get_commands() {
25 | local SOURCE="$1"
26 |
27 | if [ "." != ".`echo \"$SOURCE\" | grep -o inesta`" ]; then
28 | # This provider has very dark scans with no intensities > 240
29 | COMMANDS="-geometry 50%x> -unsharp 0.8x0.1+0.8+2.0 -level 0,1.0,220 -quality 95"
30 | return
31 | fi
32 | if [ "." != ".`echo \"$SOURCE\" | grep -o pex`" ]; then
33 | # This provider has scans practically without any blown high- or low-lights
34 | # Input is 400 DPI so we need to scale a bit more to reach ~150DPI
35 | COMMANDS="-geometry 38%x> -unsharp 0.8x0.1+0.8+2.0 -level 30,0.8,240 -quality 95"
36 | return
37 | fi
38 | COMMANDS="$DEFAULT_COMMANDS"
39 | }
40 |
41 | pushd `dirname $0` > /dev/null
42 | ROOT=`pwd`
43 | if [ -e "$SETTINGS" ]; then
44 | echo "Sourcing settings from $SETTINGS"
45 | source "$SETTINGS"
46 | fi
47 | popd > /dev/null
48 |
49 | SOURCE="$1"
50 | DESTINATION="$2"
51 |
52 | if [ ! -f "$SOURCE" ]; then
53 | echo "The image '$SOURCE' does not exist"
54 | exit 2
55 | fi
56 |
57 | if [ "." == ".$DESTINATION" ]; then
58 | echo "Usage: ./presentation.sh source destination"
59 | exit 2
60 | fi
61 |
62 |
63 | get_commands "$SOURCE"
64 | gm convert "$SOURCE" $COMMANDS "$DESTINATION"
65 |
--------------------------------------------------------------------------------
/quack.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | #
16 | # 2013-2014 Toke Eskildsen, State and University Library, Denmark
17 | # 2022 Toke Eskildsen, Denmark
18 | #
19 |
20 | #
21 | # Quack 1.6 beta - Quality assurance tool for text scanning projects.
22 | #
23 | # Generates zoomable (OpenSeadragon) views of scanned text pages with overlays
24 | # containing OCR-text from ALTO-files. The views are static HTML pages that
25 | # can be viewed directly from the file system or through a webserver.
26 | #
27 | # Note that the images used for OpenSeadragon are PNG.
28 | # The focus is fully on QA, where pixel-perfect reproduction is required.
29 | #
30 | # The script supports iterative updates by re-using existing structures when
31 | # source files are added and the script is executed again. The destination
32 | # folder is fully self-contained and suitable for mounting under a webserver
33 | # with no access to the original files.
34 | #
35 | # Requirements:
36 | # Some unix-flavor with bash (only tested under Ubuntu)
37 | # GraphicsMagick (JPEG2000 -> PNG conversion is twice as fast is GraphicsMagic as ImageMagick)
38 | # opj_decompress if GrapghicsMagic does not support JPEG2000
39 | # ImageMagick (to create histograms)
40 | # openseadragon.min.js (download at http://openseadragon.github.io/#download)
41 | # a fairly current browser with JavaScript enabled
42 | #
43 |
44 | # Settings below. Instead of changing this file, it is recommended to
45 | # create a new file "quack.settings" with the wanted setup as it will
46 | # override the defaults below.
47 |
48 | # The types of images to pull from source
49 | export IMAGE_GLOB="*.tiff *.tif *.jp2 *.jpeg2000 *.j2k *.jpg *.jpeg"
50 | # The extension of the ALTO files corresponding to the image files
51 | # ALTO files are expected to be located next to the image files:
52 | # OurScanProject_batch_2013-09-18_page_007.tif
53 | # OurScanProject_batch_2013-09-18_page_007.alto.xml
54 | export ALTO_EXT=".alto.xml"
55 |
56 | # Sometimes the image corresponding to the ALTO has been scaled after ALTO
57 | # generation. This factor will be multiplied to all ALTO elements. If the
58 | # image has been scaled to half width & half height, set this to 0.5.
59 | export ALTO_SCALE_FACTOR="1.0"
60 |
61 | # The image format for the QA image. Possible values are png and jpg.
62 | # png is recommended if QA should check image quality in detail.
63 | export IMAGE_DISP_EXT="png"
64 | # If jpg is chosen for IMAGE_DISP_EXT, this quality setting (1-100)
65 | # will be used when genrerating the images.
66 | # Note: This does (unfortunately) not set the quality when tiles and
67 | # jpg has been chosen.
68 | export IMAGE_DISP_QUALITY="95"
69 | # When generating the QA image, these arguments will be added to the
70 | # gm convert command
71 | export QA_EXTRA=""
72 | # Later graphicmagic and imagemagic distributions does not seem to be
73 | # compiled with JPEG 2000 support. This setting controls how to decode
74 | # JPEG 2000. Valid values (default is "auto"):
75 | # auto: Check if graphicsmagic has JPEG 2000 support and choose between
76 | # gm and opj_decompress accordingly
77 | # gm: Try using build-in JPEG 2000 support in graphicsmagic
78 | # opj_decompress: Use opj_decompress for decoding of JPEG 2000
79 | : ${J2K_DECOMPRESS:="auto"}
80 |
81 | # The size of thumbnails in folder view.
82 | export THUMB_IMAGE_SIZE="300x200"
83 |
84 | # These elements will be grepped from the ALTO-files and shown on the image pages
85 | export ALTO_ELEMENTS="processingDateTime softwareName softwareVersion"
86 |
87 | # Number of threads used for image processing. CPU and memory bound.
88 | export THREADS=4
89 |
90 | # Number of threads used for histograms. Note that histogram generation
91 | # is very memory hungry (~2GB for a 30MP image), unless HISTOGRAM_PHEIGHT
92 | # is set to a percentage.
93 | export HISTOGRAM_THREADS=8
94 |
95 | # Number of threads used for pages. Page generation uses very little memory and
96 | # is almost exclusively CPU bound.
97 | export PAGE_THREADS=8
98 |
99 | # For production it is recommended that all FORCE_ options are set to "false" as
100 | # it makes iterative updates fast. If quack settings are tweaked, the relevant
101 | # FORCE_ options should be temporarily "true" until quack has been run once.
102 |
103 | # If true, image-pages will be generated even if they already exist.
104 | export FORCE_PAGES=false
105 | # If true, the main QA-images will be generated even if they already exist.
106 | export FORCE_QAIMAGE=false
107 | # If true, thumbnails will be generated even if they already exist.
108 | export FORCE_THUMBNAILS=false
109 | # If true, blown high- and low-light overlays will be generated even if they already exist.
110 | # Setting this to true will also set FORCE_BLOWN_THUMBS to true
111 | export FORCE_BLOWN=false
112 | # If true, blown high- and low-light overlays for thumbs will be generated even if they already exist.
113 | export FORCE_BLOWN_THUMBS=false
114 | # If true, presentation images will be generated even if they already exist.
115 | export FORCE_PRESENTATION=false
116 | # If true, histogram images will be generated even if they already exist.
117 | export FORCE_HISTOGRAM=false
118 | # If true, tile images will be generated even if they already exist.
119 | # This is only relevant if TILE="true"
120 | export FORCE_TILES=false
121 |
122 | # If true, the script attempts to find all alternative versions of the current image
123 | # in other folders under source. Suitable for easy switching between alternate scans
124 | # of the same material.
125 | export RESOLVE_ALTERNATIVES=false
126 |
127 | # If the IDNEXT attribute starts with 'ART' it is ignored
128 | # Used to avoid visually linking everything on the page
129 | export SKIP_NEXT_ART=false
130 |
131 | # How much of the image to retain, cropping from center, when calculating
132 | # histograms. Empty value = no crop. Valid values: 1-100
133 | # This us usable for generating proper histograms for scans where the border
134 | # is different from the rest of the image. Artifacts from rotations is an example.
135 | # Suggested values are 85-95%.
136 | export CROP_PERCENT=""
137 |
138 | # If defined, all histograms will have a a fixed height of this percentage.
139 | # If auto, histograms will scale individually to the highest value.
140 | # If script_auto, histograms will scale as with auto, but will be generated by
141 | # script (slow, low mem) instead of ImageMagick (fast, high mem).
142 | # If auto is specified, it is highly recommended to decrease HISTOGRAM_THREADS
143 | # to 2-4 on a 4-8GB machine
144 | # Suggested values are 10-20%
145 | # percentage or script_auto: scripted (low mem, slower)
146 | # auto: GraphicsMagick (high mem, faster)
147 | export HISTOGRAM_PHEIGHT="script_auto"
148 |
149 | # If true, tiles are generated for OpenSeadragon. This requires Robert Barta's
150 | # deepzoom (see link in README.md) and will generate a lot of 260x260 pixel tiles.
151 | # If false, a single image will be used with OpenSeadragon. This is a lot heavier
152 | # on the browser but avoids the size and file-count overhead of the tiles.
153 | export TILE="false"
154 |
155 | # If defined, TILE is ignored and OpenSeadragon is set up to get tiles from
156 | # the image server.
157 | # Sample: http://myimserver.example.com/iipsrv/?DeepZoom=/net/zone1.isilon.sblokalnet/ifs/archive/avis-show/
158 | export IIPSRV=""
159 | # If IIPSRV is defined, this extension will be used instead of the image extension
160 | # for resolving the dzi
161 | export IIPSRV_DZI_EXT=".jp2.dzi"
162 | # If IIPSRV is defined, symlinked images will be resolved to their source
163 | # before being used as paths for the image server
164 | export IIPSRV_FOLLOW_SYMLINKS="true"
165 | # Hack for resolving the source image
166 | # TODO: Avoid this by passing the real source image
167 | export IIPSRV_FOLLOW_SYMLINKS_EXTHACK=".jp2"
168 | # If a symlink is followed and the symlinks root is defined, this is used instead of
169 | # of SOURCE_FULL for extracting the relative link.
170 | export IIPSRV_FOLLOW_SYMLINKS_ROOT=""
171 |
172 | # If true, a secondary view of the scans will be inserted into the page.
173 | # The view represents an end-user version of the scan. This will often be
174 | # downscaled, levelled, sharpened and JPEG'ed.
175 | export PRESENTATION="true"
176 | # The image format for the presentation image. Possible values are png and jpg.
177 | # jpg is recommended as this would normally be the choice for end-user presentation.
178 | export PRESENTATION_IMAGE_DISP_EXT="jpg"
179 |
180 | # Overlay colors for indicating burned out high- and low-lights
181 | export OVERLAY_BLACK=3399FF
182 | export OVERLAY_WHITE=FFFF00
183 |
184 | # Limits for the overlays. Some scanners have absolute black as grey #02
185 | # To get grey #02 and below marked as blown black, set BLOWN_BLACK_BT to 3,3,3
186 | export BLOWN_WHITE_BT=255,255,255
187 | export BLOWN_WHITE_WT=254,254,254
188 | export BLOWN_BLACK_BT=1,1,1
189 | export BLOWN_BLACK_WT=0,0,0
190 |
191 | # Snippets are inserted verbatim at the top of the folder and the image pages.
192 | # Use them for specifying things like delivery date or provider notes.
193 | # Note that these snippet can be overridden on a per-folder and per-image basis
194 | # by creating special files in the source tree (see SPECIFIC_FOLDER_SNIPPET and
195 | # SPECIFIC_IMAGE_SNIPPET_EXTENSION below).
196 | export SNIPPET_FOLDER=""
197 | export SNIPPET_IMAGE=""
198 |
199 | # Temporary folder used for .mpc files and similar
200 | export DEFAULT_QUACK_TMP="/tmp"
201 |
202 | # End default settings. User-supplied overrides will be loaded from quack.settings
203 |
204 | # If present in a source-folder, the content of the folder will be inserted into
205 | # the generated folder HTML file.
206 | export SPECIFIC_FOLDER_SNIPPET="folder.snippet"
207 | # How to sort the list of sub folders. Possible values are "changed", "changed_rev",
208 | # "name" and "name_rev", where "changed" refers to the "last updated" timestamp for
209 | # the sub-folder and the "_rev"-suffix triggers reverse sorting.
210 | export SUB_FOLDER_LIST_SORT="changed"
211 |
212 | # If a file with image basename + this extension is encountered, the content will
213 | # be inserted into the generated image HTML file.
214 | export SPECIFIC_IMAGE_SNIPPET_EXTENSION=".snippet"
215 |
216 | # If no OpenSeadragon is present, the scripts attempts to download this version.
217 | OSD_ZIP="openseadragon-bin-1.0.0.zip"
218 | OSD_DIRECT="http://github.com/openseadragon/openseadragon/releases/download/v1.0.0/$OSD_ZIP"
219 |
220 | # The blacklist and whitelist are files with regular expressions, used when traversing the
221 | # source folder. One expression/line.
222 | export BLACKLIST="quack.blacklist"
223 | export WHITELIST="quack.whitelist"
224 |
225 | START_PATH=`pwd`
226 | pushd `dirname $0` > /dev/null
227 | export ROOT=`pwd`
228 |
229 | if [ -e "quack.settings" ]; then
230 | echo "Sourcing user settings from quack.settings in `pwd`"
231 | source "quack.settings"
232 | fi
233 | if [ -e "$BLACKLIST" ]; then
234 | echo "Using $BLACKLIST in `pwd`"
235 | export BLACKLIST_FILE="`pwd`/$BLACKLIST"
236 | fi
237 | if [ -e "$WHITELIST" ]; then
238 | echo "Using $WHITELIST in `pwd`"
239 | export WHITELIST_FILE="`pwd`/$WHITELIST"
240 | fi
241 | # functions for generating identify-files and extract greyscale statistics
242 | source "analyze.sh"
243 | source "quack_helper_common.sh"
244 | export PAGE_SCRIPT="`pwd`/quack_helper_imagepage.sh"
245 | popd > /dev/null
246 |
247 | # Local settings overrides general settings
248 | if [ ! "$START_PATH" == "$ROOT" ]; then
249 | if [ -e "quack.settings" ]; then
250 | echo "Sourcing user settings from quack.settings in `pwd`"
251 | source "quack.settings"
252 | fi
253 | if [ -e "$BLACKLIST" ]; then
254 | echo "Using $BLACKLIST in `pwd`"
255 | export BLACKLIST_FILE="`pwd`/$BLACKLIST"
256 | fi
257 | if [ -e "$WHITELIST" ]; then
258 | echo "Using $WHITELIST in `pwd`"
259 | export WHITELIST_FILE="`pwd`/$WHITELIST"
260 | fi
261 | fi
262 |
263 | if [ ".true" == ".$FORCE_BLOWN" ]; then
264 | # When we force regeneration of blown, we must also regenerate the blown thumbs.
265 | export FORCE_BLOWN_THUMBS=true
266 | fi
267 |
268 | PRESENTATION_SCRIPT="$ROOT/presentation.sh"
269 | if [ -f "$START_PATH/presentation_custom.sh" ]; then
270 | echo "Using presentation_custom.sh located in $START_PATH"
271 | PRESENTATION_SCRIPT="$START_PATH/presentation_custom.sh"
272 | fi
273 | if [ -f "$START_PATH/presentation.sh" ]; then
274 | echo "Using presentation.sh located in $START_PATH"
275 | PRESENTATION_SCRIPT="$START_PATH/presentation.sh"
276 | fi
277 | export FOLDER_TEMPLATE="$ROOT/web/folder_template.html"
278 | export IMAGE_TEMPLATE="$ROOT/web/image_template.html"
279 | export IMAGELINK_TEMPLATE="$ROOT/web/imagelink_template.html"
280 | export THUMB_TEMPLATE="$ROOT/web/thumb_template.html"
281 | export HIST_TEMPLATE="$ROOT/web/histogram_template.html"
282 | DRAGON="openseadragon.min.js"
283 |
284 | export PAGE_COUNTER=`createCounter page 0`
285 | export IMAGE_COUNTER=`createCounter image 0`
286 | export HIST_COUNTER=`createCounter histogram 0`
287 |
288 | export TILE_TIMING=`createCounter tile_timing 0`
289 | export QA_TIMING=`createCounter qa_timing 0`
290 | export PRESENTATION_TIMING=`createCounter presentation_timing 0`
291 | export OVERLAY_TIMING=`createCounter overlay_timing 0`
292 | export THUMB_TIMING=`createCounter thumb_timing 0`
293 | export HIST_TIMING=`createCounter hist_timing 0`
294 | export TOTAL_TIMING=`createCounter total_timing 0`
295 |
296 | ALL_COUNTERS="$PAGE_COUNTER $MAGE_COUNTER $HIST_COUNTER $TILE_TIMING $QA_TIMING $PRESENTATION_TIMING $THUMB_TIMING $HIST_TIMING $OVERLAY_TIMING $TOTAL_TIMING"
297 | TOTAL_START_TIME=`date +%s%N`
298 |
299 |
300 | function check_dependencies() {
301 | if [ "." == ".`which gm`" ]; then
302 | echo "Error: gm missing: Please install Graphics Magick" >&2
303 | exit 2
304 | fi
305 | local GM_J2K="$(gm convert -list format | grep JPEG-2000)"
306 | echo "***$GM_J2K***"
307 | if [[ "$J2K_DECOMPRESS" == "gm" && -z "$GM_J2K" ]]; then
308 | >&2 echo "Error: J2K_DECOMPRESS==gm but the available GraphicsMagic does not have JPEG 2000 support (gm convert -list format)"
309 | exit 3
310 | fi
311 | # TODO: Turn all of this off is source bitmaps are not JPEG 2000
312 | if [[ "$J2K_DECOMPRESS" == "auto" ]]; then
313 | if [[ -z "$GM_J2K" ]]; then
314 | echo "Setting J2K_DECOMPRESS=opj_decompress as initial J2K_DECOMPRESS==auto and local GraphicsMagic does not have JPEG 2000 support"
315 | J2K_DECOMPRESS=opj_decompress
316 | else
317 | echo "Setting J2K_DECOMPRESS=gm as initial J2K_DECOMPRESS==auto and local GraphicsMagic has JPEG 2000 support"
318 | J2K_DECOMPRESS=gm
319 | fi
320 | fi
321 | if [[ "$J2K_DECOMPRESS" == "opj_decompress" && -z "$(which opj_decompress)" ]]; then
322 | >&2 echo "Error: J2K_DECOMPRESS==opj_decompress but opj_decompress is not installed"
323 | exit 2
324 | fi
325 |
326 | if [ "." == ".`which convert`" ]; then
327 | echo "Error: convert missing: Please install Image Magick" >&2
328 | exit 2
329 | fi
330 | if [ "." == ".`which deepzoom`" -a "true" == $TILE ]; then
331 | echo "Error: deepzoom missing and TILE=true: Please install deepzoom" >&2
332 | exit 2
333 | fi
334 | }
335 |
336 | function usage() {
337 | echo "quack 1.6 beta - Quality Assurance oriented ALTO viewer"
338 | echo ""
339 | echo "Usage: ./quack.sh source destination"
340 | echo ""
341 | echo "source: The top folder for images with ALTO files"
342 | echo "destination: The wanted location of the presentation structure"
343 | echo ""
344 | echo "See comments in script and README.md for details."
345 | }
346 |
347 | SOURCE=$1
348 | if [ "." == ".$SOURCE" ]; then
349 | echo "Error: Missing source" >&2
350 | echo ""
351 | usage
352 | exit 2
353 | fi
354 | if [[ ! -d "$SOURCE" ]]; then
355 | >&2 echo "Error: Source folder '$SOURCE' was not available"
356 | echo ""
357 | usage
358 | exit 5
359 | fi
360 | pushd "$SOURCE" > /dev/null
361 | export SOURCE_FULL=`pwd`
362 | popd > /dev/null
363 |
364 | DEST=$2
365 | if [ "." == ".$DEST" ]; then
366 | echo "Error: Missing destination" >&2
367 | echo ""
368 | usage
369 | exit 2
370 | fi
371 |
372 | if [ ! -f "$ROOT/web/$DRAGON" ]; then
373 | if [ -f "$ROOT/$DRAGON" ]; then
374 | echo "Copying $DRAGON from Quack root to the web folder"
375 | cp "$ROOT/$DRAGON" "$ROOT/web/"
376 | else
377 | echo "The file $ROOT/$DRAGON or $ROOT/web/$DRAGON does not exist" >&2
378 | if [ "." == ".`which wget`" -o "." == ".`which unzip`" ]; then
379 | echo "Please download it at http://openseadragon.github.io/#download" >&2
380 | echo "Tested version is 1.0.0, which can be downloaded from" >&2
381 | echo "$OSD_DIRECT" >&2
382 | exit
383 | else
384 | echo "Attempting to download of OpenSeadragon from" >&2
385 | echo "$OSD_DIRECT"
386 | wget "$OSD_DIRECT" -O "$ROOT/web/$OSD_ZIP"
387 | pushd "$ROOT/web" > /dev/null
388 | unzip "$ROOT/web/$OSD_ZIP"
389 | mv "openseadragon-bin-1.0.0/openseadragon.min.js" "$DRAGON"
390 | mv "openseadragon-bin-1.0.0/images" "$ROOT/web"
391 | rm -r "openseadragon-bin-1.0.0"
392 | popd > /dev/null
393 | rm "$ROOT/web/$OSD_ZIP"
394 | if [ ! -f "$ROOT/web/$DRAGON" ]; then
395 | echo "Automatic OpenSeadragon download and installation failed." >&2
396 | echo "Please download it at http://openseadragon.github.io/#download" >&2
397 | echo "Tested version is 1.0.0, which can be downloaded from" >&2
398 | echo "$OSD_DIRECT" >&2
399 | exit 2
400 | fi
401 | echo "Automatic download and installation of OpenSeadragon successful."
402 | fi
403 | fi
404 | fi
405 |
406 | if [ -z "$QUACK_TMP" ]; then
407 | export QUACK_TMP=$DEFAULT_QUACK_TMP
408 | fi
409 |
410 | # Copy OpenSeadragon and all css-files to destination
411 | function copyFiles () {
412 | if [ ! -d "$DEST" ]; then
413 | echo "Creating folder $DEST"
414 | mkdir -p "$DEST"
415 | fi
416 | cp -r ${ROOT}/web/*.js ${ROOT}/web/*.css ${ROOT}/web/images "$DEST"
417 | }
418 |
419 | # http://stackoverflow.com/questions/14434549/how-to-expand-shell-variables-in-a-text-file
420 | # Input: template-file
421 | function ctemplate() {
422 | local TMP="`mktemp --suffix .sh`"
423 | echo 'cat < $TMP
424 | cat "$1" >> $TMP
425 | echo 'END_OF_TEXT' >> $TMP
426 | . $TMP
427 | rm $TMP
428 | }
429 |
430 | # Creates the bash environment variables corresponding to those used by makeImages
431 | # This is used to separate HTML generation from the actual image processing
432 | # srcFolder dstFolder image
433 | # Output: SOURCE_IMAGE DEST_IMAGE HIST_IMAGE THUMB
434 | function makeImageParams() {
435 | local SRC_FOLDER="$1"
436 | local DEST_FOLDER="$2"
437 | local IMAGE="$3"
438 |
439 | local SANS_PATH=${IMAGE##*/}
440 | local BASE=${SANS_PATH%.*}
441 |
442 | # Used by function caller
443 | # Must be mirrored in makeImages
444 | SOURCE_IMAGE="${SRC_FOLDER}/${IMAGE}"
445 | DEST_IMAGE="${DEST_FOLDER}/${BASE}.${IMAGE_DISP_EXT}"
446 | HIST_IMAGE="${DEST_FOLDER}/${BASE}.histogram.png"
447 | HISTOGRAM_LINK=${HIST_IMAGE##*/}
448 | THUMB_IMAGE="${DEST_FOLDER}/${BASE}.thumb.jpg"
449 | THUMB_LINK=${THUMB_IMAGE##*/}
450 | WHITE_IMAGE="${DEST_FOLDER}/${BASE}.white.png"
451 | BLACK_IMAGE="${DEST_FOLDER}/${BASE}.black.png"
452 | PRESENTATION_IMAGE="${DEST_FOLDER}/${BASE}.presentation.jpg"
453 | TILE_FOLDER="${DEST_FOLDER}/${BASE}_files"
454 | PRESENTATION_TILE_FOLDER="${DEST_FOLDER}/${BASE}.presentation_files"
455 | ALTO_DEST="${DEST_FOLDER}/${BASE}.alto.xml"
456 | }
457 |
458 | # If force is true and image exists, image is deleted and true returned
459 | # If force is true and image does not exist, true is returned
460 | # If force is false and image exists, false is returned
461 | # If force is false and image does not exists, true is returned
462 | # Input: force image designation
463 | # Output: true/false. Use with 'if shouldGenerate true dummy; then'
464 | shouldGenerate() {
465 | local FORCE="$1"
466 | local IMG="$2"
467 | local DES="$3"
468 |
469 | if [ ".true" == ".$FORCE" -a -e "$IMG" ]; then
470 | rm -rf "$IMG"
471 | fi
472 | if [ ! -e "$IMG" -a "." != ".$DES" ]; then
473 | echo " - ${IMG##*/} ($DES)"
474 | fi
475 | [ ! -e "$IMG" ]
476 | }
477 | export -f shouldGenerate
478 |
479 | # Handles creation of the intermediate mpc image for speeding up
480 | # repeated Graphic Magick calls on the same source image
481 | # Input: src dest
482 | function ensureIntermediate() {
483 | local D="$2"
484 | if [ ! -s "$D" ]; then
485 | gm convert "$1" "$D"
486 | # Trap does not work here as new traps for the same signal overrides the old ones
487 | trap "rm -f \"${D%.*}.cache\" \"$D\"" EXIT
488 | fi
489 | }
490 | export -f ensureIntermediate
491 |
492 | function removeIntermediate() {
493 | local D="$1"
494 | rm -f "$D" "${D%.*}.cache"
495 | }
496 | export -f removeIntermediate
497 |
498 | # Creates a presentation image and a histogram for the given image
499 | # srcFolder dstFolder image crop presentation_script tile
500 | function makeImages() {
501 | local SRC_FOLDER="$1"
502 | local DEST_FOLDER="$2"
503 | local IMAGE="$3"
504 | local CROP_PERCENT="$5"
505 | local PRESENTATION_SCRIPT="$6"
506 | local TILE="$7"
507 |
508 | # echo "makeImages $SRC_FOLDER $DEST_FOLDER"
509 |
510 | local SANS_PATH=${IMAGE##*/}
511 | local BASE=${SANS_PATH%.*}
512 |
513 | # Must mirror the ones in makeImageParams
514 | # Do not cheat by calling makeImageParams as makeImages might
515 | # be called in parallel
516 | local SOURCE_IMAGE="${SRC_FOLDER}/${IMAGE}"
517 | local DEST_IMAGE="${DEST_FOLDER}/${BASE}.${IMAGE_DISP_EXT}"
518 | local HIST_IMAGE="${DEST_FOLDER}/${BASE}.histogram.png"
519 | local HISTOGRAM_LINK=${HIST_IMAGE##*/}
520 | local THUMB_IMAGE="${DEST_FOLDER}/${BASE}.thumb.jpg"
521 | local THUMB_LINK=${THUMB_IMAGE##*/}
522 | local WHITE_IMAGE="${DEST_FOLDER}/${BASE}.white.png"
523 | local BLACK_IMAGE="${DEST_FOLDER}/${BASE}.black.png"
524 | local THUMB_OVERLAY_WHITE="${DEST_FOLDER}/${BASE}.white.thumb.png"
525 | local THUMB_OVERLAY_BLACK="${DEST_FOLDER}/${BASE}.black.thumb.png"
526 | local PRESENTATION_IMAGE="${DEST_FOLDER}/${BASE}.presentation.jpg"
527 | local TILE_FOLDER="${DEST_FOLDER}/${BASE}_files"
528 | local PRESENTATION_TILE_FOLDER="${DEST_FOLDER}/${BASE}.presentation_files"
529 | local ALTO_DEST="${DEST_FOLDER}/${BASE}.alto.xml"
530 |
531 |
532 | # The intermediate format mpc is memory-mapped and very fast for reuse
533 | local GM_INTERMEDIATE=`echo "${DEST_FOLDER}/${BASE}.mpc" | sed 's@/@_@g'`
534 | local GM_INTERMEDIATE="$QUACK_TMP/$GM_INTERMEDIATE"
535 |
536 | if [ ! -f "$SOURCE_IMAGE" ]; then
537 | echo "Error in makeImages: The source image '$SOURCE_IMAGE' does not exist" >&2
538 | exit
539 | fi
540 |
541 | local CREATED_IMAGES=`addGetCounter $IMAGE_COUNTER`
542 |
543 | # Even if TILE="true", we create the full main presentational image as it
544 | # might be requested for download
545 | if shouldGenerate "$FORCE_QAIMAGE" "$DEST_IMAGE" "QA (${CREATED_IMAGES}/${TOTAL_IMAGES})"; then
546 | local START=`date +%s%N`
547 | ensureIntermediate "$SOURCE_IMAGE" "$GM_INTERMEDIATE"
548 | gm convert "$GM_INTERMEDIATE" $QA_EXTRA -quality $IMAGE_DISP_QUALITY "$DEST_IMAGE"
549 | updateTiming $QA_TIMING $START > /dev/null
550 | fi
551 |
552 | if [ "png" == ${IMAGE_DISP_EXT} ]; then
553 | # PNG is fairly fast to decode so use that as source
554 | local CONV="$DEST_IMAGE"
555 | else
556 | local CONV="$SOURCE_IMAGE"
557 | fi
558 |
559 | if [ ".true" == ".$PRESENTATION" ]; then
560 | local START=`date +%s%N`
561 | if shouldGenerate "$FORCE_PRESENTATION" "$PRESENTATION_IMAGE" "presentation"; then
562 | $PRESENTATION_SCRIPT "$CONV" "$PRESENTATION_IMAGE"
563 | fi
564 | updateTiming $PRESENTATION_TIMING $START > /dev/null
565 | fi
566 |
567 | if [ ".true" == ".$TILE" ]; then
568 | local START=`date +%s%N`
569 | if shouldGenerate "$FORCE_TILES" "$TILE_FOLDER" "tiles"; then
570 | # TODO: Specify JPEG quality
571 | deepzoom "$CONV" -format $IMAGE_DISP_EXT -path "${DEST_FOLDER}/"
572 | fi
573 |
574 | if [ ".true" == ".$PRESENTATION" ]; then
575 | if shouldGenerate "$FORCE_TILES" "$PRESENTATION_TILE_FOLDER" "presentation tiles"; then
576 | if [ ! -f "$PRESENTATION_IMAGE" ]; then
577 | echo "Error: The image $PRESENTATION_IMAGE does not exist"
578 | else
579 | # TODO: Specify JPEG quality
580 | deepzoom "$PRESENTATION_IMAGE" -format $PRESENTATION_IMAGE_DISP_EXT -path "${DEST_FOLDER}/"
581 | fi
582 | fi
583 | fi
584 | updateTiming $TILE_TIMING $START > /dev/null
585 | fi
586 |
587 | local START_OVERLAY=`date +%s%N`
588 | if shouldGenerate "$FORCE_BLOWN" "$WHITE_IMAGE" "overlay"; then
589 | ensureIntermediate "$SOURCE_IMAGE" "$GM_INTERMEDIATE"
590 | gm convert "$GM_INTERMEDIATE" -black-threshold $BLOWN_WHITE_BT -white-threshold $BLOWN_WHITE_WT -negate -fill \#$OVERLAY_WHITE -opaque black -colors 2 -matte -transparent white "$WHITE_IMAGE"
591 | if [[ ! -s "$WHITE_IMAGE" ]]; then
592 | >&2 echo "Error: Unable to generate overlay for whiteout '$WHITE_IMAGE'"
593 | fi
594 | fi
595 |
596 | if shouldGenerate "$FORCE_BLOWN" "$BLACK_IMAGE" "overlay"; then
597 | ensureIntermediate "$SOURCE_IMAGE" "$GM_INTERMEDIATE"
598 | gm convert "$GM_INTERMEDIATE" -black-threshold $BLOWN_BLACK_BT -white-threshold $BLOWN_BLACK_WT -fill \#$OVERLAY_BLACK -opaque black -colors 2 -matte -transparent white "$BLACK_IMAGE"
599 | if [[ ! -s "$BLACK_IMAGE" ]]; then
600 | >&2 echo "Error: Unable to generate overlay for blackout '$BLACK_IMAGE'"
601 | fi
602 | fi
603 | updateTiming $OVERLAY_TIMING $START_OVERLAY > /dev/null
604 |
605 | local START_THUMB=`date +%s%N`
606 | if shouldGenerate "$FORCE_THUMBNAILS" "$THUMB_IMAGE" "thumbnail"; then
607 | ensureIntermediate "$SOURCE_IMAGE" "$GM_INTERMEDIATE"
608 | gm convert "$GM_INTERMEDIATE" -sharpen 3 -enhance -resize $THUMB_IMAGE_SIZE "$THUMB_IMAGE"
609 | fi
610 |
611 | if shouldGenerate "$FORCE_BLOWN_THUMBS" "$THUMB_OVERLAY_WHITE" "thumb overlay"; then
612 | echo " - ${THUMB_OVERLAY_WHITE##*/}"
613 | # Note: We use ImageMagick here as older versions of GraphicsMagic does not
614 | # handle resizing of alpha-channel PNGs followed by color reduction
615 | gm convert "$WHITE_IMAGE" -resize $THUMB_IMAGE_SIZE "$THUMB_OVERLAY_WHITE"
616 | fi
617 | if shouldGenerate "$FORCE_BLOWN_THUMBS" "$THUMB_OVERLAY_BLACK" "thumb overlay"; then
618 | echo " - ${THUMB_OVERLAY_BLACK##*/}"
619 | # Note: We use ImageMagick here as older versions of GraphicsMagic does not
620 | # handle resizing of alpha-channel PNGs followed by color reduction
621 | gm convert "$BLACK_IMAGE" -resize $THUMB_IMAGE_SIZE "$THUMB_OVERLAY_BLACK"
622 | fi
623 |
624 | removeIntermediate "$GM_INTERMEDIATE"
625 |
626 | updateTiming $THUMB_TIMING $START_THUMB > /dev/null
627 | }
628 | export -f makeImages
629 |
630 | # Histogram generation is separated from generic image generation as it takes a lot of memory
631 | # srcFolder dstFolder image crop presentation_script tile
632 | function makeHistograms() {
633 | local SRC_FOLDER="$1"
634 | local DEST_FOLDER="$2"
635 | local IMAGE="$3"
636 | local CROP_PERCENT="$5"
637 | local PRESENTATION_SCRIPT="$6"
638 | local TILE="$7"
639 |
640 | local START=`date +%s%N`
641 | # echo "makeImages $SRC_FOLDER $DEST_FOLDER"
642 |
643 | local SANS_PATH=${IMAGE##*/}
644 | local BASE=${SANS_PATH%.*}
645 |
646 | local DEST_IMAGE="${DEST_FOLDER}/${BASE}.${IMAGE_DISP_EXT}"
647 | local SOURCE_IMAGE="${SRC_FOLDER}/${IMAGE}"
648 |
649 | # Must mirror the ones in makeImageParams
650 | # Do not cheat by calling makeImageParams as makeImages might
651 | # be called in parallel
652 | local HIST_IMAGE="${DEST_FOLDER}/${BASE}.histogram.png"
653 |
654 | if [ ! -f "$SOURCE_IMAGE" ]; then
655 | echo "Error in makeHistograms: The source image $SOURCE_IMAGE does not exist" >&2
656 | exit
657 | fi
658 |
659 | local CREATED_HIST=`addGetCounter $HIST_COUNTER`
660 |
661 | if [ "png" == ${IMAGE_DISP_EXT} ]; then
662 | # PNG is fairly fast to decode so use that as source
663 | local CONV="$DEST_IMAGE"
664 | else
665 | local CONV="$SOURCE_IMAGE"
666 | fi
667 |
668 | if shouldGenerate "$FORCE_HISTOGRAM" "$HIST_IMAGE" "histogram (${CREATED_HIST}/${TOTAL_IMAGES})"; then
669 | if [ "." == ".$HISTOGRAM_PHEIGHT" -o "auto" == "$HISTOGRAM_PHEIGHT" ]; then
670 | # Remove "-separate -append" to generate a RGB histogram
671 | # http://www.imagemagick.org/Usage/files/#histogram
672 | if [ "." == ".$CROP_PERCENT" ]; then
673 | convert "$CONV" -separate -append -define histogram:unique-colors=false -write histogram:mpr:hgram +delete mpr:hgram -negate -strip "$HIST_IMAGE"
674 | else
675 | convert "$CONV" -gravity Center -crop $CROP_PERCENT%x+0+0 -separate -append -define histogram:unique-colors=false -write histogram:mpr:hgram +delete mpr:hgram -negate -strip "$HIST_IMAGE"
676 | fi
677 | else
678 | histogramScript "$CONV" 200 false "$HIST_IMAGE"
679 | fi
680 | fi
681 | updateTiming $HIST_TIMING $START > /dev/null
682 | }
683 | export -f makeHistograms
684 |
685 | # Input: [recursive]
686 | # Output: Images in the current folder, matching $IMAGE_GLOB and
687 | # obeying white- and black-list.
688 | function listImages() {
689 | local RECURSIVE="$1"
690 |
691 | if [ -n "$BLACKLIST_FILE" ]; then
692 | if [ -n "$WHITELIST_FILE" ]; then
693 | ls $IMAGE_GLOB 2> /dev/null | grep -f "$WHITELIST_FILE" | grep -v -f "$BLACKLIST_FILE"
694 | else
695 | ls $IMAGE_GLOB 2> /dev/null | grep -v -f "$BLACKLIST_FILE"
696 | fi
697 | else
698 | if [ -n "$WHITELIST_FILE" ]; then
699 | ls $IMAGE_GLOB 2> /dev/null | grep -f "$WHITELIST_FILE"
700 | else
701 | ls $IMAGE_GLOB 2> /dev/null
702 | fi
703 | fi
704 |
705 | if [ ".true" == ".$RECURSIVE" ]; then
706 | for SUB in `ls -d */ 2> /dev/null`; do
707 | pushd $SUB > /dev/null
708 | listImages $RECURSIVE
709 | popd > /dev/null
710 | done
711 | fi
712 | }
713 |
714 | # Input: up parent srcFolder dstFolder
715 | #
716 | function makeIndex() {
717 | local UP="$1"
718 | local PARENT="$2"
719 | local SRC_FOLDER="$3"
720 | local DEST_FOLDER="$4"
721 | # echo "Processing level '$PARENT' from $SRC_FOLDER"
722 |
723 | if [ ! -d "$SRC_FOLDER" ]; then
724 | echo "Error in makeIndex: Unable to locate folder $SRC_FOLDER from `pwd`" >&2
725 | exit
726 | fi
727 | pushd "$SRC_FOLDER" > /dev/null
728 | local SRC_FOLDER=`pwd`
729 | popd > /dev/null
730 | echo "Processing $SRC_FOLDER `date +%H:%M:%S`"
731 |
732 | if [ ! -d "$DEST_FOLDER" ]; then
733 | # echo "Creating folder $DEST_FOLDER"
734 | mkdir -p "$DEST_FOLDER"
735 | fi
736 | pushd "$DEST_FOLDER" > /dev/null
737 | local DEST_FOLDER=`pwd`
738 | popd > /dev/null
739 |
740 | pushd "$SRC_FOLDER" > /dev/null
741 | local PP="${DEST_FOLDER}/index.html"
742 |
743 | if [ "." == ".$PARENT" ]; then
744 | true
745 | # echo "
Parent: N/A
" >> $PP
746 | fi
747 |
748 | # Images
749 | local IMAGES=`listImages`
750 |
751 | # Generate graphics
752 | # http://stackoverflow.com/questions/11003418/calling-functions-with-xargs-within-a-bash-script
753 | echo "$IMAGES" | xargs -I'{}' -P $THREADS bash -c 'makeImages "$@"' _ "$SRC_FOLDER" "$DEST_FOLDER" "{}" "$THUMB_IMAGE_SIZE" "$CROP_PERCENT" "$PRESENTATION_SCRIPT" "$TILE" \;
754 |
755 | # Generate histograms
756 | echo "$IMAGES" | xargs -I'{}' -P $HISTOGRAM_THREADS bash -c 'makeHistograms "$@"' _ "$SRC_FOLDER" "$DEST_FOLDER" "{}" "$THUMB_IMAGE_SIZE" "$CROP_PERCENT" "$PRESENTATION_SCRIPT" "$TILE" \;
757 |
758 | # Generate pages
759 | echo "$IMAGES" | xargs -I'{}' -P $PAGE_THREADS bash -c '$PAGE_SCRIPT "$@"' _ "$UP" "$PARENT" "$SRC_FOLDER" "$DEST_FOLDER" "{}" "$IMAGES" \;
760 |
761 | # if [ ! "." == ".$IMAGES" ]; then
762 | # for I in $IMAGES; do
763 | # makePreviewPage "$UP" "$PARENT" "$SRC_FOLDER" "$DEST_FOLDER" "$I" "$IMAGES"
764 | #"$PREV_IMAGE" "$NEXT_IMAGE"
765 | # done
766 | # fi
767 |
768 | # Generate links, thumbs and histograms from the pages for the folder view
769 | local THUMBS_HTML=""
770 | local HISTOGRAMS_HTML=""
771 | local ILIST_HTML=""
772 | if [ "." == ".$IMAGES" ]; then
773 | local THUMBS_HTML="
No images
"$'\n'
774 | local HISTOGRAMS_HTML="
No images
"$'\n'
775 | else
776 | for I in $IMAGES; do
777 | local SANS_PATH=${I##*/}
778 | local BASE=${SANS_PATH%.*}
779 | # Must be kept in sync with quack_helper_imagepage
780 | local ILINK="${DEST_FOLDER}/${BASE}.link.html"
781 | local TLINK="${DEST_FOLDER}/${BASE}.thumb.html"
782 | local HLINK="${DEST_FOLDER}/${BASE}.hist.html"
783 | local ILIST_HTML="${ILIST_HTML}`cat \"$ILINK\"`"$'\n'
784 | local THUMBS_HTML="${THUMBS_HTML}`cat \"$TLINK\"`"$'\n'
785 | local HISTOGRAMS_HTML="${HISTOGRAMS_HTML}`cat \"$HLINK\"`"$'\n'
786 | done
787 | fi
788 |
789 | case ".$SUB_FOLDER_LIST_SORT" in
790 | .changed) local SUBS=`ls -rt "$SRC_FOLDER"` ;;
791 | .changed_rev) local SUBS=`ls -rt "$SRC_FOLDER" | tac` ;;
792 | .name_rev) local SUBS=`ls "$SRC_FOLDER" | tac` ;;
793 | *) local SUBS=`ls "$SRC_FOLDER"` ;;
794 | esac
795 |
796 | if [ "." == ".$SUBS" ]; then
797 | SUBFOLDERS_HTML="
No subfolders
"$'\n'
798 | else
799 | SUBFOLDERS_HTML="
"$'\n'"
Folder
Changed
Images
"$'\n'
800 | # TODO: Make the iterator handle spaces
801 | for F in $SUBS; do
802 | if [ -d $F ]; then
803 | local CHANGED=`date -r "$SRC_FOLDER/$F" +%Y%m%d-%H%M`
804 | pushd "$SRC_FOLDER/$F" > /dev/null
805 | local SUB_COUNT=`listImages true | wc -l`
806 | popd > /dev/null
807 | SUBFOLDERS_HTML="${SUBFOLDERS_HTML}
"$'\n'
808 | fi
809 | done
810 | SUBFOLDERS_HTML="${SUBFOLDERS_HTML}
"$'\n'
811 | fi
812 |
813 | if [ ! -f *.Edition.xml ]; then
814 | # TODO: Only warn if there are images
815 | EDITION_HTML=`echo "
No edition
"`
816 | else
817 | EDITION_HTML=""
818 | for E in *.Edition.xml; do
819 | local EDTMP=`mktemp`
820 | # echo to get newlines
821 | EDITION_HTML="${EDITION_HTML}
"$'\n'
828 | done
829 | fi
830 |
831 | pushd $SRC_FOLDER > /dev/null
832 | if [ -f $SPECIFIC_FOLDER_SNIPPET ]; then
833 | SNIPPET=`cat $SPECIFIC_FOLDER_SNIPPET`
834 | else
835 | SNIPPET="$SNIPPET_FOLDER"
836 | fi
837 | popd > /dev/null
838 |
839 | # UP, PARENT, SRC_FOLDER, DEST_FOLDER, ILIST_HTML, THUMBS_HTML, HISTOGRAMS_HTML, SUBFOLDERS_HTML, EDITION_HTML, SNIPPET
840 | ctemplate $FOLDER_TEMPLATE > $PP
841 |
842 | # Generate pages for sub folders
843 | # We do this at the end to avoid overriding of variables
844 | for F in $SUBS; do
845 | if [ -d $F ]; then
846 | makeIndex "${UP}../" "${PARENT}${F}/" "${SRC_FOLDER}/${F}" "${DEST_FOLDER}/${F}"
847 | fi
848 | done
849 |
850 | popd > /dev/null
851 | }
852 |
853 | function pAverage() {
854 | if [ ! -n "$TOTAL_IMAGES" -o "0" -eq "$TOTAL_IMAGES" ]; then
855 | echo "`getCounter "$1"` ms"
856 | return
857 | fi
858 | local COUNTER=`getCounter "$1"`
859 | local AVG=$(($COUNTER / $TOTAL_IMAGES))
860 | echo "$COUNTER ms ($AVG ms/image)"
861 | }
862 |
863 | function performanceStats() {
864 | echo "Performance measurements"
865 | echo " - total time (clock): `pAverage $TOTAL_TIMING`"
866 | echo " - tiles (cpu): `pAverage "$TILE_TIMING"`"
867 | echo " - qa images (cpu): `pAverage "$QA_TIMING"`"
868 | echo " - presentation images (cpu): `pAverage "$PRESENTATION_TIMING"`"
869 | echo " - thumbs (cpu): `pAverage "$THUMB_TIMING"`"
870 | echo " - histograms (cpu): `pAverage "$HIST_TIMING"`"
871 | echo " - overlays (cpu): `pAverage "$OVERLAY_TIMING"`"
872 | }
873 |
874 | echo "Quack starting at `date`"
875 | check_dependencies
876 | copyFiles
877 | pushd "$SOURCE" > /dev/null
878 | export TOTAL_IMAGES=`listImages true | wc -l`
879 | popd > /dev/null
880 | makeIndex "" "" "$SOURCE" "$DEST"
881 | updateTiming $TOTAL_TIMING $TOTAL_START_TIME > /dev/null
882 | performanceStats
883 | for COUNTER in $ALL_COUNTERS; do
884 | deleteCount $COUNTER
885 | done
886 | echo "All done at `date`"
887 | echo "Please open ${DEST}/index.html in a browser"
888 |
--------------------------------------------------------------------------------
/quack_helper_common.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | #
4 | # Functions used by different quack scripts
5 | #
6 |
7 | # Input: id initialValue
8 | # Output: lockname
9 | function createCounter() {
10 | local ID="$1"
11 | local INITIAL="$2"
12 | pushd `dirname $0` > /dev/null
13 | local LOCKNAME="`pwd`/lock.${ID}_$$"
14 | popd > /dev/null
15 | local COUNTFILE="${LOCKNAME}.counter"
16 | if [ "." == ".$INITIAL" ]; then
17 | local INITIAL=1
18 | fi
19 | echo "$INITIAL" > $COUNTFILE
20 | echo "$LOCKNAME"
21 | }
22 | export -f createCounter
23 |
24 | # Input: lockname delta
25 | # Output: Old counter from lock file + 1
26 | function addDeltaGetCounter() {
27 | local LOCKNAME="$1"
28 | local DELTA="$2"
29 | if [ "." == ".$LOCKNAME" ]; then
30 | echo "threadedCounter: The lockname must be specified" 1>&2
31 | exit
32 | fi
33 | local COUNTFILE="${LOCKNAME}.counter"
34 |
35 | # http://stackoverflow.com/questions/8231847/bash-script-to-count-number-of-times-script-has-run
36 | mkdir $LOCKNAME 2> /dev/null
37 | while [[ $? -ne 0 ]] ; do
38 | sleep 0.1
39 | mkdir $LOCKNAME 2> /dev/null
40 | done
41 | local COUNTER=`cat "$COUNTFILE"`
42 | local COUNTER=$((COUNTER+DELTA))
43 | echo $COUNTER > "$COUNTFILE"
44 | rm -rf $LOCKNAME
45 | echo $COUNTER
46 | }
47 | export -f addDeltaGetCounter
48 |
49 | # Input: lockname
50 | # Output: Old counter from lock file + 1
51 | function addGetCounter() {
52 | addDeltaGetCounter "$1" 1
53 | }
54 | export -f addGetCounter
55 |
56 | # TODO: Implement this. The problem is that bash does not support adding fractions
57 | # and that a call to bc is costly. Maybe we can move the decimal point to
58 | # millisecond precision and use integers instead?
59 | # Input: lockname starttime (in nanoseconds)
60 | # Increments the counter with milliseconds from currenttime-starttime
61 | function updateTiming() {
62 | local START=$2
63 | local START=${START:0:${#START}-6}
64 | local END=`date +%s%N`
65 | local END=${END:0:${#END}-6}
66 | addDeltaGetCounter $1 $((END-START))
67 | }
68 | export -f updateTiming
69 |
70 | # Input: lockname
71 | # Output: Old counter from lock file
72 | function getCounter() {
73 | addDeltaGetCounter "$1" 0
74 | }
75 | export -f getCounter
76 |
77 | # Removed old count files
78 | function deleteCount() {
79 | local LOCKNAME="$1"
80 | if [ "." == ".$LOCKNAME" ]; then
81 | echo "deleteCount: The lockname must be specified" 1>&2
82 | exit
83 | fi
84 | local COUNTFILE="${LOCKNAME}.counter"
85 |
86 | if [ -d "$LOCKNAME" ]; then
87 | rm -r "$LOCKNAME"
88 | fi
89 | if [ -f "$COUNTFILE" ]; then
90 | rm -r "$COUNTFILE"
91 | fi
92 | }
93 | export -f deleteCount
94 |
95 | #L=`createCount foo 0`
96 | #addGetCounter $L
97 | #addGetCounter $L
98 |
99 | # Skips the given number of lines and returns the rest
100 | # If negative lines are given, the end is skipped
101 | # Input: string lines
102 | function skipLines() {
103 | local TEXT="$1"
104 | local SKIP="$2"
105 |
106 | if [ 0 -eq $SKIP ]; then
107 | echo ""
108 | return
109 | fi
110 |
111 | if [ $SKIP -le 0 ]; then
112 | local TAIL=true
113 | local SKIP=$(((-1)*$SKIP))
114 | else
115 | local TAIL=false
116 | fi
117 |
118 | local LENGTH=`echo "$TEXT" | wc -l`
119 | if [ $LENGTH -le $SKIP ]; then
120 | echo ""
121 | return
122 | fi
123 | if [ "true" == "$TAIL" ]; then
124 | echo "$TEXT" | head -n $((LENGTH-SKIP))
125 | else
126 | echo "$TEXT" | tail -n $((LENGTH-SKIP))
127 | fi
128 | }
129 | export -f skipLines
130 |
131 | #skipLines "$1" "$2"
132 |
--------------------------------------------------------------------------------
/quack_helper_imagepage.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Get helper functions
4 | pushd `dirname $0` > /dev/null
5 | source "analyze.sh"
6 | source "quack_helper_common.sh"
7 | popd > /dev/null
8 |
9 | # TODO: Double-defined in quack.sh. Move to common script
10 | # http://stackoverflow.com/questions/14434549/how-to-expand-shell-variables-in-a-text-file
11 | # Input: template-file
12 | function ctemplate() {
13 | local TMP=`mktemp --suffix .sh`
14 | echo 'cat < $TMP
15 | cat "$1" >> $TMP
16 | echo 'END_OF_TEXT' >> $TMP
17 | . $TMP
18 | rm $TMP
19 | }
20 |
21 | # Searches from the root for alternative versions of the given image
22 | # Very specific to Statsbiblioteket
23 | # src_folder image
24 | # Output: ALTERNATIVES_HTML
25 | function resolveAlternatives() {
26 | local SRC_FOLDER="$1"
27 | local IMAGE="$2"
28 | local FULL="${SRC_FOLDER}/${IMAGE}"
29 |
30 | # local ID=`echo "$IMAGE" | grep -o "[0-9][0-9][0-9][0-9]-.*"`
31 | # local ID="${IMAGE%.*}"
32 | # if [ "." == ".$ID" ]; then
33 | # echo " Unable to extract ID for \"$IMAGE\". No alternatives lookup"
34 | # return
35 | # fi
36 |
37 | pushd "$SOURCE_FULL" > /dev/null
38 | ALTERNATIVES_HTML="
"$'\n'
39 | # for A in `find "$SOURCE_FULL" -name "${ID}*" | sort`; do
40 | # dot to get relative path
41 | for A in `find . -name "${IMAGE}" | sort`; do
42 | # "../../.././Apex/B3/2012-01-05-01/Dagbladet-2012-01-05-01-0130B.jp2 -> Apex/B3
43 | local LINK=`echo "$A" | sed 's/[./]\\+\\([^\\/]\\+\\/[^\\/]\\+\\).*/\\1/g'`
44 | local LINK=$(dirname "$LINK")/
45 | local D="${A%.*}"
46 | ALTERNATIVES_HTML="${ALTERNATIVES_HTML}
"$'\n'
49 | popd > /dev/null
50 | }
51 |
52 | # Generates JavaScript snippet for black and white overlays
53 | # Input: src
54 | # Output: OVERLAYS (not terminated with ']')
55 | function blackWhite() {
56 | local SRC="$1"
57 | local IMAGE_WIDTH=$2
58 | local IMAGE_HEIGHT=$3
59 | local REL_HEIGHT=`echo "scale=2;$IMAGE_HEIGHT/$IMAGE_WIDTH" | bc`
60 |
61 | if [ "." == ".$CROP_PERCENT" ]; then
62 | local CROP_X_FRACTION="0.0"
63 | local CROP_Y_FRACTION="0.0"
64 | local CROP_WIDTH_FRACTION="1.0"
65 | local CROP_HEIGHT_FRACTION="$REL_HEIGHT"
66 | else
67 | local PERCENT=`echo "$CROP_PERCENT" | grep -o "[0-9]\+"`
68 | # TODO: Rounding is quite rough. Consider keeping fractions and skipping intermediates
69 | local PERCENT=$(((100-$PERCENT)/2))
70 | local CROP_X=$((PERCENT*IMAGE_WIDTH/100))
71 | local CROP_Y=$((PERCENT*IMAGE_HEIGHT/100))
72 | local CROP_WIDTH=$((IMAGE_WIDTH-(2*CROP_X)))
73 | local CROP_HEIGHT=$((IMAGE_HEIGHT-(2*CROP_Y)))
74 | local CROP_X_FRACTION=`echo "scale=2;x=$CROP_X/$IMAGE_WIDTH; if(x<1) print 0; x" | bc`
75 | local CROP_Y_FRACTION=`echo "scale=2;x=$CROP_Y/$IMAGE_WIDTH; if(x<1) print 0; x" | bc`
76 | local CROP_WIDTH_FRACTION=`echo "scale=2;x=$CROP_WIDTH/$IMAGE_WIDTH; if(x<1) print 0; x" | bc`
77 | local CROP_HEIGHT_FRACTION=`echo "scale=2;x=$CROP_HEIGHT/$IMAGE_WIDTH; if(x<1) print 0; x" | bc`
78 | fi
79 |
80 | # Special overlays to show absolute black and absolute white pixels
81 | # The FULL_REL is a hack as OpenSeaDragon scales with respect to width
82 | OVERLAYS="overlays: ["$'\n'
83 | OVERLAYS="${OVERLAYS}{id: 'white',"$'\n'
84 | OVERLAYS="${OVERLAYS} x: 0.0, y: 0.0, width: 1.0, height: $REL_HEIGHT,"$'\n'
85 | OVERLAYS="${OVERLAYS} className: 'whiteoverlay'"$'\n'
86 | OVERLAYS="${OVERLAYS}},"$'\n'
87 | OVERLAYS="${OVERLAYS}{id: 'black',"$'\n'
88 | OVERLAYS="${OVERLAYS} x: 0.0, y: 0.0, width: 1.0, height: $REL_HEIGHT,"$'\n'
89 | OVERLAYS="${OVERLAYS} className: 'blackoverlay'"$'\n'
90 | OVERLAYS="${OVERLAYS}},"$'\n'
91 | OVERLAYS="${OVERLAYS}{id: 'cropbox',"$'\n'
92 | OVERLAYS="${OVERLAYS} x: $CROP_X_FRACTION, y: $CROP_Y_FRACTION, width: $CROP_WIDTH_FRACTION, height: $CROP_HEIGHT_FRACTION,"$'\n'
93 | OVERLAYS="${OVERLAYS} className: 'cropoverlay'"$'\n'
94 | OVERLAYS="${OVERLAYS}},"$'\n'
95 | }
96 |
97 | # Generates overlays for the stated block and updates idnext & idprev
98 | # altoxml (newlines removed) tag class
99 | # Output (addition): IDNEXTS IDPREVS OVERLAYS OCR_CONTENT
100 | function processElements() {
101 | local ALTOFLAT=$1
102 | local TAG=$2
103 | local CLASS=$3
104 |
105 | # echo "processGenericOverlay $TAG $CLASS"
106 | # Insert newlines before $TAG>
107 | ELEMENTS=`echo $ALTOFLAT | sed "s/<$TAG/\\n<$TAG/g" | grep "<$TAG"`
108 | # local ELEMENTS=`echo $ALTOFLAT | sed "s/<\/$TAG>/<\/$TAG>\\n/g"`
109 | local SAVEIFS=$IFS
110 | IFS=$(echo -en "\n\b")
111 | # http://mywiki.wooledge.org/BashFAQ/001
112 | while IFS= read -r B
113 | do
114 | # echo -n "."
115 | # for B in $ELEMENTS ; do
116 | local BTAG=`echo "$B" | grep -o "<$TAG[^>]\+>"`
117 | local BID=`echo $BTAG | sed 's/.*ID=\"\([^"]\+\)".*/\\1/g'`
118 | if [ "." == ".$BID" ]; then
119 | continue
120 | fi
121 | local BIDNEXT=`echo $BTAG | sed 's/.*IDNEXT=\"\([^"]\+\)".*/\\1/g'`
122 | if [ "." != ".$BIDNEXT" -a "$BTAG" != "$BIDNEXT" ]; then
123 | local PRE_ART=`echo "$BIDNEXT" | grep -o "^ART"`
124 | if [ ".true" == ".$SKIP_NEXT_ART" ]; then
125 | if [ ".ART" == ".$PRE_ART" ]; then
126 | BIDNEXT=""
127 | fi
128 | fi
129 | IDNEXTS="${IDNEXTS}nexts[\"${BID}\"] = \"$BIDNEXT\";"$'\n'
130 | IDPREVS="${IDPREVS}prevs[\"${BIDNEXT}\"] = \"$BID\";"$'\n'
131 | fi
132 | local BHEIGHT=`echo $BTAG | sed 's/.*HEIGHT=\"\([^"]\+\)".*/\\1/g'`
133 | local BWIDTH=`echo $BTAG | sed 's/.*WIDTH=\"\([^"]\+\)".*/\\1/g'`
134 | local BHPOS=`echo $BTAG | sed 's/.*HPOS=\"\([^"]\+\)".*/\\1/g'`
135 | local BVPOS=`echo $BTAG | sed 's/.*VPOS=\"\([^"]\+\)".*/\\1/g'`
136 |
137 | local SWIDTH=`echo "scale=6;$BWIDTH/$PWIDTH*$ALTO_SCALE_FACTOR" | bc | sed 's/^\./0./'`
138 | # TODO: Seems like there is some mismatch going on here with some deliveries
139 | local SHEIGHT=`echo "scale=6;$BHEIGHT/$PHEIGHT*$ALTO_SCALE_FACTOR" | bc | sed 's/^\./0./'`
140 | # SHEIGHT=`echo "scale=6;$BHEIGHT/$PWIDTH" | bc | sed 's/^\./0./'`
141 | local SHPOS=`echo "scale=6;$BHPOS/$PWIDTH*$ALTO_SCALE_FACTOR" | bc | sed 's/^\./0./'`
142 | local SVPOS=`echo "scale=6;$BVPOS/$PHEIGHT*$ALTO_SCALE_FACTOR" | bc | sed 's/^\./0./'`
143 |
144 | # Special handling of TextBlock
145 | if [ "TextBlock" == "$TAG" ]; then
146 | BCONTENT=`echo "$B" | grep -o ' CONTENT="[^"]\+"' | sed 's/ CONTENT="\\([^"]\\+\\)"/\\1/g' | sed ':a;N;$!ba;s/\\n/ /g' | sed 's/\\\\/\\\\\\\\/g'`
147 | # TODO: Handle entity-escaped content as well as quotes and backslash
148 | OCR_CONTENT="${OCR_CONTENT}ocrs[\"${BID}\"] = \"$BCONTENT\";"$'\n'
149 | # echo "ocrs[\"${BID}\"] = \"$BCONTENT\";"$'\n'
150 | fi
151 |
152 | OVERLAYS="${OVERLAYS} {id: '$BID',"$'\n'
153 | OVERLAYS="${OVERLAYS} x: $SHPOS, y: $SVPOS, width: $SWIDTH, height: $SHEIGHT,"$'\n'
154 | OVERLAYS="${OVERLAYS} className: '$CLASS'"$'\n'
155 | OVERLAYS="${OVERLAYS} },"$'\n'
156 | done <<< "$ELEMENTS"
157 | IFS=$SAVEIFS
158 | }
159 |
160 | # Generates overlayscase
161 | # src dest altofile width height
162 | # Output: ELEMENTS_HTML OVERLAYS OCR_CONTENT IDNEXT_CONTENT FULL_RELATIVE_HEIGHT ACCURACY
163 | function processALTO() {
164 | local SRC="$1"
165 | local DEST="$2"
166 | local ALTO_FILE="$3"
167 | local IMAGE_WIDTH=$4
168 | local IMAGE_HEIGHT=$5
169 | # local WIDTH=$4
170 | # local HEIGHT=$5
171 |
172 | # Used by caller
173 | OVERLAYS=""
174 | ELEMENTS_HTML=""
175 | OCR_CONTENT=""
176 | IDNEXT_CONTENT=""
177 | FULL_RELATIVE_HEIGHT="1"
178 | ACCURACY="N/A"
179 |
180 | local ALTO="${SRC_FOLDER}/${ALTO_FILE}"
181 | if [[ ! -s "$ALTO" ]]; then
182 | >&2 echo "Error: '$ALTO' not available from folder $(pwd)"
183 | fi
184 | blackWhite "$SRC" $IMAGE_WIDTH $IMAGE_HEIGHT
185 | # TODO: Extract relevant elements from the Alto for display
186 | if [ ! -f "$ALTO" ]; then
187 | # TODO: Better handling of non-existence
188 | ELEMENTS_HTML="
No ALTO file at $ALTO
"$'\n'
189 | # Terminate the black/white overlay and return
190 | OVERLAYS="${OVERLAYS}]"
191 | return
192 | fi
193 |
194 | cp "$ALTO" "$ALTO_DEST"
195 | # Extract key elements from the ALTO
196 | BLOCK_COUNT=`grep -c "&2 echo "Error: '$ALTO' not available from folder $(pwd). Unable to generate compact representation"
202 | fi
203 | >&2 echo "Alto file from $(pwd) is $ALTO_FILE"
204 |
205 |
206 | local ALTO_COMPACT=`cat "$ALTO_FILE" | sed ':a;N;$!ba;s/\\n/ /g'`
207 | # local PTAG=`echo "$ALTO_COMPACT" | grep -o "]\\+>"`
208 | local PTAG=`echo "$ALTO_COMPACT" | grep -o "]\\+>"`
209 | local PHEIGHT=`echo $PTAG | sed 's/.*HEIGHT=\"\([^"]\+\)".*/\\1/g'`
210 | local PWIDTH=`echo $PTAG | sed 's/.*WIDTH=\"\([^"]\+\)".*/\\1/g'`
211 | ACCURACY=`echo $PTAG | sed 's/.*PC=\"\([^"]\+\)".*/\\1/g'`
212 | ACCURACY=`echo "scale=2;x=$ACCURACY*100/1; if(x<1) print 0; x" | bc`
213 |
214 | FULL_RELATIVE_HEIGHT=`echo "scale=6;$PHEIGHT/$PWIDTH" | bc | sed 's/^\./0./'`
215 | # TODO: Ponder how relative positioning works and why this hack is necessary
216 | # Theory #1: OpenSeadragon messes up the vertical relative positioning
217 | PHEIGHT=$PWIDTH
218 |
219 | ELEMENTS_HTML="
Key
Value
"$'\n'
220 | for E in $ALTO_ELEMENTS; do
221 | SAVEIFS=$IFS
222 | IFS=$(echo -en "\n\b")
223 | for V in `echo "$ALTO_COMPACT" | grep -o "<${E}>[^<]\\+${E}>"`; do
224 | TV=`echo "$V" | sed 's/.*>\(.*\)<.*/\\1/g'`
225 | ELEMENTS_HTML="${ELEMENTS_HTML}
"$'\n'
230 |
231 | OCR_CONTENT=""
232 | IDNEXTS=""
233 | IDPREVS=""
234 |
235 | # Remove newlines from the ALTO
236 | if [[ ! -s "$ALTO" ]]; then
237 | >&2 echo "Error: '$ALTO' not available from folder $(pwd). Unable to generate no-newline representation"
238 | fi
239 | SANS=`cat "$ALTO" | sed ':a;N;$!ba;s/\\n/ /g'`
240 |
241 | processElements "$SANS" "ComposedBlock" "composed"
242 | processElements "$SANS" "Illustration" "illustration"
243 | processElements "$SANS" "TextBlock" "highlight"
244 |
245 | OVERLAYS="${OVERLAYS} ]"$'\n'
246 | }
247 |
248 | #
249 | # Creates a HTML page representing a single image.
250 | # The image files used by this function must be created (function makeImages) before calling
251 | # makePreviewPage.
252 | #
253 | # Input: up parent srcFolder dstFolder image images
254 | # Output: PAGE_LINK BASE THUMB_LINK THUMB_WIDTH THUMB_HEIGHT HISTOGRAM_LINK HISTOGRAM_WIDTH HISTOGRAM_HEIGHT ILINK
255 | function makePreviewPage() {
256 | local UP="$1"
257 | local PARENT="$2"
258 | local SRC_FOLDER="$3"
259 | local DEST_FOLDER="$4"
260 | local IMAGE="$5"
261 | local IMAGES="$6"
262 |
263 | local PREV_IMAGE=`echo "$IMAGES" | grep -B 1 "$IMAGE" | head -n 1 | grep -v "$IMAGE"`
264 | local NEXT_IMAGE=`echo "$IMAGES" | grep -A 1 "$IMAGE" | tail -n 1 | grep -v "$IMAGE"`
265 |
266 | local SANS_PATH=${IMAGE##*/}
267 | BASE=${SANS_PATH%.*}
268 | P="${DEST_FOLDER}/${BASE}.html"
269 |
270 | # Must be synced with quack.makeImageParams()
271 | local SOURCE_IMAGE="${SRC_FOLDER}/${IMAGE}"
272 | local DEST_IMAGE="${DEST_FOLDER}/${BASE}.${IMAGE_DISP_EXT}"
273 | local HIST_IMAGE="${DEST_FOLDER}/${BASE}.histogram.png"
274 | local HISTOGRAM_LINK=${HIST_IMAGE##*/}
275 | local THUMB_IMAGE="${DEST_FOLDER}/${BASE}.thumb.jpg"
276 | local THUMB_LINK=${THUMB_IMAGE##*/}
277 | local WHITE_IMAGE="${DEST_FOLDER}/${BASE}.white.png"
278 | local BLACK_IMAGE="${DEST_FOLDER}/${BASE}.black.png"
279 | local PRESENTATION_IMAGE="${DEST_FOLDER}/${BASE}.presentation.jpg"
280 | local TILE_FOLDER="${DEST_FOLDER}/${BASE}_files"
281 | local PRESENTATION_TILE_FOLDER="${DEST_FOLDER}/${BASE}.presentation_files"
282 | local ALTO_DEST="${DEST_FOLDER}/${BASE}.alto.xml"
283 |
284 | # Must be kept in sync with quack.makeIndex()
285 | local ILINK="${DEST_FOLDER}/${BASE}.link.html"
286 | local TLINK="${DEST_FOLDER}/${BASE}.thumb.html"
287 | local HLINK="${DEST_FOLDER}/${BASE}.hist.html"
288 |
289 | local SSNIP="${BASE}${SPECIFIC_IMAGE_SNIPPET_EXTENSION}"
290 |
291 | if [ -f $SSNIP ]; then
292 | SNIPPET=`cat $SSNIP`
293 | else
294 | SNIPPET="$SNIPPET_FOLDER"
295 | fi
296 |
297 | # Used by function caller
298 | PAGE_LINK="${BASE}.html"
299 |
300 | # makeImageParams "$SRC_FOLDER" "$DEST_FOLDER" "$IMAGE"
301 |
302 | if [ ! -e "$DEST_IMAGE" ]; then
303 | echo "The destination image '$DEST_IMAGE' for '$IMAGE' has not been created" >&2
304 | exit
305 | fi
306 |
307 | set -- junk `isize "$DEST_IMAGE"`
308 | shift
309 | IMAGE_WIDTH=$1
310 | IMAGE_HEIGHT=$2
311 | IMAGE_MP=`echo "scale=1;x=$IMAGE_WIDTH*$IMAGE_HEIGHT/1000000; if(x<1) print 0; x" | bc`
312 |
313 | set -- junk `isize "$THUMB_IMAGE"`
314 | shift
315 | THUMB_WIDTH=$1
316 | THUMB_HEIGHT=$2
317 |
318 | set -- junk `isize "$HIST_IMAGE"`
319 | shift
320 | HISTOGRAM_WIDTH=$1
321 | HISTOGRAM_HEIGHT=$2
322 |
323 | if [ ".true" == ".$PRESENTATION" ]; then
324 | set -- junk `isize "$PRESENTATION_IMAGE"`
325 | shift
326 | PRESENTATION_WIDTH=$1
327 | PRESENTATION_HEIGHT=$2
328 | fi
329 |
330 | local CREATED_PAGES=`addGetCounter $PAGE_COUNTER`
331 |
332 | if [ "true" != "$FORCE_PAGES" -a -e "$P" ]; then
333 | return
334 | fi
335 |
336 | echo " - ${P##*/} (${CREATED_PAGES}/${TOTAL_IMAGES})"
337 |
338 | local ALTO_FILE="${BASE}${ALTO_EXT}"
339 | processALTO "$SRC_FOLDER" "$DEST_FOLDER" "$ALTO_FILE" $IMAGE_WIDTH $IMAGE_HEIGHT
340 | # $IMAGE_WIDTH $IMAGE_HEIGHT
341 |
342 | local NAVIGATION=""
343 | if [ ! "." == ".$PREV_IMAGE" ]; then
344 | local PSANS_PATH=${PREV_IMAGE##*/}
345 | local PBASE=${PSANS_PATH%.*}
346 | NAVIGATION="previous | "
347 | else
348 | # We write the text to keep the positions of the links constant
349 | NAVIGATION="previous | "
350 | fi
351 | NAVIGATION="${NAVIGATION}up"
352 | if [ ! "." == ".$NEXT_IMAGE" ]; then
353 | local NSANS_PATH=${NEXT_IMAGE##*/}
354 | local NBASE=${NSANS_PATH%.*}
355 | NAVIGATION="${NAVIGATION} | next"
356 | else
357 | NAVIGATION="${NAVIGATION} | next"
358 | fi
359 |
360 | # PARENT, DATE, UP, NAVIGATION, BASE, SOURCE, FULL_RELATIVE_HEIGHT, EDEST, IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_MP, TILE_SOURCES, THUMB, THUMB_WIDTH, THUMB_HEIGHT, PRESENTATION, PRESENTATION_WIDTH, PRESENTATION_HEIGHT, WHITE, BLACK, OVERLAYS, OCR_CONTENT, IDNEXTS, IDPREVS, ALTO_ELEMENTS_HTML, HISTOGRAM, ALTO, ALTERNATIVES
361 | # SOURCE="$SOURCE_IMAGE"
362 | SOURCE_SHORT=${SOURCE##*/}
363 | SOURCE_SIZE=`du -k "$SOURCE_IMAGE" | grep -o "^[0-9]\+"`
364 | EDEST=${DEST_IMAGE##*/}
365 | IMAGE="$EDEST"
366 |
367 | # tileSources: 'http://achernar/iipsrv/?DeepZoom=/net/zone1.isilon.sblokalnet/ifs/archive/avis-upload/B400027132055-RT1/400027132055-08/1899-11-25-01/adresseavisen1759-1899-11-25-01-0298.jp2.dzi',
368 |
369 |
370 |
371 | # ***************** Imageserver *******************
372 | # We get the relative path by subtracting SOURCE_FULL from SRC_FOLDER
373 |
374 | if [ "." != ".$IIPSRV" ]; then
375 |
376 | # Ensure the SOURCE_FULL ends with a slash:
377 | local SRC_FULL=`echo "$SOURCE_FULL" | sed 's&\([^/]\)$&\1/&'`
378 | if [ -z "$IIPSRV_FOLLOW_SYMLINKS_EXTHACK" ]; then
379 | local SRC_IMG="$IMAGE"
380 | else
381 | local SRC_IMG="${IMAGE%.*}$IIPSRV_FOLLOW_SYMLINKS_EXTHACK"
382 | fi
383 |
384 | if [ ".true" == ".$IIPSRV_FOLLOW_SYMLINKS" -a -n "`readlink $SRC_FOLDER/$SRC_IMG`" ]; then
385 | if [ -n "$IIPSRV_FOLLOW_SYMLINKS_ROOT" ]; then
386 | local SRC_FULL=`echo "$IIPSRV_FOLLOW_SYMLINKS_ROOT" | sed 's&\([^/]\)$&\1/&'`
387 | fi
388 | local TRUE_IMAGE=`readlink $SRC_FOLDER/$SRC_IMG`
389 | local TRUE_IMAGE=`echo $(cd $(dirname $TRUE_IMAGE); pwd)/$(basename $TRUE_IMAGE)`
390 | # Remove the source root prefix
391 | local IMG_REL=`echo "$TRUE_IMAGE" | sed "s&$SRC_FULL&&"`
392 | else
393 | # Remove the source root prefix
394 | local SRC_REL=`echo "$SRC_FOLDER" | sed "s&$SRC_FULL&&"`
395 | # Add the image
396 | local IMG_REL="$SRC_REL/$SRC_IMG"
397 | fi
398 |
399 | # Set the DZI-extension
400 | local SRC_DZI="${IMG_REL%.*}$IIPSRV_DZI_EXT"
401 |
402 | TILE_SOURCES="'http://achernar/iipsrv/?DeepZoom=/net/zone1.isilon.sblokalnet/ifs/archive/avis-upload/$SRC_DZI'"
403 | # TODO: Consider adding PRESENTATION_TILE_SOURCES here
404 | else
405 | if [ "true" == "$TILE" ]; then
406 | # ***************** Tiles *******************
407 | TILE_SOURCES="{ Image: {\
408 | xmlns: \"http://schemas.microsoft.com/deepzoom/2008\",\
409 | Url: \"${TILE_FOLDER##*/}/\",\
410 | Format: \"$IMAGE_DISP_EXT\",\
411 | Overlap: \"4\",\
412 | TileSize: \"256\",\
413 | Size: {\
414 | Width: \"$IMAGE_WIDTH\",\
415 | Height: \"$IMAGE_HEIGHT\"\
416 | }\
417 | }}"$'\n'
418 | if [ ".true" == ".$PRESENTATION" ]; then
419 | PRESENTATION_TILE_SOURCES=" Image: {\
420 | xmlns: \"http://schemas.microsoft.com/deepzoom/2008\",\
421 | Url: \"${PRESENTATION_TILE_FOLDER##*/}/\",\
422 | Format: \"$PRESENTATION_IMAGE_DISP_EXT\",\
423 | Overlap: \"4\",\
424 | TileSize: \"256\",\
425 | Size: {\
426 | Width: \"$PRESENTATION_WIDTH\",\
427 | Height: \"$PRESENTATION_HEIGHT\"\
428 | }\
429 | }"$'\n'
430 | else
431 | PRESENTATION_TILE_SOURCES=""
432 | fi
433 |
434 | else
435 | # ***************** No Tiles *******************
436 | TILE_SOURCES="{ type: 'legacy-image-pyramid',\
437 | levels:[\
438 | {\
439 | url: '${EDEST}',\
440 | width: ${IMAGE_WIDTH},\
441 | height: ${IMAGE_HEIGHT}\
442 | }\
443 | ]}"$'\n'
444 | if [ ".true" == ".$PRESENTATION" ]; then
445 | PRESENTATION_TILE_SOURCES=" type: 'legacy-image-pyramid',\
446 | levels:[\
447 | {\
448 | url: '${PRESENTATION_IMAGE##*/}',\
449 | width: ${PRESENTATION_WIDTH},\
450 | height: ${PRESENTATION_HEIGHT}\
451 | }\
452 | ]"$'\n'
453 | else
454 | PRESENTATION_TILE_SOURCES=""
455 | fi
456 | fi
457 | fi
458 |
459 | THUMB="$THUMB_LINK"
460 | WHITE_LINK=${WHITE_IMAGE##*/}
461 | WHITE="$WHITE_LINK"
462 | BLACK_LINK=${BLACK_IMAGE##*/}
463 | BLACK="$BLACK_LINK"
464 |
465 | ALTO_ELEMENTS_HTML="$ELEMENTS_HTML"
466 | EHIST=${HIST_IMAGE##*/}
467 | HISTOGRAM="$EHIST"
468 | ALTO="$ALTO_FILE"
469 | if [ "true" == "$RESOLVE_ALTERNATIVES" ]; then
470 | resolveAlternatives "$SRC_FOLDER" $(basename $SOURCE_IMAGE)
471 | else
472 | local ALTERNATIVES_HTML=""
473 | fi
474 | ALTERNATIVES="$ALTERNATIVES_HTML"
475 |
476 | # TODO: Use destination if that is lossless and faster to open?
477 | local GREY=`grey_stats "$SOURCE_IMAGE" "$DEST_FOLDER"`
478 |
479 | # $PIXELS $UNIQUE $FIRST_COUNT $PERCENT_FIRST $FIRST_GREY $LAST_COUNT $PERCENT_LAST $LAST_GREY $COUNT_SPIKE $PERCENT_SPIKE $GREY_SPIKE $ZEROES $HOLES
480 | # 1000095 512 82362 8.23 (0,0,0) 255 .02 (255,255,255)
481 | GREY_PIXELS=`echo "$GREY" | cut -d\ -f1`
482 | GREY_UNIQUE=`echo "$GREY" | cut -d\ -f2`
483 | GREY_COUNT_FIRST=`echo "$GREY" | cut -d\ -f3`
484 | GREY_PERCENT_FIRST=`echo "$GREY" | cut -d\ -f4`
485 | GREY_FIRST=`echo "$GREY" | cut -d\ -f5`
486 | GREY_COUNT_LAST=`echo "$GREY" | cut -d\ -f6`
487 | GREY_PERCENT_LAST=`echo "$GREY" | cut -d\ -f7`
488 | GREY_LAST=`echo "$GREY" | cut -d\ -f8`
489 | GREY_COUNT_SPIKE=`echo "$GREY" | cut -d\ -f9`
490 | GREY_PERCENT_SPIKE=`echo "$GREY" | cut -d\ -f10`
491 | GREY_SPIKE=`echo "$GREY" | cut -d\ -f11`
492 | GREY_ZEROES=`echo "$GREY" | cut -d\ -f12`
493 | GREY_HOLES=`echo "$GREY" | cut -d\ -f13`
494 | local GREY_ALL_SOURCE=`im_identify "$SOURCE_IMAGE" "$DEST_FOLDER"`
495 | GREY_ALL=`cat "$GREY_ALL_SOURCE" | grep -A 256 Histogram | tail -n 256`
496 |
497 | ctemplate $IMAGE_TEMPLATE > $P
498 | ctemplate $IMAGELINK_TEMPLATE > $ILINK
499 | ctemplate $HIST_TEMPLATE > $HLINK
500 | ctemplate $THUMB_TEMPLATE > $TLINK
501 |
502 | # ls -l "$IMAGE"
503 | # echo "$GREY"
504 | # ***
505 | # echo ""
506 |
507 | # cat $P
508 | # exit
509 |
510 |
511 | }
512 | makePreviewPage "$@"
513 |
--------------------------------------------------------------------------------
/samples/ACE-17950616-0018B.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tokee/quack/5cbda6327fff58c409faf3d92e3b90b4439498a4/samples/ACE-17950616-0018B.jpg
--------------------------------------------------------------------------------
/samples/ACE-17950616-0018B.snippet:
--------------------------------------------------------------------------------
1 |
Frontpage for the newspaper
2 |
--------------------------------------------------------------------------------
/samples/ACE-17950616-0019A.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tokee/quack/5cbda6327fff58c409faf3d92e3b90b4439498a4/samples/ACE-17950616-0019A.jpg
--------------------------------------------------------------------------------
/samples/ACE-17950616-0019B.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tokee/quack/5cbda6327fff58c409faf3d92e3b90b4439498a4/samples/ACE-17950616-0019B.jpg
--------------------------------------------------------------------------------
/samples/README:
--------------------------------------------------------------------------------
1 | *** Copyright and source
2 |
3 | The originating newspaper issue for the sample images is from 1795 and thus
4 | out of danish copyright. The rights to the digital images as well as the
5 | corresponding ALTO files has been released by Statsbiblioteket (State and
6 | University Library, Denmark).
7 |
8 | The sample files are downscaled and strongly compressed to avoid bloating
9 | the git repository. They should not be seen as representative of production
10 | scans. The original scans are 300 DPI equivalent, in lossless JPEG 2000.
11 |
12 |
13 | *** How to use the sample
14 |
15 | From the main quack folder, issue the following command
16 | ./quack.sh samples samples_qa
17 | then open the file samples_qa/index.html in a browser.
18 |
--------------------------------------------------------------------------------
/samples/folder.snippet:
--------------------------------------------------------------------------------
1 |
Sample images from a 200+ year old Danish newspaper
Permission is hereby granted, free of charge, to any person obtaining
15 | a
16 | copy of this software and associated documentation files (the
17 | "Software"), to deal in the Software without restriction, including
18 | without limitation the rights to use, copy, modify, merge, publish,
19 | distribute, sublicense, and/or sell copies of the Software, and to
20 | permit persons to whom the Software is furnished to do so, subject to
21 | the following conditions:
22 |
23 |
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
24 | EXPRESS
25 | OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
26 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
27 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
28 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
29 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
30 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.