├── .gitattributes ├── AUTHORS ├── LICENSE ├── Makefile ├── README.md ├── draw_bpp_plot ├── draw_heatmap ├── ecoli_tRNA ├── ecoli_tRNA_bpp ├── example.shape ├── gflags.py ├── linearpartition ├── src ├── LinearFoldEval.h ├── LinearPartition.cpp ├── LinearPartition.h ├── Utils │ ├── energy_parameter.h │ ├── feature_weight.h │ ├── intl11.h │ ├── intl21.h │ ├── intl22.h │ ├── utility.h │ └── utility_v.h ├── bpp.cpp └── scripts │ └── script_draw_bpp.py ├── testseq └── vis_examples ├── bpp_plot.pdf ├── bpp_plot.png ├── heatmap.pdf └── heatmap.png /.gitattributes: -------------------------------------------------------------------------------- 1 | *.cpp linguist-language=c++ 2 | *.h linguist-language=c++ -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | Liang Huang (all parts) 2 | He Zhang (all parts) 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Disclaimer and Copyright 2 | 3 | The programs, library and source code of the LinearPartition Package are free 4 | software. They are distributed in the hope that they will be useful 5 | but WITHOUT ANY WARRANTY; without even the implied warranty of 6 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 7 | 8 | Permission is granted for research, educational, and commercial use 9 | and modification so long as 1) the package and any derived works are not 10 | redistributed for any fee, other than media costs, 2) proper credit is 11 | given to the authors. 12 | 13 | corresponding author: Liang Huang 14 | 15 | If you want to include this software in a commercial product, please contact 16 | the corresponding author. 17 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | ################################ 2 | # Makefile 3 | # 4 | # author: He Zhang 5 | # edited by: 03/2019 6 | ################################ 7 | 8 | CC=g++ 9 | DEPS=src/bpp.cpp src/LinearPartition.h src/Utils/energy_parameter.h src/Utils/feature_weight.h src/Utils/intl11.h src/Utils/intl21.h src/Utils/intl22.h src/Utils/utility_v.h src/Utils/utility.h 10 | CFLAGS=-std=c++11 -O3 11 | .PHONY : clean linearpartition 12 | objects=bin/linearpartition_v bin/linearpartition_c 13 | 14 | linearpartition: src/LinearPartition.cpp $(DEPS) 15 | chmod +x linearpartition draw_bpp_plot draw_heatmap 16 | mkdir -p bin 17 | $(CC) src/LinearPartition.cpp $(CFLAGS) -Dlpv -o bin/linearpartition_v 18 | $(CC) src/LinearPartition.cpp $(CFLAGS) -o bin/linearpartition_c 19 | 20 | clean: 21 | -rm $(objects) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LinearPartition: Linear-Time Approximation of RNA Folding Partition Function and Base Pairing Probabilities 2 | 3 | This repository contains the C++ source code for the LinearPartition project, the first linear-time partition function and base pair probabilities calculation algorithm/software for RNA secondary structures. 4 | 5 | [LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities](https://academic.oup.com/bioinformatics/article/36/Supplement_1/i258/5870487). Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i258–i267. ISMB 2020 6 | 7 | He Zhang, Liang Zhang, David Mathews, Liang Huang* 8 | 9 | \* corresponding author 10 | 11 | Web server: http://linearfold.org/partition 12 | 13 | 14 | ## Dependencies 15 | gcc 4.8.5 or above; 16 | python2.7 17 | LaTex for drawing circular plot 18 | numpy, pandas, seaborn and matplotlib for drawing heatmap plot 19 | 20 | ## To Compile 21 | ``` 22 | make 23 | ``` 24 | 25 | ## To Run 26 | LinearPartition can be run with: 27 | ``` 28 | echo SEQUENCE | ./linearpartition [OPTIONS] 29 | 30 | OR 31 | 32 | cat SEQ_OR_FASTA_FILE | ./linearpartition [OPTIONS] 33 | ``` 34 | Both FASTA format and pure-sequence format are supported for input. 35 | 36 | OPTIONS: 37 | ``` 38 | --beamsize BEAM_SIZE or -b BEAM_SIZE 39 | ``` 40 | The beam size (default 100). Use 0 for infinite beam. 41 | ``` 42 | --Vienna or -V 43 | ``` 44 | Switches LinearPartition-C (by default) to LinearPartition-V. 45 | ``` 46 | --fasta 47 | ``` 48 | Specify that the input is in fasta format. (default FALSE) 49 | ``` 50 | --verbose 51 | ``` 52 | Prints out beamsize, Log Partition Coefficient or free energy of ensemble (-V mode) and runtime information. (default False) 53 | ``` 54 | --sharpturn 55 | ``` 56 | Enable sharpturn. (default False) 57 | ``` 58 | --output FILE_NAME or -o FILE_NAME 59 | ``` 60 | Outputs base pairing probability matrix to a file with user specified name. (default False) 61 | ``` 62 | --rewrite FILE_NAME or -r FILE_NAME 63 | ``` 64 | Output base pairing probability matrix to a file with user specified name (overwrite if the file exists). (default False) 65 | ``` 66 | --prefix PREFIX_NAME 67 | ``` 68 | Outputs base pairing probability matrices to files with user specified prefix. (default False) 69 | ``` 70 | --part or -p 71 | ``` 72 | Partition function calculation only. (default False) 73 | ``` 74 | --cutoff CUTOFF or -c CUTOFF 75 | ``` 76 | Only output base pair probability larger than user specified threshold (CUTOFF) between 0 and 1. (DEFAULT=0.0) 77 | ``` 78 | 79 | --dumpforest or -f 80 | ``` 81 | dump forest (all nodes with inside [and outside] log partition functions but no hyperedges) for downstream tasks such as sampling and accessibility (DEFAULT=None) 82 | 83 | ``` 84 | --mea or -M 85 | ``` 86 | get MEA structure, (DEFAULT=FALSE) 87 | 88 | ``` 89 | --gamma GAMMA or -g GAMMA 90 | ``` 91 | set MEA gamma, (DEFAULT=3.0) 92 | 93 | ``` 94 | --bpseq 95 | ``` 96 | output MEA structure(s) in bpseq format instead of dot-bracket format 97 | 98 | ``` 99 | --mea_prefix 100 | ``` 101 | output MEA structure(s) to file(s) with user specified prefix name 102 | 103 | ``` 104 | --threshknot or -T 105 | ``` 106 | get ThreshKnot structure, (DEFAULT=FALSE) 107 | 108 | ``` 109 | --threshold 110 | ``` 111 | set ThreshKnot threshknot, (DEFAULT=0.3) 112 | 113 | ``` 114 | --threshknot_prefix 115 | ``` 116 | output ThreshKnot structure(s) to file(s) with user specified prefix name (default False) 117 | 118 | ``` 119 | --shape FILE_NAME 120 | ``` 121 | use SHAPE reactivity data (for -V mode only) 122 | Please refer to this link for the SHAPE data format: 123 | https://rna.urmc.rochester.edu/Text/File_Formats.html#SHAPE 124 | 125 | ``` 126 | --evaly y 127 | ``` 128 | prints p(y | x) and -kT log Q(x), e.g., 129 | ``` 130 | $ echo -ne "CCCAAAGGG" | ./linearpartition -V --evaly "(((...)))" 131 | CCCAAAGGG 132 | Free Energy of Ensemble: -1.41344 kcal/mol 133 | x= CCCAAAGGG y= (((...))) DeltaG(x,y)= -1.20 -kTlogQ(x)= -1.41344 p(y|x)= 0.70729 134 | ``` 135 | Note that this mode can be used in batch mode where you evaluate `p(y|x)` for many `x` sequences and a particular `y` structure. 136 | 137 | 138 | ## To Visualize 139 | LinearPartition provides two ways to visualize base pairing probabilities, circular plot and heatmap plot. 140 | 141 | In a circular plot, the darkness of each arc represents the probability of each base pair (see an example below). 142 | To draw a circular plot, run command: 143 | ``` 144 | cat TARGET_FILE | ./draw_bpp_plot BASE_PAIRING_PROBABILITY_FILE 145 | ``` 146 | TARGET_FILE contains one sequence and its structure; see "ecoli_tRNA" file as an example. 147 | BASE_PAIRING_PROBABILITY_FILE can be a probability file generated by LinearPartition, or a file with the same format; see "ecoli_tRNA_bpp" as an example. 148 | 149 | To draw a heatmap plot, run command: 150 | ``` 151 | cat BASE_PAIRING_PROBABILITY_FILE | ./draw_heatmap SEQUENCE_LENGTH 152 | ``` 153 | SEQUENCE_LENGTH is the length of the sequence. 154 | 155 | ## Example: Run Predict 156 | ``` 157 | cat testseq | ./linearpartition -V --prefix testseq_output 158 | Free Energy of Ensemble: -1.96 kcal/mol 159 | Outputing base pairing probability matrix to testseq_output_1... 160 | Done! 161 | Free Energy of Ensemble: -9.41 kcal/mol 162 | Outputing base pairing probability matrix to testseq_output_2... 163 | Done! 164 | Free Energy of Ensemble: -7.72 kcal/mol 165 | Outputing base pairing probability matrix to testseq_output_3... 166 | Done! 167 | Free Energy of Ensemble: -9.09 kcal/mol 168 | Outputing base pairing probability matrix to testseq_output_4... 169 | Done! 170 | Free Energy of Ensemble: -13.58 kcal/mol 171 | Outputing base pairing probability matrix to testseq_output_5... 172 | Done! 173 | 174 | echo GGGCUCGUAGAUCAGCGGUAGAUCGCUUCCUUCGCAAGGAAGCCCUGGGUUCAAAUCCCAGCGAGUCCACCA | ./linearpartition -o output 175 | Log Partition Coefficient: 15.88268 176 | Outputing base pairing probability matrix to output... 177 | Done! 178 | ``` 179 | 180 | ## Example: Run Partition Function Calculation Only 181 | ``` 182 | echo GGGCUCGUAGAUCAGCGGUAGAUCGCUUCCUUCGCAAGGAAGCCCUGGGUUCAAAUCCCAGCGAGUCCACCA | ./linearpartition -V -p --verbose 183 | beam size: 100 184 | Free Energy of Ensemble: -32.14 kcal/mol 185 | Partition Function Calculation Time: 0.01 seconds. 186 | ``` 187 | 188 | ## Example: Run Prediction and Output MEA structure 189 | ``` 190 | echo GGGCUCGUAGAUCAGCGGUAGAUCGCUUCCUUCGCAAGGAAGCCCUGGGUUCAAAUCCCAGCGAGUCCACCA | ./linearpartition -V -M 191 | Free Energy of Ensemble: -32.14 kcal/mol 192 | GGGCUCGUAGAUCAGCGGUAGAUCGCUUCCUUCGCAAGGAAGCCCUGGGUUCAAAUCCCAGCGAGUCCACCA 193 | (((((((..((((.......))))((((((((...)))))))).(((((.......)))))))))))).... 194 | ``` 195 | 196 | ## Example: Run Prediction and Output ThreshKnot structure in bpseq format 197 | ``` 198 | echo GUUGUUAUAGCAUAAGAAGUGCAUUUGUUUUAAGCGUAAAAGAUAUGGGACAACUCCA | ./linearpartition -V -T --threshold 0 199 | Free Energy of Ensemble: -8.74 kcal/mol 200 | GUUGUUAUAGCAUAAGAAGUGCAUUUGUUUUAAGCGUAAAAGAUAUGGGACAACUCCA 201 | 1 G 54 202 | 2 U 53 203 | 3 U 52 204 | 4 G 51 205 | 5 U 50 206 | 6 U 49 207 | 7 A 0 208 | 8 U 0 209 | 9 A 0 210 | 10 G 22 211 | 11 C 21 212 | 12 A 20 213 | 13 U 19 214 | 14 A 0 215 | 15 A 0 216 | 16 G 0 217 | 17 A 0 218 | 18 A 0 219 | 19 G 13 220 | 20 U 12 221 | 21 G 11 222 | 22 C 10 223 | 23 A 0 224 | 24 U 34 225 | 25 U 33 226 | 26 U 45 227 | 27 G 44 228 | 28 U 43 229 | 29 U 42 230 | 30 U 41 231 | 31 U 40 232 | 32 A 37 233 | 33 A 25 234 | 34 G 24 235 | 35 C 47 236 | 36 G 46 237 | 37 U 32 238 | 38 A 0 239 | 39 A 0 240 | 40 A 31 241 | 41 A 30 242 | 42 G 29 243 | 43 A 28 244 | 44 U 27 245 | 45 A 26 246 | 46 U 36 247 | 47 G 35 248 | 48 G 0 249 | 49 G 6 250 | 50 A 5 251 | 51 C 4 252 | 52 A 3 253 | 53 A 2 254 | 54 C 1 255 | 55 U 0 256 | 56 C 0 257 | 57 C 0 258 | 58 A 0 259 | ``` 260 | 261 | ## Example Run LinearPartition with SHAPE data 262 | ``` 263 | echo GCCUGGUGACCAUAGCGAGUCGGUACCACCCCUUCCCAUCCCGAACAGGACCGUGAAACGACUCCGCGCCGAUGAUAGUGCGGAUUCCCGUGUGAAAGUAGGUCAUCGCCAGGC | ./linearpartition -V --shape example.shape 264 | Free Energy of Ensemble: -67.82 kcal/mol 265 | ``` 266 | 267 | 268 | ## Example: Draw Circular Plot 269 | ``` 270 | cat ecoli_tRNA | ./draw_bpp_plot ecoli_tRNA_bpp 271 | ``` 272 | 273 | 274 | ## Example: Draw Heatmap Plot 275 | ``` 276 | cat ecoli_tRNA_bpp | ./draw_heatmap 76 277 | ``` 278 | 279 | 280 | References 281 | ------------- 282 | 283 | Liang Zhang, He Zhang, David H Mathews, and Liang Huang\*. Threshknot: Thresholded probknot for improved RNA secondary structure prediction. arXiv preprint arXiv:1912.12796. 284 | 285 | \* corresponding author -------------------------------------------------------------------------------- /draw_bpp_plot: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python2.7 2 | import os, sys, subprocess 3 | from collections import defaultdict 4 | import math 5 | import sys 6 | 7 | bpp_file = sys.argv[1] 8 | 9 | i = 0 10 | seq, res = "", "" 11 | for line in sys.stdin.readlines(): 12 | if line.startswith(">"): continue 13 | if i == 0: 14 | seq = line.strip() 15 | i += 1 16 | elif i == 1: 17 | res = line.strip() 18 | i += 1 19 | else: break 20 | 21 | print seq 22 | print res 23 | if len(seq) != len(res): 24 | print "sequence and structure lengths are not the same!" 25 | sys.exit() 26 | 27 | 28 | ref = "." * len(res) 29 | 30 | print "Processing seq..." 31 | 32 | os.system("echo \"%s %s %s\" | ./src/scripts/script_draw_bpp.py %s > %s.tex" % \ 33 | (seq, res, ref, bpp_file, "bpp_plot")) 34 | 35 | os.system("pdflatex bpp_plot.tex") -------------------------------------------------------------------------------- /draw_heatmap: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import numpy as np 4 | import pandas as pd 5 | import seaborn as sns 6 | import matplotlib.pyplot as plt 7 | import sys 8 | 9 | # bpp_file = sys.argv[1] 10 | seq_length = int(sys.argv[1]) 11 | 12 | matrix = [[0.0 for x in range(seq_length)] for y in range(seq_length)] 13 | 14 | # data processing 15 | # for line in open(bpp_file): 16 | for line in sys.stdin.readlines(): 17 | line = line.strip() 18 | if line == "": break 19 | i, j, prob = line.split() 20 | matrix[int(j)-1][int(i)-1] = float(prob) 21 | 22 | sns.set(style="white") 23 | 24 | matrix_ticks = pd.DataFrame(data=matrix, 25 | columns=range(1,seq_length+1), 26 | index=range(1,seq_length+1)) 27 | 28 | # Generate a mask for the upper triangle 29 | mask = np.triu(np.ones_like(matrix, dtype=np.bool)) 30 | 31 | # Set up the matplotlib figure 32 | f, ax = plt.subplots(figsize=(10, 10)) 33 | 34 | # Generate a custom diverging colormap 35 | cmap = sns.diverging_palette(220, 10, as_cmap=True) 36 | # cmap = sns.diverging_palette(220, 10) 37 | 38 | sns.heatmap(matrix_ticks, mask=mask, cmap=cmap, vmax=1., center=0, 39 | square=True, linewidths=.5, cbar_kws={"shrink": .5}, 40 | xticklabels="auto", yticklabels="auto") 41 | # plt.savefig(bpp_file+"_heatmap", format="pdf") 42 | plt.savefig("heatmap", format="pdf") 43 | 44 | 45 | -------------------------------------------------------------------------------- /ecoli_tRNA: -------------------------------------------------------------------------------- 1 | > 2 | GCGGGAAUAGCUCAGUUGGUAGAGCACGACCUUGCCAAGGUCGGGGUCGCGAGUUCGAGUCUCGUUUCCCGCUCCA 3 | (((((((..((((........)))).(((((.......))))).....(((((.......)))))))))))).... 4 | -------------------------------------------------------------------------------- /ecoli_tRNA_bpp: -------------------------------------------------------------------------------- 1 | 1 72 9.9434e-01 2 | 1 75 1.6668e-04 3 | 2 71 9.9780e-01 4 | 3 69 7.4465e-05 5 | 3 70 9.9778e-01 6 | 3 74 1.8160e-04 7 | 3 75 4.3606e-04 8 | 4 68 1.5118e-04 9 | 4 69 9.9773e-01 10 | 4 73 1.8213e-04 11 | 4 74 4.3294e-04 12 | 4 75 1.2602e-03 13 | 5 67 1.8357e-04 14 | 5 68 9.9754e-01 15 | 5 72 2.0723e-04 16 | 5 73 2.6520e-04 17 | 5 74 1.2662e-03 18 | 6 66 6.5654e-03 19 | 6 67 9.6748e-01 20 | 6 73 1.1456e-03 21 | 7 65 1.2150e-02 22 | 7 66 7.3970e-01 23 | 7 67 3.2235e-04 24 | 8 18 3.9230e-04 25 | 8 21 1.8437e-03 26 | 8 26 5.9377e-03 27 | 8 64 1.5946e-02 28 | 9 17 5.2056e-04 29 | 9 20 2.0319e-03 30 | 9 62 3.5227e-02 31 | 9 65 5.0072e-03 32 | 9 66 5.4785e-04 33 | 10 16 5.6498e-04 34 | 10 20 1.4671e-03 35 | 10 25 9.2382e-01 36 | 10 47 5.5794e-05 37 | 10 54 4.5545e-05 38 | 10 60 2.6842e-02 39 | 10 61 3.6403e-02 40 | 10 63 4.2200e-03 41 | 10 65 2.2279e-04 42 | 11 15 5.6095e-04 43 | 11 18 2.1528e-03 44 | 11 19 1.5428e-03 45 | 11 24 9.2594e-01 46 | 11 46 8.1882e-05 47 | 11 53 5.8837e-05 48 | 11 59 6.7790e-02 49 | 11 64 2.2482e-04 50 | 12 18 1.1785e-03 51 | 12 19 1.1318e-04 52 | 12 23 9.2565e-01 53 | 12 45 8.1241e-05 54 | 12 52 5.8584e-05 55 | 12 58 6.7797e-02 56 | 13 18 3.3537e-04 57 | 13 22 9.2534e-01 58 | 13 44 7.9196e-05 59 | 13 51 5.7328e-05 60 | 13 57 6.7652e-02 61 | 14 20 2.1549e-02 62 | 14 54 1.9869e-03 63 | 14 55 4.3316e-02 64 | 14 60 1.7808e-04 65 | 14 62 2.3542e-04 66 | 15 20 2.3099e-03 67 | 15 25 4.1548e-04 68 | 15 54 3.9536e-02 69 | 15 55 1.5709e-03 70 | 15 56 1.9412e-03 71 | 15 61 2.4346e-04 72 | 16 21 1.4353e-03 73 | 16 24 4.1477e-04 74 | 16 52 3.3550e-03 75 | 16 53 4.6749e-02 76 | 16 58 2.8808e-04 77 | 17 21 1.4082e-04 78 | 17 23 4.0349e-04 79 | 17 26 6.9410e-05 80 | 17 51 3.2951e-03 81 | 17 52 5.4780e-02 82 | 17 53 5.9619e-04 83 | 17 57 2.9238e-04 84 | 17 58 1.6998e-05 85 | 18 25 1.1703e-04 86 | 18 50 2.8718e-03 87 | 18 56 3.0943e-04 88 | 19 50 5.8938e-02 89 | 19 55 2.7318e-04 90 | 19 56 8.8610e-05 91 | 20 49 5.1885e-02 92 | 21 55 2.2843e-05 93 | 22 47 7.5176e-05 94 | 22 48 6.8993e-02 95 | 22 54 9.5211e-05 96 | 22 56 2.6745e-03 97 | 23 47 6.8748e-02 98 | 23 55 2.6823e-03 99 | 23 73 1.9744e-05 100 | 24 47 3.4730e-04 101 | 24 50 9.7681e-05 102 | 24 54 2.6810e-03 103 | 24 72 2.0724e-05 104 | 25 44 6.1051e-04 105 | 25 45 6.4111e-02 106 | 25 46 5.6528e-04 107 | 25 49 6.7014e-05 108 | 25 51 3.2293e-04 109 | 25 53 2.6655e-03 110 | 25 71 2.0169e-05 111 | 26 47 6.9667e-05 112 | 27 43 2.0946e-01 113 | 27 44 3.2483e-04 114 | 27 49 5.3843e-01 115 | 28 42 2.1479e-01 116 | 28 48 7.8442e-01 117 | 29 41 2.1478e-01 118 | 29 47 7.8483e-01 119 | 30 40 2.1482e-01 120 | 30 45 3.7594e-04 121 | 30 46 7.8472e-01 122 | 31 39 2.1479e-01 123 | 31 44 7.6494e-04 124 | 31 45 7.8435e-01 125 | 32 37 1.5840e-04 126 | 32 38 2.0844e-01 127 | 32 43 1.1177e-03 128 | 32 44 7.8077e-01 129 | 33 37 1.4441e-01 130 | 33 38 1.3835e-03 131 | 33 43 7.6791e-01 132 | 34 41 2.0629e-02 133 | 34 42 7.5988e-01 134 | 35 39 4.9338e-04 135 | 35 40 7.2373e-01 136 | 35 44 9.0636e-04 137 | 36 40 2.0004e-02 138 | 36 43 9.6971e-04 139 | 37 41 1.3677e-04 140 | 43 48 1.4646e-04 141 | 43 62 4.8404e-04 142 | 43 63 7.4174e-04 143 | 44 48 9.6662e-04 144 | 44 50 1.0481e-04 145 | 44 61 4.9682e-03 146 | 44 62 1.1047e-03 147 | 44 63 2.7970e-02 148 | 44 70 2.1040e-03 149 | 45 50 1.2468e-04 150 | 45 60 4.9609e-03 151 | 45 61 1.2183e-03 152 | 45 62 2.8118e-02 153 | 45 63 5.9910e-05 154 | 45 69 2.1153e-03 155 | 46 60 1.0396e-03 156 | 46 61 2.8170e-02 157 | 46 66 1.1914e-04 158 | 46 67 4.9006e-05 159 | 46 68 2.1226e-03 160 | 47 58 6.3944e-03 161 | 48 57 6.4726e-03 162 | 48 59 2.4863e-02 163 | 48 64 1.5076e-04 164 | 49 56 5.9609e-03 165 | 49 60 2.3546e-04 166 | 49 63 1.6601e-04 167 | 49 65 3.3061e-01 168 | 49 66 1.8422e-04 169 | 50 57 2.8184e-02 170 | 50 59 2.9539e-04 171 | 50 64 8.7147e-01 172 | 51 56 2.7625e-02 173 | 51 61 5.6578e-04 174 | 51 63 8.9364e-01 175 | 52 60 5.6024e-04 176 | 52 62 8.9326e-01 177 | 53 61 8.9094e-01 178 | 54 58 5.5866e-04 179 | 54 59 2.0675e-02 180 | 55 59 6.6319e-03 181 | 56 64 1.1428e-03 182 | 57 63 1.2483e-03 183 | 58 62 9.2624e-04 184 | 185 | -------------------------------------------------------------------------------- /example.shape: -------------------------------------------------------------------------------- 1 | 1 NA 2 | 2 NA 3 | 3 NA 4 | 4 NA 5 | 5 NA 6 | 6 NA 7 | 7 NA 8 | 8 NA 9 | 9 2.0303 10 | 10 3.4545 11 | 11 3.9093 12 | 12 3.3998 13 | 13 2.4215 14 | 14 1.5185 15 | 15 0.26599 16 | 16 0.044856 17 | 17 0 18 | 18 0.24481 19 | 19 0.26915 20 | 20 0.39516 21 | 21 0.42047 22 | 22 0.29272 23 | 23 0.1189 24 | 24 0.10819 25 | 25 0.06756 26 | 26 0.1099 27 | 27 1.1082 28 | 28 0.51335 29 | 29 0.8531 30 | 30 1.2355 31 | 31 1.5989 32 | 32 1.234 33 | 33 1.014 34 | 34 1.1945 35 | 35 0.098511 36 | 36 0.11049 37 | 37 0.061875 38 | 38 0.10532 39 | 39 0.096157 40 | 40 0.10929 41 | 41 0.042156 42 | 42 0.16029 43 | 43 0.32049 44 | 44 1.1684 45 | 45 1.4619 46 | 46 0.040105 47 | 47 0.0025812 48 | 48 0.034826 49 | 49 0.057291 50 | 50 0.039333 51 | 51 0 52 | 52 0.028677 53 | 53 0.21812 54 | 54 0.022805 55 | 55 0.00021709 56 | 56 0.013873 57 | 57 0.013754 58 | 58 0 59 | 59 0 60 | 60 0 61 | 61 0.25117 62 | 62 0.43657 63 | 63 0.43743 64 | 64 0.50744 65 | 65 0.085134 66 | 66 0.019398 67 | 67 0.023139 68 | 68 0.00074496 69 | 69 0 70 | 70 0 71 | 71 0.39526 72 | 72 0.059399 73 | 73 0.033083 74 | 74 0.01536 75 | 75 0.0031007 76 | 76 0.013488 77 | 77 0.010088 78 | 78 0 79 | 79 0 80 | 80 0 81 | 81 0 82 | 82 0.0076068 83 | 83 0.061887 84 | 84 0.031508 85 | 85 0 86 | 86 0 87 | 87 0.023431 88 | 88 0.045501 89 | 89 0.093281 90 | 90 0.27861 91 | 91 0.3089 92 | 92 2.4142 93 | 93 0.24535 94 | 94 0.18949 95 | 95 0.033237 96 | 96 0 97 | 97 0 98 | 98 0.020459 99 | 99 0.24343 100 | 100 1.5 101 | 101 0.14023 102 | 102 0.40887 103 | 103 0.022548 104 | 104 0.018817 105 | 105 0.013818 106 | 106 0.0041629 107 | 107 0.0095489 108 | 108 0.013491 109 | 109 0 110 | 110 0 111 | 111 0.018041 112 | 112 0.013599 113 | 113 0 114 | 114 0.0035904 115 | -------------------------------------------------------------------------------- /linearpartition: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import gflags as flags 4 | import subprocess 5 | import sys 6 | import os 7 | 8 | FLAGS = flags.FLAGS 9 | 10 | def setgflags(): 11 | flags.DEFINE_integer('beamsize', 100, "set beam size", short_name='b') 12 | flags.DEFINE_boolean('Vienna', False, "use vienna parameters", short_name='V') 13 | flags.DEFINE_boolean('sharpturn', False, "enable sharp turn in prediction") 14 | flags.DEFINE_boolean('verbose', False, "print out beamsize, Log Partition Coefficient (default mode) or free energy of ensumble (-V mode) and runtime information") 15 | flags.DEFINE_string('output', '', "output base pairing probability matrix to a file with user specified name", short_name="o") # output mode 16 | flags.DEFINE_string('prefix', '', "output base pairing probability matrices to file(s) with user specified prefix name") # prefix of file name 17 | flags.DEFINE_boolean('part', False, "only do partition function calculation", short_name='p') # adding partition function mode 18 | flags.DEFINE_string('rewrite', '', "output base pairing probability matrix to a file with user specified name (rewrite if the file exists)", short_name='r') # output (rewrite) mode 19 | flags.DEFINE_float('cutoff', None, "only output base pair probability bigger than user specified threshold between 0 and 1", short_name='c') # bpp cutoff 20 | flags.DEFINE_string("dumpforest", "", "dump forest (all nodes with inside [and outside] log partition functions but no hyperedges) for downstream tasks such as sampling and accessibility (DEFAULT=None)", short_name="f") # output (rewrite) mode 21 | flags.DEFINE_boolean('mea', False, "get MEA structure", short_name='M') 22 | flags.DEFINE_float('gamma', 3.0, "set MEA gamma", short_name='g') 23 | flags.DEFINE_string('mea_prefix', '', "output MEA structure(s) to file(s) with user specified prefix name") # output (rewrite) mode 24 | flags.DEFINE_boolean('bpseq', False, "output MEA structure(s) in bpseq format instead of dot-bracket format") # output (rewrite) mode 25 | flags.DEFINE_boolean('threshknot', False, "get ThreshKnot structure", short_name='T') 26 | flags.DEFINE_float('threshold', 0.3, "set ThreshKnot threshold") 27 | flags.DEFINE_string('threshknot_prefix', '', "output ThreshKnot structure(s) to file(s) in bpseq format with user specified prefix name") # prefix of file name 28 | flags.DEFINE_string('shape', '', "import SHAPE data for SHAPE guided LinearPartition (DEFAULT: not use SHAPE data)") # SHAPE 29 | flags.DEFINE_boolean('fasta', False, "input is in fasta format") # FASTA format 30 | flags.DEFINE_integer('dangles', 2, "the way to treat `dangling end' energies for bases adjacent to helices in free ends and multi-loops (only supporting `0' or `2', default=`2')", short_name="d") 31 | flags.DEFINE_string('evaly', "", "batch eval all sequences against structure for p(y|x)", short_name='y') # p(y|x) 32 | argv = FLAGS(sys.argv) 33 | 34 | def main(): 35 | use_vienna = FLAGS.V 36 | beamsize = str(FLAGS.b) 37 | is_sharpturn = '1' if FLAGS.sharpturn else '0' 38 | is_verbose = '1' if FLAGS.verbose else '0' 39 | bpp_file = str(FLAGS.o) 40 | bpp_prefix = str(FLAGS.prefix) + "_" if FLAGS.prefix else '' 41 | pf_only = '1' if (FLAGS.p and not (FLAGS.mea or FLAGS.threshknot)) else '0' 42 | bpp_cutoff = str(FLAGS.c) 43 | forest_file = str(FLAGS.dumpforest) 44 | mea = '1' if FLAGS.mea else '0' 45 | gamma = str(FLAGS.g) 46 | MEA_bpseq = '1' if FLAGS.bpseq else '0' 47 | MEA_prefix = str(FLAGS.mea_prefix) + "_" if FLAGS.mea_prefix else '' 48 | TK = '1' if FLAGS.threshknot else '0' 49 | threshold = str(FLAGS.threshold) 50 | ThreshKnot_prefix = str(FLAGS.threshknot_prefix) + "_" if FLAGS.threshknot_prefix else '' 51 | shape_file_path = str(FLAGS.shape) 52 | is_fasta = '1' if FLAGS.fasta else '0' 53 | dangles = str(FLAGS.dangles) 54 | evaly = FLAGS.evaly 55 | 56 | if FLAGS.p and (FLAGS.o or FLAGS.prefix): 57 | print("\nWARNING: -p mode has no output for base pairing probability matrix!\n"); 58 | 59 | if FLAGS.o and FLAGS.r: 60 | print("WARNING: choose either -o mode or -r mode!\n"); 61 | print("Exit!\n"); 62 | exit(); 63 | 64 | if (FLAGS.o or FLAGS.r) and FLAGS.prefix: 65 | print("WARNING: choose either -o/-r mode or --prefix mode!\n"); 66 | print("Exit!\n"); 67 | exit(); 68 | 69 | if FLAGS.o: 70 | if os.path.exists(bpp_file): 71 | print("WARNING: this file name has already be taken. Choose another name or use -r mode.\n"); 72 | print("Exit!\n"); 73 | exit(); 74 | 75 | if FLAGS.r: 76 | bpp_file = str(FLAGS.r) 77 | if os.path.exists(bpp_file): os.remove(bpp_file) 78 | 79 | if FLAGS.c: 80 | if float(bpp_cutoff) < 0.0 or float(bpp_cutoff) > 1.0: 81 | print("WARNING: base pair probability cutoff should be between 0.0 and 1.0\n"); 82 | print("Exit!\n"); 83 | exit(); 84 | 85 | if FLAGS.evaly: 86 | use_vienna = True # eval p(y|x) only in Vienna mode 87 | 88 | path = os.path.dirname(os.path.abspath(__file__)) 89 | cmd = ["%s/%s" % (path, ('bin/linearpartition_v' if use_vienna else 'bin/linearpartition_c')), beamsize, is_sharpturn, is_verbose, bpp_file, bpp_prefix, pf_only, bpp_cutoff, forest_file, mea, gamma, TK, threshold, ThreshKnot_prefix, MEA_prefix, MEA_bpseq, shape_file_path, is_fasta, dangles, evaly] 90 | subprocess.call(cmd, stdin=sys.stdin) 91 | 92 | if __name__ == '__main__': 93 | setgflags() 94 | main() 95 | 96 | -------------------------------------------------------------------------------- /src/LinearFoldEval.h: -------------------------------------------------------------------------------- 1 | /* 2 | *LinearFoldEval.cpp* 3 | Evaluate the energy of a given RNA structure. 4 | 5 | author: He Zhang 6 | edited by: 12/2018 7 | */ 8 | 9 | #ifndef LINEARFOLDEVAL_H 10 | #define LINEARFOLDEVAL_H 11 | 12 | #include 13 | #include 14 | #include 15 | 16 | //#include "LinearFold.h" // actually no need; should remove it in LF 17 | 18 | #include "Utils/utility_v.h" 19 | #include "Utils/utility.h" 20 | 21 | using namespace std; 22 | 23 | long eval(string seq, string ref, bool is_verbose, int dangle_model) { 24 | 25 | int seq_length = seq.length(); 26 | 27 | vector if_tetraloops; 28 | vector if_hexaloops; 29 | vector if_triloops; 30 | 31 | v_init_tetra_hex_tri(seq, seq_length, if_tetraloops, if_hexaloops, if_triloops); // calculate if_tetraloops, if_hexaloops, if_triloops 32 | 33 | vector eval_nucs; 34 | eval_nucs.clear(); 35 | eval_nucs.resize(seq_length); 36 | for (int i = 0; i < seq_length; ++i) { 37 | eval_nucs[i] = GET_ACGU_NUM(seq[i]); 38 | } 39 | 40 | long total_energy = 0; 41 | long external_energy = 0; 42 | long M1_energy[seq_length]; 43 | long multi_number_unpaired[seq_length]; 44 | // int external_number_unpaired = 0; 45 | 46 | stack> stk; // tuple of (index, page) 47 | tuple inner_loop; 48 | 49 | for (int j=0; j top = stk.top(); 68 | int i = get<0>(top), page = get<1>(top); 69 | stk.pop(); 70 | 71 | int nuci = eval_nucs[i]; 72 | int nucj = eval_nucs[j]; 73 | int nuci1 = (i + 1) < seq_length ? eval_nucs[i + 1] : -1; 74 | int nucj_1 = (j - 1) > -1 ? eval_nucs[j - 1] : -1; 75 | int nuci_1 = (i-1>-1) ? eval_nucs[i-1] : -1; // only for calculating v_score_M1 76 | int nucj1 = (j+1) < seq_length ? eval_nucs[j+1] : -1; // only for calculating v_score_M1 77 | 78 | if (page == 0) { // hairpin 79 | int tetra_hex_tri = -1; 80 | if (j-i-1 == 4) // 6:tetra 81 | tetra_hex_tri = if_tetraloops[i]; 82 | else if (j-i-1 == 6) // 8:hexa 83 | tetra_hex_tri = if_hexaloops[i]; 84 | else if (j-i-1 == 3) // 5:tri 85 | tetra_hex_tri = if_triloops[i]; 86 | 87 | int newscore = - v_score_hairpin(i, j, nuci, nuci1, nucj_1, nucj, tetra_hex_tri); 88 | if (is_verbose) 89 | printf("Hairpin loop ( %d, %d) %c%c : %.2f\n", i+1, j+1, seq[i], seq[j], newscore / -100.0); 90 | total_energy += newscore; 91 | } 92 | 93 | else if (page == 1) { //single 94 | int p = get<0>(inner_loop), q = get<1>(inner_loop); 95 | 96 | int nucp_1 = eval_nucs[p-1], nucp = eval_nucs[p], nucq = eval_nucs[q], nucq1 = eval_nucs[q+1]; 97 | 98 | int newscore = - v_score_single(i,j,p,q, nuci, nuci1, nucj_1, nucj, 99 | nucp_1, nucp, nucq, nucq1); 100 | if (is_verbose) 101 | printf("Interior loop ( %d, %d) %c%c; ( %d, %d) %c%c : %.2f\n", i+1, j+1, seq[i], seq[j], p+1, q+1, seq[p],seq[q], newscore / -100.0); 102 | total_energy += newscore; 103 | } 104 | 105 | else { //multi 106 | int multi_score = 0; 107 | multi_score += M1_energy[i]; 108 | multi_score += - v_score_multi(i, j, nuci, nuci1, nucj_1, nucj, seq_length, dangle_model); 109 | multi_score += - v_score_multi_unpaired(i+1, i + multi_number_unpaired[i]); // current model is 0 110 | if (is_verbose) 111 | printf("Multi loop ( %d, %d) %c%c : %.2f\n", i+1, j+1, seq[i], seq[j], multi_score / -100.0); 112 | total_energy += multi_score; 113 | } 114 | 115 | //update inner_loop 116 | inner_loop = make_tuple(i, j); 117 | 118 | // possible M 119 | if (!stk.empty()) 120 | M1_energy[stk.top().first] += - v_score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length, dangle_model); 121 | 122 | // check if adding external energy 123 | if (stk.empty()) { 124 | int k = i - 1; 125 | int nuck = k > -1 ? eval_nucs[k] : -1; 126 | int nuck1 = eval_nucs[k+1]; 127 | external_energy += - v_score_external_paired(k+1, j, nuck, nuck1, 128 | nucj, nucj1, seq_length, dangle_model); 129 | // external_energy += 0; currently external unpaired is 0 130 | } 131 | } 132 | } 133 | 134 | if (is_verbose) 135 | printf("External loop : %.2f\n", external_energy / -100.0); 136 | total_energy += external_energy; 137 | return total_energy; 138 | } 139 | 140 | #endif // LINEARFOLDEVAL_H 141 | -------------------------------------------------------------------------------- /src/LinearPartition.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | *LinearPartition.cpp* 3 | The main code for LinearPartition: Linear-Time Approximation of 4 | RNA Folding Partition Function 5 | and Base Pairing Probabilities 6 | 7 | author: He Zhang 8 | created by: 03/2019 9 | */ 10 | 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | #include 19 | #include 20 | #include 21 | #include 22 | 23 | #include "LinearPartition.h" 24 | #include "Utils/utility.h" 25 | #include "Utils/utility_v.h" 26 | #include "bpp.cpp" 27 | 28 | #include "LinearFoldEval.h" // for p(y|x) 29 | 30 | #define SPECIAL_HP 31 | 32 | using namespace std; 33 | 34 | unsigned long quickselect_partition(vector>& scores, unsigned long lower, unsigned long upper) { 35 | pf_type pivot = scores[upper].first; 36 | while (lower < upper) { 37 | while (scores[lower].first < pivot) ++lower; 38 | while (scores[upper].first > pivot) --upper; 39 | if (scores[lower].first == scores[upper].first) ++lower; 40 | else if (lower < upper) swap(scores[lower], scores[upper]); 41 | } 42 | return upper; 43 | } 44 | 45 | // in-place quick-select 46 | pf_type quickselect(vector>& scores, unsigned long lower, unsigned long upper, unsigned long k) { 47 | if ( lower == upper ) return scores[lower].first; 48 | unsigned long split = quickselect_partition(scores, lower, upper); 49 | unsigned long length = split - lower + 1; 50 | if (length == k) return scores[split].first; 51 | else if (k < length) return quickselect(scores, lower, split-1, k); 52 | else return quickselect(scores, split+1, upper, k - length); 53 | } 54 | 55 | 56 | pf_type BeamCKYParser::beam_prune(std::unordered_map &beamstep) { 57 | scores.clear(); 58 | for (auto &item : beamstep) { 59 | int i = item.first; 60 | State &cand = item.second; 61 | int k = i - 1; 62 | pf_type newalpha = (k >= 0 ? bestC[k].alpha : pf_type(0.0)) + cand.alpha; 63 | scores.push_back(make_pair(newalpha, i)); 64 | } 65 | if (scores.size() <= beam) return VALUE_MIN; 66 | pf_type threshold = quickselect(scores, 0, scores.size() - 1, scores.size() - beam); 67 | for (auto &p : scores) { 68 | if (p.first < threshold) beamstep.erase(p.second); 69 | } 70 | 71 | return threshold; 72 | } 73 | 74 | void BeamCKYParser::prepare(unsigned len) { 75 | seq_length = len; 76 | 77 | nucs = new int[seq_length]; 78 | bestC = new State[seq_length]; 79 | bestH = new unordered_map[seq_length]; 80 | bestP = new unordered_map[seq_length]; 81 | bestM = new unordered_map[seq_length]; 82 | bestM2 = new unordered_map[seq_length]; 83 | bestMulti = new unordered_map[seq_length]; 84 | 85 | scores.reserve(seq_length); 86 | } 87 | 88 | void BeamCKYParser::postprocess() { 89 | 90 | delete[] bestC; 91 | delete[] bestH; 92 | delete[] bestP; 93 | delete[] bestM; 94 | delete[] bestM2; 95 | delete[] bestMulti; 96 | 97 | delete[] nucs; 98 | } 99 | 100 | double BeamCKYParser::parse(string& seq) { 101 | 102 | struct timeval parse_starttime, parse_endtime; 103 | 104 | gettimeofday(&parse_starttime, NULL); 105 | 106 | prepare(static_cast(seq.length())); 107 | 108 | for (int i = 0; i < seq_length; ++i) 109 | nucs[i] = GET_ACGU_NUM(seq[i]); 110 | 111 | vector next_pair[NOTON]; 112 | { 113 | for (int nuci = 0; nuci < NOTON; ++nuci) { 114 | // next_pair 115 | next_pair[nuci].resize(seq_length, -1); 116 | int next = -1; 117 | for (int j = seq_length-1; j >=0; --j) { 118 | next_pair[nuci][j] = next; 119 | if (_allowed_pairs[nuci][nucs[j]]) next = j; 120 | } 121 | } 122 | } 123 | 124 | #ifdef SPECIAL_HP 125 | #ifdef lpv 126 | v_init_tetra_hex_tri(seq, seq_length, if_tetraloops, if_hexaloops, if_triloops); 127 | #endif 128 | #endif 129 | 130 | #ifdef lpv 131 | if(seq_length > 0) bestC[0].alpha = 0.0; 132 | if(seq_length > 1) bestC[1].alpha = 0.0; 133 | #else 134 | if(seq_length > 0) Fast_LogPlusEquals(bestC[0].alpha, score_external_unpaired(0, 0)); 135 | if(seq_length > 1) Fast_LogPlusEquals(bestC[1].alpha, score_external_unpaired(0, 1)); 136 | #endif 137 | 138 | value_type newscore; 139 | for(int j = 0; j < seq_length; ++j) { 140 | int nucj = nucs[j]; 141 | int nucj1 = (j+1) < seq_length ? nucs[j+1] : -1; 142 | 143 | unordered_map& beamstepH = bestH[j]; 144 | unordered_map& beamstepMulti = bestMulti[j]; 145 | unordered_map& beamstepP = bestP[j]; 146 | unordered_map& beamstepM2 = bestM2[j]; 147 | unordered_map& beamstepM = bestM[j]; 148 | State& beamstepC = bestC[j]; 149 | 150 | // beam of H 151 | { 152 | if (beam > 0 && beamstepH.size() > beam) beam_prune(beamstepH); 153 | 154 | { 155 | // for nucj put H(j, j_next) into H[j_next] 156 | int jnext = next_pair[nucj][j]; 157 | if (no_sharp_turn) while (jnext - j < 4 && jnext != -1) jnext = next_pair[nucj][jnext]; 158 | if (jnext != -1) { 159 | int nucjnext = nucs[jnext]; 160 | int nucjnext_1 = (jnext - 1) > -1 ? nucs[jnext - 1] : -1; 161 | #ifdef lpv 162 | int tetra_hex_tri = -1; 163 | #ifdef SPECIAL_HP 164 | if (jnext-j-1 == 4) // 6:tetra 165 | tetra_hex_tri = if_tetraloops[j]; 166 | else if (jnext-j-1 == 6) // 8:hexa 167 | tetra_hex_tri = if_hexaloops[j]; 168 | else if (jnext-j-1 == 3) // 5:tri 169 | tetra_hex_tri = if_triloops[j]; 170 | #endif 171 | newscore = - v_score_hairpin(j, jnext, nucj, nucj1, nucjnext_1, nucjnext, tetra_hex_tri); 172 | Fast_LogPlusEquals(bestH[jnext][j].alpha, newscore/kT); 173 | #else 174 | newscore = score_hairpin(j, jnext, nucj, nucj1, nucjnext_1, nucjnext); 175 | Fast_LogPlusEquals(bestH[jnext][j].alpha, newscore); 176 | #endif 177 | } 178 | } 179 | 180 | { 181 | // for every state h in H[j] 182 | // 1. extend h(i, j) to h(i, jnext) 183 | // 2. generate p(i, j) 184 | for (auto &item : beamstepH) { 185 | int i = item.first; 186 | State &state = item.second; 187 | int nuci = nucs[i]; 188 | int jnext = next_pair[nuci][j]; 189 | 190 | if (jnext != -1) { 191 | int nuci1 = (i + 1) < seq_length ? nucs[i + 1] : -1; 192 | int nucjnext = nucs[jnext]; 193 | int nucjnext_1 = (jnext - 1) > -1 ? nucs[jnext - 1] : -1; 194 | 195 | // 1. extend h(i, j) to h(i, jnext)= 196 | #ifdef lpv 197 | int tetra_hex_tri = -1; 198 | #ifdef SPECIAL_HP 199 | if (jnext-i-1 == 4) // 6:tetra 200 | tetra_hex_tri = if_tetraloops[i]; 201 | else if (jnext-i-1 == 6) // 8:hexa 202 | tetra_hex_tri = if_hexaloops[i]; 203 | else if (jnext-i-1 == 3) // 5:tri 204 | tetra_hex_tri = if_triloops[i]; 205 | #endif 206 | newscore = - v_score_hairpin(i, jnext, nuci, nuci1, nucjnext_1, nucjnext, tetra_hex_tri); 207 | Fast_LogPlusEquals(bestH[jnext][i].alpha, newscore/kT); 208 | #else 209 | newscore = score_hairpin(i, jnext, nuci, nuci1, nucjnext_1, nucjnext); 210 | Fast_LogPlusEquals(bestH[jnext][i].alpha, newscore); 211 | #endif 212 | } 213 | 214 | // 2. generate p(i, j) 215 | Fast_LogPlusEquals(beamstepP[i].alpha, state.alpha); 216 | } 217 | } 218 | } 219 | if (j == 0) continue; 220 | 221 | // beam of Multi 222 | { 223 | if (beam > 0 && beamstepMulti.size() > beam) beam_prune(beamstepMulti); 224 | 225 | for(auto& item : beamstepMulti) { 226 | int i = item.first; 227 | State& state = item.second; 228 | 229 | int nuci = nucs[i]; 230 | int nuci1 = nucs[i+1]; 231 | int jnext = next_pair[nuci][j]; 232 | 233 | // 1. extend (i, j) to (i, jnext) 234 | { 235 | if (jnext != -1) { 236 | #ifdef lpv 237 | Fast_LogPlusEquals(bestMulti[jnext][i].alpha, state.alpha); 238 | #else 239 | newscore = score_multi_unpaired(j, jnext - 1); 240 | Fast_LogPlusEquals(bestMulti[jnext][i].alpha, state.alpha + newscore); 241 | #endif 242 | } 243 | } 244 | 245 | // 2. generate P (i, j) 246 | { 247 | #ifdef lpv 248 | newscore = - v_score_multi(i, j, nuci, nuci1, nucs[j-1], nucj, seq_length, dangle_mode); 249 | Fast_LogPlusEquals(beamstepP[i].alpha, state.alpha + newscore/kT); 250 | #else 251 | newscore = score_multi(i, j, nuci, nuci1, nucs[j-1], nucj, seq_length); 252 | Fast_LogPlusEquals(beamstepP[i].alpha, state.alpha + newscore); 253 | #endif 254 | } 255 | } 256 | } 257 | 258 | // beam of P 259 | { 260 | if (beam > 0 && beamstepP.size() > beam) beam_prune(beamstepP); 261 | 262 | // for every state in P[j] 263 | // 1. generate new helix/bulge 264 | // 2. M = P 265 | // 3. M2 = M + P 266 | // 4. C = C + P 267 | for(auto& item : beamstepP) { 268 | int i = item.first; 269 | State& state = item.second; 270 | int nuci = nucs[i]; 271 | int nuci_1 = (i-1>-1) ? nucs[i-1] : -1; 272 | 273 | // 1. generate new helix / single_branch 274 | // new state is of shape p..i..j..q 275 | if (i >0 && j= std::max(i - SINGLE_MAX_LEN, 0); --p) { 280 | int nucp = nucs[p]; 281 | int nucp1 = nucs[p + 1]; 282 | int q = next_pair[nucp][j]; 283 | while (q != -1 && ((i - p) + (q - j) - 2 <= SINGLE_MAX_LEN)) { 284 | int nucq = nucs[q]; 285 | int nucq_1 = nucs[q - 1]; 286 | 287 | if (p == i - 1 && q == j + 1) { 288 | // helix 289 | #ifdef lpv 290 | newscore = -v_score_single(p,q,i,j, nucp, nucp1, nucq_1, nucq, 291 | nuci_1, nuci, nucj, nucj1); 292 | 293 | // SHAPE for Vienna only 294 | if (use_shape) 295 | newscore += -(pseudo_energy_stack[p] + pseudo_energy_stack[i] + pseudo_energy_stack[j] + pseudo_energy_stack[q]); 296 | Fast_LogPlusEquals(bestP[q][p].alpha, state.alpha + newscore/kT); 297 | #else 298 | newscore = score_helix(nucp, nucp1, nucq_1, nucq); 299 | Fast_LogPlusEquals(bestP[q][p].alpha, state.alpha + newscore); 300 | #endif 301 | } else { 302 | // single branch 303 | #ifdef lpv 304 | newscore = - v_score_single(p,q,i,j, nucp, nucp1, nucq_1, nucq, 305 | nuci_1, nuci, nucj, nucj1); 306 | Fast_LogPlusEquals(bestP[q][p].alpha, state.alpha + newscore/kT); 307 | #else 308 | newscore = score_junction_B(p, q, nucp, nucp1, nucq_1, nucq) + 309 | precomputed + 310 | score_single_without_junctionB(p, q, i, j, 311 | nuci_1, nuci, nucj, nucj1); 312 | Fast_LogPlusEquals(bestP[q][p].alpha, state.alpha + newscore); 313 | #endif 314 | } 315 | q = next_pair[nucp][q]; 316 | } 317 | } 318 | } 319 | 320 | // 2. M = P 321 | if(i > 0 && j < seq_length-1){ 322 | #ifdef lpv 323 | newscore = - v_score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length, dangle_mode); 324 | Fast_LogPlusEquals(beamstepM[i].alpha, state.alpha + newscore/kT); 325 | #else 326 | newscore = score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length); 327 | Fast_LogPlusEquals(beamstepM[i].alpha, state.alpha + newscore); 328 | #endif 329 | } 330 | 331 | // 3. M2 = M + P 332 | int k = i - 1; 333 | if ( k > 0 && !bestM[k].empty()) { 334 | #ifdef lpv 335 | newscore = - v_score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length, dangle_mode); 336 | pf_type m1_alpha = state.alpha + newscore/kT; 337 | #else 338 | newscore = score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length); 339 | pf_type m1_alpha = state.alpha + newscore; 340 | #endif 341 | for (auto &m : bestM[k]) { 342 | int newi = m.first; 343 | State& m_state = m.second; 344 | Fast_LogPlusEquals(beamstepM2[newi].alpha, m_state.alpha + m1_alpha); 345 | } 346 | } 347 | 348 | // 4. C = C + P 349 | { 350 | int k = i - 1; 351 | if (k >= 0) { 352 | State& prefix_C = bestC[k]; 353 | int nuck = nuci_1; 354 | int nuck1 = nuci; 355 | #ifdef lpv 356 | newscore = - v_score_external_paired(k+1, j, nuck, nuck1, 357 | nucj, nucj1, seq_length, dangle_mode); 358 | Fast_LogPlusEquals(beamstepC.alpha, prefix_C.alpha + state.alpha + newscore/kT); 359 | #else 360 | newscore = score_external_paired(k+1, j, nuck, nuck1, 361 | nucj, nucj1, seq_length); 362 | Fast_LogPlusEquals(beamstepC.alpha, prefix_C.alpha + state.alpha + newscore); 363 | #endif 364 | } else { 365 | #ifdef lpv 366 | newscore = - v_score_external_paired(0, j, -1, nucs[0], 367 | nucj, nucj1, seq_length, dangle_mode); 368 | Fast_LogPlusEquals(beamstepC.alpha, state.alpha + newscore/kT); 369 | #else 370 | newscore = score_external_paired(0, j, -1, nucs[0], 371 | nucj, nucj1, seq_length); 372 | Fast_LogPlusEquals(beamstepC.alpha, state.alpha + newscore); 373 | #endif 374 | } 375 | } 376 | } 377 | } 378 | 379 | // beam of M2 380 | { 381 | if (beam > 0 && beamstepM2.size() > beam) beam_prune(beamstepM2); 382 | 383 | for(auto& item : beamstepM2) { 384 | int i = item.first; 385 | State& state = item.second; 386 | 387 | // 1. multi-loop 388 | for (int p = i-1; p >= std::max(i - SINGLE_MAX_LEN, 0); --p) { 389 | int nucp = nucs[p]; 390 | int q = next_pair[nucp][j]; 391 | if (q != -1 && ((i - p - 1) <= SINGLE_MAX_LEN)) { 392 | #ifdef lpv 393 | Fast_LogPlusEquals(bestMulti[q][p].alpha, state.alpha); 394 | 395 | #else 396 | newscore = score_multi_unpaired(p+1, i-1) + 397 | score_multi_unpaired(j+1, q-1); 398 | Fast_LogPlusEquals(bestMulti[q][p].alpha, state.alpha + newscore); 399 | #endif 400 | } 401 | } 402 | 403 | // 2. M = M2 404 | Fast_LogPlusEquals(beamstepM[i].alpha, state.alpha); 405 | } 406 | } 407 | 408 | // beam of M 409 | { 410 | if (beam > 0 && beamstepM.size() > beam) beam_prune(beamstepM); 411 | 412 | for(auto& item : beamstepM) { 413 | int i = item.first; 414 | State& state = item.second; 415 | if (j < seq_length-1) { 416 | #ifdef lpv 417 | Fast_LogPlusEquals(bestM[j+1][i].alpha, state.alpha); 418 | #else 419 | newscore = score_multi_unpaired(j + 1, j + 1); 420 | Fast_LogPlusEquals(bestM[j+1][i].alpha, state.alpha + newscore); 421 | #endif 422 | } 423 | } 424 | } 425 | 426 | // beam of C 427 | { 428 | // C = C + U 429 | if (j < seq_length-1) { 430 | #ifdef lpv 431 | Fast_LogPlusEquals(bestC[j+1].alpha, beamstepC.alpha); 432 | 433 | #else 434 | newscore = score_external_unpaired(j+1, j+1); 435 | Fast_LogPlusEquals(bestC[j+1].alpha, beamstepC.alpha + newscore); 436 | #endif 437 | } 438 | } 439 | } // end of for-loo j 440 | 441 | State& viterbi = bestC[seq_length-1]; 442 | 443 | gettimeofday(&parse_endtime, NULL); 444 | double parse_elapsed_time = parse_endtime.tv_sec - parse_starttime.tv_sec + (parse_endtime.tv_usec-parse_starttime.tv_usec)/1000000.0; 445 | 446 | double ensemble; 447 | #ifdef lpv 448 | ensemble = -kT * viterbi.alpha / 100.0; // -kT log(Q(x)) 449 | fprintf(stderr,"Free Energy of Ensemble: %.5f kcal/mol\n", ensemble); 450 | #else 451 | ensemble = viterbi.alpha; 452 | fprintf(stderr,"Log Partition Coefficient: %.5f\n", ensemble); 453 | #endif 454 | 455 | if(is_verbose) fprintf(stderr,"Partition Function Calculation Time: %.2f seconds.\n", parse_elapsed_time); 456 | 457 | fflush(stdout); 458 | 459 | // lhuang 460 | if(pf_only && !forest_file.empty()) dump_forest(seq, true); // inside-only forest 461 | 462 | if(!pf_only){ 463 | outside(next_pair); 464 | if (!forest_file.empty()) 465 | dump_forest(seq, false); // inside-outside forest 466 | cal_PairProb(viterbi); 467 | 468 | if (mea_) PairProb_MEA(seq); 469 | 470 | if (threshknot_) ThreshKnot(seq); 471 | } 472 | postprocess(); 473 | return ensemble; 474 | } 475 | 476 | void BeamCKYParser::print_states(FILE *fptr, unordered_map& states, int j, string label, bool inside_only, double threshold) { 477 | for (auto & item : states) { 478 | int i = item.first; 479 | State & state = item.second; 480 | if (inside_only) fprintf(fptr, "%s %d %d %.5lf\n", label.c_str(), i+1, j+1, state.alpha); 481 | else if (state.alpha + state.beta > threshold) // lhuang : alpha + beta - totalZ < ... 482 | fprintf(fptr, "%s %d %d %.5lf %.5lf\n", label.c_str(), i+1, j+1, state.alpha, state.beta); 483 | } 484 | } 485 | 486 | void BeamCKYParser::dump_forest(string seq, bool inside_only) { 487 | printf("Dumping (%s) Forest to %s...\n", (inside_only ? "Inside-Only" : "Inside-Outside"), forest_file.c_str()); 488 | FILE *fptr = fopen(forest_file.c_str(), "w"); // lhuang: should be fout >> 489 | fprintf(fptr, "%s\n", seq.c_str()); 490 | int n = seq.length(), j; 491 | for (j = 0; j < n; j++) { 492 | if (inside_only) fprintf(fptr, "E %d %.5lf\n", j+1, bestC[j].alpha); 493 | else fprintf(fptr, "E %d %.5lf %.5lf\n", j+1, bestC[j].alpha, bestC[j].beta); 494 | } 495 | double threshold = bestC[n-1].alpha - 9.91152; // lhuang -9.xxx or ? 496 | for (j = 0; j < n; j++) 497 | print_states(fptr, bestP[j], j, "P", inside_only, threshold); 498 | for (j = 0; j < n; j++) 499 | print_states(fptr, bestM[j], j, "M", inside_only, threshold); 500 | for (j = 0; j < n; j++) 501 | print_states(fptr, bestM2[j], j, "M2", inside_only, threshold); 502 | for (j = 0; j < n; j++) 503 | print_states(fptr, bestMulti[j], j, "Multi", inside_only, threshold); 504 | } 505 | 506 | BeamCKYParser::BeamCKYParser(int beam_size, 507 | bool nosharpturn, 508 | bool verbose, 509 | string bppfile, 510 | string bppfileindex, 511 | bool pfonly, 512 | float bppcutoff, 513 | string forestfile, 514 | bool mea, 515 | float MEA_gamma, 516 | string MEA_file_index, 517 | bool MEA_bpseq, 518 | bool ThreshKnot, 519 | float ThreshKnot_threshold, 520 | string ThreshKnot_file_index, 521 | string shape_file_path, 522 | bool fasta, 523 | int dangles) 524 | : beam(beam_size), 525 | no_sharp_turn(nosharpturn), 526 | is_verbose(verbose), 527 | bpp_file(bppfile), 528 | bpp_file_index(bppfileindex), 529 | pf_only(pfonly), 530 | bpp_cutoff(bppcutoff), 531 | forest_file(forestfile), 532 | mea_(mea), 533 | gamma(MEA_gamma), 534 | mea_file_index(MEA_file_index), 535 | bpseq(MEA_bpseq), 536 | threshknot_(ThreshKnot), 537 | threshknot_threshold(ThreshKnot_threshold), 538 | threshknot_file_index(ThreshKnot_file_index), 539 | is_fasta(fasta), 540 | dangle_mode(dangles) { 541 | #ifdef lpv 542 | initialize(); 543 | #else 544 | initialize(); 545 | initialize_cachesingle(); 546 | #endif 547 | 548 | if (shape_file_path != "" ){ 549 | use_shape = true; 550 | int position; 551 | string data; 552 | 553 | double temp_after_mb_shape; 554 | 555 | ifstream in(shape_file_path); 556 | 557 | if (!in.good()){ 558 | cout<<"Reading SHAPE file error!"<> position >> data).fail()) { 564 | if (isdigit(int(data[0])) == 0){ 565 | SHAPE_data.push_back(double((-1.000000))); 566 | } 567 | 568 | else { 569 | SHAPE_data.push_back(stod(data)); 570 | } 571 | } 572 | 573 | for (int i = 0; i 1) { 619 | beamsize = atoi(argv[1]); 620 | sharpturn = atoi(argv[2]) == 1; 621 | is_verbose = atoi(argv[3]) == 1; 622 | bpp_file = argv[4]; 623 | bpp_prefix = argv[5]; 624 | pf_only = atoi(argv[6]) == 1; 625 | bpp_cutoff = atof(argv[7]); 626 | forest_file = argv[8]; 627 | mea = atoi(argv[9]) == 1; 628 | MEA_gamma = atof(argv[10]); 629 | ThreshKnot = atoi(argv[11]) == 1; 630 | ThreshKnot_threshold = atof(argv[12]); 631 | ThresKnot_prefix = argv[13]; 632 | MEA_prefix = argv[14]; 633 | MEA_bpseq = atoi(argv[15]) == 1; 634 | shape_file_path = argv[16]; 635 | fasta = atoi(argv[17]) == 1; 636 | dangles = atoi(argv[18]); 637 | ystruct = argv[19]; // for p(y|x) 638 | } 639 | 640 | if (is_verbose) printf("beam size: %d\n", beamsize); 641 | 642 | // variables for decoding 643 | int num=0, total_len = 0; 644 | unsigned long long total_states = 0; 645 | double total_score = .0; 646 | double total_time = .0; 647 | 648 | int seq_index = 0; 649 | string bpp_file_index = ""; 650 | string ThreshKnot_file_index = ""; 651 | string MEA_file_index = ""; 652 | 653 | string rna_seq; 654 | vector rna_seq_list, rna_name_list; 655 | if (fasta){ 656 | for (string seq; getline(cin, seq);){ 657 | if (seq.empty()) continue; 658 | else if (seq[0] == '>' or seq[0] == ';'){ 659 | rna_name_list.push_back(seq); // sequence name 660 | if (!rna_seq.empty()) 661 | rna_seq_list.push_back(rna_seq); 662 | rna_seq.clear(); 663 | continue; 664 | }else{ 665 | rtrim(seq); 666 | rna_seq += seq; 667 | } 668 | } 669 | if (!rna_seq.empty()) 670 | rna_seq_list.push_back(rna_seq); 671 | } else { 672 | for (string seq; getline(cin, seq);){ 673 | if (seq.empty()) continue; 674 | if (seq[0] == '>' or seq[0] == ';') continue; 675 | if (!isalpha(seq[0])){ 676 | printf("Unrecognized sequence: %s\n", seq.c_str()); 677 | continue; 678 | } 679 | rna_seq_list.push_back(seq); 680 | } 681 | } 682 | 683 | // TODO: no need to store all seqs 684 | for(int i = 0; i < rna_seq_list.size(); i++){ 685 | if (rna_name_list.size() > i) 686 | printf("%s\n", rna_name_list[i].c_str()); 687 | rna_seq = rna_seq_list[i]; 688 | 689 | printf("%s\n", rna_seq.c_str()); 690 | if (!bpp_file.empty()) { 691 | FILE *fptr = fopen(bpp_file.c_str(), "a"); 692 | if (fptr == NULL) { 693 | printf("Could not open file!\n"); 694 | return 0; 695 | } 696 | if (rna_name_list.size() > i) 697 | fprintf(fptr, "%s\n", rna_name_list[i].c_str()); 698 | fclose(fptr); 699 | } 700 | 701 | seq_index ++; 702 | if (!bpp_prefix.empty()) bpp_file_index = bpp_prefix + to_string(seq_index); 703 | 704 | if (!ThresKnot_prefix.empty()) ThreshKnot_file_index = ThresKnot_prefix + to_string(seq_index); 705 | 706 | if (!MEA_prefix.empty()) MEA_file_index = MEA_prefix + to_string(seq_index); 707 | 708 | // convert to uppercase 709 | transform(rna_seq.begin(), rna_seq.end(), rna_seq.begin(), ::toupper); 710 | 711 | // convert T to U 712 | replace(rna_seq.begin(), rna_seq.end(), 'T', 'U'); 713 | 714 | // lhuang: moved inside loop, fixing an obscure but crucial bug in initialization 715 | BeamCKYParser parser(beamsize, !sharpturn, is_verbose, bpp_file, bpp_file_index, pf_only, bpp_cutoff, forest_file, mea, MEA_gamma, MEA_file_index, MEA_bpseq, ThreshKnot, ThreshKnot_threshold, ThreshKnot_file_index, shape_file_path, fasta, dangles); 716 | 717 | double ensemble = parser.parse(rna_seq); // ensemble free energy 718 | 719 | if (!ystruct.empty()) { 720 | double energy = -eval(rna_seq, ystruct, is_verbose, dangles)/100.; // LinearFoldEval.h 721 | double prob = exp(100*(energy-ensemble)/ -kT); 722 | printf("x= %s\ty= %s\tDeltaG(x,y)= %.2f\t-kTlogQ(x)= %.5f\tp(y|x)= %.5f\n", rna_seq.c_str(), ystruct.c_str(), energy, ensemble, prob); 723 | } 724 | } 725 | 726 | gettimeofday(&total_endtime, NULL); 727 | double total_elapsed_time = total_endtime.tv_sec - total_starttime.tv_sec + (total_endtime.tv_usec-total_starttime.tv_usec)/1000000.0; 728 | 729 | if(is_verbose) fprintf(stderr,"Total Time: %.2f seconds.\n", total_elapsed_time); 730 | 731 | return 0; 732 | } 733 | -------------------------------------------------------------------------------- /src/LinearPartition.h: -------------------------------------------------------------------------------- 1 | /* 2 | *LinearPartition.h* 3 | header file for LinearPartition.cpp. 4 | 5 | author: He Zhang 6 | created by: 03/2019 7 | */ 8 | 9 | #ifndef FASTCKY_BEAMCKYPAR_H 10 | #define FASTCKY_BEAMCKYPAR_H 11 | 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | 19 | // #define MIN_CUBE_PRUNING_SIZE 20 20 | #define kT 61.63207755 21 | 22 | #define NEG_INF -2e20 23 | // #define testtime 24 | 25 | using namespace std; 26 | 27 | #ifdef lpv 28 | typedef float pf_type; 29 | #else 30 | typedef double pf_type; 31 | #endif 32 | 33 | 34 | #ifdef lpv 35 | typedef int value_type; 36 | #define VALUE_MIN numeric_limits::lowest() 37 | #else 38 | typedef double value_type; 39 | #define VALUE_MIN numeric_limits::lowest() 40 | #endif 41 | 42 | // A hash function used to hash a pair of any kind 43 | struct hash_pair { 44 | template 45 | size_t operator()(const pair& p) const 46 | { 47 | auto hash1 = hash{}(p.first); 48 | auto hash2 = hash{}(p.second); 49 | return hash1 ^ hash2; 50 | } 51 | }; 52 | 53 | 54 | 55 | struct comp 56 | { 57 | template 58 | bool operator()(const T& l, const T& r) const 59 | { 60 | if (l.first == r.first) 61 | return l.second < r.second; 62 | 63 | return l.first < r.first; 64 | } 65 | }; 66 | 67 | struct State { 68 | 69 | pf_type alpha; 70 | pf_type beta; 71 | 72 | State(): alpha(VALUE_MIN), beta(VALUE_MIN) {}; 73 | }; 74 | 75 | 76 | class BeamCKYParser { 77 | public: 78 | int beam; 79 | bool no_sharp_turn; 80 | bool is_verbose; 81 | string bpp_file; 82 | string bpp_file_index; 83 | bool pf_only; 84 | float bpp_cutoff; 85 | string forest_file; 86 | bool mea_; 87 | float gamma; 88 | string mea_file_index; 89 | bool bpseq; 90 | bool threshknot_; 91 | float threshknot_threshold; 92 | string threshknot_file_index; 93 | bool is_fasta; 94 | int dangle_mode; 95 | 96 | // SHAPE 97 | bool use_shape = false; 98 | double m = 1.8; 99 | double b = -0.6; 100 | 101 | 102 | BeamCKYParser(int beam_size=100, 103 | bool nosharpturn=true, 104 | bool is_verbose=false, 105 | string bppfile="", 106 | string bppfileindex="", 107 | bool pf_only=false, 108 | float bpp_cutoff=0.0, 109 | string forestfile="", 110 | bool mea_=false, 111 | float gamma=3.0, 112 | string mea_file_index="", 113 | bool bpseq=false, 114 | bool threshknot_=false, 115 | float threshknot_threshold=0.3, 116 | string threshknot_file_index="", 117 | string shape_file_path="", 118 | bool is_fasta=false, 119 | int dangles=1); 120 | 121 | // DecoderResult parse(string& seq); 122 | double parse(string& seq); 123 | 124 | private: 125 | void get_parentheses(char* result, string& seq); 126 | 127 | unsigned seq_length; 128 | 129 | unordered_map *bestH, *bestP, *bestM2, *bestMulti, *bestM; 130 | 131 | vector if_tetraloops; 132 | vector if_hexaloops; 133 | vector if_triloops; 134 | 135 | State *bestC; 136 | 137 | int *nucs; 138 | 139 | void prepare(unsigned len); 140 | void postprocess(); 141 | 142 | void cal_PairProb(State& viterbi); 143 | 144 | void PairProb_MEA(string & seq); 145 | 146 | void ThreshKnot(string & seq); 147 | 148 | string back_trace(const int i, const int j, const vector >& back_pointer); 149 | map get_pairs(string & structure); 150 | 151 | void outside(vector next_pair[]); 152 | 153 | void dump_forest(string seq, bool inside_only); 154 | void print_states(FILE *fptr, unordered_map& states, int j, string label, bool inside_only, double threshold); 155 | 156 | pf_type beam_prune(unordered_map& beamstep); 157 | 158 | vector> scores; 159 | 160 | unordered_map, pf_type, hash_pair> Pij; 161 | 162 | void output_to_file(string file_name, const char * type); 163 | void output_to_file_MEA_threshknot_bpseq(string file_name, const char * type, map & pairs, string & seq); 164 | 165 | 166 | 167 | // SHAPE 168 | std::vector SHAPE_data; 169 | 170 | std::vector pseudo_energy_stack; 171 | 172 | }; 173 | 174 | // log space: borrowed from CONTRAfold 175 | 176 | inline pf_type Fast_LogExpPlusOne(pf_type x){ 177 | 178 | // Bounds for tolerance of 7.05e-06: (0, 11.8625) 179 | // Approximating interval: (0, 0.661537) --> ((T(-0.0065591595)*x+T(0.1276442762))*x+T(0.4996554598))*x+T(0.6931542306); 180 | // Approximating interval: (0.661537, 1.63202) --> ((T(-0.0155157557)*x+T(0.1446775699))*x+T(0.4882939746))*x+T(0.6958092989); 181 | // Approximating interval: (1.63202, 2.49126) --> ((T(-0.0128909247)*x+T(0.1301028251))*x+T(0.5150398748))*x+T(0.6795585882); 182 | // Approximating interval: (2.49126, 3.37925) --> ((T(-0.0072142647)*x+T(0.0877540853))*x+T(0.6208708362))*x+T(0.5909675829); 183 | // Approximating interval: (3.37925, 4.42617) --> ((T(-0.0031455354)*x+T(0.0467229449))*x+T(0.7592532310))*x+T(0.4348794399); 184 | // Approximating interval: (4.42617, 5.78907) --> ((T(-0.0010110698)*x+T(0.0185943421))*x+T(0.8831730747))*x+T(0.2523695427); 185 | // Approximating interval: (5.78907, 7.81627) --> ((T(-0.0001962780)*x+T(0.0046084408))*x+T(0.9634431978))*x+T(0.0983148903); 186 | // Approximating interval: (7.81627, 11.8625) --> ((T(-0.0000113994)*x+T(0.0003734731))*x+T(0.9959107193))*x+T(0.0149855051); 187 | // 8 polynomials needed. 188 | 189 | assert(pf_type(0.0000000000) <= x && x <= pf_type(11.8624794162) && "Argument out-of-range."); 190 | if (x < pf_type(3.3792499610)) 191 | { 192 | if (x < pf_type(1.6320158198)) 193 | { 194 | if (x < pf_type(0.6615367791)) 195 | return ((pf_type(-0.0065591595)*x+pf_type(0.1276442762))*x+pf_type(0.4996554598))*x+pf_type(0.6931542306); 196 | return ((pf_type(-0.0155157557)*x+pf_type(0.1446775699))*x+pf_type(0.4882939746))*x+pf_type(0.6958092989); 197 | } 198 | if (x < pf_type(2.4912588184)) 199 | return ((pf_type(-0.0128909247)*x+pf_type(0.1301028251))*x+pf_type(0.5150398748))*x+pf_type(0.6795585882); 200 | return ((pf_type(-0.0072142647)*x+pf_type(0.0877540853))*x+pf_type(0.6208708362))*x+pf_type(0.5909675829); 201 | } 202 | if (x < pf_type(5.7890710412)) 203 | { 204 | if (x < pf_type(4.4261691294)) 205 | return ((pf_type(-0.0031455354)*x+pf_type(0.0467229449))*x+pf_type(0.7592532310))*x+pf_type(0.4348794399); 206 | return ((pf_type(-0.0010110698)*x+pf_type(0.0185943421))*x+pf_type(0.8831730747))*x+pf_type(0.2523695427); 207 | } 208 | if (x < pf_type(7.8162726752)) 209 | return ((pf_type(-0.0001962780)*x+pf_type(0.0046084408))*x+pf_type(0.9634431978))*x+pf_type(0.0983148903); 210 | return ((pf_type(-0.0000113994)*x+pf_type(0.0003734731))*x+pf_type(0.9959107193))*x+pf_type(0.0149855051); 211 | } 212 | 213 | inline void Fast_LogPlusEquals (pf_type &x, pf_type y) 214 | { 215 | if (x < y) std::swap (x, y); 216 | if (y > pf_type(NEG_INF/2) && x-y < pf_type(11.8624794162)) 217 | x = Fast_LogExpPlusOne(x-y) + y; 218 | } 219 | 220 | inline pf_type Fast_Exp(pf_type x) 221 | { 222 | // Bounds for tolerance of 4.96e-05: (-9.91152, 0) 223 | // Approximating interval: (-9.91152, -5.86228) --> ((T(0.0000803850)*x+T(0.0021627428))*x+T(0.0194708555))*x+T(0.0588080014); 224 | // Approximating interval: (-5.86228, -3.83966) --> ((T(0.0013889414)*x+T(0.0244676474))*x+T(0.1471290604))*x+T(0.3042757740); 225 | // Approximating interval: (-3.83966, -2.4915) --> ((T(0.0072335607)*x+T(0.0906002677))*x+T(0.3983111356))*x+T(0.6245959221); 226 | // Approximating interval: (-2.4915, -1.48054) --> ((T(0.0232410351)*x+T(0.2085645908))*x+T(0.6906367911))*x+T(0.8682322329); 227 | // Approximating interval: (-1.48054, -0.672505) --> ((T(0.0573782771)*x+T(0.3580258429))*x+T(0.9121133217))*x+T(0.9793091728); 228 | // Approximating interval: (-0.672505, -3.9145e-11) --> ((T(0.1199175927)*x+T(0.4815668234))*x+T(0.9975991939))*x+T(0.9999505077); 229 | // 6 polynomials needed. 230 | 231 | if (x < pf_type(-2.4915033807)) 232 | { 233 | if (x < pf_type(-5.8622823336)) 234 | { 235 | if (x < pf_type(-9.91152)) 236 | return pf_type(0); 237 | return ((pf_type(0.0000803850)*x+pf_type(0.0021627428))*x+pf_type(0.0194708555))*x+pf_type(0.0588080014); 238 | } 239 | if (x < pf_type(-3.8396630909)) 240 | return ((pf_type(0.0013889414)*x+pf_type(0.0244676474))*x+pf_type(0.1471290604))*x+pf_type(0.3042757740); 241 | return ((pf_type(0.0072335607)*x+pf_type(0.0906002677))*x+pf_type(0.3983111356))*x+pf_type(0.6245959221); 242 | } 243 | if (x < pf_type(-0.6725053211)) 244 | { 245 | if (x < pf_type(-1.4805375919)) 246 | return ((pf_type(0.0232410351)*x+pf_type(0.2085645908))*x+pf_type(0.6906367911))*x+pf_type(0.8682322329); 247 | return ((pf_type(0.0573782771)*x+pf_type(0.3580258429))*x+pf_type(0.9121133217))*x+pf_type(0.9793091728); 248 | } 249 | if (x < pf_type(0)) 250 | return ((pf_type(0.1199175927)*x+pf_type(0.4815668234))*x+pf_type(0.9975991939))*x+pf_type(0.9999505077); 251 | return (x > pf_type(46.052) ? pf_type(1e20) : expf(x)); 252 | } 253 | 254 | #endif //FASTCKY_BEAMCKYPAR_H 255 | -------------------------------------------------------------------------------- /src/Utils/energy_parameter.h: -------------------------------------------------------------------------------- 1 | /* 2 | *energy_parameter.h* 3 | feature values from ViennaRNA. 4 | 5 | author: Dezhong Deng, He Zhang 6 | edited by: 02/2018 7 | */ 8 | 9 | #ifndef VIE_INF 10 | // #define VIE_INF 999999999 11 | #define VIE_INF 10000000 // to be the same as in vienna 12 | #endif 13 | #ifndef NBPAIRS 14 | #define NBPAIRS 7 15 | #endif 16 | 17 | #define SPECIAL_HP 18 | //int special_hp = 1; 19 | 20 | double lxc37=107.856; 21 | int ML_intern37=-90; 22 | // int ML_intern37=-60; 23 | int ML_closing37=930; 24 | int ML_BASE37=0; 25 | int MAX_NINIO=300; 26 | int ninio37=60; 27 | int TerminalAU37=50; // lhuang: outermost pair is AU or GU; also used in tetra_loop triloop 28 | 29 | char Triloops[241] = 30 | "CAACG " 31 | "GUUAC " 32 | ; 33 | int Triloop37[2] = { 680, 690}; 34 | 35 | char Tetraloops[281] = 36 | "CAACGG " 37 | "CCAAGG " 38 | "CCACGG " 39 | "CCCAGG " 40 | "CCGAGG " 41 | "CCGCGG " 42 | "CCUAGG " 43 | "CCUCGG " 44 | "CUAAGG " 45 | "CUACGG " 46 | "CUCAGG " 47 | "CUCCGG " 48 | "CUGCGG " 49 | "CUUAGG " 50 | "CUUCGG " 51 | "CUUUGG " 52 | ; 53 | 54 | int Tetraloop37[16] = { 550, 330, 370, 340, 350, 360, 370, 250, 360, 280, 370, 270, 280, 350, 370, 370}; 55 | 56 | char Hexaloops[361] = 57 | "ACAGUACU " 58 | "ACAGUGAU " 59 | "ACAGUGCU " 60 | "ACAGUGUU " 61 | ; 62 | int Hexaloop37[4] = { 280, 360, 290, 180}; 63 | 64 | int stack37[NBPAIRS+1][NBPAIRS+1] = 65 | // CG GC GU UG AU UA NN 66 | {{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 67 | /*CG*/,{ VIE_INF, -240, -330, -210, -140, -210, -210, -140} 68 | /*GC*/,{ VIE_INF, -330, -340, -250, -150, -220, -240, -150} 69 | /*GU*/,{ VIE_INF, -210, -250, 130, -50, -140, -130, 130} 70 | /*UG*/,{ VIE_INF, -140, -150, -50, 30, -60, -100, 30} 71 | /*AU*/,{ VIE_INF, -210, -220, -140, -60, -110, -90, -60} 72 | /*UA*/,{ VIE_INF, -210, -240, -130, -100, -90, -130, -90} 73 | /*NN*/,{ VIE_INF, -140, -150, 130, 30, -60, -90, 130}}; 74 | 75 | int hairpin37[31] = { VIE_INF, VIE_INF, VIE_INF, 540, 560, 570, 540, 600, 550, 640, 650, 660, 670, 680, 690, 690, 700, 710, 710, 720, 720, 730, 730, 740, 740, 750, 750, 750, 760, 760, 770}; 76 | int bulge37[31] = { VIE_INF, 380, 280, 320, 360, 400, 440, 460, 470, 480, 490, 500, 510, 520, 530, 540, 540, 550, 550, 560, 570, 570, 580, 580, 580, 590, 590, 600, 600, 600, 610}; 77 | int internal_loop37[31] = { VIE_INF, VIE_INF, 100, 100, 110, 200, 200, 210, 230, 240, 250, 260, 270, 280, 290, 290, 300, 310, 310, 320, 330, 330, 340, 340, 350, 350, 350, 360, 360, 370, 370}; 78 | 79 | int mismatchI37[NBPAIRS+1][5][5] = 80 | {{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 81 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 82 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 83 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 84 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 85 | } 86 | ,{{ 0, 0, 0, 0, 0} 87 | ,{ 0, 0, 0, -80, 0} 88 | ,{ 0, 0, 0, 0, 0} 89 | ,{ 0, -100, 0, -100, 0} 90 | ,{ 0, 0, 0, 0, -60} 91 | } 92 | ,{{ 0, 0, 0, 0, 0} 93 | ,{ 0, 0, 0, -80, 0} 94 | ,{ 0, 0, 0, 0, 0} 95 | ,{ 0, -100, 0, -100, 0} 96 | ,{ 0, 0, 0, 0, -60} 97 | } 98 | ,{{ 70, 70, 70, 70, 70} 99 | ,{ 70, 70, 70, -10, 70} 100 | ,{ 70, 70, 70, 70, 70} 101 | ,{ 70, -30, 70, -30, 70} 102 | ,{ 70, 70, 70, 70, 10} 103 | } 104 | ,{{ 70, 70, 70, 70, 70} 105 | ,{ 70, 70, 70, -10, 70} 106 | ,{ 70, 70, 70, 70, 70} 107 | ,{ 70, -30, 70, -30, 70} 108 | ,{ 70, 70, 70, 70, 10} 109 | } 110 | ,{{ 70, 70, 70, 70, 70} 111 | ,{ 70, 70, 70, -10, 70} 112 | ,{ 70, 70, 70, 70, 70} 113 | ,{ 70, -30, 70, -30, 70} 114 | ,{ 70, 70, 70, 70, 10} 115 | } 116 | ,{{ 70, 70, 70, 70, 70} 117 | ,{ 70, 70, 70, -10, 70} 118 | ,{ 70, 70, 70, 70, 70} 119 | ,{ 70, -30, 70, -30, 70} 120 | ,{ 70, 70, 70, 70, 10} 121 | } 122 | ,{{ 70, 70, 70, 70, 70} 123 | ,{ 70, 70, 70, -10, 70} 124 | ,{ 70, 70, 70, 70, 70} 125 | ,{ 70, -30, 70, -30, 70} 126 | ,{ 70, 70, 70, 70, 10} 127 | }}; 128 | 129 | int mismatchH37[NBPAIRS+1][5][5] = 130 | {{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 131 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 132 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 133 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 134 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 135 | } 136 | // lhuang: CG.. 137 | ,{{ -80, -100, -110, -100, -80} 138 | ,{ -140, -150, -150, -140, -150} 139 | ,{ -80, -100, -110, -100, -80} 140 | ,{ -150, -230, -150, -240, -150} 141 | ,{ -100, -100, -140, -100, -210} 142 | } 143 | // lhuang: GC.. 144 | ,{{ -50, -110, -70, -110, -50} 145 | ,{ -110, -110, -150, -130, -150} 146 | ,{ -50, -110, -70, -110, -50} 147 | ,{ -150, -250, -150, -220, -150} 148 | ,{ -100, -110, -100, -110, -160} 149 | } 150 | ,{{ 20, 20, -20, -10, -20} 151 | ,{ 20, 20, -50, -30, -50} 152 | ,{ -10, -10, -20, -10, -20} 153 | ,{ -50, -100, -50, -110, -50} 154 | ,{ -10, -10, -30, -10, -100} 155 | } 156 | ,{{ 0, -20, -10, -20, 0} 157 | ,{ -30, -50, -30, -60, -30} 158 | ,{ 0, -20, -10, -20, 0} 159 | ,{ -30, -90, -30, -110, -30} 160 | ,{ -10, -20, -10, -20, -90} 161 | } 162 | ,{{ -10, -10, -20, -10, -20} 163 | ,{ -30, -30, -50, -30, -50} 164 | ,{ -10, -10, -20, -10, -20} 165 | ,{ -50, -120, -50, -110, -50} 166 | ,{ -10, -10, -30, -10, -120} 167 | } 168 | ,{{ 0, -20, -10, -20, 0} 169 | ,{ -30, -50, -30, -50, -30} 170 | ,{ 0, -20, -10, -20, 0} 171 | ,{ -30, -150, -30, -150, -30} 172 | ,{ -10, -20, -10, -20, -90} 173 | } 174 | ,{{ 20, 20, -10, -10, 0} 175 | ,{ 20, 20, -30, -30, -30} 176 | ,{ 0, -10, -10, -10, 0} 177 | ,{ -30, -90, -30, -110, -30} 178 | ,{ -10, -10, -10, -10, -90} 179 | }}; 180 | 181 | int mismatchM37[NBPAIRS+1][5][5] = 182 | {{ /* NP.. */ 183 | { VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 184 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 185 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 186 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 187 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 188 | }, 189 | { /* CG.. */ 190 | { -50, -110, -50, -140, -70} 191 | ,{ -110, -110, -110, -160, -110} 192 | ,{ -70, -150, -70, -150, -100} 193 | ,{ -110, -130, -110, -140, -110} 194 | ,{ -50, -150, -50, -150, -70} 195 | }, 196 | { /* GC.. */ 197 | { -80, -140, -80, -140, -100} 198 | ,{ -100, -150, -100, -140, -100} 199 | ,{ -110, -150, -110, -150, -140} 200 | ,{ -100, -140, -100, -160, -100} 201 | ,{ -80, -150, -80, -150, -120} 202 | }, 203 | { /* GU.. */ 204 | { -50, -80, -50, -50, -50} 205 | ,{ -50, -100, -70, -50, -70} 206 | ,{ -60, -80, -60, -80, -60} 207 | ,{ -70, -110, -70, -80, -70} 208 | ,{ -50, -80, -50, -80, -50} 209 | }, 210 | { /* UG.. */ 211 | { -30, -30, -60, -60, -60} 212 | ,{ -30, -30, -60, -60, -60} 213 | ,{ -70, -100, -70, -100, -80} 214 | ,{ -60, -80, -60, -80, -60} 215 | ,{ -60, -100, -70, -100, -60} 216 | }, 217 | { /* AU.. */ 218 | { -50, -80, -50, -80, -50} 219 | ,{ -70, -100, -70, -110, -70} 220 | ,{ -60, -80, -60, -80, -60} 221 | ,{ -70, -110, -70, -120, -70} 222 | ,{ -50, -80, -50, -80, -50} 223 | }, 224 | { /* UA.. */ 225 | { -60, -80, -60, -80, -60} 226 | ,{ -60, -80, -60, -80, -60} 227 | ,{ -70, -100, -70, -100, -80} 228 | ,{ -60, -80, -60, -80, -60} 229 | ,{ -70, -100, -70, -100, -80} 230 | }, 231 | { /* NN.. */ 232 | { -30, -30, -50, -50, -50} 233 | ,{ -30, -30, -60, -50, -60} 234 | ,{ -60, -80, -60, -80, -60} 235 | ,{ -60, -80, -60, -80, -60} 236 | ,{ -50, -80, -50, -80, -50} 237 | }}; 238 | 239 | int mismatch1nI37[NBPAIRS+1][5][5] = 240 | {{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 241 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 242 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 243 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 244 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 245 | } 246 | ,{{ 0, 0, 0, 0, 0} 247 | ,{ 0, 0, 0, 0, 0} 248 | ,{ 0, 0, 0, 0, 0} 249 | ,{ 0, 0, 0, 0, 0} 250 | ,{ 0, 0, 0, 0, 0} 251 | } 252 | ,{{ 0, 0, 0, 0, 0} 253 | ,{ 0, 0, 0, 0, 0} 254 | ,{ 0, 0, 0, 0, 0} 255 | ,{ 0, 0, 0, 0, 0} 256 | ,{ 0, 0, 0, 0, 0} 257 | } 258 | ,{{ 70, 70, 70, 70, 70} 259 | ,{ 70, 70, 70, 70, 70} 260 | ,{ 70, 70, 70, 70, 70} 261 | ,{ 70, 70, 70, 70, 70} 262 | ,{ 70, 70, 70, 70, 70} 263 | } 264 | ,{{ 70, 70, 70, 70, 70} 265 | ,{ 70, 70, 70, 70, 70} 266 | ,{ 70, 70, 70, 70, 70} 267 | ,{ 70, 70, 70, 70, 70} 268 | ,{ 70, 70, 70, 70, 70} 269 | } 270 | ,{{ 70, 70, 70, 70, 70} 271 | ,{ 70, 70, 70, 70, 70} 272 | ,{ 70, 70, 70, 70, 70} 273 | ,{ 70, 70, 70, 70, 70} 274 | ,{ 70, 70, 70, 70, 70} 275 | } 276 | ,{{ 70, 70, 70, 70, 70} 277 | ,{ 70, 70, 70, 70, 70} 278 | ,{ 70, 70, 70, 70, 70} 279 | ,{ 70, 70, 70, 70, 70} 280 | ,{ 70, 70, 70, 70, 70} 281 | } 282 | ,{{ 70, 70, 70, 70, 70} 283 | ,{ 70, 70, 70, 70, 70} 284 | ,{ 70, 70, 70, 70, 70} 285 | ,{ 70, 70, 70, 70, 70} 286 | ,{ 70, 70, 70, 70, 70} 287 | }}; 288 | 289 | int mismatch23I37[NBPAIRS+1][5][5] = 290 | {{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 291 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 292 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 293 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 294 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 295 | } 296 | ,{{ 0, 0, 0, 0, 0} 297 | ,{ 0, 0, 0, -50, 0} 298 | ,{ 0, 0, 0, 0, 0} 299 | ,{ 0, -110, 0, -70, 0} 300 | ,{ 0, 0, 0, 0, -30} 301 | } 302 | ,{{ 0, 0, 0, 0, 0} 303 | ,{ 0, 0, 0, 0, 0} 304 | ,{ 0, 0, 0, 0, 0} 305 | ,{ 0, -120, 0, -70, 0} 306 | ,{ 0, 0, 0, 0, -30} 307 | } 308 | ,{{ 70, 70, 70, 70, 70} 309 | ,{ 70, 70, 70, 70, 70} 310 | ,{ 70, 70, 70, 70, 70} 311 | ,{ 70, -40, 70, 0, 70} 312 | ,{ 70, 70, 70, 70, 40} 313 | } 314 | ,{{ 70, 70, 70, 70, 70} 315 | ,{ 70, 70, 70, 20, 70} 316 | ,{ 70, 70, 70, 70, 70} 317 | ,{ 70, -40, 70, 0, 70} 318 | ,{ 70, 70, 70, 70, 40} 319 | } 320 | ,{{ 70, 70, 70, 70, 70} 321 | ,{ 70, 70, 70, 70, 70} 322 | ,{ 70, 70, 70, 70, 70} 323 | ,{ 70, -40, 70, 0, 70} 324 | ,{ 70, 70, 70, 70, 40} 325 | } 326 | ,{{ 70, 70, 70, 70, 70} 327 | ,{ 70, 70, 70, 20, 70} 328 | ,{ 70, 70, 70, 70, 70} 329 | ,{ 70, -40, 70, 0, 70} 330 | ,{ 70, 70, 70, 70, 40} 331 | } 332 | ,{{ 70, 70, 70, 70, 70} 333 | ,{ 70, 70, 70, 70, 70} 334 | ,{ 70, 70, 70, 70, 70} 335 | ,{ 70, -40, 70, 0, 70} 336 | ,{ 70, 70, 70, 70, 40} 337 | }}; 338 | 339 | int mismatchExt37[NBPAIRS+1][5][5] = 340 | {{ /* NP.. */ 341 | { VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 342 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 343 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 344 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 345 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 346 | }, 347 | { /* CG.. */ 348 | { -50, -110, -50, -140, -70} 349 | ,{ -110, -110, -110, -160, -110} 350 | ,{ -70, -150, -70, -150, -100} 351 | ,{ -110, -130, -110, -140, -110} 352 | ,{ -50, -150, -50, -150, -70} 353 | }, 354 | { /* GC.. */ 355 | { -80, -140, -80, -140, -100} 356 | ,{ -100, -150, -100, -140, -100} 357 | ,{ -110, -150, -110, -150, -140} 358 | ,{ -100, -140, -100, -160, -100} 359 | ,{ -80, -150, -80, -150, -120} 360 | }, 361 | { /* GU.. */ 362 | { -50, -80, -50, -50, -50} 363 | ,{ -50, -100, -70, -50, -70} 364 | ,{ -60, -80, -60, -80, -60} 365 | ,{ -70, -110, -70, -80, -70} 366 | ,{ -50, -80, -50, -80, -50} 367 | }, 368 | { /* UG.. */ 369 | { -30, -30, -60, -60, -60} 370 | ,{ -30, -30, -60, -60, -60} 371 | ,{ -70, -100, -70, -100, -80} 372 | ,{ -60, -80, -60, -80, -60} 373 | ,{ -60, -100, -70, -100, -60} 374 | }, 375 | { /* AU.. */ 376 | { -50, -80, -50, -80, -50} 377 | ,{ -70, -100, -70, -110, -70} 378 | ,{ -60, -80, -60, -80, -60} 379 | ,{ -70, -110, -70, -120, -70} 380 | ,{ -50, -80, -50, -80, -50} 381 | }, 382 | { /* UA.. */ 383 | { -60, -80, -60, -80, -60} 384 | ,{ -60, -80, -60, -80, -60} 385 | ,{ -70, -100, -70, -100, -80} 386 | ,{ -60, -80, -60, -80, -60} 387 | ,{ -70, -100, -70, -100, -80} 388 | }, 389 | { /* NN.. */ 390 | { -30, -30, -50, -50, -50} 391 | ,{ -30, -30, -60, -50, -60} 392 | ,{ -60, -80, -60, -80, -60} 393 | ,{ -60, -80, -60, -80, -60} 394 | ,{ -50, -80, -50, -80, -50} 395 | }}; 396 | 397 | /* dangle5 */ 398 | int dangle5_37[NBPAIRS+1][5] = 399 | { /* N A C G U */ 400 | /* NP */ { VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF}, 401 | /* CG */ { -10, -50, -30, -20, -10}, 402 | /* GC */ { -0, -20, -30, -0, -0}, 403 | /* GU */ { -20, -30, -30, -40, -20}, 404 | /* UG */ { -10, -30, -10, -20, -20}, 405 | /* AU */ { -20, -30, -30, -40, -20}, 406 | /* UA */ { -10, -30, -10, -20, -20}, 407 | /* NN */ { -0, -20, -10, -0, -0} 408 | }; 409 | 410 | /* dangle3 */ 411 | int dangle3_37[NBPAIRS+1][5] = 412 | { /* N A C G U */ 413 | /* NP */ { VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF}, 414 | /* CG */ { -40, -110, -40, -130, -60}, 415 | /* GC */ { -80, -170, -80, -170, -120}, 416 | /* GU */ { -10, -70, -10, -70, -10}, 417 | /* UG */ { -50, -80, -50, -80, -60}, 418 | /* AU */ { -10, -70, -10, -70, -10}, 419 | /* UA */ { -50, -80, -50, -80, -60}, 420 | /* NN */ { -10, -70, -10, -70, -10} 421 | }; 422 | -------------------------------------------------------------------------------- /src/Utils/feature_weight.h: -------------------------------------------------------------------------------- 1 | /* 2 | *feature_weight.h* 3 | the feature weight for the parser. This code is automatically generated from feature weights in other formats. 4 | 5 | author: Kai Zhao, Dezhong Deng 6 | edited by: 02/2018 7 | */ 8 | 9 | 10 | #ifndef FASTCKY_W 11 | #define FASTCKY_W 12 | double multi_base = -1.199055076; 13 | double multi_unpaired = -0.1983300391; 14 | double multi_paired = -0.9253883752; 15 | double external_unpaired = -0.00972883093; 16 | double external_paired = -0.0009674111431; 17 | double base_pair[25] = {0.0,0.0,0.0,0.59791199,0.0,0.0,0.0,1.544290641,0.0,0.0,0.0,1.544290641,0.0,-0.01304754992,0.0,0.59791199,0.0,-0.01304754992,0.0,0.0,0.0,0.0,0.0,0.0,0.0}; 18 | double internal_1x1_nucleotides[25] = {0.2944404686,0.08641360967,-0.3664197228,-0.2053107048,0.0,0.08641360967,-0.1582543624,0.4175273724,0.1368762582,0.0,-0.3664197228,0.4175273724,-0.1193514754,-0.4188101413,0.0,-0.2053107048,0.1368762582,-0.4188101413,0.147140653,0.0,0.0,0.0,0.0,0.0,0.0}; 19 | double helix_stacking[625] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1482005248,0.0,0.0,0.0,0.4343497127,0.0,0.0,0.0,0.7079642577,0.0,-0.1010777582,0.0,0.243256656,0.0,0.1623654243,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4878707793,0.0,0.0,0.0,0.8481320247,0.0,0.0,0.0,0.4784248478,0.0,-0.1811268205,0.0,0.7079642577,0.0,0.4849351028,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5551785831,0.0,0.0,0.0,0.5008324248,0.0,0.0,0.0,0.8481320247,0.0,0.2165962476,0.0,0.4343497127,0.0,0.4864603589,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.04665365028,0.0,0.0,0.0,0.4864603589,0.0,0.0,0.0,0.4849351028,0.0,0.1833447295,0.0,0.1623654243,0.0,-0.2858970755,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.3897593783,0.0,0.0,0.0,0.5551785831,0.0,0.0,0.0,0.4878707793,0.0,-0.1157333764,0.0,0.1482005248,0.0,-0.04665365028,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.1157333764,0.0,0.0,0.0,0.2165962476,0.0,0.0,0.0,-0.1811268205,0.0,0.120296538,0.0,-0.1010777582,0.0,0.1833447295,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0}; 20 | double terminal_mismatch[625] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.184546064,-0.1181844187,-0.4461469607,-0.6175254495,0.0,0.004788458708,0.08319395146,-0.2249479995,-0.3981327204,0.0,0.5191110288,-0.3524119307,-0.4056429433,-0.7733932162,0.0,-0.01574403519,0.268570042,-0.0934388741,0.3373711531,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08386423535,-0.2520716816,-0.6711841881,-0.3816350028,0.0,0.1117852189,-0.1704393624,-0.2179987732,-0.459267635,0.0,0.8520640313,-0.9332488517,-0.3289551692,-0.7778822056,0.0,-0.2422339958,-0.03780509247,-0.4322334143,-0.2419976114,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.1703136025,-0.09154056357,-0.2522413002,-0.8520314799,0.0,0.04763224188,-0.2428654283,-0.2079275061,-0.1874270053,0.0,0.6540033983,-0.7823988605,0.1995898255,-0.4432169392,0.0,-0.1736921762,0.288494362,-0.01638238057,0.6757988971,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.4871607613,0.1105031953,0.363373916,-0.6193199348,0.0,0.3451056056,0.0314944976,-0.3799172956,-0.03222973182,0.0,0.4948638637,-0.2821952552,-0.2702227211,-0.06658395291,0.0,-0.4306154451,-0.09497863465,-0.3130794485,-0.2283242981,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0115363879,-0.3923408221,0.05661063599,-0.1251485388,0.0,-0.06545074758,-0.3167200568,0.002258383981,-0.422217724,0.0,0.5458416646,-0.2085887954,-0.1971766062,-0.4722410132,0.0,-0.1779642496,0.1643454344,-0.5005617032,0.1333867679,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1218741278,0.1990260141,0.04681893928,0.3256264491,0.0,0.1186812326,-0.1851065102,-0.04311512683,-0.6150608139,0.0,0.754933218,-0.3150708483,0.1569582926,-0.514970007,0.0,-0.2926246029,0.1373068149,-0.05422333363,0.03086776921,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0}; 21 | double bulge_0x1_nucleotides[5] = {-0.1216861662,-0.07111241127,0.008947026647,-0.002685763742,0.0}; 22 | double helix_closing[25] = {0.0,0.0,0.0,-0.9770893163,0.0,0.0,0.0,-0.4574650937,0.0,0.0,0.0,-0.8265995623,0.0,-1.051678928,0.0,-0.9246140521,0.0,-0.3698708172,0.0,0.0,0.0,0.0,0.0,0.0,0.0}; 23 | double dangle_left[125] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.1251037681,0.0441606708,-0.02541879082,0.00785098466,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07224381372,0.05279281874,0.1009554299,-0.1515059013,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.1829535099,0.03393000394,0.1335339061,-0.1604274506,0.0,0.0,0.0,0.0,0.0,0.0,-0.06517511341,-0.04250882422,0.02875971806,-0.04359727428,0.0,0.0,0.0,0.0,0.0,0.0,-0.03373847659,-0.005070324324,-0.1186861149,-0.01162357727,0.0,0.0,0.0,0.0,0.0,0.0,-0.08047139148,0.001608000669,0.1016272216,-0.09200842832,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0}; 24 | double dangle_right[125] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03232578201,-0.09096819493,-0.0740750973,-0.01621157379,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2133964379,-0.06234810991,-0.07008531041,-0.2141912285,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01581957549,0.005644320058,-0.00943297687,-0.2597793095,0.0,0.0,0.0,0.0,0.0,0.0,-0.04480271781,-0.07321213002,0.01270494867,-0.05717033985,0.0,0.0,0.0,0.0,0.0,0.0,-0.1631918513,0.06769304994,-0.08789074414,-0.05525570007,0.0,0.0,0.0,0.0,0.0,0.0,0.04105458185,-0.008136642572,-0.03808592022,-0.08629373429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0}; 25 | double internal_explicit[21] = {0.0,0.0,0.0,0.0,0.0,-0.1754591076,0.03083787104,-0.171565435,-0.2294680983,0.0,-0.1304072693,-0.07730329553,0.2782767264,0.0,0.0,-0.02898949617,0.3112350694,0.0,0.0,0.0,-0.3226348245}; 26 | double hairpin_length[31] = {-5.993180158,-9.10128592,-8.6843882853,-6.4789692193,-4.5522195273,-5.1395440602,-5.222301238,-4.6439122536,-5.3660005908,-5.5385880532,-5.8410970399,-5.8707286338,-6.7976282286,-6.82920576838,-6.93145297848,-6.74131224388,-6.83412134214,-6.66507650134,-6.74680216605,-7.09139606915,-7.20054636315,-7.49089873245,-7.83027009915,-8.02180651085,-8.07199860464,-8.11074481388,-8.06323010636,-7.9957868871,-7.89856812984,-7.73125495654,-7.49826123164}; 27 | double bulge_length[31] = {0.0,-2.399548472,-3.2940667837,-4.2029218746,-5.0441693501,-5.4807172844,-6.0506360645,-5.8503526421,-5.0964765063,-5.7009810518,-6.4210758616,-6.9347480537,-7.2962207216,-7.5576661608,-7.7170588501,-7.80330553291,-7.83437644287,-7.84534866319,-7.81533646036,-7.76774522247,-7.81070694312,-7.82862593974,-7.90663145496,-7.97762471926,-8.03530424822,-8.08164219503,-8.11723639959,-8.14398574353,-8.16217532325,-8.17269833057,-8.17785195742}; 28 | double internal_length[31] = {0.0,0.0,-0.429061443,-0.7822725931,-1.1786523466,-1.4897722641,-1.7449668113,-1.79645798028,-1.83964800435,-1.83766251486,-2.01381382847,-2.27778244917,-2.62384380687,-2.91650411476,-2.95274661783,-3.07274199393,-3.11628971319,-3.19838264454,-3.20549587058,-3.18194762206,-3.15127788635,-3.21746029729,-3.34906953559,-3.48986908699,-3.55587200561,-3.63366405305,-3.6845060657,-3.72590482171,-3.72262823831,-3.71670365547,-3.70982791746}; 29 | double internal_symmetric_length[16] = {0.0,-0.5467082599,-0.9321784246,-1.1910250647,-1.4251087392,-1.2800509627,-1.9363442142,-2.2384530511,-2.26877580377,-2.62057020957,-2.83648346017,-2.95931050557,-3.11453136507,-3.1999425725,-3.24586367049,-3.26818601285}; 30 | double internal_asymmetry[29] = {0.0,-2.105646719,-2.6576607621,-3.2347315291,-3.8483983138,-4.1541139979,-4.269619198,-4.4801804211,-4.7947547341,-5.1096509022,-5.19983279712,-5.41983547652,-5.56048380082,-5.77672492672,-5.94927807022,-6.10516925682,-6.20925512312,-6.2789319654,-6.31999174034,-6.3356979835,-6.32187797711,-6.28055809148,-6.24461623198,-6.21639436916,-6.20002851042,-6.17452794867,-6.14104762074,-6.10132837662,-6.10387349055}; 31 | double hairpin_length_at_least[31] = {-5.993180158,-3.108105762,0.4168976347,2.205419066,1.926749692,-0.5873245329,-0.0827571778,0.5783889844,-0.7220883372,-0.1725874624,-0.3025089867,-0.0296315939,-0.9268995948,-0.03157753978,-0.1022472101,0.1901407346,-0.09280909826,0.1690448408,-0.08172566471,-0.3445939031,-0.109150294,-0.2903523693,-0.3393713667,-0.1915364117,-0.05019209379,-0.03874620924,0.04751470752,0.06744321926,0.09721875726,0.1673131733,0.2329937249}; 32 | double bulge_length_at_least[31] = {0.0,-2.399548472,-0.8945183117,-0.9088550909,-0.8412474755,-0.4365479343,-0.5699187801,0.2002834224,0.7538761358,-0.6045045455,-0.7200948098,-0.5136721921,-0.3614726679,-0.2614454392,-0.1593926893,-0.08624668281,-0.03107090996,-0.01097222032,0.03001220283,0.04759123789,-0.04296172065,-0.01791899662,-0.07800551522,-0.0709932643,-0.05767952896,-0.04633794681,-0.03559420456,-0.02674934394,-0.01818957972,-0.01052300732,-0.005153626846}; 33 | double internal_length_at_least[31] = {0.0,0.0,-0.429061443,-0.3532111501,-0.3963797535,-0.3111199175,-0.2551945472,-0.05149116898,-0.04319002407,0.001985489485,-0.1761513136,-0.2639686207,-0.3460613577,-0.2926603079,-0.03624250307,-0.1199953761,-0.04354771926,-0.08209293135,-0.007113226038,0.02354824852,0.03066973571,-0.06618241094,-0.1316092383,-0.1407995514,-0.06600291862,-0.07779204744,-0.05084201265,-0.04139875601,0.003276583405,0.00592458284,0.006875738004}; 34 | double internal_symmetric_length_at_least[16] = {0.0,-0.5467082599,-0.3854701647,-0.2588466401,-0.2340836745,0.1450577765,-0.6562932515,-0.3021088369,-0.03032275267,-0.3517944058,-0.2159132506,-0.1228270454,-0.1552208595,-0.08541120743,-0.04592109799,-0.02232234236}; 35 | double internal_asymmetry_at_least[29] = {0.0,-2.105646719,-0.5520140431,-0.577070767,-0.6136667847,-0.3057156841,-0.1155052001,-0.2105612231,-0.314574313,-0.3148961681,-0.09018189492,-0.2200026794,-0.1406483243,-0.2162411259,-0.1725531435,-0.1558911866,-0.1040858663,-0.06967684228,-0.04105977494,-0.01570624316,0.01382000639,0.04131988563,0.0359418595,0.02822186282,0.01636585874,0.02550056175,0.03348032793,0.03971924412,-0.002545113932}; 36 | #endif -------------------------------------------------------------------------------- /src/Utils/intl11.h: -------------------------------------------------------------------------------- 1 | #ifndef VIE_INF 2 | #define VIE_INF 999999999 3 | #endif 4 | #ifndef NBPAIRS 5 | #define NBPAIRS 7 6 | #endif 7 | 8 | int int11_37[NBPAIRS+1][NBPAIRS+1][5][5] = 9 | {{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 10 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 11 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 12 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 13 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 14 | } 15 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 16 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 17 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 18 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 19 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 20 | } 21 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 22 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 23 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 24 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 25 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 26 | } 27 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 28 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 29 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 30 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 31 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 32 | } 33 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 34 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 35 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 36 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 37 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 38 | } 39 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 40 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 41 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 42 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 43 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 44 | } 45 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 46 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 47 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 48 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 49 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 50 | } 51 | ,{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 52 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 53 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 54 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 55 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 56 | } 57 | } 58 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 59 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 60 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 61 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 62 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 63 | } 64 | ,{{ 90, 90, 50, 50, 50} 65 | ,{ 90, 90, 50, 50, 50} 66 | ,{ 50, 50, 50, 50, 50} 67 | ,{ 50, 50, 50, -140, 50} 68 | ,{ 50, 50, 50, 50, 40} 69 | } 70 | ,{{ 90, 90, 50, 50, 60} 71 | ,{ 90, 90, -40, 50, 50} 72 | ,{ 60, 30, 50, 50, 60} 73 | ,{ 50, -10, 50, -220, 50} 74 | ,{ 50, 50, 0, 50, -10} 75 | } 76 | ,{{ 120, 120, 120, 120, 120} 77 | ,{ 120, 60, 50, 120, 120} 78 | ,{ 120, 120, 120, 120, 120} 79 | ,{ 120, -20, 120, -140, 120} 80 | ,{ 120, 120, 100, 120, 110} 81 | } 82 | ,{{ 220, 220, 170, 120, 120} 83 | ,{ 220, 220, 130, 120, 120} 84 | ,{ 170, 120, 170, 120, 120} 85 | ,{ 120, 120, 120, -140, 120} 86 | ,{ 120, 120, 120, 120, 110} 87 | } 88 | ,{{ 120, 120, 120, 120, 120} 89 | ,{ 120, 120, 120, 120, 120} 90 | ,{ 120, 120, 120, 120, 120} 91 | ,{ 120, 120, 120, -140, 120} 92 | ,{ 120, 120, 120, 120, 80} 93 | } 94 | ,{{ 120, 120, 120, 120, 120} 95 | ,{ 120, 120, 120, 120, 120} 96 | ,{ 120, 120, 120, 120, 120} 97 | ,{ 120, 120, 120, -140, 120} 98 | ,{ 120, 120, 120, 120, 120} 99 | } 100 | ,{{ 220, 220, 170, 120, 120} 101 | ,{ 220, 220, 130, 120, 120} 102 | ,{ 170, 120, 170, 120, 120} 103 | ,{ 120, 120, 120, -140, 120} 104 | ,{ 120, 120, 120, 120, 120} 105 | } 106 | } 107 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 108 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 109 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 110 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 111 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 112 | } 113 | ,{{ 90, 90, 60, 50, 50} 114 | ,{ 90, 90, 30, -10, 50} 115 | ,{ 50, -40, 50, 50, 0} 116 | ,{ 50, 50, 50, -220, 50} 117 | ,{ 60, 50, 60, 50, -10} 118 | } 119 | ,{{ 80, 80, 50, 50, 50} 120 | ,{ 80, 80, 50, 50, 50} 121 | ,{ 50, 50, 50, 50, 50} 122 | ,{ 50, 50, 50, -230, 50} 123 | ,{ 50, 50, 50, 50, -60} 124 | } 125 | ,{{ 190, 190, 120, 150, 150} 126 | ,{ 190, 190, 120, 150, 120} 127 | ,{ 120, 120, 120, 120, 120} 128 | ,{ 120, 120, 120, -140, 120} 129 | ,{ 150, 120, 120, 120, 150} 130 | } 131 | ,{{ 160, 160, 120, 120, 120} 132 | ,{ 160, 160, 120, 100, 120} 133 | ,{ 120, 120, 120, 120, 120} 134 | ,{ 120, 120, 120, -140, 120} 135 | ,{ 120, 120, 120, 120, 70} 136 | } 137 | ,{{ 120, 120, 120, 120, 120} 138 | ,{ 120, 120, 120, 120, 120} 139 | ,{ 120, 120, 120, 120, 120} 140 | ,{ 120, 120, 120, -140, 120} 141 | ,{ 120, 120, 120, 120, 80} 142 | } 143 | ,{{ 120, 120, 120, 120, 120} 144 | ,{ 120, 120, 120, 120, 120} 145 | ,{ 120, 120, 120, 120, 120} 146 | ,{ 120, 120, 120, -140, 120} 147 | ,{ 120, 120, 120, 120, 120} 148 | } 149 | ,{{ 190, 190, 120, 150, 150} 150 | ,{ 190, 190, 120, 150, 120} 151 | ,{ 120, 120, 120, 120, 120} 152 | ,{ 120, 120, 120, -140, 120} 153 | ,{ 150, 120, 120, 120, 150} 154 | } 155 | } 156 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 157 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 158 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 159 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 160 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 161 | } 162 | ,{{ 120, 120, 120, 120, 120} 163 | ,{ 120, 60, 120, -20, 120} 164 | ,{ 120, 50, 120, 120, 100} 165 | ,{ 120, 120, 120, -140, 120} 166 | ,{ 120, 120, 120, 120, 110} 167 | } 168 | ,{{ 190, 190, 120, 120, 150} 169 | ,{ 190, 190, 120, 120, 120} 170 | ,{ 120, 120, 120, 120, 120} 171 | ,{ 150, 150, 120, -140, 120} 172 | ,{ 150, 120, 120, 120, 150} 173 | } 174 | ,{{ 190, 190, 190, 190, 190} 175 | ,{ 190, 190, 190, 190, 190} 176 | ,{ 190, 190, 190, 190, 190} 177 | ,{ 190, 190, 190, -70, 190} 178 | ,{ 190, 190, 190, 190, 120} 179 | } 180 | ,{{ 190, 190, 190, 190, 190} 181 | ,{ 190, 190, 190, 190, 190} 182 | ,{ 190, 190, 190, 190, 190} 183 | ,{ 190, 190, 190, -70, 190} 184 | ,{ 190, 190, 190, 190, 160} 185 | } 186 | ,{{ 190, 190, 190, 190, 190} 187 | ,{ 190, 190, 190, 190, 190} 188 | ,{ 190, 190, 190, 190, 190} 189 | ,{ 190, 190, 190, -70, 190} 190 | ,{ 190, 190, 190, 190, 120} 191 | } 192 | ,{{ 190, 190, 190, 190, 190} 193 | ,{ 190, 190, 190, 190, 190} 194 | ,{ 190, 190, 190, 190, 190} 195 | ,{ 190, 190, 190, -70, 190} 196 | ,{ 190, 190, 190, 190, 160} 197 | } 198 | ,{{ 190, 190, 190, 190, 190} 199 | ,{ 190, 190, 190, 190, 190} 200 | ,{ 190, 190, 190, 190, 190} 201 | ,{ 190, 190, 190, -70, 190} 202 | ,{ 190, 190, 190, 190, 160} 203 | } 204 | } 205 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 206 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 207 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 208 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 209 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 210 | } 211 | ,{{ 220, 220, 170, 120, 120} 212 | ,{ 220, 220, 120, 120, 120} 213 | ,{ 170, 130, 170, 120, 120} 214 | ,{ 120, 120, 120, -140, 120} 215 | ,{ 120, 120, 120, 120, 110} 216 | } 217 | ,{{ 160, 160, 120, 120, 120} 218 | ,{ 160, 160, 120, 120, 120} 219 | ,{ 120, 120, 120, 120, 120} 220 | ,{ 120, 100, 120, -140, 120} 221 | ,{ 120, 120, 120, 120, 70} 222 | } 223 | ,{{ 190, 190, 190, 190, 190} 224 | ,{ 190, 190, 190, 190, 190} 225 | ,{ 190, 190, 190, 190, 190} 226 | ,{ 190, 190, 190, -70, 190} 227 | ,{ 190, 190, 190, 190, 160} 228 | } 229 | ,{{ 190, 190, 190, 190, 190} 230 | ,{ 190, 190, 190, 190, 190} 231 | ,{ 190, 190, 190, 190, 190} 232 | ,{ 190, 190, 190, -70, 190} 233 | ,{ 190, 190, 190, 190, 190} 234 | } 235 | ,{{ 190, 190, 190, 190, 190} 236 | ,{ 190, 190, 190, 190, 190} 237 | ,{ 190, 190, 190, 190, 190} 238 | ,{ 190, 190, 190, -70, 190} 239 | ,{ 190, 190, 190, 190, 160} 240 | } 241 | ,{{ 190, 190, 190, 190, 190} 242 | ,{ 190, 190, 190, 190, 190} 243 | ,{ 190, 190, 190, 190, 190} 244 | ,{ 190, 190, 190, -70, 190} 245 | ,{ 190, 190, 190, 190, 190} 246 | } 247 | ,{{ 220, 220, 190, 190, 190} 248 | ,{ 220, 220, 190, 190, 190} 249 | ,{ 190, 190, 190, 190, 190} 250 | ,{ 190, 190, 190, -70, 190} 251 | ,{ 190, 190, 190, 190, 190} 252 | } 253 | } 254 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 255 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 256 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 257 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 258 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 259 | } 260 | ,{{ 120, 120, 120, 120, 120} 261 | ,{ 120, 120, 120, 120, 120} 262 | ,{ 120, 120, 120, 120, 120} 263 | ,{ 120, 120, 120, -140, 120} 264 | ,{ 120, 120, 120, 120, 80} 265 | } 266 | ,{{ 120, 120, 120, 120, 120} 267 | ,{ 120, 120, 120, 120, 120} 268 | ,{ 120, 120, 120, 120, 120} 269 | ,{ 120, 120, 120, -140, 120} 270 | ,{ 120, 120, 120, 120, 80} 271 | } 272 | ,{{ 190, 190, 190, 190, 190} 273 | ,{ 190, 190, 190, 190, 190} 274 | ,{ 190, 190, 190, 190, 190} 275 | ,{ 190, 190, 190, -70, 190} 276 | ,{ 190, 190, 190, 190, 120} 277 | } 278 | ,{{ 190, 190, 190, 190, 190} 279 | ,{ 190, 190, 190, 190, 190} 280 | ,{ 190, 190, 190, 190, 190} 281 | ,{ 190, 190, 190, -70, 190} 282 | ,{ 190, 190, 190, 190, 160} 283 | } 284 | ,{{ 190, 190, 190, 190, 190} 285 | ,{ 190, 190, 190, 190, 190} 286 | ,{ 190, 190, 190, 190, 190} 287 | ,{ 190, 190, 190, -70, 190} 288 | ,{ 190, 190, 190, 190, 120} 289 | } 290 | ,{{ 190, 190, 190, 190, 190} 291 | ,{ 190, 190, 190, 190, 190} 292 | ,{ 190, 190, 190, 190, 190} 293 | ,{ 190, 190, 190, -70, 190} 294 | ,{ 190, 190, 190, 190, 150} 295 | } 296 | ,{{ 190, 190, 190, 190, 190} 297 | ,{ 190, 190, 190, 190, 190} 298 | ,{ 190, 190, 190, 190, 190} 299 | ,{ 190, 190, 190, -70, 190} 300 | ,{ 190, 190, 190, 190, 160} 301 | } 302 | } 303 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 304 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 305 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 306 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 307 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 308 | } 309 | ,{{ 120, 120, 120, 120, 120} 310 | ,{ 120, 120, 120, 120, 120} 311 | ,{ 120, 120, 120, 120, 120} 312 | ,{ 120, 120, 120, -140, 120} 313 | ,{ 120, 120, 120, 120, 120} 314 | } 315 | ,{{ 120, 120, 120, 120, 120} 316 | ,{ 120, 120, 120, 120, 120} 317 | ,{ 120, 120, 120, 120, 120} 318 | ,{ 120, 120, 120, -140, 120} 319 | ,{ 120, 120, 120, 120, 120} 320 | } 321 | ,{{ 190, 190, 190, 190, 190} 322 | ,{ 190, 190, 190, 190, 190} 323 | ,{ 190, 190, 190, 190, 190} 324 | ,{ 190, 190, 190, -70, 190} 325 | ,{ 190, 190, 190, 190, 160} 326 | } 327 | ,{{ 190, 190, 190, 190, 190} 328 | ,{ 190, 190, 190, 190, 190} 329 | ,{ 190, 190, 190, 190, 190} 330 | ,{ 190, 190, 190, -70, 190} 331 | ,{ 190, 190, 190, 190, 190} 332 | } 333 | ,{{ 190, 190, 190, 190, 190} 334 | ,{ 190, 190, 190, 190, 190} 335 | ,{ 190, 190, 190, 190, 190} 336 | ,{ 190, 190, 190, -70, 190} 337 | ,{ 190, 190, 190, 190, 150} 338 | } 339 | ,{{ 190, 190, 190, 190, 190} 340 | ,{ 190, 190, 190, 190, 190} 341 | ,{ 190, 190, 190, 190, 190} 342 | ,{ 190, 190, 190, -70, 190} 343 | ,{ 190, 190, 190, 190, 170} 344 | } 345 | ,{{ 190, 190, 190, 190, 190} 346 | ,{ 190, 190, 190, 190, 190} 347 | ,{ 190, 190, 190, 190, 190} 348 | ,{ 190, 190, 190, -70, 190} 349 | ,{ 190, 190, 190, 190, 190} 350 | } 351 | } 352 | ,{{{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 353 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 354 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 355 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 356 | ,{ VIE_INF, VIE_INF, VIE_INF, VIE_INF, VIE_INF} 357 | } 358 | ,{{ 220, 220, 170, 120, 120} 359 | ,{ 220, 220, 120, 120, 120} 360 | ,{ 170, 130, 170, 120, 120} 361 | ,{ 120, 120, 120, -140, 120} 362 | ,{ 120, 120, 120, 120, 120} 363 | } 364 | ,{{ 190, 190, 120, 120, 150} 365 | ,{ 190, 190, 120, 120, 120} 366 | ,{ 120, 120, 120, 120, 120} 367 | ,{ 150, 150, 120, -140, 120} 368 | ,{ 150, 120, 120, 120, 150} 369 | } 370 | ,{{ 190, 190, 190, 190, 190} 371 | ,{ 190, 190, 190, 190, 190} 372 | ,{ 190, 190, 190, 190, 190} 373 | ,{ 190, 190, 190, -70, 190} 374 | ,{ 190, 190, 190, 190, 160} 375 | } 376 | ,{{ 220, 220, 190, 190, 190} 377 | ,{ 220, 220, 190, 190, 190} 378 | ,{ 190, 190, 190, 190, 190} 379 | ,{ 190, 190, 190, -70, 190} 380 | ,{ 190, 190, 190, 190, 190} 381 | } 382 | ,{{ 190, 190, 190, 190, 190} 383 | ,{ 190, 190, 190, 190, 190} 384 | ,{ 190, 190, 190, 190, 190} 385 | ,{ 190, 190, 190, -70, 190} 386 | ,{ 190, 190, 190, 190, 160} 387 | } 388 | ,{{ 190, 190, 190, 190, 190} 389 | ,{ 190, 190, 190, 190, 190} 390 | ,{ 190, 190, 190, 190, 190} 391 | ,{ 190, 190, 190, -70, 190} 392 | ,{ 190, 190, 190, 190, 190} 393 | } 394 | ,{{ 220, 220, 190, 190, 190} 395 | ,{ 220, 220, 190, 190, 190} 396 | ,{ 190, 190, 190, 190, 190} 397 | ,{ 190, 190, 190, -70, 190} 398 | ,{ 190, 190, 190, 190, 190} 399 | } 400 | }}; 401 | -------------------------------------------------------------------------------- /src/Utils/utility.h: -------------------------------------------------------------------------------- 1 | /* 2 | *utility.h* 3 | provides feature functions. 4 | 5 | author: Kai Zhao, Dezhong Deng 6 | edited by: 02/2018 7 | */ 8 | 9 | #ifndef FASTCKY_UTILITY_H 10 | #define FASTCKY_UTILITY_H 11 | 12 | #include 13 | #include 14 | #include 15 | 16 | #include "feature_weight.h" 17 | 18 | #define INF 1000000007 19 | 20 | #define NOTON 5 // NUM_OF_TYPE_OF_NUCS 21 | #define NOTOND 25 22 | #define NOTONT 125 23 | 24 | #define EXPLICIT_MAX_LEN 4 25 | #define SINGLE_MIN_LEN 0 26 | #define SINGLE_MAX_LEN 30 // NOTE: *must* <= sizeof(char), otherwise modify State::TraceInfo accordingly 27 | 28 | #define MULTI_MAX_LEN 30 29 | 30 | #define HAIRPIN_MAX_LEN 30 31 | #define BULGE_MAX_LEN SINGLE_MAX_LEN 32 | #define INTERNAL_MAX_LEN SINGLE_MAX_LEN 33 | #define SYMMETRIC_MAX_LEN 15 34 | #define ASYMMETRY_MAX_LEN 28 35 | 36 | #define GET_ACGU_NUM(x) ((x=='A'? 0 : (x=='C'? 1 : (x=='G'? 2 : (x=='U'?3: 4))))) 37 | #define HELIX_STACKING_OLD(x, y, z, w) (_helix_stacking[GET_ACGU_NUM(x)][GET_ACGU_NUM(y)][GET_ACGU_NUM(z)][GET_ACGU_NUM(w)]) 38 | 39 | bool _allowed_pairs[NOTON][NOTON]; 40 | bool _helix_stacking[NOTON][NOTON][NOTON][NOTON]; 41 | double cache_single[SINGLE_MAX_LEN+1][SINGLE_MAX_LEN+1]; 42 | 43 | void initialize_cachesingle() 44 | { 45 | memset(cache_single, 0, sizeof(cache_single)); 46 | for (int l1 = SINGLE_MIN_LEN; l1 <= SINGLE_MAX_LEN; l1 ++) 47 | for (int l2 = SINGLE_MIN_LEN; l2 <= SINGLE_MAX_LEN; l2 ++) 48 | { 49 | if (l1 == 0 && l2 == 0) 50 | continue; 51 | 52 | // bulge 53 | else if (l1 == 0) 54 | cache_single[l1][l2] += bulge_length[l2]; 55 | else if (l2 == 0) 56 | cache_single[l1][l2] += bulge_length[l1]; 57 | else 58 | { 59 | 60 | // internal 61 | cache_single[l1][l2] += internal_length[std::min(l1+l2, INTERNAL_MAX_LEN)]; 62 | 63 | // internal explicit 64 | if (l1 <= EXPLICIT_MAX_LEN && l2 <= EXPLICIT_MAX_LEN) 65 | cache_single[l1][l2] += 66 | internal_explicit[l1<=l2 ? l1*EXPLICIT_MAX_LEN+l2 : l2*EXPLICIT_MAX_LEN+l1]; 67 | 68 | // internal symmetry 69 | if (l1 == l2) 70 | cache_single[l1][l2] += internal_symmetric_length[std::min(l1, SYMMETRIC_MAX_LEN)]; 71 | 72 | else { // internal asymmetry 73 | int diff = l1 - l2; if (diff < 0) diff = -diff; 74 | cache_single[l1][l2] += internal_asymmetry[std::min(diff, ASYMMETRY_MAX_LEN)]; 75 | } 76 | } 77 | } 78 | return; 79 | } 80 | 81 | void initialize() 82 | { 83 | _allowed_pairs[GET_ACGU_NUM('A')][GET_ACGU_NUM('U')] = true; 84 | _allowed_pairs[GET_ACGU_NUM('U')][GET_ACGU_NUM('A')] = true; 85 | _allowed_pairs[GET_ACGU_NUM('C')][GET_ACGU_NUM('G')] = true; 86 | _allowed_pairs[GET_ACGU_NUM('G')][GET_ACGU_NUM('C')] = true; 87 | _allowed_pairs[GET_ACGU_NUM('G')][GET_ACGU_NUM('U')] = true; 88 | _allowed_pairs[GET_ACGU_NUM('U')][GET_ACGU_NUM('G')] = true; 89 | 90 | HELIX_STACKING_OLD('A', 'U', 'A', 'U') = true; 91 | HELIX_STACKING_OLD('A', 'U', 'C', 'G') = true; 92 | HELIX_STACKING_OLD('A', 'U', 'G', 'C') = true; 93 | HELIX_STACKING_OLD('A', 'U', 'G', 'U') = true; 94 | HELIX_STACKING_OLD('A', 'U', 'U', 'A') = true; 95 | HELIX_STACKING_OLD('A', 'U', 'U', 'G') = true; 96 | HELIX_STACKING_OLD('C', 'G', 'A', 'U') = true; 97 | HELIX_STACKING_OLD('C', 'G', 'C', 'G') = true; 98 | HELIX_STACKING_OLD('C', 'G', 'G', 'C') = true; 99 | HELIX_STACKING_OLD('C', 'G', 'G', 'U') = true; 100 | HELIX_STACKING_OLD('C', 'G', 'U', 'G') = true; 101 | HELIX_STACKING_OLD('G', 'C', 'A', 'U') = true; 102 | HELIX_STACKING_OLD('G', 'C', 'C', 'G') = true; 103 | HELIX_STACKING_OLD('G', 'C', 'G', 'U') = true; 104 | HELIX_STACKING_OLD('G', 'C', 'U', 'G') = true; 105 | HELIX_STACKING_OLD('G', 'U', 'A', 'U') = true; 106 | HELIX_STACKING_OLD('G', 'U', 'G', 'U') = true; 107 | HELIX_STACKING_OLD('G', 'U', 'U', 'G') = true; 108 | HELIX_STACKING_OLD('U', 'A', 'A', 'U') = true; 109 | HELIX_STACKING_OLD('U', 'A', 'G', 'U') = true; 110 | HELIX_STACKING_OLD('U', 'G', 'G', 'U') = true; 111 | } 112 | 113 | // ------------- nucs based scores ------------- 114 | 115 | // parameters: nucs[i], nucs[j] 116 | inline double base_pair_score(int nuci, int nucj) { 117 | return base_pair[nucj*NOTON + nuci]; 118 | } 119 | 120 | // parameters: nucs[i], nucs[i+1], nucs[j-1], nucs[j] 121 | inline double helix_stacking_score(int nuci, int nuci1, int nucj_1, int nucj) { 122 | return helix_stacking[nuci*NOTONT + nucj*NOTOND + nuci1*NOTON + nucj_1]; 123 | } 124 | 125 | // parameters: nucs[i], nucs[j] 126 | inline double helix_closing_score(int nuci, int nucj) { 127 | return helix_closing[nuci*NOTON + nucj]; 128 | } 129 | 130 | // parameters: nucs[i], nucs[i+1], nucs[j-1], nucs[j] 131 | inline double terminal_mismatch_score(int nuci, int nuci1, int nucj_1, int nucj) { 132 | return terminal_mismatch[nuci*NOTONT+nucj*NOTOND + nuci1*NOTON + nucj_1]; 133 | } 134 | 135 | 136 | 137 | // parameter: nucs[i] 138 | inline double bulge_nuc_score(int nuci) { 139 | return bulge_0x1_nucleotides[nuci]; 140 | } 141 | 142 | // parameters: nucs[i], nucs[j] 143 | inline double internal_nuc_score(int nuci, int nucj) { 144 | return internal_1x1_nucleotides[nuci*NOTON + nucj]; 145 | } 146 | 147 | // parameters: nucs[i], nucs[i+1], nucs[j] 148 | inline double dangle_left_score(int nuci, int nuci1, int nucj) { 149 | return dangle_left[nuci*NOTOND + nucj*NOTON + nuci1]; 150 | } 151 | 152 | // parameters: nucs[i], nucs[j-1], nucs[j] 153 | inline double dangle_right_score(int nuci, int nucj_1, int nucj) { 154 | return dangle_right[nuci*NOTOND + nucj*NOTON + nucj_1]; 155 | } 156 | 157 | 158 | 159 | // ------------- length based scores ------------- 160 | 161 | inline double hairpin_score(int i, int j) { 162 | return hairpin_length[std::min(j-i-1, HAIRPIN_MAX_LEN)]; 163 | } 164 | 165 | inline double internal_length_score(int l) { 166 | return internal_length[std::min(l, INTERNAL_MAX_LEN)]; 167 | } 168 | 169 | inline double internal_explicit_score(int l1, int l2){ 170 | int l1_ = std::min(l1, EXPLICIT_MAX_LEN); 171 | int l2_ = std::min(l2, EXPLICIT_MAX_LEN); 172 | return internal_explicit[l1_<=l2_ ? l1_*NOTON+l2_ : l2_*NOTON+l1_]; 173 | } 174 | 175 | inline double internal_sym_score(int l) { 176 | return internal_symmetric_length[std::min(l, SYMMETRIC_MAX_LEN)]; 177 | } 178 | 179 | inline double internal_asym_score(int l1, int l2) 180 | { 181 | int diff = l1 - l2; if (diff < 0) diff = -diff; 182 | return internal_asymmetry[std::min(diff, ASYMMETRY_MAX_LEN)]; 183 | } 184 | 185 | inline double bulge_length_score(int l){ 186 | return bulge_length[std::min(l, BULGE_MAX_LEN)]; 187 | } 188 | 189 | inline double hairpin_at_least_score(int l) { 190 | return hairpin_length_at_least[std::min(l, HAIRPIN_MAX_LEN)]; 191 | } 192 | 193 | inline double buldge_length_at_least_score(int l) { 194 | return bulge_length_at_least[std::min(l, BULGE_MAX_LEN)]; 195 | } 196 | 197 | inline double internal_length_at_least_score(int l) { 198 | return internal_length_at_least[std::min(l, INTERNAL_MAX_LEN)]; 199 | } 200 | 201 | 202 | //----------------------------------------------------- 203 | inline double score_junction_A(int i, int j, int nuci, int nuci1, int nucj_1, int nucj, int len) { 204 | return helix_closing_score(nuci, nucj) + 205 | (i < len - 1 ? dangle_left_score(nuci, nuci1, nucj) : 0) + 206 | (j > 0 ? dangle_right_score(nuci, nucj_1, nucj) : 0); 207 | } 208 | 209 | inline double score_junction_B(int i, int j, int nuci, int nuci1, int nucj_1, int nucj) { 210 | return helix_closing_score(nuci, nucj) + terminal_mismatch_score(nuci, nuci1, nucj_1, nucj); 211 | } 212 | 213 | inline double score_hairpin_length(int len) { 214 | return hairpin_length[std::min(len, HAIRPIN_MAX_LEN)]; 215 | } 216 | 217 | inline double score_hairpin(int i, int j, int nuci, int nuci1, int nucj_1, int nucj) { 218 | return hairpin_length[std::min(j-i-1, HAIRPIN_MAX_LEN)] + 219 | score_junction_B(i, j, nuci, nuci1, nucj_1, nucj); 220 | } 221 | 222 | inline double score_helix(int nuci, int nuci1, int nucj_1, int nucj) { 223 | return helix_stacking_score(nuci, nuci1, nucj_1, nucj) + base_pair_score(nuci1, nucj_1); 224 | } 225 | 226 | inline double score_single_nuc(int i, int j, int p, int q, int nucp_1, int nucq1) { 227 | int l1 = p-i-1, l2=j-q-1; 228 | if (l1==0 && l2==1) return bulge_nuc_score(nucq1); 229 | if (l1==1 && l2==0) return bulge_nuc_score(nucp_1); 230 | if (l1==1 && l2==1) return internal_nuc_score(nucp_1, nucq1); 231 | return 0; 232 | } 233 | 234 | inline double score_single(int i, int j, int p, int q, int len, 235 | int nuci, int nuci1, int nucj_1, int nucj, 236 | int nucp_1, int nucp, int nucq, int nucq1) { 237 | int l1 = p-i-1, l2=j-q-1; 238 | return cache_single[l1][l2] + 239 | base_pair_score(nucp, nucq) + 240 | score_junction_B(i, j, nuci, nuci1, nucj_1, nucj) + 241 | score_junction_B(q, p, nucq, nucq1, nucp_1, nucp) + 242 | score_single_nuc(i, j, p, q, nucp_1, nucq1); 243 | } 244 | 245 | // score_single without socre_junction_B 246 | inline double score_single_without_junctionB(int i, int j, int p, int q, 247 | int nucp_1, int nucp, int nucq, int nucq1) { 248 | int l1 = p-i-1, l2=j-q-1; 249 | return cache_single[l1][l2] + 250 | base_pair_score(nucp, nucq) + 251 | score_single_nuc(i, j, p, q, nucp_1, nucq1); 252 | } 253 | 254 | inline double score_multi(int i, int j, int nuci, int nuci1, int nucj_1, int nucj, int len) { 255 | return score_junction_A(i, j, nuci, nuci1, nucj_1, nucj, len) + 256 | multi_paired + multi_base; 257 | } 258 | 259 | inline double score_multi_unpaired(int i, int j) { 260 | return (j-i+1) * multi_unpaired; 261 | } 262 | 263 | inline double score_M1(int i, int j, int k, int nuci_1, int nuci, int nuck, int nuck1, int len) { 264 | return score_junction_A(k, i, nuck, nuck1, nuci_1, nuci, len) + 265 | score_multi_unpaired(k+1, j) + base_pair_score(nuci, nuck) + multi_paired; 266 | } 267 | 268 | inline double score_external_paired(int i, int j, int nuci_1, int nuci, int nucj, int nucj1, int len) { 269 | return score_junction_A(j, i, nucj, nucj1, nuci_1, nuci, len) + 270 | external_paired + base_pair_score(nuci, nucj); 271 | } 272 | 273 | inline double score_external_unpaired(int i, int j) { 274 | return (j-i+1) * external_unpaired; 275 | } 276 | 277 | #endif //FASTCKY_UTILITY_H 278 | -------------------------------------------------------------------------------- /src/Utils/utility_v.h: -------------------------------------------------------------------------------- 1 | /* 2 | *utility_v.h* 3 | provides feature functions for vienna model. 4 | 5 | author: Kai Zhao, Dezhong Deng 6 | edited by: 02/2018 7 | */ 8 | 9 | #ifndef FASTCKY_UTILITY_V_H 10 | #define FASTCKY_UTILITY_V_H 11 | 12 | #define NUM_TO_NUC(x) (x==-1?-1:((x==4?0:(x+1)))) 13 | #define NUM_TO_PAIR(x,y) (x==0? (y==3?5:0) : (x==1? (y==2?1:0) : (x==2 ? (y==1?2:(y==3?3:0)) : (x==3 ? (y==2?4:(y==0?6:0)) : 0)))) 14 | #define NUC_TO_PAIR(x,y) (x==1? (y==4?5:0) : (x==2? (y==3?1:0) : (x==3 ? (y==2?2:(y==4?3:0)) : (x==4 ? (y==3?4:(y==1?6:0)) : 0)))) 15 | 16 | // bool _allowed_pairs[NOTON][NOTON]; 17 | 18 | #include 19 | #include 20 | 21 | #include "energy_parameter.h" // energy_parameter stuff 22 | #include "intl11.h" 23 | #include "intl21.h" 24 | #include "intl22.h" 25 | 26 | #define MAXLOOP 30 27 | 28 | inline int MIN2(int a, int b) {if (a <= b)return a;else return b;} 29 | inline int MAX2(int a, int b) {if (a >= b)return a;else return b;} 30 | 31 | /* void v_initialize() */ 32 | /* { */ 33 | /* _allowed_pairs[GET_ACGU_NUM('A')][GET_ACGU_NUM('U')] = true; */ 34 | /* _allowed_pairs[GET_ACGU_NUM('U')][GET_ACGU_NUM('A')] = true; */ 35 | /* _allowed_pairs[GET_ACGU_NUM('C')][GET_ACGU_NUM('G')] = true; */ 36 | /* _allowed_pairs[GET_ACGU_NUM('G')][GET_ACGU_NUM('C')] = true; */ 37 | /* _allowed_pairs[GET_ACGU_NUM('G')][GET_ACGU_NUM('U')] = true; */ 38 | /* _allowed_pairs[GET_ACGU_NUM('U')][GET_ACGU_NUM('G')] = true; */ 39 | 40 | /* } */ 41 | 42 | inline void v_init_tetra_hex_tri(std::string& seq, int seq_length, std::vector& if_tetraloops, std::vector& if_hexaloops, std::vector& if_triloops) { 43 | 44 | // TetraLoops 45 | if_tetraloops.resize(seq_length-5<0?0:seq_length-5, -1); 46 | for (int i = 0; i < seq_length-5; ++i) { 47 | if (!(seq[i] == 'C' && seq[i+5] == 'G')) 48 | continue; 49 | char *ts; 50 | if ((ts=strstr(Tetraloops, seq.substr(i,6).c_str()))) 51 | if_tetraloops[i] = (ts - Tetraloops)/7; 52 | } 53 | 54 | // Triloops 55 | if_triloops.resize(seq_length-4<0?0:seq_length-4, -1); 56 | for (int i = 0; i < seq_length-4; ++i) { 57 | if (!((seq[i] == 'C' && seq[i+4] == 'G') || (seq[i] == 'G' && seq[i+4] == 'C'))) 58 | continue; 59 | char *ts; 60 | if ((ts=strstr(Triloops, seq.substr(i,5).c_str()))) 61 | if_triloops[i] = (ts - Triloops)/6; 62 | } 63 | 64 | // Hexaloops 65 | if_hexaloops.resize(seq_length-7<0?0:seq_length-7, -1); 66 | for (int i = 0; i < seq_length-7; ++i) { 67 | if (!(seq[i] == 'A' && seq[i+7] == 'U')) 68 | continue; 69 | char *ts; 70 | if ((ts=strstr(Hexaloops, seq.substr(i,8).c_str()))) 71 | if_hexaloops[i] = (ts - Hexaloops)/9; 72 | } 73 | return; 74 | } 75 | 76 | inline int v_score_hairpin(int i, int j, int nuci, int nuci1, int nucj_1, int nucj, int tetra_hex_tri_index = -1) { 77 | int size = j-i-1; 78 | int type = NUM_TO_PAIR(nuci, nucj); 79 | int si1 = NUM_TO_NUC(nuci1); 80 | int sj1 = NUM_TO_NUC(nucj_1); 81 | 82 | int energy; 83 | 84 | if(size <= 30) 85 | energy = hairpin37[size]; 86 | else 87 | energy = hairpin37[30] + (int)(lxc37*log((size)/30.)); 88 | 89 | if(size < 3) return energy; /* should only be the case when folding alignments */ 90 | #ifdef SPECIAL_HP 91 | // if(special_hp){ 92 | if (size == 4 && tetra_hex_tri_index > -1) 93 | return Tetraloop37[tetra_hex_tri_index]; 94 | else if (size == 6 && tetra_hex_tri_index > -1) 95 | return Hexaloop37[tetra_hex_tri_index]; 96 | else if (size == 3) { 97 | if (tetra_hex_tri_index > -1) 98 | return Triloop37[tetra_hex_tri_index]; 99 | return (energy + (type>2 ? TerminalAU37 : 0)); 100 | } 101 | // } 102 | #endif 103 | 104 | energy += mismatchH37[type][si1][sj1]; 105 | 106 | return energy; 107 | } 108 | 109 | inline int v_score_single(int i, int j, int p, int q, 110 | int nuci, int nuci1, int nucj_1, int nucj, 111 | int nucp_1, int nucp, int nucq, int nucq1){ 112 | int si1 = NUM_TO_NUC(nuci1); 113 | int sj1 = NUM_TO_NUC(nucj_1); 114 | int sp1 = NUM_TO_NUC(nucp_1); 115 | int sq1 = NUM_TO_NUC(nucq1); 116 | int type = NUM_TO_PAIR(nuci, nucj); 117 | int type_2 = NUM_TO_PAIR(nucq, nucp); 118 | int n1 = p-i-1; 119 | int n2 = j-q-1; 120 | int nl, ns, u, energy; 121 | energy = 0; 122 | 123 | if (n1>n2) { nl=n1; ns=n2;} 124 | else {nl=n2; ns=n1;} 125 | 126 | if (nl == 0) 127 | return stack37[type][type_2]; /* stack */ 128 | 129 | if (ns==0) { /* bulge */ 130 | energy = (nl<=MAXLOOP)?bulge37[nl]: 131 | (bulge37[30]+(int)(lxc37*log(nl/30.))); 132 | if (nl==1) energy += stack37[type][type_2]; 133 | else { 134 | if (type>2) energy += TerminalAU37; 135 | if (type_2>2) energy += TerminalAU37; 136 | } 137 | return energy; 138 | } 139 | else { /* interior loop */ 140 | if (ns==1) { 141 | if (nl==1) /* 1x1 loop */ 142 | return int11_37[type][type_2][si1][sj1]; 143 | if (nl==2) { /* 2x1 loop */ 144 | if (n1==1) 145 | energy = int21_37[type][type_2][si1][sq1][sj1]; 146 | else 147 | energy = int21_37[type_2][type][sq1][si1][sp1]; 148 | return energy; 149 | } 150 | else { /* 1xn loop */ 151 | energy = (nl+1<=MAXLOOP)?(internal_loop37[nl+1]) : (internal_loop37[30]+(int)(lxc37*log((nl+1)/30.))); 152 | energy += MIN2(MAX_NINIO, (nl-ns)*ninio37); 153 | energy += mismatch1nI37[type][si1][sj1] + mismatch1nI37[type_2][sq1][sp1]; 154 | return energy; 155 | } 156 | } 157 | else if (ns==2) { 158 | if(nl==2) { /* 2x2 loop */ 159 | return int22_37[type][type_2][si1][sp1][sq1][sj1];} 160 | else if (nl==3){ /* 2x3 loop */ 161 | energy = internal_loop37[5]+ninio37; 162 | energy += mismatch23I37[type][si1][sj1] + mismatch23I37[type_2][sq1][sp1]; 163 | return energy; 164 | } 165 | 166 | } 167 | { /* generic interior loop (no else here!)*/ 168 | u = nl + ns; 169 | energy = (u <= MAXLOOP) ? (internal_loop37[u]) : (internal_loop37[30]+(int)(lxc37*log((u)/30.))); 170 | 171 | energy += MIN2(MAX_NINIO, (nl-ns)*ninio37); 172 | 173 | energy += mismatchI37[type][si1][sj1] + mismatchI37[type_2][sq1][sp1]; 174 | } 175 | } 176 | return energy; 177 | } 178 | 179 | // multi_loop 180 | inline int E_MLstem(int type, int si1, int sj1, int dangle_mode) { 181 | int energy = 0; 182 | 183 | if (dangle_mode != 0) { 184 | if(si1 >= 0 && sj1 >= 0){ 185 | energy += mismatchM37[type][si1][sj1]; 186 | } 187 | else if (si1 >= 0){ 188 | energy += dangle5_37[type][si1]; 189 | } 190 | else if (sj1 >= 0){ 191 | energy += dangle3_37[type][sj1]; 192 | } 193 | } 194 | 195 | if(type > 2) { 196 | energy += TerminalAU37; 197 | } 198 | 199 | energy += ML_intern37; 200 | 201 | return energy; 202 | } 203 | 204 | inline int v_score_M1(int i, int j, int k, int nuci_1, int nuci, int nuck, int nuck1, int len, int dangle_mode) { 205 | int p = i; 206 | int q = k; 207 | int tt = NUM_TO_PAIR(nuci, nuck); 208 | int sp1 = NUM_TO_NUC(nuci_1); 209 | int sq1 = NUM_TO_NUC(nuck1); 210 | 211 | return E_MLstem(tt, sp1, sq1, dangle_mode); 212 | 213 | } 214 | 215 | inline int v_score_multi_unpaired(int i, int j) { 216 | return 0; 217 | } 218 | 219 | inline int v_score_multi(int i, int j, int nuci, int nuci1, int nucj_1, int nucj, int len, int dangle_mode) { 220 | int tt = NUM_TO_PAIR(nucj, nuci); 221 | int si1 = NUM_TO_NUC(nuci1); 222 | int sj1 = NUM_TO_NUC(nucj_1); 223 | 224 | return E_MLstem(tt, sj1, si1, dangle_mode) + ML_closing37; 225 | } 226 | 227 | // exterior_loop 228 | inline int v_score_external_paired(int i, int j, int nuci_1, int nuci, int nucj, int nucj1, int len, int dangle_mode) { 229 | int type = NUM_TO_PAIR(nuci, nucj); 230 | int si1 = NUM_TO_NUC(nuci_1); 231 | int sj1 = NUM_TO_NUC(nucj1); 232 | int energy = 0; 233 | 234 | if (dangle_mode != 0) { 235 | if(si1 >= 0 && sj1 >= 0){ 236 | energy += mismatchExt37[type][si1][sj1]; 237 | } 238 | else if (si1 >= 0){ 239 | energy += dangle5_37[type][si1]; 240 | } 241 | else if (sj1 >= 0){ 242 | energy += dangle3_37[type][sj1]; 243 | } 244 | } 245 | if(type > 2) 246 | energy += TerminalAU37; 247 | return energy; 248 | } 249 | 250 | inline int v_score_external_unpaired(int i, int j) { 251 | return 0; 252 | } 253 | 254 | #endif //FASTCKY_UTILITY_V_H 255 | -------------------------------------------------------------------------------- /src/bpp.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | *bpp.cpp* 3 | The main code for base pair probability calculation. 4 | 5 | author: He Zhang 6 | created by: 04/2019 7 | */ 8 | 9 | #include 10 | #include 11 | #include 12 | #include "LinearPartition.h" 13 | 14 | using namespace std; 15 | 16 | void BeamCKYParser::output_to_file(string file_name, const char * type) { 17 | if(!file_name.empty()) { 18 | printf("Outputing base pairing probability matrix to %s...\n", file_name.c_str()); 19 | FILE *fptr = fopen(file_name.c_str(), type); 20 | if (fptr == NULL) { 21 | printf("Could not open file!\n"); 22 | return; 23 | } 24 | 25 | int turn = no_sharp_turn?3:0; 26 | for (int i = 1; i <= seq_length; i++) { 27 | for (int j = i + turn + 1; j <= seq_length; j++) { 28 | pair key = make_pair(i,j); 29 | auto got = Pij.find(key); 30 | if (got != Pij.end()){ 31 | fprintf(fptr, "%d %d %.5f\n", i, j, got->second); // lhuang: %.4e->%.5f 32 | } 33 | } 34 | } 35 | fprintf(fptr, "\n"); 36 | fclose(fptr); 37 | printf("Done!\n"); 38 | } 39 | 40 | return; 41 | } 42 | 43 | void BeamCKYParser::output_to_file_MEA_threshknot_bpseq(string file_name, const char * type, map& pairs, string & seq) { 44 | 45 | int i,j; 46 | char nuc; 47 | if(!file_name.empty()) { 48 | printf("Outputing base pairs in bpseq format to %s...\n", file_name.c_str()); 49 | FILE *fptr = fopen(file_name.c_str(), type); 50 | if (fptr == NULL) { 51 | printf("Could not open file!\n"); 52 | return; 53 | } 54 | 55 | for (int i = 1; i <= seq_length; i++) { 56 | if (pairs.find(i) != pairs.end()){ 57 | j = pairs[i]; 58 | } 59 | else{ 60 | j = 0; 61 | } 62 | nuc = seq[i-1]; 63 | fprintf(fptr, "%d %c %d\n", i, nuc, j); 64 | } 65 | 66 | fprintf(fptr, "\n"); 67 | fclose(fptr); 68 | printf("Done!\n"); 69 | } 70 | else{ 71 | for (int i = 1; i <= seq_length; i++) { 72 | if (pairs.find(i) != pairs.end()){ 73 | j = pairs[i]; 74 | } 75 | else{ 76 | j = 0; 77 | } 78 | nuc = seq[i-1]; 79 | printf("%d %c %d\n", i, nuc, j); 80 | } 81 | printf("\n"); 82 | } 83 | 84 | } 85 | 86 | void BeamCKYParser::cal_PairProb(State& viterbi) { 87 | 88 | for(int j=0; j pf_type(-9.91152)) { 95 | pf_type prob = Fast_Exp(temp_prob_inside); 96 | if(prob > pf_type(1.0)) prob = pf_type(1.0); 97 | if(prob < pf_type(bpp_cutoff)) continue; 98 | Pij[make_pair(i+1, j+1)] = prob; 99 | } 100 | } 101 | } 102 | 103 | // -o mode: output to a single file with user specified name; 104 | // bpp matrices for different sequences are separated with empty lines 105 | if (!bpp_file.empty()){ 106 | output_to_file(bpp_file, "a"); 107 | } 108 | 109 | // -prefix mode: output to multiple files with user specified prefix; 110 | else if (!bpp_file_index.empty()) { 111 | output_to_file(bpp_file_index, "w"); 112 | } 113 | return; 114 | } 115 | 116 | 117 | string BeamCKYParser::back_trace(const int i, const int j, const vector >& back_pointer){ 118 | 119 | if (i>j) return ""; 120 | if (back_pointer[i][j] == -1){ 121 | if (i == j) return "."; 122 | else return "." + back_trace(i+1,j, back_pointer); 123 | }else if (back_pointer[i][j] != 0){ 124 | int k = back_pointer[i][j]; 125 | assert(k + 1 > 0 && k + 1 <= seq_length); 126 | string temp; 127 | if (k == j) temp = ""; 128 | else temp = back_trace(k+1,j, back_pointer); 129 | return "(" + back_trace(i+1,k-1, back_pointer) + ")" + temp; 130 | } 131 | assert(false); 132 | return ""; 133 | } 134 | 135 | map BeamCKYParser::get_pairs(string & structure){ 136 | map pairs; 137 | stack s; 138 | int index = 1; 139 | int pre_index = 0; 140 | for (auto & elem : structure){ 141 | if (elem == '(') s.push(index); 142 | else if(elem == ')'){ 143 | pre_index = s.top(); 144 | pairs[pre_index] = index; 145 | pairs[index] = pre_index; 146 | s.pop(); 147 | } 148 | index++; 149 | } 150 | return pairs; 151 | } 152 | 153 | void BeamCKYParser::ThreshKnot(string & seq){ 154 | 155 | map rowprob; 156 | vector > prob_list; 157 | 158 | map pairs; 159 | set visited; 160 | 161 | for(auto& pij : Pij){ 162 | auto i = pij.first.first; //index starts from 1 163 | auto j = pij.first.second; 164 | auto score = pij.second; 165 | 166 | if (score < threshknot_threshold) continue; 167 | 168 | prob_list.push_back(make_tuple(i,j,score)); 169 | 170 | rowprob[i] = max(rowprob[i], score); 171 | rowprob[j] = max(rowprob[j], score); 172 | 173 | } 174 | 175 | for(auto& elem : prob_list){ 176 | 177 | auto i = std::get<0>(elem); 178 | auto j = std::get<1>(elem); 179 | auto score = std::get<2>(elem); 180 | 181 | if (score == rowprob[i] && score == rowprob[j]){ 182 | 183 | if ((visited.find(i) != visited.end()) || (visited.find(j) != visited.end())) continue; 184 | visited.insert(i); 185 | visited.insert(j); 186 | 187 | pairs[i] = j; 188 | pairs[j] = i; 189 | } 190 | } 191 | 192 | fprintf(stderr, "%s\n", seq.c_str()); 193 | output_to_file_MEA_threshknot_bpseq(threshknot_file_index, "w", pairs, seq); 194 | } 195 | 196 | void BeamCKYParser::PairProb_MEA(string & seq) { 197 | 198 | vector > OPT; 199 | OPT.resize(seq_length); 200 | 201 | for (int i = 0; i < seq_length; ++i) OPT[i].resize(seq_length); 202 | 203 | vector> P; 204 | P.resize(seq_length); 205 | 206 | for (int i = 0; i < seq_length; ++i) P[i].resize(seq_length); 207 | 208 | vector > back_pointer; 209 | back_pointer.resize(seq_length); 210 | 211 | for (int i = 0; i < seq_length; ++i) back_pointer[i].resize(seq_length); 212 | 213 | vector> paired; 214 | paired.resize(seq_length); 215 | 216 | vector Q; 217 | for (int i = 0; i < seq_length; ++i) Q.push_back(pf_type(1.0)); 218 | 219 | for(auto& pij : Pij){ 220 | auto i = pij.first.first-1; 221 | auto j = pij.first.second-1; 222 | auto score = pij.second; 223 | 224 | P[i][j] = score; 225 | 226 | paired[i].push_back(j); 227 | Q[i] -= score; 228 | Q[j] -= score; 229 | } 230 | 231 | for (int i = 0; i < seq_length; ++i) std::sort (paired[i].begin(), paired[i].end()); 232 | for (int l = 0; l< seq_length; l++){ 233 | for (int i = 0; ij) break; 244 | pf_type temp_OPT_k1_j; 245 | if (k next_pair[]){ 283 | 284 | struct timeval bpp_starttime, bpp_endtime; 285 | gettimeofday(&bpp_starttime, NULL); 286 | 287 | bestC[seq_length-1].beta = 0.0; 288 | 289 | // from right to left 290 | value_type newscore; 291 | for(int j = seq_length-1; j > 0; --j) { 292 | int nucj = nucs[j]; 293 | int nucj1 = (j+1) < seq_length ? nucs[j+1] : -1; 294 | 295 | unordered_map& beamstepH = bestH[j]; 296 | unordered_map& beamstepMulti = bestMulti[j]; 297 | unordered_map& beamstepP = bestP[j]; 298 | unordered_map& beamstepM2 = bestM2[j]; 299 | unordered_map& beamstepM = bestM[j]; 300 | State& beamstepC = bestC[j]; 301 | 302 | // beam of C 303 | { 304 | // C = C + U 305 | if (j < seq_length-1) { 306 | #ifdef lpv 307 | Fast_LogPlusEquals(beamstepC.beta, (bestC[j+1].beta)); 308 | 309 | #else 310 | newscore = score_external_unpaired(j+1, j+1); 311 | Fast_LogPlusEquals(beamstepC.beta, bestC[j+1].beta + newscore); 312 | #endif 313 | } 314 | } 315 | 316 | // beam of M 317 | { 318 | for(auto& item : beamstepM) { 319 | int i = item.first; 320 | State& state = item.second; 321 | if (j < seq_length-1) { 322 | #ifdef lpv 323 | Fast_LogPlusEquals(state.beta, bestM[j+1][i].beta); 324 | #else 325 | newscore = score_multi_unpaired(j + 1, j + 1); 326 | Fast_LogPlusEquals(state.beta, bestM[j+1][i].beta + newscore); 327 | #endif 328 | } 329 | } 330 | } 331 | 332 | // beam of M2 333 | { 334 | for(auto& item : beamstepM2) { 335 | int i = item.first; 336 | State& state = item.second; 337 | 338 | // 1. multi-loop 339 | { 340 | for (int p = i-1; p >= std::max(i - SINGLE_MAX_LEN, 0); --p) { 341 | int nucp = nucs[p]; 342 | int q = next_pair[nucp][j]; 343 | if (q != -1 && ((i - p - 1) <= SINGLE_MAX_LEN)) { 344 | #ifdef lpv 345 | Fast_LogPlusEquals(state.beta, bestMulti[q][p].beta); 346 | #else 347 | newscore = score_multi_unpaired(p+1, i-1) + 348 | score_multi_unpaired(j+1, q-1); 349 | Fast_LogPlusEquals(state.beta, bestMulti[q][p].beta + newscore); 350 | #endif 351 | } 352 | } 353 | } 354 | 355 | // 2. M = M2 356 | Fast_LogPlusEquals(state.beta, beamstepM[i].beta); 357 | } 358 | } 359 | 360 | // beam of P 361 | { 362 | for(auto& item : beamstepP) { 363 | int i = item.first; 364 | State& state = item.second; 365 | int nuci = nucs[i]; 366 | int nuci_1 = (i-1>-1) ? nucs[i-1] : -1; 367 | 368 | if (i >0 && j= std::max(i - SINGLE_MAX_LEN, 0); --p) { 373 | int nucp = nucs[p]; 374 | int nucp1 = nucs[p + 1]; 375 | int q = next_pair[nucp][j]; 376 | while (q != -1 && ((i - p) + (q - j) - 2 <= SINGLE_MAX_LEN)) { 377 | int nucq = nucs[q]; 378 | int nucq_1 = nucs[q - 1]; 379 | 380 | if (p == i - 1 && q == j + 1) { 381 | // helix 382 | #ifdef lpv 383 | newscore = -v_score_single(p,q,i,j, nucp, nucp1, nucq_1, nucq, 384 | nuci_1, nuci, nucj, nucj1); 385 | // SHAPE for Vienna only 386 | if (use_shape) 387 | { 388 | newscore += -(pseudo_energy_stack[p] + pseudo_energy_stack[i] + pseudo_energy_stack[j] + pseudo_energy_stack[q]); 389 | } 390 | 391 | 392 | Fast_LogPlusEquals(state.beta, bestP[q][p].beta + newscore/kT); 393 | #else 394 | newscore = score_helix(nucp, nucp1, nucq_1, nucq); 395 | Fast_LogPlusEquals(state.beta, bestP[q][p].beta + newscore); 396 | #endif 397 | } else { 398 | // single branch 399 | #ifdef lpv 400 | newscore = - v_score_single(p,q,i,j, nucp, nucp1, nucq_1, nucq, 401 | nuci_1, nuci, nucj, nucj1); 402 | Fast_LogPlusEquals(state.beta, bestP[q][p].beta + newscore/kT); 403 | #else 404 | newscore = score_junction_B(p, q, nucp, nucp1, nucq_1, nucq) + 405 | precomputed + 406 | score_single_without_junctionB(p, q, i, j, nuci_1, nuci, nucj, nucj1); 407 | Fast_LogPlusEquals(state.beta, bestP[q][p].beta + newscore); 408 | #endif 409 | } 410 | q = next_pair[nucp][q]; 411 | } 412 | } 413 | } 414 | 415 | // 2. M = P 416 | if(i > 0 && j < seq_length-1){ 417 | #ifdef lpv 418 | newscore = - v_score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length, dangle_mode); 419 | Fast_LogPlusEquals(state.beta, beamstepM[i].beta + newscore/kT); 420 | #else 421 | newscore = score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length); 422 | Fast_LogPlusEquals(state.beta, beamstepM[i].beta + newscore); 423 | #endif 424 | } 425 | 426 | // 3. M2 = M + P 427 | int k = i - 1; 428 | if ( k > 0 && !bestM[k].empty()) { 429 | #ifdef lpv 430 | newscore = - v_score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length, dangle_mode); 431 | pf_type m1_alpha = newscore/kT; 432 | #else 433 | newscore = score_M1(i, j, j, nuci_1, nuci, nucj, nucj1, seq_length); 434 | pf_type m1_alpha = newscore; 435 | #endif 436 | pf_type m1_plus_P_alpha = state.alpha + m1_alpha; 437 | for (auto &m : bestM[k]) { 438 | int newi = m.first; 439 | State& m_state = m.second; 440 | Fast_LogPlusEquals(state.beta, (beamstepM2[newi].beta + m_state.alpha + m1_alpha)); 441 | Fast_LogPlusEquals(m_state.beta, (beamstepM2[newi].beta + m1_plus_P_alpha)); 442 | } 443 | } 444 | 445 | // 4. C = C + P 446 | { 447 | int k = i - 1; 448 | if (k >= 0) { 449 | int nuck = nuci_1; 450 | int nuck1 = nuci; 451 | #ifdef lpv 452 | newscore = - v_score_external_paired(k+1, j, nuck, nuck1, 453 | nucj, nucj1, seq_length, dangle_mode); 454 | pf_type external_paired_alpha_plus_beamstepC_beta = beamstepC.beta + newscore/kT; 455 | 456 | #else 457 | newscore = score_external_paired(k+1, j, nuck, nuck1, nucj, nucj1, seq_length); 458 | pf_type external_paired_alpha_plus_beamstepC_beta = beamstepC.beta + newscore; 459 | #endif 460 | Fast_LogPlusEquals(bestC[k].beta, state.alpha + external_paired_alpha_plus_beamstepC_beta); 461 | Fast_LogPlusEquals(state.beta, bestC[k].alpha + external_paired_alpha_plus_beamstepC_beta); 462 | } else { 463 | // value_type newscore; 464 | #ifdef lpv 465 | newscore = - v_score_external_paired(0, j, -1, nucs[0], 466 | nucj, nucj1, seq_length, dangle_mode); 467 | Fast_LogPlusEquals(state.beta, (beamstepC.beta + newscore/kT)); 468 | #else 469 | newscore = score_external_paired(0, j, -1, nucs[0], 470 | nucj, nucj1, seq_length); 471 | Fast_LogPlusEquals(state.beta, beamstepC.beta + newscore); 472 | #endif 473 | } 474 | } 475 | } 476 | } 477 | 478 | // beam of Multi 479 | { 480 | for(auto& item : beamstepMulti) { 481 | int i = item.first; 482 | State& state = item.second; 483 | 484 | int nuci = nucs[i]; 485 | int nuci1 = nucs[i+1]; 486 | int jnext = next_pair[nuci][j]; 487 | 488 | // 1. extend (i, j) to (i, jnext) 489 | { 490 | if (jnext != -1) { 491 | #ifdef lpv 492 | Fast_LogPlusEquals(state.beta, (bestMulti[jnext][i].beta)); 493 | #else 494 | newscore = score_multi_unpaired(j, jnext - 1); 495 | Fast_LogPlusEquals(state.beta, bestMulti[jnext][i].beta + newscore); 496 | #endif 497 | } 498 | } 499 | 500 | // 2. generate P (i, j) 501 | { 502 | #ifdef lpv 503 | newscore = - v_score_multi(i, j, nuci, nuci1, nucs[j-1], nucj, seq_length, dangle_mode); 504 | Fast_LogPlusEquals(state.beta, beamstepP[i].beta + newscore/kT); 505 | #else 506 | newscore = score_multi(i, j, nuci, nuci1, nucs[j-1], nucj, seq_length); 507 | Fast_LogPlusEquals(state.beta, beamstepP[i].beta + newscore); 508 | #endif 509 | } 510 | } 511 | } 512 | } // end of for-loo j 513 | 514 | gettimeofday(&bpp_endtime, NULL); 515 | double bpp_elapsed_time = bpp_endtime.tv_sec - bpp_starttime.tv_sec + (bpp_endtime.tv_usec-bpp_starttime.tv_usec)/1000000.0; 516 | 517 | if(is_verbose) fprintf(stderr,"Base Pairing Probabilities Calculation Time: %.2f seconds.\n", bpp_elapsed_time); 518 | 519 | fflush(stdout); 520 | 521 | return; 522 | } 523 | 524 | -------------------------------------------------------------------------------- /src/scripts/script_draw_bpp.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python2.7 2 | 3 | import sys 4 | import math 5 | from collections import defaultdict 6 | logs = sys.stderr 7 | 8 | preamble = r''' 9 | \documentclass{standalone} 10 | %\usepackage{fullpage} 11 | \pagestyle{empty} 12 | \usepackage{tikz} 13 | \usepackage{tkz-euclide} 14 | \usepackage{siunitx} 15 | \usetikzlibrary{shapes, shapes.multipart} 16 | \usepackage{xcolor} 17 | 18 | \usepackage{verbatim} 19 | \usepackage{lipsum} 20 | \begin{document} 21 | ''' 22 | 23 | picturepre = r''' 24 | %\hspace{-3cm} 25 | %\resizebox{1.2\textwidth}{!}{ 26 | \begin{tikzpicture}[darkstyle/.style={}, scale=2] %circle,draw,fill=gray!10}] 27 | ''' 28 | 29 | print preamble 30 | 31 | dataset = "." 32 | MAXLEN = 5650 33 | MINLEN = 0 34 | circular = True 35 | 36 | lbs = [ 37 | '(', 38 | '[', 39 | '{', 40 | '<' 41 | ] 42 | 43 | rbs = [ 44 | ')', 45 | ']', 46 | '}', 47 | '>' 48 | ] 49 | 50 | counter_clockwise = False # hzhang 51 | rotate = 180 -5 52 | # rotate = 180 53 | 54 | def drawarc_clockwise(a, b, deg, style, length, lengthfix): ## counterclock wise 55 | angle_a = 360./(length+lengthfix)*a 56 | angle_b = 360./(length+lengthfix)*b 57 | angle_a = angle_a-70 + rotate 58 | angle_b = angle_b-70 + rotate 59 | 60 | alpha = (angle_b-angle_a) 61 | pa = angle_a * math.pi / 180. 62 | pb = angle_b * math.pi / 180. 63 | palpha = alpha * math.pi / 180. 64 | if alpha < 170: 65 | print "\\draw[line width = 0.0mm] %s ([shift=(%f:10cm)]0,0) arc (%f:%f:%fcm); %% %d %d %d" % (style, angle_b, 66 | angle_b + 90, 67 | angle_a + 270, 68 | # 10*math.sin(palpha/2.) / math.sin(math.pi/2. - palpha/2.), 69 | 10 * math.tan(palpha/2.), 70 | a, b, length 71 | ) 72 | elif alpha > 190: 73 | beta = 360. - alpha 74 | pbeta = beta * math.pi / 180. 75 | print "\\draw[line width = 0.0mm] %s ([shift=(%f:10cm)]0,0) arc (%f:%f:%fcm); %% %d %d %d" % (style, angle_a, 76 | angle_a + 90, 77 | angle_b - 90, 78 | # 10*math.sin(palpha/2.) / math.sin(math.pi/2. - palpha/2.), 79 | 10 * math.tan(pbeta/2.), 80 | a, b, length 81 | ) 82 | else: 83 | print "\\draw[line width = 0.0mm] %s (%d) to [bend left=%.1f] (%d);" % (style, 84 | a, 85 | 2*deg, 86 | b) 87 | 88 | def drawarc_counterclockwise(a, b, deg, style, length, lengthfix): 89 | a,b = b,a 90 | angle_a = 360./(length+lengthfix)*a 91 | angle_b = 360./(length+lengthfix)*b 92 | angle_a = 450 - angle_a -20 93 | angle_b = 450 - angle_b -20 94 | 95 | alpha = (angle_b-angle_a) 96 | pa = angle_a * math.pi / 180. 97 | pb = angle_b * math.pi / 180. 98 | palpha = alpha * math.pi / 180. 99 | if alpha < 170: 100 | print "\\draw[line width = 0.0mm] %s ([shift=(%f:10cm)]0,0) arc (%f:%f:%fcm); %% %d %d %d" % (style, angle_b, 101 | angle_b + 90, 102 | angle_a + 270, 103 | # 10*math.sin(palpha/2.) / math.sin(math.pi/2. - palpha/2.), 104 | 10 * math.tan(palpha/2.), 105 | a, b, length 106 | ) 107 | elif alpha > 190: 108 | beta = 360. - alpha 109 | pbeta = beta * math.pi / 180. 110 | print "\\draw[line width = 0.0mm] %s ([shift=(%f:10cm)]0,0) arc (%f:%f:%fcm); %% %d %d %d" % (style, angle_a, 111 | angle_a + 90, 112 | angle_b - 90, 113 | # 10*math.sin(palpha/2.) / math.sin(math.pi/2. - palpha/2.), 114 | 10 * math.tan(pbeta/2.), 115 | a, b, length 116 | ) 117 | else: 118 | print "\\draw[line width = 0.0mm] %s (%d) to [bend left=%.1f] (%d);" % (style, 119 | a, 120 | 2*deg, 121 | b) 122 | 123 | def agree(pres, pref, a, b): ## pres[a] = b 124 | if pref[a] == b: 125 | return True 126 | elif pref.get(a-1,-1) == b or pref.get(a+1,-1) == b: 127 | return True 128 | elif pref.get(b-1,-1) == a or pref.get(b+1,-1) == a: 129 | return True 130 | else: 131 | return False 132 | 133 | def pair_agree(pres, pref, a, b): 134 | if pres[a] == b or pres.get(a-1,-1) == b or pres.get(a+1,-1) == b or pres.get(b-1,-1) == a or pres.get(b+1,-1) == a: 135 | return True 136 | else: 137 | return False 138 | 139 | 140 | bpp_file = sys.argv[1] 141 | # bpp_dict = defaultdict(int) 142 | bpp_dict = [] 143 | for line in open(bpp_file).readlines(): 144 | if line.startswith(">"): continue 145 | line = line.strip().split() 146 | if len(line) != 3: continue 147 | i, j, prob = int(line[0]), int(line[1]), float(line[2]) 148 | # print >> logs, "%d %d %f" % (i, j, prob) 149 | 150 | if prob > 3e-3: # threshold to avoid "Dimension too large" error 151 | # bpp_dict[i,j] = prob 152 | # print >> logs, "%f" % (prob) 153 | if prob > 1.0: 154 | prob = 1.0 155 | bpp_dict.append((i, j, prob)) 156 | bpp_dict.sort(key=lambda x: x[2]) 157 | 158 | 159 | note = False 160 | notes = "" 161 | num_hasknot = 0 162 | for index, line in enumerate(sys.stdin): 163 | pairs = [] 164 | goldpairs = [] 165 | pairset = set() 166 | respair = defaultdict(lambda: -1) 167 | goldpair = defaultdict(lambda: -1) 168 | bases = [''] 169 | 170 | pseudoknot = 0 171 | tmp = line.strip().split() 172 | if len(tmp) == 3: 173 | seq, res, ref = tmp 174 | filename = "seq %d" % index 175 | elif len(tmp) == 4: 176 | note = True 177 | seq, res, ref, notes = tmp 178 | elif len(tmp) == 5: 179 | note = True 180 | seq, res, ref, filename, notes = tmp 181 | else: 182 | print "input format error!" 183 | sys.exit(1) 184 | 185 | stacks = [] 186 | for _ in xrange(len(lbs)): 187 | stacks.append([]) 188 | for i, item in enumerate(res): 189 | if item in lbs: 190 | stackindex = lbs.index(item) 191 | stacks[stackindex].append(i) 192 | elif item in rbs: 193 | stackindex = rbs.index(item) 194 | left = stacks[stackindex][-1] 195 | stacks[stackindex] = stacks[stackindex][:-1] 196 | pairs.append((left+1,i+1, stackindex)) 197 | respair[left+1] = i+1 198 | respair[i+1] = left+1 199 | pairset.add((left+1,i+1)) 200 | notes += ";pair=%d" % (len(respair)//2) 201 | 202 | stacks = [] 203 | for _ in xrange(len(lbs)): 204 | stacks.append([]) 205 | for i, item in enumerate(ref): 206 | if item in lbs: 207 | stackindex = lbs.index(item) 208 | stacks[stackindex].append(i) 209 | elif item in rbs: 210 | stackindex = rbs.index(item) 211 | left = stacks[stackindex][-1] 212 | stacks[stackindex] = stacks[stackindex][:-1] 213 | goldpairs.append((left+1,i+1, stackindex)) 214 | goldpair[left+1] = i+1 215 | goldpair[i+1] = left+1 216 | 217 | length = len(seq) 218 | bases = [''] + list(seq) 219 | 220 | if length > MAXLEN: 221 | print >> logs, "%s too long (%d)" % (filename, length) 222 | continue # too long 223 | 224 | if length < MINLEN: 225 | print >> logs, "%s too short (%d)" % (filename, length) 226 | continue # too short 227 | 228 | print picturepre 229 | 230 | 231 | lengthfix = int(length/9.0) 232 | for i, base in enumerate(bases[1:], 1): 233 | if circular: 234 | angle = 360./(length+lengthfix)*i 235 | if counter_clockwise: # hzhang 236 | angle = 450-angle -20 237 | else: 238 | angle = angle-70 + rotate 239 | 240 | print "\\node [darkstyle] (%d) at (%f:10cm) {};" % (i, angle) 241 | if length <= 100: 242 | gap = 5 243 | elif length <= 200: 244 | gap = 10 245 | elif length <= 300: 246 | gap = 20 247 | elif length <= 700: 248 | gap = 50 249 | elif length <= 2000: 250 | gap = 100 251 | elif length <= 3000: 252 | gap = 200 253 | elif length <= 5000: 254 | gap = 300 255 | else: 256 | gap = 400 257 | 258 | if i % gap == 0 and i < len(bases)-10: 259 | print "\\node [scale=2] (%d,1) at (%f:10.8cm) {\Huge %d};" % (i, angle, i) 260 | if i > 1: 261 | print "\\draw (%d.center) -- (%d.center);" % (i, i-1) 262 | 263 | else: 264 | print "\\node [darkstyle] (%d) at (%d,0) {};" % (i, i) 265 | if i % 5 == 0: 266 | print "\\node [] (%d,1) at (%d,-1) {%d};" % (i, i, i) 267 | 268 | if circular: 269 | 270 | for j in range(4): 271 | i += 1 272 | angle = 360./(length+lengthfix)*(i-4) 273 | 274 | if counter_clockwise: # hzhang 275 | angle = 450-angle -20 276 | else: 277 | angle = angle-70 + rotate 278 | 279 | print "\\node [darkstyle] (%d) at (%f:10cm) {};" % (i, angle) 280 | 281 | angle = 360./(length+lengthfix) 282 | angle = 450-angle 283 | if length <= 50: 284 | angle5 = angle - 30 285 | angle3 = angle + 30 286 | elif length <= 100: 287 | angle5 = angle - 20 288 | angle3 = angle + 20 289 | elif length <= 200: 290 | angle5 = angle - 17 291 | angle3 = angle + 17 292 | else: 293 | angle5 = angle - 13 294 | angle3 = angle + 13 295 | print "\\node [scale=2](3prime) at (%f:10cm) {\LARGE \\textbf{%s}};" % (angle3, "5'") 296 | print "\\node [right=9.5cm of 3prime, scale=2] {\LARGE \\textbf{%s}};" % "3'" 297 | 298 | 299 | # add for bpp 300 | ##################################### 301 | for a, b, prob in bpp_dict: 302 | color = "red" # not in gold 303 | if pair_agree(goldpair, respair, a, b): # in gold 304 | color = "blue" 305 | lw = ",thick" 306 | prob *= 100 307 | color += "!" + str(prob) 308 | style = "[" + color + lw + "]" 309 | 310 | if circular: 311 | dist = b - a 312 | revdist = length+lengthfix - dist 313 | 314 | deg = 90 * (0.5-(dist+.0) / (length+lengthfix+.0)) 315 | 316 | else: 317 | deg = 20 318 | 319 | if counter_clockwise: 320 | drawarc_counterclockwise(a, b, deg, style, length, lengthfix) 321 | else: 322 | drawarc_clockwise(a, b, deg, style, length, lengthfix) 323 | 324 | print "\\end{tikzpicture}" 325 | 326 | print "\\end{document}" 327 | 328 | print >> logs, "%d out of %d sequences have pseudoknots" % (num_hasknot, index) #TODO 329 | -------------------------------------------------------------------------------- /testseq: -------------------------------------------------------------------------------- 1 | UGAGUUCUCGAUCUCUAAAAUCG 2 | AAAACGGUCCUUAUCAGGACCAAACA 3 | AUUCUUGCUUCAACAGUGUUUGAACGGAAU 4 | UCGGCCACAAACACACAAUCUACUGUUGGUCGA 5 | GUUUUUAUCUUACACACGCUUGUGUAAGAUAGUUA 6 | -------------------------------------------------------------------------------- /vis_examples/bpp_plot.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinearFold/LinearPartition/b450fb3e63189073b68d385589035f992080aa3a/vis_examples/bpp_plot.pdf -------------------------------------------------------------------------------- /vis_examples/bpp_plot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinearFold/LinearPartition/b450fb3e63189073b68d385589035f992080aa3a/vis_examples/bpp_plot.png -------------------------------------------------------------------------------- /vis_examples/heatmap.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinearFold/LinearPartition/b450fb3e63189073b68d385589035f992080aa3a/vis_examples/heatmap.pdf -------------------------------------------------------------------------------- /vis_examples/heatmap.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinearFold/LinearPartition/b450fb3e63189073b68d385589035f992080aa3a/vis_examples/heatmap.png --------------------------------------------------------------------------------