├── .gitignore ├── README.md ├── data ├── FB15K237 │ ├── FB15K237.pickle │ ├── README.txt │ ├── entities.dict │ ├── relations.dict │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v1 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v1_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v2 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v2_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v3 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v3_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v4 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── WN18RR_v4_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v1 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v1_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v2 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v2_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v3 │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v3_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── fb237_v4 │ ├── test.txt │ ├── train.txt │ └── valid.txt └── fb237_v4_ind │ ├── test.txt │ ├── train.txt │ └── valid.txt ├── managers ├── __pycache__ │ ├── evaluator.cpython-36.pyc │ └── trainer.cpython-36.pyc ├── evaluator.py └── trainer.py ├── model └── dgl │ ├── __init__.py │ ├── __pycache__ │ ├── __init__.cpython-36.pyc │ ├── aggregators.cpython-36.pyc │ ├── batch_gru.cpython-36.pyc │ ├── discriminator.cpython-36.pyc │ ├── graph_classifier.cpython-36.pyc │ ├── layers.cpython-36.pyc │ └── rgcn_model.cpython-36.pyc │ ├── aggregators.py │ ├── batch_gru.py │ ├── discriminator.py │ ├── graph_classifier.py │ ├── layers.py │ └── rgcn_model.py ├── requirements.txt ├── snri.png ├── subgraph_extraction ├── __pycache__ │ ├── datasets.cpython-36.pyc │ └── graph_sampler.cpython-36.pyc ├── datasets.py └── graph_sampler.py ├── test_auc.py ├── test_ranking.py ├── train.py └── utils ├── __pycache__ ├── data_utils.cpython-36.pyc ├── dgl_utils.cpython-36.pyc ├── graph_utils.cpython-36.pyc └── initialization_utils.cpython-36.pyc ├── clean_data.py ├── data_utils.py ├── dgl_utils.py ├── graph_utils.py ├── initialization_utils.py └── prepare_meta_data.py /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | __pycache__/ 3 | tmp.txt 4 | experiments/ 5 | data/ 6 | 7 | #Saved and downloaded data files 8 | *.nt.gz 9 | *.npz 10 | *.pkl 11 | *.ipynb 12 | *.npy 13 | *.pyc 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SNRI - Subgraph Neighboring Relations Infomax for Inductive Link Prediction on Knowledge Graphs 2 | 3 | Code for paper [Subgraph Neighboring Relations Infomax for Inductive Link Prediction on Knowledge Graphs](https://arxiv.org/abs/2208.00850) Xiaohan Xu, Peng Zhang, Yongquan He, Chengpeng Chao and Chaoyang Yan. IJCAI 2022. 4 | 5 | 6 | 7 | Inductive link prediction for knowledge graph aims at predicting missing links between unseen entities, those not shown in training stage. Most previous works learn entity-specific embeddings of entities, which cannot handle unseen entities. Recent several methods utilize enclosing subgraph to obtain inductive ability. However, all these works only consider the enclosing part of subgraph without complete neighboring relations, which leads to the issue that partial neighboring relations are neglected, and sparse subgraphs are hard to be handled. To address that, we propose Subgraph Neighboring Relations Infomax, SNRI, which sufficiently exploits complete neighboring relations from two aspects: \textit{neighboring relational feature} for node feature and \textit{neighboring relational path} for sparse subgraph. To further model neighboring relations in a global way, we innovatively apply mutual information (MI) maximization for knowledge graph. Experiments show that SNRI outperforms existing state-of-art methods by a large margin on inductive link prediction task, and verify the effectiveness of exploring complete neighboring relations in a global way to characterize node features and reason on sparse subgraphs. 8 | 9 | ## Requirements 10 | dgl 11 | lmdb 12 | networkx 13 | scikit-learn 14 | torch 15 | tqdm 16 | 17 | ## Usage 18 | 19 | Train data and test data are located in `data` folder. 20 | 21 | ### Training 22 | 23 | Train WN18RR dataset using the following commands: 24 | 25 | ```shell script 26 | python train.py -d WN18RR_v1 -e snri_wn_v1 27 | python train.py -d WN18RR_v2 -e snri_wn_v2 28 | python train.py -d WN18RR_v3 -e snri_wn_v3 29 | python train.py -d WN18RR_v4 -e snri_wn_v4 30 | ``` 31 | 32 | Train Fb15K237 dataset using the following commands: 33 | ```shell script 34 | python train.py -d fb237_v1 -e snri_fb_v1 35 | python train.py -d fb237_v2 -e snri_fb_v2 36 | python train.py -d fb237_v3 -e snri_fb_v3 37 | python train.py -d fb237_v4 -e snri_fb_v4 38 | ``` 39 | 40 | ### Evaluation 41 | 42 | Evaluate model using similar commands like: 43 | ```shell script 44 | python test_auc.py -d WN18RR_v4_ind -e snri_wn_v4 45 | python test_ranking.py -d WN18RR_v4_ind -e snri_wn_v4 46 | ``` 47 | 48 | ### Ablation Study 49 | 50 | Run following commands for different variant models: 51 | ```shell script 52 | python train.py -d WN18RR_v4 -e snri_wn_v4 --nei_rel_path # without neighboring relational path module 53 | python train.py -d WN18RR_v4 -e snri_wn_v4 --init_nei_rels no # without neighboring relational feature module 54 | python train.py -d WN18RR_v4 -e snri_wn_v4 --coef_dgi_loss 0 # without MI module 55 | ``` 56 | 57 | ## Citation 58 | If you use source codes included in this toolkit in your work, please cite the following paper. The bibtex are listed below: 59 | 60 | @inproceedings{ijcai2022p325, 61 | title = {Subgraph Neighboring Relations Infomax for Inductive Link Prediction on Knowledge Graphs}, 62 | author = {Xu, Xiaohan and Zhang, Peng and He, Yongquan and Chao, Chengpeng and Yan, Chaoyang}, 63 | booktitle = {Proceedings of the Thirty-First International Joint Conference on 64 | Artificial Intelligence, {IJCAI-22}}, 65 | publisher = {International Joint Conferences on Artificial Intelligence Organization}, 66 | editor = {Lud De Raedt}, 67 | pages = {2341--2347}, 68 | year = {2022}, 69 | month = {7}, 70 | note = {Main Track}, 71 | doi = {10.24963/ijcai.2022/325}, 72 | url = {https://doi.org/10.24963/ijcai.2022/325}, 73 | } 74 | 75 | ## Acknowledgement 76 | We refer to the code of [GraIL](https://github.com/kkteru/grail). Thanks for their contributions. 77 | -------------------------------------------------------------------------------- /data/FB15K237/FB15K237.pickle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/data/FB15K237/FB15K237.pickle -------------------------------------------------------------------------------- /data/FB15K237/README.txt: -------------------------------------------------------------------------------- 1 | FB15K-237 Knowledge Base Completion Dataset 2 | 3 | This dataset contains knowledge base relation triples and textual mentions of Freebase entity pairs, as used in the work published in [1] and [2]. 4 | The knowledge base triples are a subset of the FB15K set [3], originally derived from Freebase. The textual mentions are derived from 200 million sentences from the ClueWeb12 [5] corpus coupled with Freebase entity mention annotations [4]. 5 | 6 | 7 | FILE FORMAT DETAILS 8 | 9 | The files train.txt, valid.txt, and test.text contain the training, development, and test set knowledge base triples used in both [1] and [2]. 10 | The file text_cvsc.txt contains the textual triples used in [2] and the file text_emnlp.txt contains the textual triples used in [1]. 11 | 12 | The knowledge base triples contain lines like this: 13 | 14 | /m/0grwj /people/person/profession /m/05sxg2 15 | 16 | The format is: 17 | 18 | mid1 relation mid2 19 | 20 | The separator is a tab character; the mids are Freebase ids of entities, and the relation is a single or a two-hop relation from Freebase, where an intermediate complex value type entity has been collapsed out. 21 | 22 | The textual mentions files have lines like this: 23 | 24 | /m/02qkt [XXX]:<-nn>:fact:<-pobj>:in:<-prep>:game:<-nsubj>:'s::pivot::[YYY] /m/05sb1 3 25 | 26 | This indicates the mids of two Freebase entities, together with a fully lexicalized dependency path between the entities. The last element in the tuple is the number of occurrences of the specified entity pair with the given dependency path in sentences from ClueWeb12. 27 | The dependency paths are specified as sequences of words (like the word "fact" above) and labeled dependency links (like above). The direction of traversal of a dependency arc is indicated by whether there is a - sign in front of the arc label "e.g." <-nsubj> vs . 28 | 29 | 30 | REFERENCES 31 | 32 | [1] Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon. Representing text for joint embedding of text and knowledge bases. In Proceedings of EMNLP 2015. 33 | [2] Kristina Toutanova and Danqi Chen. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and Their Compositionality 2015. 34 | [3] Antoine Bordes, Nicolas Usunier, Alberto Garcia Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multirelational data. In Advances in Neural Information Processing Systems (NIPS) 2013. 35 | [4] Evgeniy Gabrilovich, Michael Ringgaard, and Amarnag Subramanya. FACC1: Freebase annotation of ClueWeb corpora, Version 1 (release date 2013-06-26, format version 1, correction level 0). http://lemurproject.org/clueweb12/FACC1/ 36 | [5] http://lemurproject.org/clueweb12/ 37 | 38 | 39 | CONTACT 40 | 41 | Please contact Kristina Toutanova kristout@microsoft.com if you have questions about the dataset. 42 | -------------------------------------------------------------------------------- /data/FB15K237/relations.dict: -------------------------------------------------------------------------------- 1 | 0 /organization/organization/headquarters./location/mailing_address/state_province_region 2 | 1 /education/educational_institution/colors 3 | 2 /people/person/profession 4 | 3 /film/film/costume_design_by 5 | 4 /film/film/genre 6 | 5 /celebrities/celebrity/celebrity_friends./celebrities/friendship/friend 7 | 6 /tv/tv_producer/programs_produced./tv/tv_producer_term/producer_type 8 | 7 /film/film/executive_produced_by 9 | 8 /sports/sports_team/roster./basketball/basketball_roster_position/position 10 | 9 /award/award_nominee/award_nominations./award/award_nomination/nominated_for 11 | 10 /award/award_category/winners./award/award_honor/award_winner 12 | 11 /award/award_winner/awards_won./award/award_honor/award_winner 13 | 12 /music/artist/origin 14 | 13 /food/food/nutrients./food/nutrition_fact/nutrient 15 | 14 /film/film/distributors./film/film_film_distributor_relationship/region 16 | 15 /time/event/instance_of_recurring_event 17 | 16 /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/school 18 | 17 /film/film/language 19 | 18 /location/statistical_region/places_exported_to./location/imports_and_exports/exported_to 20 | 19 /music/group_member/membership./music/group_membership/group 21 | 20 /tv/tv_network/programs./tv/tv_network_duration/program 22 | 21 /award/award_winning_work/awards_won./award/award_honor/award_winner 23 | 22 /people/person/places_lived./people/place_lived/location 24 | 23 /travel/travel_destination/climate./travel/travel_destination_monthly_climate/month 25 | 24 /broadcast/content/artist 26 | 25 /base/americancomedy/celebrity_impressionist/celebrities_impersonated 27 | 26 /base/popstra/celebrity/breakup./base/popstra/breakup/participant 28 | 27 /organization/organization/place_founded 29 | 28 /people/person/employment_history./business/employment_tenure/company 30 | 29 /location/statistical_region/gdp_nominal_per_capita./measurement_unit/dated_money_value/currency 31 | 30 /people/person/place_of_birth 32 | 31 /location/location/contains 33 | 32 /base/popstra/celebrity/dated./base/popstra/dated/participant 34 | 33 /user/ktrueman/default_domain/international_organization/member_states 35 | 34 /government/legislative_session/members./government/government_position_held/legislative_sessions 36 | 35 /film/film/estimated_budget./measurement_unit/dated_money_value/currency 37 | 36 /organization/non_profit_organization/registered_with./organization/non_profit_registration/registering_agency 38 | 37 /organization/organization/headquarters./location/mailing_address/country 39 | 38 /base/biblioness/bibs_location/country 40 | 39 /education/educational_institution/students_graduates./education/education/student 41 | 40 /music/group_member/membership./music/group_membership/role 42 | 41 /location/administrative_division/country 43 | 42 /award/ranked_item/appears_in_ranked_lists./award/ranking/list 44 | 43 /base/eating/practicer_of_diet/diet 45 | 44 /film/special_film_performance_type/film_performance_type./film/performance/film 46 | 45 /award/award_nominated_work/award_nominations./award/award_nomination/nominated_for 47 | 46 /film/director/film 48 | 47 /base/x2010fifaworldcupsouthafrica/world_cup_squad/current_world_cup_squad./base/x2010fifaworldcupsouthafrica/current_world_cup_squad/current_club 49 | 48 /olympics/olympic_games/participating_countries 50 | 49 /music/performance_role/regular_performances./music/group_membership/role 51 | 50 /music/artist/track_contributions./music/track_contribution/role 52 | 51 /base/aareas/schema/administrative_area/administrative_area_type 53 | 52 /film/film/distributors./film/film_film_distributor_relationship/film_distribution_medium 54 | 53 /olympics/olympic_games/sports 55 | 54 /soccer/football_team/current_roster./soccer/football_roster_position/position 56 | 55 /olympics/olympic_participating_country/athletes./olympics/olympic_athlete_affiliation/olympics 57 | 56 /military/military_combatant/military_conflicts./military/military_combatant_group/combatants 58 | 57 /tv/tv_personality/tv_regular_appearances./tv/tv_regular_personal_appearance/program 59 | 58 /common/topic/webpage./common/webpage/category 60 | 59 /music/genre/artists 61 | 60 /film/film/featured_film_locations 62 | 61 /location/location/adjoin_s./location/adjoining_relationship/adjoins 63 | 62 /sports/sports_team/colors 64 | 63 /tv/tv_program/program_creator 65 | 64 /business/business_operation/operating_income./measurement_unit/dated_money_value/currency 66 | 65 /ice_hockey/hockey_team/current_roster./sports/sports_team_roster/position 67 | 66 /film/film/prequel 68 | 67 /organization/endowed_organization/endowment./measurement_unit/dated_money_value/currency 69 | 68 /film/film_set_designer/film_sets_designed 70 | 69 /film/film/film_art_direction_by 71 | 70 /language/human_language/countries_spoken_in 72 | 71 /people/marriage_union_type/unions_of_this_type./people/marriage/location_of_ceremony 73 | 72 /tv/tv_writer/tv_programs./tv/tv_program_writer_relationship/tv_program 74 | 73 /government/political_party/politicians_in_this_party./government/political_party_tenure/politician 75 | 74 /sports/sports_team/roster./american_football/football_historical_roster_position/position_s 76 | 75 /film/film/release_date_s./film/film_regional_release_date/film_release_region 77 | 76 /film/film/release_date_s./film/film_regional_release_date/film_regional_debut_venue 78 | 77 /award/award_winning_work/awards_won./award/award_honor/honored_for 79 | 78 /location/capital_of_administrative_division/capital_of./location/administrative_division_capital_relationship/administrative_division 80 | 79 /location/hud_foreclosure_area/estimated_number_of_mortgages./measurement_unit/dated_integer/source 81 | 80 /award/award_category/winners./award/award_honor/ceremony 82 | 81 /people/person/languages 83 | 82 /film/actor/film./film/performance/film 84 | 83 /business/business_operation/revenue./measurement_unit/dated_money_value/currency 85 | 84 /base/petbreeds/city_with_dogs/top_breeds./base/petbreeds/dog_city_relationship/dog_breed 86 | 85 /sports/sports_team_location/teams 87 | 86 /film/film/music 88 | 87 /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/draft 89 | 88 /education/educational_institution/students_graduates./education/education/major_field_of_study 90 | 89 /people/ethnicity/geographic_distribution 91 | 90 /sports/sports_league/teams./sports/sports_league_participation/team 92 | 91 /education/educational_degree/people_with_this_degree./education/education/student 93 | 92 /government/politician/government_positions_held./government/government_position_held/jurisdiction_of_office 94 | 93 /base/aareas/schema/administrative_area/capital 95 | 94 /film/film/film_production_design_by 96 | 95 /user/jg/default_domain/olympic_games/sports 97 | 96 /award/award_category/category_of 98 | 97 /education/educational_institution/school_type 99 | 98 /sports/sports_team/roster./baseball/baseball_roster_position/position 100 | 99 /tv/tv_producer/programs_produced./tv/tv_producer_term/program 101 | 100 /location/us_county/county_seat 102 | 101 /education/university/fraternities_and_sororities 103 | 102 /film/film/other_crew./film/film_crew_gig/crewmember 104 | 103 /military/military_conflict/combatants./military/military_combatant_group/combatants 105 | 104 /base/popstra/celebrity/canoodled./base/popstra/canoodled/participant 106 | 105 /education/educational_degree/people_with_this_degree./education/education/institution 107 | 106 /organization/organization/child./organization/organization_relationship/child 108 | 107 /travel/travel_destination/how_to_get_here./travel/transportation/mode_of_transportation 109 | 108 /award/award_category/nominees./award/award_nomination/nominated_for 110 | 109 /medicine/symptom/symptom_of 111 | 110 /people/ethnicity/people 112 | 111 /film/film/other_crew./film/film_crew_gig/film_crew_role 113 | 112 /government/governmental_body/members./government/government_position_held/legislative_sessions 114 | 113 /business/business_operation/industry 115 | 114 /film/film/country 116 | 115 /people/profession/specialization_of 117 | 116 /location/hud_county_place/place 118 | 117 /organization/role/leaders./organization/leadership/organization 119 | 118 /music/instrument/instrumentalists 120 | 119 /time/event/locations 121 | 120 /film/film/produced_by 122 | 121 /music/performance_role/track_performances./music/track_contribution/role 123 | 122 /film/film/runtime./film/film_cut/film_release_region 124 | 123 /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/country 125 | 124 /tv/tv_program/regular_cast./tv/regular_tv_appearance/actor 126 | 125 /award/award_nominee/award_nominations./award/award_nomination/award 127 | 126 /people/person/spouse_s./people/marriage/type_of_union 128 | 127 /film/actor/dubbing_performances./film/dubbing_performance/language 129 | 128 /sports/sports_position/players./sports/sports_team_roster/team 130 | 129 /award/award_ceremony/awards_presented./award/award_honor/honored_for 131 | 130 /sports/sports_team/sport 132 | 131 /tv/tv_program/country_of_origin 133 | 132 /award/award_category/disciplines_or_subjects 134 | 133 /base/popstra/celebrity/friendship./base/popstra/friendship/participant 135 | 134 /people/ethnicity/languages_spoken 136 | 135 /tv/tv_program/genre 137 | 136 /education/educational_degree/people_with_this_degree./education/education/major_field_of_study 138 | 137 /people/person/sibling_s./people/sibling_relationship/sibling 139 | 138 /business/business_operation/assets./measurement_unit/dated_money_value/currency 140 | 139 /olympics/olympic_games/medals_awarded./olympics/olympic_medal_honor/medal 141 | 140 /film/film/edited_by 142 | 141 /film/actor/film./film/performance/special_performance_type 143 | 142 /education/educational_institution_campus/educational_institution 144 | 143 /film/film/written_by 145 | 144 /sports/sports_position/players./sports/sports_team_roster/position 146 | 145 /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_location 147 | 146 /film/film/personal_appearances./film/personal_film_appearance/person 148 | 147 /user/tsegaran/random/taxonomy_subject/entry./user/tsegaran/random/taxonomy_entry/taxonomy 149 | 148 /people/person/gender 150 | 149 /people/deceased_person/place_of_death 151 | 150 /location/statistical_region/rent50_2./measurement_unit/dated_money_value/currency 152 | 151 /music/performance_role/guest_performances./music/recording_contribution/performance_role 153 | 152 /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/medal 154 | 153 /dataworld/gardening_hint/split_to 155 | 154 /location/country/capital 156 | 155 /award/award_winning_work/awards_won./award/award_honor/award 157 | 156 /tv/tv_program/tv_producer./tv/tv_producer_term/producer_type 158 | 157 /base/biblioness/bibs_location/state 159 | 158 /influence/influence_node/peers./influence/peer_relationship/peers 160 | 159 /film/film/story_by 161 | 160 /location/administrative_division/first_level_division_of 162 | 161 /baseball/baseball_team/team_stats./baseball/baseball_team_stats/season 163 | 162 /award/hall_of_fame/inductees./award/hall_of_fame_induction/inductee 164 | 163 /sports/sports_team/roster./american_football/football_roster_position/position 165 | 164 /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_language 166 | 165 /sports/sports_position/players./american_football/football_historical_roster_position/position_s 167 | 166 /media_common/netflix_genre/titles 168 | 167 /people/person/spouse_s./people/marriage/spouse 169 | 168 /people/cause_of_death/people 170 | 169 /organization/organization_founder/organizations_founded 171 | 170 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office 172 | 171 /tv/tv_program/languages 173 | 172 /base/popstra/location/vacationers./base/popstra/vacation_choice/vacationer 174 | 173 /influence/influence_node/influenced_by 175 | 174 /location/country/second_level_divisions 176 | 175 /sports/sport/pro_athletes./sports/pro_sports_played/athlete 177 | 176 /government/legislative_session/members./government/government_position_held/district_represented 178 | 177 /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/olympics 179 | 178 /medicine/disease/risk_factors 180 | 179 /award/award_ceremony/awards_presented./award/award_honor/award_winner 181 | 180 /american_football/football_team/current_roster./sports/sports_team_roster/position 182 | 181 /music/artist/contribution./music/recording_contribution/performance_role 183 | 182 /education/educational_institution/campuses 184 | 183 /location/country/form_of_government 185 | 184 /base/marchmadness/ncaa_basketball_tournament/seeds./base/marchmadness/ncaa_tournament_seed/team 186 | 185 /education/field_of_study/students_majoring./education/education/major_field_of_study 187 | 186 /people/person/nationality 188 | 187 /film/film/release_date_s./film/film_regional_release_date/film_release_distribution_medium 189 | 188 /film/film/film_format 190 | 189 /soccer/football_player/current_team./sports/sports_team_roster/team 191 | 190 /government/politician/government_positions_held./government/government_position_held/legislative_sessions 192 | 191 /film/film/cinematography 193 | 192 /people/deceased_person/place_of_burial 194 | 193 /base/aareas/schema/administrative_area/administrative_parent 195 | 194 /music/genre/parent_genre 196 | 195 /sports/sports_league_draft/picks./sports/sports_league_draft_pick/school 197 | 196 /location/statistical_region/religions./location/religion_percentage/religion 198 | 197 /location/location/time_zones 199 | 198 /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics 200 | 199 /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film 201 | 200 /film/film/dubbing_performances./film/dubbing_performance/actor 202 | 201 /organization/organization/headquarters./location/mailing_address/citytown 203 | 202 /sports/pro_athlete/teams./sports/sports_team_roster/team 204 | 203 /education/university/local_tuition./measurement_unit/dated_money_value/currency 205 | 204 /music/record_label/artist 206 | 205 /business/job_title/people_with_this_title./business/employment_tenure/company 207 | 206 /music/instrument/family 208 | 207 /user/alexander/philosophy/philosopher/interests 209 | 208 /location/statistical_region/gdp_real./measurement_unit/adjusted_money_value/adjustment_currency 210 | 209 /tv/non_character_role/tv_regular_personal_appearances./tv/tv_regular_personal_appearance/person 211 | 210 /location/hud_county_place/county 212 | 211 /government/politician/government_positions_held./government/government_position_held/basic_title 213 | 212 /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/contact_category 214 | 213 /people/person/religion 215 | 214 /education/university/domestic_tuition./measurement_unit/dated_money_value/currency 216 | 215 /award/award_nominee/award_nominations./award/award_nomination/award_nominee 217 | 216 /music/performance_role/regular_performances./music/group_membership/group 218 | 217 /education/university/international_tuition./measurement_unit/dated_money_value/currency 219 | 218 /film/film/film_festivals 220 | 219 /location/statistical_region/gdp_nominal./measurement_unit/dated_money_value/currency 221 | 220 /base/saturdaynightlive/snl_cast_member/seasons./base/saturdaynightlive/snl_season_tenure/cast_members 222 | 221 /education/field_of_study/students_majoring./education/education/student 223 | 222 /location/statistical_region/gni_per_capita_in_ppp_dollars./measurement_unit/dated_money_value/currency 224 | 223 /base/localfood/seasonal_month/produce_available./base/localfood/produce_availability/seasonal_months 225 | 224 /film/film_subject/films 226 | 225 /soccer/football_team/current_roster./sports/sports_team_roster/position 227 | 226 /location/location/partially_contains 228 | 227 /celebrities/celebrity/sexual_relationships./celebrities/romantic_relationship/celebrity 229 | 228 /people/person/spouse_s./people/marriage/location_of_ceremony 230 | 229 /base/culturalevent/event/entity_involved 231 | 230 /organization/organization_member/member_of./organization/organization_membership/organization 232 | 231 /base/locations/continents/countries_within 233 | 232 /location/country/official_language 234 | 233 /film/film/production_companies 235 | 234 /base/schemastaging/person_extra/net_worth./measurement_unit/dated_money_value/currency 236 | 235 /medicine/disease/notable_people_with_this_condition 237 | 236 /film/person_or_entity_appearing_in_film/films./film/personal_film_appearance/type_of_appearance 238 | -------------------------------------------------------------------------------- /data/WN18RR_v1_ind/test.txt: -------------------------------------------------------------------------------- 1 | 00445169 _similar_to 00444519 2 | 02666239 _derivationally_related_form 01410363 3 | 03420559 _derivationally_related_form 01087197 4 | 01149494 _also_see 01364008 5 | 00233335 _derivationally_related_form 05162455 6 | 10341660 _derivationally_related_form 02987454 7 | 03354613 _derivationally_related_form 01340439 8 | 00088481 _derivationally_related_form 02272549 9 | 02979662 _derivationally_related_form 01662771 10 | 01021128 _derivationally_related_form 05925366 11 | 02512305 _derivationally_related_form 01153548 12 | 00456740 _derivationally_related_form 07369604 13 | 01292885 _derivationally_related_form 07976936 14 | 01410363 _derivationally_related_form 04750164 15 | 00082308 _derivationally_related_form 03075768 16 | 07520612 _derivationally_related_form 01780941 17 | 05849789 _derivationally_related_form 02630189 18 | 08612786 _derivationally_related_form 01276361 19 | 00847340 _derivationally_related_form 01428853 20 | 02509287 _derivationally_related_form 00808182 21 | 00527572 _derivationally_related_form 13491060 22 | 04750164 _derivationally_related_form 01410363 23 | 00267349 _derivationally_related_form 01590171 24 | 00915722 _derivationally_related_form 01742726 25 | 07423001 _hypernym 07355887 26 | 02728440 _derivationally_related_form 00046534 27 | 01167146 _derivationally_related_form 01612053 28 | 03600977 _hypernym 03605915 29 | 03264542 _hypernym 08592656 30 | 02443049 _derivationally_related_form 01135529 31 | 01779165 _derivationally_related_form 04143712 32 | 01779165 _derivationally_related_form 07520612 33 | 01753596 _derivationally_related_form 09972157 34 | 00922438 _hypernym 00921738 35 | 00443384 _derivationally_related_form 14110411 36 | 00064095 _derivationally_related_form 03879854 37 | 00149084 _derivationally_related_form 01285440 38 | 03779621 _derivationally_related_form 01662771 39 | 00353782 _derivationally_related_form 00429060 40 | 04659287 _derivationally_related_form 01026262 41 | 10371741 _derivationally_related_form 00752335 42 | 01662771 _derivationally_related_form 13913566 43 | 01240432 _hypernym 01240210 44 | 05085572 _derivationally_related_form 00444519 45 | 09941964 _derivationally_related_form 02441022 46 | 00082308 _derivationally_related_form 00354884 47 | 00429060 _derivationally_related_form 00359903 48 | 00444519 _derivationally_related_form 05085572 49 | 00751887 _derivationally_related_form 09941964 50 | 00201923 _derivationally_related_form 00462092 51 | 01130607 _derivationally_related_form 03878963 52 | 01059400 _also_see 02095311 53 | 04713332 _hypernym 04712735 54 | 13999663 _derivationally_related_form 01301410 55 | 03391301 _derivationally_related_form 01586850 56 | 00290740 _derivationally_related_form 00351638 57 | 00833702 _derivationally_related_form 00893955 58 | 01340439 _derivationally_related_form 10080337 59 | 01780941 _hypernym 01779165 60 | 01474513 _also_see 02451113 61 | 00119873 _derivationally_related_form 02987454 62 | 10078806 _derivationally_related_form 01739814 63 | 00795008 _derivationally_related_form 00047317 64 | 10012815 _derivationally_related_form 00650353 65 | 15224293 _derivationally_related_form 00233335 66 | 01687569 _derivationally_related_form 01159964 67 | 02447001 _derivationally_related_form 10298912 68 | 02659763 _derivationally_related_form 04930307 69 | 01819554 _derivationally_related_form 10525134 70 | 09800249 _hypernym 09952163 71 | 02506555 _also_see 02064745 72 | 02700104 _derivationally_related_form 00552841 73 | 00083334 _derivationally_related_form 00149084 74 | 01364008 _also_see 01368192 75 | 01667449 _derivationally_related_form 04033995 76 | 07366289 _derivationally_related_form 02661252 77 | 00595146 _derivationally_related_form 10164233 78 | 01340439 _hypernym 01296462 79 | 02661252 _derivationally_related_form 04802776 80 | 02502536 _derivationally_related_form 00320852 81 | 00898804 _derivationally_related_form 01697027 82 | 03600977 _derivationally_related_form 02660147 83 | 13860793 _derivationally_related_form 00445169 84 | 01643464 _derivationally_related_form 10029068 85 | 07355887 _derivationally_related_form 00152887 86 | 00751887 _derivationally_related_form 09941383 87 | 00482893 _derivationally_related_form 00185104 88 | 00456740 _derivationally_related_form 05696020 89 | 00233335 _derivationally_related_form 15224293 90 | 10093908 _derivationally_related_form 02702830 91 | 09442838 _derivationally_related_form 00709625 92 | 03496892 _hypernym 03322940 93 | 02646931 _derivationally_related_form 14442530 94 | 01834304 _derivationally_related_form 00410247 95 | 01531375 _also_see 01508719 96 | 00299580 _derivationally_related_form 07369604 97 | 00233335 _derivationally_related_form 10525134 98 | 00046534 _also_see 00044149 99 | 08592656 _hypernym 08512259 100 | 00087152 _also_see 01922763 101 | 02875013 _derivationally_related_form 01467370 102 | 02806907 _derivationally_related_form 06806469 103 | 02441022 _derivationally_related_form 09941964 104 | 00764902 _derivationally_related_form 01205827 105 | 14441825 _derivationally_related_form 00791227 106 | 00651991 _derivationally_related_form 05748285 107 | 07254057 _hypernym 07253637 108 | 01582645 _derivationally_related_form 03234306 109 | 02542280 _derivationally_related_form 00696518 110 | 04181228 _derivationally_related_form 01085474 111 | 00859325 _derivationally_related_form 04630689 112 | 09941571 _hypernym 09943541 113 | 03573282 _derivationally_related_form 00187526 114 | 13860793 _hypernym 00027807 115 | 00413876 _derivationally_related_form 01051331 116 | 07515560 _derivationally_related_form 01922763 117 | 08552138 _derivationally_related_form 02512150 118 | 00462092 _derivationally_related_form 01070892 119 | 00290740 _derivationally_related_form 00355252 120 | 09779790 _derivationally_related_form 00245457 121 | 01467370 _derivationally_related_form 08512736 122 | 01753596 _derivationally_related_form 03178782 123 | 01428853 _derivationally_related_form 10300303 124 | 07337390 _derivationally_related_form 01876907 125 | 00236289 _derivationally_related_form 04181228 126 | 01551871 _verb_group 01684337 127 | 00764902 _derivationally_related_form 00759551 128 | 09952163 _derivationally_related_form 01765392 129 | 09941571 _derivationally_related_form 00590626 130 | 02700104 _derivationally_related_form 04713118 131 | 01029852 _derivationally_related_form 07202579 132 | 05117660 _hypernym 05093890 133 | 00236289 _hypernym 00233335 134 | 10388440 _derivationally_related_form 02539334 135 | 09779790 _derivationally_related_form 01104406 136 | 01782218 _hypernym 01780202 137 | 03051540 _derivationally_related_form 00050652 138 | 03792334 _derivationally_related_form 01660640 139 | 00661213 _derivationally_related_form 05748786 140 | 01146039 _derivationally_related_form 02553697 141 | 09812338 _derivationally_related_form 02991122 142 | 08612786 _hypernym 08512259 143 | 13971561 _derivationally_related_form 00764902 144 | 07519253 _derivationally_related_form 01779165 145 | 00100044 _derivationally_related_form 00893955 146 | 02204692 _derivationally_related_form 10389398 147 | 00169651 _hypernym 00170844 148 | 03670849 _has_part 02845576 149 | 07177437 _derivationally_related_form 00482893 150 | 05696020 _derivationally_related_form 00456740 151 | 06526291 _derivationally_related_form 10402417 152 | 06773976 _derivationally_related_form 01647867 153 | 13969243 _derivationally_related_form 02700104 154 | 03852280 _hypernym 03574816 155 | 05641959 _derivationally_related_form 00597385 156 | 04748836 _derivationally_related_form 00119524 157 | 01684337 _derivationally_related_form 04157320 158 | 00709625 _derivationally_related_form 00928077 159 | 00321956 _derivationally_related_form 00187526 160 | 00047745 _derivationally_related_form 03051540 161 | 04085873 _hypernym 03315644 162 | 04641153 _hypernym 04640927 163 | 07254057 _derivationally_related_form 01781180 164 | 05844105 _derivationally_related_form 01687569 165 | 10525134 _derivationally_related_form 01301051 166 | 14442530 _hypernym 14441825 167 | 00650016 _derivationally_related_form 10012815 168 | 00696518 _also_see 02564986 169 | 01697027 _derivationally_related_form 00898804 170 | 01301410 _derivationally_related_form 13998781 171 | 10388924 _derivationally_related_form 00809465 172 | 03779621 _derivationally_related_form 01697027 173 | 00152887 _derivationally_related_form 07355887 174 | 08677628 _derivationally_related_form 02695895 175 | 01612053 _also_see 01123148 176 | 05162455 _hypernym 05161614 177 | 00044149 _verb_group 00044797 178 | 02539334 _derivationally_related_form 00791227 179 | 00040962 _hypernym 00040804 180 | 04051825 _derivationally_related_form 01128071 181 | 04905188 _derivationally_related_form 01026262 182 | 01320009 _derivationally_related_form 00921790 183 | 00354884 _derivationally_related_form 01815185 184 | 03721797 _derivationally_related_form 00921738 185 | 02064745 _derivationally_related_form 02666239 186 | 13491060 _derivationally_related_form 00527572 187 | 02928413 _derivationally_related_form 01498713 188 | 00224901 _derivationally_related_form 09476521 189 | -------------------------------------------------------------------------------- /data/WN18RR_v1_ind/valid.txt: -------------------------------------------------------------------------------- 1 | 09953178 _hypernym 09931640 2 | 01027263 _derivationally_related_form 00299580 3 | 03728811 _derivationally_related_form 01292885 4 | 04928903 _derivationally_related_form 01687569 5 | 01301051 _derivationally_related_form 10525134 6 | 09273291 _derivationally_related_form 02711114 7 | 13969700 _derivationally_related_form 01765392 8 | 00590626 _derivationally_related_form 09780828 9 | 13903079 _derivationally_related_form 01466978 10 | 00050652 _hypernym 00046534 11 | 10029068 _derivationally_related_form 00935940 12 | 10676877 _derivationally_related_form 02443049 13 | 02420232 _derivationally_related_form 10078806 14 | 01256157 _derivationally_related_form 10566072 15 | 00151689 _derivationally_related_form 05111835 16 | 04770911 _derivationally_related_form 01876907 17 | 00262703 _derivationally_related_form 03745285 18 | 00708017 _similar_to 00709625 19 | 02695895 _hypernym 02694933 20 | 03051540 _derivationally_related_form 00047745 21 | 01291069 _derivationally_related_form 00145218 22 | 13905792 _derivationally_related_form 01276361 23 | 03878963 _derivationally_related_form 01130607 24 | 00898804 _derivationally_related_form 01743784 25 | 10448983 _derivationally_related_form 00752335 26 | 00082308 _derivationally_related_form 14445379 27 | 00921790 _derivationally_related_form 01320009 28 | 03792048 _derivationally_related_form 01660640 29 | 10525134 _derivationally_related_form 01301410 30 | 01711749 _derivationally_related_form 08664443 31 | 10525134 _derivationally_related_form 00233335 32 | 01693881 _derivationally_related_form 03104594 33 | 02928413 _hypernym 03600977 34 | 03104594 _derivationally_related_form 01693881 35 | 10529231 _derivationally_related_form 02204692 36 | 10689564 _derivationally_related_form 04160372 37 | 13454318 _derivationally_related_form 01742726 38 | 03265479 _hypernym 02875013 39 | 05844105 _derivationally_related_form 10155849 40 | 03932670 _hypernym 03932203 41 | 00083809 _synset_domain_topic_of 00612160 42 | 00151689 _derivationally_related_form 13458571 43 | 01069190 _verb_group 01069391 44 | 01612053 _derivationally_related_form 02542795 45 | 04644512 _derivationally_related_form 02564986 46 | 01647867 _derivationally_related_form 13970236 47 | 00321956 _derivationally_related_form 01580467 48 | 03257343 _derivationally_related_form 01735308 49 | 01410905 _derivationally_related_form 04750164 50 | 00919513 _derivationally_related_form 01567275 51 | 00043683 _derivationally_related_form 02728440 52 | 00764902 _derivationally_related_form 01026262 53 | 04748836 _derivationally_related_form 00651991 54 | 10160412 _derivationally_related_form 00482473 55 | 01739814 _derivationally_related_form 00916464 56 | 13998781 _derivationally_related_form 01301410 57 | 01363613 _also_see 01148283 58 | 03282060 _has_part 04085873 59 | 03792048 _derivationally_related_form 01660386 60 | 00730301 _derivationally_related_form 08512736 61 | 13085864 _derivationally_related_form 01741446 62 | 06998748 _derivationally_related_form 09812338 63 | 00933566 _derivationally_related_form 05117660 64 | 07369604 _derivationally_related_form 00299580 65 | 05902327 _derivationally_related_form 01743784 66 | 07369604 _derivationally_related_form 00300537 67 | 01151110 _hypernym 01987160 68 | 01135529 _derivationally_related_form 02443049 69 | 00696882 _derivationally_related_form 00082714 70 | 00233335 _derivationally_related_form 05846355 71 | 09941964 _derivationally_related_form 00751887 72 | 01743784 _derivationally_related_form 05902327 73 | 00084230 _derivationally_related_form 00612160 74 | 01020936 _hypernym 01019524 75 | 10529231 _derivationally_related_form 02203362 76 | 03777283 _derivationally_related_form 01697406 77 | 01167146 _derivationally_related_form 02542795 78 | 02657219 _derivationally_related_form 03728811 79 | 03327234 _derivationally_related_form 01588134 80 | 01020936 _derivationally_related_form 01742886 81 | 10668450 _hypernym 10525134 82 | 00796047 _derivationally_related_form 06893885 83 | 04613158 _derivationally_related_form 01492052 84 | 04905842 _derivationally_related_form 02388145 85 | 00765213 _derivationally_related_form 09800249 86 | 04463273 _hypernym 03234306 87 | 04930307 _derivationally_related_form 02659763 88 | 01662771 _derivationally_related_form 03779370 89 | 13998576 _derivationally_related_form 02711114 90 | 01813884 _derivationally_related_form 07527352 91 | 03386011 _derivationally_related_form 01606205 92 | 01052853 _derivationally_related_form 01613239 93 | 14442530 _derivationally_related_form 02646931 94 | 01640550 _derivationally_related_form 09972157 95 | 00267349 _hypernym 00266806 96 | 01711749 _derivationally_related_form 08677628 97 | 10155849 _derivationally_related_form 05844105 98 | 01159964 _derivationally_related_form 01687569 99 | 01662771 _derivationally_related_form 00909899 100 | 01780941 _derivationally_related_form 01222666 101 | 13913566 _hypernym 13860793 102 | 00761713 _derivationally_related_form 10351874 103 | 00909363 _also_see 01149494 104 | 01711749 _derivationally_related_form 05075602 105 | 02899439 _hypernym 03673971 106 | 07369604 _derivationally_related_form 00482893 107 | 01248191 _derivationally_related_form 02502536 108 | 05902327 _derivationally_related_form 01683582 109 | 10317007 _hypernym 10582746 110 | 00764902 _derivationally_related_form 07151122 111 | 03322099 _derivationally_related_form 02420232 112 | 02388145 _derivationally_related_form 04905842 113 | 00482473 _hypernym 00296178 114 | 07527352 _derivationally_related_form 01363613 115 | 01765392 _derivationally_related_form 01151407 116 | 00730499 _derivationally_related_form 08592656 117 | 01051331 _derivationally_related_form 01496630 118 | 14441825 _derivationally_related_form 02646931 119 | 00047317 _derivationally_related_form 00795008 120 | 02512305 _derivationally_related_form 10012815 121 | 07537068 _hypernym 07532440 122 | 10093908 _derivationally_related_form 00300537 123 | 05937112 _derivationally_related_form 02723733 124 | 04433185 _derivationally_related_form 01285440 125 | 00651991 _derivationally_related_form 07270179 126 | 02389346 _derivationally_related_form 00145218 127 | 01819554 _derivationally_related_form 01222477 128 | 00409211 _derivationally_related_form 02443849 129 | 00083809 _derivationally_related_form 00671351 130 | 02671279 _derivationally_related_form 13321495 131 | 01224744 _derivationally_related_form 10378412 132 | 01291069 _hypernym 01354673 133 | 10378780 _hypernym 09882007 134 | 09476521 _derivationally_related_form 00290740 135 | 03285912 _derivationally_related_form 02711114 136 | 00150287 _derivationally_related_form 09957614 137 | 05765415 _derivationally_related_form 02806907 138 | 00915722 _hypernym 00913705 139 | 01190884 _hypernym 01187810 140 | 13427078 _derivationally_related_form 00150287 141 | 06791372 _derivationally_related_form 02296984 142 | 01922763 _also_see 01740892 143 | 00119074 _derivationally_related_form 04748836 144 | 07066659 _derivationally_related_form 10155849 145 | 02003725 _derivationally_related_form 00236592 146 | 00916464 _has_part 00921790 147 | 01765392 _derivationally_related_form 07515790 148 | 01222477 _derivationally_related_form 01819554 149 | 00364479 _also_see 01368192 150 | 01684337 _verb_group 01551871 151 | 01148283 _also_see 00999817 152 | 01340439 _derivationally_related_form 00147595 153 | 01922763 _derivationally_related_form 07515560 154 | 00815644 _derivationally_related_form 01150559 155 | 00462092 _derivationally_related_form 04361641 156 | 01876907 _derivationally_related_form 00348571 157 | 07366627 _hypernym 07366289 158 | 05846355 _derivationally_related_form 00235368 159 | 09956578 _derivationally_related_form 00462092 160 | 00650016 _derivationally_related_form 05748054 161 | 03091374 _derivationally_related_form 01354673 162 | 00751887 _hypernym 02539334 163 | 02840361 _hypernym 03496892 164 | 00300537 _derivationally_related_form 04930307 165 | 08592656 _derivationally_related_form 00730499 166 | 01624568 _derivationally_related_form 13913566 167 | 04630689 _derivationally_related_form 00859153 168 | 02991122 _derivationally_related_form 02743547 169 | 01148283 _also_see 00362467 170 | 06003682 _hypernym 06000644 171 | 05198036 _derivationally_related_form 10388440 172 | 02372326 _derivationally_related_form 00040152 173 | 00795008 _derivationally_related_form 02659763 174 | 10645611 _hypernym 10676877 175 | 07527352 _derivationally_related_form 01813884 176 | 03779370 _derivationally_related_form 01697027 177 | 13489037 _derivationally_related_form 00245457 178 | 01051331 _derivationally_related_form 02333689 179 | 02991122 _derivationally_related_form 09812338 180 | 10689564 _hypernym 10120816 181 | 05844105 _derivationally_related_form 01666894 182 | 02539359 _derivationally_related_form 00047745 183 | 03932203 _derivationally_related_form 01656788 184 | 00696518 _also_see 01612053 185 | 01492052 _derivationally_related_form 04612840 186 | -------------------------------------------------------------------------------- /data/WN18RR_v2_ind/test.txt: -------------------------------------------------------------------------------- 1 | 08858942 _has_part 08890097 2 | 01725712 _derivationally_related_form 07480896 3 | 02542280 _derivationally_related_form 01203676 4 | 10515194 _derivationally_related_form 05945508 5 | 01958615 _derivationally_related_form 00299217 6 | 01398212 _hypernym 14989820 7 | 02112891 _hypernym 02112029 8 | 04695963 _derivationally_related_form 01537409 9 | 02491383 _derivationally_related_form 10526096 10 | 10489944 _hypernym 10707233 11 | 01856225 _member_meronym 01856553 12 | 05748786 _derivationally_related_form 02666882 13 | 00812526 _derivationally_related_form 01572978 14 | 01949110 _hypernym 01955984 15 | 02064131 _derivationally_related_form 13774404 16 | 09334396 _derivationally_related_form 01502762 17 | 09688008 _derivationally_related_form 02957823 18 | 05715864 _derivationally_related_form 02194495 19 | 04926427 _hypernym 04924103 20 | 14798450 _derivationally_related_form 02627221 21 | 01098869 _derivationally_related_form 01156438 22 | 01926984 _verb_group 02099829 23 | 02964389 _derivationally_related_form 01612084 24 | 07635155 _hypernym 07628870 25 | 13291189 _derivationally_related_form 02543874 26 | 00753428 _hypernym 00752493 27 | 01144657 _derivationally_related_form 00452293 28 | 01131043 _also_see 01125429 29 | 01073241 _derivationally_related_form 01182293 30 | 01074650 _also_see 02530861 31 | 10488016 _hypernym 10632576 32 | 02632567 _derivationally_related_form 14493426 33 | 08277805 _derivationally_related_form 09759311 34 | 05050379 _hypernym 05050115 35 | 00470084 _derivationally_related_form 00233386 36 | 02666943 _derivationally_related_form 01322854 37 | 08873622 _has_part 08597023 38 | 01781983 _derivationally_related_form 14405931 39 | 00299217 _derivationally_related_form 01958615 40 | 06220616 _hypernym 06212839 41 | 01904293 _derivationally_related_form 09281777 42 | 00481739 _derivationally_related_form 06667317 43 | 01182293 _derivationally_related_form 04638585 44 | 02046755 _derivationally_related_form 13878112 45 | 07813107 _derivationally_related_form 00213353 46 | 00467717 _derivationally_related_form 05924920 47 | 09759311 _derivationally_related_form 02669885 48 | 00409211 _derivationally_related_form 02443849 49 | 01775535 _hypernym 01775164 50 | 10672662 _derivationally_related_form 05186306 51 | 05707146 _derivationally_related_form 00614999 52 | 02519991 _derivationally_related_form 00259643 53 | 01845627 _member_meronym 01855672 54 | 01390616 _derivationally_related_form 00113113 55 | 14299070 _derivationally_related_form 00091124 56 | 01955508 _derivationally_related_form 00121645 57 | 01389329 _derivationally_related_form 00616083 58 | 00596393 _derivationally_related_form 10464542 59 | 00601822 _derivationally_related_form 00180770 60 | 02530861 _also_see 00853776 61 | 00560893 _derivationally_related_form 00358931 62 | 01856748 _hypernym 01507175 63 | 08877208 _instance_hypernym 08633957 64 | 06682794 _derivationally_related_form 01955127 65 | 13855627 _derivationally_related_form 00661213 66 | 00838367 _derivationally_related_form 01179865 67 | 03150232 _derivationally_related_form 01522276 68 | 01073822 _derivationally_related_form 04993413 69 | 01944692 _derivationally_related_form 02858304 70 | 00350889 _hypernym 00350461 71 | 02478059 _derivationally_related_form 14455700 72 | 07238102 _derivationally_related_form 02677332 73 | 09424489 _derivationally_related_form 01816431 74 | 02887209 _hypernym 04336034 75 | 13580723 _derivationally_related_form 01193721 76 | 10093658 _derivationally_related_form 01140794 77 | 00330160 _derivationally_related_form 00438178 78 | 13552270 _derivationally_related_form 00239614 79 | 01224744 _verb_group 00597385 80 | 05219724 _has_part 05514905 81 | 01354006 _derivationally_related_form 14705718 82 | 01234345 _derivationally_related_form 00421535 83 | 04576211 _has_part 04574999 84 | 02270165 _derivationally_related_form 10330189 85 | 00044673 _derivationally_related_form 00417001 86 | 02250625 _derivationally_related_form 13282550 87 | 00891216 _derivationally_related_form 13344804 88 | 00182213 _derivationally_related_form 02461314 89 | 14622893 _has_part 14619225 90 | 05238282 _has_part 05244934 91 | 05174653 _derivationally_related_form 02519991 92 | 13282550 _derivationally_related_form 02519991 93 | 03933529 _derivationally_related_form 02085742 94 | 00366547 _derivationally_related_form 00357680 95 | 02519991 _derivationally_related_form 13290676 96 | 01493897 _derivationally_related_form 14425974 97 | 04828255 _derivationally_related_form 01782519 98 | 08277805 _hypernym 08276720 99 | 01406356 _verb_group 01406512 100 | 01487311 _hypernym 01488956 101 | 04980656 _derivationally_related_form 01053144 102 | 15129927 _hypernym 05816790 103 | 01537409 _derivationally_related_form 05244934 104 | 02531422 _also_see 00856860 105 | 05003090 _derivationally_related_form 01882170 106 | 02125641 _derivationally_related_form 05714466 107 | 00113113 _derivationally_related_form 01754105 108 | 00695523 _also_see 01475282 109 | 00658052 _derivationally_related_form 01009871 110 | 06561942 _derivationally_related_form 00844298 111 | 02235666 _verb_group 02537407 112 | 13874073 _derivationally_related_form 00417001 113 | 14971519 _derivationally_related_form 00238867 114 | 02291708 _synset_domain_topic_of 13333237 115 | 01150467 _hypernym 01150200 116 | 13282550 _derivationally_related_form 02253456 117 | 13720096 _hypernym 13716084 118 | 01167780 _derivationally_related_form 03200357 119 | 01475282 _also_see 02451951 120 | 10330189 _derivationally_related_form 02269894 121 | 08612049 _has_part 08495617 122 | 00105778 _derivationally_related_form 09930876 123 | 10478960 _derivationally_related_form 02163301 124 | 00842989 _derivationally_related_form 09762385 125 | 01097031 _derivationally_related_form 08397255 126 | 00657728 _derivationally_related_form 00874977 127 | 02196690 _derivationally_related_form 14599641 128 | 01684337 _derivationally_related_form 00937656 129 | 00980908 _derivationally_related_form 06790042 130 | 00658052 _derivationally_related_form 06483454 131 | 05659365 _hypernym 05659621 132 | 01074650 _derivationally_related_form 10112591 133 | 02072501 _derivationally_related_form 13649791 134 | 01227137 _also_see 01370590 135 | 00800930 _hypernym 00685683 136 | 02250625 _derivationally_related_form 00259894 137 | 13573666 _hypernym 13575433 138 | 07551052 _derivationally_related_form 00859604 139 | 05527216 _has_part 05525628 140 | 00590148 _derivationally_related_form 09906986 141 | 09334396 _derivationally_related_form 01292727 142 | 00859604 _derivationally_related_form 01073241 143 | 00843468 _derivationally_related_form 06730780 144 | 09203827 _member_meronym 09316454 145 | 00239614 _verb_group 00238867 146 | 01958615 _synset_domain_topic_of 00450335 147 | 02700104 _derivationally_related_form 00552841 148 | 01960911 _derivationally_related_form 00442115 149 | 04236001 _hypernym 03895866 150 | 06685456 _derivationally_related_form 00890100 151 | 00365188 _verb_group 00365647 152 | 07450343 _hypernym 07447641 153 | 08871007 _has_part 08877208 154 | 00622584 _derivationally_related_form 01143838 155 | 06767777 _derivationally_related_form 01058880 156 | 01820302 _derivationally_related_form 07491981 157 | 10293332 _derivationally_related_form 01924505 158 | 09759311 _derivationally_related_form 08280124 159 | 02080577 _derivationally_related_form 02671880 160 | 00622266 _derivationally_related_form 01574292 161 | 04493505 _derivationally_related_form 02079525 162 | 03216828 _derivationally_related_form 01305731 163 | 00657550 _derivationally_related_form 05737153 164 | 13969243 _derivationally_related_form 02700104 165 | 09612291 _derivationally_related_form 00869596 166 | 01526956 _derivationally_related_form 03872495 167 | 02672540 _derivationally_related_form 00259643 168 | 02058794 _also_see 02523275 169 | 05524615 _hypernym 05525252 170 | 05014099 _derivationally_related_form 00328128 171 | 02191766 _derivationally_related_form 05715864 172 | 00812274 _derivationally_related_form 01216004 173 | 10298912 _derivationally_related_form 00595146 174 | 01216522 _derivationally_related_form 00812526 175 | 00299580 _derivationally_related_form 05755486 176 | 00085678 _derivationally_related_form 02274482 177 | 01547641 _verb_group 01547390 178 | 15183428 _hypernym 15157225 179 | 06200010 _derivationally_related_form 00350461 180 | 01027263 _derivationally_related_form 04659090 181 | 02085742 _derivationally_related_form 10655169 182 | 01158690 _derivationally_related_form 00467717 183 | 00851933 _derivationally_related_form 01224517 184 | 00259643 _derivationally_related_form 02519991 185 | 09826204 _derivationally_related_form 01941093 186 | 13876371 _derivationally_related_form 02738544 187 | 02120458 _derivationally_related_form 00513401 188 | 01524298 _hypernym 01524871 189 | 02653996 _derivationally_related_form 02945161 190 | 10058411 _derivationally_related_form 01820302 191 | 02834778 _has_part 04289690 192 | 02446164 _derivationally_related_form 00583461 193 | 00015303 _derivationally_related_form 00858849 194 | 00842989 _derivationally_related_form 07234230 195 | 00320486 _derivationally_related_form 02001858 196 | 00963241 _hypernym 00962129 197 | 00356790 _derivationally_related_form 01387786 198 | 14585519 _derivationally_related_form 00330144 199 | 01354405 _derivationally_related_form 04049405 200 | 15145586 _has_part 15146545 201 | 02163301 _derivationally_related_form 01135529 202 | 04854389 _hypernym 04827652 203 | 01385920 _derivationally_related_form 09307300 204 | 00937656 _derivationally_related_form 01551871 205 | 02326695 _also_see 02451951 206 | 13279262 _hypernym 13281275 207 | 01179865 _derivationally_related_form 07800091 208 | 13875970 _derivationally_related_form 00143204 209 | 01940403 _derivationally_related_form 00302394 210 | 00061290 _derivationally_related_form 01955127 211 | 01951276 _derivationally_related_form 13326198 212 | 00467717 _derivationally_related_form 13617952 213 | 02632567 _hypernym 02632353 214 | 00963283 _derivationally_related_form 00963241 215 | 01432601 _derivationally_related_form 10395073 216 | 01193099 _hypernym 01166351 217 | 01961691 _synset_domain_topic_of 00441824 218 | 00443231 _hypernym 00442115 219 | 13649791 _hypernym 13603305 220 | 05058140 _derivationally_related_form 00438178 221 | 01273263 _derivationally_related_form 02784732 222 | 01131043 _also_see 02037272 223 | 00136800 _derivationally_related_form 13446197 224 | 01510827 _derivationally_related_form 14008806 225 | 00683185 _also_see 01880531 226 | 00073828 _hypernym 00070965 227 | 01640850 _also_see 00666058 228 | 00658052 _derivationally_related_form 14429608 229 | 04565375 _derivationally_related_form 01087197 230 | 02123672 _derivationally_related_form 05713737 231 | 04046810 _hypernym 04048568 232 | 01958615 _synset_domain_topic_of 00299217 233 | 07480068 _hypernym 00026192 234 | 03933529 _derivationally_related_form 01305731 235 | 08894456 _has_part 08895928 236 | 01193099 _derivationally_related_form 01073655 237 | 02261464 _derivationally_related_form 08069878 238 | 01073655 _hypernym 01073241 239 | 01535709 _also_see 01640850 240 | 00854000 _derivationally_related_form 01226600 241 | 03000447 _derivationally_related_form 13575433 242 | 10527334 _hypernym 10503452 243 | 01840238 _derivationally_related_form 10096217 244 | 00890100 _derivationally_related_form 06685456 245 | 13282007 _derivationally_related_form 02249741 246 | 00014742 _derivationally_related_form 15273626 247 | 14798450 _hypernym 15010703 248 | 00208943 _derivationally_related_form 02405252 249 | 02667228 _hypernym 02666882 250 | 02332999 _derivationally_related_form 07556637 251 | 01100145 _derivationally_related_form 10782791 252 | 14493426 _derivationally_related_form 02632567 253 | 09361517 _hypernym 09437454 254 | 01235258 _derivationally_related_form 01153486 255 | 00852922 _hypernym 00849080 256 | 00841628 _derivationally_related_form 01193721 257 | 01167981 _derivationally_related_form 03200357 258 | 00619183 _derivationally_related_form 15159819 259 | 01854415 _hypernym 01852861 260 | 09906848 _hypernym 10474645 261 | 13858045 _hypernym 13857486 262 | 01845627 _member_meronym 01854047 263 | 00105778 _derivationally_related_form 00513401 264 | 01510827 _derivationally_related_form 00409211 265 | 01100145 _derivationally_related_form 10782940 266 | 01639105 _derivationally_related_form 03772269 267 | 02384686 _derivationally_related_form 07186148 268 | 14436875 _derivationally_related_form 02237631 269 | 14429608 _hypernym 14429985 270 | 01182024 _derivationally_related_form 01197338 271 | 00459114 _hypernym 00458754 272 | 05645199 _derivationally_related_form 00609100 273 | 10768585 _derivationally_related_form 01093172 274 | 00609506 _derivationally_related_form 01941093 275 | 01820302 _derivationally_related_form 10058411 276 | 04135315 _hypernym 03872495 277 | 00300317 _hypernym 00299580 278 | 09622302 _derivationally_related_form 01775535 279 | 07238102 _derivationally_related_form 02636921 280 | 07183151 _derivationally_related_form 00869126 281 | 02043982 _derivationally_related_form 08612049 282 | 00853633 _derivationally_related_form 06778102 283 | 14889479 _derivationally_related_form 00239614 284 | 01941093 _derivationally_related_form 10433164 285 | 00043480 _derivationally_related_form 05714466 286 | 08278324 _hypernym 08276342 287 | 09861946 _derivationally_related_form 01944692 288 | 09747329 _derivationally_related_form 03130073 289 | 05658603 _derivationally_related_form 02124748 290 | 00724029 _derivationally_related_form 13421462 291 | 01190840 _derivationally_related_form 07891726 292 | 02274482 _derivationally_related_form 00085678 293 | 01020117 _similar_to 01017738 294 | 02174311 _derivationally_related_form 07380144 295 | 02464342 _hypernym 02463704 296 | 07460104 _derivationally_related_form 01914947 297 | 00588473 _derivationally_related_form 09759501 298 | 00135718 _also_see 01880531 299 | 02053941 _verb_group 01849746 300 | 00550016 _derivationally_related_form 10318892 301 | 01021579 _derivationally_related_form 00958823 302 | 01475282 _also_see 01613463 303 | 00384620 _derivationally_related_form 00261405 304 | 01203676 _derivationally_related_form 02662979 305 | 13529616 _derivationally_related_form 02672859 306 | 05046471 _hypernym 05046009 307 | 01387786 _derivationally_related_form 01741562 308 | 00366547 _verb_group 00364868 309 | 00858849 _derivationally_related_form 00016380 310 | 00076072 _hypernym 00074790 311 | 03199901 _has_part 03905730 312 | 00231567 _derivationally_related_form 02478059 313 | 15166462 _has_part 15169421 314 | 08654360 _hypernym 08491826 315 | 13622591 _has_part 13622209 316 | 00657550 _derivationally_related_form 00874977 317 | 00453935 _derivationally_related_form 01140794 318 | 00754731 _derivationally_related_form 10672192 319 | 04665813 _derivationally_related_form 02529284 320 | 02402409 _derivationally_related_form 00215838 321 | 01882170 _derivationally_related_form 04544979 322 | 00302394 _derivationally_related_form 01941093 323 | 14619225 _has_part 09272085 324 | 02652494 _derivationally_related_form 10269458 325 | 04046810 _derivationally_related_form 01954559 326 | 01186208 _hypernym 01194418 327 | 00759694 _derivationally_related_form 10702781 328 | 00660102 _derivationally_related_form 10506762 329 | 15122231 _derivationally_related_form 00490968 330 | 08571139 _hypernym 08523483 331 | 15166462 _has_part 15169248 332 | 04440749 _hypernym 03533972 333 | 02519991 _derivationally_related_form 13341756 334 | 07847198 _derivationally_related_form 01418037 335 | 01242716 _derivationally_related_form 02512922 336 | 09889941 _hypernym 10744164 337 | 02252931 _derivationally_related_form 01120448 338 | 02566227 _derivationally_related_form 10157744 339 | 04157320 _derivationally_related_form 01551871 340 | 09334396 _derivationally_related_form 02022359 341 | 01996735 _derivationally_related_form 08428019 342 | 10529965 _derivationally_related_form 01957529 343 | 10769321 _derivationally_related_form 02493260 344 | 01940403 _derivationally_related_form 10096217 345 | 00555648 _derivationally_related_form 02055649 346 | 06508816 _derivationally_related_form 01001643 347 | 03309465 _has_part 04082886 348 | 00239614 _hypernym 00239321 349 | 02005756 _derivationally_related_form 04864515 350 | 01782218 _derivationally_related_form 07520612 351 | 01223182 _derivationally_related_form 13877918 352 | 01136614 _derivationally_related_form 03467984 353 | 02478059 _derivationally_related_form 00231567 354 | 00595146 _derivationally_related_form 10298912 355 | 00980908 _derivationally_related_form 05960464 356 | 00365188 _derivationally_related_form 00365471 357 | 07543288 _derivationally_related_form 01775164 358 | 06778102 _derivationally_related_form 00105554 359 | 00891216 _derivationally_related_form 10209731 360 | 10760340 _derivationally_related_form 02461314 361 | 13358549 _hypernym 13384557 362 | 00074790 _hypernym 00070965 363 | 00452293 _derivationally_related_form 02003601 364 | 02994858 _derivationally_related_form 00329831 365 | 15165490 _hypernym 15228378 366 | 02210119 _verb_group 02236124 367 | 00643197 _derivationally_related_form 09790278 368 | 02529284 _derivationally_related_form 00066397 369 | 00264875 _derivationally_related_form 13426238 370 | 00754731 _derivationally_related_form 06513366 371 | 01219706 _derivationally_related_form 02886599 372 | 08587828 _hypernym 08491826 373 | 01765392 _derivationally_related_form 00759551 374 | 02046755 _derivationally_related_form 07440979 375 | 06201136 _derivationally_related_form 10402086 376 | 00882961 _derivationally_related_form 02123672 377 | 04660080 _hypernym 05207130 378 | 04879658 _derivationally_related_form 00963283 379 | 10051975 _derivationally_related_form 00416135 380 | 02079525 _derivationally_related_form 05246511 381 | 01510827 _derivationally_related_form 07338114 382 | 14889479 _hypernym 14865800 383 | 10084635 _derivationally_related_form 00800421 384 | 09790278 _hypernym 10488016 385 | 07246742 _derivationally_related_form 00807461 386 | 00087152 _also_see 01922763 387 | 07204911 _derivationally_related_form 02473431 388 | 10299250 _derivationally_related_form 02592397 389 | 07677593 _hypernym 07675627 390 | 05714161 _hypernym 05713737 391 | 10433737 _derivationally_related_form 01180975 392 | 00849080 _derivationally_related_form 10561320 393 | 00302394 _derivationally_related_form 01940403 394 | 02344243 _hypernym 02344060 395 | 00258854 _derivationally_related_form 00199659 396 | 01960911 _derivationally_related_form 10683126 397 | 01912893 _derivationally_related_form 04544979 398 | 02368336 _also_see 02395115 399 | 02037272 _also_see 01549291 400 | 02738031 _hypernym 04566257 401 | 07186148 _derivationally_related_form 01470225 402 | 00031820 _also_see 00802136 403 | 02395115 _also_see 01073822 404 | 02583139 _derivationally_related_form 00962129 405 | 04465933 _hypernym 03605722 406 | 02523275 _derivationally_related_form 05042871 407 | 02657219 _derivationally_related_form 04713428 408 | 00070965 _derivationally_related_form 00842538 409 | 09759311 _derivationally_related_form 08279298 410 | 01522052 _derivationally_related_form 00345641 411 | 14425974 _derivationally_related_form 01489722 412 | 15159819 _derivationally_related_form 00619183 413 | 01570562 _hypernym 01387786 414 | 00616857 _derivationally_related_form 10351625 415 | 00739270 _derivationally_related_form 00311663 416 | 02274482 _derivationally_related_form 09810364 417 | 02418205 _derivationally_related_form 05769471 418 | 15170786 _has_part 15171008 419 | 01857632 _derivationally_related_form 01053339 420 | 09366017 _hypernym 09287968 421 | 04842515 _derivationally_related_form 01587077 422 | 02462580 _verb_group 02461314 423 | 02389220 _derivationally_related_form 13939353 424 | 02022486 _derivationally_related_form 09334396 425 | 04713428 _derivationally_related_form 02657219 426 | 00511212 _hypernym 00510189 427 | 09879744 _derivationally_related_form 02566227 428 | 00321486 _derivationally_related_form 03873064 429 | 00259894 _derivationally_related_form 02250625 430 | 02924116 _derivationally_related_form 01949110 431 | 10002760 _derivationally_related_form 02521410 432 | 00135718 _derivationally_related_form 04721650 433 | 02055649 _derivationally_related_form 00330160 434 | 01522276 _derivationally_related_form 10781984 435 | 00922867 _derivationally_related_form 01004582 436 | 02667900 _hypernym 02657219 437 | 08895771 _instance_hypernym 08524735 438 | 15137047 _hypernym 15163005 439 | 02700104 _verb_group 02657219 440 | 03624966 _derivationally_related_form 01671039 441 | 02402409 _hypernym 02405252 442 | -------------------------------------------------------------------------------- /data/WN18RR_v2_ind/valid.txt: -------------------------------------------------------------------------------- 1 | 11450566 _derivationally_related_form 00505802 2 | 04839676 _hypernym 04854389 3 | 01176567 _derivationally_related_form 07891726 4 | 10062996 _derivationally_related_form 02599004 5 | 01176232 _derivationally_related_form 07557165 6 | 01510827 _derivationally_related_form 00140393 7 | 00417643 _derivationally_related_form 01425511 8 | 06271778 _derivationally_related_form 00790703 9 | 05716577 _derivationally_related_form 02337667 10 | 01936537 _derivationally_related_form 04046810 11 | 01816431 _derivationally_related_form 13986679 12 | 07424109 _derivationally_related_form 10527334 13 | 01765392 _derivationally_related_form 07515790 14 | 01504699 _derivationally_related_form 00447540 15 | 02566015 _also_see 00695523 16 | 07678729 _derivationally_related_form 00542809 17 | 14705718 _derivationally_related_form 01354006 18 | 00713250 _derivationally_related_form 01266895 19 | 02666882 _derivationally_related_form 05748786 20 | 01305731 _derivationally_related_form 03933529 21 | 00420132 _derivationally_related_form 00148057 22 | 02043982 _derivationally_related_form 07440979 23 | 02792903 _derivationally_related_form 08276720 24 | 01003729 _derivationally_related_form 00657550 25 | 02521816 _hypernym 02521410 26 | 05514905 _has_part 05524615 27 | 06778102 _derivationally_related_form 00853633 28 | 01387786 _derivationally_related_form 00356790 29 | 04713692 _hypernym 04713428 30 | 01673472 _hypernym 01672014 31 | 00891850 _hypernym 00884466 32 | 08860123 _member_of_domain_usage 07711080 33 | 00136800 _derivationally_related_form 13897996 34 | 00492677 _derivationally_related_form 08615374 35 | 10744164 _hypernym 09626031 36 | 10672192 _derivationally_related_form 00754731 37 | 04146050 _derivationally_related_form 02792903 38 | 08858942 _member_meronym 09700964 39 | 01646941 _also_see 02100709 40 | 00854000 _derivationally_related_form 01431230 41 | 00942234 _derivationally_related_form 01256157 42 | 01926311 _verb_group 01914947 43 | 04049405 _hypernym 03057021 44 | 02443849 _derivationally_related_form 10378780 45 | 10224098 _hypernym 09940146 46 | 01135795 _derivationally_related_form 02593354 47 | 00853958 _derivationally_related_form 07153727 48 | 03467984 _hypernym 04565375 49 | 02566227 _derivationally_related_form 09879744 50 | 03024882 _derivationally_related_form 01215694 51 | 01489161 _derivationally_related_form 02964389 52 | 07512465 _derivationally_related_form 02105990 53 | 02023992 _derivationally_related_form 03420559 54 | 01009871 _derivationally_related_form 00745499 55 | 01072072 _derivationally_related_form 01828736 56 | 05186306 _derivationally_related_form 10672908 57 | 14738752 _derivationally_related_form 00458471 58 | 02531422 _also_see 01257612 59 | 14599641 _hypernym 14599168 60 | 01957529 _verb_group 02102398 61 | 01020117 _derivationally_related_form 01738347 62 | 00492410 _derivationally_related_form 10451858 63 | 01068012 _derivationally_related_form 02466496 64 | 00375021 _derivationally_related_form 05014099 65 | 01216670 _derivationally_related_form 00812526 66 | 02599939 _derivationally_related_form 09759069 67 | 04873550 _hypernym 04827652 68 | 09316454 _derivationally_related_form 10217436 69 | 05256862 _has_part 05257737 70 | 00713952 _derivationally_related_form 01612084 71 | 04993882 _hypernym 04992163 72 | 01370590 _also_see 01227137 73 | 00593108 _derivationally_related_form 10162991 74 | 10351625 _derivationally_related_form 00616153 75 | 00759551 _derivationally_related_form 00764902 76 | 02700104 _derivationally_related_form 13969243 77 | 01322854 _derivationally_related_form 00620424 78 | 00204199 _derivationally_related_form 00810557 79 | 04544979 _derivationally_related_form 01912893 80 | 01373138 _derivationally_related_form 14619225 81 | 00259643 _hypernym 00258854 82 | 01899360 _also_see 00311663 83 | 05681117 _hypernym 14024882 84 | 10618848 _derivationally_related_form 06220616 85 | 00091124 _derivationally_related_form 14299336 86 | 02746365 _has_part 04322026 87 | 02001858 _derivationally_related_form 00487874 88 | 03999992 _derivationally_related_form 01754105 89 | 00894552 _derivationally_related_form 00606093 90 | 10160412 _derivationally_related_form 00483181 91 | 01179707 _derivationally_related_form 01613463 92 | 09849598 _derivationally_related_form 01775164 93 | 01919391 _derivationally_related_form 08428019 94 | 09986189 _derivationally_related_form 02834778 95 | 05246796 _hypernym 05246511 96 | 10224098 _derivationally_related_form 00853633 97 | 08280124 _derivationally_related_form 09759501 98 | 10196965 _derivationally_related_form 01637633 99 | 06695579 _derivationally_related_form 00880227 100 | 00605310 _derivationally_related_form 10118382 101 | 00272448 _derivationally_related_form 02579447 102 | 08894456 _has_part 09430771 103 | 00324231 _derivationally_related_form 00247792 104 | 00510189 _hypernym 00509846 105 | 08892766 _instance_hypernym 08552138 106 | 00810729 _derivationally_related_form 00740712 107 | 08543496 _hypernym 08543223 108 | 01136614 _derivationally_related_form 10152083 109 | 01952750 _derivationally_related_form 08616311 110 | 01021579 _derivationally_related_form 00350461 111 | 00123234 _derivationally_related_form 01133825 112 | 02893338 _derivationally_related_form 10020890 113 | 00804802 _derivationally_related_form 07180787 114 | 00357680 _derivationally_related_form 00366547 115 | 13450636 _synset_domain_topic_of 06055946 116 | 02344060 _derivationally_related_form 01235137 117 | 10782791 _derivationally_related_form 01100145 118 | 08892058 _instance_hypernym 08552138 119 | 09624168 _has_part 05219724 120 | 06561942 _derivationally_related_form 00869931 121 | 00052146 _derivationally_related_form 01305731 122 | 04595855 _derivationally_related_form 02354536 123 | 02167571 _derivationally_related_form 00984609 124 | 05003090 _derivationally_related_form 01912893 125 | 01637633 _derivationally_related_form 05768806 126 | 07710616 _hypernym 07710283 127 | 04694809 _derivationally_related_form 01532329 128 | 01723224 _derivationally_related_form 00897026 129 | 09334396 _derivationally_related_form 01981279 130 | 08523483 _derivationally_related_form 01498498 131 | 01224744 _derivationally_related_form 00409211 132 | 00854150 _hypernym 00853633 133 | 01194418 _derivationally_related_form 10299250 134 | 02543874 _derivationally_related_form 00233386 135 | 02124332 _hypernym 02123672 136 | 01322221 _hypernym 01321854 137 | 01646941 _derivationally_related_form 07944050 138 | 00812526 _derivationally_related_form 01220303 139 | 01373844 _derivationally_related_form 02754103 140 | 07557434 _has_part 07809096 141 | 00643197 _derivationally_related_form 00704305 142 | 06686174 _derivationally_related_form 00889555 143 | 01845627 _member_meronym 01853379 144 | 01613463 _also_see 02451951 145 | 02521816 _derivationally_related_form 01177703 146 | 02192992 _derivationally_related_form 00882702 147 | 04794751 _derivationally_related_form 01672607 148 | 08871007 _has_part 08873412 149 | 02126382 _derivationally_related_form 03916470 150 | 00302394 _derivationally_related_form 01840238 151 | 01106272 _derivationally_related_form 01489161 152 | 00416135 _hypernym 01856626 153 | 05091316 _derivationally_related_form 00658052 154 | 05681117 _derivationally_related_form 00014742 155 | 06773976 _derivationally_related_form 01647867 156 | 03024746 _derivationally_related_form 01570562 157 | 04728376 _derivationally_related_form 02341266 158 | 10495555 _derivationally_related_form 02302817 159 | 02546075 _derivationally_related_form 06696483 160 | 01055073 _derivationally_related_form 04980008 161 | 09917593 _derivationally_related_form 14427065 162 | 07557165 _derivationally_related_form 01176232 163 | 05716744 _hypernym 05715283 164 | 13855627 _hypernym 13854649 165 | 03058603 _derivationally_related_form 00051511 166 | 00329831 _derivationally_related_form 08523483 167 | 08612049 _derivationally_related_form 02043982 168 | 00273963 _derivationally_related_form 13453428 169 | 01646941 _also_see 01640850 170 | 01637633 _derivationally_related_form 10196965 171 | 10566072 _derivationally_related_form 01684337 172 | 02967626 _hypernym 03872495 173 | 00350889 _derivationally_related_form 04863074 174 | 00458754 _derivationally_related_form 14738752 175 | 10388440 _derivationally_related_form 00595684 176 | 04159058 _derivationally_related_form 01531265 177 | 15228162 _hypernym 15154774 178 | 01183573 _derivationally_related_form 07532112 179 | 08280124 _member_meronym 09759501 180 | 05717342 _derivationally_related_form 02196214 181 | 02192992 _derivationally_related_form 05658226 182 | 00467717 _derivationally_related_form 07260623 183 | 14427065 _derivationally_related_form 09918554 184 | 06561942 _derivationally_related_form 00843468 185 | 02124332 _derivationally_related_form 05713737 186 | 00276813 _derivationally_related_form 00492410 187 | 02346895 _verb_group 02542280 188 | 01941093 _verb_group 01847845 189 | 02542280 _verb_group 00351406 190 | 07674749 _hypernym 07673397 191 | 01156438 _derivationally_related_form 01098869 192 | 01177033 _derivationally_related_form 02521410 193 | 04907991 _hypernym 04907269 194 | 02253456 _derivationally_related_form 13279262 195 | 07679356 _hypernym 07622061 196 | 05513807 _has_part 05526384 197 | 02039156 _derivationally_related_form 00341548 198 | 01425709 _derivationally_related_form 00854000 199 | 07813107 _hypernym 07809368 200 | 00417397 _derivationally_related_form 01424456 201 | 10672662 _hypernym 10084635 202 | 00962722 _derivationally_related_form 10527334 203 | 06271778 _hypernym 06254669 204 | 00550016 _derivationally_related_form 01724185 205 | 01850035 _member_meronym 01850192 206 | 15137890 _derivationally_related_form 02708707 207 | 10330189 _hypernym 09847010 208 | 02708707 _hypernym 02708420 209 | 10782791 _derivationally_related_form 02288295 210 | 05524615 _has_part 05525807 211 | 00609100 _derivationally_related_form 05645199 212 | 00800940 _derivationally_related_form 00265386 213 | 02671880 _derivationally_related_form 09424489 214 | 08657249 _derivationally_related_form 02512808 215 | 04411264 _derivationally_related_form 02653996 216 | 00365188 _derivationally_related_form 07313241 217 | 00252710 _derivationally_related_form 09772029 218 | 00031820 _derivationally_related_form 10248876 219 | 00447540 _derivationally_related_form 01574292 220 | 10403876 _synset_domain_topic_of 02858304 221 | 00739270 _derivationally_related_form 00614999 222 | 00504592 _also_see 01675190 223 | 00409211 _derivationally_related_form 01510827 224 | 01523986 _derivationally_related_form 03065424 225 | 01489989 _derivationally_related_form 02964389 226 | 01489734 _derivationally_related_form 03933529 227 | 15147850 _has_part 15146004 228 | 02037272 _also_see 02513740 229 | 02290196 _derivationally_related_form 13279262 230 | 00293916 _derivationally_related_form 01926311 231 | 14425974 _derivationally_related_form 01493897 232 | 01305361 _derivationally_related_form 03933529 233 | 00199912 _derivationally_related_form 10512982 234 | 04565375 _derivationally_related_form 02334867 235 | 06878071 _derivationally_related_form 00028565 236 | 00043480 _derivationally_related_form 03916470 237 | 02527651 _derivationally_related_form 01263018 238 | 08033194 _synset_domain_topic_of 00759694 239 | 01072565 _derivationally_related_form 01183573 240 | 03656484 _hypernym 03851341 241 | 02944826 _hypernym 03763727 242 | 00601822 _hypernym 00599472 243 | 00274283 _verb_group 00273963 244 | 01782519 _derivationally_related_form 04828255 245 | 02653159 _derivationally_related_form 02839200 246 | 01647867 _derivationally_related_form 09952163 247 | 13650045 _hypernym 13603305 248 | 01941093 _derivationally_related_form 00609506 249 | 00285557 _derivationally_related_form 02091410 250 | 10610465 _derivationally_related_form 00014742 251 | 01185981 _hypernym 01166351 252 | 03101796 _hypernym 03101986 253 | 05050115 _derivationally_related_form 01731351 254 | 00247792 _derivationally_related_form 00323856 255 | 02191546 _derivationally_related_form 05658226 256 | 10042300 _derivationally_related_form 01166351 257 | 08873622 _has_part 08875547 258 | 10561613 _hypernym 10042300 259 | 02000868 _derivationally_related_form 10100124 260 | 01088749 _derivationally_related_form 14455206 261 | 01572978 _derivationally_related_form 00812526 262 | 00369802 _hypernym 00358931 263 | 01055165 _derivationally_related_form 02653996 264 | 00622266 _derivationally_related_form 01504699 265 | 08892971 _instance_hypernym 08524735 266 | 15228267 _hypernym 15154774 267 | 02124106 _derivationally_related_form 05714894 268 | 02048891 _derivationally_related_form 13878112 269 | 00843468 _derivationally_related_form 09762385 270 | 15122231 _derivationally_related_form 00297906 271 | 00026385 _derivationally_related_form 01064148 272 | 06700169 _hypernym 06700030 273 | 01027263 _derivationally_related_form 00299580 274 | 02792552 _derivationally_related_form 01954852 275 | 02792903 _derivationally_related_form 04146050 276 | 01158690 _derivationally_related_form 00682436 277 | 01743531 _derivationally_related_form 10318892 278 | 00853776 _derivationally_related_form 04626280 279 | 07891726 _hypernym 07884567 280 | 01725712 _also_see 01256332 281 | 02463990 _derivationally_related_form 10564800 282 | 01587077 _derivationally_related_form 04782116 283 | 02462580 _derivationally_related_form 00182213 284 | 01431230 _derivationally_related_form 10237196 285 | 02950256 _hypernym 02746365 286 | 09762385 _derivationally_related_form 00842989 287 | 01105737 _derivationally_related_form 02909006 288 | 01904293 _derivationally_related_form 00443231 289 | 02669885 _derivationally_related_form 09759501 290 | 08871007 _has_part 08879197 291 | 04055030 _hypernym 03872495 292 | 02700104 _derivationally_related_form 04713332 293 | 07449862 _derivationally_related_form 01186208 294 | 08279298 _derivationally_related_form 09759069 295 | 00259643 _derivationally_related_form 02253456 296 | 13766896 _derivationally_related_form 01180351 297 | 06685456 _derivationally_related_form 00891936 298 | 00015498 _derivationally_related_form 00858377 299 | 02579447 _derivationally_related_form 10257647 300 | 03325769 _derivationally_related_form 02631659 301 | 08223263 _derivationally_related_form 02346895 302 | 06220616 _derivationally_related_form 00298041 303 | 07557434 _has_part 07829412 304 | 00014742 _derivationally_related_form 14024882 305 | 02672859 _hypernym 02672540 306 | 02491383 _derivationally_related_form 07390945 307 | 01112364 _derivationally_related_form 05737153 308 | 02344381 _similar_to 02341266 309 | 13118569 _hypernym 13112664 310 | 10334567 _derivationally_related_form 01922895 311 | 00357680 _derivationally_related_form 00366275 312 | 00606335 _derivationally_related_form 00894552 313 | 10720097 _hypernym 10193026 314 | 10211203 _hypernym 10020890 315 | 09811852 _derivationally_related_form 02950482 316 | 08428019 _derivationally_related_form 01996735 317 | 02085742 _derivationally_related_form 03216828 318 | 00810557 _derivationally_related_form 00740712 319 | 07520612 _derivationally_related_form 01780941 320 | 10117017 _derivationally_related_form 02729965 321 | 03470387 _has_part 03340723 322 | 10472799 _hypernym 09807754 323 | 00227507 _also_see 00504592 324 | 00853776 _also_see 01725712 325 | 08036849 _instance_hypernym 08392137 326 | 01497292 _derivationally_related_form 02967626 327 | 04561734 _derivationally_related_form 01398941 328 | 01222645 _derivationally_related_form 00343249 329 | 00356790 _hypernym 00113113 330 | 13417410 _hypernym 13398241 331 | 10480730 _hypernym 09759069 332 | 00028565 _derivationally_related_form 06878071 333 | 04839676 _derivationally_related_form 00957176 334 | 01828736 _derivationally_related_form 07543288 335 | 08871007 _has_part 08873622 336 | 01149911 _derivationally_related_form 01387786 337 | 13384557 _derivationally_related_form 09934921 338 | 08688247 _derivationally_related_form 02512150 339 | 05008227 _derivationally_related_form 01476685 340 | 01121855 _hypernym 01120448 341 | 02125641 _derivationally_related_form 04980008 342 | 06877578 _hypernym 06877078 343 | 01183573 _derivationally_related_form 13580723 344 | 01606205 _derivationally_related_form 03386011 345 | 02089420 _derivationally_related_form 04500060 346 | 00138221 _derivationally_related_form 01431230 347 | 01183573 _derivationally_related_form 02080577 348 | 03024882 _hypernym 03814906 349 | 01185981 _derivationally_related_form 08253640 350 | 03065424 _derivationally_related_form 01523986 351 | 02708707 _derivationally_related_form 15137890 352 | 03963028 _derivationally_related_form 01395049 353 | 13381734 _derivationally_related_form 01064999 354 | 01851996 _hypernym 01507175 355 | 08871007 _has_part 08876975 356 | 08046759 _instance_hypernym 08392137 357 | 00384620 _derivationally_related_form 00258854 358 | 10211203 _derivationally_related_form 00593837 359 | 02950256 _derivationally_related_form 09811852 360 | 00442115 _derivationally_related_form 01904293 361 | 03386011 _derivationally_related_form 01155421 362 | 01167188 _derivationally_related_form 07556637 363 | 08657249 _derivationally_related_form 00506952 364 | 09827683 _derivationally_related_form 02570267 365 | 02513740 _also_see 02037272 366 | 02331175 _derivationally_related_form 03933529 367 | 05813229 _derivationally_related_form 01828736 368 | 00945916 _derivationally_related_form 02269894 369 | 06950528 _derivationally_related_form 02958126 370 | 07515560 _hypernym 07514968 371 | 00487874 _derivationally_related_form 02001858 372 | 02463990 _hypernym 02463704 373 | 01186208 _derivationally_related_form 08253640 374 | 01240979 _derivationally_related_form 02478059 375 | 00123234 _derivationally_related_form 01489332 376 | 13282007 _hypernym 13278375 377 | 00962567 _hypernym 00973077 378 | 03075768 _derivationally_related_form 01765392 379 | 08392137 _synset_domain_topic_of 00759694 380 | 00031820 _derivationally_related_form 06778102 381 | 14836127 _hypernym 14971519 382 | 00512843 _derivationally_related_form 00854150 383 | 00340989 _derivationally_related_form 02048891 384 | 01856626 _derivationally_related_form 01123095 385 | 07628870 _hypernym 07622061 386 | 04665813 _derivationally_related_form 00754873 387 | 05769726 _derivationally_related_form 01637633 388 | 03959936 _derivationally_related_form 01395049 389 | 01387786 _derivationally_related_form 05289601 390 | 04795545 _hypernym 04794751 391 | 05960464 _derivationally_related_form 00980908 392 | 01780941 _derivationally_related_form 07520612 393 | 07775375 _derivationally_related_form 01140794 394 | 04842993 _derivationally_related_form 02105990 395 | 01093587 _derivationally_related_form 09952163 396 | 11449907 _derivationally_related_form 00505802 397 | 09879744 _derivationally_related_form 00013172 398 | 09811852 _derivationally_related_form 02950256 399 | 00417859 _derivationally_related_form 01424456 400 | 01845627 _member_meronym 01858441 401 | 00105778 _derivationally_related_form 06781383 402 | 00804802 _derivationally_related_form 10018021 403 | 14427239 _hypernym 14425974 404 | 00345641 _derivationally_related_form 01522052 405 | 10693459 _derivationally_related_form 05636402 406 | 02902079 _hypernym 04008947 407 | 03679986 _derivationally_related_form 01612084 408 | 00467717 _derivationally_related_form 00999245 409 | 01612084 _derivationally_related_form 00713952 410 | 08647945 _hypernym 08523483 411 | 01775164 _derivationally_related_form 07543288 412 | -------------------------------------------------------------------------------- /data/fb237_v1_ind/test.txt: -------------------------------------------------------------------------------- 1 | /m/0gq9h /award/award_category/winners./award/award_honor/ceremony /m/0bzlrh 2 | /m/080knyg /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/022q4l9 3 | /m/046qq /film/actor/film./film/performance/film /m/03shpq 4 | /m/060c4 /business/job_title/people_with_this_title./business/employment_tenure/company /m/0jpn8 5 | /m/0x67 /people/ethnicity/people /m/02h9_l 6 | /m/0qf2t /film/film/genre /m/01t_vv 7 | /m/06dfg /location/location/adjoin_s./location/adjoining_relationship/adjoins /m/07tp2 8 | /m/0l9k1 /award/award_nominee/award_nominations./award/award_nomination/award /m/0gq9h 9 | /m/01wgxtl /base/popstra/celebrity/dated./base/popstra/dated/participant /m/022q32 10 | /m/01qb5d /film/film/genre /m/02kdv5l 11 | /m/01qrbf /film/actor/film./film/performance/film /m/020bv3 12 | /m/03mp8k /music/record_label/artist /m/0127s7 13 | /m/03_8r /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/country /m/03h2c 14 | /m/01d259 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp 15 | /m/06fq2 /education/educational_institution/colors /m/036k5h 16 | /m/025sc50 /music/genre/artists /m/0412f5y 17 | /m/03h2c /location/country/official_language /m/06nm1 18 | /m/05np4c /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0k2mxq 19 | /m/05b5c /business/business_operation/industry /m/01mf0 20 | /m/0h1p /award/award_nominee/award_nominations./award/award_nomination/award /m/0gq9h 21 | /m/02rg_4 /education/educational_institution/school_type /m/05pcjw 22 | /m/025cn2 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0m_h6 23 | /m/0dq9p /people/cause_of_death/people /m/0hpz8 24 | /m/05qd_ /award/award_winner/awards_won./award/award_honor/award_winner /m/02q_cc 25 | /m/079dy /government/politician/government_positions_held./government/government_position_held/basic_title /m/060c4 26 | /m/041rx /people/ethnicity/people /m/01xndd 27 | /m/0x67 /people/ethnicity/languages_spoken /m/06nm1 28 | /m/025cn2 /people/person/place_of_birth /m/02_286 29 | /m/02rn00y /film/film/other_crew./film/film_crew_gig/film_crew_role /m/0d2b38 30 | /m/0b_c7 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0f5mdz 31 | /m/04ftdq /education/educational_institution/colors /m/083jv 32 | /m/016z9n /film/film/featured_film_locations /m/0rh6k 33 | /m/01vn0t_ /people/person/profession /m/016z4k 34 | /m/0l9k1 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0p_rk 35 | /m/0ml_m /location/location/adjoin_s./location/adjoining_relationship/adjoins /m/0mlw1 36 | /m/01wgxtl /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wlt3k 37 | /m/016sp_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01l03w2 38 | /m/0pqc5 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/0f2s6 39 | /m/022q32 /people/person/places_lived./people/place_lived/location /m/01jr6 40 | /m/01900g /people/person/profession /m/018gz8 41 | /m/04sx9_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01qscs 42 | /m/01l1b90 /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/01kgxf 43 | /m/027dtxw /award/award_category/nominees./award/award_nomination/nominated_for /m/05sy_5 44 | /m/0127s7 /award/award_nominee/award_nominations./award/award_nomination/award /m/01c99j 45 | /m/04zwtdy /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/05nyqk 46 | /m/09q17 /media_common/netflix_genre/titles /m/0640m69 47 | /m/0x67 /people/ethnicity/people /m/01sg7_ 48 | /m/01cwdk /education/educational_institution/school_type /m/05jxkf 49 | /m/0ct5zc /film/film/story_by /m/0jt90f5 50 | /m/03295l /people/ethnicity/languages_spoken /m/01jb8r 51 | /m/063g7l /film/actor/film./film/performance/film /m/01718w 52 | /m/0crh5_f /film/film/film_festivals /m/0fpkxfd 53 | /m/05br10 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0jqj5 54 | /m/03xgm3 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01l4zqz 55 | /m/0bdjd /film/film/featured_film_locations /m/0fsv2 56 | /m/04ydr95 /film/film/produced_by /m/0pmhf 57 | /m/01g257 /film/actor/film./film/performance/film /m/02pg45 58 | /m/027f2w /education/educational_degree/people_with_this_degree./education/education/institution /m/01jssp 59 | /m/0jqp3 /film/film/produced_by /m/09ftwr 60 | /m/02glc4 /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02cg7g 61 | /m/0gq_d /award/award_category/winners./award/award_honor/ceremony /m/0bzlrh 62 | /m/0bzlrh /award/award_ceremony/awards_presented./award/award_honor/honored_for /m/0p_rk 63 | /m/0pkyh /award/award_winner/awards_won./award/award_honor/award_winner /m/02cx90 64 | /m/022_lg /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0pd64 65 | /m/02nwxc /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/05p5nc 66 | /m/033qdy /film/film/language /m/064_8sq 67 | /m/0fd_1 /people/person/places_lived./people/place_lived/location /m/02_286 68 | /m/01rc6f /education/educational_institution/colors /m/083jv 69 | /m/02681_5 /award/award_category/winners./award/award_honor/award_winner /m/01wj18h 70 | /m/09k5jh7 /award/award_ceremony/awards_presented./award/award_honor/honored_for /m/02rn00y 71 | /m/0k_kr /music/record_label/artist /m/07c0j 72 | /m/09gdh6k /film/film/film_festivals /m/0bmj62v 73 | /m/0f63n /location/location/adjoin_s./location/adjoining_relationship/adjoins /m/0f6_4 74 | /m/0jw67 /film/director/film /m/012jfb 75 | /m/01vxlbm /people/person/profession /m/016z4k 76 | /m/01xcr4 /people/person/profession /m/0d8qb 77 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/02_kd 78 | /m/01rtm4 /education/educational_institution/students_graduates./education/education/student /m/01xndd 79 | /m/01nvmd_ /people/person/place_of_birth /m/01_d4 80 | /m/0170vn /people/person/places_lived./people/place_lived/location /m/02_286 81 | /m/011j5x /music/genre/artists /m/01dw_f 82 | /m/04g_wd /people/person/profession /m/0d8qb 83 | /m/0184jc /award/award_winner/awards_won./award/award_honor/award_winner /m/0c35b1 84 | /m/08052t3 /film/film/language /m/06nm1 85 | /m/02482c /education/educational_institution/school_type /m/07tf8 86 | /m/0p9xd /music/genre/artists /m/01304j 87 | /m/01rnly /film/film/featured_film_locations /m/02_286 88 | /m/0jrv_ /music/genre/artists /m/04rcr 89 | /m/0gmcwlb /award/award_winning_work/awards_won./award/award_honor/award /m/0gq9h 90 | /m/01900g /film/actor/film./film/performance/film /m/0234j5 91 | /m/0d608 /film/actor/film./film/performance/film /m/033pf1 92 | /m/0721cy /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/03xp8d5 93 | /m/05qd_ /business/business_operation/industry /m/02vxn 94 | /m/0k_kr /music/record_label/artist /m/0kr_t 95 | /m/0kz2w /education/educational_institution/school_type /m/07tf8 96 | /m/01q2nx /film/film/featured_film_locations /m/0rh6k 97 | /m/039wsf /people/person/place_of_birth /m/02_286 98 | /m/0337vz /people/person/place_of_birth /m/01_d4 99 | /m/01ymvk /education/educational_institution/colors /m/083jv 100 | /m/0m93 /influence/influence_node/influenced_by /m/0gz_ 101 | /m/0d608 /people/person/profession /m/018gz8 102 | /m/03kmyy /education/educational_institution/school_type /m/05pcjw 103 | /m/080dwhx /award/award_winning_work/awards_won./award/award_honor/award_winner /m/02l6dy 104 | /m/02kxbwx /film/director/film /m/0bz3jx 105 | /m/02fsn /music/performance_role/regular_performances./music/group_membership/role /m/07_l6 106 | /m/01wbsdz /music/artist/origin /m/01smm 107 | /m/025sc50 /music/genre/artists /m/01ws9n6 108 | /m/08wq0g /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0cmt6q 109 | /m/0l3h /location/statistical_region/religions./location/religion_percentage/religion /m/072w0 110 | /m/01p7b6b /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/01jc6q 111 | /m/02t_h3 /film/film/genre /m/01q03 112 | /m/060bp /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/03548 113 | /m/08y2fn /tv/tv_program/regular_cast./tv/regular_tv_appearance/actor /m/06j8wx 114 | /m/0k60 /people/person/places_lived./people/place_lived/location /m/0fp5z 115 | /m/02d413 /film/film/cinematography /m/0f3zsq 116 | /m/09f0bj /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/05np4c 117 | /m/02cg7g /government/legislative_session/members./government/government_position_held/legislative_sessions /m/060ny2 118 | /m/059_gf /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/03shpq 119 | /m/01wk7ql /award/award_nominee/award_nominations./award/award_nomination/award /m/01c99j 120 | /m/03hr1p /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/country /m/0d060g 121 | /m/01m1dzc /award/award_winner/awards_won./award/award_honor/award_winner /m/01l03w2 122 | /m/0pqzh /influence/influence_node/influenced_by /m/03hnd 123 | /m/0x67 /people/ethnicity/people /m/059_gf 124 | /m/03q3sy /people/person/nationality /m/0d060g 125 | /m/0gvs1kt /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02vzc 126 | /m/06_9lg /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_location /m/09c17 127 | /m/07s3m4g /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02vzc 128 | /m/064_8sq /language/human_language/countries_spoken_in /m/05cc1 129 | /m/06qn87 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/08ct6 130 | /m/020h2v /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/011yth 131 | /m/02rdxsh /award/award_category/nominees./award/award_nomination/nominated_for /m/04b2qn 132 | /m/027dtxw /award/award_category/winners./award/award_honor/award_winner /m/01qscs 133 | /m/07l450 /film/film/language /m/064_8sq 134 | /m/041rx /people/ethnicity/people /m/0lrh 135 | /m/07lp1 /influence/influence_node/influenced_by /m/0lrh 136 | /m/0jsqk /film/film/language /m/064_8sq 137 | /m/01g257 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/015v3r 138 | /m/027dtxw /award/award_category/nominees./award/award_nomination/nominated_for /m/02_kd 139 | /m/02q_cc /award/award_nominee/award_nominations./award/award_nomination/award /m/0gq9h 140 | /m/02lgj6 /award/award_winner/awards_won./award/award_honor/award_winner /m/02lgfh 141 | /m/02p3cr5 /music/record_label/artist /m/01323p 142 | /m/06k75 /time/event/locations /m/0d0kn 143 | /m/02xj3rw /award/award_category/disciplines_or_subjects /m/02vxn 144 | /m/031f_m /film/film/genre /m/02kdv5l 145 | /m/02x4wr9 /award/award_category/nominees./award/award_nomination/nominated_for /m/017z49 146 | /m/016srn /people/person/profession /m/016z4k 147 | /m/0js9s /film/director/film /m/017gl1 148 | /m/08hp53 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/04fzfj 149 | /m/0kv2hv /film/film/executive_produced_by /m/0bgrsl 150 | /m/0184jc /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/08c4yn 151 | /m/02681xs /award/award_category/winners./award/award_honor/ceremony /m/08pc1x 152 | /m/01hw5kk /film/film/production_companies /m/04rtpt 153 | /m/0721cy /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01jgpsh 154 | /m/01c99j /award/award_category/winners./award/award_honor/ceremony /m/05pd94v 155 | /m/02_kd /award/award_winning_work/awards_won./award/award_honor/award_winner /m/02l4rh 156 | /m/01tlmw /location/hud_county_place/county /m/0m2fr 157 | /m/04zwc /education/educational_institution/school_type /m/05jxkf 158 | /m/01m15br /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01l4zqz 159 | /m/07_l6 /music/instrument/instrumentalists /m/011zf2 160 | /m/023322 /music/group_member/membership./music/group_membership/role /m/0dwt5 161 | /m/02cx90 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01m15br 162 | /m/05qd_ /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/0jsqk 163 | /m/04bdzg /people/person/places_lived./people/place_lived/location /m/02_286 164 | /m/02kfzz /film/film/genre /m/02kdv5l 165 | /m/01q9b9 /people/person/profession /m/0d8qb 166 | /m/05_6_y /sports/pro_athlete/teams./sports/sports_team_roster/team /m/02b15h 167 | /m/02bqn1 /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02cg7g 168 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/011yth 169 | /m/02bn_p /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02glc4 170 | /m/03q3sy /film/actor/film./film/performance/film /m/05t0_2v 171 | /m/05szq8z /film/film/film_format /m/0cj16 172 | /m/02qny_ /soccer/football_player/current_team./sports/sports_team_roster/team /m/01k2xy 173 | /m/016sp_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/015c4g 174 | /m/0k2mxq /award/award_winner/awards_won./award/award_honor/award_winner /m/05np4c 175 | /m/0dgrwqr /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp 176 | /m/0b2v79 /film/film/genre /m/02kdv5l 177 | /m/034r25 /film/film/genre /m/02kdv5l 178 | /m/0qf2t /film/film/genre /m/01q03 179 | /m/0b1xl /education/educational_institution/students_graduates./education/education/student /m/012gq6 180 | /m/041rx /people/ethnicity/people /m/0161sp 181 | /m/03b3j /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/school /m/01rc6f 182 | /m/03xp8d5 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0721cy 183 | /m/035s95 /film/film/production_companies /m/02slt7 184 | /m/08pc1x /award/award_ceremony/awards_presented./award/award_honor/award_winner /m/011zf2 185 | /m/0plw /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_language /m/06nm1 186 | /m/063_t /people/deceased_person/place_of_burial /m/0nb1s 187 | /m/01xzb6 /people/person/profession /m/09lbv 188 | /m/0dq9p /people/cause_of_death/people /m/0c12h 189 | /m/03q3sy /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/05t0_2v 190 | /m/08wq0g /award/award_winner/awards_won./award/award_honor/award_winner /m/0bt7ws 191 | /m/04fc6c /music/record_label/artist /m/03y82t6 192 | /m/06nm1 /media_common/netflix_genre/titles /m/091z_p 193 | /m/05qb8vx /award/award_ceremony/awards_presented./award/award_honor/honored_for /m/02z0f6l 194 | /m/08bqy9 /people/person/place_of_birth /m/09c17 195 | /m/0gxtknx /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/0d060g 196 | /m/03y82t6 /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/02bc74 197 | /m/084w8 /influence/influence_node/influenced_by /m/07dnx 198 | /m/01kwlwp /people/person/place_of_birth /m/0f2s6 199 | /m/0dq3c /business/job_title/people_with_this_title./business/employment_tenure/company /m/03mnk 200 | /m/01t6b4 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/038bht 201 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/017gl1 202 | /m/02tk74 /people/person/spouse_s./people/marriage/spouse /m/0h5g_ 203 | /m/02x8m /music/genre/artists /m/016376 204 | /m/020h2v /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/035s95 205 | /m/04yqlk /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/015p37 206 | -------------------------------------------------------------------------------- /data/fb237_v1_ind/valid.txt: -------------------------------------------------------------------------------- 1 | /m/022q32 /base/popstra/celebrity/breakup./base/popstra/breakup/participant /m/015pkc 2 | /m/04ly1 /location/location/contains /m/02vkzcx 3 | /m/08052t3 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/05sb1 4 | /m/02fwfb /film/film/genre /m/01t_vv 5 | /m/0dgrwqr /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/04v3q 6 | /m/01p4vl /people/person/profession /m/018gz8 7 | /m/02xj3rw /award/award_category/winners./award/award_honor/award_winner /m/0683n 8 | /m/06s6hs /people/person/places_lived./people/place_lived/location /m/0fr0t 9 | /m/0k_kr /music/record_label/artist /m/01323p 10 | /m/0x67 /people/ethnicity/people /m/080knyg 11 | /m/0x67 /people/ethnicity/people /m/0126y2 12 | /m/03m8y5 /film/film/genre /m/01t_vv 13 | /m/01pvxl /film/film/production_companies /m/016tt2 14 | /m/016zgj /music/genre/artists /m/082brv 15 | /m/09f0bj /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/06s6hs 16 | /m/0127s7 /base/popstra/celebrity/dated./base/popstra/dated/participant /m/015pkc 17 | /m/09kvv /education/educational_institution/students_graduates./education/education/student /m/01rc4p 18 | /m/06mzp /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics /m/0l6ny 19 | /m/0kz2w /education/educational_institution/school_type /m/05pcjw 20 | /m/011j5x /music/genre/artists /m/0326tc 21 | /m/0411q /music/artist/track_contributions./music/track_contribution/role /m/01qzyz 22 | /m/011j5x /music/genre/parent_genre /m/06cqb 23 | /m/05qd_ /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0bmpm 24 | /m/0d608 /people/person/nationality /m/0d060g 25 | /m/01wj18h /award/award_nominee/award_nominations./award/award_nomination/award /m/02f73p 26 | /m/02681_5 /award/award_category/winners./award/award_honor/ceremony /m/092868 27 | /m/02x8m /music/genre/artists /m/01304j 28 | /m/0d060g /location/country/form_of_government /m/018wl5 29 | /m/0175wg /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/06mmb 30 | /m/02bc74 /people/person/profession /m/016z4k 31 | /m/0pmhf /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/034hzj 32 | /m/023tp8 /award/award_nominee/award_nominations./award/award_nomination/award /m/0cqh6z 33 | /m/0178g /organization/organization/headquarters./location/mailing_address/citytown /m/01_d4 34 | /m/0l6ny /user/jg/default_domain/olympic_games/sports /m/03_8r 35 | /m/05pd94v /award/award_ceremony/awards_presented./award/award_honor/award_winner /m/011zf2 36 | /m/01qrbf /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/05vsxz 37 | /m/0h7jp /location/location/contains /m/01b85 38 | /m/0h5g_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01qrbf 39 | /m/02l4rh /film/actor/film./film/performance/film /m/04w7rn 40 | /m/07yk1xz /film/film/other_crew./film/film_crew_gig/film_crew_role /m/0d2b38 41 | /m/025vl4m /award/award_winner/awards_won./award/award_honor/award_winner /m/026n3rs 42 | /m/02bqmq /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02bqn1 43 | /m/02482c /education/educational_institution/school_type /m/05jxkf 44 | /m/02chhq /film/film/genre /m/017fp 45 | /m/01vwbts /people/person/profession /m/016z4k 46 | /m/05zlld0 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp 47 | /m/041rx /people/ethnicity/people /m/05xpv 48 | /m/031f_m /film/film/genre /m/01zhp 49 | /m/02fwfb /film/film/production_companies /m/0gfmc_ 50 | /m/041rx /people/ethnicity/people /m/0l9k1 51 | /m/02x4wr9 /award/award_category/nominees./award/award_nomination/nominated_for /m/047myg9 52 | /m/09f0bj /award/award_winner/awards_won./award/award_honor/award_winner /m/05np4c 53 | /m/0248jb /award/award_category/winners./award/award_honor/ceremony /m/01mh_q 54 | /m/020h2v /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0pc62 55 | /m/03qjlz /people/person/places_lived./people/place_lived/location /m/02_286 56 | /m/0209xj /film/film/language /m/064_8sq 57 | /m/0992d9 /film/film/genre /m/02kdv5l 58 | /m/0738b8 /people/person/profession /m/018gz8 59 | /m/0d060g /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics /m/0jkvj 60 | /m/02vzc /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics /m/0jkvj 61 | /m/01_j71 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02_n5d 62 | /m/01s0ps /music/performance_role/regular_performances./music/group_membership/role /m/0dwt5 63 | /m/0269kx /education/educational_institution/colors /m/036k5h 64 | /m/02yxjs /education/educational_institution/students_graduates./education/education/student /m/0glmv 65 | /m/02sddg /sports/sports_position/players./sports/sports_team_roster/team /m/049n7 66 | /m/04fzfj /award/award_winning_work/awards_won./award/award_honor/award_winner /m/08hp53 67 | /m/05qd_ /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/0bmpm 68 | /m/06mzp /location/location/contains /m/0lfyd 69 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/0jqj5 70 | /m/0jnlm /sports/sports_team/colors /m/083jv 71 | /m/02y_rq5 /award/award_category/nominees./award/award_nomination/nominated_for /m/02_06s 72 | /m/0g5ff /award/award_nominee/award_nominations./award/award_nomination/award /m/0265vt 73 | /m/0234j5 /film/film/genre /m/02kdv5l 74 | /m/09dt7 /award/award_nominee/award_nominations./award/award_nomination/award /m/0265vt 75 | /m/02mhfy /film/actor/film./film/performance/film /m/0372j5 76 | /m/0478__m /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wj18h 77 | /m/0js9s /film/actor/film./film/performance/film /m/08k40m 78 | /m/060c4 /business/job_title/people_with_this_title./business/employment_tenure/company /m/02482c 79 | /m/041rx /people/ethnicity/people /m/03f0fnk 80 | /m/022dp5 /people/ethnicity/people /m/01nvmd_ 81 | /m/021yzs /film/actor/film./film/performance/film /m/02pg45 82 | /m/01xndd /people/person/place_of_birth /m/02_286 83 | /m/016tt2 /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/07s3m4g 84 | /m/02_3zj /award/award_category/nominees./award/award_nomination/nominated_for /m/080dwhx 85 | /m/0b7l4x /film/film/other_crew./film/film_crew_gig/film_crew_role /m/0d2b38 86 | /m/0pkyh /influence/influence_node/influenced_by /m/041h0 87 | /m/01qb5d /film/film/featured_film_locations /m/02_286 88 | /m/02bqmq /government/legislative_session/members./government/government_position_held/legislative_sessions /m/060ny2 89 | /m/016sp_ /music/artist/origin /m/01smm 90 | /m/041rx /people/ethnicity/people /m/063_t 91 | /m/0pb33 /film/film/genre /m/02kdv5l 92 | /m/0d060g /olympics/olympic_participating_country/athletes./olympics/olympic_athlete_affiliation/olympics /m/0lbbj 93 | /m/01ct6 /sports/sports_team/colors /m/083jv 94 | /m/05b7q /location/statistical_region/religions./location/religion_percentage/religion /m/092bf5 95 | /m/0x67 /people/ethnicity/people /m/01wlt3k 96 | /m/0jm4v /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/draft /m/0f4vx0 97 | /m/064_8sq /language/human_language/countries_spoken_in /m/03676 98 | /m/01l4zqz /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/011zf2 99 | /m/023gxx /film/film/genre /m/017fp 100 | /m/047q2k1 /film/film/genre /m/01chg 101 | /m/073v6 /influence/influence_node/influenced_by /m/07dnx 102 | /m/0fsv2 /location/hud_county_place/county /m/0m24v 103 | /m/05qd_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0l9k1 104 | /m/07w5rq /education/educational_institution/school_type /m/01_srz 105 | /m/03xp8d5 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/07_s4b 106 | /m/0jqj5 /award/award_winning_work/awards_won./award/award_honor/award_winner /m/05br10 107 | /m/04b2qn /film/film/genre /m/04228s 108 | /m/01hkhq /award/award_winner/awards_won./award/award_honor/award_winner /m/02l4rh 109 | /m/03m2fg /people/person/profession /m/018gz8 110 | /m/06k75 /military/military_conflict/combatants./military/military_combatant_group/combatants /m/02vzc 111 | /m/01ljpm /education/educational_institution/students_graduates./education/education/major_field_of_study /m/01jzxy 112 | /m/0l1589 /music/performance_role/regular_performances./music/group_membership/role /m/01s0ps 113 | /m/0bmpm /award/award_winning_work/awards_won./award/award_honor/award_winner /m/05qd_ 114 | /m/0p_qr /award/award_winning_work/awards_won./award/award_honor/award_winner /m/046qq 115 | /m/04ly1 /location/location/contains /m/03x33n 116 | /m/02y_rq5 /award/award_category/nominees./award/award_nomination/nominated_for /m/0p_qr 117 | /m/05np4c /award/award_winner/awards_won./award/award_honor/award_winner /m/06s6hs 118 | /m/0m6x4 /film/actor/film./film/performance/film /m/083skw 119 | /m/04gmlt /music/record_label/artist /m/0249kn 120 | /m/041rx /people/ethnicity/people /m/01z_g6 121 | /m/06j8wx /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01hkhq 122 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/02rn00y 123 | /m/0dn3n /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/01pvxl 124 | /m/0209xj /film/film/written_by /m/01_f_5 125 | /m/016tt2 /business/business_operation/industry /m/02vxn 126 | /m/02ldkf /education/educational_institution/school_type /m/05jxkf 127 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/03pc89 128 | /m/03nqnk3 /award/award_category/winners./award/award_honor/award_winner /m/033rq 129 | /m/012gbb /base/popstra/celebrity/dated./base/popstra/dated/participant /m/0gmtm 130 | /m/0pd64 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02vzc 131 | /m/060c4 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/05b7q 132 | /m/064nh4k /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/032q8q 133 | /m/01hlwv /organization/organization/headquarters./location/mailing_address/citytown /m/02_286 134 | /m/0p9tm /film/film/production_companies /m/016tt2 135 | /m/0d060g /olympics/olympic_participating_country/athletes./olympics/olympic_athlete_affiliation/olympics /m/0jkvj 136 | /m/015q1n /education/educational_institution/school_type /m/05jxkf 137 | /m/0x67 /people/ethnicity/people /m/01vxlbm 138 | /m/0dq9p /people/cause_of_death/people /m/05xpv 139 | /m/064nh4k /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/022q4l9 140 | /m/016srn /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02cx90 141 | /m/0j6b5 /film/film/other_crew./film/film_crew_gig/film_crew_role /m/094hwz 142 | /m/0jqp3 /film/film/produced_by /m/0170vn 143 | /m/019pkm /people/person/places_lived./people/place_lived/location /m/02_286 144 | /m/0j11 /location/administrative_division/first_level_division_of /m/049nq 145 | /m/0h5g_ /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/020bv3 146 | /m/01l4zqz /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/03xgm3 147 | /m/01c99j /award/award_category/winners./award/award_honor/ceremony /m/01mh_q 148 | /m/05650n /award/award_winning_work/awards_won./award/award_honor/award /m/02qyxs5 149 | /m/0crh5_f /film/film/release_date_s./film/film_regional_release_date/film_regional_debut_venue /m/0fpkxfd 150 | /m/04dsnp /film/film/produced_by /m/0jw67 151 | /m/027dtxw /award/award_category/nominees./award/award_nomination/nominated_for /m/0b2v79 152 | /m/0pd64 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02_286 153 | /m/01f85k /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/0d060g 154 | /m/015q1n /education/educational_institution/students_graduates./education/education/student /m/01nvmd_ 155 | /m/0m6x4 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0jsqk 156 | /m/03k8th /film/film/featured_film_locations /m/02_286 157 | /m/0jpn8 /education/educational_institution/colors /m/036k5h 158 | /m/0hkxq /food/food/nutrients./food/nutrition_fact/nutrient /m/0h1yy 159 | /m/025sc50 /music/genre/artists /m/02h9_l 160 | /m/0bv7t /influence/influence_node/influenced_by /m/017_pb 161 | /m/041rx /people/ethnicity/people /m/058vp 162 | /m/0pd64 /award/award_winning_work/awards_won./award/award_honor/award /m/0gq9h 163 | /m/0175wg /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/020bv3 164 | /m/047n8xt /film/film/written_by /m/02r6c_ 165 | /m/08052t3 /film/film/genre /m/02kdv5l 166 | /m/01ws9n6 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wlt3k 167 | /m/0pqc5 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/07bcn 168 | /m/0gdh5 /people/person/profession /m/016z4k 169 | /m/01qb5d /film/film/production_companies /m/016tt2 170 | /m/032q8q /film/actor/film./film/performance/film /m/01q2nx 171 | /m/02l6dy /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wy5m 172 | /m/03548 /location/country/official_language /m/064_8sq 173 | /m/01718w /film/film/music /m/0417z2 174 | /m/0gvs1kt /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/04v3q 175 | /m/01s0ps /music/performance_role/regular_performances./music/group_membership/role /m/0l1589 176 | /m/022q4l9 /award/award_winner/awards_won./award/award_honor/award_winner /m/064nh4k 177 | /m/0161sp /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/0pmhf 178 | /m/0gq_d /award/award_category/winners./award/award_honor/ceremony /m/0bzn6_ 179 | /m/06cgy /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0jqj5 180 | /m/05p5nc /award/award_nominee/award_nominations./award/award_nomination/award /m/0cqh6z 181 | /m/01m15br /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02cx90 182 | /m/0c12h /award/award_nominee/award_nominations./award/award_nomination/award /m/027dtxw 183 | /m/04mby /people/person/places_lived./people/place_lived/location /m/071cn 184 | /m/017gl1 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp 185 | /m/015pkc /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02s2ft 186 | /m/02_3zj /award/award_category/nominees./award/award_nomination/nominated_for /m/0524b41 187 | /m/02bf58 /organization/organization/headquarters./location/mailing_address/citytown /m/01_d4 188 | /m/032q8q /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/080knyg 189 | /m/0146hc /education/educational_institution/school_type /m/05jxkf 190 | /m/02y_rq5 /award/award_category/nominees./award/award_nomination/nominated_for /m/0294mx 191 | /m/01jgpsh /film/actor/film./film/performance/film /m/03nqnnk 192 | /m/04fzfj /film/film/story_by /m/08hp53 193 | /m/01_d4 /location/location/contains /m/065r8g 194 | /m/020bv3 /film/film/production_companies /m/03sb38 195 | /m/02z0f6l /film/film/genre /m/017fp 196 | /m/04nw9 /film/actor/film./film/performance/film /m/0gnjh 197 | /m/03ft8 /people/person/places_lived./people/place_lived/location /m/0100mt 198 | /m/02qny_ /soccer/football_player/current_team./sports/sports_team_roster/team /m/0175rc 199 | /m/024lt6 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/04v3q 200 | /m/0jkvj /olympics/olympic_games/sports /m/03hr1p 201 | /m/012gk9 /film/film/written_by /m/03ft8 202 | /m/02cg7g /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02glc4 203 | /m/0f6zs /location/location/contains /m/0ybkj 204 | /m/07gql /music/instrument/instrumentalists /m/03xgm3 205 | /m/041rx /people/ethnicity/people /m/073v6 206 | /m/05ztm4r /people/person/places_lived./people/place_lived/location /m/02_286 207 | -------------------------------------------------------------------------------- /managers/__pycache__/evaluator.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/managers/__pycache__/evaluator.cpython-36.pyc -------------------------------------------------------------------------------- /managers/__pycache__/trainer.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/managers/__pycache__/trainer.cpython-36.pyc -------------------------------------------------------------------------------- /managers/evaluator.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | import pdb 5 | from sklearn import metrics 6 | import torch.nn.functional as F 7 | from torch.utils.data import DataLoader 8 | 9 | 10 | class Evaluator(): 11 | def __init__(self, params, graph_classifier, data): 12 | self.params = params 13 | self.graph_classifier = graph_classifier 14 | self.data = data 15 | 16 | def eval(self, save=False): 17 | pos_scores = [] 18 | pos_labels = [] 19 | neg_scores = [] 20 | neg_labels = [] 21 | pos_nodes_num_list = [] 22 | neg_nodes_num_list = [] 23 | dataloader = DataLoader(self.data, batch_size=self.params.batch_size, shuffle=False, num_workers=self.params.num_workers, collate_fn=self.params.collate_fn) 24 | 25 | self.graph_classifier.eval() 26 | with torch.no_grad(): 27 | for b_idx, batch in enumerate(dataloader): 28 | 29 | data_pos, targets_pos, data_neg, targets_neg, _ = self.params.move_batch_to_device(batch, self.params.device) 30 | pos_nodes_num_list.extend(data_pos[0].batch_num_nodes) 31 | neg_nodes_num_list.extend(data_neg[0].batch_num_nodes) 32 | score_pos = self.graph_classifier(data_pos) 33 | score_neg = self.graph_classifier(data_neg) 34 | 35 | pos_scores += score_pos.squeeze(1).detach().cpu().tolist() 36 | neg_scores += score_neg.squeeze(1).detach().cpu().tolist() 37 | pos_labels += targets_pos.tolist() 38 | neg_labels += targets_neg.tolist() 39 | 40 | assert not torch.any(torch.isnan(torch.tensor(pos_scores + neg_scores))) 41 | 42 | auc = metrics.roc_auc_score(pos_labels + neg_labels, pos_scores + neg_scores) 43 | auc_pr = metrics.average_precision_score(pos_labels + neg_labels, pos_scores + neg_scores) 44 | 45 | if save: 46 | pos_test_triplets_path = os.path.join(self.params.main_dir, 'data/{}/{}.txt'.format(self.params.dataset, self.data.file_name)) 47 | with open(pos_test_triplets_path) as f: 48 | pos_triplets = [line.split() for line in f.read().split('\n')[:-1]] 49 | pos_file_path = os.path.join(self.params.main_dir, 'data/{}/grail_{}_predictions.txt'.format(self.params.dataset, self.data.file_name)) 50 | with open(pos_file_path, "w") as f: 51 | for ([s, r, o], score, nodes_num) in zip(pos_triplets, pos_scores, pos_nodes_num_list): 52 | f.write('\t'.join([s, r, o, str(score), str(nodes_num)]) + '\n') 53 | 54 | neg_test_triplets_path = os.path.join(self.params.main_dir, 'data/{}/neg_{}_0.txt'.format(self.params.dataset, self.data.file_name)) 55 | with open(neg_test_triplets_path) as f: 56 | neg_triplets = [line.split() for line in f.read().split('\n')[:-1]] 57 | neg_file_path = os.path.join(self.params.main_dir, 'data/{}/grail_neg_{}_{}_predictions.txt'.format(self.params.dataset, self.data.file_name, self.params.constrained_neg_prob)) 58 | with open(neg_file_path, "w") as f: 59 | for ([s, r, o], score, nodes_num) in zip(neg_triplets, neg_scores, neg_nodes_num_list): 60 | f.write('\t'.join([s, r, o, str(score), str(nodes_num)]) + '\n') 61 | 62 | return {'auc': auc, 'auc_pr': auc_pr} 63 | -------------------------------------------------------------------------------- /managers/trainer.py: -------------------------------------------------------------------------------- 1 | import statistics 2 | import timeit 3 | import os 4 | import logging 5 | import pdb 6 | import numpy as np 7 | import time 8 | 9 | import torch 10 | import torch.nn as nn 11 | import torch.optim as optim 12 | import torch.nn.functional as F 13 | from torch.utils.data import DataLoader 14 | import dgl 15 | from sklearn import metrics 16 | 17 | 18 | class Trainer(): 19 | def __init__(self, params, graph_classifier, train, valid_evaluator=None): 20 | self.graph_classifier = graph_classifier 21 | self.valid_evaluator = valid_evaluator 22 | self.params = params 23 | self.train_data = train 24 | 25 | self.updates_counter = 0 26 | 27 | model_params = list(self.graph_classifier.parameters()) 28 | logging.info('Total number of parameters: %d' % sum(map(lambda x: x.numel(), model_params))) 29 | 30 | if params.optimizer == "SGD": 31 | self.optimizer = optim.SGD(model_params, lr=params.lr, momentum=params.momentum, weight_decay=self.params.l2) 32 | if params.optimizer == "Adam": 33 | self.optimizer = optim.Adam(model_params, lr=params.lr, weight_decay=self.params.l2) 34 | 35 | self.criterion = nn.MarginRankingLoss(self.params.margin, reduction='mean') 36 | self.b_xent = nn.BCEWithLogitsLoss() 37 | self.reset_training_state() 38 | 39 | def reset_training_state(self): 40 | self.best_metric = 0 41 | self.last_metric = 0 42 | self.not_improved_count = 0 43 | 44 | def train_epoch(self): 45 | total_loss = 0 46 | total_MI_loss = 0 47 | all_labels = [] 48 | all_scores = [] 49 | 50 | dataloader = DataLoader(self.train_data, batch_size=self.params.batch_size, shuffle=True, num_workers=self.params.num_workers, collate_fn=self.params.collate_fn) 51 | self.graph_classifier.train() 52 | model_params = list(self.graph_classifier.parameters()) 53 | 54 | for b_idx, batch in enumerate(dataloader): 55 | # print("batch:", b_idx) 56 | 57 | # Input positive and negative graph 58 | data_pos, targets_pos, data_neg, targets_neg, data_cor = self.params.move_batch_to_device(batch, self.params.device) 59 | self.optimizer.zero_grad() 60 | self.graph_classifier.train() 61 | score_pos, s_G_pos, s_g_pos = self.graph_classifier(data_pos, is_return_emb=True) 62 | score_neg = self.graph_classifier(data_neg) 63 | 64 | # loss = self.criterion(score_pos, score_neg.view(len(score_pos), -1).mean(dim=1), torch.Tensor([1]).to(device=self.params.device)) 65 | loss = self.criterion(score_pos.squeeze(-1), score_neg.view(len(score_pos), -1).mean(dim=1), torch.Tensor([1]).to(device=self.params.device)) 66 | # print(f"loss: {loss}") 67 | 68 | dgi_loss = 0 69 | if self.params.coef_dgi_loss: 70 | _, _, s_g_cor = self.graph_classifier(data_cor, is_return_emb=True, cor_graph=True) 71 | 72 | # Calculate the DGI loss 73 | lbl_1 = torch.ones(data_pos[0].batch_size) 74 | lbl_2 = torch.zeros(data_pos[0].batch_size) 75 | lbl = torch.cat((lbl_1, lbl_2)).to(self.params.device) 76 | logits = self.graph_classifier.get_logits(s_G_pos, s_g_pos, s_g_cor) 77 | dgi_loss = self.b_xent(logits, lbl) 78 | 79 | print(f'supervised loss: {loss}, NCE loss: {dgi_loss}') 80 | loss = loss + self.params.coef_dgi_loss * dgi_loss 81 | 82 | loss.backward() 83 | self.optimizer.step() 84 | self.updates_counter += 1 85 | 86 | with torch.no_grad(): 87 | all_scores += score_pos.squeeze().detach().cpu().tolist() + score_neg.squeeze().detach().cpu().tolist() 88 | all_labels += targets_pos.tolist() + targets_neg.tolist() 89 | total_loss += loss 90 | total_MI_loss += dgi_loss 91 | 92 | if self.valid_evaluator and self.params.eval_every_iter and self.updates_counter % self.params.eval_every_iter == 0: 93 | tic = time.time() 94 | result = self.valid_evaluator.eval() 95 | logging.info('\nPerformance:' + str(result) + 'in ' + str(time.time() - tic)) 96 | 97 | if result['auc'] >= self.best_metric: 98 | self.save_classifier() 99 | self.best_metric = result['auc'] 100 | self.not_improved_count = 0 101 | else: 102 | self.not_improved_count += 1 103 | if self.not_improved_count > self.params.early_stop: 104 | logging.info(f"Validation performance didn\'t improve for {self.params.early_stop} epochs. Training stops.") 105 | break 106 | 107 | self.last_metric = result['auc'] 108 | 109 | auc = metrics.roc_auc_score(all_labels, all_scores) 110 | auc_pr = metrics.average_precision_score(all_labels, all_scores) 111 | 112 | weight_norm = sum(map(lambda x: torch.norm(x), model_params)) 113 | 114 | return total_loss, total_MI_loss, auc, auc_pr, weight_norm 115 | 116 | def train(self): 117 | self.reset_training_state() 118 | 119 | for epoch in range(1, self.params.num_epochs + 1): 120 | time_start = time.time() 121 | loss, MI_loss, auc, auc_pr, weight_norm = self.train_epoch() 122 | time_elapsed = time.time() - time_start 123 | logging.info(f'Epoch {epoch} with loss: {loss}, MI loss: {MI_loss}, training auc: {auc}, training auc_pr: {auc_pr}, best validation AUC: {self.best_metric}, weight_norm: {weight_norm} in {time_elapsed}') 124 | 125 | if epoch % self.params.save_every == 0: 126 | torch.save(self.graph_classifier, os.path.join(self.params.exp_dir, 'graph_classifier_chk.pth')) 127 | 128 | def save_classifier(self): 129 | torch.save(self.graph_classifier, os.path.join(self.params.exp_dir, 'best_graph_classifier.pth')) # Does it overwrite or fuck with the existing file? 130 | logging.info('Better models found w.r.t accuracy. Saved it!') 131 | -------------------------------------------------------------------------------- /model/dgl/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__init__.py -------------------------------------------------------------------------------- /model/dgl/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/__pycache__/aggregators.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/aggregators.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/__pycache__/batch_gru.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/batch_gru.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/__pycache__/discriminator.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/discriminator.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/__pycache__/graph_classifier.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/graph_classifier.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/__pycache__/layers.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/layers.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/__pycache__/rgcn_model.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/rgcn_model.cpython-36.pyc -------------------------------------------------------------------------------- /model/dgl/aggregators.py: -------------------------------------------------------------------------------- 1 | import abc 2 | import torch.nn as nn 3 | import torch 4 | import torch.nn.functional as F 5 | 6 | 7 | class Aggregator(nn.Module): 8 | def __init__(self, emb_dim): 9 | super(Aggregator, self).__init__() 10 | 11 | def forward(self, node): 12 | curr_emb = node.mailbox['curr_emb'][:, 0, :] # (B, F) 13 | nei_msg = torch.bmm(node.mailbox['alpha'].transpose(1, 2), node.mailbox['msg']).squeeze(1) # (B, F) 14 | # nei_msg, _ = torch.max(node.mailbox['msg'], 1) # (B, F) 15 | 16 | new_emb = self.update_embedding(curr_emb, nei_msg) 17 | 18 | return {'h': new_emb} 19 | 20 | @abc.abstractmethod 21 | def update_embedding(curr_emb, nei_msg): 22 | raise NotImplementedError 23 | 24 | 25 | class SumAggregator(Aggregator): 26 | def __init__(self, emb_dim): 27 | super(SumAggregator, self).__init__(emb_dim) 28 | 29 | def update_embedding(self, curr_emb, nei_msg): 30 | new_emb = nei_msg + curr_emb 31 | 32 | return new_emb 33 | 34 | 35 | class MLPAggregator(Aggregator): 36 | def __init__(self, emb_dim): 37 | super(MLPAggregator, self).__init__(emb_dim) 38 | self.linear = nn.Linear(2 * emb_dim, emb_dim) 39 | 40 | def update_embedding(self, curr_emb, nei_msg): 41 | inp = torch.cat((nei_msg, curr_emb), 1) 42 | new_emb = F.relu(self.linear(inp)) 43 | 44 | return new_emb 45 | 46 | 47 | class GRUAggregator(Aggregator): 48 | def __init__(self, emb_dim): 49 | super(GRUAggregator, self).__init__(emb_dim) 50 | self.gru = nn.GRUCell(emb_dim, emb_dim) 51 | 52 | def update_embedding(self, curr_emb, nei_msg): 53 | new_emb = self.gru(nei_msg, curr_emb) 54 | 55 | return new_emb 56 | -------------------------------------------------------------------------------- /model/dgl/batch_gru.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | import torch.nn.functional as F 4 | import math 5 | 6 | class BatchGRU(nn.Module): 7 | def __init__(self, hidden_size=300): 8 | super(BatchGRU, self).__init__() 9 | self.hidden_size = hidden_size 10 | self.gru = nn.GRU(self.hidden_size, self.hidden_size, batch_first=True, 11 | bidirectional=True) 12 | self.bias = nn.Parameter(torch.Tensor(self.hidden_size)) 13 | self.bias.data.uniform_(-1.0 / math.sqrt(self.hidden_size), 14 | 1.0 / math.sqrt(self.hidden_size)) 15 | 16 | 17 | def forward(self, node, a_scope): 18 | hidden = node 19 | # print(hidden.shape) 20 | message = F.relu(node + self.bias) 21 | MAX_node_len = max(a_scope) 22 | # padding 23 | message_lst = [] 24 | hidden_lst = [] 25 | a_start = 0 26 | for i in a_scope: 27 | i = int(i) 28 | if i == 0: 29 | assert 0 30 | cur_message = message.narrow(0, a_start, i) 31 | cur_hidden = hidden.narrow(0, a_start, i) 32 | hidden_lst.append(cur_hidden.max(0)[0].unsqueeze(0).unsqueeze(0)) 33 | a_start += i 34 | cur_message = torch.nn.ZeroPad2d((0,0,0,MAX_node_len-cur_message.shape[0]))(cur_message) 35 | message_lst.append(cur_message.unsqueeze(0)) 36 | 37 | message_lst = torch.cat(message_lst, 0) 38 | hidden_lst = torch.cat(hidden_lst, 1) 39 | hidden_lst = hidden_lst.repeat(2,1,1) 40 | cur_message, cur_hidden = self.gru(message_lst, hidden_lst) 41 | 42 | # unpadding 43 | cur_message_unpadding = [] 44 | kk = 0 45 | for a_size in a_scope: 46 | a_size = int(a_size) 47 | cur_message_unpadding.append(cur_message[kk, :a_size].view(-1, 2*self.hidden_size)) 48 | kk += 1 49 | cur_message_unpadding = torch.cat(cur_message_unpadding, 0) 50 | 51 | return cur_message_unpadding -------------------------------------------------------------------------------- /model/dgl/discriminator.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import time 4 | class Discriminator(nn.Module): 5 | r""" Discriminator module for calculating MI""" 6 | 7 | def __init__(self, n_e, n_g): 8 | """ 9 | param: n_e: dimension of edge embedding 10 | param: n_g: dimension of graph embedding 11 | """ 12 | super(Discriminator, self).__init__() 13 | self.f_k = nn.Bilinear(n_e, n_g, 1) 14 | 15 | for m in self.modules(): 16 | self.weights_init(m) 17 | 18 | def weights_init(self, m): 19 | if isinstance(m, nn.Bilinear): 20 | torch.nn.init.xavier_uniform_(m.weight.data) 21 | if m.bias is not None: 22 | m.bias.data.fill_(0.0) 23 | 24 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 25 | c_x = torch.unsqueeze(c, 0) # [1, F] 26 | c_x = c_x.expand_as(h_pl) #[B, F] 27 | 28 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 1) # [B]; self.f_k(h_pl, c_x): [B, 1] 29 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 1) # [B] 30 | 31 | # print('Discriminator time:', time.time() - ts) 32 | if s_bias1 is not None: 33 | sc_1 += s_bias1 34 | if s_bias2 is not None: 35 | sc_2 += s_bias2 36 | 37 | logits = torch.cat((sc_1, sc_2)) 38 | 39 | return logits 40 | -------------------------------------------------------------------------------- /model/dgl/graph_classifier.py: -------------------------------------------------------------------------------- 1 | from .rgcn_model import RGCN 2 | from dgl import mean_nodes 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import torch 6 | import time 7 | import numpy as np 8 | from .discriminator import Discriminator 9 | from .batch_gru import BatchGRU 10 | """ 11 | File based off of dgl tutorial on RGCN 12 | Source: https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn 13 | """ 14 | 15 | 16 | class GraphClassifier(nn.Module): 17 | def __init__(self, params, relation2id, ent2rels): # in_dim, h_dim, rel_emb_dim, out_dim, num_rels, num_bases): 18 | super().__init__() 19 | 20 | self.params = params 21 | self.relation2id = relation2id 22 | self.ent2rels = ent2rels 23 | self.gnn = RGCN(params) # in_dim, h_dim, h_dim, num_rels, num_bases) 24 | 25 | # num_rels + 1 instead of nums_rels, in order to add a "padding" relation. 26 | self.rel_emb = nn.Embedding(self.params.num_rels + 1, self.params.inp_dim, sparse=False, padding_idx=self.params.num_rels) 27 | 28 | self.ent_padding = nn.Parameter(torch.FloatTensor(1, self.params.sem_dim).uniform_(-1, 1)) 29 | if self.params.init_nei_rels == 'both': 30 | self.w_rel2ent = nn.Linear(2 * self.params.inp_dim, self.params.sem_dim) 31 | elif self.params.init_nei_rels == 'out' or 'in': 32 | self.w_rel2ent = nn.Linear(self.params.inp_dim, self.params.sem_dim) 33 | 34 | self.sigmoid = nn.Sigmoid() 35 | self.nei_rels_dropout = nn.Dropout(self.params.nei_rels_dropout) 36 | self.dropout = nn.Dropout(self.params.dropout) 37 | self.softmax = nn.Softmax(dim=1) 38 | 39 | if self.params.add_ht_emb: 40 | # self.fc_layer = nn.Linear(3 * self.params.num_gcn_layers * self.params.emb_dim + self.params.rel_emb_dim, 1) 41 | self.fc_layer = nn.Linear(3 * self.params.num_gcn_layers * self.params.emb_dim + self.params.emb_dim, 1) 42 | else: 43 | self.fc_layer = nn.Linear(self.params.num_gcn_layers * self.params.emb_dim + self.params.rel_emb_dim, 1) 44 | 45 | if self.params.comp_hrt: 46 | self.fc_layer = nn.Linear(2 * self.params.num_gcn_layers * self.params.emb_dim, 1) 47 | 48 | if self.params.nei_rel_path: 49 | self.fc_layer = nn.Linear(3 * self.params.num_gcn_layers * self.params.emb_dim + 2 * self.params.emb_dim, 1) 50 | 51 | if self.params.comp_ht == 'mlp': 52 | self.fc_comp = nn.Linear(2 * self.params.emb_dim, self.params.emb_dim) 53 | 54 | if self.params.nei_rel_path: 55 | self.disc = Discriminator(self.params.num_gcn_layers * self.params.emb_dim + self.params.emb_dim, self.params.num_gcn_layers * self.params.emb_dim + self.params.emb_dim) 56 | else: 57 | self.disc = Discriminator(self.params.num_gcn_layers * self.params.emb_dim , self.params.num_gcn_layers * self.params.emb_dim) 58 | 59 | self.rnn = torch.nn.GRU(self.params.emb_dim, self.params.emb_dim, batch_first=True) 60 | 61 | self.batch_gru = BatchGRU(self.params.num_gcn_layers * self.params.emb_dim ) 62 | 63 | self.W_o = nn.Linear(self.params.num_gcn_layers * self.params.emb_dim * 2, self.params.num_gcn_layers * self.params.emb_dim) 64 | 65 | def init_ent_emb_matrix(self, g): 66 | """ Initialize feature of entities by matrix form """ 67 | out_nei_rels = g.ndata['out_nei_rels'] 68 | in_nei_rels = g.ndata['in_nei_rels'] 69 | 70 | target_rels = g.ndata['r_label'] 71 | out_nei_rels_emb = self.rel_emb(out_nei_rels) 72 | in_nei_rels_emb = self.rel_emb(in_nei_rels) 73 | target_rels_emb = self.rel_emb(target_rels).unsqueeze(2) 74 | 75 | out_atts = self.softmax(self.nei_rels_dropout(torch.matmul(out_nei_rels_emb, target_rels_emb).squeeze(2))) 76 | in_atts = self.softmax(self.nei_rels_dropout(torch.matmul(in_nei_rels_emb, target_rels_emb).squeeze(2))) 77 | out_sem_feats = torch.matmul(out_atts.unsqueeze(1), out_nei_rels_emb).squeeze(1) 78 | in_sem_feats = torch.matmul(in_atts.unsqueeze(1), in_nei_rels_emb).squeeze(1) 79 | 80 | if self.params.init_nei_rels == 'both': 81 | ent_sem_feats = self.sigmoid(self.w_rel2ent(torch.cat([out_sem_feats, in_sem_feats], dim=1))) 82 | elif self.params.init_nei_rels == 'out': 83 | ent_sem_feats = self.sigmoid(self.w_rel2ent(out_sem_feats)) 84 | elif self.params.init_nei_rels == 'in': 85 | ent_sem_feats = self.sigmoid(self.w_rel2ent(in_sem_feats)) 86 | 87 | g.ndata['init'] = torch.cat([g.ndata['feat'], ent_sem_feats], dim=1) # [B, self.inp_dim] 88 | 89 | def comp_ht_emb(self, head_embs, tail_embs): 90 | if self.params.comp_ht == 'mult': 91 | ht_embs = head_embs * tail_embs 92 | elif self.params.comp_ht == 'mlp': 93 | ht_embs = self.fc_comp(torch.cat([head_embs, tail_embs], dim=1)) 94 | elif self.params.comp_ht == 'sum': 95 | ht_embs = head_embs + tail_embs 96 | else: 97 | raise KeyError(f'composition operator of head and relation embedding {self.comp_ht} not recognized.') 98 | 99 | return ht_embs 100 | 101 | def comp_hrt_emb(self, head_embs, tail_embs, rel_embs): 102 | rel_embs = rel_embs.repeat(1, self.params.num_gcn_layers) 103 | if self.params.comp_hrt == 'TransE': 104 | hrt_embs = head_embs + rel_embs - tail_embs 105 | elif self.params.comp_hrt == 'DistMult': 106 | hrt_embs = head_embs * rel_embs * tail_embs 107 | else: raise KeyError(f'composition operator of (h, r, t) embedding {self.comp_hrt} not recognized.') 108 | 109 | return hrt_embs 110 | 111 | def nei_rel_path(self, g, rel_labels, r_emb_out): 112 | """ Neighboring relational path module """ 113 | # Only consider in-degree relations first. 114 | nei_rels = g.ndata['in_nei_rels'] 115 | head_ids = (g.ndata['id'] == 1).nonzero().squeeze(1) 116 | tail_ids = (g.ndata['id'] == 2).nonzero().squeeze(1) 117 | heads_rels = nei_rels[head_ids] 118 | tails_rels = nei_rels[tail_ids] 119 | 120 | # Extract neighboring relational paths 121 | batch_paths = [] 122 | for (head_rels, r_t, tail_rels) in zip(heads_rels, rel_labels, tails_rels): 123 | paths = [] 124 | for h_r in head_rels: 125 | for t_r in tail_rels: 126 | path = [h_r, r_t, t_r] 127 | paths.append(path) 128 | batch_paths.append(paths) # [B, n_paths, 3] , n_paths = n_head_rels * n_tail_rels 129 | 130 | batch_paths = torch.LongTensor(batch_paths).to(rel_labels.device)# [B, n_paths, 3], n_paths = n_head_rels * n_tail_rels 131 | batch_size = batch_paths.shape[0] 132 | batch_paths = batch_paths.view(batch_size * len(paths), -1) # [B * n_paths, 3] 133 | 134 | batch_paths_embs = F.embedding(batch_paths, r_emb_out, padding_idx=-1) # [B * n_paths, 3, inp_dim] 135 | 136 | # Input RNN 137 | _, last_state = self.rnn(batch_paths_embs) # last_state: [1, B * n_paths, inp_dim] 138 | last_state = last_state.squeeze(0) # squeeze the dim 0 139 | last_state = last_state.view(batch_size, len(paths), self.params.emb_dim) # [B, n_paths, inp_dim] 140 | # Aggregate paths by attention 141 | if self.params.path_agg == 'mean': 142 | output = torch.mean(last_state, 1) # [B, inp_dim] 143 | 144 | if self.params.path_agg == 'att': 145 | r_label_embs = F.embedding(rel_labels, r_emb_out, padding_idx=-1) .unsqueeze(2) # [B, inp_dim, 1] 146 | atts = torch.matmul(last_state, r_label_embs).squeeze(2) # [B, n_paths] 147 | atts = F.softmax(atts, dim=1).unsqueeze(1) # [B, 1, n_paths] 148 | output = torch.matmul(atts, last_state).squeeze(1) # [B, 1, n_paths] * [B, n_paths, inp_dim] -> [B, 1, inp_dim] -> [B, inp_dim] 149 | else: 150 | raise ValueError('unknown path_agg') 151 | 152 | return output # [B, inp_dim] 153 | 154 | def get_logits(self, s_G, s_g_pos, s_g_cor): 155 | ret = self.disc(s_G, s_g_pos, s_g_cor) 156 | return ret 157 | 158 | def forward(self, data, is_return_emb=False, cor_graph=False): 159 | # Initialize the embedding of entities 160 | g, rel_labels = data 161 | 162 | # Neighboring Relational Feature Module 163 | ## Initialize the embedding of nodes by neighbor relations 164 | if self.params.init_nei_rels == 'no': 165 | g.ndata['init'] = g.ndata['feat'].clone() 166 | else: 167 | self.init_ent_emb_matrix(g) 168 | 169 | # Corrupt the node feature 170 | if cor_graph: 171 | g.ndata['init'] = g.ndata['init'][torch.randperm(g.ndata['feat'].shape[0])] 172 | 173 | # r: Embedding of relation 174 | r = self.rel_emb.weight.clone() 175 | 176 | # Input graph into GNN to get embeddings. 177 | g.ndata['h'], r_emb_out = self.gnn(g, r) 178 | 179 | # GRU layer for nodes 180 | graph_sizes = g.batch_num_nodes() 181 | out_dim = self.params.num_gcn_layers * self.params.emb_dim 182 | g.ndata['repr'] = F.relu(self.batch_gru(g.ndata['repr'].view(-1, out_dim), graph_sizes)) 183 | node_hiddens = F.relu(self.W_o(g.ndata['repr'])) # num_nodes x hidden 184 | g.ndata['repr'] = self.dropout(node_hiddens) # num_nodes x hidden 185 | g_out = mean_nodes(g, 'repr').view(-1, out_dim) 186 | 187 | # Get embedding of target nodes (i.e. head and tail nodes) 188 | head_ids = (g.ndata['id'] == 1).nonzero().squeeze(1) 189 | head_embs = g.ndata['repr'][head_ids] 190 | tail_ids = (g.ndata['id'] == 2).nonzero().squeeze(1) 191 | tail_embs = g.ndata['repr'][tail_ids] 192 | 193 | if self.params.add_ht_emb: 194 | g_rep = torch.cat([g_out, 195 | head_embs.view(-1, out_dim), 196 | tail_embs.view(-1, out_dim), 197 | F.embedding(rel_labels, r_emb_out, padding_idx=-1)], dim=1) 198 | else: 199 | g_rep = torch.cat([g_out, self.rel_emb(rel_labels)], dim=1) 200 | 201 | # Represent subgraph by composing (h,r,t) in some way. (Not use in paper) 202 | if self.params.comp_hrt: 203 | edge_embs = self.comp_hrt_emb(head_embs.view(-1, out_dim), tail_embs.view(-1, out_dim), F.embedding(rel_labels, r_emb_out, padding_idx=-1)) 204 | g_rep = torch.cat([g_out, edge_embs], dim=1) 205 | 206 | # Model neighboring relational paths 207 | if self.params.nei_rel_path: 208 | # Model neighboring relational path 209 | g_p = self.nei_rel_path(g, rel_labels, r_emb_out) 210 | g_rep = torch.cat([g_rep, g_p], dim=1) 211 | s_g = torch.cat([g_out, g_p], dim=1) 212 | else: 213 | s_g = g_out 214 | output = self.fc_layer(g_rep) 215 | 216 | self.r_emb_out = r_emb_out 217 | 218 | if not is_return_emb: 219 | return output 220 | else: 221 | # Get the subgraph-level embedding 222 | s_G = s_g.mean(0) 223 | return output, s_G, s_g 224 | 225 | 226 | 227 | -------------------------------------------------------------------------------- /model/dgl/layers.py: -------------------------------------------------------------------------------- 1 | """ 2 | File baseed off of dgl tutorial on RGCN 3 | Source: https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn 4 | """ 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | 9 | 10 | class Identity(nn.Module): 11 | """A placeholder identity operator that is argument-insensitive. 12 | (Identity has already been supported by PyTorch 1.2, we will directly 13 | import torch.nn.Identity in the future) 14 | """ 15 | 16 | def __init__(self): 17 | super(Identity, self).__init__() 18 | 19 | def forward(self, x): 20 | """Return input""" 21 | return x 22 | 23 | 24 | class RGCNLayer(nn.Module): 25 | def __init__(self, inp_dim, out_dim, aggregator, bias=None, activation=None, dropout=0.0, edge_dropout=0.0, is_input_layer=False): 26 | super(RGCNLayer, self).__init__() 27 | self.bias = bias 28 | self.activation = activation 29 | 30 | if self.bias: 31 | self.bias = nn.Parameter(torch.Tensor(out_dim)) 32 | nn.init.xavier_uniform_(self.bias, 33 | gain=nn.init.calculate_gain('relu')) 34 | 35 | self.aggregator = aggregator 36 | 37 | if dropout: 38 | self.dropout = nn.Dropout(dropout) 39 | else: 40 | self.dropout = None 41 | 42 | if edge_dropout: 43 | self.edge_dropout = nn.Dropout(edge_dropout) 44 | else: 45 | self.edge_dropout = Identity() 46 | 47 | # define how propagation is done in subclass 48 | def propagate(self, g): 49 | raise NotImplementedError 50 | 51 | def forward(self, g, rel_emb, attn_rel_emb=None): 52 | raise NotImplementedError 53 | 54 | class RGCNBasisLayer(RGCNLayer): 55 | def __init__(self, inp_dim, out_dim, aggregator, attn_rel_emb_dim, num_rels, num_bases=-1, bias=None, 56 | activation=None, dropout=0.0, edge_dropout=0.0, is_input_layer=False, has_attn=False, is_comp=''): 57 | super( 58 | RGCNBasisLayer, 59 | self).__init__( 60 | inp_dim, 61 | out_dim, 62 | aggregator, 63 | bias, 64 | activation, 65 | dropout=dropout, 66 | edge_dropout=edge_dropout, 67 | is_input_layer=is_input_layer) 68 | self.inp_dim = inp_dim 69 | self.out_dim = out_dim 70 | self.attn_rel_emb_dim = attn_rel_emb_dim 71 | self.num_rels = num_rels 72 | self.num_bases = num_bases 73 | self.is_input_layer = is_input_layer 74 | self.has_attn = has_attn 75 | self.is_comp = is_comp 76 | 77 | if self.num_bases <= 0 or self.num_bases > self.num_rels: 78 | self.num_bases = self.num_rels 79 | 80 | # add basis weights 81 | # self.weight = basis_weights 82 | self.weight = nn.Parameter(torch.Tensor(self.num_bases, self.inp_dim, self.out_dim)) 83 | self.w_comp = nn.Parameter(torch.Tensor(self.num_rels, self.num_bases)) 84 | # Project relation embedding to current node input embedidng 85 | self.w_rel = nn.Parameter(torch.Tensor(self.inp_dim, self.out_dim)) 86 | if self.has_attn: 87 | self.A = nn.Linear(2 * self.inp_dim + 2 * self.attn_rel_emb_dim, inp_dim) 88 | self.B = nn.Linear(inp_dim, 1) 89 | 90 | self.self_loop_weight = nn.Parameter(torch.Tensor(self.inp_dim, self.out_dim)) 91 | 92 | nn.init.xavier_uniform_(self.self_loop_weight, gain=nn.init.calculate_gain('relu')) 93 | nn.init.xavier_uniform_(self.weight, gain=nn.init.calculate_gain('relu')) 94 | nn.init.xavier_uniform_(self.w_comp, gain=nn.init.calculate_gain('relu')) 95 | nn.init.xavier_uniform_(self.w_rel, gain=nn.init.calculate_gain('relu')) 96 | 97 | def propagate(self, g, attn_rel_emb=None): 98 | 99 | # generate all weights from bases 100 | weight = self.weight.view(self.num_bases, 101 | self.inp_dim * self.out_dim) 102 | weight = torch.matmul(self.w_comp, weight).view( 103 | self.num_rels, self.inp_dim, self.out_dim) 104 | 105 | g.edata['w'] = self.edge_dropout(torch.ones(g.number_of_edges(), 1).to(weight.device)) 106 | 107 | # input_ = 'feat' if self.is_input_layer else 'h' 108 | input_ = 'init' if self.is_input_layer else 'h' 109 | 110 | def comp(h, edge_data): 111 | """ Refer to CompGCN """ 112 | if self.is_comp == 'mult': 113 | return h * edge_data 114 | elif self.is_comp == 'sub': 115 | return h - edge_data 116 | else: 117 | raise KeyError(f'composition operator {self.comp} not recognized.') 118 | 119 | def msg_func(edges): 120 | w = weight.index_select(0, edges.data['type']) 121 | 122 | # Similar to CompGCN to interact nodes and relations 123 | if self.is_comp: 124 | edge_data = comp(edges.src[input_], F.embedding(edges.data['type'], self.rel_emb, padding_idx=-1)) 125 | else: 126 | edge_data = edges.src[input_] 127 | 128 | msg = edges.data['w'] * torch.bmm(edge_data.unsqueeze(1), w).squeeze(1) 129 | 130 | curr_emb = torch.mm(edges.dst[input_], self.self_loop_weight) # (B, F) 131 | 132 | if self.has_attn: 133 | e = torch.cat([edges.src[input_], edges.dst[input_], attn_rel_emb(edges.data['type']), attn_rel_emb(edges.data['label'])], dim=1) 134 | a = torch.sigmoid(self.B(F.relu(self.A(e)))) 135 | else: 136 | a = torch.ones((len(edges), 1)).to(device=w.device) 137 | 138 | return {'curr_emb': curr_emb, 'msg': msg, 'alpha': a} 139 | 140 | g.update_all(msg_func, self.aggregator, None) 141 | 142 | def forward(self, g, rel_emb, attn_rel_emb=None): 143 | self.rel_emb = rel_emb 144 | self.propagate(g, attn_rel_emb) 145 | 146 | # apply bias and activation 147 | node_repr = g.ndata['h'] 148 | if self.bias: 149 | node_repr = node_repr + self.bias 150 | if self.activation: 151 | node_repr = self.activation(node_repr) 152 | if self.dropout: 153 | node_repr = self.dropout(node_repr) 154 | 155 | g.ndata['h'] = node_repr 156 | 157 | if self.is_input_layer: 158 | g.ndata['repr'] = g.ndata['h'].unsqueeze(1) 159 | else: 160 | g.ndata['repr'] = torch.cat([g.ndata['repr'], g.ndata['h'].unsqueeze(1)], dim=1) 161 | 162 | rel_emb_out = torch.matmul(self.rel_emb, self.w_rel) 163 | rel_emb_out[-1, :].zero_() # padding embedding as 0 164 | return rel_emb_out 165 | -------------------------------------------------------------------------------- /model/dgl/rgcn_model.py: -------------------------------------------------------------------------------- 1 | """ 2 | File based off of dgl tutorial on RGCN 3 | Source: https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn 4 | """ 5 | 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | from .layers import RGCNBasisLayer as RGCNLayer 10 | 11 | from .aggregators import SumAggregator, MLPAggregator, GRUAggregator 12 | 13 | 14 | class RGCN(nn.Module): 15 | def __init__(self, params): 16 | super(RGCN, self).__init__() 17 | 18 | self.max_label_value = params.max_label_value 19 | self.inp_dim = params.inp_dim 20 | self.emb_dim = params.emb_dim 21 | self.attn_rel_emb_dim = params.attn_rel_emb_dim 22 | self.num_rels = params.num_rels 23 | self.aug_num_rels = params.aug_num_rels 24 | self.num_bases = params.num_bases 25 | self.num_hidden_layers = params.num_gcn_layers 26 | self.dropout = params.dropout 27 | self.edge_dropout = params.edge_dropout 28 | # self.aggregator_type = params.gnn_agg_type 29 | self.has_attn = params.has_attn 30 | 31 | self.is_comp = params.is_comp 32 | 33 | self.device = params.device 34 | 35 | if self.has_attn: 36 | self.attn_rel_emb = nn.Embedding(self.num_rels, self.attn_rel_emb_dim, sparse=False) 37 | else: 38 | self.attn_rel_emb = None 39 | 40 | # initialize aggregators for input and hidden layers 41 | if params.gnn_agg_type == "sum": 42 | self.aggregator = SumAggregator(self.emb_dim) 43 | elif params.gnn_agg_type == "mlp": 44 | self.aggregator = MLPAggregator(self.emb_dim) 45 | elif params.gnn_agg_type == "gru": 46 | self.aggregator = GRUAggregator(self.emb_dim) 47 | 48 | # initialize basis weights for input and hidden layers 49 | # self.input_basis_weights = nn.Parameter(torch.Tensor(self.num_bases, self.inp_dim, self.emb_dim)) 50 | # self.basis_weights = nn.Parameter(torch.Tensor(self.num_bases, self.emb_dim, self.emb_dim)) 51 | 52 | # create rgcn layers 53 | self.build_model() 54 | 55 | # create initial features 56 | self.features = self.create_features() 57 | 58 | def create_features(self): 59 | features = torch.arange(self.inp_dim).to(device=self.device) 60 | return features 61 | 62 | def build_model(self): 63 | self.layers = nn.ModuleList() 64 | # i2h 65 | i2h = self.build_input_layer() 66 | if i2h is not None: 67 | self.layers.append(i2h) 68 | # h2h 69 | for idx in range(self.num_hidden_layers - 1): 70 | h2h = self.build_hidden_layer(idx) 71 | self.layers.append(h2h) 72 | 73 | def build_input_layer(self): 74 | return RGCNLayer(self.inp_dim, 75 | self.emb_dim, 76 | # self.input_basis_weights, 77 | self.aggregator, 78 | self.attn_rel_emb_dim, 79 | self.aug_num_rels, 80 | self.num_bases, 81 | activation=F.relu, 82 | dropout=self.dropout, 83 | edge_dropout=self.edge_dropout, 84 | is_input_layer=True, 85 | has_attn=self.has_attn, 86 | is_comp=self.is_comp) 87 | 88 | def build_hidden_layer(self, idx): 89 | return RGCNLayer(self.emb_dim, 90 | self.emb_dim, 91 | # self.basis_weights, 92 | self.aggregator, 93 | self.attn_rel_emb_dim, 94 | self.aug_num_rels, 95 | self.num_bases, 96 | activation=F.relu, 97 | dropout=self.dropout, 98 | edge_dropout=self.edge_dropout, 99 | has_attn=self.has_attn, 100 | is_comp=self.is_comp) 101 | 102 | def forward(self, g, r): 103 | for layer in self.layers: 104 | r = layer(g, r, self.attn_rel_emb) 105 | return g.ndata.pop('h'), r 106 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | dgl==0.4.2 2 | lmdb==0.98 3 | networkx==2.4 4 | scikit-learn==0.22.1 5 | torch==1.4.0 6 | tqdm==4.43.0 -------------------------------------------------------------------------------- /snri.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/snri.png -------------------------------------------------------------------------------- /subgraph_extraction/__pycache__/datasets.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/subgraph_extraction/__pycache__/datasets.cpython-36.pyc -------------------------------------------------------------------------------- /subgraph_extraction/__pycache__/graph_sampler.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/subgraph_extraction/__pycache__/graph_sampler.cpython-36.pyc -------------------------------------------------------------------------------- /subgraph_extraction/datasets.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import Dataset 2 | import timeit 3 | import os 4 | import logging 5 | import lmdb 6 | import numpy as np 7 | import json 8 | import pickle 9 | import dgl 10 | from utils.graph_utils import ssp_multigraph_to_dgl, incidence_matrix 11 | from utils.data_utils import process_files, save_to_file, plot_rel_dist 12 | from .graph_sampler import * 13 | import pdb 14 | 15 | 16 | def generate_subgraph_datasets(params, splits=['train', 'valid'], saved_relation2id=None, max_label_value=None, is_ent2rels=None): 17 | 18 | testing = 'test' in splits 19 | adj_list, triplets, entity2id, relation2id, id2entity, id2relation, h2r, m_h2r, t2r, m_t2r = process_files(params.file_paths, saved_relation2id, sort_data=params.sort_data) 20 | 21 | # plot_rel_dist(adj_list, os.path.join(params.main_dir, f'data/{params.dataset}/rel_dist.png')) 22 | 23 | data_path = os.path.join(params.main_dir, f'data/{params.dataset}/relation2id.json') 24 | if not os.path.isdir(data_path) and not testing: 25 | with open(data_path, 'w') as f: 26 | json.dump(relation2id, f) 27 | 28 | graphs = {} 29 | 30 | for split_name in splits: 31 | graphs[split_name] = {'triplets': triplets[split_name], 'max_size': params.max_links} 32 | 33 | # Sample train and valid/test links 34 | for split_name, split in graphs.items(): 35 | logging.info(f"Sampling negative links for {split_name}") 36 | split['pos'], split['neg'] = sample_neg(adj_list, split['triplets'], params.num_neg_samples_per_link, max_size=split['max_size'], constrained_neg_prob=params.constrained_neg_prob) 37 | 38 | if testing: 39 | directory = os.path.join(params.main_dir, 'data/{}/'.format(params.dataset)) 40 | save_to_file(directory, f'neg_{params.test_file}_{params.constrained_neg_prob}.txt', graphs['test']['neg'], id2entity, id2relation) 41 | 42 | links2subgraphs(adj_list, graphs, params, max_label_value) 43 | 44 | 45 | def get_kge_embeddings(dataset, kge_model): 46 | 47 | path = './experiments/kge_baselines/{}_{}'.format(kge_model, dataset) 48 | node_features = np.load(os.path.join(path, 'entity_embedding.npy')) 49 | with open(os.path.join(path, 'id2entity.json')) as json_file: 50 | kge_id2entity = json.load(json_file) 51 | kge_entity2id = {v: int(k) for k, v in kge_id2entity.items()} 52 | 53 | return node_features, kge_entity2id 54 | 55 | 56 | class SubgraphDataset(Dataset): 57 | """Extracted, labeled, subgraph dataset -- DGL Only""" 58 | 59 | def __init__(self, db_path, db_name_pos, db_name_neg, raw_data_paths, included_relations=None, add_traspose_rels=False, num_neg_samples_per_link=1, use_kge_embeddings=False, dataset='', kge_model='', file_name='', is_ret_nodes_num=False): 60 | 61 | self.main_env = lmdb.open(db_path, readonly=True, max_dbs=3, lock=False) 62 | self.db_pos = self.main_env.open_db(db_name_pos.encode()) 63 | self.db_neg = self.main_env.open_db(db_name_neg.encode()) 64 | self.node_features, self.kge_entity2id = get_kge_embeddings(dataset, kge_model) if use_kge_embeddings else (None, None) 65 | self.num_neg_samples_per_link = num_neg_samples_per_link 66 | self.file_name = file_name 67 | self.is_ret_nodes_num = is_ret_nodes_num 68 | 69 | ssp_graph, __, __, __, id2entity, id2relation, h2r, m_h2r, t2r, m_t2r = process_files(raw_data_paths, included_relations, add_traspose_rels) 70 | self.num_rels = len(ssp_graph) 71 | 72 | # Add transpose matrices to handle both directions of relations. 73 | if add_traspose_rels: 74 | ssp_graph_t = [adj.T for adj in ssp_graph] 75 | ssp_graph += ssp_graph_t 76 | 77 | # the effective number of relations after adding symmetric adjacency matrices and/or self connections 78 | self.aug_num_rels = len(ssp_graph) 79 | self.graph = ssp_multigraph_to_dgl(ssp_graph) 80 | self.ssp_graph = ssp_graph 81 | self.id2entity = id2entity 82 | self.id2relation = id2relation 83 | self.m_h2r = m_h2r 84 | self.m_t2r = m_t2r 85 | 86 | self.max_n_label = np.array([0, 0]) 87 | with self.main_env.begin() as txn: 88 | self.max_n_label[0] = int.from_bytes(txn.get('max_n_label_sub'.encode()), byteorder='little') 89 | self.max_n_label[1] = int.from_bytes(txn.get('max_n_label_obj'.encode()), byteorder='little') 90 | 91 | self.avg_subgraph_size = struct.unpack('f', txn.get('avg_subgraph_size'.encode())) 92 | self.min_subgraph_size = struct.unpack('f', txn.get('min_subgraph_size'.encode())) 93 | self.max_subgraph_size = struct.unpack('f', txn.get('max_subgraph_size'.encode())) 94 | self.std_subgraph_size = struct.unpack('f', txn.get('std_subgraph_size'.encode())) 95 | 96 | self.avg_enc_ratio = struct.unpack('f', txn.get('avg_enc_ratio'.encode())) 97 | self.min_enc_ratio = struct.unpack('f', txn.get('min_enc_ratio'.encode())) 98 | self.max_enc_ratio = struct.unpack('f', txn.get('max_enc_ratio'.encode())) 99 | self.std_enc_ratio = struct.unpack('f', txn.get('std_enc_ratio'.encode())) 100 | 101 | self.avg_num_pruned_nodes = struct.unpack('f', txn.get('avg_num_pruned_nodes'.encode())) 102 | self.min_num_pruned_nodes = struct.unpack('f', txn.get('min_num_pruned_nodes'.encode())) 103 | self.max_num_pruned_nodes = struct.unpack('f', txn.get('max_num_pruned_nodes'.encode())) 104 | self.std_num_pruned_nodes = struct.unpack('f', txn.get('std_num_pruned_nodes'.encode())) 105 | 106 | logging.info(f"Max distance from sub : {self.max_n_label[0]}, Max distance from obj : {self.max_n_label[1]}") 107 | 108 | # logging.info('=====================') 109 | # logging.info(f"Subgraph size stats: \n Avg size {self.avg_subgraph_size}, \n Min size {self.min_subgraph_size}, \n Max size {self.max_subgraph_size}, \n Std {self.std_subgraph_size}") 110 | 111 | # logging.info('=====================') 112 | # logging.info(f"Enclosed nodes ratio stats: \n Avg size {self.avg_enc_ratio}, \n Min size {self.min_enc_ratio}, \n Max size {self.max_enc_ratio}, \n Std {self.std_enc_ratio}") 113 | 114 | # logging.info('=====================') 115 | # logging.info(f"# of pruned nodes stats: \n Avg size {self.avg_num_pruned_nodes}, \n Min size {self.min_num_pruned_nodes}, \n Max size {self.max_num_pruned_nodes}, \n Std {self.std_num_pruned_nodes}") 116 | 117 | with self.main_env.begin(db=self.db_pos) as txn: 118 | self.num_graphs_pos = int.from_bytes(txn.get('num_graphs'.encode()), byteorder='little') 119 | with self.main_env.begin(db=self.db_neg) as txn: 120 | self.num_graphs_neg = int.from_bytes(txn.get('num_graphs'.encode()), byteorder='little') 121 | 122 | self.__getitem__(0) 123 | 124 | def __getitem__(self, index): 125 | with self.main_env.begin(db=self.db_pos) as txn: 126 | str_id = '{:08}'.format(index).encode('ascii') 127 | nodes_pos, r_label_pos, g_label_pos, n_labels_pos = deserialize(txn.get(str_id)).values() 128 | subgraph_pos = self._prepare_subgraphs(nodes_pos, r_label_pos, n_labels_pos) 129 | 130 | # Get the neighbor relations of target head and tails 131 | # nei_rels_pos = [self.ent2rels[nodes_pos[0]], self.ent2rels[nodes_pos[1]]] 132 | nei_rels_pos = [[0, 1], [0, 1]] 133 | 134 | subgraphs_neg = [] 135 | r_labels_neg = [] 136 | g_labels_neg = [] 137 | nei_rels_negs = [] 138 | 139 | with self.main_env.begin(db=self.db_neg) as txn: 140 | for i in range(self.num_neg_samples_per_link): 141 | str_id = '{:08}'.format(index + i * (self.num_graphs_pos)).encode('ascii') 142 | nodes_neg, r_label_neg, g_label_neg, n_labels_neg = deserialize(txn.get(str_id)).values() 143 | subgraphs_neg.append(self._prepare_subgraphs(nodes_neg, r_label_neg, n_labels_neg)) 144 | # Get the neighbor relations of target head and tails 145 | # nei_rels_neg = [self.ent2rels[nodes_neg[0]], self.ent2rels[nodes_neg[1]]] 146 | nei_rels_neg = [[0, 1], [0, 1]] 147 | nei_rels_negs.append(nei_rels_neg) 148 | r_labels_neg.append(r_label_neg) 149 | g_labels_neg.append(g_label_neg) 150 | 151 | # print("Nodes of subgraph: ", len(subgraph_pos.nodes())) 152 | return subgraph_pos, g_label_pos, r_label_pos, subgraphs_neg, g_labels_neg, r_labels_neg, 153 | 154 | def __len__(self): 155 | return self.num_graphs_pos 156 | 157 | def _prepare_subgraphs(self, nodes, r_label, n_labels): 158 | subgraph: dgl.DGLGraph = self.graph.subgraph(nodes) 159 | subgraph.edata['type'] = self.graph.edata['type'][subgraph.edata[dgl.EID]] 160 | subgraph.edata['label'] = torch.tensor(r_label * np.ones(subgraph.edata['type'].shape), dtype=torch.long) 161 | 162 | # Check if the target relation is in the subgraph 163 | has_rel = subgraph.has_edges_between(0, 1) 164 | if has_rel: 165 | edges_btw_roots = subgraph.edge_ids(0, 1) 166 | # rel_link = np.nonzero(subgraph.edata['type'][edges_btw_roots] == r_label) 167 | rel_link = subgraph.edata['type'][edges_btw_roots].item() == r_label # If the target relation is not in the subgraph, add a self-loop to the subgraph 168 | # if rel_link.squeeze().nelement() == 0: 169 | if not has_rel or not rel_link : # If there is no relation between the roots, or the target relation is not in the subgraph (these two cases may only occur for neg sample, because for neg sample, the target relation may not really exist, so we have to add this edge manually) 170 | # subgraph.add_edge(0, 1) 171 | subgraph.add_edges([0], [1]) 172 | subgraph.edata['type'][-1] = torch.tensor(r_label).type(torch.LongTensor) 173 | subgraph.edata['label'][-1] = torch.tensor(r_label).type(torch.LongTensor) 174 | 175 | # map the id read by GraIL to the entity IDs as registered by the KGE embeddings 176 | kge_nodes = [self.kge_entity2id[self.id2entity[n]] for n in nodes] if self.kge_entity2id else None 177 | n_feats = self.node_features[kge_nodes] if self.node_features is not None else None 178 | subgraph = self._prepare_features_new(subgraph, n_labels, r_label, n_feats) 179 | 180 | # Add the original node id feature 181 | subgraph.ndata['parent_id'] = self.graph.subgraph(nodes).ndata[dgl.NID] 182 | # Add the neighbor relations 183 | subgraph.ndata['out_nei_rels'] = torch.LongTensor(self.m_h2r[subgraph.ndata['parent_id']]) 184 | subgraph.ndata['in_nei_rels'] = torch.LongTensor(self.m_t2r[subgraph.ndata['parent_id']]) 185 | 186 | return subgraph 187 | 188 | def _prepare_features(self, subgraph, n_labels, n_feats=None): 189 | # One hot encode the node label feature and concat to n_featsure 190 | n_nodes = subgraph.number_of_nodes() 191 | label_feats = np.zeros((n_nodes, self.max_n_label[0] + 1)) 192 | label_feats[np.arange(n_nodes), n_labels] = 1 193 | label_feats[np.arange(n_nodes), self.max_n_label[0] + 1 + n_labels[:, 1]] = 1 194 | n_feats = np.concatenate((label_feats, n_feats), axis=1) if n_feats else label_feats 195 | subgraph.ndata['feat'] = torch.FloatTensor(n_feats) 196 | self.n_feat_dim = n_feats.shape[1] # Find cleaner way to do this -- i.e. set the n_feat_dim 197 | return subgraph 198 | 199 | def _prepare_features_new(self, subgraph, n_labels, r_label, n_feats=None): 200 | # One hot encode the node label feature and concat to n_featsure 201 | n_nodes = subgraph.number_of_nodes() 202 | label_feats = np.zeros((n_nodes, self.max_n_label[0] + 1 + self.max_n_label[1] + 1)) 203 | label_feats[np.arange(n_nodes), n_labels[:, 0]] = 1 204 | label_feats[np.arange(n_nodes), self.max_n_label[0] + 1 + n_labels[:, 1]] = 1 205 | # label_feats = np.zeros((n_nodes, self.max_n_label[0] + 1 + self.max_n_label[1] + 1)) 206 | # label_feats[np.arange(n_nodes), 0] = 1 207 | # label_feats[np.arange(n_nodes), self.max_n_label[0] + 1] = 1 208 | n_feats = np.concatenate((label_feats, n_feats), axis=1) if n_feats is not None else label_feats 209 | subgraph.ndata['feat'] = torch.FloatTensor(n_feats) 210 | 211 | head_id = np.argwhere([label[0] == 0 and label[1] == 1 for label in n_labels]) 212 | tail_id = np.argwhere([label[0] == 1 and label[1] == 0 for label in n_labels]) 213 | n_ids = np.zeros(n_nodes) 214 | n_ids[head_id] = 1 # head 215 | n_ids[tail_id] = 2 # tail 216 | 217 | # 'id' is used to represent the target head and targe tail nodes 218 | subgraph.ndata['id'] = torch.FloatTensor(n_ids) 219 | 220 | # 'r_label' is used to represent the relation label of this subgraph 221 | subgraph.ndata['r_label'] = torch.LongTensor(np.ones(n_nodes) * r_label) 222 | self.n_feat_dim = n_feats.shape[1] # Find cleaner way to do this -- i.e. set the n_feat_dim 223 | 224 | 225 | return subgraph 226 | -------------------------------------------------------------------------------- /subgraph_extraction/graph_sampler.py: -------------------------------------------------------------------------------- 1 | import os 2 | import math 3 | import struct 4 | import logging 5 | import random 6 | import pickle as pkl 7 | import pdb 8 | from tqdm import tqdm 9 | import lmdb 10 | import multiprocessing as mp 11 | import numpy as np 12 | import scipy.io as sio 13 | import scipy.sparse as ssp 14 | import sys 15 | import torch 16 | from scipy.special import softmax 17 | from utils.dgl_utils import _bfs_relational 18 | from utils.graph_utils import incidence_matrix, remove_nodes, ssp_to_torch, serialize, deserialize, get_edge_count, diameter, radius 19 | import networkx as nx 20 | 21 | 22 | def sample_neg(adj_list, edges, num_neg_samples_per_link=1, max_size=1000000, constrained_neg_prob=0): 23 | pos_edges = edges 24 | neg_edges = [] 25 | 26 | # if max_size is set, randomly sample train links 27 | if max_size < len(pos_edges): 28 | perm = np.random.permutation(len(pos_edges))[:max_size] 29 | pos_edges = pos_edges[perm] 30 | 31 | # sample negative links for train/test 32 | n, r = adj_list[0].shape[0], len(adj_list) 33 | 34 | # distribution of edges across reelations 35 | theta = 0.001 36 | edge_count = get_edge_count(adj_list) 37 | rel_dist = np.zeros(edge_count.shape) 38 | idx = np.nonzero(edge_count) 39 | rel_dist[idx] = softmax(theta * edge_count[idx]) 40 | 41 | # possible head and tails for each relation 42 | valid_heads = [adj.tocoo().row.tolist() for adj in adj_list] 43 | valid_tails = [adj.tocoo().col.tolist() for adj in adj_list] 44 | 45 | pbar = tqdm(total=len(pos_edges)) 46 | while len(neg_edges) < num_neg_samples_per_link * len(pos_edges): 47 | neg_head, neg_tail, rel = pos_edges[pbar.n % len(pos_edges)][0], pos_edges[pbar.n % len(pos_edges)][1], pos_edges[pbar.n % len(pos_edges)][2] 48 | if np.random.uniform() < constrained_neg_prob: 49 | if np.random.uniform() < 0.5: 50 | neg_head = np.random.choice(valid_heads[rel]) 51 | else: 52 | neg_tail = np.random.choice(valid_tails[rel]) 53 | else: 54 | if np.random.uniform() < 0.5: 55 | neg_head = np.random.choice(n) 56 | else: 57 | neg_tail = np.random.choice(n) 58 | 59 | if neg_head != neg_tail and adj_list[rel][neg_head, neg_tail] == 0: 60 | neg_edges.append([neg_head, neg_tail, rel]) 61 | pbar.update(1) 62 | 63 | pbar.close() 64 | 65 | neg_edges = np.array(neg_edges) 66 | return pos_edges, neg_edges 67 | 68 | 69 | def links2subgraphs(A, graphs, params, max_label_value=None): 70 | ''' 71 | extract enclosing subgraphs, write map mode + named dbs 72 | ''' 73 | max_n_label = {'value': np.array([0, 0])} 74 | subgraph_sizes = [] 75 | enc_ratios = [] 76 | num_pruned_nodes = [] 77 | 78 | BYTES_PER_DATUM = get_average_subgraph_size(100, list(graphs.values())[0]['pos'], A, params) * 1.5 79 | links_length = 0 80 | for split_name, split in graphs.items(): 81 | links_length += (len(split['pos']) + len(split['neg'])) * 2 82 | map_size = links_length * BYTES_PER_DATUM 83 | 84 | env = lmdb.open(params.db_path, map_size=map_size, max_dbs=6) 85 | 86 | def extraction_helper(A, links, g_labels, split_env): 87 | 88 | with env.begin(write=True, db=split_env) as txn: 89 | txn.put('num_graphs'.encode(), (len(links)).to_bytes(int.bit_length(len(links)), byteorder='little')) 90 | 91 | with mp.Pool(processes=None, initializer=intialize_worker, initargs=(A, params, max_label_value)) as p: 92 | args_ = zip(range(len(links)), links, g_labels) 93 | for (str_id, datum) in tqdm(p.imap(extract_save_subgraph, args_), total=len(links)): 94 | max_n_label['value'] = np.maximum(np.max(datum['n_labels'], axis=0), max_n_label['value']) 95 | subgraph_sizes.append(datum['subgraph_size']) 96 | enc_ratios.append(datum['enc_ratio']) 97 | num_pruned_nodes.append(datum['num_pruned_nodes']) 98 | 99 | with env.begin(write=True, db=split_env) as txn: 100 | txn.put(str_id, serialize(datum)) 101 | 102 | for split_name, split in graphs.items(): 103 | logging.info(f"Extracting enclosing subgraphs for positive links in {split_name} set") 104 | labels = np.ones(len(split['pos'])) 105 | db_name_pos = split_name + '_pos' 106 | split_env = env.open_db(db_name_pos.encode()) 107 | extraction_helper(A, split['pos'], labels, split_env) 108 | 109 | logging.info(f"Extracting enclosing subgraphs for negative links in {split_name} set") 110 | labels = np.zeros(len(split['neg'])) 111 | db_name_neg = split_name + '_neg' 112 | split_env = env.open_db(db_name_neg.encode()) 113 | extraction_helper(A, split['neg'], labels, split_env) 114 | 115 | max_n_label['value'] = max_label_value if max_label_value is not None else max_n_label['value'] 116 | 117 | with env.begin(write=True) as txn: 118 | bit_len_label_sub = int.bit_length(int(max_n_label['value'][0])) 119 | bit_len_label_obj = int.bit_length(int(max_n_label['value'][1])) 120 | txn.put('max_n_label_sub'.encode(), (int(max_n_label['value'][0])).to_bytes(bit_len_label_sub, byteorder='little')) 121 | txn.put('max_n_label_obj'.encode(), (int(max_n_label['value'][1])).to_bytes(bit_len_label_obj, byteorder='little')) 122 | 123 | txn.put('avg_subgraph_size'.encode(), struct.pack('f', float(np.mean(subgraph_sizes)))) 124 | txn.put('min_subgraph_size'.encode(), struct.pack('f', float(np.min(subgraph_sizes)))) 125 | txn.put('max_subgraph_size'.encode(), struct.pack('f', float(np.max(subgraph_sizes)))) 126 | txn.put('std_subgraph_size'.encode(), struct.pack('f', float(np.std(subgraph_sizes)))) 127 | 128 | txn.put('avg_enc_ratio'.encode(), struct.pack('f', float(np.mean(enc_ratios)))) 129 | txn.put('min_enc_ratio'.encode(), struct.pack('f', float(np.min(enc_ratios)))) 130 | txn.put('max_enc_ratio'.encode(), struct.pack('f', float(np.max(enc_ratios)))) 131 | txn.put('std_enc_ratio'.encode(), struct.pack('f', float(np.std(enc_ratios)))) 132 | 133 | txn.put('avg_num_pruned_nodes'.encode(), struct.pack('f', float(np.mean(num_pruned_nodes)))) 134 | txn.put('min_num_pruned_nodes'.encode(), struct.pack('f', float(np.min(num_pruned_nodes)))) 135 | txn.put('max_num_pruned_nodes'.encode(), struct.pack('f', float(np.max(num_pruned_nodes)))) 136 | txn.put('std_num_pruned_nodes'.encode(), struct.pack('f', float(np.std(num_pruned_nodes)))) 137 | 138 | 139 | def get_average_subgraph_size(sample_size, links, A, params): 140 | total_size = 0 141 | for (n1, n2, r_label) in links[np.random.choice(len(links), sample_size)]: 142 | nodes, n_labels, subgraph_size, enc_ratio, num_pruned_nodes = subgraph_extraction_labeling((n1, n2), r_label, A, params.hop, params.enclosing_sub_graph, params.max_nodes_per_hop) 143 | datum = {'nodes': nodes, 'r_label': r_label, 'g_label': 0, 'n_labels': n_labels, 'subgraph_size': subgraph_size, 'enc_ratio': enc_ratio, 'num_pruned_nodes': num_pruned_nodes} 144 | total_size += len(serialize(datum)) 145 | return total_size / sample_size 146 | 147 | 148 | def intialize_worker(A, params, max_label_value): 149 | global A_, params_, max_label_value_ 150 | A_, params_, max_label_value_ = A, params, max_label_value 151 | 152 | 153 | def extract_save_subgraph(args_): 154 | idx, (n1, n2, r_label), g_label = args_ 155 | nodes, n_labels, subgraph_size, enc_ratio, num_pruned_nodes = subgraph_extraction_labeling((n1, n2), r_label, A_, params_.hop, params_.enclosing_sub_graph, params_.max_nodes_per_hop) 156 | 157 | # max_label_value_ is to set the maximum possible value of node label while doing double-radius labelling. 158 | if max_label_value_ is not None: 159 | n_labels = np.array([np.minimum(label, max_label_value_).tolist() for label in n_labels]) 160 | 161 | datum = {'nodes': nodes, 'r_label': r_label, 'g_label': g_label, 'n_labels': n_labels, 'subgraph_size': subgraph_size, 'enc_ratio': enc_ratio, 'num_pruned_nodes': num_pruned_nodes} 162 | str_id = '{:08}'.format(idx).encode('ascii') 163 | 164 | return (str_id, datum) 165 | 166 | 167 | def get_neighbor_nodes(roots, adj, h=1, max_nodes_per_hop=None): 168 | bfs_generator = _bfs_relational(adj, roots, max_nodes_per_hop) 169 | lvls = list() 170 | for _ in range(h): 171 | try: 172 | lvls.append(next(bfs_generator)) 173 | except StopIteration: 174 | pass 175 | return set().union(*lvls) 176 | 177 | 178 | def subgraph_extraction_labeling(ind, rel, A_list, h=1, enclosing_sub_graph=False, max_nodes_per_hop=None, max_node_label_value=None): 179 | # extract the h-hop enclosing subgraphs around link 'ind' 180 | A_incidence = incidence_matrix(A_list) 181 | A_incidence += A_incidence.T 182 | 183 | root1_nei = get_neighbor_nodes(set([ind[0]]), A_incidence, h, max_nodes_per_hop) 184 | root2_nei = get_neighbor_nodes(set([ind[1]]), A_incidence, h, max_nodes_per_hop) 185 | 186 | subgraph_nei_nodes_int = root1_nei.intersection(root2_nei) 187 | subgraph_nei_nodes_un = root1_nei.union(root2_nei) 188 | 189 | # Extract subgraph | Roots being in the front is essential for labelling and the model to work properly. 190 | if enclosing_sub_graph: 191 | subgraph_nodes = list(ind) + list(subgraph_nei_nodes_int) 192 | else: 193 | subgraph_nodes = list(ind) + list(subgraph_nei_nodes_un) 194 | 195 | subgraph = [adj[subgraph_nodes, :][:, subgraph_nodes] for adj in A_list] 196 | 197 | labels, enclosing_subgraph_nodes = node_label(incidence_matrix(subgraph), max_distance=h) 198 | 199 | pruned_subgraph_nodes = np.array(subgraph_nodes)[enclosing_subgraph_nodes].tolist() 200 | pruned_labels = labels[enclosing_subgraph_nodes] 201 | # pruned_subgraph_nodes = subgraph_nodes 202 | # pruned_labels = labels 203 | 204 | if max_node_label_value is not None: 205 | pruned_labels = np.array([np.minimum(label, max_node_label_value).tolist() for label in pruned_labels]) 206 | 207 | subgraph_size = len(pruned_subgraph_nodes) 208 | enc_ratio = len(subgraph_nei_nodes_int) / (len(subgraph_nei_nodes_un) + 1e-3) 209 | num_pruned_nodes = len(subgraph_nodes) - len(pruned_subgraph_nodes) 210 | 211 | return pruned_subgraph_nodes, pruned_labels, subgraph_size, enc_ratio, num_pruned_nodes 212 | 213 | 214 | def node_label(subgraph, max_distance=1): 215 | # implementation of the node labeling scheme described in the paper 216 | roots = [0, 1] 217 | sgs_single_root = [remove_nodes(subgraph, [root]) for root in roots] 218 | dist_to_roots = [np.clip(ssp.csgraph.dijkstra(sg, indices=[0], directed=False, unweighted=True, limit=1e6)[:, 1:], 0, 1e7) for r, sg in enumerate(sgs_single_root)] 219 | dist_to_roots = np.array(list(zip(dist_to_roots[0][0], dist_to_roots[1][0])), dtype=int) 220 | 221 | target_node_labels = np.array([[0, 1], [1, 0]]) 222 | labels = np.concatenate((target_node_labels, dist_to_roots)) if dist_to_roots.size else target_node_labels 223 | 224 | enclosing_subgraph_nodes = np.where(np.max(labels, axis=1) <= max_distance)[0] 225 | return labels, enclosing_subgraph_nodes 226 | -------------------------------------------------------------------------------- /test_auc.py: -------------------------------------------------------------------------------- 1 | # from comet_ml import Experiment 2 | 3 | import pdb 4 | import os 5 | os.environ['OPENBLAS_NUM_THREADS'] = '1' 6 | import argparse 7 | import logging 8 | import torch 9 | from scipy.sparse import SparseEfficiencyWarning 10 | import numpy as np 11 | 12 | from subgraph_extraction.datasets import SubgraphDataset, generate_subgraph_datasets 13 | from utils.initialization_utils import initialize_experiment, initialize_model 14 | from utils.graph_utils import collate_dgl, move_batch_to_device_dgl, collate_dgl_train, move_batch_to_device_dgl_train 15 | from managers.evaluator import Evaluator 16 | from utils.data_utils import process_files 17 | 18 | from warnings import simplefilter 19 | 20 | 21 | def main(params): 22 | simplefilter(action='ignore', category=UserWarning) 23 | simplefilter(action='ignore', category=SparseEfficiencyWarning) 24 | 25 | graph_classifier = initialize_model(params, None, load_model=True) 26 | adj_list, triplets, entity2id, relation2id, id2entity, id2relation, _,_,_,_ = process_files(params.file_paths, graph_classifier.relation2id) 27 | # ent2rels = {k: torch.LongTensor(v).to(device=params.device) for k, v in ent2rels.items()} 28 | # graph_classifier.ent2rels = ent2rels 29 | 30 | logging.info(f"Device: {params.device}") 31 | 32 | all_auc = [] 33 | auc_mean = 0 34 | 35 | all_auc_pr = [] 36 | auc_pr_mean = 0 37 | for r in range(1, params.runs + 1): 38 | 39 | params.db_path = os.path.join(params.main_dir, f'data/{params.dataset}/test_subgraphs_{params.experiment_name}_{params.constrained_neg_prob}_en_{params.enclosing_sub_graph}') 40 | 41 | generate_subgraph_datasets(params, splits=['test'], 42 | saved_relation2id=graph_classifier.relation2id, 43 | max_label_value=graph_classifier.gnn.max_label_value) 44 | 45 | test = SubgraphDataset(params.db_path, 'test_pos', 'test_neg', params.file_paths, graph_classifier.relation2id, 46 | add_traspose_rels=params.add_traspose_rels, 47 | num_neg_samples_per_link=params.num_neg_samples_per_link, 48 | use_kge_embeddings=params.use_kge_embeddings, dataset=params.dataset, 49 | kge_model=params.kge_model, file_name=params.test_file) 50 | 51 | test_evaluator = Evaluator(params, graph_classifier, test) 52 | 53 | result = test_evaluator.eval(save=True) 54 | logging.info('\nTest Set Performance:' + str(result)) 55 | all_auc.append(result['auc']) 56 | auc_mean = auc_mean + (result['auc'] - auc_mean) / r 57 | 58 | all_auc_pr.append(result['auc_pr']) 59 | auc_pr_mean = auc_pr_mean + (result['auc_pr'] - auc_pr_mean) / r 60 | 61 | auc_std = np.std(all_auc) 62 | auc_pr_std = np.std(all_auc_pr) 63 | 64 | logging.info('\nAvg test Set Performance -- mean auc :' + str(np.mean(all_auc)) + ' std auc: ' + str(np.std(all_auc))) 65 | logging.info('\nAvg test Set Performance -- mean auc_pr :' + str(np.mean(all_auc_pr)) + ' std auc_pr: ' + str(np.std(all_auc_pr))) 66 | 67 | 68 | if __name__ == '__main__': 69 | 70 | logging.basicConfig(level=logging.INFO) 71 | 72 | parser = argparse.ArgumentParser(description='TransE model') 73 | 74 | # Experiment setup params 75 | parser.add_argument("--experiment_name", "-e", type=str, default="default", 76 | help="A folder with this name would be created to dump saved models and log files") 77 | parser.add_argument("--dataset", "-d", type=str, default="Toy", 78 | help="Dataset string") 79 | parser.add_argument("--train_file", "-tf", type=str, default="train", 80 | help="Name of file containing training triplets") 81 | parser.add_argument("--test_file", "-t", type=str, default="test", 82 | help="Name of file containing test triplets") 83 | parser.add_argument("--runs", type=int, default=1, 84 | help="How many runs to perform for mean and std?") 85 | parser.add_argument("--gpu", type=int, default=0, 86 | help="Which GPU to use?") 87 | parser.add_argument('--disable_cuda', action='store_true', # default value is False 88 | help='Disable CUDA') 89 | 90 | # Data processing pipeline params 91 | parser.add_argument("--max_links", type=int, default=100000, 92 | help="Set maximum number of links (to fit into memory)") 93 | parser.add_argument("--hop", type=int, default=3, 94 | help="Enclosing subgraph hop number") 95 | parser.add_argument("--max_nodes_per_hop", "-max_h", type=int, default=None, 96 | help="if > 0, upper bound the # nodes per hop by subsampling") 97 | parser.add_argument("--use_kge_embeddings", "-kge", type=bool, default=False, 98 | help='whether to use pretrained KGE embeddings') 99 | parser.add_argument("--kge_model", type=str, default="TransE", 100 | help="Which KGE model to load entity embeddings from") 101 | parser.add_argument('--model_type', '-m', type=str, choices=['dgl'], default='dgl', 102 | help='what format to store subgraphs in for model') 103 | parser.add_argument('--constrained_neg_prob', '-cn', type=float, default=0, 104 | help='with what probability to sample constrained heads/tails while neg sampling') 105 | parser.add_argument("--num_neg_samples_per_link", '-neg', type=int, default=1, 106 | help="Number of negative examples to sample per positive link") 107 | parser.add_argument("--batch_size", type=int, default=16, 108 | help="Batch size") 109 | parser.add_argument("--num_workers", type=int, default=8, 110 | help="Number of dataloading processes") 111 | parser.add_argument('--add_traspose_rels', '-tr', type=bool, default=False, 112 | help='whether to append adj matrix list with symmetric relations') 113 | parser.add_argument('--enclosing_sub_graph', '-en', type=bool, default=True, 114 | help='whether to only consider enclosing subgraph') 115 | # parser.add_argument('--comp_hrt', type=str, default='TransE') 116 | parser.add_argument('--sort_data', type=bool, default=False) 117 | 118 | params = parser.parse_args() 119 | initialize_experiment(params, __file__) 120 | 121 | params.file_paths = { 122 | 'train': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.train_file)), 123 | 'test': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.test_file)) 124 | } 125 | 126 | if not params.disable_cuda and torch.cuda.is_available(): 127 | params.device = torch.device('cuda:%d' % params.gpu) 128 | else: 129 | params.device = torch.device('cpu') 130 | 131 | params.collate_fn = collate_dgl_train 132 | params.move_batch_to_device = move_batch_to_device_dgl_train 133 | 134 | main(params) 135 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['OPENBLAS_NUM_THREADS'] = '1' 3 | import argparse 4 | import logging 5 | import torch 6 | from scipy.sparse import SparseEfficiencyWarning 7 | 8 | from subgraph_extraction.datasets import SubgraphDataset, generate_subgraph_datasets 9 | from utils.initialization_utils import initialize_experiment, initialize_model 10 | from utils.graph_utils import collate_dgl, collate_dgl_train, move_batch_to_device_dgl, move_batch_to_device_dgl_train 11 | 12 | from model.dgl.graph_classifier import GraphClassifier as dgl_model 13 | 14 | from managers.evaluator import Evaluator 15 | from managers.trainer import Trainer 16 | 17 | from warnings import simplefilter 18 | 19 | 20 | def main(params): 21 | simplefilter(action='ignore', category=UserWarning) 22 | simplefilter(action='ignore', category=SparseEfficiencyWarning) 23 | 24 | params.db_path = os.path.join(params.main_dir, f'./data/{params.dataset}/subgraphs_en_{params.enclosing_sub_graph}_neg_{params.num_neg_samples_per_link}_hop_{params.hop}') 25 | 26 | if not os.path.isdir(params.db_path): 27 | generate_subgraph_datasets(params) 28 | 29 | train = SubgraphDataset(params.db_path, 'train_pos', 'train_neg', params.file_paths, 30 | add_traspose_rels=params.add_traspose_rels, 31 | num_neg_samples_per_link=params.num_neg_samples_per_link, 32 | use_kge_embeddings=params.use_kge_embeddings, dataset=params.dataset, 33 | kge_model=params.kge_model, file_name=params.train_file) 34 | valid = SubgraphDataset(params.db_path, 'valid_pos', 'valid_neg', params.file_paths, 35 | add_traspose_rels=params.add_traspose_rels, 36 | num_neg_samples_per_link=params.num_neg_samples_per_link, 37 | use_kge_embeddings=params.use_kge_embeddings, dataset=params.dataset, 38 | kge_model=params.kge_model, file_name=params.valid_file) 39 | 40 | # Avoid that m_h2r and m_t2r of train and valid are different 41 | valid.m_h2r = train.m_h2r 42 | valid.m_t2r = train.m_t2r 43 | 44 | params.num_rels = train.num_rels 45 | params.aug_num_rels = train.aug_num_rels 46 | 47 | # Set the embedding dimension of relation and node 48 | if params.init_nei_rels == 'no': 49 | params.inp_dim = train.n_feat_dim 50 | else: 51 | params.inp_dim = train.n_feat_dim + params.sem_dim 52 | 53 | # Log the max label value to save it in the model. This will be used to cap the labels generated on test set. 54 | params.max_label_value = train.max_n_label 55 | 56 | graph_classifier = initialize_model(params, dgl_model, params.load_model) 57 | 58 | logging.info(f"Device: {params.device}") 59 | logging.info(f"Input dim : {params.inp_dim}, # Relations : {params.num_rels}, # Augmented relations : {params.aug_num_rels}") 60 | 61 | valid_evaluator = Evaluator(params, graph_classifier, valid) 62 | 63 | trainer = Trainer(params, graph_classifier, train, valid_evaluator) 64 | 65 | logging.info('Starting training with full batch...') 66 | 67 | trainer.train() 68 | 69 | 70 | if __name__ == '__main__': 71 | 72 | logging.basicConfig(level=logging.INFO) 73 | 74 | parser = argparse.ArgumentParser(description='TransE model') 75 | 76 | # Experiment setup params 77 | parser.add_argument("--experiment_name", "-e", type=str, default="default", 78 | help="A folder with this name would be created to dump saved models and log files") 79 | parser.add_argument("--dataset", "-d", type=str, 80 | help="Dataset string") 81 | parser.add_argument("--gpu", type=int, default=0, 82 | help="Which GPU to use?") 83 | parser.add_argument('--disable_cuda', action='store_true', 84 | help='Disable CUDA') 85 | parser.add_argument('--load_model', action='store_true', 86 | help='Load existing model?') 87 | parser.add_argument("--train_file", "-tf", type=str, default="train", 88 | help="Name of file containing training triplets") 89 | parser.add_argument("--valid_file", "-vf", type=str, default="valid", 90 | help="Name of file containing validation triplets") 91 | 92 | # Training regime params 93 | parser.add_argument("--num_epochs", "-ne", type=int, default=30, 94 | help="Learning rate of the optimizer") 95 | parser.add_argument("--eval_every", type=int, default=3, 96 | help="Interval of epochs to evaluate the model?") 97 | parser.add_argument("--eval_every_iter", type=int, default=455, 98 | help="Interval of iterations to evaluate the model?") 99 | parser.add_argument("--save_every", type=int, default=10, 100 | help="Interval of epochs to save a checkpoint of the model?") 101 | parser.add_argument("--early_stop", type=int, default=100, 102 | help="Early stopping patience") 103 | parser.add_argument("--optimizer", type=str, default="Adam", 104 | help="Which optimizer to use?") 105 | parser.add_argument("--lr", type=float, default=0.001, 106 | help="Learning rate of the optimizer") 107 | parser.add_argument("--clip", type=int, default=1000, 108 | help="Maximum gradient norm allowed") 109 | parser.add_argument("--l2", type=float, default=5e-4, 110 | help="Regularization constant for GNN weights") 111 | parser.add_argument("--margin", type=float, default=10, 112 | help="The margin between positive and negative samples in the max-margin loss") 113 | 114 | # Data processing pipeline params 115 | parser.add_argument("--max_links", type=int, default=1000000, 116 | help="Set maximum number of train links (to fit into memory)") 117 | parser.add_argument("--hop", type=int, default=3, 118 | help="Enclosing subgraph hop number") 119 | parser.add_argument("--max_nodes_per_hop", "-max_h", type=int, default=None, 120 | help="if > 0, upper bound the # nodes per hop by subsampling") 121 | parser.add_argument("--use_kge_embeddings", "-kge", type=bool, default=False, 122 | help='whether to use pretrained KGE embeddings') 123 | parser.add_argument("--kge_model", type=str, default="TransE", 124 | help="Which KGE model to load entity embeddings from") 125 | parser.add_argument('--model_type', '-m', type=str, choices=['ssp', 'dgl'], default='dgl', 126 | help='what format to store subgraphs in for model') 127 | parser.add_argument('--constrained_neg_prob', '-cn', type=float, default=0.0, 128 | help='with what probability to sample constrained heads/tails while neg sampling') 129 | parser.add_argument("--batch_size", type=int, default=64, 130 | help="Batch size") 131 | parser.add_argument("--num_neg_samples_per_link", '-neg', type=int, default=1, 132 | help="Number of negative examples to sample per positive link") 133 | parser.add_argument("--num_workers", type=int, default=2, 134 | help="Number of dataloading processes") 135 | parser.add_argument('--add_traspose_rels', '-tr', type=bool, default=False, 136 | help='whether to append adj matrix list with symmetric relations') 137 | parser.add_argument('--enclosing_sub_graph', '-en', type=bool, default=True, 138 | help='whether to only consider enclosing subgraph') 139 | 140 | # Model params 141 | parser.add_argument("--rel_emb_dim", "-r_dim", type=int, default=32, 142 | help="Relation embedding size") 143 | parser.add_argument("--attn_rel_emb_dim", "-ar_dim", type=int, default=32, 144 | help="Relation embedding size for attention") 145 | parser.add_argument("--emb_dim", "-dim", type=int, default=32, 146 | help="Entity embedding size") 147 | parser.add_argument("--num_gcn_layers", "-l", type=int, default=3, 148 | help="Number of GCN layers") 149 | parser.add_argument("--num_bases", "-b", type=int, default=4, 150 | help="Number of basis functions to use for GCN weights") 151 | parser.add_argument("--dropout", type=float, default=0, 152 | help="Dropout rate in GNN layers") 153 | parser.add_argument("--edge_dropout", type=float, default=0.5, 154 | help="Dropout rate in edges of the subgraphs") 155 | parser.add_argument('--gnn_agg_type', '-a', type=str, choices=['sum', 'mlp', 'gru'], default='sum', 156 | help='what type of aggregation to do in gnn msg passing') 157 | parser.add_argument('--add_ht_emb', '-ht', type=bool, default=True, 158 | help='whether to concatenate head/tail embedding with pooled graph representation') 159 | parser.add_argument('--has_attn', '-attn', type=bool, default=True, 160 | help='whether to have attn in model or not') 161 | parser.add_argument('--sem_dim', type=int, default=24, 162 | help='the dimension of sematic part of node embedding') 163 | parser.add_argument('--max_nei_rels', type=int, default=10, help='the maximum num of neighbor relations of each node when initialzing the node embedding.') 164 | parser.add_argument('--nei_rels_dropout', type=float, default=0.4, help='Dropout rate in aggregating relation embeddings.') 165 | parser.add_argument('--is_comp', type=str, default='mult', choices=['mult', 'sub'], help='The composition manner of node and relation') 166 | parser.add_argument('--comp_ht', type=str, choices=['mult, mlp, sum'], default='sum', help='The composition operator of head and tail embedding') 167 | parser.add_argument('--comp_hrt', type=str, choices=['TransE, DistMult'], default=None, help='The composition operator of (h, r, t)embedding') 168 | parser.add_argument('--coef_dgi_loss', type=float, default=5, help='Coefficient of MI loss') 169 | parser.add_argument('--init_nei_rels', type=str, choices=['no', 'out', 'in', 'both'], default='in', help='the manner of utilizing relatioins when initializing entity embedding') 170 | parser.add_argument('--sort_data', type=bool, default=True, 171 | help='whether to training data according to relation id ') 172 | parser.add_argument('--nei_rel_path', action='store_false', 173 | help='whether to consider neighboring relational paths') 174 | parser.add_argument('--path_agg', type=str, choices=['mean', 'att'], default='att', help='the manner of aggreating neighboring relational paths.') 175 | 176 | params = parser.parse_args() 177 | initialize_experiment(params, __file__) 178 | 179 | params.file_paths = { 180 | 'train': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.train_file)), 181 | 'valid': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.valid_file)) 182 | } 183 | 184 | if not params.disable_cuda and torch.cuda.is_available(): 185 | params.device = torch.device('cuda:%d' % params.gpu) 186 | else: 187 | params.device = torch.device('cpu') 188 | 189 | params.collate_fn = collate_dgl_train 190 | params.move_batch_to_device = move_batch_to_device_dgl_train 191 | main(params) 192 | -------------------------------------------------------------------------------- /utils/__pycache__/data_utils.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/data_utils.cpython-36.pyc -------------------------------------------------------------------------------- /utils/__pycache__/dgl_utils.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/dgl_utils.cpython-36.pyc -------------------------------------------------------------------------------- /utils/__pycache__/graph_utils.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/graph_utils.cpython-36.pyc -------------------------------------------------------------------------------- /utils/__pycache__/initialization_utils.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/initialization_utils.cpython-36.pyc -------------------------------------------------------------------------------- /utils/clean_data.py: -------------------------------------------------------------------------------- 1 | import os 2 | import argparse 3 | import numpy as np 4 | 5 | 6 | def write_to_file(file_name, data): 7 | with open(file_name, "w") as f: 8 | for s, r, o in data: 9 | f.write('\t'.join([s, r, o]) + '\n') 10 | 11 | 12 | def main(params): 13 | with open(os.path.join(params.main_dir, 'data', params.dataset, 'train.txt')) as f: 14 | train_data = [line.split() for line in f.read().split('\n')[:-1]] 15 | with open(os.path.join(params.main_dir, 'data', params.dataset, 'valid.txt')) as f: 16 | valid_data = [line.split() for line in f.read().split('\n')[:-1]] 17 | with open(os.path.join(params.main_dir, 'data', params.dataset, 'test.txt')) as f: 18 | test_data = [line.split() for line in f.read().split('\n')[:-1]] 19 | 20 | train_tails = set([d[2] for d in train_data]) 21 | train_heads = set([d[0] for d in train_data]) 22 | train_ent = train_tails.union(train_heads) 23 | train_rels = set([d[1] for d in train_data]) 24 | 25 | filtered_valid_data = [] 26 | for d in valid_data: 27 | if d[0] in train_ent and d[1] in train_rels and d[2] in train_ent: 28 | filtered_valid_data.append(d) 29 | else: 30 | train_data.append(d) 31 | train_ent = train_ent.union(set([d[0], d[2]])) 32 | train_rels = train_rels.union(set([d[1]])) 33 | 34 | filtered_test_data = [] 35 | for d in test_data: 36 | if d[0] in train_ent and d[1] in train_rels and d[2] in train_ent: 37 | filtered_test_data.append(d) 38 | else: 39 | train_data.append(d) 40 | train_ent = train_ent.union(set([d[0], d[2]])) 41 | train_rels = train_rels.union(set([d[1]])) 42 | 43 | data_dir = os.path.join(params.main_dir, 'data/{}'.format(params.dataset)) 44 | write_to_file(os.path.join(data_dir, 'train.txt'), train_data) 45 | write_to_file(os.path.join(data_dir, 'valid.txt'), filtered_valid_data) 46 | write_to_file(os.path.join(data_dir, 'test.txt'), filtered_test_data) 47 | 48 | with open(os.path.join(params.main_dir, 'data', params.dataset + '_meta', 'train.txt')) as f: 49 | meta_train_data = [line.split() for line in f.read().split('\n')[:-1]] 50 | with open(os.path.join(params.main_dir, 'data', params.dataset + '_meta', 'valid.txt')) as f: 51 | meta_valid_data = [line.split() for line in f.read().split('\n')[:-1]] 52 | with open(os.path.join(params.main_dir, 'data', params.dataset + '_meta', 'test.txt')) as f: 53 | meta_test_data = [line.split() for line in f.read().split('\n')[:-1]] 54 | 55 | meta_train_tails = set([d[2] for d in meta_train_data]) 56 | meta_train_heads = set([d[0] for d in meta_train_data]) 57 | meta_train_ent = meta_train_tails.union(meta_train_heads) 58 | meta_train_rels = set([d[1] for d in meta_train_data]) 59 | 60 | filtered_meta_valid_data = [] 61 | for d in meta_valid_data: 62 | if d[0] in meta_train_ent and d[1] in meta_train_rels and d[2] in meta_train_ent: 63 | filtered_meta_valid_data.append(d) 64 | else: 65 | meta_train_data.append(d) 66 | meta_train_ent = meta_train_ent.union(set([d[0], d[2]])) 67 | meta_train_rels = meta_train_rels.union(set([d[1]])) 68 | 69 | filtered_meta_test_data = [] 70 | for d in meta_test_data: 71 | if d[0] in meta_train_ent and d[1] in meta_train_rels and d[2] in meta_train_ent: 72 | filtered_meta_test_data.append(d) 73 | else: 74 | meta_train_data.append(d) 75 | meta_train_ent = meta_train_ent.union(set([d[0], d[2]])) 76 | meta_train_rels = meta_train_rels.union(set([d[1]])) 77 | 78 | meta_data_dir = os.path.join(params.main_dir, 'data/{}_meta'.format(params.dataset)) 79 | write_to_file(os.path.join(meta_data_dir, 'train.txt'), meta_train_data) 80 | write_to_file(os.path.join(meta_data_dir, 'valid.txt'), filtered_meta_valid_data) 81 | write_to_file(os.path.join(meta_data_dir, 'test.txt'), filtered_meta_test_data) 82 | 83 | 84 | if __name__ == '__main__': 85 | parser = argparse.ArgumentParser(description='Move new entities from test/valid to train') 86 | 87 | parser.add_argument("--dataset", "-d", type=str, default="fb237_v1_copy", 88 | help="Dataset string") 89 | params = parser.parse_args() 90 | 91 | params.main_dir = os.path.join(os.path.relpath(os.path.dirname(os.path.abspath(__file__))), '..') 92 | 93 | main(params) 94 | -------------------------------------------------------------------------------- /utils/data_utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import pdb 3 | import logging 4 | import numpy as np 5 | from scipy.sparse import csc_matrix 6 | import matplotlib.pyplot as plt 7 | 8 | 9 | def plot_rel_dist(adj_list, filename): 10 | rel_count = [] 11 | for adj in adj_list: 12 | rel_count.append(adj.count_nonzero()) 13 | 14 | fig = plt.figure(figsize=(12, 8)) 15 | plt.plot(rel_count) 16 | fig.savefig(filename, dpi=fig.dpi) 17 | 18 | 19 | def process_files(files, saved_relation2id=None, add_traspose_rels=False, sort_data=False): 20 | ''' 21 | files: Dictionary map of file paths to read the triplets from. 22 | saved_relation2id: Saved relation2id (mostly passed from a trained model) which can be used to map relations to pre-defined indices and filter out the unknown ones. 23 | ''' 24 | entity2id = {} 25 | relation2id = {} if saved_relation2id is None else saved_relation2id 26 | 27 | triplets = {} 28 | 29 | ent = 0 30 | rel = 0 31 | 32 | for file_type, file_path in files.items(): 33 | 34 | data = [] 35 | with open(file_path) as f: 36 | file_data = [line.split() for line in f.read().split('\n')[:-1]] 37 | 38 | for triplet in file_data: 39 | if triplet[0] not in entity2id: 40 | entity2id[triplet[0]] = ent 41 | ent += 1 42 | if triplet[2] not in entity2id: 43 | entity2id[triplet[2]] = ent 44 | ent += 1 45 | if not saved_relation2id and triplet[1] not in relation2id: 46 | relation2id[triplet[1]] = rel 47 | rel += 1 48 | 49 | # Save the triplets corresponding to only the known relations 50 | if triplet[1] in relation2id: 51 | data.append([entity2id[triplet[0]], entity2id[triplet[2]], relation2id[triplet[1]]]) 52 | 53 | triplets[file_type] = np.array(data) 54 | 55 | id2entity = {v: k for k, v in entity2id.items()} 56 | id2relation = {v: k for k, v in relation2id.items()} 57 | 58 | # Construct the the neighbor relations of each entity 59 | num_rels = len(id2relation) 60 | num_ents = len(entity2id) 61 | h2r = {} 62 | h2r_len = {} 63 | t2r = {} 64 | t2r_len = {} 65 | 66 | for triplet in triplets['train']: 67 | h, t, r = triplet 68 | if h not in h2r: 69 | h2r_len[h] = 1 70 | h2r[h] = [r] 71 | else: 72 | h2r_len[h] += 1 73 | h2r[h].append(r) 74 | 75 | if add_traspose_rels: 76 | # Consider the reverse relation, the id of reverse relation is (relation + #relations) 77 | if t not in t2r: 78 | t2r[t] = [r + num_rels] 79 | else: 80 | t2r[t].append(r + num_rels) 81 | if t not in t2r: 82 | t2r[t] = [r] 83 | t2r_len[t] = 1 84 | else: 85 | t2r[t].append(r) 86 | t2r_len[t] += 1 87 | 88 | # Consider nodes with no neighbors as index '-1' and their relation index: num_rels. 89 | # ent2rels[-1] = [num_rels] 90 | 91 | # Construct the matrix of ent2rels 92 | # rels_len = triplets['train'].shape(0) // num_ents 93 | h_nei_rels_len = int(np.percentile(list(h2r_len.values()), 75)) 94 | t_nei_rels_len = int(np.percentile(list(t2r_len.values()), 75)) 95 | logging.info(f"Average number of relations each node: head: {h_nei_rels_len}, tail: {t_nei_rels_len}") 96 | 97 | # The index "num_rels" of relation is considered as "padding" relation. 98 | # Use padding relation to initialize matrix of ent2rels. 99 | m_h2r = np.ones([num_ents, h_nei_rels_len]) * num_rels 100 | for ent, rels in h2r.items(): 101 | if len(rels) > h_nei_rels_len: 102 | rels = np.array(rels)[np.random.choice(np.arange(len(rels)), h_nei_rels_len)] 103 | m_h2r[ent] = rels 104 | else: 105 | rels = np.array(rels) 106 | m_h2r[ent][: rels.shape[0]] = rels 107 | 108 | m_t2r = np.ones([num_ents, t_nei_rels_len]) * num_rels 109 | for ent, rels in t2r.items(): 110 | if len(rels) > t_nei_rels_len: 111 | rels = np.array(rels)[np.random.choice(np.arange(len(rels)), t_nei_rels_len)] 112 | m_t2r[ent] = rels 113 | else: 114 | rels = np.array(rels) 115 | m_t2r[ent][: rels.shape[0]] = rels 116 | 117 | print("Construct matrix of ent2rels done!") 118 | 119 | # Sort the data according to relation id 120 | if sort_data: 121 | triplets['train'] = triplets['train'][np.argsort(triplets['train'][:,2])] 122 | 123 | adj_list = [] 124 | for i in range(len(relation2id)): 125 | idx = np.argwhere(triplets['train'][:, 2] == i) 126 | adj_list.append(csc_matrix((np.ones(len(idx), dtype=np.uint8), (triplets['train'][:, 0][idx].squeeze(1), triplets['train'][:, 1][idx].squeeze(1))), shape=(len(entity2id), len(entity2id)))) 127 | 128 | return adj_list, triplets, entity2id, relation2id, id2entity, id2relation, h2r, m_h2r, t2r, m_t2r 129 | 130 | 131 | def save_to_file(directory, file_name, triplets, id2entity, id2relation): 132 | file_path = os.path.join(directory, file_name) 133 | with open(file_path, "w") as f: 134 | for s, o, r in triplets: 135 | f.write('\t'.join([id2entity[s], id2relation[r], id2entity[o]]) + '\n') 136 | -------------------------------------------------------------------------------- /utils/dgl_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import scipy.sparse as ssp 3 | import random 4 | 5 | """All functions in this file are from dgl.contrib.data.knowledge_graph""" 6 | 7 | 8 | def _bfs_relational(adj, roots, max_nodes_per_hop=None): 9 | """ 10 | BFS for graphs. 11 | Modified from dgl.contrib.data.knowledge_graph to accomodate node sampling 12 | """ 13 | visited = set() 14 | current_lvl = set(roots) 15 | 16 | next_lvl = set() 17 | 18 | while current_lvl: 19 | 20 | for v in current_lvl: 21 | visited.add(v) 22 | 23 | next_lvl = _get_neighbors(adj, current_lvl) 24 | next_lvl -= visited # set difference 25 | 26 | if max_nodes_per_hop and max_nodes_per_hop < len(next_lvl): 27 | next_lvl = set(random.sample(next_lvl, max_nodes_per_hop)) 28 | 29 | yield next_lvl 30 | 31 | current_lvl = set.union(next_lvl) 32 | 33 | 34 | def _get_neighbors(adj, nodes): 35 | """Takes a set of nodes and a graph adjacency matrix and returns a set of neighbors. 36 | Directly copied from dgl.contrib.data.knowledge_graph""" 37 | sp_nodes = _sp_row_vec_from_idx_list(list(nodes), adj.shape[1]) 38 | sp_neighbors = sp_nodes.dot(adj) 39 | neighbors = set(ssp.find(sp_neighbors)[1]) # convert to set of indices 40 | return neighbors 41 | 42 | 43 | def _sp_row_vec_from_idx_list(idx_list, dim): 44 | """Create sparse vector of dimensionality dim from a list of indices.""" 45 | shape = (1, dim) 46 | data = np.ones(len(idx_list)) 47 | row_ind = np.zeros(len(idx_list)) 48 | col_ind = list(idx_list) 49 | return ssp.csr_matrix((data, (row_ind, col_ind)), shape=shape) 50 | -------------------------------------------------------------------------------- /utils/graph_utils.py: -------------------------------------------------------------------------------- 1 | import statistics 2 | import numpy as np 3 | import scipy.sparse as ssp 4 | import torch 5 | import networkx as nx 6 | import dgl 7 | import pickle 8 | 9 | 10 | def serialize(data): 11 | data_tuple = tuple(data.values()) 12 | return pickle.dumps(data_tuple) 13 | 14 | 15 | def deserialize(data): 16 | data_tuple = pickle.loads(data) 17 | keys = ('nodes', 'r_label', 'g_label', 'n_label') 18 | return dict(zip(keys, data_tuple)) 19 | 20 | 21 | def get_edge_count(adj_list): 22 | count = [] 23 | for adj in adj_list: 24 | count.append(len(adj.tocoo().row.tolist())) 25 | return np.array(count) 26 | 27 | 28 | def incidence_matrix(adj_list): 29 | ''' 30 | adj_list: List of sparse adjacency matrices 31 | ''' 32 | 33 | rows, cols, dats = [], [], [] 34 | dim = adj_list[0].shape 35 | for adj in adj_list: 36 | adjcoo = adj.tocoo() 37 | rows += adjcoo.row.tolist() 38 | cols += adjcoo.col.tolist() 39 | dats += adjcoo.data.tolist() 40 | row = np.array(rows) 41 | col = np.array(cols) 42 | data = np.array(dats) 43 | return ssp.csc_matrix((data, (row, col)), shape=dim) 44 | 45 | 46 | def remove_nodes(A_incidence, nodes): 47 | idxs_wo_nodes = list(set(range(A_incidence.shape[1])) - set(nodes)) 48 | return A_incidence[idxs_wo_nodes, :][:, idxs_wo_nodes] 49 | 50 | 51 | def ssp_to_torch(A, device, dense=False): 52 | ''' 53 | A : Sparse adjacency matrix 54 | ''' 55 | idx = torch.LongTensor([A.tocoo().row, A.tocoo().col]) 56 | dat = torch.FloatTensor(A.tocoo().data) 57 | A = torch.sparse.FloatTensor(idx, dat, torch.Size([A.shape[0], A.shape[1]])).to(device=device) 58 | return A 59 | 60 | 61 | def ssp_multigraph_to_dgl(graph, n_feats=None): 62 | """ 63 | Converting ssp multigraph (i.e. list of adjs) to dgl multigraph. 64 | """ 65 | 66 | g_nx = nx.MultiDiGraph() 67 | g_nx.add_nodes_from(list(range(graph[0].shape[0]))) 68 | # Add edges 69 | for rel, adj in enumerate(graph): 70 | # Convert adjacency matrix to tuples for nx0 71 | nx_triplets = [] 72 | for src, dst in list(zip(adj.tocoo().row, adj.tocoo().col)): 73 | nx_triplets.append((src, dst, {'type': rel})) 74 | g_nx.add_edges_from(nx_triplets) 75 | 76 | # make dgl graph 77 | g_dgl = dgl.from_networkx(g_nx, edge_attrs=['type']) 78 | # add node features 79 | if n_feats is not None: 80 | g_dgl.ndata['feat'] = torch.tensor(n_feats) 81 | 82 | return g_dgl 83 | 84 | 85 | def collate_dgl(samples): 86 | # The input `samples` is a list of pairs 87 | graphs_pos, g_labels_pos, r_labels_pos, graphs_negs, g_labels_negs, r_labels_negs = map(list, zip(*samples)) 88 | batched_graph_pos = dgl.batch(graphs_pos) 89 | # batched_nei_rels_pos = nei_rels_poss 90 | # batched_nei_rels_pos = [sublist for sublist in nei_rels_poss] 91 | 92 | graphs_neg = [item for sublist in graphs_negs for item in sublist] 93 | g_labels_neg = [item for sublist in g_labels_negs for item in sublist] 94 | r_labels_neg = [item for sublist in r_labels_negs for item in sublist] 95 | 96 | batched_graph_neg = dgl.batch(graphs_neg) 97 | # batched_nei_rels_neg = [item for sublist in nei_rels_negs for item in sublist] 98 | 99 | return (batched_graph_pos, r_labels_pos), g_labels_pos, (batched_graph_neg, r_labels_neg), g_labels_neg 100 | 101 | def collate_dgl_train(samples): 102 | # The input `samples` is a list of pairs 103 | graphs_pos, g_labels_pos, r_labels_pos, graphs_negs, g_labels_negs, r_labels_negs = map(list, zip(*samples)) 104 | batched_graph_pos = dgl.batch(graphs_pos) 105 | batched_graph_cor = dgl.batch(graphs_pos) 106 | # batched_nei_rels_pos = nei_rels_poss 107 | # batched_nei_rels_pos = [sublist for sublist in nei_rels_poss] 108 | 109 | graphs_neg = [item for sublist in graphs_negs for item in sublist] 110 | g_labels_neg = [item for sublist in g_labels_negs for item in sublist] 111 | r_labels_neg = [item for sublist in r_labels_negs for item in sublist] 112 | 113 | batched_graph_neg = dgl.batch(graphs_neg) 114 | # batched_nei_rels_neg = [item for sublist in nei_rels_negs for item in sublist] 115 | 116 | return (batched_graph_pos, batched_graph_cor, r_labels_pos), g_labels_pos, (batched_graph_neg, r_labels_neg), g_labels_neg 117 | 118 | def move_batch_to_device_dgl(batch, device): 119 | ((g_dgl_pos, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg) = batch 120 | 121 | targets_pos = torch.LongTensor(targets_pos).to(device=device) 122 | r_labels_pos = torch.LongTensor(r_labels_pos).to(device=device) 123 | 124 | targets_neg = torch.LongTensor(targets_neg).to(device=device) 125 | r_labels_neg = torch.LongTensor(r_labels_neg).to(device=device) 126 | 127 | g_dgl_pos = send_graph_to_device(g_dgl_pos, device) 128 | g_dgl_neg = send_graph_to_device(g_dgl_neg, device) 129 | 130 | # ent2rels = {key: torch.LongTensor(value).to(device=device) for key, value in ent2rels.items()} 131 | # nei_rels_pos = torch.LongTensor(nei_rels_pos).to(device=device) 132 | # nei_rels_neg = torch.LongTensor(nei_rels_neg).to(device=device) 133 | 134 | return ((g_dgl_pos, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg) 135 | 136 | def move_batch_to_device_dgl_train(batch, device): 137 | ((g_dgl_pos, g_dgl_cor, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg) = batch 138 | 139 | targets_pos = torch.LongTensor(targets_pos).to(device=device) 140 | r_labels_pos = torch.LongTensor(r_labels_pos).to(device=device) 141 | 142 | targets_neg = torch.LongTensor(targets_neg).to(device=device) 143 | r_labels_neg = torch.LongTensor(r_labels_neg).to(device=device) 144 | 145 | g_dgl_pos = send_graph_to_device(g_dgl_pos, device) 146 | g_dgl_cor = send_graph_to_device(g_dgl_cor, device) 147 | g_dgl_neg = send_graph_to_device(g_dgl_neg, device) 148 | 149 | # ent2rels = {key: torch.LongTensor(value).to(device=device) for key, value in ent2rels.items()} 150 | # nei_rels_pos = torch.LongTensor(nei_rels_pos).to(device=device) 151 | # nei_rels_neg = torch.LongTensor(nei_rels_neg).to(device=device) 152 | 153 | return ((g_dgl_pos, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg, (g_dgl_cor, r_labels_pos)) 154 | 155 | def send_graph_to_device(g, device): 156 | # # nodes 157 | # labels = g.node_attr_schemes() 158 | # for l in labels.keys(): 159 | # g.ndata[l] = g.ndata.pop(l).to(device) 160 | 161 | # # edges 162 | # labels = g.edge_attr_schemes() 163 | # for l in labels.keys(): 164 | # g.edata[l] = g.edata.pop(l).to(device) 165 | # return g 166 | g = g.to(device) 167 | return g 168 | 169 | # The following three functions are modified from networks source codes to 170 | # accomodate diameter and radius for dirercted graphs 171 | 172 | 173 | def eccentricity(G): 174 | e = {} 175 | for n in G.nbunch_iter(): 176 | length = nx.single_source_shortest_path_length(G, n) 177 | e[n] = max(length.values()) 178 | return e 179 | 180 | 181 | def radius(G): 182 | e = eccentricity(G) 183 | e = np.where(np.array(list(e.values())) > 0, list(e.values()), np.inf) 184 | return min(e) 185 | 186 | 187 | def diameter(G): 188 | e = eccentricity(G) 189 | return max(e.values()) 190 | -------------------------------------------------------------------------------- /utils/initialization_utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import logging 3 | import json 4 | import torch 5 | 6 | 7 | def initialize_experiment(params, file_name): 8 | ''' 9 | Makes the experiment directory, sets standard paths and initializes the logger 10 | ''' 11 | params.main_dir = os.path.join(os.path.relpath(os.path.dirname(os.path.abspath(__file__))), '..') 12 | exps_dir = os.path.join(params.main_dir, 'experiments') 13 | if not os.path.exists(exps_dir): 14 | os.makedirs(exps_dir) 15 | 16 | params.exp_dir = os.path.join(exps_dir, params.experiment_name) 17 | 18 | if not os.path.exists(params.exp_dir): 19 | os.makedirs(params.exp_dir) 20 | 21 | if file_name == 'test_auc.py': 22 | params.test_exp_dir = os.path.join(params.exp_dir, f"test_{params.dataset}_{params.constrained_neg_prob}") 23 | if not os.path.exists(params.test_exp_dir): 24 | os.makedirs(params.test_exp_dir) 25 | file_handler = logging.FileHandler(os.path.join(params.test_exp_dir, f"log_test.txt")) 26 | else: 27 | file_handler = logging.FileHandler(os.path.join(params.exp_dir, "log_train.txt")) 28 | logger = logging.getLogger() 29 | logger.addHandler(file_handler) 30 | 31 | logger.info('============ Initialized logger ============') 32 | logger.info('\n'.join('%s: %s' % (k, str(v)) for k, v 33 | in sorted(dict(vars(params)).items()))) 34 | logger.info('============================================') 35 | 36 | with open(os.path.join(params.exp_dir, "params.json"), 'w') as fout: 37 | json.dump(vars(params), fout) 38 | 39 | 40 | def initialize_model(params, model, ent2rels=None, load_model=False): 41 | ''' 42 | relation2id: the relation to id mapping, this is stored in the model and used when testing 43 | model: the type of model to initialize/load 44 | load_model: flag which decide to initialize the model or load a saved model 45 | ''' 46 | 47 | if load_model and os.path.exists(os.path.join(params.exp_dir, 'best_graph_classifier.pth')): 48 | logging.info('Loading existing model from %s' % os.path.join(params.exp_dir, 'best_graph_classifier.pth')) 49 | graph_classifier = torch.load(os.path.join(params.exp_dir, 'best_graph_classifier.pth')).to(device=params.device) 50 | else: 51 | relation2id_path = os.path.join(params.main_dir, f'data/{params.dataset}/relation2id.json') 52 | with open(relation2id_path) as f: 53 | relation2id = json.load(f) 54 | 55 | logging.info('No existing model found. Initializing new model..') 56 | graph_classifier = model(params, relation2id, ent2rels).to(device=params.device) 57 | 58 | return graph_classifier 59 | -------------------------------------------------------------------------------- /utils/prepare_meta_data.py: -------------------------------------------------------------------------------- 1 | import pdb 2 | import os 3 | import math 4 | import random 5 | import argparse 6 | import numpy as np 7 | 8 | from graph_utils import incidence_matrix, get_edge_count 9 | from dgl_utils import _bfs_relational 10 | from data_utils import process_files, save_to_file 11 | 12 | 13 | def get_active_relations(adj_list): 14 | act_rels = [] 15 | for r, adj in enumerate(adj_list): 16 | if len(adj.tocoo().row.tolist()) > 0: 17 | act_rels.append(r) 18 | return act_rels 19 | 20 | 21 | def get_avg_degree(adj_list): 22 | adj_mat = incidence_matrix(adj_list) 23 | degree = [] 24 | for node in range(adj_list[0].shape[0]): 25 | degree.append(np.sum(adj_mat[node, :])) 26 | return np.mean(degree) 27 | 28 | 29 | def get_splits(adj_list, nodes, valid_rels=None, valid_ratio=0.1, test_ratio=0.1): 30 | ''' 31 | Get train/valid/test splits of the sub-graph defined by the given set of nodes. The relations in this subbgraph are limited to be among the given valid_rels. 32 | ''' 33 | 34 | # Extract the subgraph 35 | subgraph = [adj[nodes, :][:, nodes] for adj in adj_list] 36 | 37 | # Get the relations that are allowed to be sampled 38 | active_rels = get_active_relations(subgraph) 39 | common_rels = list(set(active_rels).intersection(set(valid_rels))) 40 | 41 | print('Average degree : ', get_avg_degree(subgraph)) 42 | print('Nodes: ', len(nodes)) 43 | print('Links: ', np.sum(get_edge_count(subgraph))) 44 | print('Active relations: ', len(common_rels)) 45 | 46 | # get all the triplets satisfying the given constraints 47 | all_triplets = [] 48 | for r in common_rels: 49 | # print(r, len(subgraph[r].tocoo().row)) 50 | for (i, j) in zip(subgraph[r].tocoo().row, subgraph[r].tocoo().col): 51 | all_triplets.append([nodes[i], nodes[j], r]) 52 | all_triplets = np.array(all_triplets) 53 | 54 | # delete the triplets which correspond to self connections 55 | ind = np.argwhere(all_triplets[:, 0] == all_triplets[:, 1]) 56 | all_triplets = np.delete(all_triplets, ind, axis=0) 57 | print('Links after deleting self connections : %d' % len(all_triplets)) 58 | 59 | # get the splits according to the given ratio 60 | np.random.shuffle(all_triplets) 61 | train_split = int(math.ceil(len(all_triplets) * (1 - valid_ratio - test_ratio))) 62 | valid_split = int(math.ceil(len(all_triplets) * (1 - test_ratio))) 63 | 64 | train_triplets = all_triplets[:train_split] 65 | valid_triplets = all_triplets[train_split: valid_split] 66 | test_triplets = all_triplets[valid_split:] 67 | 68 | return train_triplets, valid_triplets, test_triplets, common_rels 69 | 70 | 71 | def get_subgraph(adj_list, hops, max_nodes_per_hop): 72 | ''' 73 | Samples a subgraph around randomly chosen root nodes upto hops with a limit on the nodes selected per hop given by max_nodes_per_hop 74 | ''' 75 | 76 | # collapse the list of adj mattricees to a single matrix 77 | A_incidence = incidence_matrix(adj_list) 78 | 79 | # chose a set of random root nodes 80 | idx = np.random.choice(range(len(A_incidence.tocoo().row)), size=params.n_roots, replace=False) 81 | roots = set([A_incidence.tocoo().row[id] for id in idx] + [A_incidence.tocoo().col[id] for id in idx]) 82 | 83 | # get the neighbor nodes within a limit of hops 84 | bfs_generator = _bfs_relational(A_incidence, roots, max_nodes_per_hop) 85 | lvls = list() 86 | for _ in range(hops): 87 | lvls.append(next(bfs_generator)) 88 | 89 | nodes = list(roots) + list(set().union(*lvls)) 90 | 91 | return nodes 92 | 93 | 94 | def mask_nodes(adj_list, nodes): 95 | ''' 96 | mask a set of nodes from a given graph 97 | ''' 98 | 99 | masked_adj_list = [adj.copy() for adj in adj_list] 100 | for node in nodes: 101 | for adj in masked_adj_list: 102 | adj.data[adj.indptr[node]:adj.indptr[node + 1]] = 0 103 | adj = adj.tocsr() 104 | adj.data[adj.indptr[node]:adj.indptr[node + 1]] = 0 105 | adj = adj.tocsc() 106 | for adj in masked_adj_list: 107 | adj.eliminate_zeros() 108 | return masked_adj_list 109 | 110 | 111 | def main(params): 112 | 113 | adj_list, triplets, entity2id, relation2id, id2entity, id2relation = process_files(files) 114 | 115 | meta_train_nodes = get_subgraph(adj_list, params.hops, params.max_nodes_per_hop) # list(range(750, 8500)) # 116 | 117 | masked_adj_list = mask_nodes(adj_list, meta_train_nodes) 118 | 119 | meta_test_nodes = get_subgraph(masked_adj_list, params.hops_test + 1, params.max_nodes_per_hop_test) # list(range(0, 750)) # 120 | 121 | print('Common nodes among the two disjoint datasets (should ideally be zero): ', set(meta_train_nodes).intersection(set(meta_test_nodes))) 122 | tmp = [adj[meta_train_nodes, :][:, meta_train_nodes] for adj in masked_adj_list] 123 | print('Residual edges (should be zero) : ', np.sum(get_edge_count(tmp))) 124 | 125 | print("================") 126 | print("Train graph stats") 127 | print("================") 128 | train_triplets, valid_triplets, test_triplets, train_active_rels = get_splits(adj_list, meta_train_nodes, range(len(adj_list))) 129 | print("================") 130 | print("Meta-test graph stats") 131 | print("================") 132 | meta_train_triplets, meta_valid_triplets, meta_test_triplets, meta_active_rels = get_splits(adj_list, meta_test_nodes, train_active_rels) 133 | 134 | print("================") 135 | print('Extra rels (should be empty): ', set(meta_active_rels) - set(train_active_rels)) 136 | 137 | # TODO: ABSTRACT THIS INTO A METHOD 138 | data_dir = os.path.join(params.main_dir, 'data/{}'.format(params.new_dataset)) 139 | if not os.path.exists(data_dir): 140 | os.makedirs(data_dir) 141 | 142 | save_to_file(data_dir, 'train.txt', train_triplets, id2entity, id2relation) 143 | save_to_file(data_dir, 'valid.txt', valid_triplets, id2entity, id2relation) 144 | save_to_file(data_dir, 'test.txt', test_triplets, id2entity, id2relation) 145 | 146 | meta_data_dir = os.path.join(params.main_dir, 'data/{}'.format(params.new_dataset + '_meta')) 147 | if not os.path.exists(meta_data_dir): 148 | os.makedirs(meta_data_dir) 149 | 150 | save_to_file(meta_data_dir, 'train.txt', meta_train_triplets, id2entity, id2relation) 151 | save_to_file(meta_data_dir, 'valid.txt', meta_valid_triplets, id2entity, id2relation) 152 | save_to_file(meta_data_dir, 'test.txt', meta_test_triplets, id2entity, id2relation) 153 | 154 | 155 | if __name__ == '__main__': 156 | 157 | parser = argparse.ArgumentParser(description='Save adjacency matrtices and triplets') 158 | 159 | parser.add_argument("--dataset", "-d", type=str, default="FB15K237", 160 | help="Dataset string") 161 | parser.add_argument("--new_dataset", "-nd", type=str, default="fb_v3", 162 | help="Dataset string") 163 | parser.add_argument("--n_roots", "-n", type=int, default="1", 164 | help="Number of roots to sample the neighborhood from") 165 | parser.add_argument("--hops", "-H", type=int, default="3", 166 | help="Number of hops to sample the neighborhood") 167 | parser.add_argument("--max_nodes_per_hop", "-m", type=int, default="2500", 168 | help="Number of nodes in the neighborhood") 169 | parser.add_argument("--hops_test", "-HT", type=int, default="3", 170 | help="Number of hops to sample the neighborhood") 171 | parser.add_argument("--max_nodes_per_hop_test", "-mt", type=int, default="2500", 172 | help="Number of nodes in the neighborhood") 173 | parser.add_argument("--seed", "-s", type=int, default="28", 174 | help="Numpy random seed") 175 | 176 | params = parser.parse_args() 177 | 178 | np.random.seed(params.seed) 179 | random.seed(params.seed) 180 | 181 | params.main_dir = os.path.join(os.path.relpath(os.path.dirname(os.path.abspath(__file__))), '..') 182 | 183 | files = { 184 | 'train': os.path.join(params.main_dir, 'data/{}/train.txt'.format(params.dataset)), 185 | 'valid': os.path.join(params.main_dir, 'data/{}/valid.txt'.format(params.dataset)), 186 | 'test': os.path.join(params.main_dir, 'data/{}/test.txt'.format(params.dataset)) 187 | } 188 | 189 | main(params) 190 | --------------------------------------------------------------------------------