├── .gitignore
├── README.md
├── data
├── FB15K237
│ ├── FB15K237.pickle
│ ├── README.txt
│ ├── entities.dict
│ ├── relations.dict
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v1
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v1_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v2
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v2_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v3
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v3_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v4
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── WN18RR_v4_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v1
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v1_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v2
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v2_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v3
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v3_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── fb237_v4
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
└── fb237_v4_ind
│ ├── test.txt
│ ├── train.txt
│ └── valid.txt
├── managers
├── __pycache__
│ ├── evaluator.cpython-36.pyc
│ └── trainer.cpython-36.pyc
├── evaluator.py
└── trainer.py
├── model
└── dgl
│ ├── __init__.py
│ ├── __pycache__
│ ├── __init__.cpython-36.pyc
│ ├── aggregators.cpython-36.pyc
│ ├── batch_gru.cpython-36.pyc
│ ├── discriminator.cpython-36.pyc
│ ├── graph_classifier.cpython-36.pyc
│ ├── layers.cpython-36.pyc
│ └── rgcn_model.cpython-36.pyc
│ ├── aggregators.py
│ ├── batch_gru.py
│ ├── discriminator.py
│ ├── graph_classifier.py
│ ├── layers.py
│ └── rgcn_model.py
├── requirements.txt
├── snri.png
├── subgraph_extraction
├── __pycache__
│ ├── datasets.cpython-36.pyc
│ └── graph_sampler.cpython-36.pyc
├── datasets.py
└── graph_sampler.py
├── test_auc.py
├── test_ranking.py
├── train.py
└── utils
├── __pycache__
├── data_utils.cpython-36.pyc
├── dgl_utils.cpython-36.pyc
├── graph_utils.cpython-36.pyc
└── initialization_utils.cpython-36.pyc
├── clean_data.py
├── data_utils.py
├── dgl_utils.py
├── graph_utils.py
├── initialization_utils.py
└── prepare_meta_data.py
/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | __pycache__/
3 | tmp.txt
4 | experiments/
5 | data/
6 |
7 | #Saved and downloaded data files
8 | *.nt.gz
9 | *.npz
10 | *.pkl
11 | *.ipynb
12 | *.npy
13 | *.pyc
14 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # SNRI - Subgraph Neighboring Relations Infomax for Inductive Link Prediction on Knowledge Graphs
2 |
3 | Code for paper [Subgraph Neighboring Relations Infomax for Inductive Link Prediction on Knowledge Graphs](https://arxiv.org/abs/2208.00850) Xiaohan Xu, Peng Zhang, Yongquan He, Chengpeng Chao and Chaoyang Yan. IJCAI 2022.
4 |
5 |
6 |
7 | Inductive link prediction for knowledge graph aims at predicting missing links between unseen entities, those not shown in training stage. Most previous works learn entity-specific embeddings of entities, which cannot handle unseen entities. Recent several methods utilize enclosing subgraph to obtain inductive ability. However, all these works only consider the enclosing part of subgraph without complete neighboring relations, which leads to the issue that partial neighboring relations are neglected, and sparse subgraphs are hard to be handled. To address that, we propose Subgraph Neighboring Relations Infomax, SNRI, which sufficiently exploits complete neighboring relations from two aspects: \textit{neighboring relational feature} for node feature and \textit{neighboring relational path} for sparse subgraph. To further model neighboring relations in a global way, we innovatively apply mutual information (MI) maximization for knowledge graph. Experiments show that SNRI outperforms existing state-of-art methods by a large margin on inductive link prediction task, and verify the effectiveness of exploring complete neighboring relations in a global way to characterize node features and reason on sparse subgraphs.
8 |
9 | ## Requirements
10 | dgl
11 | lmdb
12 | networkx
13 | scikit-learn
14 | torch
15 | tqdm
16 |
17 | ## Usage
18 |
19 | Train data and test data are located in `data` folder.
20 |
21 | ### Training
22 |
23 | Train WN18RR dataset using the following commands:
24 |
25 | ```shell script
26 | python train.py -d WN18RR_v1 -e snri_wn_v1
27 | python train.py -d WN18RR_v2 -e snri_wn_v2
28 | python train.py -d WN18RR_v3 -e snri_wn_v3
29 | python train.py -d WN18RR_v4 -e snri_wn_v4
30 | ```
31 |
32 | Train Fb15K237 dataset using the following commands:
33 | ```shell script
34 | python train.py -d fb237_v1 -e snri_fb_v1
35 | python train.py -d fb237_v2 -e snri_fb_v2
36 | python train.py -d fb237_v3 -e snri_fb_v3
37 | python train.py -d fb237_v4 -e snri_fb_v4
38 | ```
39 |
40 | ### Evaluation
41 |
42 | Evaluate model using similar commands like:
43 | ```shell script
44 | python test_auc.py -d WN18RR_v4_ind -e snri_wn_v4
45 | python test_ranking.py -d WN18RR_v4_ind -e snri_wn_v4
46 | ```
47 |
48 | ### Ablation Study
49 |
50 | Run following commands for different variant models:
51 | ```shell script
52 | python train.py -d WN18RR_v4 -e snri_wn_v4 --nei_rel_path # without neighboring relational path module
53 | python train.py -d WN18RR_v4 -e snri_wn_v4 --init_nei_rels no # without neighboring relational feature module
54 | python train.py -d WN18RR_v4 -e snri_wn_v4 --coef_dgi_loss 0 # without MI module
55 | ```
56 |
57 | ## Citation
58 | If you use source codes included in this toolkit in your work, please cite the following paper. The bibtex are listed below:
59 |
60 | @inproceedings{ijcai2022p325,
61 | title = {Subgraph Neighboring Relations Infomax for Inductive Link Prediction on Knowledge Graphs},
62 | author = {Xu, Xiaohan and Zhang, Peng and He, Yongquan and Chao, Chengpeng and Yan, Chaoyang},
63 | booktitle = {Proceedings of the Thirty-First International Joint Conference on
64 | Artificial Intelligence, {IJCAI-22}},
65 | publisher = {International Joint Conferences on Artificial Intelligence Organization},
66 | editor = {Lud De Raedt},
67 | pages = {2341--2347},
68 | year = {2022},
69 | month = {7},
70 | note = {Main Track},
71 | doi = {10.24963/ijcai.2022/325},
72 | url = {https://doi.org/10.24963/ijcai.2022/325},
73 | }
74 |
75 | ## Acknowledgement
76 | We refer to the code of [GraIL](https://github.com/kkteru/grail). Thanks for their contributions.
77 |
--------------------------------------------------------------------------------
/data/FB15K237/FB15K237.pickle:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/data/FB15K237/FB15K237.pickle
--------------------------------------------------------------------------------
/data/FB15K237/README.txt:
--------------------------------------------------------------------------------
1 | FB15K-237 Knowledge Base Completion Dataset
2 |
3 | This dataset contains knowledge base relation triples and textual mentions of Freebase entity pairs, as used in the work published in [1] and [2].
4 | The knowledge base triples are a subset of the FB15K set [3], originally derived from Freebase. The textual mentions are derived from 200 million sentences from the ClueWeb12 [5] corpus coupled with Freebase entity mention annotations [4].
5 |
6 |
7 | FILE FORMAT DETAILS
8 |
9 | The files train.txt, valid.txt, and test.text contain the training, development, and test set knowledge base triples used in both [1] and [2].
10 | The file text_cvsc.txt contains the textual triples used in [2] and the file text_emnlp.txt contains the textual triples used in [1].
11 |
12 | The knowledge base triples contain lines like this:
13 |
14 | /m/0grwj /people/person/profession /m/05sxg2
15 |
16 | The format is:
17 |
18 | mid1 relation mid2
19 |
20 | The separator is a tab character; the mids are Freebase ids of entities, and the relation is a single or a two-hop relation from Freebase, where an intermediate complex value type entity has been collapsed out.
21 |
22 | The textual mentions files have lines like this:
23 |
24 | /m/02qkt [XXX]:<-nn>:fact:<-pobj>:in:<-prep>:game:<-nsubj>:'s::pivot::[YYY] /m/05sb1 3
25 |
26 | This indicates the mids of two Freebase entities, together with a fully lexicalized dependency path between the entities. The last element in the tuple is the number of occurrences of the specified entity pair with the given dependency path in sentences from ClueWeb12.
27 | The dependency paths are specified as sequences of words (like the word "fact" above) and labeled dependency links (like above). The direction of traversal of a dependency arc is indicated by whether there is a - sign in front of the arc label "e.g." <-nsubj> vs .
28 |
29 |
30 | REFERENCES
31 |
32 | [1] Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon. Representing text for joint embedding of text and knowledge bases. In Proceedings of EMNLP 2015.
33 | [2] Kristina Toutanova and Danqi Chen. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and Their Compositionality 2015.
34 | [3] Antoine Bordes, Nicolas Usunier, Alberto Garcia Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multirelational data. In Advances in Neural Information Processing Systems (NIPS) 2013.
35 | [4] Evgeniy Gabrilovich, Michael Ringgaard, and Amarnag Subramanya. FACC1: Freebase annotation of ClueWeb corpora, Version 1 (release date 2013-06-26, format version 1, correction level 0). http://lemurproject.org/clueweb12/FACC1/
36 | [5] http://lemurproject.org/clueweb12/
37 |
38 |
39 | CONTACT
40 |
41 | Please contact Kristina Toutanova kristout@microsoft.com if you have questions about the dataset.
42 |
--------------------------------------------------------------------------------
/data/FB15K237/relations.dict:
--------------------------------------------------------------------------------
1 | 0 /organization/organization/headquarters./location/mailing_address/state_province_region
2 | 1 /education/educational_institution/colors
3 | 2 /people/person/profession
4 | 3 /film/film/costume_design_by
5 | 4 /film/film/genre
6 | 5 /celebrities/celebrity/celebrity_friends./celebrities/friendship/friend
7 | 6 /tv/tv_producer/programs_produced./tv/tv_producer_term/producer_type
8 | 7 /film/film/executive_produced_by
9 | 8 /sports/sports_team/roster./basketball/basketball_roster_position/position
10 | 9 /award/award_nominee/award_nominations./award/award_nomination/nominated_for
11 | 10 /award/award_category/winners./award/award_honor/award_winner
12 | 11 /award/award_winner/awards_won./award/award_honor/award_winner
13 | 12 /music/artist/origin
14 | 13 /food/food/nutrients./food/nutrition_fact/nutrient
15 | 14 /film/film/distributors./film/film_film_distributor_relationship/region
16 | 15 /time/event/instance_of_recurring_event
17 | 16 /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/school
18 | 17 /film/film/language
19 | 18 /location/statistical_region/places_exported_to./location/imports_and_exports/exported_to
20 | 19 /music/group_member/membership./music/group_membership/group
21 | 20 /tv/tv_network/programs./tv/tv_network_duration/program
22 | 21 /award/award_winning_work/awards_won./award/award_honor/award_winner
23 | 22 /people/person/places_lived./people/place_lived/location
24 | 23 /travel/travel_destination/climate./travel/travel_destination_monthly_climate/month
25 | 24 /broadcast/content/artist
26 | 25 /base/americancomedy/celebrity_impressionist/celebrities_impersonated
27 | 26 /base/popstra/celebrity/breakup./base/popstra/breakup/participant
28 | 27 /organization/organization/place_founded
29 | 28 /people/person/employment_history./business/employment_tenure/company
30 | 29 /location/statistical_region/gdp_nominal_per_capita./measurement_unit/dated_money_value/currency
31 | 30 /people/person/place_of_birth
32 | 31 /location/location/contains
33 | 32 /base/popstra/celebrity/dated./base/popstra/dated/participant
34 | 33 /user/ktrueman/default_domain/international_organization/member_states
35 | 34 /government/legislative_session/members./government/government_position_held/legislative_sessions
36 | 35 /film/film/estimated_budget./measurement_unit/dated_money_value/currency
37 | 36 /organization/non_profit_organization/registered_with./organization/non_profit_registration/registering_agency
38 | 37 /organization/organization/headquarters./location/mailing_address/country
39 | 38 /base/biblioness/bibs_location/country
40 | 39 /education/educational_institution/students_graduates./education/education/student
41 | 40 /music/group_member/membership./music/group_membership/role
42 | 41 /location/administrative_division/country
43 | 42 /award/ranked_item/appears_in_ranked_lists./award/ranking/list
44 | 43 /base/eating/practicer_of_diet/diet
45 | 44 /film/special_film_performance_type/film_performance_type./film/performance/film
46 | 45 /award/award_nominated_work/award_nominations./award/award_nomination/nominated_for
47 | 46 /film/director/film
48 | 47 /base/x2010fifaworldcupsouthafrica/world_cup_squad/current_world_cup_squad./base/x2010fifaworldcupsouthafrica/current_world_cup_squad/current_club
49 | 48 /olympics/olympic_games/participating_countries
50 | 49 /music/performance_role/regular_performances./music/group_membership/role
51 | 50 /music/artist/track_contributions./music/track_contribution/role
52 | 51 /base/aareas/schema/administrative_area/administrative_area_type
53 | 52 /film/film/distributors./film/film_film_distributor_relationship/film_distribution_medium
54 | 53 /olympics/olympic_games/sports
55 | 54 /soccer/football_team/current_roster./soccer/football_roster_position/position
56 | 55 /olympics/olympic_participating_country/athletes./olympics/olympic_athlete_affiliation/olympics
57 | 56 /military/military_combatant/military_conflicts./military/military_combatant_group/combatants
58 | 57 /tv/tv_personality/tv_regular_appearances./tv/tv_regular_personal_appearance/program
59 | 58 /common/topic/webpage./common/webpage/category
60 | 59 /music/genre/artists
61 | 60 /film/film/featured_film_locations
62 | 61 /location/location/adjoin_s./location/adjoining_relationship/adjoins
63 | 62 /sports/sports_team/colors
64 | 63 /tv/tv_program/program_creator
65 | 64 /business/business_operation/operating_income./measurement_unit/dated_money_value/currency
66 | 65 /ice_hockey/hockey_team/current_roster./sports/sports_team_roster/position
67 | 66 /film/film/prequel
68 | 67 /organization/endowed_organization/endowment./measurement_unit/dated_money_value/currency
69 | 68 /film/film_set_designer/film_sets_designed
70 | 69 /film/film/film_art_direction_by
71 | 70 /language/human_language/countries_spoken_in
72 | 71 /people/marriage_union_type/unions_of_this_type./people/marriage/location_of_ceremony
73 | 72 /tv/tv_writer/tv_programs./tv/tv_program_writer_relationship/tv_program
74 | 73 /government/political_party/politicians_in_this_party./government/political_party_tenure/politician
75 | 74 /sports/sports_team/roster./american_football/football_historical_roster_position/position_s
76 | 75 /film/film/release_date_s./film/film_regional_release_date/film_release_region
77 | 76 /film/film/release_date_s./film/film_regional_release_date/film_regional_debut_venue
78 | 77 /award/award_winning_work/awards_won./award/award_honor/honored_for
79 | 78 /location/capital_of_administrative_division/capital_of./location/administrative_division_capital_relationship/administrative_division
80 | 79 /location/hud_foreclosure_area/estimated_number_of_mortgages./measurement_unit/dated_integer/source
81 | 80 /award/award_category/winners./award/award_honor/ceremony
82 | 81 /people/person/languages
83 | 82 /film/actor/film./film/performance/film
84 | 83 /business/business_operation/revenue./measurement_unit/dated_money_value/currency
85 | 84 /base/petbreeds/city_with_dogs/top_breeds./base/petbreeds/dog_city_relationship/dog_breed
86 | 85 /sports/sports_team_location/teams
87 | 86 /film/film/music
88 | 87 /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/draft
89 | 88 /education/educational_institution/students_graduates./education/education/major_field_of_study
90 | 89 /people/ethnicity/geographic_distribution
91 | 90 /sports/sports_league/teams./sports/sports_league_participation/team
92 | 91 /education/educational_degree/people_with_this_degree./education/education/student
93 | 92 /government/politician/government_positions_held./government/government_position_held/jurisdiction_of_office
94 | 93 /base/aareas/schema/administrative_area/capital
95 | 94 /film/film/film_production_design_by
96 | 95 /user/jg/default_domain/olympic_games/sports
97 | 96 /award/award_category/category_of
98 | 97 /education/educational_institution/school_type
99 | 98 /sports/sports_team/roster./baseball/baseball_roster_position/position
100 | 99 /tv/tv_producer/programs_produced./tv/tv_producer_term/program
101 | 100 /location/us_county/county_seat
102 | 101 /education/university/fraternities_and_sororities
103 | 102 /film/film/other_crew./film/film_crew_gig/crewmember
104 | 103 /military/military_conflict/combatants./military/military_combatant_group/combatants
105 | 104 /base/popstra/celebrity/canoodled./base/popstra/canoodled/participant
106 | 105 /education/educational_degree/people_with_this_degree./education/education/institution
107 | 106 /organization/organization/child./organization/organization_relationship/child
108 | 107 /travel/travel_destination/how_to_get_here./travel/transportation/mode_of_transportation
109 | 108 /award/award_category/nominees./award/award_nomination/nominated_for
110 | 109 /medicine/symptom/symptom_of
111 | 110 /people/ethnicity/people
112 | 111 /film/film/other_crew./film/film_crew_gig/film_crew_role
113 | 112 /government/governmental_body/members./government/government_position_held/legislative_sessions
114 | 113 /business/business_operation/industry
115 | 114 /film/film/country
116 | 115 /people/profession/specialization_of
117 | 116 /location/hud_county_place/place
118 | 117 /organization/role/leaders./organization/leadership/organization
119 | 118 /music/instrument/instrumentalists
120 | 119 /time/event/locations
121 | 120 /film/film/produced_by
122 | 121 /music/performance_role/track_performances./music/track_contribution/role
123 | 122 /film/film/runtime./film/film_cut/film_release_region
124 | 123 /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/country
125 | 124 /tv/tv_program/regular_cast./tv/regular_tv_appearance/actor
126 | 125 /award/award_nominee/award_nominations./award/award_nomination/award
127 | 126 /people/person/spouse_s./people/marriage/type_of_union
128 | 127 /film/actor/dubbing_performances./film/dubbing_performance/language
129 | 128 /sports/sports_position/players./sports/sports_team_roster/team
130 | 129 /award/award_ceremony/awards_presented./award/award_honor/honored_for
131 | 130 /sports/sports_team/sport
132 | 131 /tv/tv_program/country_of_origin
133 | 132 /award/award_category/disciplines_or_subjects
134 | 133 /base/popstra/celebrity/friendship./base/popstra/friendship/participant
135 | 134 /people/ethnicity/languages_spoken
136 | 135 /tv/tv_program/genre
137 | 136 /education/educational_degree/people_with_this_degree./education/education/major_field_of_study
138 | 137 /people/person/sibling_s./people/sibling_relationship/sibling
139 | 138 /business/business_operation/assets./measurement_unit/dated_money_value/currency
140 | 139 /olympics/olympic_games/medals_awarded./olympics/olympic_medal_honor/medal
141 | 140 /film/film/edited_by
142 | 141 /film/actor/film./film/performance/special_performance_type
143 | 142 /education/educational_institution_campus/educational_institution
144 | 143 /film/film/written_by
145 | 144 /sports/sports_position/players./sports/sports_team_roster/position
146 | 145 /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_location
147 | 146 /film/film/personal_appearances./film/personal_film_appearance/person
148 | 147 /user/tsegaran/random/taxonomy_subject/entry./user/tsegaran/random/taxonomy_entry/taxonomy
149 | 148 /people/person/gender
150 | 149 /people/deceased_person/place_of_death
151 | 150 /location/statistical_region/rent50_2./measurement_unit/dated_money_value/currency
152 | 151 /music/performance_role/guest_performances./music/recording_contribution/performance_role
153 | 152 /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/medal
154 | 153 /dataworld/gardening_hint/split_to
155 | 154 /location/country/capital
156 | 155 /award/award_winning_work/awards_won./award/award_honor/award
157 | 156 /tv/tv_program/tv_producer./tv/tv_producer_term/producer_type
158 | 157 /base/biblioness/bibs_location/state
159 | 158 /influence/influence_node/peers./influence/peer_relationship/peers
160 | 159 /film/film/story_by
161 | 160 /location/administrative_division/first_level_division_of
162 | 161 /baseball/baseball_team/team_stats./baseball/baseball_team_stats/season
163 | 162 /award/hall_of_fame/inductees./award/hall_of_fame_induction/inductee
164 | 163 /sports/sports_team/roster./american_football/football_roster_position/position
165 | 164 /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_language
166 | 165 /sports/sports_position/players./american_football/football_historical_roster_position/position_s
167 | 166 /media_common/netflix_genre/titles
168 | 167 /people/person/spouse_s./people/marriage/spouse
169 | 168 /people/cause_of_death/people
170 | 169 /organization/organization_founder/organizations_founded
171 | 170 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office
172 | 171 /tv/tv_program/languages
173 | 172 /base/popstra/location/vacationers./base/popstra/vacation_choice/vacationer
174 | 173 /influence/influence_node/influenced_by
175 | 174 /location/country/second_level_divisions
176 | 175 /sports/sport/pro_athletes./sports/pro_sports_played/athlete
177 | 176 /government/legislative_session/members./government/government_position_held/district_represented
178 | 177 /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/olympics
179 | 178 /medicine/disease/risk_factors
180 | 179 /award/award_ceremony/awards_presented./award/award_honor/award_winner
181 | 180 /american_football/football_team/current_roster./sports/sports_team_roster/position
182 | 181 /music/artist/contribution./music/recording_contribution/performance_role
183 | 182 /education/educational_institution/campuses
184 | 183 /location/country/form_of_government
185 | 184 /base/marchmadness/ncaa_basketball_tournament/seeds./base/marchmadness/ncaa_tournament_seed/team
186 | 185 /education/field_of_study/students_majoring./education/education/major_field_of_study
187 | 186 /people/person/nationality
188 | 187 /film/film/release_date_s./film/film_regional_release_date/film_release_distribution_medium
189 | 188 /film/film/film_format
190 | 189 /soccer/football_player/current_team./sports/sports_team_roster/team
191 | 190 /government/politician/government_positions_held./government/government_position_held/legislative_sessions
192 | 191 /film/film/cinematography
193 | 192 /people/deceased_person/place_of_burial
194 | 193 /base/aareas/schema/administrative_area/administrative_parent
195 | 194 /music/genre/parent_genre
196 | 195 /sports/sports_league_draft/picks./sports/sports_league_draft_pick/school
197 | 196 /location/statistical_region/religions./location/religion_percentage/religion
198 | 197 /location/location/time_zones
199 | 198 /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics
200 | 199 /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film
201 | 200 /film/film/dubbing_performances./film/dubbing_performance/actor
202 | 201 /organization/organization/headquarters./location/mailing_address/citytown
203 | 202 /sports/pro_athlete/teams./sports/sports_team_roster/team
204 | 203 /education/university/local_tuition./measurement_unit/dated_money_value/currency
205 | 204 /music/record_label/artist
206 | 205 /business/job_title/people_with_this_title./business/employment_tenure/company
207 | 206 /music/instrument/family
208 | 207 /user/alexander/philosophy/philosopher/interests
209 | 208 /location/statistical_region/gdp_real./measurement_unit/adjusted_money_value/adjustment_currency
210 | 209 /tv/non_character_role/tv_regular_personal_appearances./tv/tv_regular_personal_appearance/person
211 | 210 /location/hud_county_place/county
212 | 211 /government/politician/government_positions_held./government/government_position_held/basic_title
213 | 212 /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/contact_category
214 | 213 /people/person/religion
215 | 214 /education/university/domestic_tuition./measurement_unit/dated_money_value/currency
216 | 215 /award/award_nominee/award_nominations./award/award_nomination/award_nominee
217 | 216 /music/performance_role/regular_performances./music/group_membership/group
218 | 217 /education/university/international_tuition./measurement_unit/dated_money_value/currency
219 | 218 /film/film/film_festivals
220 | 219 /location/statistical_region/gdp_nominal./measurement_unit/dated_money_value/currency
221 | 220 /base/saturdaynightlive/snl_cast_member/seasons./base/saturdaynightlive/snl_season_tenure/cast_members
222 | 221 /education/field_of_study/students_majoring./education/education/student
223 | 222 /location/statistical_region/gni_per_capita_in_ppp_dollars./measurement_unit/dated_money_value/currency
224 | 223 /base/localfood/seasonal_month/produce_available./base/localfood/produce_availability/seasonal_months
225 | 224 /film/film_subject/films
226 | 225 /soccer/football_team/current_roster./sports/sports_team_roster/position
227 | 226 /location/location/partially_contains
228 | 227 /celebrities/celebrity/sexual_relationships./celebrities/romantic_relationship/celebrity
229 | 228 /people/person/spouse_s./people/marriage/location_of_ceremony
230 | 229 /base/culturalevent/event/entity_involved
231 | 230 /organization/organization_member/member_of./organization/organization_membership/organization
232 | 231 /base/locations/continents/countries_within
233 | 232 /location/country/official_language
234 | 233 /film/film/production_companies
235 | 234 /base/schemastaging/person_extra/net_worth./measurement_unit/dated_money_value/currency
236 | 235 /medicine/disease/notable_people_with_this_condition
237 | 236 /film/person_or_entity_appearing_in_film/films./film/personal_film_appearance/type_of_appearance
238 |
--------------------------------------------------------------------------------
/data/WN18RR_v1_ind/test.txt:
--------------------------------------------------------------------------------
1 | 00445169 _similar_to 00444519
2 | 02666239 _derivationally_related_form 01410363
3 | 03420559 _derivationally_related_form 01087197
4 | 01149494 _also_see 01364008
5 | 00233335 _derivationally_related_form 05162455
6 | 10341660 _derivationally_related_form 02987454
7 | 03354613 _derivationally_related_form 01340439
8 | 00088481 _derivationally_related_form 02272549
9 | 02979662 _derivationally_related_form 01662771
10 | 01021128 _derivationally_related_form 05925366
11 | 02512305 _derivationally_related_form 01153548
12 | 00456740 _derivationally_related_form 07369604
13 | 01292885 _derivationally_related_form 07976936
14 | 01410363 _derivationally_related_form 04750164
15 | 00082308 _derivationally_related_form 03075768
16 | 07520612 _derivationally_related_form 01780941
17 | 05849789 _derivationally_related_form 02630189
18 | 08612786 _derivationally_related_form 01276361
19 | 00847340 _derivationally_related_form 01428853
20 | 02509287 _derivationally_related_form 00808182
21 | 00527572 _derivationally_related_form 13491060
22 | 04750164 _derivationally_related_form 01410363
23 | 00267349 _derivationally_related_form 01590171
24 | 00915722 _derivationally_related_form 01742726
25 | 07423001 _hypernym 07355887
26 | 02728440 _derivationally_related_form 00046534
27 | 01167146 _derivationally_related_form 01612053
28 | 03600977 _hypernym 03605915
29 | 03264542 _hypernym 08592656
30 | 02443049 _derivationally_related_form 01135529
31 | 01779165 _derivationally_related_form 04143712
32 | 01779165 _derivationally_related_form 07520612
33 | 01753596 _derivationally_related_form 09972157
34 | 00922438 _hypernym 00921738
35 | 00443384 _derivationally_related_form 14110411
36 | 00064095 _derivationally_related_form 03879854
37 | 00149084 _derivationally_related_form 01285440
38 | 03779621 _derivationally_related_form 01662771
39 | 00353782 _derivationally_related_form 00429060
40 | 04659287 _derivationally_related_form 01026262
41 | 10371741 _derivationally_related_form 00752335
42 | 01662771 _derivationally_related_form 13913566
43 | 01240432 _hypernym 01240210
44 | 05085572 _derivationally_related_form 00444519
45 | 09941964 _derivationally_related_form 02441022
46 | 00082308 _derivationally_related_form 00354884
47 | 00429060 _derivationally_related_form 00359903
48 | 00444519 _derivationally_related_form 05085572
49 | 00751887 _derivationally_related_form 09941964
50 | 00201923 _derivationally_related_form 00462092
51 | 01130607 _derivationally_related_form 03878963
52 | 01059400 _also_see 02095311
53 | 04713332 _hypernym 04712735
54 | 13999663 _derivationally_related_form 01301410
55 | 03391301 _derivationally_related_form 01586850
56 | 00290740 _derivationally_related_form 00351638
57 | 00833702 _derivationally_related_form 00893955
58 | 01340439 _derivationally_related_form 10080337
59 | 01780941 _hypernym 01779165
60 | 01474513 _also_see 02451113
61 | 00119873 _derivationally_related_form 02987454
62 | 10078806 _derivationally_related_form 01739814
63 | 00795008 _derivationally_related_form 00047317
64 | 10012815 _derivationally_related_form 00650353
65 | 15224293 _derivationally_related_form 00233335
66 | 01687569 _derivationally_related_form 01159964
67 | 02447001 _derivationally_related_form 10298912
68 | 02659763 _derivationally_related_form 04930307
69 | 01819554 _derivationally_related_form 10525134
70 | 09800249 _hypernym 09952163
71 | 02506555 _also_see 02064745
72 | 02700104 _derivationally_related_form 00552841
73 | 00083334 _derivationally_related_form 00149084
74 | 01364008 _also_see 01368192
75 | 01667449 _derivationally_related_form 04033995
76 | 07366289 _derivationally_related_form 02661252
77 | 00595146 _derivationally_related_form 10164233
78 | 01340439 _hypernym 01296462
79 | 02661252 _derivationally_related_form 04802776
80 | 02502536 _derivationally_related_form 00320852
81 | 00898804 _derivationally_related_form 01697027
82 | 03600977 _derivationally_related_form 02660147
83 | 13860793 _derivationally_related_form 00445169
84 | 01643464 _derivationally_related_form 10029068
85 | 07355887 _derivationally_related_form 00152887
86 | 00751887 _derivationally_related_form 09941383
87 | 00482893 _derivationally_related_form 00185104
88 | 00456740 _derivationally_related_form 05696020
89 | 00233335 _derivationally_related_form 15224293
90 | 10093908 _derivationally_related_form 02702830
91 | 09442838 _derivationally_related_form 00709625
92 | 03496892 _hypernym 03322940
93 | 02646931 _derivationally_related_form 14442530
94 | 01834304 _derivationally_related_form 00410247
95 | 01531375 _also_see 01508719
96 | 00299580 _derivationally_related_form 07369604
97 | 00233335 _derivationally_related_form 10525134
98 | 00046534 _also_see 00044149
99 | 08592656 _hypernym 08512259
100 | 00087152 _also_see 01922763
101 | 02875013 _derivationally_related_form 01467370
102 | 02806907 _derivationally_related_form 06806469
103 | 02441022 _derivationally_related_form 09941964
104 | 00764902 _derivationally_related_form 01205827
105 | 14441825 _derivationally_related_form 00791227
106 | 00651991 _derivationally_related_form 05748285
107 | 07254057 _hypernym 07253637
108 | 01582645 _derivationally_related_form 03234306
109 | 02542280 _derivationally_related_form 00696518
110 | 04181228 _derivationally_related_form 01085474
111 | 00859325 _derivationally_related_form 04630689
112 | 09941571 _hypernym 09943541
113 | 03573282 _derivationally_related_form 00187526
114 | 13860793 _hypernym 00027807
115 | 00413876 _derivationally_related_form 01051331
116 | 07515560 _derivationally_related_form 01922763
117 | 08552138 _derivationally_related_form 02512150
118 | 00462092 _derivationally_related_form 01070892
119 | 00290740 _derivationally_related_form 00355252
120 | 09779790 _derivationally_related_form 00245457
121 | 01467370 _derivationally_related_form 08512736
122 | 01753596 _derivationally_related_form 03178782
123 | 01428853 _derivationally_related_form 10300303
124 | 07337390 _derivationally_related_form 01876907
125 | 00236289 _derivationally_related_form 04181228
126 | 01551871 _verb_group 01684337
127 | 00764902 _derivationally_related_form 00759551
128 | 09952163 _derivationally_related_form 01765392
129 | 09941571 _derivationally_related_form 00590626
130 | 02700104 _derivationally_related_form 04713118
131 | 01029852 _derivationally_related_form 07202579
132 | 05117660 _hypernym 05093890
133 | 00236289 _hypernym 00233335
134 | 10388440 _derivationally_related_form 02539334
135 | 09779790 _derivationally_related_form 01104406
136 | 01782218 _hypernym 01780202
137 | 03051540 _derivationally_related_form 00050652
138 | 03792334 _derivationally_related_form 01660640
139 | 00661213 _derivationally_related_form 05748786
140 | 01146039 _derivationally_related_form 02553697
141 | 09812338 _derivationally_related_form 02991122
142 | 08612786 _hypernym 08512259
143 | 13971561 _derivationally_related_form 00764902
144 | 07519253 _derivationally_related_form 01779165
145 | 00100044 _derivationally_related_form 00893955
146 | 02204692 _derivationally_related_form 10389398
147 | 00169651 _hypernym 00170844
148 | 03670849 _has_part 02845576
149 | 07177437 _derivationally_related_form 00482893
150 | 05696020 _derivationally_related_form 00456740
151 | 06526291 _derivationally_related_form 10402417
152 | 06773976 _derivationally_related_form 01647867
153 | 13969243 _derivationally_related_form 02700104
154 | 03852280 _hypernym 03574816
155 | 05641959 _derivationally_related_form 00597385
156 | 04748836 _derivationally_related_form 00119524
157 | 01684337 _derivationally_related_form 04157320
158 | 00709625 _derivationally_related_form 00928077
159 | 00321956 _derivationally_related_form 00187526
160 | 00047745 _derivationally_related_form 03051540
161 | 04085873 _hypernym 03315644
162 | 04641153 _hypernym 04640927
163 | 07254057 _derivationally_related_form 01781180
164 | 05844105 _derivationally_related_form 01687569
165 | 10525134 _derivationally_related_form 01301051
166 | 14442530 _hypernym 14441825
167 | 00650016 _derivationally_related_form 10012815
168 | 00696518 _also_see 02564986
169 | 01697027 _derivationally_related_form 00898804
170 | 01301410 _derivationally_related_form 13998781
171 | 10388924 _derivationally_related_form 00809465
172 | 03779621 _derivationally_related_form 01697027
173 | 00152887 _derivationally_related_form 07355887
174 | 08677628 _derivationally_related_form 02695895
175 | 01612053 _also_see 01123148
176 | 05162455 _hypernym 05161614
177 | 00044149 _verb_group 00044797
178 | 02539334 _derivationally_related_form 00791227
179 | 00040962 _hypernym 00040804
180 | 04051825 _derivationally_related_form 01128071
181 | 04905188 _derivationally_related_form 01026262
182 | 01320009 _derivationally_related_form 00921790
183 | 00354884 _derivationally_related_form 01815185
184 | 03721797 _derivationally_related_form 00921738
185 | 02064745 _derivationally_related_form 02666239
186 | 13491060 _derivationally_related_form 00527572
187 | 02928413 _derivationally_related_form 01498713
188 | 00224901 _derivationally_related_form 09476521
189 |
--------------------------------------------------------------------------------
/data/WN18RR_v1_ind/valid.txt:
--------------------------------------------------------------------------------
1 | 09953178 _hypernym 09931640
2 | 01027263 _derivationally_related_form 00299580
3 | 03728811 _derivationally_related_form 01292885
4 | 04928903 _derivationally_related_form 01687569
5 | 01301051 _derivationally_related_form 10525134
6 | 09273291 _derivationally_related_form 02711114
7 | 13969700 _derivationally_related_form 01765392
8 | 00590626 _derivationally_related_form 09780828
9 | 13903079 _derivationally_related_form 01466978
10 | 00050652 _hypernym 00046534
11 | 10029068 _derivationally_related_form 00935940
12 | 10676877 _derivationally_related_form 02443049
13 | 02420232 _derivationally_related_form 10078806
14 | 01256157 _derivationally_related_form 10566072
15 | 00151689 _derivationally_related_form 05111835
16 | 04770911 _derivationally_related_form 01876907
17 | 00262703 _derivationally_related_form 03745285
18 | 00708017 _similar_to 00709625
19 | 02695895 _hypernym 02694933
20 | 03051540 _derivationally_related_form 00047745
21 | 01291069 _derivationally_related_form 00145218
22 | 13905792 _derivationally_related_form 01276361
23 | 03878963 _derivationally_related_form 01130607
24 | 00898804 _derivationally_related_form 01743784
25 | 10448983 _derivationally_related_form 00752335
26 | 00082308 _derivationally_related_form 14445379
27 | 00921790 _derivationally_related_form 01320009
28 | 03792048 _derivationally_related_form 01660640
29 | 10525134 _derivationally_related_form 01301410
30 | 01711749 _derivationally_related_form 08664443
31 | 10525134 _derivationally_related_form 00233335
32 | 01693881 _derivationally_related_form 03104594
33 | 02928413 _hypernym 03600977
34 | 03104594 _derivationally_related_form 01693881
35 | 10529231 _derivationally_related_form 02204692
36 | 10689564 _derivationally_related_form 04160372
37 | 13454318 _derivationally_related_form 01742726
38 | 03265479 _hypernym 02875013
39 | 05844105 _derivationally_related_form 10155849
40 | 03932670 _hypernym 03932203
41 | 00083809 _synset_domain_topic_of 00612160
42 | 00151689 _derivationally_related_form 13458571
43 | 01069190 _verb_group 01069391
44 | 01612053 _derivationally_related_form 02542795
45 | 04644512 _derivationally_related_form 02564986
46 | 01647867 _derivationally_related_form 13970236
47 | 00321956 _derivationally_related_form 01580467
48 | 03257343 _derivationally_related_form 01735308
49 | 01410905 _derivationally_related_form 04750164
50 | 00919513 _derivationally_related_form 01567275
51 | 00043683 _derivationally_related_form 02728440
52 | 00764902 _derivationally_related_form 01026262
53 | 04748836 _derivationally_related_form 00651991
54 | 10160412 _derivationally_related_form 00482473
55 | 01739814 _derivationally_related_form 00916464
56 | 13998781 _derivationally_related_form 01301410
57 | 01363613 _also_see 01148283
58 | 03282060 _has_part 04085873
59 | 03792048 _derivationally_related_form 01660386
60 | 00730301 _derivationally_related_form 08512736
61 | 13085864 _derivationally_related_form 01741446
62 | 06998748 _derivationally_related_form 09812338
63 | 00933566 _derivationally_related_form 05117660
64 | 07369604 _derivationally_related_form 00299580
65 | 05902327 _derivationally_related_form 01743784
66 | 07369604 _derivationally_related_form 00300537
67 | 01151110 _hypernym 01987160
68 | 01135529 _derivationally_related_form 02443049
69 | 00696882 _derivationally_related_form 00082714
70 | 00233335 _derivationally_related_form 05846355
71 | 09941964 _derivationally_related_form 00751887
72 | 01743784 _derivationally_related_form 05902327
73 | 00084230 _derivationally_related_form 00612160
74 | 01020936 _hypernym 01019524
75 | 10529231 _derivationally_related_form 02203362
76 | 03777283 _derivationally_related_form 01697406
77 | 01167146 _derivationally_related_form 02542795
78 | 02657219 _derivationally_related_form 03728811
79 | 03327234 _derivationally_related_form 01588134
80 | 01020936 _derivationally_related_form 01742886
81 | 10668450 _hypernym 10525134
82 | 00796047 _derivationally_related_form 06893885
83 | 04613158 _derivationally_related_form 01492052
84 | 04905842 _derivationally_related_form 02388145
85 | 00765213 _derivationally_related_form 09800249
86 | 04463273 _hypernym 03234306
87 | 04930307 _derivationally_related_form 02659763
88 | 01662771 _derivationally_related_form 03779370
89 | 13998576 _derivationally_related_form 02711114
90 | 01813884 _derivationally_related_form 07527352
91 | 03386011 _derivationally_related_form 01606205
92 | 01052853 _derivationally_related_form 01613239
93 | 14442530 _derivationally_related_form 02646931
94 | 01640550 _derivationally_related_form 09972157
95 | 00267349 _hypernym 00266806
96 | 01711749 _derivationally_related_form 08677628
97 | 10155849 _derivationally_related_form 05844105
98 | 01159964 _derivationally_related_form 01687569
99 | 01662771 _derivationally_related_form 00909899
100 | 01780941 _derivationally_related_form 01222666
101 | 13913566 _hypernym 13860793
102 | 00761713 _derivationally_related_form 10351874
103 | 00909363 _also_see 01149494
104 | 01711749 _derivationally_related_form 05075602
105 | 02899439 _hypernym 03673971
106 | 07369604 _derivationally_related_form 00482893
107 | 01248191 _derivationally_related_form 02502536
108 | 05902327 _derivationally_related_form 01683582
109 | 10317007 _hypernym 10582746
110 | 00764902 _derivationally_related_form 07151122
111 | 03322099 _derivationally_related_form 02420232
112 | 02388145 _derivationally_related_form 04905842
113 | 00482473 _hypernym 00296178
114 | 07527352 _derivationally_related_form 01363613
115 | 01765392 _derivationally_related_form 01151407
116 | 00730499 _derivationally_related_form 08592656
117 | 01051331 _derivationally_related_form 01496630
118 | 14441825 _derivationally_related_form 02646931
119 | 00047317 _derivationally_related_form 00795008
120 | 02512305 _derivationally_related_form 10012815
121 | 07537068 _hypernym 07532440
122 | 10093908 _derivationally_related_form 00300537
123 | 05937112 _derivationally_related_form 02723733
124 | 04433185 _derivationally_related_form 01285440
125 | 00651991 _derivationally_related_form 07270179
126 | 02389346 _derivationally_related_form 00145218
127 | 01819554 _derivationally_related_form 01222477
128 | 00409211 _derivationally_related_form 02443849
129 | 00083809 _derivationally_related_form 00671351
130 | 02671279 _derivationally_related_form 13321495
131 | 01224744 _derivationally_related_form 10378412
132 | 01291069 _hypernym 01354673
133 | 10378780 _hypernym 09882007
134 | 09476521 _derivationally_related_form 00290740
135 | 03285912 _derivationally_related_form 02711114
136 | 00150287 _derivationally_related_form 09957614
137 | 05765415 _derivationally_related_form 02806907
138 | 00915722 _hypernym 00913705
139 | 01190884 _hypernym 01187810
140 | 13427078 _derivationally_related_form 00150287
141 | 06791372 _derivationally_related_form 02296984
142 | 01922763 _also_see 01740892
143 | 00119074 _derivationally_related_form 04748836
144 | 07066659 _derivationally_related_form 10155849
145 | 02003725 _derivationally_related_form 00236592
146 | 00916464 _has_part 00921790
147 | 01765392 _derivationally_related_form 07515790
148 | 01222477 _derivationally_related_form 01819554
149 | 00364479 _also_see 01368192
150 | 01684337 _verb_group 01551871
151 | 01148283 _also_see 00999817
152 | 01340439 _derivationally_related_form 00147595
153 | 01922763 _derivationally_related_form 07515560
154 | 00815644 _derivationally_related_form 01150559
155 | 00462092 _derivationally_related_form 04361641
156 | 01876907 _derivationally_related_form 00348571
157 | 07366627 _hypernym 07366289
158 | 05846355 _derivationally_related_form 00235368
159 | 09956578 _derivationally_related_form 00462092
160 | 00650016 _derivationally_related_form 05748054
161 | 03091374 _derivationally_related_form 01354673
162 | 00751887 _hypernym 02539334
163 | 02840361 _hypernym 03496892
164 | 00300537 _derivationally_related_form 04930307
165 | 08592656 _derivationally_related_form 00730499
166 | 01624568 _derivationally_related_form 13913566
167 | 04630689 _derivationally_related_form 00859153
168 | 02991122 _derivationally_related_form 02743547
169 | 01148283 _also_see 00362467
170 | 06003682 _hypernym 06000644
171 | 05198036 _derivationally_related_form 10388440
172 | 02372326 _derivationally_related_form 00040152
173 | 00795008 _derivationally_related_form 02659763
174 | 10645611 _hypernym 10676877
175 | 07527352 _derivationally_related_form 01813884
176 | 03779370 _derivationally_related_form 01697027
177 | 13489037 _derivationally_related_form 00245457
178 | 01051331 _derivationally_related_form 02333689
179 | 02991122 _derivationally_related_form 09812338
180 | 10689564 _hypernym 10120816
181 | 05844105 _derivationally_related_form 01666894
182 | 02539359 _derivationally_related_form 00047745
183 | 03932203 _derivationally_related_form 01656788
184 | 00696518 _also_see 01612053
185 | 01492052 _derivationally_related_form 04612840
186 |
--------------------------------------------------------------------------------
/data/WN18RR_v2_ind/test.txt:
--------------------------------------------------------------------------------
1 | 08858942 _has_part 08890097
2 | 01725712 _derivationally_related_form 07480896
3 | 02542280 _derivationally_related_form 01203676
4 | 10515194 _derivationally_related_form 05945508
5 | 01958615 _derivationally_related_form 00299217
6 | 01398212 _hypernym 14989820
7 | 02112891 _hypernym 02112029
8 | 04695963 _derivationally_related_form 01537409
9 | 02491383 _derivationally_related_form 10526096
10 | 10489944 _hypernym 10707233
11 | 01856225 _member_meronym 01856553
12 | 05748786 _derivationally_related_form 02666882
13 | 00812526 _derivationally_related_form 01572978
14 | 01949110 _hypernym 01955984
15 | 02064131 _derivationally_related_form 13774404
16 | 09334396 _derivationally_related_form 01502762
17 | 09688008 _derivationally_related_form 02957823
18 | 05715864 _derivationally_related_form 02194495
19 | 04926427 _hypernym 04924103
20 | 14798450 _derivationally_related_form 02627221
21 | 01098869 _derivationally_related_form 01156438
22 | 01926984 _verb_group 02099829
23 | 02964389 _derivationally_related_form 01612084
24 | 07635155 _hypernym 07628870
25 | 13291189 _derivationally_related_form 02543874
26 | 00753428 _hypernym 00752493
27 | 01144657 _derivationally_related_form 00452293
28 | 01131043 _also_see 01125429
29 | 01073241 _derivationally_related_form 01182293
30 | 01074650 _also_see 02530861
31 | 10488016 _hypernym 10632576
32 | 02632567 _derivationally_related_form 14493426
33 | 08277805 _derivationally_related_form 09759311
34 | 05050379 _hypernym 05050115
35 | 00470084 _derivationally_related_form 00233386
36 | 02666943 _derivationally_related_form 01322854
37 | 08873622 _has_part 08597023
38 | 01781983 _derivationally_related_form 14405931
39 | 00299217 _derivationally_related_form 01958615
40 | 06220616 _hypernym 06212839
41 | 01904293 _derivationally_related_form 09281777
42 | 00481739 _derivationally_related_form 06667317
43 | 01182293 _derivationally_related_form 04638585
44 | 02046755 _derivationally_related_form 13878112
45 | 07813107 _derivationally_related_form 00213353
46 | 00467717 _derivationally_related_form 05924920
47 | 09759311 _derivationally_related_form 02669885
48 | 00409211 _derivationally_related_form 02443849
49 | 01775535 _hypernym 01775164
50 | 10672662 _derivationally_related_form 05186306
51 | 05707146 _derivationally_related_form 00614999
52 | 02519991 _derivationally_related_form 00259643
53 | 01845627 _member_meronym 01855672
54 | 01390616 _derivationally_related_form 00113113
55 | 14299070 _derivationally_related_form 00091124
56 | 01955508 _derivationally_related_form 00121645
57 | 01389329 _derivationally_related_form 00616083
58 | 00596393 _derivationally_related_form 10464542
59 | 00601822 _derivationally_related_form 00180770
60 | 02530861 _also_see 00853776
61 | 00560893 _derivationally_related_form 00358931
62 | 01856748 _hypernym 01507175
63 | 08877208 _instance_hypernym 08633957
64 | 06682794 _derivationally_related_form 01955127
65 | 13855627 _derivationally_related_form 00661213
66 | 00838367 _derivationally_related_form 01179865
67 | 03150232 _derivationally_related_form 01522276
68 | 01073822 _derivationally_related_form 04993413
69 | 01944692 _derivationally_related_form 02858304
70 | 00350889 _hypernym 00350461
71 | 02478059 _derivationally_related_form 14455700
72 | 07238102 _derivationally_related_form 02677332
73 | 09424489 _derivationally_related_form 01816431
74 | 02887209 _hypernym 04336034
75 | 13580723 _derivationally_related_form 01193721
76 | 10093658 _derivationally_related_form 01140794
77 | 00330160 _derivationally_related_form 00438178
78 | 13552270 _derivationally_related_form 00239614
79 | 01224744 _verb_group 00597385
80 | 05219724 _has_part 05514905
81 | 01354006 _derivationally_related_form 14705718
82 | 01234345 _derivationally_related_form 00421535
83 | 04576211 _has_part 04574999
84 | 02270165 _derivationally_related_form 10330189
85 | 00044673 _derivationally_related_form 00417001
86 | 02250625 _derivationally_related_form 13282550
87 | 00891216 _derivationally_related_form 13344804
88 | 00182213 _derivationally_related_form 02461314
89 | 14622893 _has_part 14619225
90 | 05238282 _has_part 05244934
91 | 05174653 _derivationally_related_form 02519991
92 | 13282550 _derivationally_related_form 02519991
93 | 03933529 _derivationally_related_form 02085742
94 | 00366547 _derivationally_related_form 00357680
95 | 02519991 _derivationally_related_form 13290676
96 | 01493897 _derivationally_related_form 14425974
97 | 04828255 _derivationally_related_form 01782519
98 | 08277805 _hypernym 08276720
99 | 01406356 _verb_group 01406512
100 | 01487311 _hypernym 01488956
101 | 04980656 _derivationally_related_form 01053144
102 | 15129927 _hypernym 05816790
103 | 01537409 _derivationally_related_form 05244934
104 | 02531422 _also_see 00856860
105 | 05003090 _derivationally_related_form 01882170
106 | 02125641 _derivationally_related_form 05714466
107 | 00113113 _derivationally_related_form 01754105
108 | 00695523 _also_see 01475282
109 | 00658052 _derivationally_related_form 01009871
110 | 06561942 _derivationally_related_form 00844298
111 | 02235666 _verb_group 02537407
112 | 13874073 _derivationally_related_form 00417001
113 | 14971519 _derivationally_related_form 00238867
114 | 02291708 _synset_domain_topic_of 13333237
115 | 01150467 _hypernym 01150200
116 | 13282550 _derivationally_related_form 02253456
117 | 13720096 _hypernym 13716084
118 | 01167780 _derivationally_related_form 03200357
119 | 01475282 _also_see 02451951
120 | 10330189 _derivationally_related_form 02269894
121 | 08612049 _has_part 08495617
122 | 00105778 _derivationally_related_form 09930876
123 | 10478960 _derivationally_related_form 02163301
124 | 00842989 _derivationally_related_form 09762385
125 | 01097031 _derivationally_related_form 08397255
126 | 00657728 _derivationally_related_form 00874977
127 | 02196690 _derivationally_related_form 14599641
128 | 01684337 _derivationally_related_form 00937656
129 | 00980908 _derivationally_related_form 06790042
130 | 00658052 _derivationally_related_form 06483454
131 | 05659365 _hypernym 05659621
132 | 01074650 _derivationally_related_form 10112591
133 | 02072501 _derivationally_related_form 13649791
134 | 01227137 _also_see 01370590
135 | 00800930 _hypernym 00685683
136 | 02250625 _derivationally_related_form 00259894
137 | 13573666 _hypernym 13575433
138 | 07551052 _derivationally_related_form 00859604
139 | 05527216 _has_part 05525628
140 | 00590148 _derivationally_related_form 09906986
141 | 09334396 _derivationally_related_form 01292727
142 | 00859604 _derivationally_related_form 01073241
143 | 00843468 _derivationally_related_form 06730780
144 | 09203827 _member_meronym 09316454
145 | 00239614 _verb_group 00238867
146 | 01958615 _synset_domain_topic_of 00450335
147 | 02700104 _derivationally_related_form 00552841
148 | 01960911 _derivationally_related_form 00442115
149 | 04236001 _hypernym 03895866
150 | 06685456 _derivationally_related_form 00890100
151 | 00365188 _verb_group 00365647
152 | 07450343 _hypernym 07447641
153 | 08871007 _has_part 08877208
154 | 00622584 _derivationally_related_form 01143838
155 | 06767777 _derivationally_related_form 01058880
156 | 01820302 _derivationally_related_form 07491981
157 | 10293332 _derivationally_related_form 01924505
158 | 09759311 _derivationally_related_form 08280124
159 | 02080577 _derivationally_related_form 02671880
160 | 00622266 _derivationally_related_form 01574292
161 | 04493505 _derivationally_related_form 02079525
162 | 03216828 _derivationally_related_form 01305731
163 | 00657550 _derivationally_related_form 05737153
164 | 13969243 _derivationally_related_form 02700104
165 | 09612291 _derivationally_related_form 00869596
166 | 01526956 _derivationally_related_form 03872495
167 | 02672540 _derivationally_related_form 00259643
168 | 02058794 _also_see 02523275
169 | 05524615 _hypernym 05525252
170 | 05014099 _derivationally_related_form 00328128
171 | 02191766 _derivationally_related_form 05715864
172 | 00812274 _derivationally_related_form 01216004
173 | 10298912 _derivationally_related_form 00595146
174 | 01216522 _derivationally_related_form 00812526
175 | 00299580 _derivationally_related_form 05755486
176 | 00085678 _derivationally_related_form 02274482
177 | 01547641 _verb_group 01547390
178 | 15183428 _hypernym 15157225
179 | 06200010 _derivationally_related_form 00350461
180 | 01027263 _derivationally_related_form 04659090
181 | 02085742 _derivationally_related_form 10655169
182 | 01158690 _derivationally_related_form 00467717
183 | 00851933 _derivationally_related_form 01224517
184 | 00259643 _derivationally_related_form 02519991
185 | 09826204 _derivationally_related_form 01941093
186 | 13876371 _derivationally_related_form 02738544
187 | 02120458 _derivationally_related_form 00513401
188 | 01524298 _hypernym 01524871
189 | 02653996 _derivationally_related_form 02945161
190 | 10058411 _derivationally_related_form 01820302
191 | 02834778 _has_part 04289690
192 | 02446164 _derivationally_related_form 00583461
193 | 00015303 _derivationally_related_form 00858849
194 | 00842989 _derivationally_related_form 07234230
195 | 00320486 _derivationally_related_form 02001858
196 | 00963241 _hypernym 00962129
197 | 00356790 _derivationally_related_form 01387786
198 | 14585519 _derivationally_related_form 00330144
199 | 01354405 _derivationally_related_form 04049405
200 | 15145586 _has_part 15146545
201 | 02163301 _derivationally_related_form 01135529
202 | 04854389 _hypernym 04827652
203 | 01385920 _derivationally_related_form 09307300
204 | 00937656 _derivationally_related_form 01551871
205 | 02326695 _also_see 02451951
206 | 13279262 _hypernym 13281275
207 | 01179865 _derivationally_related_form 07800091
208 | 13875970 _derivationally_related_form 00143204
209 | 01940403 _derivationally_related_form 00302394
210 | 00061290 _derivationally_related_form 01955127
211 | 01951276 _derivationally_related_form 13326198
212 | 00467717 _derivationally_related_form 13617952
213 | 02632567 _hypernym 02632353
214 | 00963283 _derivationally_related_form 00963241
215 | 01432601 _derivationally_related_form 10395073
216 | 01193099 _hypernym 01166351
217 | 01961691 _synset_domain_topic_of 00441824
218 | 00443231 _hypernym 00442115
219 | 13649791 _hypernym 13603305
220 | 05058140 _derivationally_related_form 00438178
221 | 01273263 _derivationally_related_form 02784732
222 | 01131043 _also_see 02037272
223 | 00136800 _derivationally_related_form 13446197
224 | 01510827 _derivationally_related_form 14008806
225 | 00683185 _also_see 01880531
226 | 00073828 _hypernym 00070965
227 | 01640850 _also_see 00666058
228 | 00658052 _derivationally_related_form 14429608
229 | 04565375 _derivationally_related_form 01087197
230 | 02123672 _derivationally_related_form 05713737
231 | 04046810 _hypernym 04048568
232 | 01958615 _synset_domain_topic_of 00299217
233 | 07480068 _hypernym 00026192
234 | 03933529 _derivationally_related_form 01305731
235 | 08894456 _has_part 08895928
236 | 01193099 _derivationally_related_form 01073655
237 | 02261464 _derivationally_related_form 08069878
238 | 01073655 _hypernym 01073241
239 | 01535709 _also_see 01640850
240 | 00854000 _derivationally_related_form 01226600
241 | 03000447 _derivationally_related_form 13575433
242 | 10527334 _hypernym 10503452
243 | 01840238 _derivationally_related_form 10096217
244 | 00890100 _derivationally_related_form 06685456
245 | 13282007 _derivationally_related_form 02249741
246 | 00014742 _derivationally_related_form 15273626
247 | 14798450 _hypernym 15010703
248 | 00208943 _derivationally_related_form 02405252
249 | 02667228 _hypernym 02666882
250 | 02332999 _derivationally_related_form 07556637
251 | 01100145 _derivationally_related_form 10782791
252 | 14493426 _derivationally_related_form 02632567
253 | 09361517 _hypernym 09437454
254 | 01235258 _derivationally_related_form 01153486
255 | 00852922 _hypernym 00849080
256 | 00841628 _derivationally_related_form 01193721
257 | 01167981 _derivationally_related_form 03200357
258 | 00619183 _derivationally_related_form 15159819
259 | 01854415 _hypernym 01852861
260 | 09906848 _hypernym 10474645
261 | 13858045 _hypernym 13857486
262 | 01845627 _member_meronym 01854047
263 | 00105778 _derivationally_related_form 00513401
264 | 01510827 _derivationally_related_form 00409211
265 | 01100145 _derivationally_related_form 10782940
266 | 01639105 _derivationally_related_form 03772269
267 | 02384686 _derivationally_related_form 07186148
268 | 14436875 _derivationally_related_form 02237631
269 | 14429608 _hypernym 14429985
270 | 01182024 _derivationally_related_form 01197338
271 | 00459114 _hypernym 00458754
272 | 05645199 _derivationally_related_form 00609100
273 | 10768585 _derivationally_related_form 01093172
274 | 00609506 _derivationally_related_form 01941093
275 | 01820302 _derivationally_related_form 10058411
276 | 04135315 _hypernym 03872495
277 | 00300317 _hypernym 00299580
278 | 09622302 _derivationally_related_form 01775535
279 | 07238102 _derivationally_related_form 02636921
280 | 07183151 _derivationally_related_form 00869126
281 | 02043982 _derivationally_related_form 08612049
282 | 00853633 _derivationally_related_form 06778102
283 | 14889479 _derivationally_related_form 00239614
284 | 01941093 _derivationally_related_form 10433164
285 | 00043480 _derivationally_related_form 05714466
286 | 08278324 _hypernym 08276342
287 | 09861946 _derivationally_related_form 01944692
288 | 09747329 _derivationally_related_form 03130073
289 | 05658603 _derivationally_related_form 02124748
290 | 00724029 _derivationally_related_form 13421462
291 | 01190840 _derivationally_related_form 07891726
292 | 02274482 _derivationally_related_form 00085678
293 | 01020117 _similar_to 01017738
294 | 02174311 _derivationally_related_form 07380144
295 | 02464342 _hypernym 02463704
296 | 07460104 _derivationally_related_form 01914947
297 | 00588473 _derivationally_related_form 09759501
298 | 00135718 _also_see 01880531
299 | 02053941 _verb_group 01849746
300 | 00550016 _derivationally_related_form 10318892
301 | 01021579 _derivationally_related_form 00958823
302 | 01475282 _also_see 01613463
303 | 00384620 _derivationally_related_form 00261405
304 | 01203676 _derivationally_related_form 02662979
305 | 13529616 _derivationally_related_form 02672859
306 | 05046471 _hypernym 05046009
307 | 01387786 _derivationally_related_form 01741562
308 | 00366547 _verb_group 00364868
309 | 00858849 _derivationally_related_form 00016380
310 | 00076072 _hypernym 00074790
311 | 03199901 _has_part 03905730
312 | 00231567 _derivationally_related_form 02478059
313 | 15166462 _has_part 15169421
314 | 08654360 _hypernym 08491826
315 | 13622591 _has_part 13622209
316 | 00657550 _derivationally_related_form 00874977
317 | 00453935 _derivationally_related_form 01140794
318 | 00754731 _derivationally_related_form 10672192
319 | 04665813 _derivationally_related_form 02529284
320 | 02402409 _derivationally_related_form 00215838
321 | 01882170 _derivationally_related_form 04544979
322 | 00302394 _derivationally_related_form 01941093
323 | 14619225 _has_part 09272085
324 | 02652494 _derivationally_related_form 10269458
325 | 04046810 _derivationally_related_form 01954559
326 | 01186208 _hypernym 01194418
327 | 00759694 _derivationally_related_form 10702781
328 | 00660102 _derivationally_related_form 10506762
329 | 15122231 _derivationally_related_form 00490968
330 | 08571139 _hypernym 08523483
331 | 15166462 _has_part 15169248
332 | 04440749 _hypernym 03533972
333 | 02519991 _derivationally_related_form 13341756
334 | 07847198 _derivationally_related_form 01418037
335 | 01242716 _derivationally_related_form 02512922
336 | 09889941 _hypernym 10744164
337 | 02252931 _derivationally_related_form 01120448
338 | 02566227 _derivationally_related_form 10157744
339 | 04157320 _derivationally_related_form 01551871
340 | 09334396 _derivationally_related_form 02022359
341 | 01996735 _derivationally_related_form 08428019
342 | 10529965 _derivationally_related_form 01957529
343 | 10769321 _derivationally_related_form 02493260
344 | 01940403 _derivationally_related_form 10096217
345 | 00555648 _derivationally_related_form 02055649
346 | 06508816 _derivationally_related_form 01001643
347 | 03309465 _has_part 04082886
348 | 00239614 _hypernym 00239321
349 | 02005756 _derivationally_related_form 04864515
350 | 01782218 _derivationally_related_form 07520612
351 | 01223182 _derivationally_related_form 13877918
352 | 01136614 _derivationally_related_form 03467984
353 | 02478059 _derivationally_related_form 00231567
354 | 00595146 _derivationally_related_form 10298912
355 | 00980908 _derivationally_related_form 05960464
356 | 00365188 _derivationally_related_form 00365471
357 | 07543288 _derivationally_related_form 01775164
358 | 06778102 _derivationally_related_form 00105554
359 | 00891216 _derivationally_related_form 10209731
360 | 10760340 _derivationally_related_form 02461314
361 | 13358549 _hypernym 13384557
362 | 00074790 _hypernym 00070965
363 | 00452293 _derivationally_related_form 02003601
364 | 02994858 _derivationally_related_form 00329831
365 | 15165490 _hypernym 15228378
366 | 02210119 _verb_group 02236124
367 | 00643197 _derivationally_related_form 09790278
368 | 02529284 _derivationally_related_form 00066397
369 | 00264875 _derivationally_related_form 13426238
370 | 00754731 _derivationally_related_form 06513366
371 | 01219706 _derivationally_related_form 02886599
372 | 08587828 _hypernym 08491826
373 | 01765392 _derivationally_related_form 00759551
374 | 02046755 _derivationally_related_form 07440979
375 | 06201136 _derivationally_related_form 10402086
376 | 00882961 _derivationally_related_form 02123672
377 | 04660080 _hypernym 05207130
378 | 04879658 _derivationally_related_form 00963283
379 | 10051975 _derivationally_related_form 00416135
380 | 02079525 _derivationally_related_form 05246511
381 | 01510827 _derivationally_related_form 07338114
382 | 14889479 _hypernym 14865800
383 | 10084635 _derivationally_related_form 00800421
384 | 09790278 _hypernym 10488016
385 | 07246742 _derivationally_related_form 00807461
386 | 00087152 _also_see 01922763
387 | 07204911 _derivationally_related_form 02473431
388 | 10299250 _derivationally_related_form 02592397
389 | 07677593 _hypernym 07675627
390 | 05714161 _hypernym 05713737
391 | 10433737 _derivationally_related_form 01180975
392 | 00849080 _derivationally_related_form 10561320
393 | 00302394 _derivationally_related_form 01940403
394 | 02344243 _hypernym 02344060
395 | 00258854 _derivationally_related_form 00199659
396 | 01960911 _derivationally_related_form 10683126
397 | 01912893 _derivationally_related_form 04544979
398 | 02368336 _also_see 02395115
399 | 02037272 _also_see 01549291
400 | 02738031 _hypernym 04566257
401 | 07186148 _derivationally_related_form 01470225
402 | 00031820 _also_see 00802136
403 | 02395115 _also_see 01073822
404 | 02583139 _derivationally_related_form 00962129
405 | 04465933 _hypernym 03605722
406 | 02523275 _derivationally_related_form 05042871
407 | 02657219 _derivationally_related_form 04713428
408 | 00070965 _derivationally_related_form 00842538
409 | 09759311 _derivationally_related_form 08279298
410 | 01522052 _derivationally_related_form 00345641
411 | 14425974 _derivationally_related_form 01489722
412 | 15159819 _derivationally_related_form 00619183
413 | 01570562 _hypernym 01387786
414 | 00616857 _derivationally_related_form 10351625
415 | 00739270 _derivationally_related_form 00311663
416 | 02274482 _derivationally_related_form 09810364
417 | 02418205 _derivationally_related_form 05769471
418 | 15170786 _has_part 15171008
419 | 01857632 _derivationally_related_form 01053339
420 | 09366017 _hypernym 09287968
421 | 04842515 _derivationally_related_form 01587077
422 | 02462580 _verb_group 02461314
423 | 02389220 _derivationally_related_form 13939353
424 | 02022486 _derivationally_related_form 09334396
425 | 04713428 _derivationally_related_form 02657219
426 | 00511212 _hypernym 00510189
427 | 09879744 _derivationally_related_form 02566227
428 | 00321486 _derivationally_related_form 03873064
429 | 00259894 _derivationally_related_form 02250625
430 | 02924116 _derivationally_related_form 01949110
431 | 10002760 _derivationally_related_form 02521410
432 | 00135718 _derivationally_related_form 04721650
433 | 02055649 _derivationally_related_form 00330160
434 | 01522276 _derivationally_related_form 10781984
435 | 00922867 _derivationally_related_form 01004582
436 | 02667900 _hypernym 02657219
437 | 08895771 _instance_hypernym 08524735
438 | 15137047 _hypernym 15163005
439 | 02700104 _verb_group 02657219
440 | 03624966 _derivationally_related_form 01671039
441 | 02402409 _hypernym 02405252
442 |
--------------------------------------------------------------------------------
/data/WN18RR_v2_ind/valid.txt:
--------------------------------------------------------------------------------
1 | 11450566 _derivationally_related_form 00505802
2 | 04839676 _hypernym 04854389
3 | 01176567 _derivationally_related_form 07891726
4 | 10062996 _derivationally_related_form 02599004
5 | 01176232 _derivationally_related_form 07557165
6 | 01510827 _derivationally_related_form 00140393
7 | 00417643 _derivationally_related_form 01425511
8 | 06271778 _derivationally_related_form 00790703
9 | 05716577 _derivationally_related_form 02337667
10 | 01936537 _derivationally_related_form 04046810
11 | 01816431 _derivationally_related_form 13986679
12 | 07424109 _derivationally_related_form 10527334
13 | 01765392 _derivationally_related_form 07515790
14 | 01504699 _derivationally_related_form 00447540
15 | 02566015 _also_see 00695523
16 | 07678729 _derivationally_related_form 00542809
17 | 14705718 _derivationally_related_form 01354006
18 | 00713250 _derivationally_related_form 01266895
19 | 02666882 _derivationally_related_form 05748786
20 | 01305731 _derivationally_related_form 03933529
21 | 00420132 _derivationally_related_form 00148057
22 | 02043982 _derivationally_related_form 07440979
23 | 02792903 _derivationally_related_form 08276720
24 | 01003729 _derivationally_related_form 00657550
25 | 02521816 _hypernym 02521410
26 | 05514905 _has_part 05524615
27 | 06778102 _derivationally_related_form 00853633
28 | 01387786 _derivationally_related_form 00356790
29 | 04713692 _hypernym 04713428
30 | 01673472 _hypernym 01672014
31 | 00891850 _hypernym 00884466
32 | 08860123 _member_of_domain_usage 07711080
33 | 00136800 _derivationally_related_form 13897996
34 | 00492677 _derivationally_related_form 08615374
35 | 10744164 _hypernym 09626031
36 | 10672192 _derivationally_related_form 00754731
37 | 04146050 _derivationally_related_form 02792903
38 | 08858942 _member_meronym 09700964
39 | 01646941 _also_see 02100709
40 | 00854000 _derivationally_related_form 01431230
41 | 00942234 _derivationally_related_form 01256157
42 | 01926311 _verb_group 01914947
43 | 04049405 _hypernym 03057021
44 | 02443849 _derivationally_related_form 10378780
45 | 10224098 _hypernym 09940146
46 | 01135795 _derivationally_related_form 02593354
47 | 00853958 _derivationally_related_form 07153727
48 | 03467984 _hypernym 04565375
49 | 02566227 _derivationally_related_form 09879744
50 | 03024882 _derivationally_related_form 01215694
51 | 01489161 _derivationally_related_form 02964389
52 | 07512465 _derivationally_related_form 02105990
53 | 02023992 _derivationally_related_form 03420559
54 | 01009871 _derivationally_related_form 00745499
55 | 01072072 _derivationally_related_form 01828736
56 | 05186306 _derivationally_related_form 10672908
57 | 14738752 _derivationally_related_form 00458471
58 | 02531422 _also_see 01257612
59 | 14599641 _hypernym 14599168
60 | 01957529 _verb_group 02102398
61 | 01020117 _derivationally_related_form 01738347
62 | 00492410 _derivationally_related_form 10451858
63 | 01068012 _derivationally_related_form 02466496
64 | 00375021 _derivationally_related_form 05014099
65 | 01216670 _derivationally_related_form 00812526
66 | 02599939 _derivationally_related_form 09759069
67 | 04873550 _hypernym 04827652
68 | 09316454 _derivationally_related_form 10217436
69 | 05256862 _has_part 05257737
70 | 00713952 _derivationally_related_form 01612084
71 | 04993882 _hypernym 04992163
72 | 01370590 _also_see 01227137
73 | 00593108 _derivationally_related_form 10162991
74 | 10351625 _derivationally_related_form 00616153
75 | 00759551 _derivationally_related_form 00764902
76 | 02700104 _derivationally_related_form 13969243
77 | 01322854 _derivationally_related_form 00620424
78 | 00204199 _derivationally_related_form 00810557
79 | 04544979 _derivationally_related_form 01912893
80 | 01373138 _derivationally_related_form 14619225
81 | 00259643 _hypernym 00258854
82 | 01899360 _also_see 00311663
83 | 05681117 _hypernym 14024882
84 | 10618848 _derivationally_related_form 06220616
85 | 00091124 _derivationally_related_form 14299336
86 | 02746365 _has_part 04322026
87 | 02001858 _derivationally_related_form 00487874
88 | 03999992 _derivationally_related_form 01754105
89 | 00894552 _derivationally_related_form 00606093
90 | 10160412 _derivationally_related_form 00483181
91 | 01179707 _derivationally_related_form 01613463
92 | 09849598 _derivationally_related_form 01775164
93 | 01919391 _derivationally_related_form 08428019
94 | 09986189 _derivationally_related_form 02834778
95 | 05246796 _hypernym 05246511
96 | 10224098 _derivationally_related_form 00853633
97 | 08280124 _derivationally_related_form 09759501
98 | 10196965 _derivationally_related_form 01637633
99 | 06695579 _derivationally_related_form 00880227
100 | 00605310 _derivationally_related_form 10118382
101 | 00272448 _derivationally_related_form 02579447
102 | 08894456 _has_part 09430771
103 | 00324231 _derivationally_related_form 00247792
104 | 00510189 _hypernym 00509846
105 | 08892766 _instance_hypernym 08552138
106 | 00810729 _derivationally_related_form 00740712
107 | 08543496 _hypernym 08543223
108 | 01136614 _derivationally_related_form 10152083
109 | 01952750 _derivationally_related_form 08616311
110 | 01021579 _derivationally_related_form 00350461
111 | 00123234 _derivationally_related_form 01133825
112 | 02893338 _derivationally_related_form 10020890
113 | 00804802 _derivationally_related_form 07180787
114 | 00357680 _derivationally_related_form 00366547
115 | 13450636 _synset_domain_topic_of 06055946
116 | 02344060 _derivationally_related_form 01235137
117 | 10782791 _derivationally_related_form 01100145
118 | 08892058 _instance_hypernym 08552138
119 | 09624168 _has_part 05219724
120 | 06561942 _derivationally_related_form 00869931
121 | 00052146 _derivationally_related_form 01305731
122 | 04595855 _derivationally_related_form 02354536
123 | 02167571 _derivationally_related_form 00984609
124 | 05003090 _derivationally_related_form 01912893
125 | 01637633 _derivationally_related_form 05768806
126 | 07710616 _hypernym 07710283
127 | 04694809 _derivationally_related_form 01532329
128 | 01723224 _derivationally_related_form 00897026
129 | 09334396 _derivationally_related_form 01981279
130 | 08523483 _derivationally_related_form 01498498
131 | 01224744 _derivationally_related_form 00409211
132 | 00854150 _hypernym 00853633
133 | 01194418 _derivationally_related_form 10299250
134 | 02543874 _derivationally_related_form 00233386
135 | 02124332 _hypernym 02123672
136 | 01322221 _hypernym 01321854
137 | 01646941 _derivationally_related_form 07944050
138 | 00812526 _derivationally_related_form 01220303
139 | 01373844 _derivationally_related_form 02754103
140 | 07557434 _has_part 07809096
141 | 00643197 _derivationally_related_form 00704305
142 | 06686174 _derivationally_related_form 00889555
143 | 01845627 _member_meronym 01853379
144 | 01613463 _also_see 02451951
145 | 02521816 _derivationally_related_form 01177703
146 | 02192992 _derivationally_related_form 00882702
147 | 04794751 _derivationally_related_form 01672607
148 | 08871007 _has_part 08873412
149 | 02126382 _derivationally_related_form 03916470
150 | 00302394 _derivationally_related_form 01840238
151 | 01106272 _derivationally_related_form 01489161
152 | 00416135 _hypernym 01856626
153 | 05091316 _derivationally_related_form 00658052
154 | 05681117 _derivationally_related_form 00014742
155 | 06773976 _derivationally_related_form 01647867
156 | 03024746 _derivationally_related_form 01570562
157 | 04728376 _derivationally_related_form 02341266
158 | 10495555 _derivationally_related_form 02302817
159 | 02546075 _derivationally_related_form 06696483
160 | 01055073 _derivationally_related_form 04980008
161 | 09917593 _derivationally_related_form 14427065
162 | 07557165 _derivationally_related_form 01176232
163 | 05716744 _hypernym 05715283
164 | 13855627 _hypernym 13854649
165 | 03058603 _derivationally_related_form 00051511
166 | 00329831 _derivationally_related_form 08523483
167 | 08612049 _derivationally_related_form 02043982
168 | 00273963 _derivationally_related_form 13453428
169 | 01646941 _also_see 01640850
170 | 01637633 _derivationally_related_form 10196965
171 | 10566072 _derivationally_related_form 01684337
172 | 02967626 _hypernym 03872495
173 | 00350889 _derivationally_related_form 04863074
174 | 00458754 _derivationally_related_form 14738752
175 | 10388440 _derivationally_related_form 00595684
176 | 04159058 _derivationally_related_form 01531265
177 | 15228162 _hypernym 15154774
178 | 01183573 _derivationally_related_form 07532112
179 | 08280124 _member_meronym 09759501
180 | 05717342 _derivationally_related_form 02196214
181 | 02192992 _derivationally_related_form 05658226
182 | 00467717 _derivationally_related_form 07260623
183 | 14427065 _derivationally_related_form 09918554
184 | 06561942 _derivationally_related_form 00843468
185 | 02124332 _derivationally_related_form 05713737
186 | 00276813 _derivationally_related_form 00492410
187 | 02346895 _verb_group 02542280
188 | 01941093 _verb_group 01847845
189 | 02542280 _verb_group 00351406
190 | 07674749 _hypernym 07673397
191 | 01156438 _derivationally_related_form 01098869
192 | 01177033 _derivationally_related_form 02521410
193 | 04907991 _hypernym 04907269
194 | 02253456 _derivationally_related_form 13279262
195 | 07679356 _hypernym 07622061
196 | 05513807 _has_part 05526384
197 | 02039156 _derivationally_related_form 00341548
198 | 01425709 _derivationally_related_form 00854000
199 | 07813107 _hypernym 07809368
200 | 00417397 _derivationally_related_form 01424456
201 | 10672662 _hypernym 10084635
202 | 00962722 _derivationally_related_form 10527334
203 | 06271778 _hypernym 06254669
204 | 00550016 _derivationally_related_form 01724185
205 | 01850035 _member_meronym 01850192
206 | 15137890 _derivationally_related_form 02708707
207 | 10330189 _hypernym 09847010
208 | 02708707 _hypernym 02708420
209 | 10782791 _derivationally_related_form 02288295
210 | 05524615 _has_part 05525807
211 | 00609100 _derivationally_related_form 05645199
212 | 00800940 _derivationally_related_form 00265386
213 | 02671880 _derivationally_related_form 09424489
214 | 08657249 _derivationally_related_form 02512808
215 | 04411264 _derivationally_related_form 02653996
216 | 00365188 _derivationally_related_form 07313241
217 | 00252710 _derivationally_related_form 09772029
218 | 00031820 _derivationally_related_form 10248876
219 | 00447540 _derivationally_related_form 01574292
220 | 10403876 _synset_domain_topic_of 02858304
221 | 00739270 _derivationally_related_form 00614999
222 | 00504592 _also_see 01675190
223 | 00409211 _derivationally_related_form 01510827
224 | 01523986 _derivationally_related_form 03065424
225 | 01489989 _derivationally_related_form 02964389
226 | 01489734 _derivationally_related_form 03933529
227 | 15147850 _has_part 15146004
228 | 02037272 _also_see 02513740
229 | 02290196 _derivationally_related_form 13279262
230 | 00293916 _derivationally_related_form 01926311
231 | 14425974 _derivationally_related_form 01493897
232 | 01305361 _derivationally_related_form 03933529
233 | 00199912 _derivationally_related_form 10512982
234 | 04565375 _derivationally_related_form 02334867
235 | 06878071 _derivationally_related_form 00028565
236 | 00043480 _derivationally_related_form 03916470
237 | 02527651 _derivationally_related_form 01263018
238 | 08033194 _synset_domain_topic_of 00759694
239 | 01072565 _derivationally_related_form 01183573
240 | 03656484 _hypernym 03851341
241 | 02944826 _hypernym 03763727
242 | 00601822 _hypernym 00599472
243 | 00274283 _verb_group 00273963
244 | 01782519 _derivationally_related_form 04828255
245 | 02653159 _derivationally_related_form 02839200
246 | 01647867 _derivationally_related_form 09952163
247 | 13650045 _hypernym 13603305
248 | 01941093 _derivationally_related_form 00609506
249 | 00285557 _derivationally_related_form 02091410
250 | 10610465 _derivationally_related_form 00014742
251 | 01185981 _hypernym 01166351
252 | 03101796 _hypernym 03101986
253 | 05050115 _derivationally_related_form 01731351
254 | 00247792 _derivationally_related_form 00323856
255 | 02191546 _derivationally_related_form 05658226
256 | 10042300 _derivationally_related_form 01166351
257 | 08873622 _has_part 08875547
258 | 10561613 _hypernym 10042300
259 | 02000868 _derivationally_related_form 10100124
260 | 01088749 _derivationally_related_form 14455206
261 | 01572978 _derivationally_related_form 00812526
262 | 00369802 _hypernym 00358931
263 | 01055165 _derivationally_related_form 02653996
264 | 00622266 _derivationally_related_form 01504699
265 | 08892971 _instance_hypernym 08524735
266 | 15228267 _hypernym 15154774
267 | 02124106 _derivationally_related_form 05714894
268 | 02048891 _derivationally_related_form 13878112
269 | 00843468 _derivationally_related_form 09762385
270 | 15122231 _derivationally_related_form 00297906
271 | 00026385 _derivationally_related_form 01064148
272 | 06700169 _hypernym 06700030
273 | 01027263 _derivationally_related_form 00299580
274 | 02792552 _derivationally_related_form 01954852
275 | 02792903 _derivationally_related_form 04146050
276 | 01158690 _derivationally_related_form 00682436
277 | 01743531 _derivationally_related_form 10318892
278 | 00853776 _derivationally_related_form 04626280
279 | 07891726 _hypernym 07884567
280 | 01725712 _also_see 01256332
281 | 02463990 _derivationally_related_form 10564800
282 | 01587077 _derivationally_related_form 04782116
283 | 02462580 _derivationally_related_form 00182213
284 | 01431230 _derivationally_related_form 10237196
285 | 02950256 _hypernym 02746365
286 | 09762385 _derivationally_related_form 00842989
287 | 01105737 _derivationally_related_form 02909006
288 | 01904293 _derivationally_related_form 00443231
289 | 02669885 _derivationally_related_form 09759501
290 | 08871007 _has_part 08879197
291 | 04055030 _hypernym 03872495
292 | 02700104 _derivationally_related_form 04713332
293 | 07449862 _derivationally_related_form 01186208
294 | 08279298 _derivationally_related_form 09759069
295 | 00259643 _derivationally_related_form 02253456
296 | 13766896 _derivationally_related_form 01180351
297 | 06685456 _derivationally_related_form 00891936
298 | 00015498 _derivationally_related_form 00858377
299 | 02579447 _derivationally_related_form 10257647
300 | 03325769 _derivationally_related_form 02631659
301 | 08223263 _derivationally_related_form 02346895
302 | 06220616 _derivationally_related_form 00298041
303 | 07557434 _has_part 07829412
304 | 00014742 _derivationally_related_form 14024882
305 | 02672859 _hypernym 02672540
306 | 02491383 _derivationally_related_form 07390945
307 | 01112364 _derivationally_related_form 05737153
308 | 02344381 _similar_to 02341266
309 | 13118569 _hypernym 13112664
310 | 10334567 _derivationally_related_form 01922895
311 | 00357680 _derivationally_related_form 00366275
312 | 00606335 _derivationally_related_form 00894552
313 | 10720097 _hypernym 10193026
314 | 10211203 _hypernym 10020890
315 | 09811852 _derivationally_related_form 02950482
316 | 08428019 _derivationally_related_form 01996735
317 | 02085742 _derivationally_related_form 03216828
318 | 00810557 _derivationally_related_form 00740712
319 | 07520612 _derivationally_related_form 01780941
320 | 10117017 _derivationally_related_form 02729965
321 | 03470387 _has_part 03340723
322 | 10472799 _hypernym 09807754
323 | 00227507 _also_see 00504592
324 | 00853776 _also_see 01725712
325 | 08036849 _instance_hypernym 08392137
326 | 01497292 _derivationally_related_form 02967626
327 | 04561734 _derivationally_related_form 01398941
328 | 01222645 _derivationally_related_form 00343249
329 | 00356790 _hypernym 00113113
330 | 13417410 _hypernym 13398241
331 | 10480730 _hypernym 09759069
332 | 00028565 _derivationally_related_form 06878071
333 | 04839676 _derivationally_related_form 00957176
334 | 01828736 _derivationally_related_form 07543288
335 | 08871007 _has_part 08873622
336 | 01149911 _derivationally_related_form 01387786
337 | 13384557 _derivationally_related_form 09934921
338 | 08688247 _derivationally_related_form 02512150
339 | 05008227 _derivationally_related_form 01476685
340 | 01121855 _hypernym 01120448
341 | 02125641 _derivationally_related_form 04980008
342 | 06877578 _hypernym 06877078
343 | 01183573 _derivationally_related_form 13580723
344 | 01606205 _derivationally_related_form 03386011
345 | 02089420 _derivationally_related_form 04500060
346 | 00138221 _derivationally_related_form 01431230
347 | 01183573 _derivationally_related_form 02080577
348 | 03024882 _hypernym 03814906
349 | 01185981 _derivationally_related_form 08253640
350 | 03065424 _derivationally_related_form 01523986
351 | 02708707 _derivationally_related_form 15137890
352 | 03963028 _derivationally_related_form 01395049
353 | 13381734 _derivationally_related_form 01064999
354 | 01851996 _hypernym 01507175
355 | 08871007 _has_part 08876975
356 | 08046759 _instance_hypernym 08392137
357 | 00384620 _derivationally_related_form 00258854
358 | 10211203 _derivationally_related_form 00593837
359 | 02950256 _derivationally_related_form 09811852
360 | 00442115 _derivationally_related_form 01904293
361 | 03386011 _derivationally_related_form 01155421
362 | 01167188 _derivationally_related_form 07556637
363 | 08657249 _derivationally_related_form 00506952
364 | 09827683 _derivationally_related_form 02570267
365 | 02513740 _also_see 02037272
366 | 02331175 _derivationally_related_form 03933529
367 | 05813229 _derivationally_related_form 01828736
368 | 00945916 _derivationally_related_form 02269894
369 | 06950528 _derivationally_related_form 02958126
370 | 07515560 _hypernym 07514968
371 | 00487874 _derivationally_related_form 02001858
372 | 02463990 _hypernym 02463704
373 | 01186208 _derivationally_related_form 08253640
374 | 01240979 _derivationally_related_form 02478059
375 | 00123234 _derivationally_related_form 01489332
376 | 13282007 _hypernym 13278375
377 | 00962567 _hypernym 00973077
378 | 03075768 _derivationally_related_form 01765392
379 | 08392137 _synset_domain_topic_of 00759694
380 | 00031820 _derivationally_related_form 06778102
381 | 14836127 _hypernym 14971519
382 | 00512843 _derivationally_related_form 00854150
383 | 00340989 _derivationally_related_form 02048891
384 | 01856626 _derivationally_related_form 01123095
385 | 07628870 _hypernym 07622061
386 | 04665813 _derivationally_related_form 00754873
387 | 05769726 _derivationally_related_form 01637633
388 | 03959936 _derivationally_related_form 01395049
389 | 01387786 _derivationally_related_form 05289601
390 | 04795545 _hypernym 04794751
391 | 05960464 _derivationally_related_form 00980908
392 | 01780941 _derivationally_related_form 07520612
393 | 07775375 _derivationally_related_form 01140794
394 | 04842993 _derivationally_related_form 02105990
395 | 01093587 _derivationally_related_form 09952163
396 | 11449907 _derivationally_related_form 00505802
397 | 09879744 _derivationally_related_form 00013172
398 | 09811852 _derivationally_related_form 02950256
399 | 00417859 _derivationally_related_form 01424456
400 | 01845627 _member_meronym 01858441
401 | 00105778 _derivationally_related_form 06781383
402 | 00804802 _derivationally_related_form 10018021
403 | 14427239 _hypernym 14425974
404 | 00345641 _derivationally_related_form 01522052
405 | 10693459 _derivationally_related_form 05636402
406 | 02902079 _hypernym 04008947
407 | 03679986 _derivationally_related_form 01612084
408 | 00467717 _derivationally_related_form 00999245
409 | 01612084 _derivationally_related_form 00713952
410 | 08647945 _hypernym 08523483
411 | 01775164 _derivationally_related_form 07543288
412 |
--------------------------------------------------------------------------------
/data/fb237_v1_ind/test.txt:
--------------------------------------------------------------------------------
1 | /m/0gq9h /award/award_category/winners./award/award_honor/ceremony /m/0bzlrh
2 | /m/080knyg /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/022q4l9
3 | /m/046qq /film/actor/film./film/performance/film /m/03shpq
4 | /m/060c4 /business/job_title/people_with_this_title./business/employment_tenure/company /m/0jpn8
5 | /m/0x67 /people/ethnicity/people /m/02h9_l
6 | /m/0qf2t /film/film/genre /m/01t_vv
7 | /m/06dfg /location/location/adjoin_s./location/adjoining_relationship/adjoins /m/07tp2
8 | /m/0l9k1 /award/award_nominee/award_nominations./award/award_nomination/award /m/0gq9h
9 | /m/01wgxtl /base/popstra/celebrity/dated./base/popstra/dated/participant /m/022q32
10 | /m/01qb5d /film/film/genre /m/02kdv5l
11 | /m/01qrbf /film/actor/film./film/performance/film /m/020bv3
12 | /m/03mp8k /music/record_label/artist /m/0127s7
13 | /m/03_8r /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/country /m/03h2c
14 | /m/01d259 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp
15 | /m/06fq2 /education/educational_institution/colors /m/036k5h
16 | /m/025sc50 /music/genre/artists /m/0412f5y
17 | /m/03h2c /location/country/official_language /m/06nm1
18 | /m/05np4c /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0k2mxq
19 | /m/05b5c /business/business_operation/industry /m/01mf0
20 | /m/0h1p /award/award_nominee/award_nominations./award/award_nomination/award /m/0gq9h
21 | /m/02rg_4 /education/educational_institution/school_type /m/05pcjw
22 | /m/025cn2 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0m_h6
23 | /m/0dq9p /people/cause_of_death/people /m/0hpz8
24 | /m/05qd_ /award/award_winner/awards_won./award/award_honor/award_winner /m/02q_cc
25 | /m/079dy /government/politician/government_positions_held./government/government_position_held/basic_title /m/060c4
26 | /m/041rx /people/ethnicity/people /m/01xndd
27 | /m/0x67 /people/ethnicity/languages_spoken /m/06nm1
28 | /m/025cn2 /people/person/place_of_birth /m/02_286
29 | /m/02rn00y /film/film/other_crew./film/film_crew_gig/film_crew_role /m/0d2b38
30 | /m/0b_c7 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0f5mdz
31 | /m/04ftdq /education/educational_institution/colors /m/083jv
32 | /m/016z9n /film/film/featured_film_locations /m/0rh6k
33 | /m/01vn0t_ /people/person/profession /m/016z4k
34 | /m/0l9k1 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0p_rk
35 | /m/0ml_m /location/location/adjoin_s./location/adjoining_relationship/adjoins /m/0mlw1
36 | /m/01wgxtl /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wlt3k
37 | /m/016sp_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01l03w2
38 | /m/0pqc5 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/0f2s6
39 | /m/022q32 /people/person/places_lived./people/place_lived/location /m/01jr6
40 | /m/01900g /people/person/profession /m/018gz8
41 | /m/04sx9_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01qscs
42 | /m/01l1b90 /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/01kgxf
43 | /m/027dtxw /award/award_category/nominees./award/award_nomination/nominated_for /m/05sy_5
44 | /m/0127s7 /award/award_nominee/award_nominations./award/award_nomination/award /m/01c99j
45 | /m/04zwtdy /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/05nyqk
46 | /m/09q17 /media_common/netflix_genre/titles /m/0640m69
47 | /m/0x67 /people/ethnicity/people /m/01sg7_
48 | /m/01cwdk /education/educational_institution/school_type /m/05jxkf
49 | /m/0ct5zc /film/film/story_by /m/0jt90f5
50 | /m/03295l /people/ethnicity/languages_spoken /m/01jb8r
51 | /m/063g7l /film/actor/film./film/performance/film /m/01718w
52 | /m/0crh5_f /film/film/film_festivals /m/0fpkxfd
53 | /m/05br10 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0jqj5
54 | /m/03xgm3 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01l4zqz
55 | /m/0bdjd /film/film/featured_film_locations /m/0fsv2
56 | /m/04ydr95 /film/film/produced_by /m/0pmhf
57 | /m/01g257 /film/actor/film./film/performance/film /m/02pg45
58 | /m/027f2w /education/educational_degree/people_with_this_degree./education/education/institution /m/01jssp
59 | /m/0jqp3 /film/film/produced_by /m/09ftwr
60 | /m/02glc4 /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02cg7g
61 | /m/0gq_d /award/award_category/winners./award/award_honor/ceremony /m/0bzlrh
62 | /m/0bzlrh /award/award_ceremony/awards_presented./award/award_honor/honored_for /m/0p_rk
63 | /m/0pkyh /award/award_winner/awards_won./award/award_honor/award_winner /m/02cx90
64 | /m/022_lg /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0pd64
65 | /m/02nwxc /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/05p5nc
66 | /m/033qdy /film/film/language /m/064_8sq
67 | /m/0fd_1 /people/person/places_lived./people/place_lived/location /m/02_286
68 | /m/01rc6f /education/educational_institution/colors /m/083jv
69 | /m/02681_5 /award/award_category/winners./award/award_honor/award_winner /m/01wj18h
70 | /m/09k5jh7 /award/award_ceremony/awards_presented./award/award_honor/honored_for /m/02rn00y
71 | /m/0k_kr /music/record_label/artist /m/07c0j
72 | /m/09gdh6k /film/film/film_festivals /m/0bmj62v
73 | /m/0f63n /location/location/adjoin_s./location/adjoining_relationship/adjoins /m/0f6_4
74 | /m/0jw67 /film/director/film /m/012jfb
75 | /m/01vxlbm /people/person/profession /m/016z4k
76 | /m/01xcr4 /people/person/profession /m/0d8qb
77 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/02_kd
78 | /m/01rtm4 /education/educational_institution/students_graduates./education/education/student /m/01xndd
79 | /m/01nvmd_ /people/person/place_of_birth /m/01_d4
80 | /m/0170vn /people/person/places_lived./people/place_lived/location /m/02_286
81 | /m/011j5x /music/genre/artists /m/01dw_f
82 | /m/04g_wd /people/person/profession /m/0d8qb
83 | /m/0184jc /award/award_winner/awards_won./award/award_honor/award_winner /m/0c35b1
84 | /m/08052t3 /film/film/language /m/06nm1
85 | /m/02482c /education/educational_institution/school_type /m/07tf8
86 | /m/0p9xd /music/genre/artists /m/01304j
87 | /m/01rnly /film/film/featured_film_locations /m/02_286
88 | /m/0jrv_ /music/genre/artists /m/04rcr
89 | /m/0gmcwlb /award/award_winning_work/awards_won./award/award_honor/award /m/0gq9h
90 | /m/01900g /film/actor/film./film/performance/film /m/0234j5
91 | /m/0d608 /film/actor/film./film/performance/film /m/033pf1
92 | /m/0721cy /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/03xp8d5
93 | /m/05qd_ /business/business_operation/industry /m/02vxn
94 | /m/0k_kr /music/record_label/artist /m/0kr_t
95 | /m/0kz2w /education/educational_institution/school_type /m/07tf8
96 | /m/01q2nx /film/film/featured_film_locations /m/0rh6k
97 | /m/039wsf /people/person/place_of_birth /m/02_286
98 | /m/0337vz /people/person/place_of_birth /m/01_d4
99 | /m/01ymvk /education/educational_institution/colors /m/083jv
100 | /m/0m93 /influence/influence_node/influenced_by /m/0gz_
101 | /m/0d608 /people/person/profession /m/018gz8
102 | /m/03kmyy /education/educational_institution/school_type /m/05pcjw
103 | /m/080dwhx /award/award_winning_work/awards_won./award/award_honor/award_winner /m/02l6dy
104 | /m/02kxbwx /film/director/film /m/0bz3jx
105 | /m/02fsn /music/performance_role/regular_performances./music/group_membership/role /m/07_l6
106 | /m/01wbsdz /music/artist/origin /m/01smm
107 | /m/025sc50 /music/genre/artists /m/01ws9n6
108 | /m/08wq0g /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0cmt6q
109 | /m/0l3h /location/statistical_region/religions./location/religion_percentage/religion /m/072w0
110 | /m/01p7b6b /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/01jc6q
111 | /m/02t_h3 /film/film/genre /m/01q03
112 | /m/060bp /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/03548
113 | /m/08y2fn /tv/tv_program/regular_cast./tv/regular_tv_appearance/actor /m/06j8wx
114 | /m/0k60 /people/person/places_lived./people/place_lived/location /m/0fp5z
115 | /m/02d413 /film/film/cinematography /m/0f3zsq
116 | /m/09f0bj /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/05np4c
117 | /m/02cg7g /government/legislative_session/members./government/government_position_held/legislative_sessions /m/060ny2
118 | /m/059_gf /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/03shpq
119 | /m/01wk7ql /award/award_nominee/award_nominations./award/award_nomination/award /m/01c99j
120 | /m/03hr1p /olympics/olympic_sport/athletes./olympics/olympic_athlete_affiliation/country /m/0d060g
121 | /m/01m1dzc /award/award_winner/awards_won./award/award_honor/award_winner /m/01l03w2
122 | /m/0pqzh /influence/influence_node/influenced_by /m/03hnd
123 | /m/0x67 /people/ethnicity/people /m/059_gf
124 | /m/03q3sy /people/person/nationality /m/0d060g
125 | /m/0gvs1kt /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02vzc
126 | /m/06_9lg /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_location /m/09c17
127 | /m/07s3m4g /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02vzc
128 | /m/064_8sq /language/human_language/countries_spoken_in /m/05cc1
129 | /m/06qn87 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/08ct6
130 | /m/020h2v /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/011yth
131 | /m/02rdxsh /award/award_category/nominees./award/award_nomination/nominated_for /m/04b2qn
132 | /m/027dtxw /award/award_category/winners./award/award_honor/award_winner /m/01qscs
133 | /m/07l450 /film/film/language /m/064_8sq
134 | /m/041rx /people/ethnicity/people /m/0lrh
135 | /m/07lp1 /influence/influence_node/influenced_by /m/0lrh
136 | /m/0jsqk /film/film/language /m/064_8sq
137 | /m/01g257 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/015v3r
138 | /m/027dtxw /award/award_category/nominees./award/award_nomination/nominated_for /m/02_kd
139 | /m/02q_cc /award/award_nominee/award_nominations./award/award_nomination/award /m/0gq9h
140 | /m/02lgj6 /award/award_winner/awards_won./award/award_honor/award_winner /m/02lgfh
141 | /m/02p3cr5 /music/record_label/artist /m/01323p
142 | /m/06k75 /time/event/locations /m/0d0kn
143 | /m/02xj3rw /award/award_category/disciplines_or_subjects /m/02vxn
144 | /m/031f_m /film/film/genre /m/02kdv5l
145 | /m/02x4wr9 /award/award_category/nominees./award/award_nomination/nominated_for /m/017z49
146 | /m/016srn /people/person/profession /m/016z4k
147 | /m/0js9s /film/director/film /m/017gl1
148 | /m/08hp53 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/04fzfj
149 | /m/0kv2hv /film/film/executive_produced_by /m/0bgrsl
150 | /m/0184jc /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/08c4yn
151 | /m/02681xs /award/award_category/winners./award/award_honor/ceremony /m/08pc1x
152 | /m/01hw5kk /film/film/production_companies /m/04rtpt
153 | /m/0721cy /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01jgpsh
154 | /m/01c99j /award/award_category/winners./award/award_honor/ceremony /m/05pd94v
155 | /m/02_kd /award/award_winning_work/awards_won./award/award_honor/award_winner /m/02l4rh
156 | /m/01tlmw /location/hud_county_place/county /m/0m2fr
157 | /m/04zwc /education/educational_institution/school_type /m/05jxkf
158 | /m/01m15br /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01l4zqz
159 | /m/07_l6 /music/instrument/instrumentalists /m/011zf2
160 | /m/023322 /music/group_member/membership./music/group_membership/role /m/0dwt5
161 | /m/02cx90 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01m15br
162 | /m/05qd_ /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/0jsqk
163 | /m/04bdzg /people/person/places_lived./people/place_lived/location /m/02_286
164 | /m/02kfzz /film/film/genre /m/02kdv5l
165 | /m/01q9b9 /people/person/profession /m/0d8qb
166 | /m/05_6_y /sports/pro_athlete/teams./sports/sports_team_roster/team /m/02b15h
167 | /m/02bqn1 /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02cg7g
168 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/011yth
169 | /m/02bn_p /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02glc4
170 | /m/03q3sy /film/actor/film./film/performance/film /m/05t0_2v
171 | /m/05szq8z /film/film/film_format /m/0cj16
172 | /m/02qny_ /soccer/football_player/current_team./sports/sports_team_roster/team /m/01k2xy
173 | /m/016sp_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/015c4g
174 | /m/0k2mxq /award/award_winner/awards_won./award/award_honor/award_winner /m/05np4c
175 | /m/0dgrwqr /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp
176 | /m/0b2v79 /film/film/genre /m/02kdv5l
177 | /m/034r25 /film/film/genre /m/02kdv5l
178 | /m/0qf2t /film/film/genre /m/01q03
179 | /m/0b1xl /education/educational_institution/students_graduates./education/education/student /m/012gq6
180 | /m/041rx /people/ethnicity/people /m/0161sp
181 | /m/03b3j /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/school /m/01rc6f
182 | /m/03xp8d5 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0721cy
183 | /m/035s95 /film/film/production_companies /m/02slt7
184 | /m/08pc1x /award/award_ceremony/awards_presented./award/award_honor/award_winner /m/011zf2
185 | /m/0plw /base/schemastaging/organization_extra/phone_number./base/schemastaging/phone_sandbox/service_language /m/06nm1
186 | /m/063_t /people/deceased_person/place_of_burial /m/0nb1s
187 | /m/01xzb6 /people/person/profession /m/09lbv
188 | /m/0dq9p /people/cause_of_death/people /m/0c12h
189 | /m/03q3sy /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/05t0_2v
190 | /m/08wq0g /award/award_winner/awards_won./award/award_honor/award_winner /m/0bt7ws
191 | /m/04fc6c /music/record_label/artist /m/03y82t6
192 | /m/06nm1 /media_common/netflix_genre/titles /m/091z_p
193 | /m/05qb8vx /award/award_ceremony/awards_presented./award/award_honor/honored_for /m/02z0f6l
194 | /m/08bqy9 /people/person/place_of_birth /m/09c17
195 | /m/0gxtknx /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/0d060g
196 | /m/03y82t6 /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/02bc74
197 | /m/084w8 /influence/influence_node/influenced_by /m/07dnx
198 | /m/01kwlwp /people/person/place_of_birth /m/0f2s6
199 | /m/0dq3c /business/job_title/people_with_this_title./business/employment_tenure/company /m/03mnk
200 | /m/01t6b4 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/038bht
201 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/017gl1
202 | /m/02tk74 /people/person/spouse_s./people/marriage/spouse /m/0h5g_
203 | /m/02x8m /music/genre/artists /m/016376
204 | /m/020h2v /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/035s95
205 | /m/04yqlk /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/015p37
206 |
--------------------------------------------------------------------------------
/data/fb237_v1_ind/valid.txt:
--------------------------------------------------------------------------------
1 | /m/022q32 /base/popstra/celebrity/breakup./base/popstra/breakup/participant /m/015pkc
2 | /m/04ly1 /location/location/contains /m/02vkzcx
3 | /m/08052t3 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/05sb1
4 | /m/02fwfb /film/film/genre /m/01t_vv
5 | /m/0dgrwqr /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/04v3q
6 | /m/01p4vl /people/person/profession /m/018gz8
7 | /m/02xj3rw /award/award_category/winners./award/award_honor/award_winner /m/0683n
8 | /m/06s6hs /people/person/places_lived./people/place_lived/location /m/0fr0t
9 | /m/0k_kr /music/record_label/artist /m/01323p
10 | /m/0x67 /people/ethnicity/people /m/080knyg
11 | /m/0x67 /people/ethnicity/people /m/0126y2
12 | /m/03m8y5 /film/film/genre /m/01t_vv
13 | /m/01pvxl /film/film/production_companies /m/016tt2
14 | /m/016zgj /music/genre/artists /m/082brv
15 | /m/09f0bj /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/06s6hs
16 | /m/0127s7 /base/popstra/celebrity/dated./base/popstra/dated/participant /m/015pkc
17 | /m/09kvv /education/educational_institution/students_graduates./education/education/student /m/01rc4p
18 | /m/06mzp /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics /m/0l6ny
19 | /m/0kz2w /education/educational_institution/school_type /m/05pcjw
20 | /m/011j5x /music/genre/artists /m/0326tc
21 | /m/0411q /music/artist/track_contributions./music/track_contribution/role /m/01qzyz
22 | /m/011j5x /music/genre/parent_genre /m/06cqb
23 | /m/05qd_ /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0bmpm
24 | /m/0d608 /people/person/nationality /m/0d060g
25 | /m/01wj18h /award/award_nominee/award_nominations./award/award_nomination/award /m/02f73p
26 | /m/02681_5 /award/award_category/winners./award/award_honor/ceremony /m/092868
27 | /m/02x8m /music/genre/artists /m/01304j
28 | /m/0d060g /location/country/form_of_government /m/018wl5
29 | /m/0175wg /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/06mmb
30 | /m/02bc74 /people/person/profession /m/016z4k
31 | /m/0pmhf /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/034hzj
32 | /m/023tp8 /award/award_nominee/award_nominations./award/award_nomination/award /m/0cqh6z
33 | /m/0178g /organization/organization/headquarters./location/mailing_address/citytown /m/01_d4
34 | /m/0l6ny /user/jg/default_domain/olympic_games/sports /m/03_8r
35 | /m/05pd94v /award/award_ceremony/awards_presented./award/award_honor/award_winner /m/011zf2
36 | /m/01qrbf /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/05vsxz
37 | /m/0h7jp /location/location/contains /m/01b85
38 | /m/0h5g_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01qrbf
39 | /m/02l4rh /film/actor/film./film/performance/film /m/04w7rn
40 | /m/07yk1xz /film/film/other_crew./film/film_crew_gig/film_crew_role /m/0d2b38
41 | /m/025vl4m /award/award_winner/awards_won./award/award_honor/award_winner /m/026n3rs
42 | /m/02bqmq /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02bqn1
43 | /m/02482c /education/educational_institution/school_type /m/05jxkf
44 | /m/02chhq /film/film/genre /m/017fp
45 | /m/01vwbts /people/person/profession /m/016z4k
46 | /m/05zlld0 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp
47 | /m/041rx /people/ethnicity/people /m/05xpv
48 | /m/031f_m /film/film/genre /m/01zhp
49 | /m/02fwfb /film/film/production_companies /m/0gfmc_
50 | /m/041rx /people/ethnicity/people /m/0l9k1
51 | /m/02x4wr9 /award/award_category/nominees./award/award_nomination/nominated_for /m/047myg9
52 | /m/09f0bj /award/award_winner/awards_won./award/award_honor/award_winner /m/05np4c
53 | /m/0248jb /award/award_category/winners./award/award_honor/ceremony /m/01mh_q
54 | /m/020h2v /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0pc62
55 | /m/03qjlz /people/person/places_lived./people/place_lived/location /m/02_286
56 | /m/0209xj /film/film/language /m/064_8sq
57 | /m/0992d9 /film/film/genre /m/02kdv5l
58 | /m/0738b8 /people/person/profession /m/018gz8
59 | /m/0d060g /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics /m/0jkvj
60 | /m/02vzc /olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics /m/0jkvj
61 | /m/01_j71 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02_n5d
62 | /m/01s0ps /music/performance_role/regular_performances./music/group_membership/role /m/0dwt5
63 | /m/0269kx /education/educational_institution/colors /m/036k5h
64 | /m/02yxjs /education/educational_institution/students_graduates./education/education/student /m/0glmv
65 | /m/02sddg /sports/sports_position/players./sports/sports_team_roster/team /m/049n7
66 | /m/04fzfj /award/award_winning_work/awards_won./award/award_honor/award_winner /m/08hp53
67 | /m/05qd_ /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/0bmpm
68 | /m/06mzp /location/location/contains /m/0lfyd
69 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/0jqj5
70 | /m/0jnlm /sports/sports_team/colors /m/083jv
71 | /m/02y_rq5 /award/award_category/nominees./award/award_nomination/nominated_for /m/02_06s
72 | /m/0g5ff /award/award_nominee/award_nominations./award/award_nomination/award /m/0265vt
73 | /m/0234j5 /film/film/genre /m/02kdv5l
74 | /m/09dt7 /award/award_nominee/award_nominations./award/award_nomination/award /m/0265vt
75 | /m/02mhfy /film/actor/film./film/performance/film /m/0372j5
76 | /m/0478__m /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wj18h
77 | /m/0js9s /film/actor/film./film/performance/film /m/08k40m
78 | /m/060c4 /business/job_title/people_with_this_title./business/employment_tenure/company /m/02482c
79 | /m/041rx /people/ethnicity/people /m/03f0fnk
80 | /m/022dp5 /people/ethnicity/people /m/01nvmd_
81 | /m/021yzs /film/actor/film./film/performance/film /m/02pg45
82 | /m/01xndd /people/person/place_of_birth /m/02_286
83 | /m/016tt2 /film/film_distributor/films_distributed./film/film_film_distributor_relationship/film /m/07s3m4g
84 | /m/02_3zj /award/award_category/nominees./award/award_nomination/nominated_for /m/080dwhx
85 | /m/0b7l4x /film/film/other_crew./film/film_crew_gig/film_crew_role /m/0d2b38
86 | /m/0pkyh /influence/influence_node/influenced_by /m/041h0
87 | /m/01qb5d /film/film/featured_film_locations /m/02_286
88 | /m/02bqmq /government/legislative_session/members./government/government_position_held/legislative_sessions /m/060ny2
89 | /m/016sp_ /music/artist/origin /m/01smm
90 | /m/041rx /people/ethnicity/people /m/063_t
91 | /m/0pb33 /film/film/genre /m/02kdv5l
92 | /m/0d060g /olympics/olympic_participating_country/athletes./olympics/olympic_athlete_affiliation/olympics /m/0lbbj
93 | /m/01ct6 /sports/sports_team/colors /m/083jv
94 | /m/05b7q /location/statistical_region/religions./location/religion_percentage/religion /m/092bf5
95 | /m/0x67 /people/ethnicity/people /m/01wlt3k
96 | /m/0jm4v /sports/professional_sports_team/draft_picks./sports/sports_league_draft_pick/draft /m/0f4vx0
97 | /m/064_8sq /language/human_language/countries_spoken_in /m/03676
98 | /m/01l4zqz /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/011zf2
99 | /m/023gxx /film/film/genre /m/017fp
100 | /m/047q2k1 /film/film/genre /m/01chg
101 | /m/073v6 /influence/influence_node/influenced_by /m/07dnx
102 | /m/0fsv2 /location/hud_county_place/county /m/0m24v
103 | /m/05qd_ /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/0l9k1
104 | /m/07w5rq /education/educational_institution/school_type /m/01_srz
105 | /m/03xp8d5 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/07_s4b
106 | /m/0jqj5 /award/award_winning_work/awards_won./award/award_honor/award_winner /m/05br10
107 | /m/04b2qn /film/film/genre /m/04228s
108 | /m/01hkhq /award/award_winner/awards_won./award/award_honor/award_winner /m/02l4rh
109 | /m/03m2fg /people/person/profession /m/018gz8
110 | /m/06k75 /military/military_conflict/combatants./military/military_combatant_group/combatants /m/02vzc
111 | /m/01ljpm /education/educational_institution/students_graduates./education/education/major_field_of_study /m/01jzxy
112 | /m/0l1589 /music/performance_role/regular_performances./music/group_membership/role /m/01s0ps
113 | /m/0bmpm /award/award_winning_work/awards_won./award/award_honor/award_winner /m/05qd_
114 | /m/0p_qr /award/award_winning_work/awards_won./award/award_honor/award_winner /m/046qq
115 | /m/04ly1 /location/location/contains /m/03x33n
116 | /m/02y_rq5 /award/award_category/nominees./award/award_nomination/nominated_for /m/0p_qr
117 | /m/05np4c /award/award_winner/awards_won./award/award_honor/award_winner /m/06s6hs
118 | /m/0m6x4 /film/actor/film./film/performance/film /m/083skw
119 | /m/04gmlt /music/record_label/artist /m/0249kn
120 | /m/041rx /people/ethnicity/people /m/01z_g6
121 | /m/06j8wx /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01hkhq
122 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/02rn00y
123 | /m/0dn3n /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/01pvxl
124 | /m/0209xj /film/film/written_by /m/01_f_5
125 | /m/016tt2 /business/business_operation/industry /m/02vxn
126 | /m/02ldkf /education/educational_institution/school_type /m/05jxkf
127 | /m/0gq9h /award/award_category/nominees./award/award_nomination/nominated_for /m/03pc89
128 | /m/03nqnk3 /award/award_category/winners./award/award_honor/award_winner /m/033rq
129 | /m/012gbb /base/popstra/celebrity/dated./base/popstra/dated/participant /m/0gmtm
130 | /m/0pd64 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02vzc
131 | /m/060c4 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/05b7q
132 | /m/064nh4k /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/032q8q
133 | /m/01hlwv /organization/organization/headquarters./location/mailing_address/citytown /m/02_286
134 | /m/0p9tm /film/film/production_companies /m/016tt2
135 | /m/0d060g /olympics/olympic_participating_country/athletes./olympics/olympic_athlete_affiliation/olympics /m/0jkvj
136 | /m/015q1n /education/educational_institution/school_type /m/05jxkf
137 | /m/0x67 /people/ethnicity/people /m/01vxlbm
138 | /m/0dq9p /people/cause_of_death/people /m/05xpv
139 | /m/064nh4k /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/022q4l9
140 | /m/016srn /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02cx90
141 | /m/0j6b5 /film/film/other_crew./film/film_crew_gig/film_crew_role /m/094hwz
142 | /m/0jqp3 /film/film/produced_by /m/0170vn
143 | /m/019pkm /people/person/places_lived./people/place_lived/location /m/02_286
144 | /m/0j11 /location/administrative_division/first_level_division_of /m/049nq
145 | /m/0h5g_ /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/020bv3
146 | /m/01l4zqz /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/03xgm3
147 | /m/01c99j /award/award_category/winners./award/award_honor/ceremony /m/01mh_q
148 | /m/05650n /award/award_winning_work/awards_won./award/award_honor/award /m/02qyxs5
149 | /m/0crh5_f /film/film/release_date_s./film/film_regional_release_date/film_regional_debut_venue /m/0fpkxfd
150 | /m/04dsnp /film/film/produced_by /m/0jw67
151 | /m/027dtxw /award/award_category/nominees./award/award_nomination/nominated_for /m/0b2v79
152 | /m/0pd64 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/02_286
153 | /m/01f85k /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/0d060g
154 | /m/015q1n /education/educational_institution/students_graduates./education/education/student /m/01nvmd_
155 | /m/0m6x4 /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0jsqk
156 | /m/03k8th /film/film/featured_film_locations /m/02_286
157 | /m/0jpn8 /education/educational_institution/colors /m/036k5h
158 | /m/0hkxq /food/food/nutrients./food/nutrition_fact/nutrient /m/0h1yy
159 | /m/025sc50 /music/genre/artists /m/02h9_l
160 | /m/0bv7t /influence/influence_node/influenced_by /m/017_pb
161 | /m/041rx /people/ethnicity/people /m/058vp
162 | /m/0pd64 /award/award_winning_work/awards_won./award/award_honor/award /m/0gq9h
163 | /m/0175wg /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/020bv3
164 | /m/047n8xt /film/film/written_by /m/02r6c_
165 | /m/08052t3 /film/film/genre /m/02kdv5l
166 | /m/01ws9n6 /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wlt3k
167 | /m/0pqc5 /government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office /m/07bcn
168 | /m/0gdh5 /people/person/profession /m/016z4k
169 | /m/01qb5d /film/film/production_companies /m/016tt2
170 | /m/032q8q /film/actor/film./film/performance/film /m/01q2nx
171 | /m/02l6dy /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/01wy5m
172 | /m/03548 /location/country/official_language /m/064_8sq
173 | /m/01718w /film/film/music /m/0417z2
174 | /m/0gvs1kt /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/04v3q
175 | /m/01s0ps /music/performance_role/regular_performances./music/group_membership/role /m/0l1589
176 | /m/022q4l9 /award/award_winner/awards_won./award/award_honor/award_winner /m/064nh4k
177 | /m/0161sp /base/popstra/celebrity/friendship./base/popstra/friendship/participant /m/0pmhf
178 | /m/0gq_d /award/award_category/winners./award/award_honor/ceremony /m/0bzn6_
179 | /m/06cgy /award/award_nominee/award_nominations./award/award_nomination/nominated_for /m/0jqj5
180 | /m/05p5nc /award/award_nominee/award_nominations./award/award_nomination/award /m/0cqh6z
181 | /m/01m15br /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02cx90
182 | /m/0c12h /award/award_nominee/award_nominations./award/award_nomination/award /m/027dtxw
183 | /m/04mby /people/person/places_lived./people/place_lived/location /m/071cn
184 | /m/017gl1 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/06mzp
185 | /m/015pkc /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/02s2ft
186 | /m/02_3zj /award/award_category/nominees./award/award_nomination/nominated_for /m/0524b41
187 | /m/02bf58 /organization/organization/headquarters./location/mailing_address/citytown /m/01_d4
188 | /m/032q8q /award/award_nominee/award_nominations./award/award_nomination/award_nominee /m/080knyg
189 | /m/0146hc /education/educational_institution/school_type /m/05jxkf
190 | /m/02y_rq5 /award/award_category/nominees./award/award_nomination/nominated_for /m/0294mx
191 | /m/01jgpsh /film/actor/film./film/performance/film /m/03nqnnk
192 | /m/04fzfj /film/film/story_by /m/08hp53
193 | /m/01_d4 /location/location/contains /m/065r8g
194 | /m/020bv3 /film/film/production_companies /m/03sb38
195 | /m/02z0f6l /film/film/genre /m/017fp
196 | /m/04nw9 /film/actor/film./film/performance/film /m/0gnjh
197 | /m/03ft8 /people/person/places_lived./people/place_lived/location /m/0100mt
198 | /m/02qny_ /soccer/football_player/current_team./sports/sports_team_roster/team /m/0175rc
199 | /m/024lt6 /film/film/release_date_s./film/film_regional_release_date/film_release_region /m/04v3q
200 | /m/0jkvj /olympics/olympic_games/sports /m/03hr1p
201 | /m/012gk9 /film/film/written_by /m/03ft8
202 | /m/02cg7g /government/legislative_session/members./government/government_position_held/legislative_sessions /m/02glc4
203 | /m/0f6zs /location/location/contains /m/0ybkj
204 | /m/07gql /music/instrument/instrumentalists /m/03xgm3
205 | /m/041rx /people/ethnicity/people /m/073v6
206 | /m/05ztm4r /people/person/places_lived./people/place_lived/location /m/02_286
207 |
--------------------------------------------------------------------------------
/managers/__pycache__/evaluator.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/managers/__pycache__/evaluator.cpython-36.pyc
--------------------------------------------------------------------------------
/managers/__pycache__/trainer.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/managers/__pycache__/trainer.cpython-36.pyc
--------------------------------------------------------------------------------
/managers/evaluator.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import torch
4 | import pdb
5 | from sklearn import metrics
6 | import torch.nn.functional as F
7 | from torch.utils.data import DataLoader
8 |
9 |
10 | class Evaluator():
11 | def __init__(self, params, graph_classifier, data):
12 | self.params = params
13 | self.graph_classifier = graph_classifier
14 | self.data = data
15 |
16 | def eval(self, save=False):
17 | pos_scores = []
18 | pos_labels = []
19 | neg_scores = []
20 | neg_labels = []
21 | pos_nodes_num_list = []
22 | neg_nodes_num_list = []
23 | dataloader = DataLoader(self.data, batch_size=self.params.batch_size, shuffle=False, num_workers=self.params.num_workers, collate_fn=self.params.collate_fn)
24 |
25 | self.graph_classifier.eval()
26 | with torch.no_grad():
27 | for b_idx, batch in enumerate(dataloader):
28 |
29 | data_pos, targets_pos, data_neg, targets_neg, _ = self.params.move_batch_to_device(batch, self.params.device)
30 | pos_nodes_num_list.extend(data_pos[0].batch_num_nodes)
31 | neg_nodes_num_list.extend(data_neg[0].batch_num_nodes)
32 | score_pos = self.graph_classifier(data_pos)
33 | score_neg = self.graph_classifier(data_neg)
34 |
35 | pos_scores += score_pos.squeeze(1).detach().cpu().tolist()
36 | neg_scores += score_neg.squeeze(1).detach().cpu().tolist()
37 | pos_labels += targets_pos.tolist()
38 | neg_labels += targets_neg.tolist()
39 |
40 | assert not torch.any(torch.isnan(torch.tensor(pos_scores + neg_scores)))
41 |
42 | auc = metrics.roc_auc_score(pos_labels + neg_labels, pos_scores + neg_scores)
43 | auc_pr = metrics.average_precision_score(pos_labels + neg_labels, pos_scores + neg_scores)
44 |
45 | if save:
46 | pos_test_triplets_path = os.path.join(self.params.main_dir, 'data/{}/{}.txt'.format(self.params.dataset, self.data.file_name))
47 | with open(pos_test_triplets_path) as f:
48 | pos_triplets = [line.split() for line in f.read().split('\n')[:-1]]
49 | pos_file_path = os.path.join(self.params.main_dir, 'data/{}/grail_{}_predictions.txt'.format(self.params.dataset, self.data.file_name))
50 | with open(pos_file_path, "w") as f:
51 | for ([s, r, o], score, nodes_num) in zip(pos_triplets, pos_scores, pos_nodes_num_list):
52 | f.write('\t'.join([s, r, o, str(score), str(nodes_num)]) + '\n')
53 |
54 | neg_test_triplets_path = os.path.join(self.params.main_dir, 'data/{}/neg_{}_0.txt'.format(self.params.dataset, self.data.file_name))
55 | with open(neg_test_triplets_path) as f:
56 | neg_triplets = [line.split() for line in f.read().split('\n')[:-1]]
57 | neg_file_path = os.path.join(self.params.main_dir, 'data/{}/grail_neg_{}_{}_predictions.txt'.format(self.params.dataset, self.data.file_name, self.params.constrained_neg_prob))
58 | with open(neg_file_path, "w") as f:
59 | for ([s, r, o], score, nodes_num) in zip(neg_triplets, neg_scores, neg_nodes_num_list):
60 | f.write('\t'.join([s, r, o, str(score), str(nodes_num)]) + '\n')
61 |
62 | return {'auc': auc, 'auc_pr': auc_pr}
63 |
--------------------------------------------------------------------------------
/managers/trainer.py:
--------------------------------------------------------------------------------
1 | import statistics
2 | import timeit
3 | import os
4 | import logging
5 | import pdb
6 | import numpy as np
7 | import time
8 |
9 | import torch
10 | import torch.nn as nn
11 | import torch.optim as optim
12 | import torch.nn.functional as F
13 | from torch.utils.data import DataLoader
14 | import dgl
15 | from sklearn import metrics
16 |
17 |
18 | class Trainer():
19 | def __init__(self, params, graph_classifier, train, valid_evaluator=None):
20 | self.graph_classifier = graph_classifier
21 | self.valid_evaluator = valid_evaluator
22 | self.params = params
23 | self.train_data = train
24 |
25 | self.updates_counter = 0
26 |
27 | model_params = list(self.graph_classifier.parameters())
28 | logging.info('Total number of parameters: %d' % sum(map(lambda x: x.numel(), model_params)))
29 |
30 | if params.optimizer == "SGD":
31 | self.optimizer = optim.SGD(model_params, lr=params.lr, momentum=params.momentum, weight_decay=self.params.l2)
32 | if params.optimizer == "Adam":
33 | self.optimizer = optim.Adam(model_params, lr=params.lr, weight_decay=self.params.l2)
34 |
35 | self.criterion = nn.MarginRankingLoss(self.params.margin, reduction='mean')
36 | self.b_xent = nn.BCEWithLogitsLoss()
37 | self.reset_training_state()
38 |
39 | def reset_training_state(self):
40 | self.best_metric = 0
41 | self.last_metric = 0
42 | self.not_improved_count = 0
43 |
44 | def train_epoch(self):
45 | total_loss = 0
46 | total_MI_loss = 0
47 | all_labels = []
48 | all_scores = []
49 |
50 | dataloader = DataLoader(self.train_data, batch_size=self.params.batch_size, shuffle=True, num_workers=self.params.num_workers, collate_fn=self.params.collate_fn)
51 | self.graph_classifier.train()
52 | model_params = list(self.graph_classifier.parameters())
53 |
54 | for b_idx, batch in enumerate(dataloader):
55 | # print("batch:", b_idx)
56 |
57 | # Input positive and negative graph
58 | data_pos, targets_pos, data_neg, targets_neg, data_cor = self.params.move_batch_to_device(batch, self.params.device)
59 | self.optimizer.zero_grad()
60 | self.graph_classifier.train()
61 | score_pos, s_G_pos, s_g_pos = self.graph_classifier(data_pos, is_return_emb=True)
62 | score_neg = self.graph_classifier(data_neg)
63 |
64 | # loss = self.criterion(score_pos, score_neg.view(len(score_pos), -1).mean(dim=1), torch.Tensor([1]).to(device=self.params.device))
65 | loss = self.criterion(score_pos.squeeze(-1), score_neg.view(len(score_pos), -1).mean(dim=1), torch.Tensor([1]).to(device=self.params.device))
66 | # print(f"loss: {loss}")
67 |
68 | dgi_loss = 0
69 | if self.params.coef_dgi_loss:
70 | _, _, s_g_cor = self.graph_classifier(data_cor, is_return_emb=True, cor_graph=True)
71 |
72 | # Calculate the DGI loss
73 | lbl_1 = torch.ones(data_pos[0].batch_size)
74 | lbl_2 = torch.zeros(data_pos[0].batch_size)
75 | lbl = torch.cat((lbl_1, lbl_2)).to(self.params.device)
76 | logits = self.graph_classifier.get_logits(s_G_pos, s_g_pos, s_g_cor)
77 | dgi_loss = self.b_xent(logits, lbl)
78 |
79 | print(f'supervised loss: {loss}, NCE loss: {dgi_loss}')
80 | loss = loss + self.params.coef_dgi_loss * dgi_loss
81 |
82 | loss.backward()
83 | self.optimizer.step()
84 | self.updates_counter += 1
85 |
86 | with torch.no_grad():
87 | all_scores += score_pos.squeeze().detach().cpu().tolist() + score_neg.squeeze().detach().cpu().tolist()
88 | all_labels += targets_pos.tolist() + targets_neg.tolist()
89 | total_loss += loss
90 | total_MI_loss += dgi_loss
91 |
92 | if self.valid_evaluator and self.params.eval_every_iter and self.updates_counter % self.params.eval_every_iter == 0:
93 | tic = time.time()
94 | result = self.valid_evaluator.eval()
95 | logging.info('\nPerformance:' + str(result) + 'in ' + str(time.time() - tic))
96 |
97 | if result['auc'] >= self.best_metric:
98 | self.save_classifier()
99 | self.best_metric = result['auc']
100 | self.not_improved_count = 0
101 | else:
102 | self.not_improved_count += 1
103 | if self.not_improved_count > self.params.early_stop:
104 | logging.info(f"Validation performance didn\'t improve for {self.params.early_stop} epochs. Training stops.")
105 | break
106 |
107 | self.last_metric = result['auc']
108 |
109 | auc = metrics.roc_auc_score(all_labels, all_scores)
110 | auc_pr = metrics.average_precision_score(all_labels, all_scores)
111 |
112 | weight_norm = sum(map(lambda x: torch.norm(x), model_params))
113 |
114 | return total_loss, total_MI_loss, auc, auc_pr, weight_norm
115 |
116 | def train(self):
117 | self.reset_training_state()
118 |
119 | for epoch in range(1, self.params.num_epochs + 1):
120 | time_start = time.time()
121 | loss, MI_loss, auc, auc_pr, weight_norm = self.train_epoch()
122 | time_elapsed = time.time() - time_start
123 | logging.info(f'Epoch {epoch} with loss: {loss}, MI loss: {MI_loss}, training auc: {auc}, training auc_pr: {auc_pr}, best validation AUC: {self.best_metric}, weight_norm: {weight_norm} in {time_elapsed}')
124 |
125 | if epoch % self.params.save_every == 0:
126 | torch.save(self.graph_classifier, os.path.join(self.params.exp_dir, 'graph_classifier_chk.pth'))
127 |
128 | def save_classifier(self):
129 | torch.save(self.graph_classifier, os.path.join(self.params.exp_dir, 'best_graph_classifier.pth')) # Does it overwrite or fuck with the existing file?
130 | logging.info('Better models found w.r.t accuracy. Saved it!')
131 |
--------------------------------------------------------------------------------
/model/dgl/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__init__.py
--------------------------------------------------------------------------------
/model/dgl/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/__pycache__/aggregators.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/aggregators.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/__pycache__/batch_gru.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/batch_gru.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/__pycache__/discriminator.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/discriminator.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/__pycache__/graph_classifier.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/graph_classifier.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/__pycache__/layers.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/layers.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/__pycache__/rgcn_model.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/model/dgl/__pycache__/rgcn_model.cpython-36.pyc
--------------------------------------------------------------------------------
/model/dgl/aggregators.py:
--------------------------------------------------------------------------------
1 | import abc
2 | import torch.nn as nn
3 | import torch
4 | import torch.nn.functional as F
5 |
6 |
7 | class Aggregator(nn.Module):
8 | def __init__(self, emb_dim):
9 | super(Aggregator, self).__init__()
10 |
11 | def forward(self, node):
12 | curr_emb = node.mailbox['curr_emb'][:, 0, :] # (B, F)
13 | nei_msg = torch.bmm(node.mailbox['alpha'].transpose(1, 2), node.mailbox['msg']).squeeze(1) # (B, F)
14 | # nei_msg, _ = torch.max(node.mailbox['msg'], 1) # (B, F)
15 |
16 | new_emb = self.update_embedding(curr_emb, nei_msg)
17 |
18 | return {'h': new_emb}
19 |
20 | @abc.abstractmethod
21 | def update_embedding(curr_emb, nei_msg):
22 | raise NotImplementedError
23 |
24 |
25 | class SumAggregator(Aggregator):
26 | def __init__(self, emb_dim):
27 | super(SumAggregator, self).__init__(emb_dim)
28 |
29 | def update_embedding(self, curr_emb, nei_msg):
30 | new_emb = nei_msg + curr_emb
31 |
32 | return new_emb
33 |
34 |
35 | class MLPAggregator(Aggregator):
36 | def __init__(self, emb_dim):
37 | super(MLPAggregator, self).__init__(emb_dim)
38 | self.linear = nn.Linear(2 * emb_dim, emb_dim)
39 |
40 | def update_embedding(self, curr_emb, nei_msg):
41 | inp = torch.cat((nei_msg, curr_emb), 1)
42 | new_emb = F.relu(self.linear(inp))
43 |
44 | return new_emb
45 |
46 |
47 | class GRUAggregator(Aggregator):
48 | def __init__(self, emb_dim):
49 | super(GRUAggregator, self).__init__(emb_dim)
50 | self.gru = nn.GRUCell(emb_dim, emb_dim)
51 |
52 | def update_embedding(self, curr_emb, nei_msg):
53 | new_emb = self.gru(nei_msg, curr_emb)
54 |
55 | return new_emb
56 |
--------------------------------------------------------------------------------
/model/dgl/batch_gru.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 | import torch
3 | import torch.nn.functional as F
4 | import math
5 |
6 | class BatchGRU(nn.Module):
7 | def __init__(self, hidden_size=300):
8 | super(BatchGRU, self).__init__()
9 | self.hidden_size = hidden_size
10 | self.gru = nn.GRU(self.hidden_size, self.hidden_size, batch_first=True,
11 | bidirectional=True)
12 | self.bias = nn.Parameter(torch.Tensor(self.hidden_size))
13 | self.bias.data.uniform_(-1.0 / math.sqrt(self.hidden_size),
14 | 1.0 / math.sqrt(self.hidden_size))
15 |
16 |
17 | def forward(self, node, a_scope):
18 | hidden = node
19 | # print(hidden.shape)
20 | message = F.relu(node + self.bias)
21 | MAX_node_len = max(a_scope)
22 | # padding
23 | message_lst = []
24 | hidden_lst = []
25 | a_start = 0
26 | for i in a_scope:
27 | i = int(i)
28 | if i == 0:
29 | assert 0
30 | cur_message = message.narrow(0, a_start, i)
31 | cur_hidden = hidden.narrow(0, a_start, i)
32 | hidden_lst.append(cur_hidden.max(0)[0].unsqueeze(0).unsqueeze(0))
33 | a_start += i
34 | cur_message = torch.nn.ZeroPad2d((0,0,0,MAX_node_len-cur_message.shape[0]))(cur_message)
35 | message_lst.append(cur_message.unsqueeze(0))
36 |
37 | message_lst = torch.cat(message_lst, 0)
38 | hidden_lst = torch.cat(hidden_lst, 1)
39 | hidden_lst = hidden_lst.repeat(2,1,1)
40 | cur_message, cur_hidden = self.gru(message_lst, hidden_lst)
41 |
42 | # unpadding
43 | cur_message_unpadding = []
44 | kk = 0
45 | for a_size in a_scope:
46 | a_size = int(a_size)
47 | cur_message_unpadding.append(cur_message[kk, :a_size].view(-1, 2*self.hidden_size))
48 | kk += 1
49 | cur_message_unpadding = torch.cat(cur_message_unpadding, 0)
50 |
51 | return cur_message_unpadding
--------------------------------------------------------------------------------
/model/dgl/discriminator.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import time
4 | class Discriminator(nn.Module):
5 | r""" Discriminator module for calculating MI"""
6 |
7 | def __init__(self, n_e, n_g):
8 | """
9 | param: n_e: dimension of edge embedding
10 | param: n_g: dimension of graph embedding
11 | """
12 | super(Discriminator, self).__init__()
13 | self.f_k = nn.Bilinear(n_e, n_g, 1)
14 |
15 | for m in self.modules():
16 | self.weights_init(m)
17 |
18 | def weights_init(self, m):
19 | if isinstance(m, nn.Bilinear):
20 | torch.nn.init.xavier_uniform_(m.weight.data)
21 | if m.bias is not None:
22 | m.bias.data.fill_(0.0)
23 |
24 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None):
25 | c_x = torch.unsqueeze(c, 0) # [1, F]
26 | c_x = c_x.expand_as(h_pl) #[B, F]
27 |
28 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 1) # [B]; self.f_k(h_pl, c_x): [B, 1]
29 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 1) # [B]
30 |
31 | # print('Discriminator time:', time.time() - ts)
32 | if s_bias1 is not None:
33 | sc_1 += s_bias1
34 | if s_bias2 is not None:
35 | sc_2 += s_bias2
36 |
37 | logits = torch.cat((sc_1, sc_2))
38 |
39 | return logits
40 |
--------------------------------------------------------------------------------
/model/dgl/graph_classifier.py:
--------------------------------------------------------------------------------
1 | from .rgcn_model import RGCN
2 | from dgl import mean_nodes
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | import torch
6 | import time
7 | import numpy as np
8 | from .discriminator import Discriminator
9 | from .batch_gru import BatchGRU
10 | """
11 | File based off of dgl tutorial on RGCN
12 | Source: https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn
13 | """
14 |
15 |
16 | class GraphClassifier(nn.Module):
17 | def __init__(self, params, relation2id, ent2rels): # in_dim, h_dim, rel_emb_dim, out_dim, num_rels, num_bases):
18 | super().__init__()
19 |
20 | self.params = params
21 | self.relation2id = relation2id
22 | self.ent2rels = ent2rels
23 | self.gnn = RGCN(params) # in_dim, h_dim, h_dim, num_rels, num_bases)
24 |
25 | # num_rels + 1 instead of nums_rels, in order to add a "padding" relation.
26 | self.rel_emb = nn.Embedding(self.params.num_rels + 1, self.params.inp_dim, sparse=False, padding_idx=self.params.num_rels)
27 |
28 | self.ent_padding = nn.Parameter(torch.FloatTensor(1, self.params.sem_dim).uniform_(-1, 1))
29 | if self.params.init_nei_rels == 'both':
30 | self.w_rel2ent = nn.Linear(2 * self.params.inp_dim, self.params.sem_dim)
31 | elif self.params.init_nei_rels == 'out' or 'in':
32 | self.w_rel2ent = nn.Linear(self.params.inp_dim, self.params.sem_dim)
33 |
34 | self.sigmoid = nn.Sigmoid()
35 | self.nei_rels_dropout = nn.Dropout(self.params.nei_rels_dropout)
36 | self.dropout = nn.Dropout(self.params.dropout)
37 | self.softmax = nn.Softmax(dim=1)
38 |
39 | if self.params.add_ht_emb:
40 | # self.fc_layer = nn.Linear(3 * self.params.num_gcn_layers * self.params.emb_dim + self.params.rel_emb_dim, 1)
41 | self.fc_layer = nn.Linear(3 * self.params.num_gcn_layers * self.params.emb_dim + self.params.emb_dim, 1)
42 | else:
43 | self.fc_layer = nn.Linear(self.params.num_gcn_layers * self.params.emb_dim + self.params.rel_emb_dim, 1)
44 |
45 | if self.params.comp_hrt:
46 | self.fc_layer = nn.Linear(2 * self.params.num_gcn_layers * self.params.emb_dim, 1)
47 |
48 | if self.params.nei_rel_path:
49 | self.fc_layer = nn.Linear(3 * self.params.num_gcn_layers * self.params.emb_dim + 2 * self.params.emb_dim, 1)
50 |
51 | if self.params.comp_ht == 'mlp':
52 | self.fc_comp = nn.Linear(2 * self.params.emb_dim, self.params.emb_dim)
53 |
54 | if self.params.nei_rel_path:
55 | self.disc = Discriminator(self.params.num_gcn_layers * self.params.emb_dim + self.params.emb_dim, self.params.num_gcn_layers * self.params.emb_dim + self.params.emb_dim)
56 | else:
57 | self.disc = Discriminator(self.params.num_gcn_layers * self.params.emb_dim , self.params.num_gcn_layers * self.params.emb_dim)
58 |
59 | self.rnn = torch.nn.GRU(self.params.emb_dim, self.params.emb_dim, batch_first=True)
60 |
61 | self.batch_gru = BatchGRU(self.params.num_gcn_layers * self.params.emb_dim )
62 |
63 | self.W_o = nn.Linear(self.params.num_gcn_layers * self.params.emb_dim * 2, self.params.num_gcn_layers * self.params.emb_dim)
64 |
65 | def init_ent_emb_matrix(self, g):
66 | """ Initialize feature of entities by matrix form """
67 | out_nei_rels = g.ndata['out_nei_rels']
68 | in_nei_rels = g.ndata['in_nei_rels']
69 |
70 | target_rels = g.ndata['r_label']
71 | out_nei_rels_emb = self.rel_emb(out_nei_rels)
72 | in_nei_rels_emb = self.rel_emb(in_nei_rels)
73 | target_rels_emb = self.rel_emb(target_rels).unsqueeze(2)
74 |
75 | out_atts = self.softmax(self.nei_rels_dropout(torch.matmul(out_nei_rels_emb, target_rels_emb).squeeze(2)))
76 | in_atts = self.softmax(self.nei_rels_dropout(torch.matmul(in_nei_rels_emb, target_rels_emb).squeeze(2)))
77 | out_sem_feats = torch.matmul(out_atts.unsqueeze(1), out_nei_rels_emb).squeeze(1)
78 | in_sem_feats = torch.matmul(in_atts.unsqueeze(1), in_nei_rels_emb).squeeze(1)
79 |
80 | if self.params.init_nei_rels == 'both':
81 | ent_sem_feats = self.sigmoid(self.w_rel2ent(torch.cat([out_sem_feats, in_sem_feats], dim=1)))
82 | elif self.params.init_nei_rels == 'out':
83 | ent_sem_feats = self.sigmoid(self.w_rel2ent(out_sem_feats))
84 | elif self.params.init_nei_rels == 'in':
85 | ent_sem_feats = self.sigmoid(self.w_rel2ent(in_sem_feats))
86 |
87 | g.ndata['init'] = torch.cat([g.ndata['feat'], ent_sem_feats], dim=1) # [B, self.inp_dim]
88 |
89 | def comp_ht_emb(self, head_embs, tail_embs):
90 | if self.params.comp_ht == 'mult':
91 | ht_embs = head_embs * tail_embs
92 | elif self.params.comp_ht == 'mlp':
93 | ht_embs = self.fc_comp(torch.cat([head_embs, tail_embs], dim=1))
94 | elif self.params.comp_ht == 'sum':
95 | ht_embs = head_embs + tail_embs
96 | else:
97 | raise KeyError(f'composition operator of head and relation embedding {self.comp_ht} not recognized.')
98 |
99 | return ht_embs
100 |
101 | def comp_hrt_emb(self, head_embs, tail_embs, rel_embs):
102 | rel_embs = rel_embs.repeat(1, self.params.num_gcn_layers)
103 | if self.params.comp_hrt == 'TransE':
104 | hrt_embs = head_embs + rel_embs - tail_embs
105 | elif self.params.comp_hrt == 'DistMult':
106 | hrt_embs = head_embs * rel_embs * tail_embs
107 | else: raise KeyError(f'composition operator of (h, r, t) embedding {self.comp_hrt} not recognized.')
108 |
109 | return hrt_embs
110 |
111 | def nei_rel_path(self, g, rel_labels, r_emb_out):
112 | """ Neighboring relational path module """
113 | # Only consider in-degree relations first.
114 | nei_rels = g.ndata['in_nei_rels']
115 | head_ids = (g.ndata['id'] == 1).nonzero().squeeze(1)
116 | tail_ids = (g.ndata['id'] == 2).nonzero().squeeze(1)
117 | heads_rels = nei_rels[head_ids]
118 | tails_rels = nei_rels[tail_ids]
119 |
120 | # Extract neighboring relational paths
121 | batch_paths = []
122 | for (head_rels, r_t, tail_rels) in zip(heads_rels, rel_labels, tails_rels):
123 | paths = []
124 | for h_r in head_rels:
125 | for t_r in tail_rels:
126 | path = [h_r, r_t, t_r]
127 | paths.append(path)
128 | batch_paths.append(paths) # [B, n_paths, 3] , n_paths = n_head_rels * n_tail_rels
129 |
130 | batch_paths = torch.LongTensor(batch_paths).to(rel_labels.device)# [B, n_paths, 3], n_paths = n_head_rels * n_tail_rels
131 | batch_size = batch_paths.shape[0]
132 | batch_paths = batch_paths.view(batch_size * len(paths), -1) # [B * n_paths, 3]
133 |
134 | batch_paths_embs = F.embedding(batch_paths, r_emb_out, padding_idx=-1) # [B * n_paths, 3, inp_dim]
135 |
136 | # Input RNN
137 | _, last_state = self.rnn(batch_paths_embs) # last_state: [1, B * n_paths, inp_dim]
138 | last_state = last_state.squeeze(0) # squeeze the dim 0
139 | last_state = last_state.view(batch_size, len(paths), self.params.emb_dim) # [B, n_paths, inp_dim]
140 | # Aggregate paths by attention
141 | if self.params.path_agg == 'mean':
142 | output = torch.mean(last_state, 1) # [B, inp_dim]
143 |
144 | if self.params.path_agg == 'att':
145 | r_label_embs = F.embedding(rel_labels, r_emb_out, padding_idx=-1) .unsqueeze(2) # [B, inp_dim, 1]
146 | atts = torch.matmul(last_state, r_label_embs).squeeze(2) # [B, n_paths]
147 | atts = F.softmax(atts, dim=1).unsqueeze(1) # [B, 1, n_paths]
148 | output = torch.matmul(atts, last_state).squeeze(1) # [B, 1, n_paths] * [B, n_paths, inp_dim] -> [B, 1, inp_dim] -> [B, inp_dim]
149 | else:
150 | raise ValueError('unknown path_agg')
151 |
152 | return output # [B, inp_dim]
153 |
154 | def get_logits(self, s_G, s_g_pos, s_g_cor):
155 | ret = self.disc(s_G, s_g_pos, s_g_cor)
156 | return ret
157 |
158 | def forward(self, data, is_return_emb=False, cor_graph=False):
159 | # Initialize the embedding of entities
160 | g, rel_labels = data
161 |
162 | # Neighboring Relational Feature Module
163 | ## Initialize the embedding of nodes by neighbor relations
164 | if self.params.init_nei_rels == 'no':
165 | g.ndata['init'] = g.ndata['feat'].clone()
166 | else:
167 | self.init_ent_emb_matrix(g)
168 |
169 | # Corrupt the node feature
170 | if cor_graph:
171 | g.ndata['init'] = g.ndata['init'][torch.randperm(g.ndata['feat'].shape[0])]
172 |
173 | # r: Embedding of relation
174 | r = self.rel_emb.weight.clone()
175 |
176 | # Input graph into GNN to get embeddings.
177 | g.ndata['h'], r_emb_out = self.gnn(g, r)
178 |
179 | # GRU layer for nodes
180 | graph_sizes = g.batch_num_nodes()
181 | out_dim = self.params.num_gcn_layers * self.params.emb_dim
182 | g.ndata['repr'] = F.relu(self.batch_gru(g.ndata['repr'].view(-1, out_dim), graph_sizes))
183 | node_hiddens = F.relu(self.W_o(g.ndata['repr'])) # num_nodes x hidden
184 | g.ndata['repr'] = self.dropout(node_hiddens) # num_nodes x hidden
185 | g_out = mean_nodes(g, 'repr').view(-1, out_dim)
186 |
187 | # Get embedding of target nodes (i.e. head and tail nodes)
188 | head_ids = (g.ndata['id'] == 1).nonzero().squeeze(1)
189 | head_embs = g.ndata['repr'][head_ids]
190 | tail_ids = (g.ndata['id'] == 2).nonzero().squeeze(1)
191 | tail_embs = g.ndata['repr'][tail_ids]
192 |
193 | if self.params.add_ht_emb:
194 | g_rep = torch.cat([g_out,
195 | head_embs.view(-1, out_dim),
196 | tail_embs.view(-1, out_dim),
197 | F.embedding(rel_labels, r_emb_out, padding_idx=-1)], dim=1)
198 | else:
199 | g_rep = torch.cat([g_out, self.rel_emb(rel_labels)], dim=1)
200 |
201 | # Represent subgraph by composing (h,r,t) in some way. (Not use in paper)
202 | if self.params.comp_hrt:
203 | edge_embs = self.comp_hrt_emb(head_embs.view(-1, out_dim), tail_embs.view(-1, out_dim), F.embedding(rel_labels, r_emb_out, padding_idx=-1))
204 | g_rep = torch.cat([g_out, edge_embs], dim=1)
205 |
206 | # Model neighboring relational paths
207 | if self.params.nei_rel_path:
208 | # Model neighboring relational path
209 | g_p = self.nei_rel_path(g, rel_labels, r_emb_out)
210 | g_rep = torch.cat([g_rep, g_p], dim=1)
211 | s_g = torch.cat([g_out, g_p], dim=1)
212 | else:
213 | s_g = g_out
214 | output = self.fc_layer(g_rep)
215 |
216 | self.r_emb_out = r_emb_out
217 |
218 | if not is_return_emb:
219 | return output
220 | else:
221 | # Get the subgraph-level embedding
222 | s_G = s_g.mean(0)
223 | return output, s_G, s_g
224 |
225 |
226 |
227 |
--------------------------------------------------------------------------------
/model/dgl/layers.py:
--------------------------------------------------------------------------------
1 | """
2 | File baseed off of dgl tutorial on RGCN
3 | Source: https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn
4 | """
5 | import torch
6 | import torch.nn as nn
7 | import torch.nn.functional as F
8 |
9 |
10 | class Identity(nn.Module):
11 | """A placeholder identity operator that is argument-insensitive.
12 | (Identity has already been supported by PyTorch 1.2, we will directly
13 | import torch.nn.Identity in the future)
14 | """
15 |
16 | def __init__(self):
17 | super(Identity, self).__init__()
18 |
19 | def forward(self, x):
20 | """Return input"""
21 | return x
22 |
23 |
24 | class RGCNLayer(nn.Module):
25 | def __init__(self, inp_dim, out_dim, aggregator, bias=None, activation=None, dropout=0.0, edge_dropout=0.0, is_input_layer=False):
26 | super(RGCNLayer, self).__init__()
27 | self.bias = bias
28 | self.activation = activation
29 |
30 | if self.bias:
31 | self.bias = nn.Parameter(torch.Tensor(out_dim))
32 | nn.init.xavier_uniform_(self.bias,
33 | gain=nn.init.calculate_gain('relu'))
34 |
35 | self.aggregator = aggregator
36 |
37 | if dropout:
38 | self.dropout = nn.Dropout(dropout)
39 | else:
40 | self.dropout = None
41 |
42 | if edge_dropout:
43 | self.edge_dropout = nn.Dropout(edge_dropout)
44 | else:
45 | self.edge_dropout = Identity()
46 |
47 | # define how propagation is done in subclass
48 | def propagate(self, g):
49 | raise NotImplementedError
50 |
51 | def forward(self, g, rel_emb, attn_rel_emb=None):
52 | raise NotImplementedError
53 |
54 | class RGCNBasisLayer(RGCNLayer):
55 | def __init__(self, inp_dim, out_dim, aggregator, attn_rel_emb_dim, num_rels, num_bases=-1, bias=None,
56 | activation=None, dropout=0.0, edge_dropout=0.0, is_input_layer=False, has_attn=False, is_comp=''):
57 | super(
58 | RGCNBasisLayer,
59 | self).__init__(
60 | inp_dim,
61 | out_dim,
62 | aggregator,
63 | bias,
64 | activation,
65 | dropout=dropout,
66 | edge_dropout=edge_dropout,
67 | is_input_layer=is_input_layer)
68 | self.inp_dim = inp_dim
69 | self.out_dim = out_dim
70 | self.attn_rel_emb_dim = attn_rel_emb_dim
71 | self.num_rels = num_rels
72 | self.num_bases = num_bases
73 | self.is_input_layer = is_input_layer
74 | self.has_attn = has_attn
75 | self.is_comp = is_comp
76 |
77 | if self.num_bases <= 0 or self.num_bases > self.num_rels:
78 | self.num_bases = self.num_rels
79 |
80 | # add basis weights
81 | # self.weight = basis_weights
82 | self.weight = nn.Parameter(torch.Tensor(self.num_bases, self.inp_dim, self.out_dim))
83 | self.w_comp = nn.Parameter(torch.Tensor(self.num_rels, self.num_bases))
84 | # Project relation embedding to current node input embedidng
85 | self.w_rel = nn.Parameter(torch.Tensor(self.inp_dim, self.out_dim))
86 | if self.has_attn:
87 | self.A = nn.Linear(2 * self.inp_dim + 2 * self.attn_rel_emb_dim, inp_dim)
88 | self.B = nn.Linear(inp_dim, 1)
89 |
90 | self.self_loop_weight = nn.Parameter(torch.Tensor(self.inp_dim, self.out_dim))
91 |
92 | nn.init.xavier_uniform_(self.self_loop_weight, gain=nn.init.calculate_gain('relu'))
93 | nn.init.xavier_uniform_(self.weight, gain=nn.init.calculate_gain('relu'))
94 | nn.init.xavier_uniform_(self.w_comp, gain=nn.init.calculate_gain('relu'))
95 | nn.init.xavier_uniform_(self.w_rel, gain=nn.init.calculate_gain('relu'))
96 |
97 | def propagate(self, g, attn_rel_emb=None):
98 |
99 | # generate all weights from bases
100 | weight = self.weight.view(self.num_bases,
101 | self.inp_dim * self.out_dim)
102 | weight = torch.matmul(self.w_comp, weight).view(
103 | self.num_rels, self.inp_dim, self.out_dim)
104 |
105 | g.edata['w'] = self.edge_dropout(torch.ones(g.number_of_edges(), 1).to(weight.device))
106 |
107 | # input_ = 'feat' if self.is_input_layer else 'h'
108 | input_ = 'init' if self.is_input_layer else 'h'
109 |
110 | def comp(h, edge_data):
111 | """ Refer to CompGCN """
112 | if self.is_comp == 'mult':
113 | return h * edge_data
114 | elif self.is_comp == 'sub':
115 | return h - edge_data
116 | else:
117 | raise KeyError(f'composition operator {self.comp} not recognized.')
118 |
119 | def msg_func(edges):
120 | w = weight.index_select(0, edges.data['type'])
121 |
122 | # Similar to CompGCN to interact nodes and relations
123 | if self.is_comp:
124 | edge_data = comp(edges.src[input_], F.embedding(edges.data['type'], self.rel_emb, padding_idx=-1))
125 | else:
126 | edge_data = edges.src[input_]
127 |
128 | msg = edges.data['w'] * torch.bmm(edge_data.unsqueeze(1), w).squeeze(1)
129 |
130 | curr_emb = torch.mm(edges.dst[input_], self.self_loop_weight) # (B, F)
131 |
132 | if self.has_attn:
133 | e = torch.cat([edges.src[input_], edges.dst[input_], attn_rel_emb(edges.data['type']), attn_rel_emb(edges.data['label'])], dim=1)
134 | a = torch.sigmoid(self.B(F.relu(self.A(e))))
135 | else:
136 | a = torch.ones((len(edges), 1)).to(device=w.device)
137 |
138 | return {'curr_emb': curr_emb, 'msg': msg, 'alpha': a}
139 |
140 | g.update_all(msg_func, self.aggregator, None)
141 |
142 | def forward(self, g, rel_emb, attn_rel_emb=None):
143 | self.rel_emb = rel_emb
144 | self.propagate(g, attn_rel_emb)
145 |
146 | # apply bias and activation
147 | node_repr = g.ndata['h']
148 | if self.bias:
149 | node_repr = node_repr + self.bias
150 | if self.activation:
151 | node_repr = self.activation(node_repr)
152 | if self.dropout:
153 | node_repr = self.dropout(node_repr)
154 |
155 | g.ndata['h'] = node_repr
156 |
157 | if self.is_input_layer:
158 | g.ndata['repr'] = g.ndata['h'].unsqueeze(1)
159 | else:
160 | g.ndata['repr'] = torch.cat([g.ndata['repr'], g.ndata['h'].unsqueeze(1)], dim=1)
161 |
162 | rel_emb_out = torch.matmul(self.rel_emb, self.w_rel)
163 | rel_emb_out[-1, :].zero_() # padding embedding as 0
164 | return rel_emb_out
165 |
--------------------------------------------------------------------------------
/model/dgl/rgcn_model.py:
--------------------------------------------------------------------------------
1 | """
2 | File based off of dgl tutorial on RGCN
3 | Source: https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn
4 | """
5 |
6 | import torch
7 | import torch.nn as nn
8 | import torch.nn.functional as F
9 | from .layers import RGCNBasisLayer as RGCNLayer
10 |
11 | from .aggregators import SumAggregator, MLPAggregator, GRUAggregator
12 |
13 |
14 | class RGCN(nn.Module):
15 | def __init__(self, params):
16 | super(RGCN, self).__init__()
17 |
18 | self.max_label_value = params.max_label_value
19 | self.inp_dim = params.inp_dim
20 | self.emb_dim = params.emb_dim
21 | self.attn_rel_emb_dim = params.attn_rel_emb_dim
22 | self.num_rels = params.num_rels
23 | self.aug_num_rels = params.aug_num_rels
24 | self.num_bases = params.num_bases
25 | self.num_hidden_layers = params.num_gcn_layers
26 | self.dropout = params.dropout
27 | self.edge_dropout = params.edge_dropout
28 | # self.aggregator_type = params.gnn_agg_type
29 | self.has_attn = params.has_attn
30 |
31 | self.is_comp = params.is_comp
32 |
33 | self.device = params.device
34 |
35 | if self.has_attn:
36 | self.attn_rel_emb = nn.Embedding(self.num_rels, self.attn_rel_emb_dim, sparse=False)
37 | else:
38 | self.attn_rel_emb = None
39 |
40 | # initialize aggregators for input and hidden layers
41 | if params.gnn_agg_type == "sum":
42 | self.aggregator = SumAggregator(self.emb_dim)
43 | elif params.gnn_agg_type == "mlp":
44 | self.aggregator = MLPAggregator(self.emb_dim)
45 | elif params.gnn_agg_type == "gru":
46 | self.aggregator = GRUAggregator(self.emb_dim)
47 |
48 | # initialize basis weights for input and hidden layers
49 | # self.input_basis_weights = nn.Parameter(torch.Tensor(self.num_bases, self.inp_dim, self.emb_dim))
50 | # self.basis_weights = nn.Parameter(torch.Tensor(self.num_bases, self.emb_dim, self.emb_dim))
51 |
52 | # create rgcn layers
53 | self.build_model()
54 |
55 | # create initial features
56 | self.features = self.create_features()
57 |
58 | def create_features(self):
59 | features = torch.arange(self.inp_dim).to(device=self.device)
60 | return features
61 |
62 | def build_model(self):
63 | self.layers = nn.ModuleList()
64 | # i2h
65 | i2h = self.build_input_layer()
66 | if i2h is not None:
67 | self.layers.append(i2h)
68 | # h2h
69 | for idx in range(self.num_hidden_layers - 1):
70 | h2h = self.build_hidden_layer(idx)
71 | self.layers.append(h2h)
72 |
73 | def build_input_layer(self):
74 | return RGCNLayer(self.inp_dim,
75 | self.emb_dim,
76 | # self.input_basis_weights,
77 | self.aggregator,
78 | self.attn_rel_emb_dim,
79 | self.aug_num_rels,
80 | self.num_bases,
81 | activation=F.relu,
82 | dropout=self.dropout,
83 | edge_dropout=self.edge_dropout,
84 | is_input_layer=True,
85 | has_attn=self.has_attn,
86 | is_comp=self.is_comp)
87 |
88 | def build_hidden_layer(self, idx):
89 | return RGCNLayer(self.emb_dim,
90 | self.emb_dim,
91 | # self.basis_weights,
92 | self.aggregator,
93 | self.attn_rel_emb_dim,
94 | self.aug_num_rels,
95 | self.num_bases,
96 | activation=F.relu,
97 | dropout=self.dropout,
98 | edge_dropout=self.edge_dropout,
99 | has_attn=self.has_attn,
100 | is_comp=self.is_comp)
101 |
102 | def forward(self, g, r):
103 | for layer in self.layers:
104 | r = layer(g, r, self.attn_rel_emb)
105 | return g.ndata.pop('h'), r
106 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | dgl==0.4.2
2 | lmdb==0.98
3 | networkx==2.4
4 | scikit-learn==0.22.1
5 | torch==1.4.0
6 | tqdm==4.43.0
--------------------------------------------------------------------------------
/snri.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/snri.png
--------------------------------------------------------------------------------
/subgraph_extraction/__pycache__/datasets.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/subgraph_extraction/__pycache__/datasets.cpython-36.pyc
--------------------------------------------------------------------------------
/subgraph_extraction/__pycache__/graph_sampler.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/subgraph_extraction/__pycache__/graph_sampler.cpython-36.pyc
--------------------------------------------------------------------------------
/subgraph_extraction/datasets.py:
--------------------------------------------------------------------------------
1 | from torch.utils.data import Dataset
2 | import timeit
3 | import os
4 | import logging
5 | import lmdb
6 | import numpy as np
7 | import json
8 | import pickle
9 | import dgl
10 | from utils.graph_utils import ssp_multigraph_to_dgl, incidence_matrix
11 | from utils.data_utils import process_files, save_to_file, plot_rel_dist
12 | from .graph_sampler import *
13 | import pdb
14 |
15 |
16 | def generate_subgraph_datasets(params, splits=['train', 'valid'], saved_relation2id=None, max_label_value=None, is_ent2rels=None):
17 |
18 | testing = 'test' in splits
19 | adj_list, triplets, entity2id, relation2id, id2entity, id2relation, h2r, m_h2r, t2r, m_t2r = process_files(params.file_paths, saved_relation2id, sort_data=params.sort_data)
20 |
21 | # plot_rel_dist(adj_list, os.path.join(params.main_dir, f'data/{params.dataset}/rel_dist.png'))
22 |
23 | data_path = os.path.join(params.main_dir, f'data/{params.dataset}/relation2id.json')
24 | if not os.path.isdir(data_path) and not testing:
25 | with open(data_path, 'w') as f:
26 | json.dump(relation2id, f)
27 |
28 | graphs = {}
29 |
30 | for split_name in splits:
31 | graphs[split_name] = {'triplets': triplets[split_name], 'max_size': params.max_links}
32 |
33 | # Sample train and valid/test links
34 | for split_name, split in graphs.items():
35 | logging.info(f"Sampling negative links for {split_name}")
36 | split['pos'], split['neg'] = sample_neg(adj_list, split['triplets'], params.num_neg_samples_per_link, max_size=split['max_size'], constrained_neg_prob=params.constrained_neg_prob)
37 |
38 | if testing:
39 | directory = os.path.join(params.main_dir, 'data/{}/'.format(params.dataset))
40 | save_to_file(directory, f'neg_{params.test_file}_{params.constrained_neg_prob}.txt', graphs['test']['neg'], id2entity, id2relation)
41 |
42 | links2subgraphs(adj_list, graphs, params, max_label_value)
43 |
44 |
45 | def get_kge_embeddings(dataset, kge_model):
46 |
47 | path = './experiments/kge_baselines/{}_{}'.format(kge_model, dataset)
48 | node_features = np.load(os.path.join(path, 'entity_embedding.npy'))
49 | with open(os.path.join(path, 'id2entity.json')) as json_file:
50 | kge_id2entity = json.load(json_file)
51 | kge_entity2id = {v: int(k) for k, v in kge_id2entity.items()}
52 |
53 | return node_features, kge_entity2id
54 |
55 |
56 | class SubgraphDataset(Dataset):
57 | """Extracted, labeled, subgraph dataset -- DGL Only"""
58 |
59 | def __init__(self, db_path, db_name_pos, db_name_neg, raw_data_paths, included_relations=None, add_traspose_rels=False, num_neg_samples_per_link=1, use_kge_embeddings=False, dataset='', kge_model='', file_name='', is_ret_nodes_num=False):
60 |
61 | self.main_env = lmdb.open(db_path, readonly=True, max_dbs=3, lock=False)
62 | self.db_pos = self.main_env.open_db(db_name_pos.encode())
63 | self.db_neg = self.main_env.open_db(db_name_neg.encode())
64 | self.node_features, self.kge_entity2id = get_kge_embeddings(dataset, kge_model) if use_kge_embeddings else (None, None)
65 | self.num_neg_samples_per_link = num_neg_samples_per_link
66 | self.file_name = file_name
67 | self.is_ret_nodes_num = is_ret_nodes_num
68 |
69 | ssp_graph, __, __, __, id2entity, id2relation, h2r, m_h2r, t2r, m_t2r = process_files(raw_data_paths, included_relations, add_traspose_rels)
70 | self.num_rels = len(ssp_graph)
71 |
72 | # Add transpose matrices to handle both directions of relations.
73 | if add_traspose_rels:
74 | ssp_graph_t = [adj.T for adj in ssp_graph]
75 | ssp_graph += ssp_graph_t
76 |
77 | # the effective number of relations after adding symmetric adjacency matrices and/or self connections
78 | self.aug_num_rels = len(ssp_graph)
79 | self.graph = ssp_multigraph_to_dgl(ssp_graph)
80 | self.ssp_graph = ssp_graph
81 | self.id2entity = id2entity
82 | self.id2relation = id2relation
83 | self.m_h2r = m_h2r
84 | self.m_t2r = m_t2r
85 |
86 | self.max_n_label = np.array([0, 0])
87 | with self.main_env.begin() as txn:
88 | self.max_n_label[0] = int.from_bytes(txn.get('max_n_label_sub'.encode()), byteorder='little')
89 | self.max_n_label[1] = int.from_bytes(txn.get('max_n_label_obj'.encode()), byteorder='little')
90 |
91 | self.avg_subgraph_size = struct.unpack('f', txn.get('avg_subgraph_size'.encode()))
92 | self.min_subgraph_size = struct.unpack('f', txn.get('min_subgraph_size'.encode()))
93 | self.max_subgraph_size = struct.unpack('f', txn.get('max_subgraph_size'.encode()))
94 | self.std_subgraph_size = struct.unpack('f', txn.get('std_subgraph_size'.encode()))
95 |
96 | self.avg_enc_ratio = struct.unpack('f', txn.get('avg_enc_ratio'.encode()))
97 | self.min_enc_ratio = struct.unpack('f', txn.get('min_enc_ratio'.encode()))
98 | self.max_enc_ratio = struct.unpack('f', txn.get('max_enc_ratio'.encode()))
99 | self.std_enc_ratio = struct.unpack('f', txn.get('std_enc_ratio'.encode()))
100 |
101 | self.avg_num_pruned_nodes = struct.unpack('f', txn.get('avg_num_pruned_nodes'.encode()))
102 | self.min_num_pruned_nodes = struct.unpack('f', txn.get('min_num_pruned_nodes'.encode()))
103 | self.max_num_pruned_nodes = struct.unpack('f', txn.get('max_num_pruned_nodes'.encode()))
104 | self.std_num_pruned_nodes = struct.unpack('f', txn.get('std_num_pruned_nodes'.encode()))
105 |
106 | logging.info(f"Max distance from sub : {self.max_n_label[0]}, Max distance from obj : {self.max_n_label[1]}")
107 |
108 | # logging.info('=====================')
109 | # logging.info(f"Subgraph size stats: \n Avg size {self.avg_subgraph_size}, \n Min size {self.min_subgraph_size}, \n Max size {self.max_subgraph_size}, \n Std {self.std_subgraph_size}")
110 |
111 | # logging.info('=====================')
112 | # logging.info(f"Enclosed nodes ratio stats: \n Avg size {self.avg_enc_ratio}, \n Min size {self.min_enc_ratio}, \n Max size {self.max_enc_ratio}, \n Std {self.std_enc_ratio}")
113 |
114 | # logging.info('=====================')
115 | # logging.info(f"# of pruned nodes stats: \n Avg size {self.avg_num_pruned_nodes}, \n Min size {self.min_num_pruned_nodes}, \n Max size {self.max_num_pruned_nodes}, \n Std {self.std_num_pruned_nodes}")
116 |
117 | with self.main_env.begin(db=self.db_pos) as txn:
118 | self.num_graphs_pos = int.from_bytes(txn.get('num_graphs'.encode()), byteorder='little')
119 | with self.main_env.begin(db=self.db_neg) as txn:
120 | self.num_graphs_neg = int.from_bytes(txn.get('num_graphs'.encode()), byteorder='little')
121 |
122 | self.__getitem__(0)
123 |
124 | def __getitem__(self, index):
125 | with self.main_env.begin(db=self.db_pos) as txn:
126 | str_id = '{:08}'.format(index).encode('ascii')
127 | nodes_pos, r_label_pos, g_label_pos, n_labels_pos = deserialize(txn.get(str_id)).values()
128 | subgraph_pos = self._prepare_subgraphs(nodes_pos, r_label_pos, n_labels_pos)
129 |
130 | # Get the neighbor relations of target head and tails
131 | # nei_rels_pos = [self.ent2rels[nodes_pos[0]], self.ent2rels[nodes_pos[1]]]
132 | nei_rels_pos = [[0, 1], [0, 1]]
133 |
134 | subgraphs_neg = []
135 | r_labels_neg = []
136 | g_labels_neg = []
137 | nei_rels_negs = []
138 |
139 | with self.main_env.begin(db=self.db_neg) as txn:
140 | for i in range(self.num_neg_samples_per_link):
141 | str_id = '{:08}'.format(index + i * (self.num_graphs_pos)).encode('ascii')
142 | nodes_neg, r_label_neg, g_label_neg, n_labels_neg = deserialize(txn.get(str_id)).values()
143 | subgraphs_neg.append(self._prepare_subgraphs(nodes_neg, r_label_neg, n_labels_neg))
144 | # Get the neighbor relations of target head and tails
145 | # nei_rels_neg = [self.ent2rels[nodes_neg[0]], self.ent2rels[nodes_neg[1]]]
146 | nei_rels_neg = [[0, 1], [0, 1]]
147 | nei_rels_negs.append(nei_rels_neg)
148 | r_labels_neg.append(r_label_neg)
149 | g_labels_neg.append(g_label_neg)
150 |
151 | # print("Nodes of subgraph: ", len(subgraph_pos.nodes()))
152 | return subgraph_pos, g_label_pos, r_label_pos, subgraphs_neg, g_labels_neg, r_labels_neg,
153 |
154 | def __len__(self):
155 | return self.num_graphs_pos
156 |
157 | def _prepare_subgraphs(self, nodes, r_label, n_labels):
158 | subgraph: dgl.DGLGraph = self.graph.subgraph(nodes)
159 | subgraph.edata['type'] = self.graph.edata['type'][subgraph.edata[dgl.EID]]
160 | subgraph.edata['label'] = torch.tensor(r_label * np.ones(subgraph.edata['type'].shape), dtype=torch.long)
161 |
162 | # Check if the target relation is in the subgraph
163 | has_rel = subgraph.has_edges_between(0, 1)
164 | if has_rel:
165 | edges_btw_roots = subgraph.edge_ids(0, 1)
166 | # rel_link = np.nonzero(subgraph.edata['type'][edges_btw_roots] == r_label)
167 | rel_link = subgraph.edata['type'][edges_btw_roots].item() == r_label # If the target relation is not in the subgraph, add a self-loop to the subgraph
168 | # if rel_link.squeeze().nelement() == 0:
169 | if not has_rel or not rel_link : # If there is no relation between the roots, or the target relation is not in the subgraph (these two cases may only occur for neg sample, because for neg sample, the target relation may not really exist, so we have to add this edge manually)
170 | # subgraph.add_edge(0, 1)
171 | subgraph.add_edges([0], [1])
172 | subgraph.edata['type'][-1] = torch.tensor(r_label).type(torch.LongTensor)
173 | subgraph.edata['label'][-1] = torch.tensor(r_label).type(torch.LongTensor)
174 |
175 | # map the id read by GraIL to the entity IDs as registered by the KGE embeddings
176 | kge_nodes = [self.kge_entity2id[self.id2entity[n]] for n in nodes] if self.kge_entity2id else None
177 | n_feats = self.node_features[kge_nodes] if self.node_features is not None else None
178 | subgraph = self._prepare_features_new(subgraph, n_labels, r_label, n_feats)
179 |
180 | # Add the original node id feature
181 | subgraph.ndata['parent_id'] = self.graph.subgraph(nodes).ndata[dgl.NID]
182 | # Add the neighbor relations
183 | subgraph.ndata['out_nei_rels'] = torch.LongTensor(self.m_h2r[subgraph.ndata['parent_id']])
184 | subgraph.ndata['in_nei_rels'] = torch.LongTensor(self.m_t2r[subgraph.ndata['parent_id']])
185 |
186 | return subgraph
187 |
188 | def _prepare_features(self, subgraph, n_labels, n_feats=None):
189 | # One hot encode the node label feature and concat to n_featsure
190 | n_nodes = subgraph.number_of_nodes()
191 | label_feats = np.zeros((n_nodes, self.max_n_label[0] + 1))
192 | label_feats[np.arange(n_nodes), n_labels] = 1
193 | label_feats[np.arange(n_nodes), self.max_n_label[0] + 1 + n_labels[:, 1]] = 1
194 | n_feats = np.concatenate((label_feats, n_feats), axis=1) if n_feats else label_feats
195 | subgraph.ndata['feat'] = torch.FloatTensor(n_feats)
196 | self.n_feat_dim = n_feats.shape[1] # Find cleaner way to do this -- i.e. set the n_feat_dim
197 | return subgraph
198 |
199 | def _prepare_features_new(self, subgraph, n_labels, r_label, n_feats=None):
200 | # One hot encode the node label feature and concat to n_featsure
201 | n_nodes = subgraph.number_of_nodes()
202 | label_feats = np.zeros((n_nodes, self.max_n_label[0] + 1 + self.max_n_label[1] + 1))
203 | label_feats[np.arange(n_nodes), n_labels[:, 0]] = 1
204 | label_feats[np.arange(n_nodes), self.max_n_label[0] + 1 + n_labels[:, 1]] = 1
205 | # label_feats = np.zeros((n_nodes, self.max_n_label[0] + 1 + self.max_n_label[1] + 1))
206 | # label_feats[np.arange(n_nodes), 0] = 1
207 | # label_feats[np.arange(n_nodes), self.max_n_label[0] + 1] = 1
208 | n_feats = np.concatenate((label_feats, n_feats), axis=1) if n_feats is not None else label_feats
209 | subgraph.ndata['feat'] = torch.FloatTensor(n_feats)
210 |
211 | head_id = np.argwhere([label[0] == 0 and label[1] == 1 for label in n_labels])
212 | tail_id = np.argwhere([label[0] == 1 and label[1] == 0 for label in n_labels])
213 | n_ids = np.zeros(n_nodes)
214 | n_ids[head_id] = 1 # head
215 | n_ids[tail_id] = 2 # tail
216 |
217 | # 'id' is used to represent the target head and targe tail nodes
218 | subgraph.ndata['id'] = torch.FloatTensor(n_ids)
219 |
220 | # 'r_label' is used to represent the relation label of this subgraph
221 | subgraph.ndata['r_label'] = torch.LongTensor(np.ones(n_nodes) * r_label)
222 | self.n_feat_dim = n_feats.shape[1] # Find cleaner way to do this -- i.e. set the n_feat_dim
223 |
224 |
225 | return subgraph
226 |
--------------------------------------------------------------------------------
/subgraph_extraction/graph_sampler.py:
--------------------------------------------------------------------------------
1 | import os
2 | import math
3 | import struct
4 | import logging
5 | import random
6 | import pickle as pkl
7 | import pdb
8 | from tqdm import tqdm
9 | import lmdb
10 | import multiprocessing as mp
11 | import numpy as np
12 | import scipy.io as sio
13 | import scipy.sparse as ssp
14 | import sys
15 | import torch
16 | from scipy.special import softmax
17 | from utils.dgl_utils import _bfs_relational
18 | from utils.graph_utils import incidence_matrix, remove_nodes, ssp_to_torch, serialize, deserialize, get_edge_count, diameter, radius
19 | import networkx as nx
20 |
21 |
22 | def sample_neg(adj_list, edges, num_neg_samples_per_link=1, max_size=1000000, constrained_neg_prob=0):
23 | pos_edges = edges
24 | neg_edges = []
25 |
26 | # if max_size is set, randomly sample train links
27 | if max_size < len(pos_edges):
28 | perm = np.random.permutation(len(pos_edges))[:max_size]
29 | pos_edges = pos_edges[perm]
30 |
31 | # sample negative links for train/test
32 | n, r = adj_list[0].shape[0], len(adj_list)
33 |
34 | # distribution of edges across reelations
35 | theta = 0.001
36 | edge_count = get_edge_count(adj_list)
37 | rel_dist = np.zeros(edge_count.shape)
38 | idx = np.nonzero(edge_count)
39 | rel_dist[idx] = softmax(theta * edge_count[idx])
40 |
41 | # possible head and tails for each relation
42 | valid_heads = [adj.tocoo().row.tolist() for adj in adj_list]
43 | valid_tails = [adj.tocoo().col.tolist() for adj in adj_list]
44 |
45 | pbar = tqdm(total=len(pos_edges))
46 | while len(neg_edges) < num_neg_samples_per_link * len(pos_edges):
47 | neg_head, neg_tail, rel = pos_edges[pbar.n % len(pos_edges)][0], pos_edges[pbar.n % len(pos_edges)][1], pos_edges[pbar.n % len(pos_edges)][2]
48 | if np.random.uniform() < constrained_neg_prob:
49 | if np.random.uniform() < 0.5:
50 | neg_head = np.random.choice(valid_heads[rel])
51 | else:
52 | neg_tail = np.random.choice(valid_tails[rel])
53 | else:
54 | if np.random.uniform() < 0.5:
55 | neg_head = np.random.choice(n)
56 | else:
57 | neg_tail = np.random.choice(n)
58 |
59 | if neg_head != neg_tail and adj_list[rel][neg_head, neg_tail] == 0:
60 | neg_edges.append([neg_head, neg_tail, rel])
61 | pbar.update(1)
62 |
63 | pbar.close()
64 |
65 | neg_edges = np.array(neg_edges)
66 | return pos_edges, neg_edges
67 |
68 |
69 | def links2subgraphs(A, graphs, params, max_label_value=None):
70 | '''
71 | extract enclosing subgraphs, write map mode + named dbs
72 | '''
73 | max_n_label = {'value': np.array([0, 0])}
74 | subgraph_sizes = []
75 | enc_ratios = []
76 | num_pruned_nodes = []
77 |
78 | BYTES_PER_DATUM = get_average_subgraph_size(100, list(graphs.values())[0]['pos'], A, params) * 1.5
79 | links_length = 0
80 | for split_name, split in graphs.items():
81 | links_length += (len(split['pos']) + len(split['neg'])) * 2
82 | map_size = links_length * BYTES_PER_DATUM
83 |
84 | env = lmdb.open(params.db_path, map_size=map_size, max_dbs=6)
85 |
86 | def extraction_helper(A, links, g_labels, split_env):
87 |
88 | with env.begin(write=True, db=split_env) as txn:
89 | txn.put('num_graphs'.encode(), (len(links)).to_bytes(int.bit_length(len(links)), byteorder='little'))
90 |
91 | with mp.Pool(processes=None, initializer=intialize_worker, initargs=(A, params, max_label_value)) as p:
92 | args_ = zip(range(len(links)), links, g_labels)
93 | for (str_id, datum) in tqdm(p.imap(extract_save_subgraph, args_), total=len(links)):
94 | max_n_label['value'] = np.maximum(np.max(datum['n_labels'], axis=0), max_n_label['value'])
95 | subgraph_sizes.append(datum['subgraph_size'])
96 | enc_ratios.append(datum['enc_ratio'])
97 | num_pruned_nodes.append(datum['num_pruned_nodes'])
98 |
99 | with env.begin(write=True, db=split_env) as txn:
100 | txn.put(str_id, serialize(datum))
101 |
102 | for split_name, split in graphs.items():
103 | logging.info(f"Extracting enclosing subgraphs for positive links in {split_name} set")
104 | labels = np.ones(len(split['pos']))
105 | db_name_pos = split_name + '_pos'
106 | split_env = env.open_db(db_name_pos.encode())
107 | extraction_helper(A, split['pos'], labels, split_env)
108 |
109 | logging.info(f"Extracting enclosing subgraphs for negative links in {split_name} set")
110 | labels = np.zeros(len(split['neg']))
111 | db_name_neg = split_name + '_neg'
112 | split_env = env.open_db(db_name_neg.encode())
113 | extraction_helper(A, split['neg'], labels, split_env)
114 |
115 | max_n_label['value'] = max_label_value if max_label_value is not None else max_n_label['value']
116 |
117 | with env.begin(write=True) as txn:
118 | bit_len_label_sub = int.bit_length(int(max_n_label['value'][0]))
119 | bit_len_label_obj = int.bit_length(int(max_n_label['value'][1]))
120 | txn.put('max_n_label_sub'.encode(), (int(max_n_label['value'][0])).to_bytes(bit_len_label_sub, byteorder='little'))
121 | txn.put('max_n_label_obj'.encode(), (int(max_n_label['value'][1])).to_bytes(bit_len_label_obj, byteorder='little'))
122 |
123 | txn.put('avg_subgraph_size'.encode(), struct.pack('f', float(np.mean(subgraph_sizes))))
124 | txn.put('min_subgraph_size'.encode(), struct.pack('f', float(np.min(subgraph_sizes))))
125 | txn.put('max_subgraph_size'.encode(), struct.pack('f', float(np.max(subgraph_sizes))))
126 | txn.put('std_subgraph_size'.encode(), struct.pack('f', float(np.std(subgraph_sizes))))
127 |
128 | txn.put('avg_enc_ratio'.encode(), struct.pack('f', float(np.mean(enc_ratios))))
129 | txn.put('min_enc_ratio'.encode(), struct.pack('f', float(np.min(enc_ratios))))
130 | txn.put('max_enc_ratio'.encode(), struct.pack('f', float(np.max(enc_ratios))))
131 | txn.put('std_enc_ratio'.encode(), struct.pack('f', float(np.std(enc_ratios))))
132 |
133 | txn.put('avg_num_pruned_nodes'.encode(), struct.pack('f', float(np.mean(num_pruned_nodes))))
134 | txn.put('min_num_pruned_nodes'.encode(), struct.pack('f', float(np.min(num_pruned_nodes))))
135 | txn.put('max_num_pruned_nodes'.encode(), struct.pack('f', float(np.max(num_pruned_nodes))))
136 | txn.put('std_num_pruned_nodes'.encode(), struct.pack('f', float(np.std(num_pruned_nodes))))
137 |
138 |
139 | def get_average_subgraph_size(sample_size, links, A, params):
140 | total_size = 0
141 | for (n1, n2, r_label) in links[np.random.choice(len(links), sample_size)]:
142 | nodes, n_labels, subgraph_size, enc_ratio, num_pruned_nodes = subgraph_extraction_labeling((n1, n2), r_label, A, params.hop, params.enclosing_sub_graph, params.max_nodes_per_hop)
143 | datum = {'nodes': nodes, 'r_label': r_label, 'g_label': 0, 'n_labels': n_labels, 'subgraph_size': subgraph_size, 'enc_ratio': enc_ratio, 'num_pruned_nodes': num_pruned_nodes}
144 | total_size += len(serialize(datum))
145 | return total_size / sample_size
146 |
147 |
148 | def intialize_worker(A, params, max_label_value):
149 | global A_, params_, max_label_value_
150 | A_, params_, max_label_value_ = A, params, max_label_value
151 |
152 |
153 | def extract_save_subgraph(args_):
154 | idx, (n1, n2, r_label), g_label = args_
155 | nodes, n_labels, subgraph_size, enc_ratio, num_pruned_nodes = subgraph_extraction_labeling((n1, n2), r_label, A_, params_.hop, params_.enclosing_sub_graph, params_.max_nodes_per_hop)
156 |
157 | # max_label_value_ is to set the maximum possible value of node label while doing double-radius labelling.
158 | if max_label_value_ is not None:
159 | n_labels = np.array([np.minimum(label, max_label_value_).tolist() for label in n_labels])
160 |
161 | datum = {'nodes': nodes, 'r_label': r_label, 'g_label': g_label, 'n_labels': n_labels, 'subgraph_size': subgraph_size, 'enc_ratio': enc_ratio, 'num_pruned_nodes': num_pruned_nodes}
162 | str_id = '{:08}'.format(idx).encode('ascii')
163 |
164 | return (str_id, datum)
165 |
166 |
167 | def get_neighbor_nodes(roots, adj, h=1, max_nodes_per_hop=None):
168 | bfs_generator = _bfs_relational(adj, roots, max_nodes_per_hop)
169 | lvls = list()
170 | for _ in range(h):
171 | try:
172 | lvls.append(next(bfs_generator))
173 | except StopIteration:
174 | pass
175 | return set().union(*lvls)
176 |
177 |
178 | def subgraph_extraction_labeling(ind, rel, A_list, h=1, enclosing_sub_graph=False, max_nodes_per_hop=None, max_node_label_value=None):
179 | # extract the h-hop enclosing subgraphs around link 'ind'
180 | A_incidence = incidence_matrix(A_list)
181 | A_incidence += A_incidence.T
182 |
183 | root1_nei = get_neighbor_nodes(set([ind[0]]), A_incidence, h, max_nodes_per_hop)
184 | root2_nei = get_neighbor_nodes(set([ind[1]]), A_incidence, h, max_nodes_per_hop)
185 |
186 | subgraph_nei_nodes_int = root1_nei.intersection(root2_nei)
187 | subgraph_nei_nodes_un = root1_nei.union(root2_nei)
188 |
189 | # Extract subgraph | Roots being in the front is essential for labelling and the model to work properly.
190 | if enclosing_sub_graph:
191 | subgraph_nodes = list(ind) + list(subgraph_nei_nodes_int)
192 | else:
193 | subgraph_nodes = list(ind) + list(subgraph_nei_nodes_un)
194 |
195 | subgraph = [adj[subgraph_nodes, :][:, subgraph_nodes] for adj in A_list]
196 |
197 | labels, enclosing_subgraph_nodes = node_label(incidence_matrix(subgraph), max_distance=h)
198 |
199 | pruned_subgraph_nodes = np.array(subgraph_nodes)[enclosing_subgraph_nodes].tolist()
200 | pruned_labels = labels[enclosing_subgraph_nodes]
201 | # pruned_subgraph_nodes = subgraph_nodes
202 | # pruned_labels = labels
203 |
204 | if max_node_label_value is not None:
205 | pruned_labels = np.array([np.minimum(label, max_node_label_value).tolist() for label in pruned_labels])
206 |
207 | subgraph_size = len(pruned_subgraph_nodes)
208 | enc_ratio = len(subgraph_nei_nodes_int) / (len(subgraph_nei_nodes_un) + 1e-3)
209 | num_pruned_nodes = len(subgraph_nodes) - len(pruned_subgraph_nodes)
210 |
211 | return pruned_subgraph_nodes, pruned_labels, subgraph_size, enc_ratio, num_pruned_nodes
212 |
213 |
214 | def node_label(subgraph, max_distance=1):
215 | # implementation of the node labeling scheme described in the paper
216 | roots = [0, 1]
217 | sgs_single_root = [remove_nodes(subgraph, [root]) for root in roots]
218 | dist_to_roots = [np.clip(ssp.csgraph.dijkstra(sg, indices=[0], directed=False, unweighted=True, limit=1e6)[:, 1:], 0, 1e7) for r, sg in enumerate(sgs_single_root)]
219 | dist_to_roots = np.array(list(zip(dist_to_roots[0][0], dist_to_roots[1][0])), dtype=int)
220 |
221 | target_node_labels = np.array([[0, 1], [1, 0]])
222 | labels = np.concatenate((target_node_labels, dist_to_roots)) if dist_to_roots.size else target_node_labels
223 |
224 | enclosing_subgraph_nodes = np.where(np.max(labels, axis=1) <= max_distance)[0]
225 | return labels, enclosing_subgraph_nodes
226 |
--------------------------------------------------------------------------------
/test_auc.py:
--------------------------------------------------------------------------------
1 | # from comet_ml import Experiment
2 |
3 | import pdb
4 | import os
5 | os.environ['OPENBLAS_NUM_THREADS'] = '1'
6 | import argparse
7 | import logging
8 | import torch
9 | from scipy.sparse import SparseEfficiencyWarning
10 | import numpy as np
11 |
12 | from subgraph_extraction.datasets import SubgraphDataset, generate_subgraph_datasets
13 | from utils.initialization_utils import initialize_experiment, initialize_model
14 | from utils.graph_utils import collate_dgl, move_batch_to_device_dgl, collate_dgl_train, move_batch_to_device_dgl_train
15 | from managers.evaluator import Evaluator
16 | from utils.data_utils import process_files
17 |
18 | from warnings import simplefilter
19 |
20 |
21 | def main(params):
22 | simplefilter(action='ignore', category=UserWarning)
23 | simplefilter(action='ignore', category=SparseEfficiencyWarning)
24 |
25 | graph_classifier = initialize_model(params, None, load_model=True)
26 | adj_list, triplets, entity2id, relation2id, id2entity, id2relation, _,_,_,_ = process_files(params.file_paths, graph_classifier.relation2id)
27 | # ent2rels = {k: torch.LongTensor(v).to(device=params.device) for k, v in ent2rels.items()}
28 | # graph_classifier.ent2rels = ent2rels
29 |
30 | logging.info(f"Device: {params.device}")
31 |
32 | all_auc = []
33 | auc_mean = 0
34 |
35 | all_auc_pr = []
36 | auc_pr_mean = 0
37 | for r in range(1, params.runs + 1):
38 |
39 | params.db_path = os.path.join(params.main_dir, f'data/{params.dataset}/test_subgraphs_{params.experiment_name}_{params.constrained_neg_prob}_en_{params.enclosing_sub_graph}')
40 |
41 | generate_subgraph_datasets(params, splits=['test'],
42 | saved_relation2id=graph_classifier.relation2id,
43 | max_label_value=graph_classifier.gnn.max_label_value)
44 |
45 | test = SubgraphDataset(params.db_path, 'test_pos', 'test_neg', params.file_paths, graph_classifier.relation2id,
46 | add_traspose_rels=params.add_traspose_rels,
47 | num_neg_samples_per_link=params.num_neg_samples_per_link,
48 | use_kge_embeddings=params.use_kge_embeddings, dataset=params.dataset,
49 | kge_model=params.kge_model, file_name=params.test_file)
50 |
51 | test_evaluator = Evaluator(params, graph_classifier, test)
52 |
53 | result = test_evaluator.eval(save=True)
54 | logging.info('\nTest Set Performance:' + str(result))
55 | all_auc.append(result['auc'])
56 | auc_mean = auc_mean + (result['auc'] - auc_mean) / r
57 |
58 | all_auc_pr.append(result['auc_pr'])
59 | auc_pr_mean = auc_pr_mean + (result['auc_pr'] - auc_pr_mean) / r
60 |
61 | auc_std = np.std(all_auc)
62 | auc_pr_std = np.std(all_auc_pr)
63 |
64 | logging.info('\nAvg test Set Performance -- mean auc :' + str(np.mean(all_auc)) + ' std auc: ' + str(np.std(all_auc)))
65 | logging.info('\nAvg test Set Performance -- mean auc_pr :' + str(np.mean(all_auc_pr)) + ' std auc_pr: ' + str(np.std(all_auc_pr)))
66 |
67 |
68 | if __name__ == '__main__':
69 |
70 | logging.basicConfig(level=logging.INFO)
71 |
72 | parser = argparse.ArgumentParser(description='TransE model')
73 |
74 | # Experiment setup params
75 | parser.add_argument("--experiment_name", "-e", type=str, default="default",
76 | help="A folder with this name would be created to dump saved models and log files")
77 | parser.add_argument("--dataset", "-d", type=str, default="Toy",
78 | help="Dataset string")
79 | parser.add_argument("--train_file", "-tf", type=str, default="train",
80 | help="Name of file containing training triplets")
81 | parser.add_argument("--test_file", "-t", type=str, default="test",
82 | help="Name of file containing test triplets")
83 | parser.add_argument("--runs", type=int, default=1,
84 | help="How many runs to perform for mean and std?")
85 | parser.add_argument("--gpu", type=int, default=0,
86 | help="Which GPU to use?")
87 | parser.add_argument('--disable_cuda', action='store_true', # default value is False
88 | help='Disable CUDA')
89 |
90 | # Data processing pipeline params
91 | parser.add_argument("--max_links", type=int, default=100000,
92 | help="Set maximum number of links (to fit into memory)")
93 | parser.add_argument("--hop", type=int, default=3,
94 | help="Enclosing subgraph hop number")
95 | parser.add_argument("--max_nodes_per_hop", "-max_h", type=int, default=None,
96 | help="if > 0, upper bound the # nodes per hop by subsampling")
97 | parser.add_argument("--use_kge_embeddings", "-kge", type=bool, default=False,
98 | help='whether to use pretrained KGE embeddings')
99 | parser.add_argument("--kge_model", type=str, default="TransE",
100 | help="Which KGE model to load entity embeddings from")
101 | parser.add_argument('--model_type', '-m', type=str, choices=['dgl'], default='dgl',
102 | help='what format to store subgraphs in for model')
103 | parser.add_argument('--constrained_neg_prob', '-cn', type=float, default=0,
104 | help='with what probability to sample constrained heads/tails while neg sampling')
105 | parser.add_argument("--num_neg_samples_per_link", '-neg', type=int, default=1,
106 | help="Number of negative examples to sample per positive link")
107 | parser.add_argument("--batch_size", type=int, default=16,
108 | help="Batch size")
109 | parser.add_argument("--num_workers", type=int, default=8,
110 | help="Number of dataloading processes")
111 | parser.add_argument('--add_traspose_rels', '-tr', type=bool, default=False,
112 | help='whether to append adj matrix list with symmetric relations')
113 | parser.add_argument('--enclosing_sub_graph', '-en', type=bool, default=True,
114 | help='whether to only consider enclosing subgraph')
115 | # parser.add_argument('--comp_hrt', type=str, default='TransE')
116 | parser.add_argument('--sort_data', type=bool, default=False)
117 |
118 | params = parser.parse_args()
119 | initialize_experiment(params, __file__)
120 |
121 | params.file_paths = {
122 | 'train': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.train_file)),
123 | 'test': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.test_file))
124 | }
125 |
126 | if not params.disable_cuda and torch.cuda.is_available():
127 | params.device = torch.device('cuda:%d' % params.gpu)
128 | else:
129 | params.device = torch.device('cpu')
130 |
131 | params.collate_fn = collate_dgl_train
132 | params.move_batch_to_device = move_batch_to_device_dgl_train
133 |
134 | main(params)
135 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['OPENBLAS_NUM_THREADS'] = '1'
3 | import argparse
4 | import logging
5 | import torch
6 | from scipy.sparse import SparseEfficiencyWarning
7 |
8 | from subgraph_extraction.datasets import SubgraphDataset, generate_subgraph_datasets
9 | from utils.initialization_utils import initialize_experiment, initialize_model
10 | from utils.graph_utils import collate_dgl, collate_dgl_train, move_batch_to_device_dgl, move_batch_to_device_dgl_train
11 |
12 | from model.dgl.graph_classifier import GraphClassifier as dgl_model
13 |
14 | from managers.evaluator import Evaluator
15 | from managers.trainer import Trainer
16 |
17 | from warnings import simplefilter
18 |
19 |
20 | def main(params):
21 | simplefilter(action='ignore', category=UserWarning)
22 | simplefilter(action='ignore', category=SparseEfficiencyWarning)
23 |
24 | params.db_path = os.path.join(params.main_dir, f'./data/{params.dataset}/subgraphs_en_{params.enclosing_sub_graph}_neg_{params.num_neg_samples_per_link}_hop_{params.hop}')
25 |
26 | if not os.path.isdir(params.db_path):
27 | generate_subgraph_datasets(params)
28 |
29 | train = SubgraphDataset(params.db_path, 'train_pos', 'train_neg', params.file_paths,
30 | add_traspose_rels=params.add_traspose_rels,
31 | num_neg_samples_per_link=params.num_neg_samples_per_link,
32 | use_kge_embeddings=params.use_kge_embeddings, dataset=params.dataset,
33 | kge_model=params.kge_model, file_name=params.train_file)
34 | valid = SubgraphDataset(params.db_path, 'valid_pos', 'valid_neg', params.file_paths,
35 | add_traspose_rels=params.add_traspose_rels,
36 | num_neg_samples_per_link=params.num_neg_samples_per_link,
37 | use_kge_embeddings=params.use_kge_embeddings, dataset=params.dataset,
38 | kge_model=params.kge_model, file_name=params.valid_file)
39 |
40 | # Avoid that m_h2r and m_t2r of train and valid are different
41 | valid.m_h2r = train.m_h2r
42 | valid.m_t2r = train.m_t2r
43 |
44 | params.num_rels = train.num_rels
45 | params.aug_num_rels = train.aug_num_rels
46 |
47 | # Set the embedding dimension of relation and node
48 | if params.init_nei_rels == 'no':
49 | params.inp_dim = train.n_feat_dim
50 | else:
51 | params.inp_dim = train.n_feat_dim + params.sem_dim
52 |
53 | # Log the max label value to save it in the model. This will be used to cap the labels generated on test set.
54 | params.max_label_value = train.max_n_label
55 |
56 | graph_classifier = initialize_model(params, dgl_model, params.load_model)
57 |
58 | logging.info(f"Device: {params.device}")
59 | logging.info(f"Input dim : {params.inp_dim}, # Relations : {params.num_rels}, # Augmented relations : {params.aug_num_rels}")
60 |
61 | valid_evaluator = Evaluator(params, graph_classifier, valid)
62 |
63 | trainer = Trainer(params, graph_classifier, train, valid_evaluator)
64 |
65 | logging.info('Starting training with full batch...')
66 |
67 | trainer.train()
68 |
69 |
70 | if __name__ == '__main__':
71 |
72 | logging.basicConfig(level=logging.INFO)
73 |
74 | parser = argparse.ArgumentParser(description='TransE model')
75 |
76 | # Experiment setup params
77 | parser.add_argument("--experiment_name", "-e", type=str, default="default",
78 | help="A folder with this name would be created to dump saved models and log files")
79 | parser.add_argument("--dataset", "-d", type=str,
80 | help="Dataset string")
81 | parser.add_argument("--gpu", type=int, default=0,
82 | help="Which GPU to use?")
83 | parser.add_argument('--disable_cuda', action='store_true',
84 | help='Disable CUDA')
85 | parser.add_argument('--load_model', action='store_true',
86 | help='Load existing model?')
87 | parser.add_argument("--train_file", "-tf", type=str, default="train",
88 | help="Name of file containing training triplets")
89 | parser.add_argument("--valid_file", "-vf", type=str, default="valid",
90 | help="Name of file containing validation triplets")
91 |
92 | # Training regime params
93 | parser.add_argument("--num_epochs", "-ne", type=int, default=30,
94 | help="Learning rate of the optimizer")
95 | parser.add_argument("--eval_every", type=int, default=3,
96 | help="Interval of epochs to evaluate the model?")
97 | parser.add_argument("--eval_every_iter", type=int, default=455,
98 | help="Interval of iterations to evaluate the model?")
99 | parser.add_argument("--save_every", type=int, default=10,
100 | help="Interval of epochs to save a checkpoint of the model?")
101 | parser.add_argument("--early_stop", type=int, default=100,
102 | help="Early stopping patience")
103 | parser.add_argument("--optimizer", type=str, default="Adam",
104 | help="Which optimizer to use?")
105 | parser.add_argument("--lr", type=float, default=0.001,
106 | help="Learning rate of the optimizer")
107 | parser.add_argument("--clip", type=int, default=1000,
108 | help="Maximum gradient norm allowed")
109 | parser.add_argument("--l2", type=float, default=5e-4,
110 | help="Regularization constant for GNN weights")
111 | parser.add_argument("--margin", type=float, default=10,
112 | help="The margin between positive and negative samples in the max-margin loss")
113 |
114 | # Data processing pipeline params
115 | parser.add_argument("--max_links", type=int, default=1000000,
116 | help="Set maximum number of train links (to fit into memory)")
117 | parser.add_argument("--hop", type=int, default=3,
118 | help="Enclosing subgraph hop number")
119 | parser.add_argument("--max_nodes_per_hop", "-max_h", type=int, default=None,
120 | help="if > 0, upper bound the # nodes per hop by subsampling")
121 | parser.add_argument("--use_kge_embeddings", "-kge", type=bool, default=False,
122 | help='whether to use pretrained KGE embeddings')
123 | parser.add_argument("--kge_model", type=str, default="TransE",
124 | help="Which KGE model to load entity embeddings from")
125 | parser.add_argument('--model_type', '-m', type=str, choices=['ssp', 'dgl'], default='dgl',
126 | help='what format to store subgraphs in for model')
127 | parser.add_argument('--constrained_neg_prob', '-cn', type=float, default=0.0,
128 | help='with what probability to sample constrained heads/tails while neg sampling')
129 | parser.add_argument("--batch_size", type=int, default=64,
130 | help="Batch size")
131 | parser.add_argument("--num_neg_samples_per_link", '-neg', type=int, default=1,
132 | help="Number of negative examples to sample per positive link")
133 | parser.add_argument("--num_workers", type=int, default=2,
134 | help="Number of dataloading processes")
135 | parser.add_argument('--add_traspose_rels', '-tr', type=bool, default=False,
136 | help='whether to append adj matrix list with symmetric relations')
137 | parser.add_argument('--enclosing_sub_graph', '-en', type=bool, default=True,
138 | help='whether to only consider enclosing subgraph')
139 |
140 | # Model params
141 | parser.add_argument("--rel_emb_dim", "-r_dim", type=int, default=32,
142 | help="Relation embedding size")
143 | parser.add_argument("--attn_rel_emb_dim", "-ar_dim", type=int, default=32,
144 | help="Relation embedding size for attention")
145 | parser.add_argument("--emb_dim", "-dim", type=int, default=32,
146 | help="Entity embedding size")
147 | parser.add_argument("--num_gcn_layers", "-l", type=int, default=3,
148 | help="Number of GCN layers")
149 | parser.add_argument("--num_bases", "-b", type=int, default=4,
150 | help="Number of basis functions to use for GCN weights")
151 | parser.add_argument("--dropout", type=float, default=0,
152 | help="Dropout rate in GNN layers")
153 | parser.add_argument("--edge_dropout", type=float, default=0.5,
154 | help="Dropout rate in edges of the subgraphs")
155 | parser.add_argument('--gnn_agg_type', '-a', type=str, choices=['sum', 'mlp', 'gru'], default='sum',
156 | help='what type of aggregation to do in gnn msg passing')
157 | parser.add_argument('--add_ht_emb', '-ht', type=bool, default=True,
158 | help='whether to concatenate head/tail embedding with pooled graph representation')
159 | parser.add_argument('--has_attn', '-attn', type=bool, default=True,
160 | help='whether to have attn in model or not')
161 | parser.add_argument('--sem_dim', type=int, default=24,
162 | help='the dimension of sematic part of node embedding')
163 | parser.add_argument('--max_nei_rels', type=int, default=10, help='the maximum num of neighbor relations of each node when initialzing the node embedding.')
164 | parser.add_argument('--nei_rels_dropout', type=float, default=0.4, help='Dropout rate in aggregating relation embeddings.')
165 | parser.add_argument('--is_comp', type=str, default='mult', choices=['mult', 'sub'], help='The composition manner of node and relation')
166 | parser.add_argument('--comp_ht', type=str, choices=['mult, mlp, sum'], default='sum', help='The composition operator of head and tail embedding')
167 | parser.add_argument('--comp_hrt', type=str, choices=['TransE, DistMult'], default=None, help='The composition operator of (h, r, t)embedding')
168 | parser.add_argument('--coef_dgi_loss', type=float, default=5, help='Coefficient of MI loss')
169 | parser.add_argument('--init_nei_rels', type=str, choices=['no', 'out', 'in', 'both'], default='in', help='the manner of utilizing relatioins when initializing entity embedding')
170 | parser.add_argument('--sort_data', type=bool, default=True,
171 | help='whether to training data according to relation id ')
172 | parser.add_argument('--nei_rel_path', action='store_false',
173 | help='whether to consider neighboring relational paths')
174 | parser.add_argument('--path_agg', type=str, choices=['mean', 'att'], default='att', help='the manner of aggreating neighboring relational paths.')
175 |
176 | params = parser.parse_args()
177 | initialize_experiment(params, __file__)
178 |
179 | params.file_paths = {
180 | 'train': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.train_file)),
181 | 'valid': os.path.join(params.main_dir, 'data/{}/{}.txt'.format(params.dataset, params.valid_file))
182 | }
183 |
184 | if not params.disable_cuda and torch.cuda.is_available():
185 | params.device = torch.device('cuda:%d' % params.gpu)
186 | else:
187 | params.device = torch.device('cpu')
188 |
189 | params.collate_fn = collate_dgl_train
190 | params.move_batch_to_device = move_batch_to_device_dgl_train
191 | main(params)
192 |
--------------------------------------------------------------------------------
/utils/__pycache__/data_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/data_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/dgl_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/dgl_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/graph_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/graph_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/initialization_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Tebmer/SNRI/0ff40e2a92750d5f7d963f828055d8e4edb4b213/utils/__pycache__/initialization_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/clean_data.py:
--------------------------------------------------------------------------------
1 | import os
2 | import argparse
3 | import numpy as np
4 |
5 |
6 | def write_to_file(file_name, data):
7 | with open(file_name, "w") as f:
8 | for s, r, o in data:
9 | f.write('\t'.join([s, r, o]) + '\n')
10 |
11 |
12 | def main(params):
13 | with open(os.path.join(params.main_dir, 'data', params.dataset, 'train.txt')) as f:
14 | train_data = [line.split() for line in f.read().split('\n')[:-1]]
15 | with open(os.path.join(params.main_dir, 'data', params.dataset, 'valid.txt')) as f:
16 | valid_data = [line.split() for line in f.read().split('\n')[:-1]]
17 | with open(os.path.join(params.main_dir, 'data', params.dataset, 'test.txt')) as f:
18 | test_data = [line.split() for line in f.read().split('\n')[:-1]]
19 |
20 | train_tails = set([d[2] for d in train_data])
21 | train_heads = set([d[0] for d in train_data])
22 | train_ent = train_tails.union(train_heads)
23 | train_rels = set([d[1] for d in train_data])
24 |
25 | filtered_valid_data = []
26 | for d in valid_data:
27 | if d[0] in train_ent and d[1] in train_rels and d[2] in train_ent:
28 | filtered_valid_data.append(d)
29 | else:
30 | train_data.append(d)
31 | train_ent = train_ent.union(set([d[0], d[2]]))
32 | train_rels = train_rels.union(set([d[1]]))
33 |
34 | filtered_test_data = []
35 | for d in test_data:
36 | if d[0] in train_ent and d[1] in train_rels and d[2] in train_ent:
37 | filtered_test_data.append(d)
38 | else:
39 | train_data.append(d)
40 | train_ent = train_ent.union(set([d[0], d[2]]))
41 | train_rels = train_rels.union(set([d[1]]))
42 |
43 | data_dir = os.path.join(params.main_dir, 'data/{}'.format(params.dataset))
44 | write_to_file(os.path.join(data_dir, 'train.txt'), train_data)
45 | write_to_file(os.path.join(data_dir, 'valid.txt'), filtered_valid_data)
46 | write_to_file(os.path.join(data_dir, 'test.txt'), filtered_test_data)
47 |
48 | with open(os.path.join(params.main_dir, 'data', params.dataset + '_meta', 'train.txt')) as f:
49 | meta_train_data = [line.split() for line in f.read().split('\n')[:-1]]
50 | with open(os.path.join(params.main_dir, 'data', params.dataset + '_meta', 'valid.txt')) as f:
51 | meta_valid_data = [line.split() for line in f.read().split('\n')[:-1]]
52 | with open(os.path.join(params.main_dir, 'data', params.dataset + '_meta', 'test.txt')) as f:
53 | meta_test_data = [line.split() for line in f.read().split('\n')[:-1]]
54 |
55 | meta_train_tails = set([d[2] for d in meta_train_data])
56 | meta_train_heads = set([d[0] for d in meta_train_data])
57 | meta_train_ent = meta_train_tails.union(meta_train_heads)
58 | meta_train_rels = set([d[1] for d in meta_train_data])
59 |
60 | filtered_meta_valid_data = []
61 | for d in meta_valid_data:
62 | if d[0] in meta_train_ent and d[1] in meta_train_rels and d[2] in meta_train_ent:
63 | filtered_meta_valid_data.append(d)
64 | else:
65 | meta_train_data.append(d)
66 | meta_train_ent = meta_train_ent.union(set([d[0], d[2]]))
67 | meta_train_rels = meta_train_rels.union(set([d[1]]))
68 |
69 | filtered_meta_test_data = []
70 | for d in meta_test_data:
71 | if d[0] in meta_train_ent and d[1] in meta_train_rels and d[2] in meta_train_ent:
72 | filtered_meta_test_data.append(d)
73 | else:
74 | meta_train_data.append(d)
75 | meta_train_ent = meta_train_ent.union(set([d[0], d[2]]))
76 | meta_train_rels = meta_train_rels.union(set([d[1]]))
77 |
78 | meta_data_dir = os.path.join(params.main_dir, 'data/{}_meta'.format(params.dataset))
79 | write_to_file(os.path.join(meta_data_dir, 'train.txt'), meta_train_data)
80 | write_to_file(os.path.join(meta_data_dir, 'valid.txt'), filtered_meta_valid_data)
81 | write_to_file(os.path.join(meta_data_dir, 'test.txt'), filtered_meta_test_data)
82 |
83 |
84 | if __name__ == '__main__':
85 | parser = argparse.ArgumentParser(description='Move new entities from test/valid to train')
86 |
87 | parser.add_argument("--dataset", "-d", type=str, default="fb237_v1_copy",
88 | help="Dataset string")
89 | params = parser.parse_args()
90 |
91 | params.main_dir = os.path.join(os.path.relpath(os.path.dirname(os.path.abspath(__file__))), '..')
92 |
93 | main(params)
94 |
--------------------------------------------------------------------------------
/utils/data_utils.py:
--------------------------------------------------------------------------------
1 | import os
2 | import pdb
3 | import logging
4 | import numpy as np
5 | from scipy.sparse import csc_matrix
6 | import matplotlib.pyplot as plt
7 |
8 |
9 | def plot_rel_dist(adj_list, filename):
10 | rel_count = []
11 | for adj in adj_list:
12 | rel_count.append(adj.count_nonzero())
13 |
14 | fig = plt.figure(figsize=(12, 8))
15 | plt.plot(rel_count)
16 | fig.savefig(filename, dpi=fig.dpi)
17 |
18 |
19 | def process_files(files, saved_relation2id=None, add_traspose_rels=False, sort_data=False):
20 | '''
21 | files: Dictionary map of file paths to read the triplets from.
22 | saved_relation2id: Saved relation2id (mostly passed from a trained model) which can be used to map relations to pre-defined indices and filter out the unknown ones.
23 | '''
24 | entity2id = {}
25 | relation2id = {} if saved_relation2id is None else saved_relation2id
26 |
27 | triplets = {}
28 |
29 | ent = 0
30 | rel = 0
31 |
32 | for file_type, file_path in files.items():
33 |
34 | data = []
35 | with open(file_path) as f:
36 | file_data = [line.split() for line in f.read().split('\n')[:-1]]
37 |
38 | for triplet in file_data:
39 | if triplet[0] not in entity2id:
40 | entity2id[triplet[0]] = ent
41 | ent += 1
42 | if triplet[2] not in entity2id:
43 | entity2id[triplet[2]] = ent
44 | ent += 1
45 | if not saved_relation2id and triplet[1] not in relation2id:
46 | relation2id[triplet[1]] = rel
47 | rel += 1
48 |
49 | # Save the triplets corresponding to only the known relations
50 | if triplet[1] in relation2id:
51 | data.append([entity2id[triplet[0]], entity2id[triplet[2]], relation2id[triplet[1]]])
52 |
53 | triplets[file_type] = np.array(data)
54 |
55 | id2entity = {v: k for k, v in entity2id.items()}
56 | id2relation = {v: k for k, v in relation2id.items()}
57 |
58 | # Construct the the neighbor relations of each entity
59 | num_rels = len(id2relation)
60 | num_ents = len(entity2id)
61 | h2r = {}
62 | h2r_len = {}
63 | t2r = {}
64 | t2r_len = {}
65 |
66 | for triplet in triplets['train']:
67 | h, t, r = triplet
68 | if h not in h2r:
69 | h2r_len[h] = 1
70 | h2r[h] = [r]
71 | else:
72 | h2r_len[h] += 1
73 | h2r[h].append(r)
74 |
75 | if add_traspose_rels:
76 | # Consider the reverse relation, the id of reverse relation is (relation + #relations)
77 | if t not in t2r:
78 | t2r[t] = [r + num_rels]
79 | else:
80 | t2r[t].append(r + num_rels)
81 | if t not in t2r:
82 | t2r[t] = [r]
83 | t2r_len[t] = 1
84 | else:
85 | t2r[t].append(r)
86 | t2r_len[t] += 1
87 |
88 | # Consider nodes with no neighbors as index '-1' and their relation index: num_rels.
89 | # ent2rels[-1] = [num_rels]
90 |
91 | # Construct the matrix of ent2rels
92 | # rels_len = triplets['train'].shape(0) // num_ents
93 | h_nei_rels_len = int(np.percentile(list(h2r_len.values()), 75))
94 | t_nei_rels_len = int(np.percentile(list(t2r_len.values()), 75))
95 | logging.info(f"Average number of relations each node: head: {h_nei_rels_len}, tail: {t_nei_rels_len}")
96 |
97 | # The index "num_rels" of relation is considered as "padding" relation.
98 | # Use padding relation to initialize matrix of ent2rels.
99 | m_h2r = np.ones([num_ents, h_nei_rels_len]) * num_rels
100 | for ent, rels in h2r.items():
101 | if len(rels) > h_nei_rels_len:
102 | rels = np.array(rels)[np.random.choice(np.arange(len(rels)), h_nei_rels_len)]
103 | m_h2r[ent] = rels
104 | else:
105 | rels = np.array(rels)
106 | m_h2r[ent][: rels.shape[0]] = rels
107 |
108 | m_t2r = np.ones([num_ents, t_nei_rels_len]) * num_rels
109 | for ent, rels in t2r.items():
110 | if len(rels) > t_nei_rels_len:
111 | rels = np.array(rels)[np.random.choice(np.arange(len(rels)), t_nei_rels_len)]
112 | m_t2r[ent] = rels
113 | else:
114 | rels = np.array(rels)
115 | m_t2r[ent][: rels.shape[0]] = rels
116 |
117 | print("Construct matrix of ent2rels done!")
118 |
119 | # Sort the data according to relation id
120 | if sort_data:
121 | triplets['train'] = triplets['train'][np.argsort(triplets['train'][:,2])]
122 |
123 | adj_list = []
124 | for i in range(len(relation2id)):
125 | idx = np.argwhere(triplets['train'][:, 2] == i)
126 | adj_list.append(csc_matrix((np.ones(len(idx), dtype=np.uint8), (triplets['train'][:, 0][idx].squeeze(1), triplets['train'][:, 1][idx].squeeze(1))), shape=(len(entity2id), len(entity2id))))
127 |
128 | return adj_list, triplets, entity2id, relation2id, id2entity, id2relation, h2r, m_h2r, t2r, m_t2r
129 |
130 |
131 | def save_to_file(directory, file_name, triplets, id2entity, id2relation):
132 | file_path = os.path.join(directory, file_name)
133 | with open(file_path, "w") as f:
134 | for s, o, r in triplets:
135 | f.write('\t'.join([id2entity[s], id2relation[r], id2entity[o]]) + '\n')
136 |
--------------------------------------------------------------------------------
/utils/dgl_utils.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import scipy.sparse as ssp
3 | import random
4 |
5 | """All functions in this file are from dgl.contrib.data.knowledge_graph"""
6 |
7 |
8 | def _bfs_relational(adj, roots, max_nodes_per_hop=None):
9 | """
10 | BFS for graphs.
11 | Modified from dgl.contrib.data.knowledge_graph to accomodate node sampling
12 | """
13 | visited = set()
14 | current_lvl = set(roots)
15 |
16 | next_lvl = set()
17 |
18 | while current_lvl:
19 |
20 | for v in current_lvl:
21 | visited.add(v)
22 |
23 | next_lvl = _get_neighbors(adj, current_lvl)
24 | next_lvl -= visited # set difference
25 |
26 | if max_nodes_per_hop and max_nodes_per_hop < len(next_lvl):
27 | next_lvl = set(random.sample(next_lvl, max_nodes_per_hop))
28 |
29 | yield next_lvl
30 |
31 | current_lvl = set.union(next_lvl)
32 |
33 |
34 | def _get_neighbors(adj, nodes):
35 | """Takes a set of nodes and a graph adjacency matrix and returns a set of neighbors.
36 | Directly copied from dgl.contrib.data.knowledge_graph"""
37 | sp_nodes = _sp_row_vec_from_idx_list(list(nodes), adj.shape[1])
38 | sp_neighbors = sp_nodes.dot(adj)
39 | neighbors = set(ssp.find(sp_neighbors)[1]) # convert to set of indices
40 | return neighbors
41 |
42 |
43 | def _sp_row_vec_from_idx_list(idx_list, dim):
44 | """Create sparse vector of dimensionality dim from a list of indices."""
45 | shape = (1, dim)
46 | data = np.ones(len(idx_list))
47 | row_ind = np.zeros(len(idx_list))
48 | col_ind = list(idx_list)
49 | return ssp.csr_matrix((data, (row_ind, col_ind)), shape=shape)
50 |
--------------------------------------------------------------------------------
/utils/graph_utils.py:
--------------------------------------------------------------------------------
1 | import statistics
2 | import numpy as np
3 | import scipy.sparse as ssp
4 | import torch
5 | import networkx as nx
6 | import dgl
7 | import pickle
8 |
9 |
10 | def serialize(data):
11 | data_tuple = tuple(data.values())
12 | return pickle.dumps(data_tuple)
13 |
14 |
15 | def deserialize(data):
16 | data_tuple = pickle.loads(data)
17 | keys = ('nodes', 'r_label', 'g_label', 'n_label')
18 | return dict(zip(keys, data_tuple))
19 |
20 |
21 | def get_edge_count(adj_list):
22 | count = []
23 | for adj in adj_list:
24 | count.append(len(adj.tocoo().row.tolist()))
25 | return np.array(count)
26 |
27 |
28 | def incidence_matrix(adj_list):
29 | '''
30 | adj_list: List of sparse adjacency matrices
31 | '''
32 |
33 | rows, cols, dats = [], [], []
34 | dim = adj_list[0].shape
35 | for adj in adj_list:
36 | adjcoo = adj.tocoo()
37 | rows += adjcoo.row.tolist()
38 | cols += adjcoo.col.tolist()
39 | dats += adjcoo.data.tolist()
40 | row = np.array(rows)
41 | col = np.array(cols)
42 | data = np.array(dats)
43 | return ssp.csc_matrix((data, (row, col)), shape=dim)
44 |
45 |
46 | def remove_nodes(A_incidence, nodes):
47 | idxs_wo_nodes = list(set(range(A_incidence.shape[1])) - set(nodes))
48 | return A_incidence[idxs_wo_nodes, :][:, idxs_wo_nodes]
49 |
50 |
51 | def ssp_to_torch(A, device, dense=False):
52 | '''
53 | A : Sparse adjacency matrix
54 | '''
55 | idx = torch.LongTensor([A.tocoo().row, A.tocoo().col])
56 | dat = torch.FloatTensor(A.tocoo().data)
57 | A = torch.sparse.FloatTensor(idx, dat, torch.Size([A.shape[0], A.shape[1]])).to(device=device)
58 | return A
59 |
60 |
61 | def ssp_multigraph_to_dgl(graph, n_feats=None):
62 | """
63 | Converting ssp multigraph (i.e. list of adjs) to dgl multigraph.
64 | """
65 |
66 | g_nx = nx.MultiDiGraph()
67 | g_nx.add_nodes_from(list(range(graph[0].shape[0])))
68 | # Add edges
69 | for rel, adj in enumerate(graph):
70 | # Convert adjacency matrix to tuples for nx0
71 | nx_triplets = []
72 | for src, dst in list(zip(adj.tocoo().row, adj.tocoo().col)):
73 | nx_triplets.append((src, dst, {'type': rel}))
74 | g_nx.add_edges_from(nx_triplets)
75 |
76 | # make dgl graph
77 | g_dgl = dgl.from_networkx(g_nx, edge_attrs=['type'])
78 | # add node features
79 | if n_feats is not None:
80 | g_dgl.ndata['feat'] = torch.tensor(n_feats)
81 |
82 | return g_dgl
83 |
84 |
85 | def collate_dgl(samples):
86 | # The input `samples` is a list of pairs
87 | graphs_pos, g_labels_pos, r_labels_pos, graphs_negs, g_labels_negs, r_labels_negs = map(list, zip(*samples))
88 | batched_graph_pos = dgl.batch(graphs_pos)
89 | # batched_nei_rels_pos = nei_rels_poss
90 | # batched_nei_rels_pos = [sublist for sublist in nei_rels_poss]
91 |
92 | graphs_neg = [item for sublist in graphs_negs for item in sublist]
93 | g_labels_neg = [item for sublist in g_labels_negs for item in sublist]
94 | r_labels_neg = [item for sublist in r_labels_negs for item in sublist]
95 |
96 | batched_graph_neg = dgl.batch(graphs_neg)
97 | # batched_nei_rels_neg = [item for sublist in nei_rels_negs for item in sublist]
98 |
99 | return (batched_graph_pos, r_labels_pos), g_labels_pos, (batched_graph_neg, r_labels_neg), g_labels_neg
100 |
101 | def collate_dgl_train(samples):
102 | # The input `samples` is a list of pairs
103 | graphs_pos, g_labels_pos, r_labels_pos, graphs_negs, g_labels_negs, r_labels_negs = map(list, zip(*samples))
104 | batched_graph_pos = dgl.batch(graphs_pos)
105 | batched_graph_cor = dgl.batch(graphs_pos)
106 | # batched_nei_rels_pos = nei_rels_poss
107 | # batched_nei_rels_pos = [sublist for sublist in nei_rels_poss]
108 |
109 | graphs_neg = [item for sublist in graphs_negs for item in sublist]
110 | g_labels_neg = [item for sublist in g_labels_negs for item in sublist]
111 | r_labels_neg = [item for sublist in r_labels_negs for item in sublist]
112 |
113 | batched_graph_neg = dgl.batch(graphs_neg)
114 | # batched_nei_rels_neg = [item for sublist in nei_rels_negs for item in sublist]
115 |
116 | return (batched_graph_pos, batched_graph_cor, r_labels_pos), g_labels_pos, (batched_graph_neg, r_labels_neg), g_labels_neg
117 |
118 | def move_batch_to_device_dgl(batch, device):
119 | ((g_dgl_pos, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg) = batch
120 |
121 | targets_pos = torch.LongTensor(targets_pos).to(device=device)
122 | r_labels_pos = torch.LongTensor(r_labels_pos).to(device=device)
123 |
124 | targets_neg = torch.LongTensor(targets_neg).to(device=device)
125 | r_labels_neg = torch.LongTensor(r_labels_neg).to(device=device)
126 |
127 | g_dgl_pos = send_graph_to_device(g_dgl_pos, device)
128 | g_dgl_neg = send_graph_to_device(g_dgl_neg, device)
129 |
130 | # ent2rels = {key: torch.LongTensor(value).to(device=device) for key, value in ent2rels.items()}
131 | # nei_rels_pos = torch.LongTensor(nei_rels_pos).to(device=device)
132 | # nei_rels_neg = torch.LongTensor(nei_rels_neg).to(device=device)
133 |
134 | return ((g_dgl_pos, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg)
135 |
136 | def move_batch_to_device_dgl_train(batch, device):
137 | ((g_dgl_pos, g_dgl_cor, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg) = batch
138 |
139 | targets_pos = torch.LongTensor(targets_pos).to(device=device)
140 | r_labels_pos = torch.LongTensor(r_labels_pos).to(device=device)
141 |
142 | targets_neg = torch.LongTensor(targets_neg).to(device=device)
143 | r_labels_neg = torch.LongTensor(r_labels_neg).to(device=device)
144 |
145 | g_dgl_pos = send_graph_to_device(g_dgl_pos, device)
146 | g_dgl_cor = send_graph_to_device(g_dgl_cor, device)
147 | g_dgl_neg = send_graph_to_device(g_dgl_neg, device)
148 |
149 | # ent2rels = {key: torch.LongTensor(value).to(device=device) for key, value in ent2rels.items()}
150 | # nei_rels_pos = torch.LongTensor(nei_rels_pos).to(device=device)
151 | # nei_rels_neg = torch.LongTensor(nei_rels_neg).to(device=device)
152 |
153 | return ((g_dgl_pos, r_labels_pos), targets_pos, (g_dgl_neg, r_labels_neg), targets_neg, (g_dgl_cor, r_labels_pos))
154 |
155 | def send_graph_to_device(g, device):
156 | # # nodes
157 | # labels = g.node_attr_schemes()
158 | # for l in labels.keys():
159 | # g.ndata[l] = g.ndata.pop(l).to(device)
160 |
161 | # # edges
162 | # labels = g.edge_attr_schemes()
163 | # for l in labels.keys():
164 | # g.edata[l] = g.edata.pop(l).to(device)
165 | # return g
166 | g = g.to(device)
167 | return g
168 |
169 | # The following three functions are modified from networks source codes to
170 | # accomodate diameter and radius for dirercted graphs
171 |
172 |
173 | def eccentricity(G):
174 | e = {}
175 | for n in G.nbunch_iter():
176 | length = nx.single_source_shortest_path_length(G, n)
177 | e[n] = max(length.values())
178 | return e
179 |
180 |
181 | def radius(G):
182 | e = eccentricity(G)
183 | e = np.where(np.array(list(e.values())) > 0, list(e.values()), np.inf)
184 | return min(e)
185 |
186 |
187 | def diameter(G):
188 | e = eccentricity(G)
189 | return max(e.values())
190 |
--------------------------------------------------------------------------------
/utils/initialization_utils.py:
--------------------------------------------------------------------------------
1 | import os
2 | import logging
3 | import json
4 | import torch
5 |
6 |
7 | def initialize_experiment(params, file_name):
8 | '''
9 | Makes the experiment directory, sets standard paths and initializes the logger
10 | '''
11 | params.main_dir = os.path.join(os.path.relpath(os.path.dirname(os.path.abspath(__file__))), '..')
12 | exps_dir = os.path.join(params.main_dir, 'experiments')
13 | if not os.path.exists(exps_dir):
14 | os.makedirs(exps_dir)
15 |
16 | params.exp_dir = os.path.join(exps_dir, params.experiment_name)
17 |
18 | if not os.path.exists(params.exp_dir):
19 | os.makedirs(params.exp_dir)
20 |
21 | if file_name == 'test_auc.py':
22 | params.test_exp_dir = os.path.join(params.exp_dir, f"test_{params.dataset}_{params.constrained_neg_prob}")
23 | if not os.path.exists(params.test_exp_dir):
24 | os.makedirs(params.test_exp_dir)
25 | file_handler = logging.FileHandler(os.path.join(params.test_exp_dir, f"log_test.txt"))
26 | else:
27 | file_handler = logging.FileHandler(os.path.join(params.exp_dir, "log_train.txt"))
28 | logger = logging.getLogger()
29 | logger.addHandler(file_handler)
30 |
31 | logger.info('============ Initialized logger ============')
32 | logger.info('\n'.join('%s: %s' % (k, str(v)) for k, v
33 | in sorted(dict(vars(params)).items())))
34 | logger.info('============================================')
35 |
36 | with open(os.path.join(params.exp_dir, "params.json"), 'w') as fout:
37 | json.dump(vars(params), fout)
38 |
39 |
40 | def initialize_model(params, model, ent2rels=None, load_model=False):
41 | '''
42 | relation2id: the relation to id mapping, this is stored in the model and used when testing
43 | model: the type of model to initialize/load
44 | load_model: flag which decide to initialize the model or load a saved model
45 | '''
46 |
47 | if load_model and os.path.exists(os.path.join(params.exp_dir, 'best_graph_classifier.pth')):
48 | logging.info('Loading existing model from %s' % os.path.join(params.exp_dir, 'best_graph_classifier.pth'))
49 | graph_classifier = torch.load(os.path.join(params.exp_dir, 'best_graph_classifier.pth')).to(device=params.device)
50 | else:
51 | relation2id_path = os.path.join(params.main_dir, f'data/{params.dataset}/relation2id.json')
52 | with open(relation2id_path) as f:
53 | relation2id = json.load(f)
54 |
55 | logging.info('No existing model found. Initializing new model..')
56 | graph_classifier = model(params, relation2id, ent2rels).to(device=params.device)
57 |
58 | return graph_classifier
59 |
--------------------------------------------------------------------------------
/utils/prepare_meta_data.py:
--------------------------------------------------------------------------------
1 | import pdb
2 | import os
3 | import math
4 | import random
5 | import argparse
6 | import numpy as np
7 |
8 | from graph_utils import incidence_matrix, get_edge_count
9 | from dgl_utils import _bfs_relational
10 | from data_utils import process_files, save_to_file
11 |
12 |
13 | def get_active_relations(adj_list):
14 | act_rels = []
15 | for r, adj in enumerate(adj_list):
16 | if len(adj.tocoo().row.tolist()) > 0:
17 | act_rels.append(r)
18 | return act_rels
19 |
20 |
21 | def get_avg_degree(adj_list):
22 | adj_mat = incidence_matrix(adj_list)
23 | degree = []
24 | for node in range(adj_list[0].shape[0]):
25 | degree.append(np.sum(adj_mat[node, :]))
26 | return np.mean(degree)
27 |
28 |
29 | def get_splits(adj_list, nodes, valid_rels=None, valid_ratio=0.1, test_ratio=0.1):
30 | '''
31 | Get train/valid/test splits of the sub-graph defined by the given set of nodes. The relations in this subbgraph are limited to be among the given valid_rels.
32 | '''
33 |
34 | # Extract the subgraph
35 | subgraph = [adj[nodes, :][:, nodes] for adj in adj_list]
36 |
37 | # Get the relations that are allowed to be sampled
38 | active_rels = get_active_relations(subgraph)
39 | common_rels = list(set(active_rels).intersection(set(valid_rels)))
40 |
41 | print('Average degree : ', get_avg_degree(subgraph))
42 | print('Nodes: ', len(nodes))
43 | print('Links: ', np.sum(get_edge_count(subgraph)))
44 | print('Active relations: ', len(common_rels))
45 |
46 | # get all the triplets satisfying the given constraints
47 | all_triplets = []
48 | for r in common_rels:
49 | # print(r, len(subgraph[r].tocoo().row))
50 | for (i, j) in zip(subgraph[r].tocoo().row, subgraph[r].tocoo().col):
51 | all_triplets.append([nodes[i], nodes[j], r])
52 | all_triplets = np.array(all_triplets)
53 |
54 | # delete the triplets which correspond to self connections
55 | ind = np.argwhere(all_triplets[:, 0] == all_triplets[:, 1])
56 | all_triplets = np.delete(all_triplets, ind, axis=0)
57 | print('Links after deleting self connections : %d' % len(all_triplets))
58 |
59 | # get the splits according to the given ratio
60 | np.random.shuffle(all_triplets)
61 | train_split = int(math.ceil(len(all_triplets) * (1 - valid_ratio - test_ratio)))
62 | valid_split = int(math.ceil(len(all_triplets) * (1 - test_ratio)))
63 |
64 | train_triplets = all_triplets[:train_split]
65 | valid_triplets = all_triplets[train_split: valid_split]
66 | test_triplets = all_triplets[valid_split:]
67 |
68 | return train_triplets, valid_triplets, test_triplets, common_rels
69 |
70 |
71 | def get_subgraph(adj_list, hops, max_nodes_per_hop):
72 | '''
73 | Samples a subgraph around randomly chosen root nodes upto hops with a limit on the nodes selected per hop given by max_nodes_per_hop
74 | '''
75 |
76 | # collapse the list of adj mattricees to a single matrix
77 | A_incidence = incidence_matrix(adj_list)
78 |
79 | # chose a set of random root nodes
80 | idx = np.random.choice(range(len(A_incidence.tocoo().row)), size=params.n_roots, replace=False)
81 | roots = set([A_incidence.tocoo().row[id] for id in idx] + [A_incidence.tocoo().col[id] for id in idx])
82 |
83 | # get the neighbor nodes within a limit of hops
84 | bfs_generator = _bfs_relational(A_incidence, roots, max_nodes_per_hop)
85 | lvls = list()
86 | for _ in range(hops):
87 | lvls.append(next(bfs_generator))
88 |
89 | nodes = list(roots) + list(set().union(*lvls))
90 |
91 | return nodes
92 |
93 |
94 | def mask_nodes(adj_list, nodes):
95 | '''
96 | mask a set of nodes from a given graph
97 | '''
98 |
99 | masked_adj_list = [adj.copy() for adj in adj_list]
100 | for node in nodes:
101 | for adj in masked_adj_list:
102 | adj.data[adj.indptr[node]:adj.indptr[node + 1]] = 0
103 | adj = adj.tocsr()
104 | adj.data[adj.indptr[node]:adj.indptr[node + 1]] = 0
105 | adj = adj.tocsc()
106 | for adj in masked_adj_list:
107 | adj.eliminate_zeros()
108 | return masked_adj_list
109 |
110 |
111 | def main(params):
112 |
113 | adj_list, triplets, entity2id, relation2id, id2entity, id2relation = process_files(files)
114 |
115 | meta_train_nodes = get_subgraph(adj_list, params.hops, params.max_nodes_per_hop) # list(range(750, 8500)) #
116 |
117 | masked_adj_list = mask_nodes(adj_list, meta_train_nodes)
118 |
119 | meta_test_nodes = get_subgraph(masked_adj_list, params.hops_test + 1, params.max_nodes_per_hop_test) # list(range(0, 750)) #
120 |
121 | print('Common nodes among the two disjoint datasets (should ideally be zero): ', set(meta_train_nodes).intersection(set(meta_test_nodes)))
122 | tmp = [adj[meta_train_nodes, :][:, meta_train_nodes] for adj in masked_adj_list]
123 | print('Residual edges (should be zero) : ', np.sum(get_edge_count(tmp)))
124 |
125 | print("================")
126 | print("Train graph stats")
127 | print("================")
128 | train_triplets, valid_triplets, test_triplets, train_active_rels = get_splits(adj_list, meta_train_nodes, range(len(adj_list)))
129 | print("================")
130 | print("Meta-test graph stats")
131 | print("================")
132 | meta_train_triplets, meta_valid_triplets, meta_test_triplets, meta_active_rels = get_splits(adj_list, meta_test_nodes, train_active_rels)
133 |
134 | print("================")
135 | print('Extra rels (should be empty): ', set(meta_active_rels) - set(train_active_rels))
136 |
137 | # TODO: ABSTRACT THIS INTO A METHOD
138 | data_dir = os.path.join(params.main_dir, 'data/{}'.format(params.new_dataset))
139 | if not os.path.exists(data_dir):
140 | os.makedirs(data_dir)
141 |
142 | save_to_file(data_dir, 'train.txt', train_triplets, id2entity, id2relation)
143 | save_to_file(data_dir, 'valid.txt', valid_triplets, id2entity, id2relation)
144 | save_to_file(data_dir, 'test.txt', test_triplets, id2entity, id2relation)
145 |
146 | meta_data_dir = os.path.join(params.main_dir, 'data/{}'.format(params.new_dataset + '_meta'))
147 | if not os.path.exists(meta_data_dir):
148 | os.makedirs(meta_data_dir)
149 |
150 | save_to_file(meta_data_dir, 'train.txt', meta_train_triplets, id2entity, id2relation)
151 | save_to_file(meta_data_dir, 'valid.txt', meta_valid_triplets, id2entity, id2relation)
152 | save_to_file(meta_data_dir, 'test.txt', meta_test_triplets, id2entity, id2relation)
153 |
154 |
155 | if __name__ == '__main__':
156 |
157 | parser = argparse.ArgumentParser(description='Save adjacency matrtices and triplets')
158 |
159 | parser.add_argument("--dataset", "-d", type=str, default="FB15K237",
160 | help="Dataset string")
161 | parser.add_argument("--new_dataset", "-nd", type=str, default="fb_v3",
162 | help="Dataset string")
163 | parser.add_argument("--n_roots", "-n", type=int, default="1",
164 | help="Number of roots to sample the neighborhood from")
165 | parser.add_argument("--hops", "-H", type=int, default="3",
166 | help="Number of hops to sample the neighborhood")
167 | parser.add_argument("--max_nodes_per_hop", "-m", type=int, default="2500",
168 | help="Number of nodes in the neighborhood")
169 | parser.add_argument("--hops_test", "-HT", type=int, default="3",
170 | help="Number of hops to sample the neighborhood")
171 | parser.add_argument("--max_nodes_per_hop_test", "-mt", type=int, default="2500",
172 | help="Number of nodes in the neighborhood")
173 | parser.add_argument("--seed", "-s", type=int, default="28",
174 | help="Numpy random seed")
175 |
176 | params = parser.parse_args()
177 |
178 | np.random.seed(params.seed)
179 | random.seed(params.seed)
180 |
181 | params.main_dir = os.path.join(os.path.relpath(os.path.dirname(os.path.abspath(__file__))), '..')
182 |
183 | files = {
184 | 'train': os.path.join(params.main_dir, 'data/{}/train.txt'.format(params.dataset)),
185 | 'valid': os.path.join(params.main_dir, 'data/{}/valid.txt'.format(params.dataset)),
186 | 'test': os.path.join(params.main_dir, 'data/{}/test.txt'.format(params.dataset))
187 | }
188 |
189 | main(params)
190 |
--------------------------------------------------------------------------------