├── Data ├── general_inference_examples.jsonl ├── general_inference_stat.png ├── mrc_examples.jsonl ├── mrc_stat.png ├── mrc_zh_examples.jsonl ├── mrc_zh_stat.png └── readme.md └── readme.md /Data/general_inference_stat.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csitfun/LogiCoT/42ab9deccf625594598270dc63ee83b0ebf7b1ef/Data/general_inference_stat.png -------------------------------------------------------------------------------- /Data/mrc_examples.jsonl: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 4 | "input": "\nPassage: A professional baseball team manager, in order to have the funds to sign a new second-baseman, discreetly arranged to trade one of the most popular outfielders on the team for a lesser-known player and an undisclosed amount of money. The manager secretly considered the outfielder to be overrated and overpaid. Reporters forcefully criticized the trade, arguing that the team had lost its best player and that the manager had violated his fiduciary duties to the team and the fans. A few weeks after being traded, the outfielder was retraded, for twice the value received by the original team manager. Plainly, the outfielder' s price shows that the reporters' criticism of the manager' s action was accurate.\nQuestion: The reasoning in the argument is vulnerable to the criticism that the argument does which one of the following?\nA. The argument bases its conclusion on what the best decision is for the present on uncertain projections about what the best decision will be for the future.\nB. The argument rejects a well-established way of achieving an end without explaining why an unconventional way is better suited for achieving the end.\nC. The argument ignores the opinions of expert reporters in the field of baseball when there is no superior source of information.\nD. The argument bases its conclusion on facts that could, considering the circumstances, have been the product of circumstances other than those presumed by the argument's proponents.\nAnswer and reasoning step by step:", 5 | "output": "D. D. The argument bases its conclusion on facts that could, considering the circumstances, have been the product of circumstances other than those presumed by the argument's proponents.\n\nReasoning: The passage argues that the reporters' criticism of the manager's decision was accurate because the outfielder was retraded for twice the value. However, this change in the outfielder's value could be due to various factors other than the manager's poor decision-making, such as market fluctuations, the outfielder's performance, or the demands of other teams. The argument does not consider these other possible circumstances when concluding that the manager's decision was wrong." 6 | }, 7 | { 8 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 9 | "input": "\nPassage: Quality control investigator: Upon testing samples of products from our supplier that were sent by our field inspectors from various manufacturing locations, our laboratory discovered that over 20 percent of the samples were defective. Since our supplier is contractually required to limit the rate of defects among items it manufactures for us to below 5 percent, it has violated its contract with us.\nQuestion: The reasoning in the quality control investigator's argument is flawed in that the argument\nA. presumes, without providing justification, that the field inspectors were just as likely to choose a defective item for testing as they were to choose a nondefective item\nB. presumes, without providing justification, that the field inspectors made an equal number of visits to each of the various manufacturing sites of the supplier\nC. overlooks the possibility that the field inspectors tend to choose items for testing that they suspect are defective\nD. bases its conclusion on too small a sample of items tested by the laboratory\nAnswer and reasoning step by step:", 10 | "output": "C. C. The reasoning in the quality control investigator's argument is flawed because it overlooks the possibility that the field inspectors tend to choose items for testing that they suspect are defective. This means that the 20 percent defect rate discovered by the laboratory might not accurately represent the overall defect rate among the items manufactured by the supplier." 11 | }, 12 | { 13 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 14 | "input": "\nPassage: Pediatrician: Swollen tonsils give rise to breathing problems during sleep, and the surgical removal of children' s swollen tonsils has been shown to alleviate sleep disturbances. So removing children' s tonsils before swelling even occurs will ensure that the children do not experience any breathing problems during sleep.\nQuestion: The pediatrician's argument is most vulnerable to the criticism that it\nA. fails to consider the possibility that there may be other medical reasons for surgically removing a child's tonsils\nB. infers from the fact that an action has a certain effect that the action is intended to produce that effect\nC. fails to consider the possibility that some breathing problems during sleep may be caused by something other than swollen tonsils\nD. relies on an inappropriate appeal to authority\nAnswer and reasoning step by step:", 15 | "output": "C. The pediatrician's argument is most vulnerable to the criticism that it fails to consider the possibility that some breathing problems during sleep may be caused by something other than swollen tonsils. So, the correct answer is C." 16 | }, 17 | { 18 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 19 | "input": "\nPassage: The Levant -- the area that borders the eastern Mediterranean-was heavily populated in prehistoric times. The southern Levant was abandoned about 6, 000 years ago, although the northern Levant, which shared the same climate, remained heavily populated. Recently archaeologists have hypothesized that the sudden depopulation in the southern Levant was due to an economic collapse resulting from deforestation.\nQuestion: If the statements above are true and the archaeologists' hypothesis is correct, which one of the following CANNOT be true?\nA. The sheep and goats herded by the peoples of the southern Levant until 6, 000 years ago grazed extensively on the seedlings and saplings of indigenous tree species.\nB. Carbon dating of organic remains from the southern Levant reliably demonstrates that there were no forests present in that area prior to 6, 000 years ago.\nC. Organic remains from the northern Levant reliably indicate that tree species flourished there without interruption during the period when the southern Levant was being abandoned.\nD. Since there are few traces of either quarried stone or of mud brick in buildings excavated in the southern Levant, it is likely that the buildings built there prior to 6, 000 years ago were made almost entirely of timber.\nAnswer and reasoning step by step:", 20 | "output": "B. B. If there were no forests present in the southern Levant prior to 6, 000 years ago, it would not make sense for the cause of the economic collapse to be deforestation, as the area would have already been deforested. This contradicts the archaeologists' hypothesis, making option B the correct answer." 21 | }, 22 | { 23 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 24 | "input": "\nPassage: If you know a lot about history, it will be easy for you to impress people who are intellectuals. But unfortunately, you will not know much about history if you have not, for example, read a large number of history books. Ttherefore, if you are not well versed in history due to a lack of reading, it will not be easy for you to impress people who are intellectuals.\nQuestion: The argument's reasoning is flawed because the argument overlooks the possibility that\nA. it is more important to impress people who are not intellectuals than people who are intellectuals\nB. many intellectuals are not widely read in history\nC. there are other easy ways to impress intellectuals that do not involve knowing history\nD. there are people who learn about history who do not impress intellectuals\nAnswer and reasoning step by step:", 25 | "output": "C. C. The argument's reasoning is flawed because it overlooks the possibility that there are other easy ways to impress intellectuals that do not involve knowing history. The passage assumes that the only way to impress intellectuals is by having a deep knowledge of history, but there might be other ways to achieve this that haven't been considered." 26 | }, 27 | { 28 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 29 | "input": "\nPassage: The most successful economies have been, and will continue to be, those that train as many people as possible in the human skills required to research, to develop, and to apply new technology. Japan is a model for this sort of training effort. Europe as a whole is in a weaker position: there is a shortage of skilled labor trained to use the new technologies, and there are not enough scientists able to develop and apply the technology. However, even in Japan there is a shortage of technically qualified people, and, like most European countries, Japan has far too many workers qualified to perform only menial tasks.\nQuestion: Which one of the following can be properly inferred from the passage?\nA. To be economically more successful, Europe needs to train more people in the new technologies.\nB. Japan's successful economy depends upon an uncommonly narrow base of highly skilled labor.\nC. Japan is not the best country against which to measure a country's economic success.\nD. European countries have economies that are more successful than those of most other countries.\nAnswer and reasoning step by step:", 30 | "output": "A. A. To be economically more successful, Europe needs to train more people in the new technologies.\n\nThe passage states that the most successful economies are those that train as many people as possible in the required human skills. It also mentions that Europe is in a weaker position due to a shortage of skilled labor and scientists in new technologies. Therefore, we can infer that training more people in new technologies would help Europe to become economically more successful. So, the correct answer is A." 31 | }, 32 | { 33 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 34 | "input": "\nPassage: Not all tenured faculty are full professors. Ttherefore, although every faculty member in the linguistics department has tenure, it must be the case that not all of the faculty members in the linguistics department are full professors.\nQuestion: The flawed pattern of reasoning exhibited by the argument above is most similar to that exhibited by which one of the following?\nA. Although some cathedrals are not built of stone, every cathedral is impressive. Ttherefore, buildings can be impressive even though they are not built of stone.\nB. Not all public buildings are well designed, but some poorly designed public buildings were originally intended for private use. Ttherefore, the poorly designed public buildings were all originally designed for private use.\nC. All municipal hospital buildings are massive, but not all municipal hospital buildings are forbidding in appearance. Ttherefore, massive buildings need not present a forbidding appearance.\nD. Although some buildings designed by famous architects are not well proportioned, all government buildings are designed by famous architects. Ttherefore, some government buildings are not well proportioned.\nAnswer and reasoning step by step:", 35 | "output": "D. B. The flawed pattern of reasoning in the original argument is the assumption that if not all tenured faculty are full professors, then not all faculty members in the linguistics department must be full professors, even though they have tenure. Similarly, option B assumes that if some poorly designed public buildings were originally intended for private use, then all poorly designed public buildings were originally designed for private use. This is a flawed assumption and makes option B the most similar in reasoning to the original argument." 36 | }, 37 | { 38 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 39 | "input": "\nPassage: Councilperson X: We have an obligation to help ensure that electricity rates are the lowest possible. Since the proposed design for a new generating station would clearly allow for the lowest rates, it must be the design we endorse if we agree that we have no choice but to approve construction of a new plant. Councilperson Y: Helping to ensure the lowest electricity rates is not the council' s only job; we also have an obligation not to lower the quality of life of our community. A plant of the type specified by the design would damage our community' s air quality to such an extent that the benefit of lower rates would be outweighed.\nQuestion: Which one of the following is an issue about which the two councilpersons disagree?\nA. It is the council's responsibility to improve the community's quality of life.\nB. If a new generating station is to be built, the council should endorse a plant of the type specified by the design in question.\nC. A plant of the type specified by the design in question would allow for the lowest electricity rates.\nD. The council should recommend the building of a new generating station.\nAnswer and reasoning step by step:", 40 | "output": "B. A. Both councilpersons agree that they have obligations towards the community, so they do not disagree on this.\n\nB. Councilperson X believes that the council should endorse the plant specified by the design in question, while Councilperson Y believes that the potential damage to air quality would outweigh the benefit of lower electricity rates. This is the issue on which they disagree. \n\nC. Both councilpersons agree that the plant specified by the design in question would allow for the lowest electricity rates.\n\nD. Neither councilperson explicitly states their position on recommending the building of a new generating station.\n\nThe correct answer is B. Both Councilperson X and Y have different opinions on whether or not the council should endorse the proposed design for a new generating station, given its potential impact on the community's air quality." 41 | }, 42 | { 43 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 44 | "input": "\nPassage: Chemist: The molecules of a certain weed-killer are always present in two forms, one the mirror image of the other. One form of the molecule kills weeds, while the other has no effect on them. As a result, the effectiveness of the weed-killer in a given situation is heavily influenced by which of the two forms is more concentrated in the soil, which in turn varies widely because local soil conditions will usually favor the breakdown of one form or the other. Thus, much of the data on the effects of this weed-killer are probably misleading.\nQuestion: Which one of the following, if true, most strengthens the chemist's argument?\nA. Almost all of the data on the effects of the weed-killer are drawn from laboratory studies in which both forms of the weed-killer's molecules are equally concentrated in the soil and equally likely to break down in that soil.\nB. In general, if the molecules of a weed-killer are always present in two forms, then it is likely that weeds are killed by one of those two forms but unaffected by the other.\nC. Data on the weed-killer's effects that rely solely on the examination of the effects of only one of the two forms of the weed-killer's molecules will almost certainly be misleading.\nD. Of the two forms of the weed-killer's molecules, the one that kills weeds is found in most local soil conditions to be the more concentrated form.\nAnswer and reasoning step by step:", 45 | "output": "A. A. This option strengthens the chemist's argument because it shows that the data is primarily derived from lab studies where both forms of the weed-killer are equally concentrated and likely to break down. In reality, local soil conditions will favor one form over the other, so these lab studies do not accurately reflect what happens in real-world conditions. Therefore, the answer is A." 46 | }, 47 | { 48 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 49 | "input": "\nPassage: Preschool children who spend the day in daycare nurseries are ill more often than those who do not. They catch many common illnesses, to which they are exposed by other children in the nurseries. However, when these children reach school age they tend to be ill less often than their classmates who did not spend the day in daycare nurseries during their preschool years.\nQuestion: Which one of the following, if true, best explains the discrepancy in the information above?\nA. The number of infectious illnesses that children in a daycare nursery or school develop is roughly proportional to the number of children in the facility, and daycare nurseries are smaller than most schools.\nB. Although in general the illnesses that children contract through contact with other children at daycare nurseries are not serious, some of those illnesses if untreated have serious complications.\nC. Those children who have older siblings are likely to catch any common infectious illnesses that their older siblings have.\nD. By school age, children who have been in daycare nurseries have developed the immunities to common childhood illnesses that children who have not been in such nurseries have yet to develop.\nAnswer and reasoning step by step:", 50 | "output": "D. D. The discrepancy in the information above is that preschool children in daycare nurseries get sick more often, but once they reach school age, they are less sick compared to those who did not spend time in daycare nurseries. Option D explains this by mentioning that children in daycare nurseries have already developed immunities to common childhood illnesses, whereas children who didn't attend daycare nurseries still need to develop these immunities. This makes option D the best explanation for the discrepancy." 51 | }, 52 | { 53 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 54 | "input": "\nPassage: Diomedes Motors has just decided to start using Rapilux Tires on most models of its new cars. The tires cost the same as the previous tires, and the change in tires will not change the sticker price of any car, nor will it change the profit on the sale of any particular car. Nevertheless, the CEO of Diomedes Motors expects this change in tires to increase Diomedes' profits in the coming year.\nQuestion: Which of the following, if true, provides the best reason for the expectation?\nA. The change to Rapilux Tires in most Diomedes models moves those models to the highest safety rating category; several national consumer advocacy groups give focused advertising to cars in this category.\nB. Some of the tires previously used on these particular Diomedes models are being phased out by those tire manufacturers, and would have to be replaced anyway, and no tire that would fit these Diomedes Models is any cheaper than the Rapilux Tires chosen.\nC. The new Rapilux Tires facilitate loading the new cars onto the trucks that deliver them to the Diomedes showroom.\nD. Few car buyers are likely to decide against buying a Diomedes Motors model simply because it now comes with Rapilux Tires.\nAnswer and reasoning step by step:", 55 | "output": "A. A. The change to Rapilux Tires in most Diomedes models moves those models to the highest safety rating category; several national consumer advocacy groups give focused advertising to cars in this category.\n\nReasoning: This option indicates that the change to Rapilux Tires will increase the safety rating of Diomedes Motors' cars. With higher safety ratings, the cars may attract more customers due to the focused advertising by consumer advocacy groups. Ultimately, this can lead to higher sales, which can increase the company's profits." 56 | }, 57 | { 58 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 59 | "input": "\nPassage: Safety inspector: The number of laboratory samples of rabies virus sent through the university delivery service has recently grown dangerously high. We need to limit this use of the service. Biologist: There is no need for a limit. The university delivery service has been handling rabies virus samples for 20 years with no accidents.\nQuestion: As a rebuttal of the safety inspector's claim, the biologist's reasoning is flawed in that it\nA. fails to explain why the number of samples sent through the service has increased\nB. offers no defense of the university's need for continued delivery of rabies virus samples\nC. overlooks the possibility that there has been a change in the hazardousness of the rabies virus samples themselves\nD. does not address the potential for harm that is posed by the recent increase in the number of samples sent through the service\nAnswer and reasoning step by step:", 60 | "output": "D. D. The biologist's reasoning is flawed because it focuses on the past 20 years of safe handling but does not address the potential harm that the recent increase in the number of samples sent through the service could cause." 61 | }, 62 | { 63 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 64 | "input": "\nPassage: A new machine for harvesting corn will allow rows to be planted only fifteen inches apart, instead of the usual thirty inches. Corn planted this closely will produce lower yields per plant. Nevertheless, the new machine will allow corn growers to double their profits per acre because __.\nQuestion: Which of the following most logically completes the argument?\nA. with the larger number of plants growing per acre, more fertilizer will be required\nB. with the closer spacing of the rows, the acreage on which corn is planted will be utilized much more intensively than it was before, requiring more frequent fallow years in which corn fields are left unplanted\nC. with the closer spacing of the rows, the growing corn plants will quickly form a dense canopy of leaves, which will, by shading the ground, minimize the need for costly weed control and irrigation\nD. with the closer spacing of the rows, corn plants will be forced to grow taller because of increased competition for sunlight from neighboring corn plants\nAnswer and reasoning step by step:", 65 | "output": "C. C. with the closer spacing of the rows, the growing corn plants will quickly form a dense canopy of leaves, which will, by shading the ground, minimize the need for costly weed control and irrigation\n\nReasoning: The question asks us to find the factor that allows corn growers to double their profits per acre. Option C explains that the dense canopy of leaves because of closer spacing will minimize the need for expensive weed control and irrigation, thus reducing costs and increasing profits for corn growers. The other options do not directly contribute to doubling the profits per acre." 66 | }, 67 | { 68 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 69 | "input": "\nPassage: Mammals cannot digest cellulose and ttherefore cannot directly obtain glucose from wood. Mushrooms can, however; and some mushrooms use cellulose to make highly branched polymers, the branches of which are a form of glucose called beta-glucans. Beta-glucan extracts from various types of mushrooms slow, reverse, or prevent the growth of cancerous tumors in mammals, and the antitumor activity of beta-glucans increases as the degree of branching increases. These extracts prevent tumor growth not by killing cancer cells directly but by increasing immune-cell activity.\nQuestion: Which one of the following is most strongly supported by the information above?\nA. If extracts from a type of mushroom slow, reverse, or prevent the growth of cancerous tumors in mammals, then the mushroom is capable of using cellulose to make beta-glucans.\nB. Any organism capable of obtaining glucose from wood can use cellulose to make beta-glucans.\nC. The greater the degree of branching of beta-glucans, the greater the degree of immune-cell activity it triggers in mammals.\nD. Mammals obtain no beneficial health effects from eating cellulose.\nAnswer and reasoning step by step:", 70 | "output": "C. C. The greater the degree of branching of beta-glucans, the greater the degree of immune-cell activity it triggers in mammals.\n\nReasoning: In the passage, it is mentioned that \"the antitumor activity of beta-glucans increases as the degree of branching increases\" and that \"these extracts prevent tumor growth not by killing cancer cells directly but by increasing immune-cell activity.\" Therefore, the statement that the greater the degree of branching of beta-glucans, the greater the degree of immune-cell activity it triggers in mammals is most strongly supported by the information given in the passage." 71 | }, 72 | { 73 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 74 | "input": "\nPassage: Between 1951 and 1963, it was illegal in the country 25. of Geronia to manufacture, sell, or transport any alcoholic beverages. Despite this prohibition, however, the death rate from diseases related to excessive alcohol consumption was higher during the first five years of the period than it was during the five years prior to 1951. Ttherefore, the attempt to prevent alcohol use merely made people want and use alcohol more than they would have if it had not been forbidden.\nQuestion: Each of the following, if true, weakens the argument EXCEPT:\nA. Death from an alcohol-related disease generally does not occur until five to ten years after the onset of excessive alcohol consumption.\nB. The diseases that can be caused by excessive alcohol consumption can also be caused by other kinds of behavior that increased between 1951 and 1963.\nC. Many who died of alcohol-related diseases between 1951 and 1963 consumed illegally imported alcoholic beverages produced by the same methods as those used within Geronia.\nD. The death rate resulting from alcohol-related diseases increased just as sharply during the ten years before and the ten years after the prohibition of alcohol as it did during the years of prohibition.\nAnswer and reasoning step by step:", 75 | "output": "C. Answer: C\n\nReasoning: The argument claims that the prohibition of alcohol in Geronia made people want and use alcohol more than they would have if it had not been forbidden. Option A weakens the argument by suggesting that the deaths during the prohibition might be the result of alcohol consumption prior to the prohibition. Option B weakens the argument by offering an alternative explanation for the increased death rate, i.e., other behaviors that increased during the prohibition period. Option D weakens the argument by showing that the death rate increase was consistent before, during, and after the prohibition, suggesting an unrelated factor.\n\nOption C, however, does not weaken the argument as it confirms that people were still consuming alcoholic beverages even though it was illegal. In fact, the illegally imported beverages were produced using the same methods as those in Geronia, suggesting that the prohibition did not significantly change people's access to alcohol. Therefore, option C does not weaken the argument and is the correct answer." 76 | }, 77 | { 78 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 79 | "input": "\nPassage: Forty miles to the west of Tucson, a developer attempted to develop an \"oasis\" glade in the Sonoran Desert. He created a glade that, the developer claims, is as lush and plant-filled as the ambient 35 degree C temperatures allow. In particular, the developer planted several native fruit-bearing plants, such as prickly-pear cactus. Once the glade was established, a number of desert birds and mammals would regularly inhabit the glade, often eating most of the fruit that falls from the plants.\nQuestion: The considerations given best serve as part of an argument that:\nA. the presence of the desert animals in the glade do not serve as evidence against the claim of the developer.\nB. most of these desert animals in the glade would have perished if the glade had not been created in that location.\nC. the shade from all the plants in the glade measurably reduces the ambient temperature within the glade.\nD. enclosing the fruit -- bearing plants in wire cages to prevent consumption by desert animals would allow for fruit sales to cover the cost of the glade's development.\nAnswer and reasoning step by step:", 80 | "output": "A. A. The passage mentions that the developer created a lush and plant-filled glade, and after its establishment, desert animals started inhabiting the area and eating the fallen fruit. This information supports the developer's claim of creating an oasis in the desert, so the presence of the animals does not serve as evidence against the claim of the developer. Therefore, the correct answer is A." 81 | }, 82 | { 83 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 84 | "input": "\nPassage: Some scientists believe that 65 million years ago an asteroid struck what is now the Yucatan Peninsula, thereby causing extinction of the dinosaurs. These scientists have established that such a strike could have hurled enough debris into the atmosphere to block sunlight and cool the atmosphere. Without adequate sunlight, food sources for herbivorous dinosaurs would have disappeared, and no dinosaurs could have survived a prolonged period of low temperatures. These same scientists, however, have also established that most debris launched by the asteroid would have settled to the ground within six months, too soon for the plants to disappear or the dinosaurs to freeze.\nQuestion: Which one of the following, if true, most helps to resolve the apparent discrepancy between the scientists' beliefs and the scientists' results, as described above?\nA. Dinosaurs were susceptible to fatal respiratory problems caused by contamination of the air by asteroid debris.\nB. A cloud of debris capable of diminishing sunlight by 20 percent would have cooled the earth's surface by 7 to 10 degrees Celsius.\nC. Loss of the herbivorous dinosaurs would have deprived the carnivorous dinosaurs of their food source.\nD. Dinosaurs inhabited most landmasses on the planet but were not especially abundant in the area of the asteroid strike.\nAnswer and reasoning step by step:", 85 | "output": "A. A. This option provides a reason for the dinosaurs' extinction that is not dependent on the disappearance of plants or a prolonged period of low temperatures. With fatal respiratory problems caused by asteroid debris, the extinction of dinosaurs could have happened within the six-month timeline. Therefore, the answer is A." 86 | }, 87 | { 88 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 89 | "input": "\nPassage: The five senses have traditionally been viewed as distinct yet complementary. Each sense is thought to have its own range of stimuli that are incapable of stimulating the other senses. However, recent research has discovered that some people taste a banana and claim that they are tasting blue, or see a color and say that it has a specific smell. This shows that such people, called synesthesiacs, have senses that do not respect the usual boundaries between the five recognized senses.\nQuestion: Which one of the following statements, if true, most seriously weakens the argument?\nA. Recent evidence strongly suggests that there are other senses besides sight, touch, smell, hearing, and taste.\nB. Synesthesiacs can be temporarily rid of their synesthetic experiences by the use of drugs.\nC. The particular ways in which sensory experiences overlap in synesthesiacs follow a definite pattern.\nD. Synesthesiacs demonstrate a general, systematic impairment in their ability to use and understand words.\nAnswer and reasoning step by step:", 90 | "output": "D. D. The argument claims that synesthesiacs have senses that don't respect the boundaries between the five recognized senses. If synesthesiacs demonstrate a general, systematic impairment in their ability to use and understand words, then it weakens the argument that their senses are crossing boundaries because it could be a language or communication issue rather than a sensory issue." 91 | }, 92 | { 93 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 94 | "input": "\nPassage: \"Multiple use\" refers to the utilization of natural resources in combinations that will best meet the present and future needs of the public. Designating land as a wilderness area does not necessarily violate the multiple-use philosophy, for even when such use does not provide the greatest dollar return, it can provide the greatest overall benefit from that site.\nQuestion: Which one of the following is an assumption required by the argument?\nA. The multiple-use philosophy takes into account some nonfinancial needs of the public.\nB. Natural resources should be used in combinations that will most greatly benefit present and future generations.\nC. The present and future needs of the public would best be met by designating greater numbers of wilderness areas.\nD. Designating a wilderness area prevents any exploitation of natural resources in that area.\nAnswer and reasoning step by step:", 95 | "output": "A. A. The multiple-use philosophy takes into account some nonfinancial needs of the public.\n\nReasoning: The passage states that designating land as a wilderness area does not necessarily violate the multiple-use philosophy, even when it does not provide the greatest dollar return, because it can still provide the greatest overall benefit. This implies that the multiple-use philosophy considers not only financial aspects but also nonfinancial needs of the public." 96 | }, 97 | { 98 | "instruction": "Instructions: You will be presented with a passage and a question about that passage. There are four options to be chosen from, you need to choose the only correct option to answer that question. If the first option is right, you generate the answer 'A', if the second option is right, you generate the answer 'B', if the third option is right, you generate the answer 'C', if the fourth option is right, you generate the answer 'D'. Read the question and options thoroughly and select the correct answer from the four answer labels. Read the passage thoroughly to ensure you know what the passage entails.", 99 | "input": "\nPassage: The Ergonomic Society conducted a study that indicated that many people develop severe back problems during adulthood, and that virtually all such people who received chiropractic treatment showed great improvement. Ttherefore, in order to minimize the proportion of the population that suffers from back pain, the Ergonomic Society recommended that chiropractic treatment be directed toward those adults who suffer from severe back problems.\nQuestion: Which of the following is an assumption on which the argument depends?\nA. Severe back pain and other problems often cause individuals to miss workdays.\nB. Chiropractic treatment is more effective in treating severe back problems when utilized over a long period of time, as opposed to sporadically.\nC. Individuals who receive chiropractic or other treatment prior to developing severe back problems are not less likely to develop back pain than those who do not.\nD. Any person who receives chiropractic treatment for back pain may also benefit from other forms of treatment.\nAnswer and reasoning step by step:", 100 | "output": "C. C. The argument assumes that treating individuals before they develop severe back problems would not be more effective in minimizing the proportion of the population suffering from back pain than focusing on treating those who already have severe back problems." 101 | }, 102 | ] 103 | -------------------------------------------------------------------------------- /Data/mrc_stat.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csitfun/LogiCoT/42ab9deccf625594598270dc63ee83b0ebf7b1ef/Data/mrc_stat.png -------------------------------------------------------------------------------- /Data/mrc_zh_examples.jsonl: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 4 | "input": "有些广东人不爱吃辣椒.因此,有些南方人不爱吃辣椒.\n\n以下哪项能保证上述论证的成立?\n\nA.有些广东人爱吃辣椒\n\nB.爱吃辣椒的有些是南方人\n\nC.所有的广东人都是南方人\n\nD.有些广东人不爱吃辣椒也不爱吃甜食", 5 | "output": "C. C\n\n理由:首先,题目给出的两个命题是:\n1. 有些广东人不爱吃辣椒。\n2. 因此,有些南方人不爱吃辣椒。\n\n为了判断这个推理是否成立,我们需要确定广东人与南方人之间的关系。在这四个选项中,只有选项C提供了这个关系:“所有的广东人都是南方人”。根据这个关系,我们可以推断出如果有些广东人不爱吃辣椒,那么确实有些南方人不爱吃辣椒,因为广东人也属于南方人。所以,选项C能保证上述论证的成立。" 6 | }, 7 | { 8 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 9 | "input": "室内荧光灯的连续照射对患心脏病的仓鼠的健康有益.一群暴露在荧光灯连续照射下的仓鼠平均寿命比另一群同种但生活在黑墙中的仓鼠长2.5%.\n\n下列哪一项问题是进行上述实验的最初动因?\n\nA.医院的光照疗法能被证明对病人的恢复有促进作用吗?\n\nB.生活在灯光下的仓鼠与生活在黑暗中的仓鼠哪一种更长寿?\n\nC.仓鼠患的是什么病?\n\nD.是否有些仓鼠需要一段时间的阴暗?", 10 | "output": "A. A. 医院的光照疗法能被证明对病人的恢复有促进作用吗?\n\n原因:这个实验的关键在于观察患心脏病的仓鼠生活在常亮荧光灯环境下是否对其健康有益。因此,实验的动机应该和疗法效果相关。选项A涉及光照疗法对病人康复的促进作用,与实验内容相关,所以选择A。" 11 | }, 12 | { 13 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 14 | "input": "毫无疑问,未成年人吸烟应该加以禁止.但是,我们不能为了防止给未成年人吸烟以可乘之机,就明令禁止自动售烟机的使用.这种禁令就如同为了禁止无证驾车在道路上设立路障,这道路障自然禁止了无证驾车,但同时也阻挡了99%以上的有证驾驶者.\n\n为了对上述论证作了评价,回答以下哪个问题最为重要?\n\nA.未成年吸烟者在整个吸烟者中所占的比例是否超过1%\n\nB.禁止使用自动售烟机带给成年购烟者的不便究竟有多大?\n\nC.无证驾车者在整个驾车者中所占的比例是否真的不超过1%\n\nD.未成年人吸烟的危害,是否真如公众认为的那样严重?", 15 | "output": "B. A. 未成年吸烟者在整个吸烟者中所占的比例是否超过1% \n这个问题与论证关系不大。因为在论证中,主要讨论的是禁止自动售烟机是否合理,而不是关注未成年人吸烟的比例。\n\nB. 禁止使用自动售烟机带给成年购烟者的不便究竟有多大?\n这个问题最为重要,因为在论证中,作者提到禁止自动售烟机的使用对于成年购烟者是不便的,那么这个不便有多大就是评价这个论证的重要因素。\n\nC. 无证驾车者在整个驾车者中所占的比例是否真的不超过1%\n这个问题与论证关系不大。作者在文中引用无证驾车者占比来说明禁止自动售烟机的不合理性,但这只是一个类比,并不意味着无证驾车者的具体占比与论证息息相关。\n\nD. 未成年人吸烟的危害,是否真如公众认为的那样严重?\n这个问题与论证关系也不大,因为这不是论证的关键点。论证的关键是关注禁止自动售烟机是否合理。\n\n所以,回答B问题最为重要。" 16 | }, 17 | { 18 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 19 | "input": "一份研究报告指出,为接受研究的3-5岁的孩子准备的一个特殊教育方案提高了他们在今后学校教育中获得成功的可能.因此,对所有孩子实行类似的教育方案会提高他们在以后学校教育中取得成功的机会.\n\n以下哪项最能说明以上归纳的逻辑漏洞?\n\nA.儿童在3-5岁时的认知能力是不断变化的\n\nB.在国家的基础上建立这样的教育培训项目需要特别的公共支出\n\nC.很多父母错误地认为早期正规教育将占用孩子们本来可以更好地独立探索世界的时间\n\nD.调查人员并未意识到他们把以前曾受到另一种教育的一大批孩子包括进来", 20 | "output": "A. A.儿童在3-5岁时的认知能力是不断变化的\n\n原因: 文中提到的研究报告是针对3-5岁孩子的,这个年龄段的孩子的认知能力是在快速发展变化的。因此,当研究者试图将这个特殊教育方案应用于所有孩子时,这可能并不合适。因为不同年龄段的孩子认知能力的发展速度和需求差异很大,所以对于其他年龄段的孩子来说,这个方案可能不具有同样的效果。这就是归纳过程中的一个逻辑漏洞。" 21 | }, 22 | { 23 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 24 | "input": "卖国贼就是说诳的,所以你是卖国贼,所以我是爱国者.爱国者的话是最有价值的,所以我的话是不错的,我的话既然不错,你就是卖国贼无疑了!\n\n下列哪项犯了与上述相同的逻辑错误?\n\nA.文艺不值得重视.因为现代化建设必须依靠科学技术,科学技术是值得我们重视的,而文艺不是科学技术.\n\nB.不读大学也可能成为杰出的人才.高尔基,爱迪生,法拉第,富兰克林等都没有读过大学,但他们都成为杰出的人才.\n\nC.地球是球体,可以从这样的事实得到证明:我们站在高处看海中帆船驶来,总是先见桅杆后见船身.之所以这样,就因为地球是球体.\n\nD.我几次看见他们从这间化工厂出来,才知道这位热心帮助病人的小青年,原来是这间化工厂的工人.", 25 | "output": "C. A: 文艺不值得重视。因为现代化建设必须依靠科学技术,科学技术是值得我们重视的,而文艺不是科学技术。\n\n这个选项犯了逻辑错误,但与问题中的例子不同。这里的错误在于,将科学技术和文艺之间形成对立关系,根据科学技术值得重视,得出文艺不值得重视的结论,这是一个假设,没有确凿证据。\n\nB: 不读大学也可能成为杰出的人才。高尔基,爱迪生,法拉第,富兰克林等都没有读过大学,但他们都成为杰出的人才。\n\n这个选项逻辑没有错误,举例子证明了不读大学也能成为杰出人才的观点。与问题中的例子不同。\n\nC: 地球是球体,可以从这样的事实得到证明:我们站在高处看海中帆船驶来,总是先见桅杆后见船身。之所以这样,就因为地球是球体。\n\n这个选项逻辑没有错误,提供了观察事实的证据来支持地球是球体的观点。与问题中的例子不同。\n\nD: 我几次看见他们从这间化工厂出来,才知道这位热心帮助病人的小青年,原来是这间化工厂的工人。\n\n这个选项逻辑没有错误,根据观察事实得出小青年是这间化工厂的工人。与问题中的例子不同。\n\n综上所述,没有选项与问题中的例子犯了相同的逻辑错误。我的原先回答是错误的。" 26 | }, 27 | { 28 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 29 | "input": "去年的通货膨胀率是1.2%,今年到目前为止已经达到4%,因此我们可以得出结论:通货膨胀率呈上升趋势,明年的通货膨胀率会更高.\n\n以下哪项如果为真,将严重削弱上述结论?\n\nA.通货膨胀率是根据有代表性的经济灵敏数据样本计算的,而不是根据已有数据\n\nB.去年通货膨胀是最近几年来最低的\n\nC.通货膨胀率上涨造成的虚假繁荣终将导致经济的衰退\n\nD.去年的金融风暴导致通货膨胀率暂时低于近几年来4%的平均水平", 30 | "output": "D. D.去年的金融风暴导致通货膨胀率暂时低于近几年来4%的平均水平\n\n原因:\n\n题目给出的结论是:通货膨胀率呈上升趋势,明年的通货膨胀率会更高。我们需要找到一个选项,如果为真,将削弱这个结论。\n\nA选项提到通货膨胀率是根据经济数据样本计算的,并不会影响结论的正确性。\nB选项指出去年通货膨胀是最近几年来最低的,这与结论无关,不会削弱结论。\nC选项提到通货膨胀率上涨会导致经济衰退,但这也不会削弱结论。\n\n而D选项指出,去年金融风暴导致通货膨胀率暂时降低,说明去年的低通货膨胀率可能是特殊情况,今年通货膨胀率上涨并不代表通货膨胀率会持续上涨。如果D为真,将削弱结论指出的通货膨胀率呈上升趋势。因此选D。" 31 | }, 32 | { 33 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 34 | "input": "亚里士多德的《辨谬篇》记载有这样一则诡辩:\"你有一条狗,它是有儿女的,因为它是一个父亲;它是你的,因而它是你的父亲;你打它,就是打你自己的父亲.\"\n\n以下哪项犯有与上述诡辩相同的逻辑谬误?\n\nA.犬有四足,羊有四足,因而犬可以为羊\n\nB.\"起来吧,捷克,公鸡早就叫了.\"\"这关我什么事?我又不是母鸡.\"\n\nC.当有人说欧谛德谟说谎时,他狡辩说:\"谁说谎,谁就是说不存在的东西;不存在的东西是无法说的,因此没有人说谎.\"\n\nD.你的观点被证明是错误的,那么我批评你的观点就是正确的.既然我的观点正确,那么,你的观点错误也就确属无疑了.", 35 | "output": "D. A选项的谬误在于类比不当,将两种不同的事物因为拥有相似的特点就认为是同一个事物。\n\nB选项的谬误在于歧义,将“公鸡叫了”这个事实与“我又不是母鸡”进行不当对比,认为自己和公鸡叫无关。\n\nC选项的谬误在于循环论证,欧谛德谟自己提出了一种无法证明的说法(不存在的东西无法说),并以此作为论据证明自己没有说谎。\n\nD选项与诡辩给出的例子类似,都是通过一种表面上看似合理的论证,实际上具有循环论证或歧义的错误。在这个选项中,首先认为自己批评对方观点是正确的,然后由于自己观点正确,就认为对方观点必然错误。这种论证实际上存在循环论证问题,没有真正证明为什么对方观点错误。\n\n所以,与诡辩中的谬误相同的选项是D。" 36 | }, 37 | { 38 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 39 | "input": "按照\"上帝创世说\",上帝在第一天创造了地球,第二天创造了月亮,第三天才创造出太阳.\n\n下列哪项断定不是上述创世说的文中之义?\n\nA.天体星球的形成是有先有后的\n\nB.关于\"一天\"的含义,上帝与我们的理解和运用是一样的\n\nC.至少地球存在的头两天是没有太阳的\n\nD.没有太阳,一片漆黑,上帝也能工作", 40 | "output": "B. A - 这个选项符合上述创世说的文中之义,因为上帝在不同的天创造了地球、月亮和太阳,这意味着天体星球的形成是有先有后的。\n\nB - 这个选项可能并不是上述创世说的文中之义。因为这个假设上帝与我们有相同的\"一天\"的概念,这可能与创世说的寓意不符。创世纪中的“天数”并不一定等同于人类对地球上一天的理解。因此,最合适的答案是 B。\n\nC - 这个选项符合上述创世说的文中之义,因为地球是在第一天创造的,而太阳是在第三天创造的,所以地球头两天是没有太阳的。\n\nD - 这个选项符合上述创世说的文中之义,因为即使没有太阳,一片漆黑,上帝也能在这个过程中创造地球。" 41 | }, 42 | { 43 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 44 | "input": "某语言学家认为,\"未婚妻\"的说法是矛盾的.因为\"未婚妻\"的词典解释是,\"已经订婚尚未结婚的妻子\",而\"妻子\"的词典解释是,\"男妇两人结婚后,女子是男子的妻子\".没有结婚,怎么能称妻子?\n\n以下哪项是该语言学家的所实际预设的前提?\n\nA.\"未婚妻\"的说法可能会导致同居关系合法化\n\nB.有的虽然没有正式结婚,但其实质是一种事实婚姻\n\nC.事实上,\"未婚妻\"是一种不确定的婚姻关系\n\nD.所谓\"未婚妻\"实际上是指现在没有结婚,但是现在该女子是某男子的妻子", 45 | "output": "D. 该语言学家认为\"未婚妻\"的说法是矛盾的,其主要依据是:\"未婚妻\"的词典解释为\"已经订婚尚未结婚的妻子\",而\"妻子\"的词典解释是\"男女两人结婚后,女子是男子的妻子\"。因此,问题中提到的前提就是选项D:\"所谓\"未婚妻\"实际上是指现在没有结婚,但是现在该女子是某男子的妻子\"。\n\n该选项恰恰表达了语言学家的看法,即\"未婚妻\"这个词的定义存在矛盾,因为按照词典解释,既然妻子是指结婚后的女子称呼,那么\"未婚妻\"这个词语在定义上就存在问题。\n\nA、B、C选项都没有表述清楚语言学家所认为的矛盾存在。选项A中的合法化问题与语言学家关注的定义矛盾无关。选项B提到事实婚姻,但这与语言学家关注的问题无关。选项C提到的不确定性也无法解释语言学家对\"未婚妻\"这个词的矛盾看法。因此,正确答案是D。" 46 | }, 47 | { 48 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 49 | "input": "某次全市司法工作者会议进行分组讨论.其中有一个小组共8人.会议开始时小组长问大家原来各自是否认识.结果是全组中仅有一人认识小组中的3个,有3人认识小组中的2人,有4人认识小组中的1人.\n\n如以上统计属实,则最能得出以下哪项结论?\n\nA.小组长认识的小组中人数最多,其他的人相互认识的人就少\n\nB.此类会议是第一次举行,大家都是新面孔\n\nC.有些成员所说的认识可能仅仅是在电视或报告会上见过而已\n\nD.虽然组内成员原来的熟人不多,但原来认识的却是至交", 50 | "output": "C. 1. 首先由统计我们知道:仅有一人认识小组中的3个(设为A),有3人认识小组中的2人(设为B1,B2,B3),有4人认识小组中的1人(设为C1,C2,C3,C4)。\n \n2. 因为A认识3个人,所以他认识的这3个人只能是B和其中的两个C,否则A就会认识超过3个人。所以,A认识B1,B2,B3中的某两个人和C1,C2,C3,C4中的某一个人。\n\n3. 设B1和B2是A认识的小组成员,则B3认识的2个人中必然包括A,另一个人必然是C类中的一个,否则会导致假设错误。所以,B1、B2和B3他们中的每个人都认识至少一个C类成员。\n\n4. 现在我们分析C类成员之间的关系。由于C1、C2、C3、C4四个人各自只认识一个人,且他们至少与3个B类成员有联系。所以C类四个成员中,有一个认识A,另外三个分别认识B1、B2和B3。这样C类四个人认识的四个人加上A认识的B1、B2和另一个C类成员,就是完全符合问题描述中的关系的唯一解。\n\n根据以上分析,可以看出C选项符合这种情况:有些成员所说的认识可能仅仅是在电视或报告会上见过而已,因为他们之间认识的人数关系是符合描述的,但不一定意味着私下熟悉。所以答案是C。" 51 | }, 52 | { 53 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 54 | "input": "某连锁店老总对于前一阶段的过渡扩展带来的管理问题束手无策,于是问一位著名管理专家:\"我不知道如何管理好下属的300多个连锁店?\"那位管理专家直截了当回答:\"削减它们.\"老总问:\"削减多少?\"专家说:\"直到你能看得住它们为止.\"\n\n上述管理专家的回答是以下面哪项陈述为前提的?\n\nA.下属连锁店多少与连锁店的性质有关.\n\nB.下属连锁店多少与连锁店的发展有关.\n\nC.下属连锁店多少与连锁店领导层的人数有关.\n\nD.下属连锁店多少与老总的能力有关.", 55 | "output": "D. D.下属连锁店多少与老总的能力有关.\n\n理由:\n1. 管理专家的回答与对老总能力的评估有关。专家建议削减连锁店的数量,直到老总能够有效地管理它们。这意味着连锁店的数量应该取决于老总的管理能力。\n\n2. A,B,C选项并未直接涉及到老总的能力。而管理专家的回答是基于当前老总的能力来给出建议的。所以,这里的前提是D.下属连锁店多少与老总的能力有关。\n\n3. 老总的能力决定了他能有效管理的连锁店数量。所以,这段文字中管理专家的回答是基于这个陈述作为前提的。" 56 | }, 57 | { 58 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 59 | "input": "李林是公务员,但不是大学毕业生\n\n下列哪项断定可必然推得?\n\nA.并非大学毕业生都是公务员\n\nB.公务员不都是大学毕业生\n\nC.有的大学毕业生不是公务员\n\nD.有的公务员是大学毕业生", 60 | "output": "B. 1. 题干给出信息:李林是公务员,但不是大学毕业生。\n\n2. 根据给出的信息,可以得出结论:存在一个公务员不具有大学毕业生身份。换句话说就是公务员群体中,有的不是大学毕业生。\n\n3. 所以选项B(公务员不都是大学毕业生)是正确答案。" 61 | }, 62 | { 63 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 64 | "input": "某研究机构进行了一次为期10周的减肥试验,参加者平均减肥9公斤,中年组参加者平均减肥13公斤,青年组参加者平均减肥7公斤.研究人员将中青年组减肥差异归结为中年组参加者减肥前体重比青年组参加者重.\n\n从上文可推出以下哪个结论?\n\nA.青年组参加者减肥前体重都比中年组参加者轻\n\nB.所有参加者体重均下降\n\nC.青年组参加者比中年组参加者多\n\nD.中年组参加者比青年组参加者多", 65 | "output": "C. 首先,我们知道试验的参加者分为中年组和青年组,而平均减肥9公斤是针对所有参加者的。中年组参加者平均减肥13公斤,青年组参加者平均减肥7公斤,说明这两组人的减肥效果有差异。\n\nA项:不能确定青年组参加者减肥前的体重都比中年组轻,只能推断出中年组减肥前平均体重比青年组重。\n\nB项:这个结论也不能肯定,因为只知道平均减肥是9公斤,但不同个体可能减肥效果有差异,有的人可能并未瘦下来。\n\nD项:不能确定同年组参加者的多少。\n\n回到C项。由平均减肥量的数据可以发现,中年组的平均减肥量比青年组多6公斤,而整体平均减肥量是9公斤。由此,我们可以推断出青年组参加者数量应该是中年组的数倍,以些建立平均减肥量的数值。因此,可以判断出C项:青年组参加者比中年组参加者多。" 66 | }, 67 | { 68 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 69 | "input": "鸽子能从几百里甚至几千里外准确无误地飞回自己的家.鸽子是怎样认出归家之路的呢?长期以来,这一直是个迷,吸引着许多人来研究这个问题.有些研究人员根据他们所提出的假说,把鸽子装入严密遮挡的笼子,运到一个陌生的地方放飞,他们照样能轻而易举地找到回家的方向.如果把毛玻璃接触透镜装在鸽子眼上,使他们看不到几米以外的东西,然后将这些\"近视眼\"鸽子运到100多公里外放飞,它们也能回家,并且准确地降落在鸽舍附近一到二米的地方——只是在这个距离上才需要眼睛帮忙.\n\n根据以上情况,以下哪项是这些研究人员所提出的假说?\n\nA.鸽子眼神好,记忆力惊人\n\nB.鸽子身上有一个相当精密的导航系统\n\nC.鸽子能按地磁导航\n\nD.在太阳发生强烈磁爆期间以及月蚀时,鸽子会失去定向的能力", 70 | "output": "B. 根据研究人员所做的实验,鸽子即使被装入严密遮挡的笼子、眼睛上装有毛玻璃透镜使眼睛看不到几米之外的东西(近视眼),仍然能轻而易举地找到回家的方向和准确地降落在鸽舍附近。这说明鸽子的这种能力并不完全依赖于视觉和对地理位置的记忆,而是有一种其他的导向方法。\n\n选项A强调了鸽子的眼神和记忆力,但实验表明凭借这两点不能回答鸽子如何找到回家的路;选项D提到了鸽子在特定情况下失去定向的能力,然而这不能解释鸽子在正常情况下如何找到回家的路。\n\n选项B提到了鸽子身上有一个相当精密的导航系统,这种说法可以解释实验中鸽子在几乎没有视觉和记忆力帮助的情况下找到回家的方向;选项C也提到了地磁导航,这可以被看作是这个导航系统的一部分,然而选项B更全面地描述了鸽子的导航能力。\n\n因此,答案是选项B。" 71 | }, 72 | { 73 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 74 | "input": "为一个大型建筑项目进行的融资是由一家银行提供的.当资金在项目完成前用完时,银行批准了进一步的贷款.现在,随着资金再度用完而项目仍未完成,银行拒绝增加进一步的贷款,虽然没有这些贷款项目就泡汤了.\n\n以下哪项如果为真,能最好地说明银行现在的反应和他们先前反应不同的理由?\n\nA.银行重新评估了完成后的项目的潜在收入,并且得出结论项目可产生的总收入会低于旧货款加上新货款的总利息\n\nB.银行发现了几个其他的项目,如果现在批准对它们的贷款让这些项目启动,这些项目能促使本金更快地回收\n\nC.银行与借方一致同意,建筑贷款以完成的项目作为担保\n\nD.成本超支主要是因为在建筑工程最困难的阶段出现了未能预见的问题", 75 | "output": "A. A. 银行重新评估了完成后的项目的潜在收入,并且得出结论项目可产生的总收入会低于旧货款加上新货款的总利息\n\n理由:这个选项解释了银行为什么现在不愿意再次批准贷款。先前的贷款可能是基于对项目潜在收入的高估,而现在银行重新评估了项目收入和贷款利息之间的关系,认为贷款的收益不再划算。因此,银行与先前的策略不同,现在决定不再提供贷款。其他选项并不直接解释为什么银行改变了与前一次相反的决策。" 76 | }, 77 | { 78 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 79 | "input": "烟草工业财源滚滚,预计今后还会这样.去年美国烟农所售出的烟草总量大大增加,但国内吸烟的人数却在不断下降.\n\n以下各项如果为真,都能成为上述现象的理由,除了:\n\nA.今年开始吸烟的妇女多于已经戒烟的男子\n\nB.持续吸烟的人消费的烟草比往年更多\n\nC.今年开始吸烟草,吸鼻烟的人数多于已经戒烟的人数\n\nD.今年美国生产的香烟出口到其他国家的数量比去年多", 80 | "output": "A. A. 今年开始吸烟的妇女多于已经戒烟的男子 - 这一选项说明吸烟人数有所上升,但是题目给出的情况是吸烟人数在不断下降。所以A选项不能解释现象。\n\nB. 持续吸烟的人消费的烟草比往年更多 - 这一选项说明即使吸烟人数在减少,但是剩下的吸烟者消费了更多的烟草。这可能是烟草工业财源滚滚的一个原因。\n\nC. 今年开始吸烟草,吸鼻烟的人数多于已经戒烟的人数 - 这一选项说明虽然香烟吸食者数量在减少,但更多的人开始吸鼻烟,这也可能是烟草工业财源滚滚的一个原因。\n\nD. 今年美国生产的香烟出口到其他国家的数量比去年多 - 这一选项说明虽然美国国内的吸烟人数在减少,但是美国烟草的国际市场需求增加了。这也是烟草工业财源滚滚的一个原因。\n\n综上所述,A选项不能解释现象。" 81 | }, 82 | { 83 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 84 | "input": "乡下的居民比城里的居民有更强的购买力,因为他们可以把城里及近郊居民用来买食品和房屋的一部分资金用来购买其他所需的东西.\n\n以下哪项最符合上述对乡下居民购买力强的原因的解释?\n\nA.一般来说,乡下居民人数多于城里或近郊\n\nB.乡下居民在食品和住房方面的开支比城里或近郊的居民低\n\nC.城市和郊区居民平均收入水平总体上不比乡下居民高多少\n\nD.无论城市居民还是乡下居民,他们用于购买住房的开支都多于其他有关商品", 85 | "output": "B. B\n\n原文提到乡下的居民比城里的居民有更强的购买力,因为他们可以把城里及近郊居民用来买食品和房屋的一部分资金用来购买其他所需的东西。这里的关键是乡下居民能够把本来用于买食品和房屋的钱用于其他方面,说明他们在食品和房屋方面的支出要少于城里或近郊的居民。\n\n所以选项B:“乡下居民在食品和住房方面的开支比城里或近郊的居民低”最符合上述对乡下居民购买力强的原因的解释。" 86 | }, 87 | { 88 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 89 | "input": "1910年的一天,年轻的德国气象学家魏格纳躺在病床上,目光转向墙上的一幅世界地图,无意中发现,非洲西海岸线同南美东海岸线如此吻合,以致像一张撕碎了的纸,可以把它们重新拼合起来,魏格纳脑海里闪现出一个念头:两块大陆原是一整块,只是到后来才破裂,漂移开来,形成现在这个样子.经过进一步的研究,魏格纳提出了一个全新的假说——大陆漂移说.\n\n以下各项都从不同角度进一步支持了\"大陆漂移说\",除了:\n\nA.大西洋两岸以及印度洋两岸相对地层层序(地层构造)相同\n\nB.大西洋两岸的古生物种(植物化石和动物化石)几乎完全相同\n\nC.留在岩层中的痕迹表明,很久以前,今天的北极地区曾经一度是很热的沙漠\n\nD.地质痕迹表明,地球上五大洲几乎是同时形成的", 90 | "output": "D. A. 大西洋两岸以及印度洋两岸相对地层层序(地层构造)相同:这一事实表明,在过去的某个时间点这些地区曾经是连接在一起的。这就进一步支持了大陆漂移说。\n\nB. 大西洋两岸的古生物种(植物化石和动物化石)几乎完全相同:这一事实说明,这些地区原来可能属于同一个生物群落,也就是说,它们曾经是一个整体。这一事实进一步支持了大陆漂移说。\n\nC. 留在岩层中的痕迹表明,很久以前,今天的北极地区曾经一度是很热的沙漠:这一事实可以说明,地球上的地区在不同时期所处的地理位置和气候条件是不同的,这表明地球的外表发生了显著的变化,从而支持了大陆漂移说。\n\nD. 地质痕迹表明,地球上五大洲几乎是同时形成的:这一观点与大陆漂移说的基本思想不符,因为大陆漂移说认为大陆是通过长时间的漂移过程才逐渐形成今天的样子。所以,选项D并未进一步支持大陆漂移说。\n\n综上所述,答案是D。" 91 | }, 92 | { 93 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 94 | "input": "据官方报道,格陵兰的自杀率(每千人的自杀人数)1987年比1960年高17倍.因为格陵兰从1960年到1987年,由狩猎和捕鱼社会变成一个工业社会,自杀率的巨大变化一定是这一社会变化的结果.\n\n下列哪项如果为真,则对以上结论提出最严重的质疑?\n\nA.从狩猎和捕鱼社会变成工业社会,有时会引起重大犯罪率的变化\n\nB.在狩猎的捕鱼社会里也有相当一部分社会成员选择自杀\n\nC.根据官方的记录,1987年格陵兰自杀者数量确有一定程度的增长\n\nD.1987年,大多数发生在格陵兰的自杀案例被官方记录报道,而1960年大多数自杀案例并未记录在案", 95 | "output": "D. D\n\n原因如下:\n\n根据题干,1987年格陵兰的自杀率比1960年高17倍,作者认为这一变化是因为格陵兰从1960年到1987年从狩猎和捕鱼社会转变为工业社会。要质疑这个结论,需要找到一个事实来说明这一差距并非是由社会变化导致的。\n\nA项:重大犯罪率的变化与自杀率之间没有直接关系,因此不能用来质疑结论。\n\nB项:狩猎和捕鱼社会里的自杀现象并不否定这种社会变化可能导致自杀率的增长,所以也不能用来质疑结论。\n\nC项:实际上支持了原文的结论,格陵兰1987年的自杀率确实有所上升。\n\nD项:如果1987年的大多数自杀案例都被记录,而1960年的自杀案例却很少被记录,那么这样的数据差异可能是由于统计和记录的不完善,而非社会变化导致。因此,选择D可以对原文的结论提出质疑。" 96 | }, 97 | { 98 | "instruction": "阅读下面一段文字,这段文字后面会有一个问题和ABCD四个选项,运用逻辑推理选出最合适的选项作为问题的答案。", 99 | "input": "只有认识错误,才能改正错误.\"\n\n以下诸项都准确表达了上述断定的含义,除了:.\n\nA.除非认识错误,否则不能改正错误\n\nB.如果不认识错误,那么不能改正错误\n\nC.如果改正错误,那么已经认识错误\n\nD.只要认识错误,就能改正错误", 100 | "output": "D. A. 除非认识错误,否则不能改正错误:意味着要想改正错误必须先认识错误,与原句一致。\nB. 如果不认识错误,那么不能改正错误:意味着没有认识错误就无法改正错误,与原句一致。\nC. 如果改正错误,那么已经认识错误:意味着在改正错误的情况下,我们已经认识到了错误,与原句一致。\nD. 只要认识错误,就能改正错误:意味着只要认识到错误,就一定能够改正,但这里并没有强调认识错误是改正错误的必要条件,与原句不一致。" 101 | }, 102 | ] 103 | -------------------------------------------------------------------------------- /Data/mrc_zh_stat.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csitfun/LogiCoT/42ab9deccf625594598270dc63ee83b0ebf7b1ef/Data/mrc_zh_stat.png -------------------------------------------------------------------------------- /Data/readme.md: -------------------------------------------------------------------------------- 1 | # Instruction following data 2 | > For non-commercial research purpose only. 3 | 4 | **Important**: To request the full dataset, please 5 | 1. Submit an access request to our huggingface page [csitfun/LogiCoT](https://huggingface.co/datasets/csitfun/LogiCoT) through your huggingface account. 6 | 2. Send an email to Hanmeng Liu at hanhaishiyi@gmail.com. Please tell us your huggingface account username, your real name, org, and purpose. You need to guarantee that you will not share the data with others. We will approve your request after your info is provided. 7 | 8 | 9 | ![](mrc_stat.png) 10 | 11 | `mrc.jsonl`: English instruction tuning data for the logical reading comprehension task 12 | 13 | ![](mrc_zh_stat.png) 14 | 15 | `mrc_zh.jsonl`: Chinese instruction tuning data for the logical reading comprehension task 16 | 17 | ![](general_inference_stat.png) 18 | 19 | `general_inference.jsonl`: English instruction tuning data for the general inference task 20 | 21 | * Remember, this version of the data does not filter out GPT-4 responses that give the wrong answer. If there is a need, fell free to change it. Share it back to us if you'd like your version been distributed. but remember, the only way to distribute it is through Hanmeng Liu. Please do not distribute without permission. 22 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # LogiCoT: Logical Chain-of-Thought Instruction Tuning with GPT-4 2 | 3 | For more information, please refer to our EMNLP2023 findings paper - [LogiCoT: Logical Chain-of-Thought Instruction-tuning Data Collection with GPT-4](https://aclanthology.org/2023.findings-emnlp.191.pdf) 4 | 5 | Updates: 6 | Our updated paper has been accepted by the findings of EMNLP2023. 7 | 8 | Now the dataset is hosted on the Huggingface Dataset page [datatune/LogiCoT](https://huggingface.co/datasets/datatune/LogiCoT). It is the only distribution channel we currently allow. 9 | 10 | The instructions and demonstrations for building formal logical reasoning generative large language models 11 | ## Seminal Data 12 | * LogicInference 13 | * EntailmentBank 14 | * FOLIO 15 | * ReClor 16 | * LogiQA 17 | ## Instruction types 18 | ### General inference task 19 | 1. Translate the following inference to logic notation 20 | 2. What can be inferred from the following premises in a single inference step (ignoring inferences that add new predicates or constants)? Name the inference rule being used 21 | 3. What can be inferred from the following premises in a single inference step (ignoring inferences that add new predicates or constants)? 22 | 4. Consider the following premises. <***> Can we infer the following from them? <***> 23 | 5. Consider the following premises. <***> Can we infer <***> from them? If possible, name the inference rules being used at each step 24 | ### Multi-choice reading comprehension task 25 | 1. identify the claim that must be true or is required in order for the argument to work. 26 | 2. identify information that would strengthen an argument. 27 | 3. identify information that would weaken an argument. 28 | 4. identify information that would explain or resolve a situation. 29 | 5. identify a flaw in an argument’s reasoning 30 | ## Models 31 | We tuned on the LLaMA 7b model with 2 A100 GPUs. The model can be downloaded [here](https://huggingface.co/datatune/llama-7b-logicot) 32 | 33 | ## How to cite 34 | ``` 35 | @inproceedings{liu2023logicot, 36 | title={LogiCoT: Logical Chain-of-Thought Instruction Tuning}, 37 | author={Liu, Hanmeng and Teng, Zhiyang and Cui, Leyang and Zhang, Chaoli and Zhou, Qiji and Zhang, Yue}, 38 | booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023}, 39 | pages={2908--2921}, 40 | year={2023} 41 | } 42 | ``` 43 | --------------------------------------------------------------------------------