├── README.md
├── data
    ├── novel
    │   └── xiaowangzi_main_text.txt
    ├── random_mask
    │   └── test.npy
    └── segmented_novel
    │   ├── segmented_Chinense_novel.xlsx
    │   ├── segmented_Chinense_novel_preface.xlsx
    │   ├── segmented_Chinense_novel_preface_display.xlsx
    │   ├── segmented_Chinense_novel_run_1.xlsx
    │   ├── segmented_Chinense_novel_run_1_display.xlsx
    │   ├── segmented_Chinense_novel_run_2.xlsx
    │   ├── segmented_Chinense_novel_run_2_display.xlsx
    │   ├── segmented_Chinense_novel_run_3.xlsx
    │   ├── segmented_Chinense_novel_run_3_display.xlsx
    │   ├── segmented_Chinense_novel_run_4.xlsx
    │   ├── segmented_Chinense_novel_run_4_display.xlsx
    │   ├── segmented_Chinense_novel_run_5.xlsx
    │   ├── segmented_Chinense_novel_run_5_display.xlsx
    │   ├── segmented_Chinense_novel_run_6.xlsx
    │   ├── segmented_Chinense_novel_run_6_display.xlsx
    │   ├── segmented_Chinense_novel_run_7.xlsx
    │   └── segmented_Chinense_novel_run_7_display.xlsx
├── data_preprocessing_and_alignment
    ├── README.md
    ├── __pycache__
    │   └── utils.cpython-310.pyc
    ├── align_eeg_with_sentence.py
    ├── convert_eeg_to_bids.py
    ├── forward.py
    ├── inverse.py
    ├── preprocessing.py
    └── utils.py
├── docker
    ├── Dockerfile
    └── README.md
├── experiment
    ├── Error.py
    ├── README.md
    └── play_novel.py
├── image
    ├── cardiac_artifact.jpg
    ├── channel_artifact.jpg
    ├── data_processing.png
    ├── docker.png
    ├── docker_run.png
    ├── egi_montage.png
    ├── exit.png
    ├── exp_layout.png
    ├── eye_artifact.jpg
    ├── ica_001.png
    ├── ica_006.png
    ├── ica_007.png
    ├── ica_010.png
    ├── ica_015.png
    ├── ica_topo.png
    ├── image.png
    ├── mount_1.png
    ├── muscle_artifact.jpg
    ├── pipeline.png
    ├── pipeline_english.png
    ├── processing_pipeline.png
    ├── ps.png
    ├── pycharm_1.png
    ├── pycharm_2.png
    ├── pycharm_3.png
    ├── pycharm_4.png
    ├── pycharm_5.png
    ├── pycharm_6.png
    ├── pycharm_7.png
    ├── random_mask1.png
    ├── random_mask2.png
    ├── random_mask3.png
    ├── restart.png
    ├── screen.png
    ├── structure_new.png
    ├── su.png
    ├── vcxsrv_1.png
    └── vcxsrv_2.png
├── novel_segmentation_and_text_embeddings
    ├── README.md
    ├── cut_chinese_novel.py
    └── embedding.py
├── random_mask
    ├── random_mask.py
    └── readme.md
├── requirements.txt
└── scripts_test
    ├── egi_test.py
    └── eyetracker_test.py


/README.md:
--------------------------------------------------------------------------------
  1 | # Chinese Linguistic Corpus EEG Dataset Development and Advanced Semantic Decoding
  2 | 
  3 | ## Introduction
  4 | 
  5 | An Electroencephalography (EEG) dataset utilizing rich text stimuli can advance the understanding of how the brain encodes semantic information and contribute to semantic decoding in brain-computer interface (BCI). Addressing the scarcity of EEG datasets featuring Chinese linguistic stimuli, we present the ChineseEEG dataset, a high-density EEG dataset complemented by simultaneous eye-tracking recordings. This dataset was compiled while 10 participants silently read approximately 11 hours of Chinese text from two well-known novels. This dataset provides long-duration EEG recordings, along with pre-processed EEG sensor-level data and semantic embeddings of reading materials extracted by a pre-trained natural language processing (NLP) model. 
  6 | 
  7 | **For more detailed information about our dataset, you can reach our preprint paper on bioRxiv: [ChineseEEG: A Chinese Linguistic Corpora EEG Dataset for Semantic Alignment and Neural Decoding](https://www.biorxiv.org/content/10.1101/2024.02.08.579481v1).**
  8 | 
  9 | **You can find the dataset via the ChineseNeuro Symphony community (CHNNeuro) in Science Data Bank platform ([https://doi.org/10.57760/sciencedb.CHNNeuro.00007](https://doi.org/10.57760/sciencedb.CHNNeuro.00007)) or via Openneuro ([https://openneuro.org/datasets/ds004952](https://openneuro.org/datasets/ds004952)).**
 10 | 
 11 | This repository contains all the code to reproduce the experiment and data processing procedure in our paper. It aims to provide a comprehensive paradigm for the establishment of an EEG dataset based on Chinese linguistic corpora. It seeks to facilitate the advancement of technologies related to EEG-based semantic decoding and brain-computer interfaces. 
 12 | 
 13 | The project is mainly divided into four modules. The script `cut_chinese_novel.py` in the `novel_segmentation_and_text_embeddings` folder contains the code to prepare the stimulation materials from source materials. The script `play_novel.py` in the experiment module contains code for the experiment, including text stimuli presentation and control of the EGI device and Tobii Glasses 3 eye-tracker. The script `preprocessing.py` in `data_preprocessing_and_alignment` module contains the main part of the code to apply pre-processing on EEG data. The script `align_eeg_with_sentence.py` in the same module contains code to align the EEG segments with corresponding text contents and text embeddings. The `docker` module contains the Docker image required for deploying and running the code, as well as tutorials on how to use Docker for environment deployment. For detailed information on each module, please refer to the README document in the respective module.
 14 | 
 15 | ## Pipeline
 16 | 
 17 | Our EEG recording and pre-processing pipeline is as follows:
 18 | 
 19 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pipeline_english.png)
 20 | 
 21 | ## Device
 22 | 
 23 | ### EEG Recording: EGI Geodesic EEG 400 series 
 24 | 
 25 | During the experiment, The EEG (electroencephalography) data were collected by a `128-channel` EEG system with Geodesic Sensor Net (EGI Inc., Eugene, OR, USA, [Geodesic EEG System 400 series (egi.com)](https://www.egi.com/clinical-division/clinical-division-clinical-products/ges-400-series)). The montage system of this device is `GSN-HydroCel-128`.We recorded the data at a sampling rate of 1000 Hz.
 26 | 
 27 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/egi_montage.png)
 28 | 
 29 | The 128-channel EEG system with Geodesic Sensor Net (GSN) by EGI is a sophisticated brain activity recording tool designed for high-resolution neuroscientific research. This system features an array of evenly spaced sensors providing complete scalp coverage, ensuring detailed spatial data collection without the need for interpolation. Coupled with the advanced Net Amps 400 amplifiers and intuitive Net Station 5 software, it delivers low noise, high sensitivity EEG data acquisition, and powerful data analysis capabilities, making it an ideal choice for dynamic and expanding research environments.
 30 | 
 31 | ### Eyetracking: Tobii Pro Glasses 3
 32 | 
 33 | We utilized Tobii Pro Glasses 3 ([Tobii Pro Glasses 3 | Latest in wearable eye tracking - Tobii](https://www.tobii.com/products/eye-trackers/wearables/tobii-pro-glasses-3)) to record the participants' eye movement trajectories to inspect whether they followed the instructions of the experiment, that is, their gaze should move along with the red highlighted text.
 34 | 
 35 | The Tobii Pro Glasses 3 are advanced wearable eye trackers. They are capable of capturing natural viewing behavior in real-world environments, providing powerful insights from a first-person perspective. The device features 16 illuminators and four eye cameras integrated into scratch-resistant lenses, a wide-angle scene camera, and a built-in microphone, allowing for a comprehensive capture of participant behavior and environmental context. Its eye tracking is reliable across different populations, unaffected by eye color or shape. The Tobii Pro Glasses 3 operates with a high sampling rate of 50 Hz or 100 Hz. It supports a one-point calibration procedure. 
 36 | 
 37 | ## Experiment
 38 | 
 39 | In the preparation phase of the experiment, we initially fitted participants with EEG caps and eye trackers, maintaining a distance of 67 cm from the screen. We emphasized to the participants that they should keep their heads still during the experiment, and their gaze should follow the red highlighted text as shown in the figure. 
 40 | 
 41 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/screen.png)
 42 | 
 43 | After ensuring the participants fully understood the instructions, we commenced the experimental procedure. Initially, there was an eye tracker calibration phase, followed by a practice reading phase, and finally the formal reading phase. Each formal reading phase lasted for approximately 30 minutes. The experimental setup is as below:
 44 | 
 45 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/exp_layout.png)
 46 | 
 47 | ## Usage
 48 | 
 49 | Typically, you can follow steps below to execute the code for preparing experimental materials, conducting the experiment, and carrying out subsequent data analysis.
 50 | 
 51 | ### Environment Settings
 52 | 
 53 | Firstly, please ensure that your code running environment is properly set up. You have the option to create a Docker container for this purpose or directly install the necessary packages on your personal computer. 
 54 | 
 55 | If you choose to use Docker, you can refer to the detailed tutorial provided [here](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/docker/README.md). If you plan to install the packages in your local environment, the required packages and their corresponding version information can be found in the [requirement.txt](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/requirements.txt) file located in the project's root directory.
 56 | 
 57 | ### Experiment Materials Preparation
 58 | 
 59 | This step primarily involves preparing the textual reading materials needed for the experiment. You need to first convert your materials into the specific format below:
 60 | 
 61 | ```
 62 | Chinese_novel.txt
 63 | Ch0
 64 | This is the preface of the novel
 65 | Ch1
 66 | Chapter 1 of the novel
 67 | Ch2
 68 | Chapter 2 of the novel
 69 | ...
 70 | ...
 71 | ...
 72 | ```
 73 | 
 74 | then run the `cut_Chinese_novel.py` script located in the `novel_segmentation` folder to perform sentence segmentation of the novel text:
 75 | 
 76 | ```
 77 | python cut_Chinese_novel.py --divide_nums=<chapter numbers of the cutting point> --Chinese_novel_path=<path to your .txt file of the novel>
 78 | ```
 79 | 
 80 | We have uploaded the text materials we use in our experiment to the `text materials` release, including the Chinese versions of two well-known novels, **The Little Prince** and **Garnett Dream**.
 81 | 
 82 | For detailed information on format requirements and script execution commands, please visit the [novel_segmentation_and_text_embeddings](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/tree/main/novel_segmentation_and_text_embeddings) module for more details.
 83 | 
 84 | ### Experiment
 85 | 
 86 | Once we have obtained the text materials cut into the specific format, we can run the experimental program using `play_novel.py` in the `experiment` module. This program will present these text materials according to a specific experimental paradigm and record the participants' EEG and eye movement data. Before running the program, please ensure that the path to the text materials is correctly set and that the EEG and eye-tracking devices are properly connected. Use the following command to run the program:
 87 | 
 88 | ```
 89 | python play_novel.py --add_mark --add_eyetracker  --preface_path=<your preface path> --host_IP=<host IP> --egi_IP=<egi IP> --eyetracker_hostname=<eyetracker serial number> --novel_path=<your novel path> --isFirstSession
 90 | ```
 91 | 
 92 | For detailed information on the specific experimental paradigm, related parameter settings, and more, please refer to the [experiment](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/tree/main/experiment) module for further details.
 93 | 
 94 | ### Data Pre-processing
 95 | 
 96 | After completing the experimental data collection for all participants, we can use the `preprocessing.py` in the `data_preprocessing` module for data preprocessing. Our preprocessing workflow includes a series of steps such as data segmentation, downsampling, filtering, bad channel interpolation, independent component analysis (ICA), and re-referencing. During the bad channel interpolation and ICA phases, we have implemented automated algorithms, but we also provide options for manual intervention to ensure accuracy. All parameters for these methods can be modified by adjusting the settings in the code. 
 97 | 
 98 | For detailed information on the preprocessing workflow, explanations of the code, and parameter settings, please refer to the [data_preprocessing](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/tree/main/data_preprocessing) module.
 99 | 
100 | ### Text Embeddings
101 | 
102 | We offer the embeddings of the reading materials. The text stimuli in each run has a corresponding embedding file saved in `.npy` format. These text embeddings provide a foundation for a series of subsequent studies, including the alignment analysis of EEG and textual data in the representation space, as well as tasks like EEG language decoding. For detailed information, please refer to the [novel_segmentation_and_text_embeddings](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/tree/main/novel_segmentation_and_text_embeddings) module.
103 | 
104 | ### Data Alignment
105 | 
106 | After you have your texts, text embeddings and runs of EEG data, you can align them to do subsequent analysis. We offer you code to align the EEG data to its corresponding texts and embeddings. For detailed information, please refer to the [data_preprocessing_and_alignment](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/tree/main/data_preprocessing_and_alignment) module.
107 | 
108 | ## Credit 
109 | 
110 | - [Mou Xinyu](https://github.com/12485953) - Coder for all parts of the project, Data processing.
111 | 
112 | - [He Cuilin](https://github.com/CuilinHe) - Experiment conductor, Data processing.
113 | 
114 | - [Tan Liwei](https://github.com/tanliwei09) - Experiment conductor, Data processing.
115 | 
116 | - [Zhang Jianyu](https://github.com/ionaaaa) - Coder for Chinese corpus segmentation and EEG random masking.
117 |   
118 | - [Tian Yan](https://github.com/Bryantianyan) - Experiment conductor
119 | 
120 | - [Chen Yizhe]() - Experimental instrument debugging
121 | 
122 |   Feel free to contact us if you have any questions about the project !!!
123 |   
124 | ## Collaborators
125 | - [Wu Haiyan](https://github.com/haiyan0305)  -  University of Macau
126 | 
127 | - [Liu Quanying] - Southern University of Science and Technology
128 |   
129 | - [Wang Xindi](https://github.com/sandywang) 
130 | 
131 | - [Wang Qing] - Shanghai Jiao Tong University
132 |   
133 | - [Chen Zijiao] - National University of Singapore
134 |   
135 | - [Yang Yu-Fang] - Freie Universität Berlin
136 |   
137 | - [Hu Chuanpeng] - Nanjing Normal University
138 |   
139 | - [Xu Ting] - Center for the Integrative Developmental Neuroscience, Child Mind Institute, New York
140 | 
141 | - [Cao Miao] - Swinburne University of Technology
142 | 
143 | - [Liang Huadong](https://github.com/Romantic-Pumpkin) - iFLYTEK Co., LTD
144 | ## Funding
145 | 
146 | This work was mainly supported by the MindD project of Tianqiao and Chrissy Chen Institute(TCCI), the Science and Technology Development Fund (FDCT) of Macau [0127/2020/A3, 0041/2022/A], the Natural Science Foundation of Guangdong Province(2021A1515012509), Shenzhen-Hong Kong-Macao Science and Technology Innovation Project (Category C) (SGDX2020110309280100), and the SRG of University of Macau (SRG2020-00027-ICI). We also thank all research assistants who provided general support in participant recruiting and data collection.
147 | 


--------------------------------------------------------------------------------
/data/novel/xiaowangzi_main_text.txt:
--------------------------------------------------------------------------------
   1 | Ch0
   2 | 
   3 | 
   4 | 
   5 | 献给莱翁·维尔特，
   6 | 
   7 | 
   8 | 请孩子们原谅我把这本书献给了一个大人。我有一个很认真的理由：这个大人是我在世界上最好的朋友。我还有另外一个理由：这个大人什么都能懂，即使是给孩子看的书他也懂。我的第三个理由是：这个大人生活在法国，正在挨饿受冻。他很需要得到安慰。倘若所有这些理由加在一起还不够，那我愿意把这本书献给还是孩子时的这个大人。所有的大人起先都是孩子（可是他们中间不大有人记得这一点）。因此我把题献改为：
   9 | 
  10 | 献给还是小男孩的莱翁·维尔特
  11 | 
  12 | 
  13 | 
  14 | 
  15 | 
  16 | 
  17 | 
  18 | Ch1
  19 | 
  20 | 
  21 | 我六岁那年，在一本描写原始森林的名叫《真实的故事》的书上，看见过一幅精彩的插图，画的是一条蟒蛇在吞吃一头猛兽。我现在把它照样画在上面。
  22 | 
  23 | 书中写道：“蟒蛇把猎物囫囵吞下，嚼都不嚼。然后它就无法动弹，躺上六个月来消化它们。”
  24 | 
  25 | 当时，我对丛林里的奇妙景象想得很多，于是我也用彩色铅笔画了我的第一幅画：我的作品一号。它就像这样：
  26 | 
  27 | 
  28 | 我把这幅杰作给大人看，问他们我的图画吓不吓人。
  29 | 
  30 | 他们回答说：“一顶帽子怎么会吓人呢？”
  31 | 
  32 | 我画的不是一顶帽子。我画的是一条蟒蛇在消化大象。于是我把蟒蛇肚子的内部画出来，好让这些大人看得明白。他们老是要人给他们解释。我的作品二号是这样的：
  33 | 
  34 | 
  35 | 
  36 | 那些大人劝我别再画蟒蛇，甭管它是剖开的，还是没剖开的，全都丢开。他们说，我还是把心思放在地理、历史、算术和语法上好。就这样，我才六岁，就放弃了辉煌的画家生涯。作品一号和作品二号都没成功，我泄了气。那些大人自个儿什么也弄不懂，老要孩子们一遍一遍给他们解释，真烦人。
  37 | 
  38 | 我只好另外选择一个职业，学会了开飞机。世界各地我差不多都飞过。的确，地理学对我非常有用。我一眼就能认出哪是中国，哪是亚利桑那。要是夜里迷了路，这很有用。
  39 | 
  40 | 就这样，我这一生中，跟好多严肃的人打过好多交道。我在那些大人中间生活过很长时间。我仔细地观察过他们。观察下来印象并没好多少。
  41 | 
  42 | 要是碰上一个人，看上去头脑稍许清楚些，我就拿出一直保存着的作品一号，让他试试看。我想知道，他是不是真的能看懂。可是人家总是回答我：“这是一顶帽子。”这时候，我就不跟他说什么蟒蛇啊，原始森林啊，星星啊，都不说了。我就说些他能懂的事情。我跟他说桥，高尔夫，政治，还有领带。于是大人觉得很高兴，认识了这么个通情达理的人。
  43 | 
  44 | 
  45 | Ch2
  46 | 
  47 | 
  48 | 我孤独地生活着，没有一个真正谈得来的人，直到六年前，有一次飞机出了故障，降落在撒哈拉大沙漠。发动机里有样什么东西碎掉了。因为我身边既没有机械师，也没有乘客，我就打算单枪匹马来完成一项困难的修复工作。这在我是个生死攸关的问题。我带的水只够喝一星期了。
  49 | 
  50 | 第一天晚上，我睡在这片远离人烟的大沙漠上，比靠一块船板在大海中漂流的遇难者还孤独。所以，当天蒙蒙亮，有个奇怪的声音轻轻把我喊醒的时候，你们可以想象我有多么惊讶。这个声音说：
  51 | 
  52 | “对不起……请给我画只绵羊！”
  53 | 
  54 | “嗯！”
  55 | 
  56 | “请给我画只绵羊……”
  57 | 
  58 | 我像遭了雷击似的，猛地一下子跳了起来。我使劲地揉了揉眼睛，仔细地看了看。只见一个从没见过的小人儿，正一本正经地看着我呢。后来我给他画了一幅非常出色的肖像，就是旁边的这幅。不过我的画，当然远远不及本人可爱。这不是我的错。我的画家生涯在六岁那年就让大人给断送了，除了画剖开和不剖开的蟒蛇，后来再没画过什么。
  59 | 
  60 | 我吃惊地瞪大眼睛瞧着他。你们别忘记，这儿离有人住的地方好远好远呢。可是这个小人儿，看上去并不像迷了路，也不像累得要命、饿得要命、渴得要命或怕得要命。他一点不像在远离人类居住的沙漠里迷路的孩子。等我总算说得出话时，我对他说：
  61 | 
  62 | “可是……你在这儿干吗？”
  63 | 
  64 | 他轻声轻气地又说了一遍，好像那是件很要紧的事情：
  65 | 
  66 | 
  67 | 
  68 | 后来我给他画了这幅非常出色的肖像
  69 | 
  70 | “对不起……请给我画一只绵羊……”
  71 | 
  72 | 受到神秘事物强烈冲击时，一个人是不敢不听从的。尽管在我看来，离一切有人居住的地方远而又远，又处于死亡的威胁之下，在这儿想到画画真是匪夷所思，可我还是从口袋里掏出一张纸、一支钢笔。但我想起我只学了地理、历史、算术和语法，所以我就（有点没好气地）对那小人儿说，我不会画画。他回答说：
  73 | 
  74 | “没关系。请给我画一只绵羊。”
  75 | 
  76 | 我因为从没画过绵羊，就在我只会画的两张图画里挑一张给他画了：没剖开的蟒蛇图。可我听到小人儿下面说的话，简直惊呆了：
  77 | 
  78 | “不对！不对！我不要在蟒蛇肚子里的大象。蟒蛇很危险，大象呢，太占地方。在我那儿，什么都是小小的。我要的是一只绵羊。请给我画一只绵羊。”
  79 | 
  80 | 
  81 | 
  82 | 我只得画了起来。他专心地看了一会儿，然后说：
  83 | 
  84 | “不对！这只羊已经病得不轻了。另外画一只吧。”
  85 | 
  86 | 我画了右面的这只。
  87 | 
  88 | 
  89 | 
  90 | 我的朋友温和地笑了，口气宽容地说：
  91 | 
  92 | “你看看……这只不是绵羊，是山羊。头上长着角……”
  93 | 
  94 | 于是我又画了一张。
  95 | 
  96 | 但这一张也跟前几张一样，没能通过：
  97 | 
  98 | 
  99 | 
 100 | “这只太老了。我要一只可以活得很久的绵羊。”
 101 | 
 102 | 我已经没有耐心了，因为我急于要去把发动机拆下来，所以我就胡乱画了一张。
 103 | 
 104 | 我随口说道：
 105 | 
 106 | “这个呢，是个箱子。你要的绵羊就在里面。”
 107 | 
 108 | 但是令我吃惊的是，这个小评判的脸上顿时变得容光焕发了：
 109 | 
 110 | 
 111 | 
 112 | “我要的就是这个！你说，这只绵羊会要很多草吗？”
 113 | 
 114 | “问这干嘛？”
 115 | 
 116 | “因为我那儿样样都很小……”
 117 | 
 118 | “肯定够了。我给你的是只很小的绵羊。”
 119 | 
 120 | 他低下头去看那幅画：
 121 | 
 122 | “不算太小……瞧！它睡着了……”
 123 | 
 124 | 就这样，我认识了小王子。
 125 | 
 126 | 
 127 | 
 128 | 
 129 | 
 130 | 
 131 | 
 132 | Ch3
 133 | 
 134 | 
 135 | 很久以后，我才弄明白他是从哪儿来的。
 136 | 
 137 | 这个小王子，对我提了好多问题，而对我的问题总像没听见似的。我是从他偶尔漏出来的那些话里，一点一点知道这一切的。比如，他第一次瞧见我的飞机时（我没画我的飞机，对我来说，这样的画实在太复杂了），就问我：
 138 | 
 139 | “这是什么东西？”
 140 | 
 141 | “这不是什么东西，它会飞。这是一架飞机，是我的飞机。”
 142 | 
 143 | 我自豪地讲给他听，我在天上飞。他听了就大声说：
 144 | 
 145 | “怎么！你是天上掉下来的？”
 146 | 
 147 | 
 148 | 
 149 | “是的，”我谦虚地说。
 150 | 
 151 | “喔！真有趣……”
 152 | 
 153 | 小王子发出一阵清脆的笑声，这下可把我惹恼了。我不喜欢别人拿我的不幸逗趣儿。接着他又说：
 154 | 
 155 | “这么说，你也是从天上来的！你从哪个星球来？”
 156 | 
 157 | 我脑子里闪过一个念头，他的降临之谜好像有了线索，我突如其来地发问：
 158 | 
 159 | “那你是从别的星球来的啰？”
 160 | 
 161 | 可是他没有回答。他看着我的飞机，轻轻地点了点头：
 162 | 
 163 | “是啊，就靠它，你来的地方不会太远……”
 164 | 
 165 | 说着，他出神地遐想了很久。而后，从袋里拿出我画的绵羊，全神贯注地凝望着这宝贝。
 166 | 
 167 | 你想想看，这个跟“别的星球”有关，说了一半打住的话头，会让我多么惊讶啊。我竭力想多知道一些：
 168 | 
 169 | “你从哪儿来，我的小家伙？‘我那儿’是哪儿？你要把我画的绵羊带到哪儿去？”
 170 | 
 171 | 他若有所思地沉默了一会儿，然后开口对我说：
 172 | 
 173 | “你给了我这个箱子，这就好了，晚上可以给它当屋子。”
 174 | 
 175 | “当然。要是你乖，我还会给你一根绳子，白天可以把它拴住。木桩也有。”
 176 | 
 177 | 这个提议好像使小王子很不以为然：
 178 | 
 179 | “拴住？真是怪念头！”
 180 | 
 181 | “可要是你不把它拴住，它就会到处跑，还会跑丢了……”
 182 | 
 183 | 
 184 | 
 185 | 小王子在B612小行星上
 186 | 
 187 | 我的朋友又咯咯地笑了起来：
 188 | 
 189 | “你叫它往哪儿跑呀？”
 190 | 
 191 | “到处跑。笔直往前……”
 192 | 
 193 | 这时，小王子一本正经地说：
 194 | 
 195 | “那也没关系，我那儿就一丁点儿大！”
 196 | 
 197 | 然后，他又说了一句，语气中仿佛有点儿忧郁：
 198 | 
 199 | “就是笔直往前跑，也跑不了多远……”
 200 | 
 201 | 
 202 | 
 203 | 
 204 | 
 205 | 
 206 | 
 207 | Ch4
 208 | 
 209 | 
 210 | 我由此知道了另一件很重要的事情：他居住的星球比一座房子大不了多少！
 211 | 
 212 | 这并没让我感到很吃惊。我知道，除了像地球、木星、火星、金星这些取了名字的大星球，还有成千上万的星球，它们有时候非常非常小，用望远镜都不大看得见。天文学家找到其中的一个星球，给它编一个号码就算名字了。比如说，他把它叫作“3251号小行星”。
 213 | 
 214 | 
 215 | 
 216 | 
 217 | 
 218 | 我有很可靠的理由，足以相信小王子原先住的那个星球，就是B612号小行星。这颗小行星只在一九〇九年被人用望远镜望见过一次，那人是一个土耳其天文学家。
 219 | 
 220 | 
 221 | 
 222 | 当时，他在一次国际天文学大会上作了长篇论证。可是就为了他的服装的缘故，谁也不信他的话。大人哪，就是这样。
 223 | 
 224 | 幸好，有一个土耳其独裁者下令，全国百姓都要穿欧洲的服装，违令者处死，这一下B612号小行星的名声总算保全了。那个天文学家在一九二〇年重新作报告，穿着一套非常体面的西装。这一回所有的人都同意了他的观点。
 225 | 
 226 | 我之所以要跟你们一五一十地介绍B612号小行星，还把它的编号也讲得明明白白，完全是为了大人。那些大人就喜欢数字。你跟他们讲起一个新朋友，他们总爱问些无关紧要的问题。他们不会问你：“他说话的声音是怎样的？他喜欢玩哪些游戏？他是不是收集蝴蝶标本？”他们问的是：“他几岁？有几个兄弟？他有多重？他父亲挣多少钱？”这样问过以后，他们就以为了解他了。你要是对大人说：“我看见一幢漂亮的房子，红砖墙，窗前种着天竺葵，屋顶上停着鸽子……”他们想象不出这幢房子是怎样的。你得这么跟他们说：“我看见一幢十万法郎的房子。”他们马上会大声嚷嚷：“多漂亮的房子！”
 227 | 
 228 | 所以，如果你对他们说：“小王子是存在的，证据就是他那么可爱，他咯咯地笑，他还想要一只绵羊。一个人想要有只绵羊，这就是他存在的证据嘛”，他们会耸耸肩膀，只当你还是个孩子！可要是你对他们说：“他来自B612号小行星”，他们就会深信不疑，不再问这问那地烦你了。他们就是这样。不必怪他们。孩子应该对大人多多原谅才是。
 229 | 
 230 | 不过，当然，我们懂得生活，我们才不把数字放在眼里呢！我真愿意像讲童话那样来开始讲这个故事。我真想这样说：
 231 | 
 232 | “从前呀，有一个小王子，住在一个跟他身体差不多大的星球上，他想有个朋友……”对那些懂得生活的人来说，这样听上去会真实得多。
 233 | 
 234 | 我不想人家轻率地来读我这本书。我讲述这段往事时，心情是很难过的。我的朋友带着他的绵羊已经离去六年了。我之所以在这儿细细地描述他，就是为了不要忘记他。忘记朋友是件令人伤心的事情。并不是人人都有过一个朋友的。再说，我早晚也会变得像那些只关心数字的大人一样的。也正是为了这个缘故，我买了一盒颜料和一些铅笔。到了我这年纪再重握画笔，是挺费劲的，况且当初我只画过剖开和没剖开的蟒蛇，还是六岁那年！当然，我一定要尽力把它们画得像一些。但做不做得到，我可说不准。有时这一张还行，那一张就不大像了。比如说，身材我就有点记不准确了。这一张里小王子画得太高了。那一张呢太矮了。衣服的颜色也挺让我犯难。我只好信手拿起色笔这儿试一下，那儿试一下。到头来，有些最要紧的细部，说不定都弄错了。不过这一切，大家都得原谅我才是。我的朋友从来不跟我解释什么。他大概以为我是跟他一样的。可是，很遗憾，我已经瞧不见箱子里面的绵羊了。我也许已经有点像那些大人了。我一定是老了。
 235 | 
 236 | 
 237 | 
 238 | 
 239 | 
 240 | 
 241 | 
 242 | Ch5
 243 | 
 244 | 
 245 | 每天我都会知道一些情况，或者是关于他的星球，或者是关于他怎么离开那儿、怎么来到这儿。这些情况，都是一点一点，碰巧知道的。比如说，在第三天，我知道了猴面包树的悲剧。
 246 | 
 247 | 这一回，起因又是那只绵羊，因为小王子突然向我发问，好像忧心忡忡似的：
 248 | 
 249 | “绵羊当真吃灌木吗？”
 250 | 
 251 | “对。当真。”
 252 | 
 253 | “啊！我真高兴。”
 254 | 
 255 | 我不明白，绵羊吃灌木，为什么会这么重要。小王子接着又说：
 256 | 
 257 | “这么说，它们也吃猴面包树喽？”
 258 | 
 259 | 我告诉小王子，猴面包树不是灌木，而是像教堂那么高的大树，他就是领一群大象来，也吃不完一棵猴面包树呢。
 260 | 
 261 | 领一群大象来的想法，惹得小王子笑了起来：
 262 | 
 263 | “那得让它们叠罗汉了……”
 264 | 
 265 | 不过他很聪明，接着又说：
 266 | 
 267 | “猴面包树在长高以前，起初也是小小的。”
 268 | 
 269 | “一点不错。可你为什么想让绵羊去吃小猴面包树呢？”
 270 | 
 271 | 
 272 | 
 273 | 
 274 | 
 275 | 他回答说：“咦！这还不明白吗！”就像这是件不言而喻的事情。可是我自己要弄懂这个问题，还着实得动一番脑筋哩。
 276 | 
 277 | 原来，在小王子的星球上，就像在别的星球上一样，有好的植物，也有不好的植物。结果呢，好植物有好种子，坏植物有坏种子。而种子是看不见的。它们悄悄地睡在地底下，直到有一天，其中有一颗忽然想起要醒了……于是它舒展身子，最先羞答答地朝太阳伸出一枝天真可爱的嫩苗。假如那是萝卜或玫瑰的幼苗，可以让它爱怎么长就怎么长。不过，假如那是一株不好的植物，一认出就得拔掉它。在小王子的星球上有一种可怕的种子……就是猴面包树的种子。星球的土壤里有好多猴面包树种子。而猴面包树长得很快，动手稍稍一慢，就甭想再除掉它了。它会占满整个星球，根枝钻来钻去，四处蔓延。要是这颗星球太小，而猴面包树又太多，它们就会把星球撑裂。
 278 | 
 279 | 
 280 | 
 281 | 猴面包树
 282 | 
 283 | “这就得有个严格的约束了，”小王子后来告诉我说。“你早晨梳洗好以后，就该仔仔细细地给星球梳洗了。猴面包树小的时候，跟玫瑰幼苗是很像的，那你就得给自己立个规矩，只要分清了哪是玫瑰，哪是猴面包树，就马上把猴面包树拔掉。这个工作很单调，但并不难。”
 284 | 
 285 | 有一天，他劝我好好画一幅画，好让我那儿的孩子们都知道这回事。“要是他们有一天出门旅行，”他对我说，“说不定会用得着。有时候，你把一件该做的事耽搁一下，也没什么关系。可是，碰到猴面包树，这就要捅大娄子了。我知道有一个星球，上面住着一个懒人。有三株幼苗他没在意……”
 286 | 
 287 | 在小王子的指点下，我画好了那颗星球。我一向不愿意摆出说教的架势。可是对猴面包树的危害，一般人都不了解，要是有人碰巧迷了路停在一颗小行星上，情况就会变得极其严峻。所以这一次，我破例抛开了矜持。我说：“孩子们！当心猴面包树啊！”这幅画我画得格外卖力，就是为了提醒朋友们有这么一种危险存在，他们也像我一样，对在身边潜伏了很久的危险一直毫无觉察。要让大家明白这道理，我多费点劲也是值得的。你们也许会想：“在这本书里，别的画为什么都没有这幅来得奔放有力呢？”回答很简单：我同样努力了，但没能成功。画猴面包树时，我内心非常焦急，情绪就受到了感染。
 288 | 
 289 | 
 290 | 
 291 | 
 292 | 
 293 | 
 294 | 
 295 | Ch6
 296 | 
 297 | 
 298 | 哦，小王子！就这样，我一点一点知道了你那段忧郁的生活。过去很长的时间里，你唯一的乐趣就是观赏夕阳沉落的温柔晚景。这个新的细节，我是在第四天早晨知道的。当时你对我说：
 299 | 
 300 | “我喜欢看日落。我们去看一回日落吧……”
 301 | 
 302 | “可是得等……”
 303 | 
 304 | “等什么？”
 305 | 
 306 | “等太阳下山呀。”
 307 | 
 308 | 开始，你显得很惊奇，随后你自己笑了起来。你对我说：
 309 | 
 310 | 
 311 | 
 312 | “我还以为在家乡呢！”
 313 | 
 314 | 可不。大家都知道，美国的中午，在法国正是黄昏。要是能在一分钟内赶到法国，就可以看到日落。可惜法国实在太远了。而在你那小小的星球上，你只要把椅子挪动几步就行了。那样，你就随时可以看到你想看的夕阳余晖……
 315 | 
 316 | “有一天，我看了四十三次日落！”
 317 | 
 318 | 过了一会儿，你又说：
 319 | 
 320 | “你知道……一个人感到非常忧伤的时候，他就喜欢看日落……”
 321 | 
 322 | “这么说，看四十三次的那天，你感到非常忧伤啰？”
 323 | 
 324 | 但是小王子没有回答。
 325 | 
 326 | 
 327 | 
 328 | 
 329 | 
 330 | 
 331 | 
 332 | Ch7
 333 | 
 334 | 
 335 | 第五天，还是羊的事情，把小王子生活的秘密向我揭开了。他好像有个问题默默地思索了很久，终于得出了结论，突然没头没脑地问我：
 336 | 
 337 | “绵羊既然吃灌木，那它也吃花儿啰？”
 338 | 
 339 | “它碰到什么吃什么。”
 340 | 
 341 | “连有刺的花儿也吃？”
 342 | 
 343 | “对。有刺的也吃。”
 344 | 
 345 | “那么，刺有什么用呢？”
 346 | 
 347 | 我不知道该怎么回答。当时我正忙着要从发动机上卸下一颗拧得太紧的螺钉。我发现故障似乎很严重，饮用水也快完了。我担心会发生最坏的情况，心里很着急。
 348 | 
 349 | “那么，刺有什么用呢？”
 350 | 
 351 | 小王子只要提了一个问题，就不依不饶地要得到答案。而那个螺钉正弄得我很恼火，我就随口回答了一句：
 352 | 
 353 | “刺呀，什么用都没有，纯粹是花儿想使坏呗。”
 354 | 
 355 | “喔！”
 356 | 
 357 | 但他沉默了一会儿以后，忿忿然地冲着我说：
 358 | 
 359 | “我不信你的话！花儿是纤弱的，天真的。它们想尽量保护自己。它们以为有了刺就会显得很厉害……”
 360 | 
 361 | 我没作声。我当时想：“要是这颗螺钉再不松开，我就一锤子敲掉它。”小王子又打断了我的思路：
 362 | 
 363 | “可你，你却认为花儿……”
 364 | 
 365 | “行了！行了！我什么也不认为！我只是随口说说。我正忙着干正事呢！”
 366 | 
 367 | 他惊愕地望着我。
 368 | 
 369 | “正事！”
 370 | 
 371 | 他看我握着锤子，手指沾满油污，俯身对着一个他觉得非常丑陋的物件。
 372 | 
 373 | “你说话就像那些大人！”
 374 | 
 375 | 这话使我有些难堪。而他毫不留情地接着说：
 376 | 
 377 | “你什么都分不清……你把什么都搅在一起！”
 378 | 
 379 | 他真的气极了，一头金发在风中摇曳：
 380 | 
 381 | “我到过一个星球，上面住着一个红脸先生。他从没闻过花香。他从没望过星星。他从没爱过一个人。除了算账，他什么事也没做过。他成天像你一样说个没完：‘我有正事要干！我有正事要干！’变得骄气十足。可是这算不得一个人，他是个蘑菇。”
 382 | 
 383 | “是个什么？”
 384 | 
 385 | “是个蘑菇！”
 386 | 
 387 | 小王子这会儿气得脸色发白了。
 388 | 
 389 | “几百万年以前，花儿就长刺了。可几百万年以前，羊也早就在吃花儿了。刺什么用也没有，那花儿为什么要费那份劲去长刺呢，把这弄明白难道不是正事吗？绵羊和花儿的战争难道不重要吗？这难道不比那个胖子红脸先生的算账更重要，更是正事吗？还有，如果我认识一朵世上独一无二的花儿，除了我的星球，哪儿都找不到这样的花儿，而有天早上，一只小羊甚至都不明白自己在做什么，就一口把花儿吃掉了，这难道不重要吗！”
 390 | 
 391 | 他的脸红了起来，接着往下说：
 392 | 
 393 | 
 394 | 
 395 | “如果有个人爱上一朵花儿，好几百万好几百万颗星星中间，只有一颗上面长着这朵花儿，那他只要望着许许多多星星，就会感到很幸福。他对自己说：‘我的花儿就在其中的一颗星星上……’可要是绵羊吃掉了这朵花儿，这对他来说，就好像满天的星星突然一下子都熄灭了！这难道不重要吗！”
 396 | 
 397 | 他说不下去了，突然抽抽噎噎地哭了起来。夜色降临。我放下手中的工具。锤子呀，螺钉呀，口渴呀，死亡呀，我全都丢在了脑后。在一颗星星，在一颗我所在的行星，在这个地球上，有个小王子需要安慰！我把他抱在怀里。我摇着他，对他说：“你爱的那朵花儿不会有危险的……我会给你的绵羊画一只嘴罩……我会给你的花儿画一个护栏……我……”我不知道再说什么好了。我觉得自己笨嘴笨舌的。我不知道怎样去接近他，打动他……泪水的世界，是多么神秘啊。
 398 | 
 399 | 
 400 | 
 401 | 
 402 | 
 403 | 
 404 | 
 405 | Ch8
 406 | 
 407 | 
 408 | 我很快就对这朵花儿有了更多的了解。在小王子的星球上，过去一直长着些很简单的花儿，这些花儿只有一层花瓣，不占地方，也不妨碍任何人。某个早晨她们会在草丛中绽放，一到晚上又都悄悄凋谢了。有一天，一颗不知从哪儿来的种子发了芽，长出的嫩苗跟别的幼苗都不一样。小王子小心翼翼地观察着这株嫩苗，它说不定是猴面包树的一枝幼芽呢。但是这株嫩苗很快就不再长大，渐渐含苞欲放。小王子眼看着它绽出一个很大很大的花蕾，心想这花蕾里一定会出现奇妙的景象，可是这朵花儿待在绿色的花萼里面，磨磨蹭蹭地打扮个没完。她精心挑选着自己的颜色，慢吞吞地穿上衣裙，一片一片地理顺花瓣。她不愿像虞美人那样一亮相就是满脸皱纹。她要让自己美艳照人地来到世间。噢！对。她很爱俏！她那神秘的装扮，就这样日复一日地延续着。然后，有一天早晨，太阳刚升起的时候，她绽放了。
 409 | 
 410 | 
 411 | 
 412 | 她精心打扮了那么久，这会儿却打着哈欠说：
 413 | 
 414 | “啊！我刚睡醒……真对不起……头发还是乱蓬蓬的……”
 415 | 
 416 | 这时，小王子的爱慕之情油然而生：
 417 | 
 418 | “您真美！”
 419 | 
 420 | “可不是吗，”花儿柔声答道，“我是跟太阳同时出生的嘛……”
 421 | 
 422 | 小王子感觉到了她不太谦虚，不过她实在太楚楚动人了！
 423 | 
 424 | “我想，现在该是用早餐的时间了，”她随即又说，“麻烦您也给我……”
 425 | 
 426 | 小王子很不好意思，于是就打来一壶清水，给这朵花儿浇水。
 427 | 
 428 | 就这样，她带着点多疑的虚荣心，很快就把他折磨得够呛。比如说，有一天说起她的四根刺，她对小王子说：
 429 | 
 430 | 
 431 | 
 432 | “那些老虎，让它们张着爪子来好了！”
 433 | 
 434 | “我的星球上没有老虎，”小王子顶了她一句，“再说，老虎也不吃草呀。”
 435 | 
 436 | “我不是草，”花儿柔声答道。
 437 | 
 438 | “对不起……”
 439 | 
 440 | “我不怕老虎，可我怕风。您没有风障吗？”
 441 | 
 442 | “怕风……一棵植物到了这份上，那可惨了，”小王子轻声说，“花儿可真难伺候……”
 443 | 
 444 | “晚上您要把我罩起来。您这儿很冷。又没安顿好。我来的那地方……”
 445 | 
 446 | 可是她没说下去。她来的时候是颗种子。她不可能知道别的世界是怎么样的。让人发现她说的谎这么不高明，她又羞又恼，就咳了两三声嗽，想让小王子觉得理亏：
 447 | 
 448 | “风障呢？”
 449 | 
 450 | “我正要去拿，可您跟我搭话了！”
 451 | 
 452 | 于是她咳得更重了些，不管怎么说，她非让他感到内疚不可。
 453 | 
 454 | 
 455 | 
 456 | 就这样，小王子尽管真心真意喜欢这朵花儿，可还是很快就对她起了疑心。他对那些无关紧要的话太当真了，结果自己很苦恼。
 457 | 
 458 | “我本来不该去听她说什么的，”有一天他对我说了心里话，“花儿说的话，是听不得的。花儿是让人看，让人闻的。这朵花儿让我的星球芳香四溢，我却不会享受这快乐。老虎爪子那些话，惹得我那么生气，其实我该同情她才是……”
 459 | 
 460 | 他还对我说：
 461 | 
 462 | 
 463 | 
 464 | “我当时什么也不懂！看她这个人，应该看她做什么，而不是听她说什么。她给了我芳香，给了我光彩。我真不该逃走！我本该猜到她那小小花招背后的一片柔情。花儿总是这么表里不一！可惜当时我太年轻，还不懂得怎么去爱她。”
 465 | 
 466 | 
 467 | 
 468 | 
 469 | 
 470 | 
 471 | 
 472 | 
 473 | 
 474 | Ch9
 475 | 
 476 | 
 477 | 我想他是趁一群野鸟迁徙的机会出走的。动身的那天早晨，他把星球收拾得井井有条。他仔细地疏通了活火山。星球上有两座活火山，热早餐很方便。还有一座死火山。不过，正像他所说的：“谁说得准呢！”所以这座死火山也照样要疏通。火山疏通过了，就会缓缓地、均匀地燃烧，不会喷发。火山喷发跟烟囱冒火是一样的。当然，在地球上，我们实在太小了，没法去疏通火山。它们造成那么多麻烦，就是由于这个缘故。
 478 | 
 479 | 小王子还拔掉了刚长出来的几株猴面包树幼苗。他心情有点忧郁，心想这一走就再也回不来了。所有这些习惯的活儿，这天早上都显得格外亲切。而当他最后一次给花儿浇水，准备给它盖上罩子的时候，他只觉得想哭。
 480 | 
 481 | “再见啦，”他对花儿说。
 482 | 
 483 | 可是她没有回答。
 484 | 
 485 | “再见啦，”他又说了一遍。
 486 | 
 487 | 花儿咳嗽起来。但不是由于感冒。
 488 | 
 489 | “我以前太傻了，”她终于开口了，“请你原谅我。但愿你能幸福。”
 490 | 
 491 | 他感到吃惊的是，居然没有一声责备。他举着罩子，茫然不知所措地站在那儿。他不懂这般恬淡的柔情。
 492 | 
 493 | “是的，我爱你，”花儿对他说，“但由于我的过错，你一点儿也没领会。这没什么要紧。不过你也和我一样傻。但愿你能幸福……把这罩子放在一边吧，我用不着它了。”
 494 | 
 495 | “可是风……”
 496 | 
 497 | “我并不是那么容易感冒的……夜晚的新鲜空气对我有好处。我是一朵花儿。”
 498 | 
 499 | “可是那些虫子和野兽……”
 500 | 
 501 | 
 502 | 
 503 | “我既然想认识蝴蝶，就应该受得了两三条毛虫。我觉得这样挺好。要不然有谁来看我呢？你，你到时候已经走得远远的了。至于野兽，我根本不怕。我也有爪子。”
 504 | 
 505 | 说着，她天真地让他看那四根刺。随后她又说：
 506 | 
 507 | “别磨磨蹭蹭的，让人心烦。你已经决定要走了。那就走吧。”
 508 | 
 509 | 因为她不愿意让他看见自己流泪。她是一朵如此骄傲的花儿……
 510 | 
 511 | 
 512 | 
 513 | 
 514 | 
 515 | 
 516 | 
 517 | Ch10
 518 | 
 519 | 
 520 | 这颗星球附近，还有325号、326号、327号、328号、329号和330号小行星。于是他开始拜访这些星球，好给自己找点事干，也好增长些见识。
 521 | 
 522 | 第一颗小行星上住着一个国王。这个国王身穿紫红镶边白鼬皮长袍，端坐在一张简朴而又气派庄严的王座上。
 523 | 
 524 | “哈！来了一个臣民，”国王看见小王子，大声叫了起来。
 525 | 
 526 | 可小王子觉得纳闷：
 527 | 
 528 | “他以前从没见过我，怎么会认识我呢？”
 529 | 
 530 | 他不知道，对国王来说，世界是非常简单的。所有的人都是臣民。
 531 | 
 532 | “你走近点，让我好好看看你，”国王说，他觉得非常骄傲，因为他终于成了某个人的国王。
 533 | 
 534 | 小王子朝四下里看看，想找个地方坐下来，可是整个星球都被那袭华丽的白鼬皮长袍占满了。所以他只好站着，不过，由于他累了，就打了个哈欠。
 535 | 
 536 | “在国王面前打哈欠，有违宫廷礼仪，”国王对他说，“我禁止你打哈欠。”
 537 | 
 538 | “我没忍住，”小王子歉疚地说，“我走了好长的路，一直没睡觉……”
 539 | 
 540 | “那么，”国王对他说，“我命令你打哈欠。我有好几年没见人打哈欠了。我觉得打哈欠挺好玩。来！再打个哈欠。这是命令。”
 541 | 
 542 | “我给吓着了……打不出……”小王子涨红着脸说。
 543 | 
 544 | “呣！呣！”国王回答说。“那么我……我命令你一会儿打哈欠，一会儿……”
 545 | 
 546 | 他嘟嘟哝哝的，看上去不大高兴。
 547 | 
 548 | 国王其实是要别人尊重他的权威。他不能容忍别人不服从命令。他是个专制的君主。不过，因为他很善良，他下的命令都是通情达理的。
 549 | 
 550 | “要是我命令，”这番话他说得流畅极了，“要是我命令一个将军变成一只海鸟，那个将军不服从，这就不是那个将军的错，这是我的错。”
 551 | 
 552 | “我可以坐下吗？”小王子怯生生地问。
 553 | 
 554 | “我命令你坐下，”国王回答他说，庄重地挪了挪白鼬皮长袍的下摆。
 555 | 
 556 | 可是小王子感到很奇怪。这么小的星球，国王能统治什么呢？
 557 | 
 558 | “陛下……”他说，“请允许我向您提个……”
 559 | 
 560 | “我命令你向我提问题，”国王赶紧抢着说。
 561 | 
 562 | “陛下……您统治什么呢？”
 563 | 
 564 | “一切，”国王的回答简单明了。
 565 | 
 566 | “一切？”
 567 | 
 568 | 国王小心翼翼地做了个手势，指了指他的行星、其他的行星和所有的星星。
 569 | 
 570 | 
 571 | 
 572 | “全归您统治？”小王子问。
 573 | 
 574 | “全归我统治……”国王回答说。
 575 | 
 576 | 因为他不仅是一国的专制君主，还是宇宙的君主。
 577 | 
 578 | “那些星星都服从您？”
 579 | 
 580 | “当然，”国王回答说，“我一下命令，它们马上就服从。我不能容忍纪律涣散。”
 581 | 
 582 | 这样的权力使小王子惊叹不已。他如果拥有这样的权力，那么一天就不是看四十三次，而是七十二次，一百次，甚至两百次日落，连椅子都不用挪一挪！由于想起被他遗弃的小星球，他有点难过，所以就壮着胆子向国王提出一个请求：
 583 | 
 584 | “我想看一次日落……请您为我……命令太阳下山……”
 585 | 
 586 | “要是我命令一个将军像蝴蝶一样从一朵花儿飞到另一朵花儿，或者让他写一部悲剧，或者让他变成一只海鸟，而这个将军拒不执行命令，那是谁，是他还是我的错呢？”
 587 | 
 588 | “那是您的错，”小王子肯定地说。
 589 | 
 590 | “正是如此。得让每个人去做他能做到的事情，”国王接着说。“权威首先得建立在合理的基础上。如果你命令你的老百姓都去投海，他们就会造反。我之所以有权让人服从，就是因为我的命令都是合情合理的。”
 591 | 
 592 | “那么我想看的日落呢？”小王子想起了这件事，他对自己提过的问题是不会忘记的。
 593 | 
 594 | “你会看到日落的。我会要它下山的。不过按照我的统治原则，要等到条件成熟的时候。”
 595 | 
 596 | “要等到什么时候呢？”小王子问。
 597 | 
 598 | “呣！呣！”国王先翻看一本厚厚的历书，然后回答说，“呣！呣！要等到，大概……大概……要等到今晚大概七点四十分！你会看到它乖乖地服从我的命令的。”
 599 | 
 600 | 小王子打了个哈欠。看不到日落，让他感到挺遗憾。再说他也已经有点腻烦了：
 601 | 
 602 | “我在这儿没什么事好做了，”他对国王说。“我要走了！”
 603 | 
 604 | “别走，”国王回答说，他有了一个臣民，正骄傲着呢。“别走，我任命你当大臣！”
 605 | 
 606 | “什么大臣？”
 607 | 
 608 | “这个……司法大臣！”
 609 | 
 610 | “可是这儿没有人要审判呀！”
 611 | 
 612 | “那可说不定，”国王对他说。“我还没巡视过我的王国。我太老了，我没地方放马车，走路又累得慌。”
 613 | 
 614 | “噢！可是我已经看过了，”小王子说着，又朝这颗小行星的另一边瞥了一眼。“那边也没有一个人……”
 615 | 
 616 | “那你就审判你自己，”国王回答他说。“这是最难的。审判自己要比审判别人难得多。要是你能审判好自己，你就是个真正的智者。”
 617 | 
 618 | “可我，”小王子说，“我在哪儿都可以自己审判自己。我不必留在这儿呀。”
 619 | 
 620 | “呣！呣！”国王说，“我想哪，在我的星球上有只老耗子。夜里我听见它的声音。你可以审判这只老耗子。你可以不时判它死刑。这样啊，它的生命就取决于你的判决了。不过，这只耗子你得悠着点儿用，每次判决后都得赦免它。因为只有这么一只耗子。”
 621 | 
 622 | “可我，”小王子回答说，“我不喜欢判死刑，我想我还得走。”
 623 | 
 624 | “不行，”国王说。
 625 | 
 626 | 整装待发的小王子不想让老国王难过：
 627 | 
 628 | “陛下如果想让命令立刻得到服从，那就不妨下一道合情合理的命令。比如说，陛下可以命令我在一分钟内离开此地。我觉得条件已经成熟……”
 629 | 
 630 | 国王一声不吭，小王子起先有点犹豫，而后叹了口气，就起程了。
 631 | 
 632 | “我任命你当我的大使，”这时国王赶紧喊道。他的神态威严极了。
 633 | 
 634 | “这些大人真奇怪，”小王子在旅途中自言自语地说。
 635 | 
 636 | 
 637 | 
 638 | 
 639 | 
 640 | 
 641 | 
 642 | Ch11
 643 | 
 644 | 
 645 | 第二颗行星上住着一个爱虚荣的人。
 646 | 
 647 | “哈哈！有个崇拜者来看我了！”这个爱虚荣的人刚看见小王子，大老远就喊了起来。
 648 | 
 649 | 因为，在爱虚荣的人眼里，别人都是他们的崇拜者。
 650 | 
 651 | “您好，”小王子说。“您这顶帽子挺有趣的。”
 652 | 
 653 | “这是用来致意的，”爱虚荣的人回答说，“人家向我欢呼时，我就用帽子向他们致意。可惜啊，一直没人经过这儿。”
 654 | 
 655 | “是吗？”小王子说，他没明白那人的意思。
 656 | 
 657 | “你用一只手去拍另一只手，”于是爱虚荣的人这样教他。
 658 | 
 659 | 小王子就拍起巴掌来了。爱虚荣的人抬起帽子，谦逊地致意。
 660 | 
 661 | “这比访问那个国王好玩多了，”小王子心想。他又拍起巴掌来了。爱虚荣的人就又抬起帽子致意。这样玩了五分钟，小王子觉得太单调，他都玩累了：
 662 | 
 663 | 
 664 | 
 665 | “要想叫这顶帽子掉下来，该怎么做呢？”
 666 | 
 667 | 可是爱虚荣的人没听见他的话。爱虚荣的人只听得见颂扬的话。
 668 | 
 669 | “你真的很崇拜我吗？”他问小王子。
 670 | 
 671 | “崇拜是什么意思？”
 672 | 
 673 | “崇拜的意思就是，承认我是这个星球上最英俊、最摩登、最富有、最有学问的人。”
 674 | 
 675 | “可是这个星球上只有你一个人呀！”
 676 | 
 677 | “帮帮忙。你只管崇拜我就是了！”
 678 | 
 679 | “我崇拜你，”小王子说着，微微耸了耸肩膀，“可是你要这个干什么呢？”
 680 | 
 681 | 说着，小王子就走开了。
 682 | 
 683 | “这些大人真的很怪哟，”一路上，他这么对自己说了一句。
 684 | 
 685 | 
 686 | 
 687 | 
 688 | 
 689 | 
 690 | 
 691 | Ch12
 692 | 
 693 | 
 694 | 下一颗行星上住着一个酒鬼。这次访问时间很短，却使小王子陷入了深深的怅惘之中。
 695 | 
 696 | 他看见那个酒鬼静静地坐在桌前，面前有一堆空酒瓶和一堆装得满满的酒瓶，他就问：“你在那儿干什么呢？”
 697 | 
 698 | “我喝酒，”酒鬼神情悲伤地回答。
 699 | 
 700 | “你为什么要喝酒呢？”小王子问。
 701 | 
 702 | “为了忘记，”酒鬼回答。
 703 | 
 704 | “忘记什么？”小王子已经有些同情他了。
 705 | 
 706 | “忘记我的羞愧，”酒鬼垂下脑袋坦白说。
 707 | 
 708 | “为什么感到羞愧？”小王子又问，他想帮助这个人。
 709 | 
 710 | 
 711 | 
 712 | “为喝酒感到羞愧！”酒鬼说完这句话，就再也不开口了。
 713 | 
 714 | 小王子茫然不解地走了。
 715 | 
 716 | “这些大人真的很怪很怪，”一路上，他自言自语地说。
 717 | 
 718 | 
 719 | 
 720 | 
 721 | 
 722 | 
 723 | 
 724 | Ch13
 725 | 
 726 | 
 727 | 第四颗行星是个商人的星球。这个人实在太忙碌了，看见小王子来，连头也没抬一下。
 728 | 
 729 | “您好，”小王子对他说，“您的烟卷灭了。”
 730 | 
 731 | “三加二等于五。五加七等于十二。十二加三等于十五。你好。十五加七等于二十二。二十二加六是二十八。没时间再去点着它。二十六加五，三十一。嚯！一共是五亿一百六十二万二千七百三十一。”
 732 | 
 733 | “五亿什么呀？”
 734 | 
 735 | “呣？你还在这儿？五亿一百万……我也不知道是什么……我的工作太多了！我做的都是正事，我没有工夫闲聊！二加五等于七……”
 736 | 
 737 | 
 738 | 
 739 | “五亿一百万什么？”小王子又问一遍，他向来是不提问题则罢，提了就决不放过。
 740 | 
 741 | 商人抬起头来：
 742 | 
 743 | “我在这个星球上住了五十四个年头，只被打搅过三次。第一次是二十二年以前，有只不知从哪儿跑来的金龟子，弄出一片可怕的声音，害得我在一笔账目里出了四个差错。第二次是十一年前，我风湿病发作。我平时缺乏锻炼。我没工夫去闲逛。我是干正事的人。第三次……就是这次！所以我刚才说了，五亿一百万……”
 744 | 
 745 | “五亿一百万什么？”
 746 | 
 747 | 商人明白他是甭想太平了：
 748 | 
 749 | “五亿一百万个小东西，有时候在天空里看得见它们。”
 750 | 
 751 | “苍蝇？”
 752 | 
 753 | “不对，是闪闪发亮的小东西。”
 754 | 
 755 | “蜜蜂？”
 756 | 
 757 | “不对。是些金色的小东西，无所事事的人望着它们会胡思乱想。可我是干正事的人！我没工夫去胡思乱想。”
 758 | 
 759 | “噢！是星星？”
 760 | 
 761 | “对啦。星星。”
 762 | 
 763 | “你拿这五亿颗星星做什么呢？”
 764 | 
 765 | “五亿一百六十二万二千七百三十一颗。我是个认真的人，我讲究精确。”
 766 | 
 767 | “那你拿这些星星来做什么呢？”
 768 | 
 769 | “我拿它们做什么？”
 770 | 
 771 | “是啊。”
 772 | 
 773 | “不做什么。我占有它们。”
 774 | 
 775 | “你占有这些星星？”
 776 | 
 777 | “对。”
 778 | 
 779 | “可我已经见到有个国王，他……”
 780 | 
 781 | “国王并不占有。他们只是‘统治’。这完全是两码事。”
 782 | 
 783 | “占有这些星星对你有什么用呢？”
 784 | 
 785 | “可以使我富有。”
 786 | 
 787 | “富有对你有什么用呢？”
 788 | 
 789 | “可以去买其他的星星——只要有人发现了这样的星星。”
 790 | 
 791 | “这个人，”小王子暗自思忖，“想问题有点像那个酒鬼。”
 792 | 
 793 | 话虽这么说，他还是接着提问题：
 794 | 
 795 | “一个人怎么能够占有这些星星呢？”
 796 | 
 797 | “它们属于谁了？”商人没好气地顶了他一句。
 798 | 
 799 | “我不知道。谁也不属于。”
 800 | 
 801 | “那么它们就属于我，因为是我第一个想到这件事的。”
 802 | 
 803 | “这就够了？”
 804 | 
 805 | “当然。当你发现一颗不属于任何人的钻石，它就属于你。当你发现一个不属于任何人的岛屿，它就属于你。当你最先想出一个主意，你去申请发明专利，它就属于你。现在我占有了这些星星，因为在我以前没有人想到过占有它们。”
 806 | 
 807 | “这倒也是，”小王子说，“可你拿它们来做什么呢？”
 808 | 
 809 | “我经营它们。我一遍又一遍地计算它们的数目，”商人说，“这并不容易。可我是个干正事的人！”
 810 | 
 811 | 小王子还是不满意。
 812 | 
 813 | “我呀，如果我有一块方围巾，我可以把它围在脖子上带走它。如果我有一朵花儿，我可以摘下这朵花儿带走它。可是你没法摘下这些星星呀！”
 814 | 
 815 | “没错，但是我可以把它们存入银行。”
 816 | 
 817 | “这是什么意思？”
 818 | 
 819 | “这就是说，我把我的星星的总数写在一张小纸片上。然后我把这张小纸片放进一个抽屉锁好。”
 820 | 
 821 | “就这些？”
 822 | 
 823 | “这就够了！”
 824 | 
 825 | “真有趣，”小王子心想，“倒挺有诗意的。可这算不上什么正事呀。”
 826 | 
 827 | 小王子对正事的看法，跟大人对正事的看法很不相同。
 828 | 
 829 | “我有一朵花儿，”他又说道，“我每天都给她浇水。我有三座火山，我每星期都把它们疏通一遍。那座死火山我也疏通。因为谁也说不准它还会不会喷发。我占有它们，对火山有好处，对花儿也有好处。可是你占有星星，对它们没有好处。”
 830 | 
 831 | 商人张口结舌，无言以对。小王子就走了。
 832 | 
 833 | “这些大人真的好古怪，”一路上，他只是自言自语说了这么一句。
 834 | 
 835 | 
 836 | 
 837 | 
 838 | 
 839 | 
 840 | 
 841 | Ch14
 842 | 
 843 | 
 844 | 第五颗行星非常奇怪。这是最小的一颗。上面刚好只能容得下一盏路灯和一个点灯人。小王子好生纳闷，在天空的一个角落，在一个既没有房子也没有居民的行星上，要一盏路灯和一个点灯人，又能有什么用呢？不过他还是对自己说：
 845 | 
 846 | “很可能这个人是有点不正常。但是跟那个国王，那个爱虚荣的人，那个商人和那个酒鬼比起来，他还是要比他们正常些。至少他的工作还有意义。他点亮路灯，就好比唤醒了另一个太阳或者一朵花儿。他熄灭路灯，就好比让这朵花儿或这个太阳睡觉了。这是件很美的事情。既然很美，自然就有用啰！”
 847 | 
 848 | 他一到这个星球，就很尊敬地向点灯人打招呼：
 849 | 
 850 | “早上好。你刚才为什么把路灯熄掉呢？”
 851 | 
 852 | “这是规定，”点灯人回答说，“早上好。”
 853 | 
 854 | “什么规定？”
 855 | 
 856 | “熄灭路灯呗。晚上好。”
 857 | 
 858 | 说着他又点亮了路灯。
 859 | 
 860 | “那你刚才为什么又点亮路灯呢？”
 861 | 
 862 | “这是规定，”点灯人回答说。
 863 | 
 864 | “我弄不懂，”小王子说。
 865 | 
 866 | “没什么要弄懂的，”点灯人说，“规定就是规定。早上好。”
 867 | 
 868 | 说着他熄灭了路灯。
 869 | 
 870 | 然后他用一块有红方格的手帕擦了擦额头。
 871 | 
 872 | “我干的是件非常累人的差事。以前还说得过去。我早晨熄灯，晚上点灯。白天我有时间休息，夜里也有时间睡觉……”
 873 | 
 874 | “那么，后来规定改变了？”
 875 | 
 876 | “规定没有改变，”点灯人说，“惨就惨在这儿！这颗行星一年比一年转得快，可规定却没变！”
 877 | 
 878 | “结果呢？”小王子说。
 879 | 
 880 | “结果现在每分钟转一圈，我连一秒钟的休息时间都没有。我每分钟就要点一次灯，熄一次灯！”
 881 | 
 882 | “这可真有趣！你这儿一天只有一分钟！”
 883 | 
 884 | “一点也不有趣，”点灯人说，“我们说着话，就已经一个月过去了。”
 885 | 
 886 | “一个月？”
 887 | 
 888 | “对。三十分钟。三十天！晚上好。”
 889 | 
 890 | 说着他点亮了路灯。
 891 | 
 892 | 小王子瞧着他，心里喜欢上了这个忠于职守的点灯人。他想起了自己以前的挪椅子看日落。他挺想帮助这个朋友：
 893 | 
 894 | “你知道……我有一个办法，好让你想休息就能休息……”
 895 | 
 896 | 
 897 | 
 898 | 我干的是桩非常累人的差事
 899 | 
 900 | “我一直想休息，”点灯人说。
 901 | 
 902 | 因为，一个人可以同时是忠于职守的，又是生性疏懒的。
 903 | 
 904 | 小王子接着说：
 905 | 
 906 | “你的星球小得很，你走三步就绕了一圈。所以你只要走得慢一些，就可以一直待在阳光下。你要想休息了，就往前走……你要白天有多长，它就有多长。”
 907 | 
 908 | “这办法帮不了我多少忙，”点灯人说，“我这人，平生就喜欢睡觉。”
 909 | 
 910 | “真不走运，”小王子说。
 911 | 
 912 | “真不走运，”点灯人说，“早上好。”
 913 | 
 914 | 说着他熄灭了路灯。
 915 | 
 916 | “这个人呀，”小王子一边继续他的旅途，一边在想，“国王也好，爱虚荣的人也好，酒鬼也好，商人也好，他们都会瞧不起这个人。可是，就只有他没让我感到可笑。也许，这是因为他关心的是别的事情，而不是自己。”
 917 | 
 918 | 他惋惜地叹了口气，又自言自语：
 919 | 
 920 | “只有这个人我可以跟他交朋友。可是他的星球实在太小了。两个人挤不下……”
 921 | 
 922 | 小王子不敢承认的是，他留恋这颗受上苍眷顾的星球，是因为每二十四小时就有一千四百四十次日落！
 923 | 
 924 | 
 925 | 
 926 | 
 927 | 
 928 | 
 929 | 
 930 | Ch15
 931 | 
 932 | 
 933 | 第六颗行星是一颗大十倍的行星。上面住着一个老先生，他在写一本本大部头的著作。
 934 | 
 935 | “瞧！来了一位探险家！”他一看见小王子，就喊道。
 936 | 
 937 | 小王子坐在桌边，喘了喘气。他刚走了那么多路！
 938 | 
 939 | “你从哪儿来啊？”老先生问他。
 940 | 
 941 | “这一大本是什么书？”小王子说，“您在这儿干什么呢？”
 942 | 
 943 | “我是地理学家，”老先生说。
 944 | 
 945 | “什么叫地理学家？”
 946 | 
 947 | “地理学家是个学者，他知道哪儿有海洋，有河流，有城市，有山脉和沙漠。”
 948 | 
 949 | 
 950 | 
 951 | “这挺有趣，”小王子说，“啊，这才是真正的职业！”说着他朝地理学家的星球四周望了一眼。他还从没见过这么雄伟壮丽的星球哩。
 952 | 
 953 | “您的星球真美。它有海洋吗？”
 954 | 
 955 | “这我没法知道，”地理学家说。
 956 | 
 957 | “哦！”小王子有点失望。“那么山脉呢？”
 958 | 
 959 | “这我没法知道，”地理学家说。
 960 | 
 961 | “城市、河流和沙漠呢？”
 962 | 
 963 | “这我也没法知道，”地理学家说。
 964 | 
 965 | “可您是地理学家呀！”
 966 | 
 967 | “一点不错，”地理学家说，“但我不是探险家。我这里一个探险家也没有。地理学家是不出去探测城市、河流、山脉、海洋和沙漠的。地理学家非常重要，他不能到处闲逛。他从不离开自己的书房。不过他会在那里接见探险家。他向他们提问，把他们的旅行回忆记下来。要是他觉得他们中间哪个人的回忆有意思，他就会让人对这个探险家的品行作一番调查。”
 968 | 
 969 | “这是为什么？”
 970 | 
 971 | “因为一个说谎的探险家会给地理书带来灾难性的后果。一个贪杯的探险家也是如此。”
 972 | 
 973 | “这是为什么？”小王子问。
 974 | 
 975 | “因为酒鬼会把一样东西看成两样东西。这样一来，地理学家就会把明明只有一座山的地方写成有两座山了。”
 976 | 
 977 | “我认识一个人，”小王子说，“他当探险家就不行。”
 978 | 
 979 | “这有可能。所以，要等到了解探险家品行良好以后，才对他的发现进行调查。”
 980 | 
 981 | “去看一下？”
 982 | 
 983 | “不。这太复杂了。地理学家只要求探险家提供物证。比如说，他发现了一座大山，地理学家就要求他带一块大石头来。”
 984 | 
 985 | 地理学家忽然激动起来。
 986 | 
 987 | “嗨，你是大老远来的！你是探险家！你给我说说你的星球！”
 988 | 
 989 | 说着，地理学家打开笔记本，削了支铅笔。地理学家一开始只用铅笔记下探险家讲的话。要等到这个探险家提供物证以后，才换用钢笔来记录。
 990 | 
 991 | “怎么样？”地理学家问。
 992 | 
 993 | “哦！我那儿，”小王子说，“并不很有趣，那是颗很小的星球。我有三座火山。两座活火山，一座死火山。不过这也说不定。”
 994 | 
 995 | “这可说不定，”地理学家说。
 996 | 
 997 | “我还有一朵花儿。”
 998 | 
 999 | “花儿我们是不记下来的，”地理学家说。
1000 | 
1001 | “这是为什么？花儿是最美的呀！”
1002 | 
1003 | “因为花是转瞬即逝的。”
1004 | 
1005 | “什么叫‘转瞬即逝’呢？”
1006 | 
1007 | “地理书，”地理学家说，“是所有的书中间最宝贵的。地理书永远不会过时。山脉移位的情形是极其罕见的。海洋干涸的情形也是极其罕见的。我们写的都是永恒的事物。”
1008 | 
1009 | “可是死火山说不定也会醒来，”小王子插话说，“什么叫‘转瞬即逝’呢？”
1010 | 
1011 | “火山睡也好，醒也好，对我们地理学家来说是一码事，”地理学家说，“我们关心的是山。山是一成不变的。”
1012 | 
1013 | “可是，什么叫‘转瞬即逝’呢？”小王子追问道，他向来提了问题就不肯放过。
1014 | 
1015 | “意思就是‘随时有消逝的危险’。”
1016 | 
1017 | “我的花儿随时有消逝的危险吗？”
1018 | 
1019 | “当然。”
1020 | 
1021 | “我的花儿是转瞬即逝的，”小王子想道，“她只有四根刺可以自卫，可以用来抵御这个世界！而我却丢下她孤零零地在那儿！”
1022 | 
1023 | 
1024 | 
1025 | 想到这儿，他不由得感到了后悔。不过他马上又振作起来：
1026 | 
1027 | “依您看，我再去哪儿访问好呢？”他问。
1028 | 
1029 | “地球吧，”地理学家回答说，“它的名气挺响……”
1030 | 
1031 | 于是小王子走了，一边走一边想着他的花儿。
1032 | 
1033 | 
1034 | 
1035 | 
1036 | 
1037 | 
1038 | 
1039 | Ch16
1040 | 
1041 | 
1042 | 所以，第七颗行星就是地球了。
1043 | 
1044 | 地球可不是普普通通的行星！它上面有一百十一个国王（当然，黑人国王也包括在内），七千万个地理学家，九十万个商人，七百五十万个酒鬼，三亿一千一百个爱虚荣的人，总共大约有二十亿个大人。为了让你们对地球的大小有个概念，我就这么对你们说吧，在发明电以前，地球的六大洲上，需要维持一支四十六万二千五百十一人的浩浩荡荡的点灯人大军。
1045 | 
1046 | 从稍远些的地方看去，这是一幅壮丽的景观。这支大军行动起来，就像在歌剧院里跳芭蕾舞那样有条不紊。最先上场的是新西兰和澳大利亚的点灯人。点着了灯，他们就退下去睡觉。接着是中国和西伯利亚的点灯人上场，随后他们也退到幕后。下面轮到了俄罗斯和印度的点灯人。接下去是非洲和欧洲的，而后是南美的。再后来是北美的。所有这些点灯人从来不会搞乱上场的次序。这场面真是蔚为壮观。
1047 | 
1048 | 只有北极（那儿只有唯一一盏路灯）的点灯人和南极（那儿也只有唯一一盏路灯）的那个同行，过着悠闲懒散的生活：他俩一年干两回活儿。
1049 | 
1050 | 
1051 | 
1052 | 
1053 | 
1054 | 
1055 | 
1056 | Ch17
1057 | 
1058 | 
1059 | 一个人如果想把话说得有趣些，免不了会稍稍撒点谎。我给你们讲点灯人大军的那会儿，就不是很诚实。那些不了解我们行星的人，听了我讲的故事，可能会造成一种错觉。其实人在地球上只占一点点地方。倘若让地球上的二十亿居民全都挨个儿站着，就像集会时那样，那么二十海里长、二十海里宽的一个广场就容得下他们。全人类可以挤在太平洋中最小的一个岛屿上。
1060 | 
1061 | 当然，大人是不会相信你们的。他们自以为占了好多好多地方。他们把自己看得跟猴面包树一样重要。你们不妨劝他们好好算一算。他们喜欢数字，说到计算就来劲。不过你们可别浪费时间，去做这种叫人厌烦的事情。根本不用去做。你们相信我就行了。所以小王子一踏上地球，就觉得奇怪，怎么一个人也看不见呢。他正在担心是不是来错了星球，忽然看见沙地上一个月白色的圆环在挪动。
1062 | 
1063 | “晚上好，”小王子没把握地招呼说。
1064 | 
1065 | “晚上好，”蛇说。
1066 | 
1067 | “我落在哪个行星上了？”小王子问。
1068 | 
1069 | “在地球上，这是非洲，”蛇回答。
1070 | 
1071 | “噢！难道地球上一个人也没有吗？”
1072 | 
1073 | “这儿是沙漠。在沙漠上是一个人也没有的。地球大着呢，”蛇说。
1074 | 
1075 | 小王子在一块石头上坐下，抬头望着天空：
1076 | 
1077 | “我在想，”他说，“这些星星闪闪发亮，大概是要让每个人总有一天能找到自己的那颗星星吧。瞧我的那颗星星。它正好在我们头顶上……可是它离得那么远！”
1078 | 
1079 | “它很美，”蛇说，“你到这儿来干吗？”
1080 | 
1081 | 
1082 | 
1083 | “你真是种奇怪的动物，”最后他说，“细得像根手指……”
1084 | 
1085 | “我和一朵花儿闹了别扭，”小王子说。
1086 | 
1087 | “噢！”蛇说。
1088 | 
1089 | 他俩都沉默了。
1090 | 
1091 | “哪儿见得到人呢？”小王子终于又开口了，“在沙漠里真有点孤独……”
1092 | 
1093 | “在人群中间，你也会感到孤独，”蛇说。
1094 | 
1095 | 小王子久久地注视着蛇：
1096 | 
1097 | “你真是种奇怪的动物，”最后他说，“细得像根手指……”
1098 | 
1099 | “可我比一个国王的手指还厉害呢，”蛇说。
1100 | 
1101 | 小王子笑了：
1102 | 
1103 | “你厉害不到哪儿去……你连脚都没有……要出远门你就不行吧？”
1104 | 
1105 | “我可以把你带到很远很远的地方去，比一艘船去的地方还远，”蛇说。
1106 | 
1107 | 它盘在小王子的脚踝上，像一只金镯子：
1108 | 
1109 | “凡是我碰过的人，我都把他们送回老家去，”它又说，“可你这么纯洁，又是从一颗星星那儿来的……”
1110 | 
1111 | 小王子没有作声。
1112 | 
1113 | “在这个花岗石的地球上，你是这么弱小，我很可怜你。哪天你要是想念你的星星了，我可以帮助你。我可以……”
1114 | 
1115 | “噢！我明白你的意思，”小王子说，“可为什么你说的话都像谜似的？”
1116 | 
1117 | “这些谜我都能解开，”蛇说。
1118 | 
1119 | 然后他们又都沉默了。
1120 | 
1121 | 
1122 | 
1123 | 
1124 | 
1125 | 
1126 | 
1127 | Ch18
1128 | 
1129 | 
1130 | 小王子穿过沙漠，只见到了一朵花儿。一朵长着三片花瓣的花儿，一朵不起眼的花儿……
1131 | 
1132 | “你好，”小王子说。
1133 | 
1134 | “你好，”花儿说。
1135 | 
1136 | “人们在哪儿呢？”小王子有礼貌地问。
1137 | 
1138 | 花儿看见过一支沙漠驼队经过：
1139 | 
1140 | “人们？我想是有的，不是六个就是七个。好几年以前，我见过他们。不过谁也不知道，要上哪儿才能找到他们。风把他们一会儿吹到这儿，一会儿吹到那儿。他们没有根，活得很辛苦。”
1141 | 
1142 | “再见了，”小王子说。
1143 | 
1144 | “再见，”花儿说。
1145 | 
1146 | 
1147 | 
1148 | 
1149 | 
1150 | 
1151 | 
1152 | 
1153 | 
1154 | Ch19
1155 | 
1156 | 
1157 | 小王子攀上一座高山。他过去只见过三座齐膝高的火山。他还把那座死火山当凳子坐哩。“从一座这么高的山上望下去，”他心想，“我一眼就能看到整个星球和所有的人们……”可是，他看到的只是些陡峭的山峰。
1158 | 
1159 | “你们好，”他怯生生地招呼说。
1160 | 
1161 | “你们好……你们好……你们好……”回声应道。
1162 | 
1163 | “你们是谁呀？”小王子问。
1164 | 
1165 | “你们是谁呀……你们是谁呀……你们是谁呀……”回声应道。
1166 | 
1167 | “请做我的朋友吧，我很孤独，”他说。
1168 | 
1169 | “我很孤独……我很孤独……我很孤独……”回声应道。
1170 | 
1171 | “这颗行星可真怪！”他心想，“又干，又尖，又锋利。人们一点想象力都没有。他们老是重复别人对他们说的话……在我那儿有一朵花儿，她总是先开口说话的……”
1172 | 
1173 | 
1174 | 
1175 | 这颗行星又干，又尖，又锋利
1176 | 
1177 | 
1178 | 
1179 | 
1180 | 
1181 | 
1182 | 
1183 | Ch20
1184 | 
1185 | 
1186 | 小王子在沙漠、山岩和雪地上走了很长时间以后，终于发现了一条路。所有的路都通往有人住的地方。
1187 | 
1188 | “你们好，”他说。
1189 | 
1190 | 眼前是一座玫瑰盛开的花园。
1191 | 
1192 | “你好，”玫瑰们说。
1193 | 
1194 | 小王子瞧着她们。她们都长得和他的花儿一模一样。
1195 | 
1196 | “你们是什么花呀？”他惊奇地问。
1197 | 
1198 | “我们是玫瑰花，”玫瑰们说。
1199 | 
1200 | “噢！”小王子说……
1201 | 
1202 | 他感到非常伤心。他的花儿跟他说过，她是整个宇宙中独一无二的花儿。可这儿，在一座花园里就有五千朵，全都一模一样！
1203 | 
1204 | “要是让她看到了，”他想，“她一定会非常生气……她会拼命咳嗽，她还会假装死去，免得让人耻笑。我呢，还得假装去照料她，否则她为了让我感到羞愧，说不定真的会让自己死去……”
1205 | 
1206 | 
1207 | 
1208 | 随后他又想：“我还以为自己拥有的是独一无二的一朵花儿呢，可我有的只是普普通通的一朵玫瑰花罢了。这朵花儿，加上那三座只到我膝盖的火山，其中有一座还说不定永远不会再喷发，就凭这些，我怎么也成不了一个伟大的王子……”想着想着，他趴在草地上哭了起来。
1209 | 
1210 | 
1211 | 
1212 | 
1213 | 
1214 | 
1215 | 
1216 | 
1217 | 
1218 | Ch21
1219 | 
1220 | 
1221 | 就在这时狐狸出现了。
1222 | 
1223 | “早哇，”狐狸说。
1224 | 
1225 | “早，”小王子有礼貌地回答，他转过身来，却什么也没看到。
1226 | 
1227 | “我在这儿呢，”那声音说，“在苹果树下面……”
1228 | 
1229 | “你是谁？”小王子说，“你很漂亮。”
1230 | 
1231 | “我是一只狐狸，”狐狸说。
1232 | 
1233 | “来和我一起玩吧，”小王子提议，“我很不快活……”
1234 | 
1235 | “我不能和你一起玩，”狐狸说，“还没人驯养过我呢。”
1236 | 
1237 | “啊！对不起，”小王子说。
1238 | 
1239 | 不过，他想了想又说：
1240 | 
1241 | “‘驯养’是什么意思？”
1242 | 
1243 | “你一定不是这儿的人，”狐狸说，“你来寻找什么呢？”
1244 | 
1245 | “我来找人，”小王子说。“‘驯养’是什么意思？”
1246 | 
1247 | “人哪，”狐狸说，“他们有枪，还打猎。讨厌极了！他们还养母鸡，这总算有点意思。你也找母鸡吗？”
1248 | 
1249 | “不找，”小王子说。“我找朋友。‘驯养’是什么意思？”
1250 | 
1251 | “这是一件经常被忽略的事情，”狐狸说。“意思是‘建立感情联系’……”
1252 | 
1253 | 
1254 | 
1255 | “建立感情联系？”
1256 | 
1257 | “当然，”狐狸说，“现在你对我来说，只不过是个小男孩，跟成千上万别的小男孩毫无两样。我不需要你。你也不需要我。我对你来说，也只不过是个狐狸，跟成千上万别的狐狸毫无两样。但是，你要是驯养了我，我俩就彼此都需要对方了。你对我来说是世界上独一无二的。我对你来说，也是世界上独一无二的……”
1258 | 
1259 | “我有点明白了，”小王子说，“有一朵花儿……我想她是驯养了我……”
1260 | 
1261 | “有可能，”狐狸说，“这个地球上各色各样的事都有……”
1262 | 
1263 | “哦！不是在地球上，”小王子说。
1264 | 
1265 | 狐狸看上去很惊讶：
1266 | 
1267 | “在另一个星球上？”
1268 | 
1269 | “对。”
1270 | 
1271 | “在那个星球上有没有猎人呢？”
1272 | 
1273 | “没有。”
1274 | 
1275 | “哈，这很有意思！那么母鸡呢？”
1276 | 
1277 | “没有。”
1278 | 
1279 | “没有十全十美的事呵，”狐狸叹气说。
1280 | 
1281 | 不过，狐狸很快又回到刚才的想法上来：
1282 | 
1283 | “我的生活很单调。我去捉鸡，人来捉我。母鸡全都长得一个模样，人也全都长得一个模样。所以我有点腻了。不过，要是你驯养我，我的生活就会变得充满阳光。我会辨认出一种和其他所有人都不同的脚步声。听见别的脚步声，我会往地底下钻，而你的脚步声，会像音乐一样，把我召唤到洞外。还有，你看！你看到那边的麦田了吗？我是不吃面包的。麦子对我来说毫无用处。我对麦田无动于衷。可悲就可悲在这儿！而你的头发是金黄色的。所以，一旦你驯养了我，事情就变得很美妙了！金黄色的麦子，会让我想起你。我会喜爱风儿吹拂麦浪的声音……”
1284 | 
1285 | 狐狸停下来，久久地注视着小王子：
1286 | 
1287 | “请你……驯养我吧！”他说。
1288 | 
1289 | 
1290 | 
1291 | 如果你能在下午四点钟来，那么我在三点钟就会开始有一种幸福的感觉。
1292 | 
1293 | “我很愿意，”小王子回答说，“可是我时间不多了。我得去找朋友，还得去了解许多东西。”
1294 | 
1295 | “只有驯养过的东西，你才会了解它，”狐狸说，“人们也没有时间去了解任何东西。他们总到商店去购买现成的东西。但是不存在出售朋友的商店，所以人们也就不会有朋友。你如果想要有个朋友，就驯养我吧！”
1296 | 
1297 | “那么应当做些什么呢？”小王子说。
1298 | 
1299 | “应当很有耐心，”狐狸回答说，“你先坐在草地上，离我稍远一些，就像这样。我从眼角里瞅你，而你什么也别说。语言是误解的根源。不过，每天你都可以坐得离我稍稍近一些……”
1300 | 
1301 | 第二天，小王子又来了。
1302 | 
1303 | “最好你能在同一时间来，”狐狸说，“比如说，下午四点钟吧，那么我在三点钟就会开始感到幸福了。时间越来越近，我就越来越幸福。到了四点钟，我会兴奋得坐立不安；幸福原来也很折磨人的！可要是你随便什么时候来，我就没法知道什么时候该准备好我的心情……还是得有个仪式。”
1304 | 
1305 | “什么叫仪式？”小王子问。
1306 | 
1307 | “这也是一件经常被忽略的事情，”狐狸说，“就是定下一个日子，使它不同于其他的日子，定下一个时间，使它不同于其他的时间。比如说，猎人有一种仪式。每星期四他们都和村里的姑娘跳舞。所以呢，星期四就是个美妙的日子！这一天我总要到葡萄地里去转悠转悠。要是猎人们随时跳舞，每天不就都一模一样，我不也就没有假期了吗？”
1308 | 
1309 | 就这样，小王子驯养了狐狸。而后，眼看分手的时刻临近了：
1310 | 
1311 | “哎！”狐狸说，“……我要哭了。”
1312 | 
1313 | “这可是你的不是哟，”小王子说，“我本来没想让你受任何伤害，可你却要我驯养你……”
1314 | 
1315 | “可不是，”狐狸说。
1316 | 
1317 | “不过你要哭了！”小王子说。
1318 | 
1319 | “可不是，”狐狸说。
1320 | 
1321 | “结果你什么好处也没得到！”
1322 | 
1323 | “我得到了，”狐狸说，“是麦田的颜色给我的。”
1324 | 
1325 | 他随即又说：
1326 | 
1327 | “你再去看看那些玫瑰花吧。你会明白你那朵玫瑰是世界上独一无二的。然后你再回来跟我告别，我要告诉你一个秘密作为临别礼物。”
1328 | 
1329 | 
1330 | 
1331 | 小王子就去看那些玫瑰。
1332 | 
1333 | “你们根本不像我那朵玫瑰，你们还什么都不是呢，”他对它们说，“谁都没驯养过你们，你们也谁都没驯养过。你们就像狐狸以前一样。那时候的它，和成千上万别的狐狸毫无两样。可是我现在和它做了朋友，它在世界上就是独一无二的了。”
1334 | 
1335 | 玫瑰们都很难为情。
1336 | 
1337 | “你们很美，但你们是空虚的，”小王子接着说，“没有人能为你们去死。当然，我那朵玫瑰在一个过路人眼里跟你们也一样。然而对于我来说，单单她这一朵，就比你们全体都重要得多。因为我给浇过水的是她，我给盖过罩子的是她，我给遮过风障的是她，我给除过毛虫的（只把两三条要变成蝴蝶的留下）也是她。我听她抱怨和自诩，有时也和她默默相对。她，是我的玫瑰。”
1338 | 
1339 | 说完，他又回到狐狸跟前：
1340 | 
1341 | “再见了……”他说。
1342 | 
1343 | “再见，”狐狸说，“我告诉你那个秘密，它很简单：只有用心才能看见。本质的东西用眼是看不见的。”
1344 | 
1345 | “本质的东西用眼是看不见的，”小王子重复了一遍，他要记住这句话。
1346 | 
1347 | “正是你为你的玫瑰花费的时光，才使你的玫瑰变得如此重要。”
1348 | 
1349 | “正是我为我的玫瑰花费的时光，才使我的玫瑰变得如此重要，”小王子说，他要记住这句话。
1350 | 
1351 | “人们已经忘记了这个道理，”狐狸说。“但你不该忘记它。对你驯养过的东西，你永远负有责任。你必须对你的玫瑰负责……”
1352 | 
1353 | “我必须对我的玫瑰负责……”小王子重复一遍，他要记住这句话。
1354 | 
1355 | 
1356 | 
1357 | 
1358 | 
1359 | 
1360 | 
1361 | Ch22
1362 | 
1363 | 
1364 | “你好，”小王子说。
1365 | 
1366 | “你好，”扳道工说。
1367 | 
1368 | “你在这儿做什么？”小王子问。
1369 | 
1370 | “我在分送旅客，一千人一拨，”扳道工说，“我发送运载旅客的列车，一会儿往右，一会儿往左。”
1371 | 
1372 | 说着，一列灯火通明的快车，像打雷似的轰鸣着驶过，震得扳道房直打颤。
1373 | 
1374 | “他们好匆忙，”小王子说，“他们去找什么呢？”
1375 | 
1376 | “开火车的人自己也不知道，”扳道工说。
1377 | 
1378 | 说话间，又一列灯火通明的快车，朝相反的方向轰鸣而去。
1379 | 
1380 | “他们已经回来了？”小王子问。
1381 | 
1382 | “不是刚才的那列，”扳道工说，“这是对开列车。”
1383 | 
1384 | “他们对原来的地方不满意吗？”
1385 | 
1386 | “人们对自己的地方从来不会满意，”扳道工说。第三列灯火通明的快车轰鸣着驶过。
1387 | 
1388 | “他们是去追赶第一批旅客吗？”小王子问。
1389 | 
1390 | “他们没追赶谁，”扳道工说，“他们在里面睡觉，或者打哈欠。只有孩子把鼻子贴在窗上看外面。”
1391 | 
1392 | “只有孩子知道自己在找什么，”小王子说，“他们在一个布娃娃身上花了好些时间，她对他们来说就成了很重要的东西。要是有人夺走他们的布娃娃，他们会哭的……”
1393 | 
1394 | “他们真幸运，”扳道工说。
1395 | 
1396 | 
1397 | 
1398 | 
1399 | 
1400 | 
1401 | 
1402 | Ch23
1403 | 
1404 | 
1405 | “你好，”小王子说。
1406 | 
1407 | “你好，”商人说。
1408 | 
1409 | 他是个卖复方止渴丸的商人。每星期只要吞服一粒，就不会感到口渴了。
1410 | 
1411 | “你为什么要卖这东西？”小王子问。
1412 | 
1413 | “它可以大大节约时间，”商人说，“专家做过计算。每星期可以省下五十三分钟。”
1414 | 
1415 | “省下的五十三分钟做什么用呢？”
1416 | 
1417 | “随便怎么用都行……”
1418 | 
1419 | “我呀，”小王子心想，“要是我省下这五十三分钟，我就不慌不忙地朝泉水走去……”
1420 | 
1421 | 
1422 | 
1423 | 
1424 | 
1425 | 
1426 | 
1427 | 
1428 | 
1429 | Ch24
1430 | 
1431 | 
1432 | 这是我降落在沙漠后的第八天，我听着这个商人的故事，喝完了最后一滴备用水。
1433 | 
1434 | “喔！”我对小王子说，“你的回忆很动人，可是我飞机还没修好，水也喝完了，要是我能朝泉水走去，那真是有福了！”
1435 | 
1436 | “我那狐狸朋友……”他说。
1437 | 
1438 | “小家伙，这可不干狐狸的事！”
1439 | 
1440 | “为什么？”
1441 | 
1442 | “因为我快要渴死了……”
1443 | 
1444 | 他没明白我的思路，回答我说：
1445 | 
1446 | “有朋友真好，即使就要死了，我也还是这么想。我真高兴，有过一个狐狸朋友……”
1447 | 
1448 | “他没明白情势有多凶险，”我心想，“他从来不知道饥渴。只要有点阳光，他就足够了……”
1449 | 
1450 | 然而他注视着我，好像知道我心里在想什么：
1451 | 
1452 | “我也渴……我们去找一口井吧……”
1453 | 
1454 | 我做了个表示厌烦的手势：在一望无垠的沙漠中，漫无目标地去找井，简直是荒唐。然而，我们到底还是上路了。
1455 | 
1456 | 默默地走了几个钟头以后，夜幕降临了，星星在天空中闪烁起来。由于渴得厉害，我有点发烧，望着天上的星星，仿佛在梦中。小王子的话在脑海里盘旋舞蹈。
1457 | 
1458 | “你也渴？”我问。
1459 | 
1460 | 他没有回答我的问题，只对我说：
1461 | 
1462 | “水对心灵也有好处……”
1463 | 
1464 | 我没听懂他的话，但我没作声……我知道，这会儿不该去问他。
1465 | 
1466 | 他累了。他坐了下来。我坐在他身旁。沉默了一会儿，他又说：
1467 | 
1468 | “星星很美，因为有一朵看不见的花儿……”
1469 | 
1470 | 我说了声“可不是”，就静静地注视着月光下沙漠的褶皱。
1471 | 
1472 | “沙漠很美，”他又说。
1473 | 
1474 | 没错。我一向喜欢沙漠。我们坐在一个沙丘上。什么也看不见。什么也听不见。然而有什么东西在寂静中发出光芒……
1475 | 
1476 | “沙漠这么美，”小王子说，“是因为有个地方藏着一口井……”
1477 | 
1478 | 我非常吃惊，突然间明白了沙漠发光的奥秘。我小时候住在一座老宅里，传说宅子里埋着宝藏。当然，从来没人发现过这宝藏，或许根本没人寻找过它。但是它使整座宅子变得令人着迷。我的宅子在心灵深处藏着一个秘密……
1479 | 
1480 | “对，”我对小王子说，“不管是宅子，还是星星或沙漠，使它们变美的东西，都是看不见的！”
1481 | 
1482 | 
1483 | 
1484 | 他笑了，拉住吊绳，让辘轳转起来
1485 | 
1486 | “我很高兴，”他说，“你和狐狸的看法一样了。”
1487 | 
1488 | 看小王子睡着了，我把他抱起来，重新上路。我很激动。我觉得就像捧着一件易碎的宝贝。我甚至觉得在地球上，再没有更娇弱的东西了。我在月光下看着他苍白的前额，紧闭的眼睛，还有那随风飘动的发绺，在心里对自己说：“我所看到的只是外貌。最重要的东西是看不见的……”
1489 | 
1490 | 当他微微张开的嘴唇绽出一丝笑意时，我又对自己说：“在这个熟睡的小王子身上，最让我感动的，是他对一朵花儿的忠贞，这朵玫瑰的影像，即使在他睡着时，仍然在他身上发出光芒，就像一盏灯的火焰一样……”这时我把他想得更加娇弱了。应该好好保护灯火呵，一阵风就会吹灭它……
1491 | 
1492 | 就这样走啊走啊，我在拂晓时发现了水井。
1493 | 
1494 | 
1495 | 
1496 | 
1497 | 
1498 | 
1499 | 
1500 | Ch25
1501 | 
1502 | 
1503 | “人们挤进快车，”小王子说，“可是又不知道还要去寻找什么。所以他们忙忙碌碌，转来转去……”
1504 | 
1505 | 他接着又说：
1506 | 
1507 | “其实何必呢……”
1508 | 
1509 | 我们找到的这口井，跟撒哈拉沙漠的那些井不一样。那些井，只是沙漠上挖的洞而已。这口井很像村庄里的那种井。可这儿根本就没有村庄呀，我觉得自己在做梦。
1510 | 
1511 | “真奇怪，”我对小王子说，“样样都是现成的：辘轳，水桶，吊绳……”
1512 | 
1513 | 他笑了，拉住吊绳，让辘轳转起来。辘轳咕咕作响，就像一只吹不到风、沉睡已久的旧风标发出的声音。
1514 | 
1515 | “你听见吗，”小王子说，“我们唤醒了这口井，它在唱歌呢……”
1516 | 
1517 | 我不想让他多用力气：
1518 | 
1519 | “让我来吧，”我说，“这活儿对你来说太重了。”我把水桶缓缓地吊到井栏上，稳稳地搁住。辘轳的歌声还在耳边响着，而在依然晃动着的水面上，我瞧见太阳在晃动。
1520 | 
1521 | “我想喝水，”小王子说，“给我喝吧……”
1522 | 
1523 | 我这时明白了他在寻找的是什么！
1524 | 
1525 | 我把水桶举到他的嘴边。他喝着水，眼睛没张开。水像节日一般美好。它已经不只是一种维持生命的物质。它来自星光下的跋涉，来自辘轳的歌唱，来自臂膀的用力。它像礼物一样愉悦着心灵。当我是个小男孩时，圣诞树的灯光，午夜弥撒的音乐，人们甜蜜的微笑，都曾像这样辉映着我收到的圣诞礼物，让它熠熠发光。
1526 | 
1527 | “你这儿的人，”小王子说，“在一座花园里种出五千朵玫瑰，却没能从中找到自己要找的东西……”
1528 | 
1529 | “他们是没能找到……”我应声说。
1530 | 
1531 | “然而他们要找的东西，在一朵玫瑰或者一点儿水里就能找到……”
1532 | 
1533 | “可不是，”我应声说。
1534 | 
1535 | 小王子接着说：
1536 | 
1537 | “但是用眼是看不见的。得用心去找。”
1538 | 
1539 | 我喝了水。我痛快地呼吸着空气。沙漠在晨曦中泛出蜂蜜的色泽。这种蜂蜜的色泽，也使我心头洋溢着幸福的感觉。我为什么要难过呢……
1540 | 
1541 | “你该实践自己的诺言了，”小王子柔声对我说，他这会儿又坐在了我的身边。
1542 | 
1543 | “什么诺言？”
1544 | 
1545 | “你知道的……给我的羊画个嘴罩……我要对我的花儿负责！”
1546 | 
1547 | 我从衣袋里掏出几幅画稿。小王子瞥了一眼，笑着说：
1548 | 
1549 | “你的猴面包树呀，有点像白菜……”
1550 | 
1551 | “哦！”
1552 | 
1553 | 可我还为这几棵猴面包树感到挺得意哩！
1554 | 
1555 | “你的狐狸……它的耳朵……有点像两只角……再说也太长了！”
1556 | 
1557 | 说着他又笑了起来。
1558 | 
1559 | “你不公平，小家伙，我可就画过剖开和不剖开的蟒蛇，别的都没学过。”
1560 | 
1561 | “噢！这就行了，”他说，“孩子们会看懂的。”
1562 | 
1563 | 我用铅笔画了一只嘴罩。把画递给他时，我的心揪紧了：
1564 | 
1565 | “你有些什么打算，我都不知道……”
1566 | 
1567 | 但他没回答，却对我说：
1568 | 
1569 | “你知道，我降落到地球上……到明天就满一年了……”
1570 | 
1571 | 然后，一阵静默过后，他又说道：
1572 | 
1573 | “我就落在这儿附近……”
1574 | 
1575 | 说着他的脸红了起来。
1576 | 
1577 | 我也不知是什么原因，只觉得又感到一阵异样的忧伤。可是我想到了一个问题：
1578 | 
1579 | “这么说，一星期前我遇见你的那个早晨，你独自在这片荒无人烟的沙漠里走来，并不是偶然的了？你是要回到当初降落的地方来吧？”
1580 | 
1581 | 小王子的脸又红了。
1582 | 
1583 | 我有些犹豫地接着说：
1584 | 
1585 | “也许，是为了周年纪念？……”
1586 | 
1587 | 小王子脸又红了。他往往不回答人家的问题，但他脸一红，就等于在说“对的”，可不是吗？
1588 | 
1589 | “哎！”我对他说，“我怕……”
1590 | 
1591 | 他却回答我说：
1592 | 
1593 | “现在你该去工作了。你得回到你的飞机那儿去。我在这儿等你。明天晚上再来吧……”
1594 | 
1595 | 可是我放心不下。我想起了狐狸的话。一个人要是被驯养过，恐怕难免要哭的……
1596 | 
1597 | 
1598 | 
1599 | 
1600 | 
1601 | 
1602 | 
1603 | Ch26
1604 | 
1605 | 
1606 | 在水井边上，有一堵残败的旧石墙。第二天傍晚，我干完活儿回来，远远地看见小王子两腿悬空地坐在断墙上。我还听见他在说话：
1607 | 
1608 | “难道你不记得了？”他说。“根本不是这儿！”
1609 | 
1610 | 想必有一个声音在回答他，只见他在反驳：
1611 | 
1612 | “对！对！是今天，可不是这个地方……”
1613 | 
1614 | 我往石墙走去。我既没看见人影，也没听见人声。但是小王子又在说：
1615 | 
1616 | “……那当然。在沙地上，你会看到我的足迹从哪儿开始的。你只要等着我就行了。今天夜里我就去那儿。”
1617 | 
1618 | 我离石墙只有二十米了，可还是什么也没看见。
1619 | 
1620 | 停了一会儿，小王子又说道：
1621 | 
1622 | “你的毒液管用吗？你有把握不会让我难受很久吗？”
1623 | 
1624 | 我心头猛地揪紧，停下了脚步，可我还是什么也不明白。
1625 | 
1626 | “现在，来吧，”小王子说，“……我要下来了！”这时，我低头朝墙脚看去，不由得吓了一跳！只见一条半分钟就能叫人致命的黄蛇，昂然竖起身子对着小王子。我一边伸手去掏手枪，一边撒腿往前奔去。可是，那条蛇听见我的声音，就像一条水柱骤然跌落下来，缓缓渗入沙地，不慌不忙地钻进石缝中去，发出轻微的金属声。
1627 | 
1628 | 我赶到墙边，正好接住从墙上跳下的小王子，把这个脸色白得像雪的小家伙抱在怀里。
1629 | 
1630 | “这是怎么回事！你居然跟蛇在谈话！”
1631 | 
1632 | 我解开他一直戴着的金黄色围巾。我用水沾湿他的太阳穴，给他喝了点水。可此刻我不敢再问他什么。他神色凝重地望着我，用双臂搂住我的脖子。我感觉到他的心跳，就像被枪弹击中濒临死亡的小鸟的心跳。他对我说：
1633 | 
1634 | 
1635 | 
1636 | “现在，来吧，”小王子说，“我要下来了！”
1637 | 
1638 | “我很高兴，你找到了飞机上缺少的东西。你可以回家了……”
1639 | 
1640 | “你怎么知道的？”
1641 | 
1642 | 我正想告诉他，就在刚才，在眼看没有希望的情况下，我修好了飞机！
1643 | 
1644 | 他没回答我的问题，但接着说：
1645 | 
1646 | “我也一样，今天，我要回家了……”
1647 | 
1648 | 然后，忧郁地说：
1649 | 
1650 | “那要远得多……难得多……”
1651 | 
1652 | 我意识到发生了一件非同寻常的事情。我把他像小孩那样抱在怀里，只觉得他在笔直地滑入一个深渊，而我全然无法拉住他……
1653 | 
1654 | 他的目光很严肃，视线消失在很远很远的地方。
1655 | 
1656 | “我有你的绵羊。我有绵羊的箱子。还有嘴罩……”
1657 | 
1658 | 说着，他忧郁地微微一笑。
1659 | 
1660 | 我等了很久。我感到他的身子渐渐暖了起来：
1661 | 
1662 | “小家伙，你受惊了……”
1663 | 
1664 | 他刚才受惊了，可不是！但他轻轻地笑了起来：
1665 | 
1666 | “今天晚上我要受更大的惊……”
1667 | 
1668 | 一种无法补救的感觉，再一次使我凉到了心里。想到从此就再也听不到他的笑声，我感到受不了。他的笑声对我来说，就像沙漠中的清泉。
1669 | 
1670 | “小家伙，我还想听到你咯咯地笑……”
1671 | 
1672 | 可是他对我说：
1673 | 
1674 | “到今天夜里，就是一年了。我的星星就在我去年降落的地方顶上……”
1675 | 
1676 | “小家伙，蛇啊，相约啊，星星啊，感情只是场恶梦吧……”
1677 | 
1678 | 可是他不回答我的问题。他对我说：
1679 | 
1680 | “重要的东西是看不见的……”
1681 | 
1682 | “可不是……”
1683 | 
1684 | “这就好比花儿一样。要是你喜欢一朵花儿，而她在一颗星星上，那你夜里看着天空，就会觉得很美。所有的星星都像开满了花儿。”
1685 | 
1686 | “可不是……”
1687 | 
1688 | “这就好比水一样。昨天你给我喝的水，有了那辘轳和吊绳，就像一首乐曲……你还记得吧……那水真好喝。”
1689 | 
1690 | “可不是……”
1691 | 
1692 | “夜里，你要抬头望着满天的星星。我那颗实在太小了，我都没法指给你看它在哪儿。这样倒也好。我的星星，对你来说就是满天星星中的一颗。所以，你会爱这满天的星星……所有的星星都会是你的朋友。我还要给你一件礼物……”
1693 | 
1694 | 他又笑了起来。
1695 | 
1696 | “呵！小家伙，小家伙，我喜欢听到这笑声！”
1697 | 
1698 | “这正是我的礼物……就像那水……”
1699 | 
1700 | “你想说什么？”
1701 | 
1702 | “人们眼里的星星，并不是一样的。对旅行的人来说，星星是向导。对有些人来说，它们只不过是天空微弱的亮光。对另一些学者来说，它们就是要探讨的问题。对我那个商人来说，它们就是金子。但是所有这些星星都是静默的。而你，你的那些星星是谁也不曾见过的……”
1703 | 
1704 | “你想说什么呢？”
1705 | 
1706 | “当你在夜里望着天空时，既然我就在其中的一颗星星上面，既然我在其中一颗星星上笑着，那么对你来说，就好像满天的星星都在笑。只有你一个人，看见的是会笑的星星！”
1707 | 
1708 | 说着他又笑了。
1709 | 
1710 | “当你感到心情平静以后（每个人总会让自己的心情平静下来），你会因为认识了我而感到高兴。你会永远是我的朋友。你会想要跟我一起笑。有时候，你会心念一动，就打开窗子……你的朋友会惊奇地看到，你望着天空在笑。于是你会对他们说：‘是的，我看见这些星星就会笑！’他们会以为你疯了。我给你闹了个恶作剧……”
1711 | 
1712 | 说着他又笑了。
1713 | 
1714 | “这样一来，我给你的仿佛不是星星，而是些会笑的小铃铛……”
1715 | 
1716 | 
1717 | 
1718 | 说着他又笑了。随后他变得很严肃：
1719 | 
1720 | “今天夜里……你知道……你不要来。”
1721 | 
1722 | “我决不离开你。”
1723 | 
1724 | “我看上去会很痛苦……会有点像死去的样子。就是这么回事。你还是别看见的好，没这必要。”
1725 | 
1726 | “我决不离开你。”
1727 | 
1728 | 可是他担心起来。
1729 | 
1730 | “我这么说……也是因为蛇的缘故。你可别让它咬着了……蛇，都是很坏的。它们无缘无故也会咬人……”
1731 | 
1732 | “我决不离开你。”
1733 | 
1734 | 不过，他想到了什么，又觉得放心了：
1735 | 
1736 | “可也是，它们咬第二口时，已经没有毒液了……”
1737 | 
1738 | 当天夜里，我没看见他起程。他悄没声儿地走了。我好不容易赶上他时，他仍然执著地快步往前走。他只是对我说：
1739 | 
1740 | “啊！你来了……”
1741 | 
1742 | 说完他就拉住我的手。可是他又感到不安起来：
1743 | 
1744 | “你不该来的。你会难过的。我看上去会像死去一样，可那不是真的……”
1745 | 
1746 | 我不作声。
1747 | 
1748 | 
1749 | 
1750 | “你是明白的。路太远了。我没法带走这副躯壳。它太沉了。”
1751 | 
1752 | 我不作声。
1753 | 
1754 | “可这就像一棵老树脱下的树皮。脱下一层树皮，是用不着伤心的……”
1755 | 
1756 | 我不作声。
1757 | 
1758 | 他有点气馁。但他重又打起精神：
1759 | 
1760 | “你知道，这样挺好。我也会望着满天星星的。每颗星星都会有一个生锈辘轳的水井。所有的星星都会倒水给我喝……”
1761 | 
1762 | 我不作声。
1763 | 
1764 | “这真是太有趣了！你有五亿个铃铛，我有五亿个水井……”
1765 | 
1766 | 他也不作声了，因为他哭了……
1767 | 
1768 | “到了。让我独自跨出一步吧。”
1769 | 
1770 | 说着他坐了下来，因为他害怕。
1771 | 
1772 | 他又说：
1773 | 
1774 | “你知道……我的花儿……我对她负有责任！她是那么柔弱！她是那么天真。她只有四根微不足道的刺，用来抵御整个世界……”
1775 | 
1776 | 我也坐下，因为我没法再站着了。他说：
1777 | 
1778 | “好了……没别的要说了……”
1779 | 
1780 | 他稍微犹豫了一下，随即站了起来。他往前跨出了一步，而我却动弹不得。
1781 | 
1782 | 只见他的脚踝边上闪过一道黄光。片刻间他一动不动。他没有叫喊。他像一棵树那样缓缓地倒下。由于是沙地，甚至都没有一点声响。
1783 | 
1784 | 
1785 | 
1786 | 
1787 | 
1788 | 
1789 | 
1790 | 
1791 | 
1792 | Ch27
1793 | 
1794 | 
1795 | 现在，当然，已经过去六年了……我还从来没跟人讲过这个故事。同伴们看见我活着回来，都很高兴。我很忧伤，但我对他们说：“我累了……”
1796 | 
1797 | 现在我的心情有点平静了。也就是说……还没有完全平静。而我知道，他已经回到了他的星球，因为那天天亮以后，我没发现他的躯体。他的躯体并不太沉……我喜欢在夜里倾听星星的声音。它们就像五亿个铃铛。
1798 | 
1799 | 可是，我想到有件事出了意外。我给小王子画的嘴罩，忘了加上皮带！他没法把它系在绵羊嘴上了。于是我一直在想：“在他的星球上到底会发生什么事呢？说不定绵羊真的把花儿给吃了……”
1800 | 
1801 | 有时我对自己说：“肯定不会！小王子每天夜里给花儿盖上玻璃罩，再说他也会仔细看好绵羊的……”于是我感到很幸福。满天的星星轻轻地笑着。
1802 | 
1803 | 有时我对自己说：“万一有个疏忽，那就全完了！没准哪天晚上，他忘了盖玻璃罩，或者绵羊在夜里悄悄钻了出来……”于是满天的铃铛全都变成了泪珠！……
1804 | 
1805 | 这可是一个很大很大的秘密哟。对于也爱着小王子的你们，就像对于我一样，要是在我们不知道的哪个地方，有一只我们从没见过的绵羊，吃掉了或者没有吃掉一朵玫瑰，整个宇宙就会完全不一样……
1806 | 
1807 | 你们望着天空，想一想：绵羊到底有没有吃掉花儿？你们就会看到一切都变了样……
1808 | 
1809 | 而没有一个大人懂得这有多重要呵！
1810 | 
1811 | 
1812 | 
1813 | 对我来说，这是世界上最美丽、最伤感的景色。它跟前一页上画的是同一个景色，而我之所以再画一遍，是为了让你们看清这景色。就是在这儿，小王子在地球上出现，而后又消失。请仔细看看这景色，如果有一天你们到非洲沙漠去旅行，就肯定能认出它来。而要是你们有机会路过那儿，请千万别匆匆走过，请在那颗星星下面等上一会儿！如果这时有个孩子向你们走来，如果他在笑，如果他的头发是金黄色的，如果问他而他不回答，你们一定能猜到他是谁了。那么就请你们做件好事吧！请别让我再这么忧伤：赶快写信告诉我，他又回来了……
1814 | 
1815 | 
1816 | 


--------------------------------------------------------------------------------
/data/random_mask/test.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/random_mask/test.npy


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_preface.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_preface.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_preface_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_preface_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_1.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_1.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_1_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_1_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_2.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_2.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_2_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_2_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_3.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_3.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_3_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_3_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_4.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_4.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_4_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_4_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_5.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_5.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_5_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_5_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_6.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_6.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_6_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_6_display.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_7.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_7.xlsx


--------------------------------------------------------------------------------
/data/segmented_novel/segmented_Chinense_novel_run_7_display.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data/segmented_novel/segmented_Chinense_novel_run_7_display.xlsx


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/README.md:
--------------------------------------------------------------------------------
  1 | # EEG Pre-processing Document and Alignment 
  2 | 
  3 | This document illustrates the pipeline of our EEG pre-processing and how to use our code to deal with the EEG data. Besides, an explanation of our dataset is provided for your reference.
  4 | 
  5 | Additonally. we provide code to align EEG data, texts and text embeddings.
  6 | 
  7 | ## Data Pre-processing Pipeline
  8 | Here, we pre-process our data to remove obvious artifact to the least extent. An overview of our pre-processing pipeline is shown in the figure below.
  9 | 
 10 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/processing_pipeline.png)
 11 | 
 12 | Our processing procedure includes these steps:
 13 | 
 14 | #### Data Segmentation
 15 | 
 16 | We will remain a short time period before and after the valid time range. We will locate the cutting position by referencing the EEG mark. Detailed information can be seen in the method `cut_single_eeg` in `preprocessing.py`. In our procedure, we set the remaining time before the valid range to 10s.
 17 | 
 18 | #### Resample and Notch filter
 19 | 
 20 | Before we do ICA, we will first follow some basic steps, including down-sampling the data and filter the line frequency. In our setting, we assign the resample frequency to 256 Hz and line frequency to 50 Hz.
 21 | 
 22 | #### Filtering
 23 | 
 24 | We will filter the data using a band pass filter to remove artifact. In our processing, we do two versions of filtering, with one pass band set to 0.5-80 Hz and the other set to 0.5-30 Hz.
 25 | 
 26 | #### Bad Channel Interpolation and Bad Segment Mask
 27 | 
 28 | Then we will  interpolate the bad channel using method implemented in the `MNE` package. We will also mark segments which look like a bad segment with label 'bad' for reference. This can be done in GUI, which we will explain later.
 29 | 
 30 | ####  ICA
 31 | 
 32 | We use ICA to remove ocular artifact, cardial artifact, muscle movement artifact and other possible artifact. In our own processing, we set the parameter  `ica_n_component` to 20 to make sure we can find all possible artifact. We use `infomax` algorithm. You can change these parameters on your own. Details about how to change parameters will be explained in the Code part.
 33 | 
 34 | #### Re-reference
 35 | 
 36 | Lastly, we will re-reference our data. In our implementation, we use the 'average' method.
 37 | 
 38 |  ## Code
 39 | 
 40 | ### Environment Requirement
 41 | 
 42 | We recommand Python 3.10, which is our own setting.
 43 | 
 44 | package `MNE`, `mne-bids`, `pybv`are required. You can get these three packages using the following commands:
 45 | 
 46 | ``` 
 47 | conda install --channel=conda-forge mamba
 48 | mamba create --override-channels --channel=conda-forge mne
 49 | ```
 50 | 
 51 | ```
 52 | pip install --upgrade mne-bids[full]
 53 | ```
 54 | 
 55 | ```
 56 | pip install pybv
 57 | ```
 58 | 
 59 | **Make sure that the full `mne` package is installed using the command above, otherwise you can not using the GUI of the MNE methods correctly. `MNE` version>=1.0 is required to support the GUI qt-browser**. `pybv`==0.7.5 is recommended. For more information, you can turn to these pages: https://mne.tools/stable/install/manual_install.html#manual-install, https://mne.tools/mne-bids/stable/install.html, https://pybv.readthedocs.io/en/stable/.
 60 | 
 61 | ### Code Usage
 62 | 
 63 | In the code `preprocessing.py`, you can change the given parameters to pre-process the data with your own setting and **save the raw data and the pre-processed data in BIDS format**.
 64 | 
 65 | This code will first cut the eeg data. A short time will be remained before the start of the valid EEG segment. You can assign the time using the parameter `remaining_time_at_beginning`. After cutting, the code will run the main pre-processing pipeline. In the whole pre-processing procedure, there GUI stages will appear. The first one shows all the ICA component sources. You can exclude ones you want to drop by clicking the components. The second one shows the band pass filtered data. In this stage, you can select bad channels by clicking the channel as in the first stage. You can also mask possible bad segments of the data by annotating them with label 'bad'. The last stage will show the data after re-reference, which is the last step of the pre-processing. In this stage, you can inspect whether the pre-processed data meets your need.
 66 | 
 67 | The detailed information about the parameters are shown below:
 68 | 
 69 | | Parameter                   | type  | Explanation                                                  |
 70 | | --------------------------- | ----- | ------------------------------------------------------------ |
 71 | | eeg_path                    | str   | data path of the unprocessed eeg                             |
 72 | | sub_id                      | str   | a string of the id of the subject. Pad 0 if the id has only one digit. |
 73 | | ses                         | str   | a string describing the session of the current data. It will be contained in the file name when saving the file. |
 74 | | task                        | str   | a string describing the task of the current data. It will be contained in the file name when saving the file. |
 75 | | run                         | int   | an integer standing for the run number of the data.          |
 76 | | raw_data_root               | str   | the path of your raw data, which is also the root of the whole dataset. |
 77 | | filtered_data_root          | str   | the path of your filtered data.                              |
 78 | | processed_data_root         | str   | the path of your pre-processed data.                         |
 79 | | dataset_name                | str   | name of the dataset, which will be saved in the dataset_description.json. |
 80 | | author                      | str   | author of the dataset.                                       |
 81 | | line_freq                   | float | line frequency of the data. This is needed when saving the data into BIDS format.                   Default to be 50. |
 82 | | start_chapter               | str   | a string which is the eeg mark of the first chapter in current eeg data. e.g. if your eeg starts with chapter 1, then the argument should be 'CH01'. |
 83 | | low_pass_freq               | float | the low pass frequency of the filter                         |
 84 | | high_pass_freq              | float | the high pass frequency of the filter                        |
 85 | | resample_freq               | float | the resample frequency of the filter                         |
 86 | | remaining_time_at_beginning | float | the remaining time before the start of the valid eeg segment |
 87 | | montage_name                | str   | the montage of the eeg                                       |
 88 | | ica_method                  | str   | which ica_method you want to use. See mne tutorial for more information |
 89 | | ica_n_components            | int   | how many ICA components you want to use. See mne tutorial for more information |
 90 | | rereference                 | str   | re-reference method you want to use                          |
 91 | 
 92 | ## Dataset 
 93 | 
 94 | You can access our dataset via Openneuro platform (https://openneuro.org/datasets/ds004952) or via the ChineseNeuro Symphony community (CHNNeuro) in the Science Data Bank (ScienceDB) platform (https://doi.org/10.57760/sciencedb.CHNNeuro.00007).
 95 | 
 96 | Our data is formatted under the requirement of the BIDS standard format as shown in the figure below. 
 97 | 
 98 | The detailed format of our data structure can be found in our paper at https://doi.org/10.1101/2024.02.08.579481.
 99 | 
100 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/structure_new.png)
101 | 
102 | ## Manual Processing Criteria
103 | 
104 | Example Name: subject_04_eeg_01
105 | 
106 | ICA Example Figure：
107 | 
108 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ica_topo.png)
109 | 
110 | Components to Exclude:
111 | 
112 | ICA001: This component has local maxima in the frontal area, which is a typical feature of eye blink artifacts.
113 | 
114 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ica_001.png)
115 | 
116 | ICA006: This component may represent artifacts of eye movements or eye scanning, as it shows local maxima on the frontal and lateral aspects of the scalp.
117 | 
118 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ica_006.png)
119 | 
120 | ICA010: This component may be related to eye movement and electrocardiographic artifacts, as it exhibits local maxima in the frontal region and near the ears.
121 | 
122 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ica_010.png)
123 | 
124 | ICA007: This component may be temporally related to electrocardiographic artifacts, as it is characterized by prominent maxima near the ears.
125 | 
126 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ica_007.png)
127 | 
128 | ICA015: This component may be temporally related to electrocardiographic artifacts, as it exhibits characteristic maxima at the edges of the scalp.
129 | 
130 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ica_015.png)
131 | 
132 | ## Data Alignment
133 | 
134 | Here we offer the script `align_eeg_with_sentence.py` to get the EEG  and its aligned texts and text embeddings. You just need to assign the three parameters to the method:
135 | 
136 | | Parameter           | Type | Description                                          |
137 | | ------------------- | ---- | ---------------------------------------------------- |
138 | | eeg_path            | str  | Path to your `.vhdr` EEG file                        |
139 | | novel_xlsx_path     | str  | Path to the corresponding run of the novel           |
140 | | text_embedding_path | str  | Path to the corresponding run of the text embeddings |
141 | 
142 |  This method will return three variable: `cut_eeg_data`, `texts` and `text_embeddings`. `cut_eeg_data` is a list that contains the eeg segment cut by the marker. `texts` contains the corresponding texts to the cut eeg segment, and `text_embeddings` contains the corresponding text embeddings generated by pretrained language models. 
143 | 
144 |  **Notice: Due to special circumstances during the experimental process, subject-07 in the LangWangMeng session did not read the content of Chapter 18 as intended in the 18th run. Instead, as a substitute, the participant read the content of Chapter 19. Therefore, in this specific case, there is no direct correspondence between the EEG data in the 18th run and the 18th text embedding file. **
145 | 


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/__pycache__/utils.cpython-310.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/data_preprocessing_and_alignment/__pycache__/utils.cpython-310.pyc


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/align_eeg_with_sentence.py:
--------------------------------------------------------------------------------
 1 | import mne
 2 | import numpy as np
 3 | import openpyxl
 4 | import csv
 5 | import argparse
 6 | from utils import read_eeg_brainvision
 7 | 
 8 | 
 9 | 
10 | 
11 | def align_eeg_with_sentence(eeg_path, novel_xlsx_path, text_embedding_path, montage_name='GSN-HydroCel-128'):
12 |     '''
13 | 
14 |     :param eeg_path: BrainVision files of the eeg
15 |     :param novel_xlsx_path: path to the corresponding run of the novel
16 |     :param text_embedding_path: path to the corresponding run of the text embeddings
17 |     :return: cut_eeg_data, texts, text_embeddings in alignment. cut_eeg_data is a list containing all
18 |              the cut eeg divided by the markers.
19 |     '''
20 | 
21 |     eeg, events, event_id = read_eeg_brainvision(eeg_path, montage_name)
22 | 
23 |     text_embeddings = np.load(text_embedding_path)
24 | 
25 |     wb = openpyxl.load_workbook(novel_xlsx_path)
26 |     wsheet = wb.active
27 |     texts = []
28 | 
29 | 
30 | 
31 |     for i in range(2, wsheet.max_row + 1):
32 |         texts.append((wsheet.cell(row=i, column=1)).value)
33 | 
34 |     start_chapter = int(texts[0])
35 | 
36 | 
37 |     if start_chapter < 10:
38 |         start_marker = 'CH0' + str(start_chapter)
39 |     else:
40 |         start_marker = 'CH' + str(start_chapter)
41 | 
42 | 
43 |     start_marker_id = event_id[start_marker]
44 |     events_start_chapter_index = np.where(events[:, 2] == start_marker_id)[0][0]
45 | 
46 |     eeg_data = eeg.get_data()
47 | 
48 | 
49 | 
50 |     ROWS_id = event_id['ROWS']
51 |     ROWE_id = event_id['ROWE']
52 | 
53 | 
54 |     rows_onset = []
55 |     rowe_onset = []
56 | 
57 |     for event in events[events_start_chapter_index:]:
58 |         if event[2] == ROWS_id:
59 |             rows_onset.append(event[0])
60 | 
61 |     for event in events[events_start_chapter_index:]:
62 |         if event[2] == ROWE_id:
63 |             rowe_onset.append(event[0])
64 | 
65 | 
66 | 
67 | 
68 |     cut_eeg_data = []
69 | 
70 |     for i in range(0, len(rows_onset)):
71 | 
72 |         start_time = rows_onset[i]
73 |         end_time = rowe_onset[i]
74 | 
75 |         cut_eeg_data.append(eeg_data[:, start_time:end_time])
76 | 
77 | 
78 |     return cut_eeg_data, texts, text_embeddings
79 | 
80 | 
81 | 
82 | parser = argparse.ArgumentParser(description='Parameters that can be changed')
83 | parser.add_argument('--eeg_path', type=str, default=r'sub-07_ses-LittlePrince_task-reading_run-01_eeg.vhdr')
84 | parser.add_argument('--novel_xlsx_path', type=str, default=r'segmented_Chinense_novel_run_1.xlsx')
85 | parser.add_argument('--text_embedding_path', type=str, default=r'LittlePrince_text_embedding_run_1.npy')
86 | 
87 | 
88 | args = parser.parse_args()
89 | 
90 | 
91 | 
92 | cut_eeg_data, texts, text_embeddings = align_eeg_with_sentence(eeg_path=args.eeg_path, novel_xlsx_path=args.novel_xlsx_path, text_embedding_path=args.text_embedding_path)


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/convert_eeg_to_bids.py:
--------------------------------------------------------------------------------
  1 | 
  2 | # %%
  3 | # We are importing everything we need for this example:
  4 | import os.path as op
  5 | import shutil
  6 | import csv
  7 | import os
  8 | 
  9 | import mne
 10 | import mne_bids
 11 | 
 12 | from mne_bids import write_raw_bids, BIDSPath, print_dir_tree
 13 | from mne_bids.stats import count_events
 14 | import json
 15 | import numpy as np
 16 | import matplotlib.pyplot as plt
 17 | 
 18 | 
 19 | def convert_to_bids(raw, ica_component=None, ica_topo_figs=None, ica_dict=None, bad_channel_dict=None, sub_id='06', ses='LittlePrince',
 20 |                     task='Reading', run=1, bids_root='derivative/preproc',
 21 |                     dataset_name='Novel Reading', dataset_type='derivative',
 22 |                     author='Xinyu Mou, Cuilin He, Liwei Tan', line_freq=50):
 23 |     '''
 24 |     :param raw: the pre-processed raw data which you want to save into BIDS format
 25 |     :param ica_component: The ICA components when you pre-process the data
 26 |     :param ica_topo_figs: Topography of ICA components
 27 |     :param ica_dict: This is a dict containing information of the ICA components, It should
 28 |            satisfy the following format: {'shape': np.ndarray, 'exclude': list}
 29 |            The first value represents the shape of the ICA components: (n_components, time_length)
 30 |            The second value represents the excluded ICA components when pre-processing the data
 31 |     :param bad_channel_dict: a dict containing information of the marked bad channel when pre-processing
 32 |            the data. It should satisfy the following format: {'bad channel': list}
 33 |     :param sub_id: The id of the subject you are processing
 34 |     :param ses: The session of the current data
 35 |     :param run: the run of the current data
 36 |     :param bids_root: The root of your dataset
 37 |     :param dataset_name: The name of the dataset, which will be saved in the dataset_description.json
 38 |     :param dataset_type: The type of the dataset, can be 'raw' or 'derivative', which will be saved in the dataset_description.json
 39 |     :param author: The author of the dataset, which will be saved in the dataset_description.json
 40 |     :param line_freq: line frequency of your raw data, normally 50 in China
 41 | 
 42 |     This function convert a raw eeg to BIDS format with some sidecar files. The EEG data will be
 43 |     saved following BrainVision format to satisfy the requirement of BIDS. Make sure that information in the
 44 |     data will not be lost when converting the data format.
 45 | 
 46 |     The parameter ica_component, ica_dict, bad_channel_dict are optional.
 47 |     '''
 48 |     raw.info["line_freq"] = line_freq  # specify power line frequency as required by BIDS
 49 | 
 50 |     # save files in need, including information about coordinate system, JSON of the eeg, channel information, ...
 51 |     bids_path = BIDSPath(subject=sub_id, session=ses, task=task, run=run, root=bids_root)
 52 |     basename = str(bids_path.basename)
 53 | 
 54 |     events, event_id = mne.events_from_annotations(raw)
 55 |     write_raw_bids(raw, bids_path, format='BrainVision', allow_preload=True, overwrite=True)
 56 | 
 57 |     fpath = bids_path.fpath
 58 |     eeg_file_directory = str(fpath.parent)
 59 | 
 60 |     ica_component_path = eeg_file_directory + '/' + basename + '_ica_components.npy'
 61 |     ica_topo_path = eeg_file_directory + '/' + basename + '_ica_components_topography'
 62 |     ica_json_path = eeg_file_directory + '/' + basename + '_ica_components.json'
 63 |     bad_channel_path = eeg_file_directory + '/' + basename + '_bad_channels.json'
 64 | 
 65 | 
 66 |     # save ICA components
 67 |     if ica_component is not None:
 68 |         np.save(ica_component_path, ica_component)
 69 | 
 70 |     # save ICA component JSON file
 71 |     if ica_dict is not None:
 72 |         ica_component_json = json.dumps(ica_dict, sort_keys=False, indent=4, separators=(',', ': '))
 73 |         f = open(ica_json_path, 'w')
 74 |         f.write(ica_component_json)
 75 | 
 76 |     # save bad channel record JSON file
 77 |     if bad_channel_dict is not None:
 78 |         bad_channel_json = json.dumps(bad_channel_dict, sort_keys=False, indent=4, separators=(',', ': '))
 79 |         f = open(bad_channel_path, 'w')
 80 |         f.write(bad_channel_json)
 81 | 
 82 |     if ica_topo_figs is not None:
 83 |         fig_num = 1
 84 |         if isinstance(ica_topo_figs, list):
 85 |             for topo in ica_topo_figs:
 86 |                 if fig_num < 10:
 87 |                     fig_num_str = str(0) + str(fig_num)
 88 |                 else:
 89 |                     fig_num_str = str(fig_num)
 90 |                 topo.savefig(ica_topo_path + '_' + fig_num_str + '.png')
 91 |                 fig_num += 1
 92 |         else:
 93 |             ica_topo_figs.savefig(ica_topo_path + '.png')
 94 | 
 95 | 
 96 | 
 97 |     print_dir_tree(bids_root)
 98 | 
 99 | 
100 |     readme = op.join(bids_root, "README")
101 |     with open(readme, "r", encoding="utf-8-sig") as fid:
102 |         text = fid.read()
103 |     print(text)
104 | 
105 |     mne_bids.make_dataset_description(path=bids_root, name=dataset_name, dataset_type=dataset_type,
106 |                                       authors=author, overwrite=True)
107 | 
108 | 
109 | 
110 | 
111 | 
112 | 
113 | 
114 | 
115 | 
116 | 
117 | 


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/forward.py:
--------------------------------------------------------------------------------
 1 | import os.path as op
 2 | import numpy as np
 3 | import mne
 4 | from mne.datasets import eegbci, fetch_fsaverage
 5 | import csv
 6 | from utils import read_eeg_brainvision
 7 | import argparse
 8 | 
 9 | 
10 | def forward_solution(eeg_path, save_fwd_path):
11 |     #####################  Download template MRI data #####################
12 |     # Download fsaverage files
13 |     fs_dir = fetch_fsaverage(verbose=True)
14 | 
15 |     # The files live in:
16 |     subject = "fsaverage"
17 |     trans = "fsaverage"  # MNE has a built-in fsaverage transformation
18 |     src = op.join(fs_dir, "bem", "fsaverage-ico-5-src.fif")
19 |     bem = op.join(fs_dir, "bem", "fsaverage-5120-5120-5120-bem-sol.fif")
20 | 
21 |     ##################### Load EEG data #####################
22 |     raw, _, _ = read_eeg_brainvision(eeg_path)
23 |     raw.set_eeg_reference(projection=True)
24 | 
25 | 
26 | 
27 |     fwd = mne.make_forward_solution(
28 |         raw.info, trans=trans, src=src, bem=bem, eeg=True, mindist=5.0, n_jobs=None
29 |     )
30 |     print(fwd)
31 | 
32 |     mne.write_forward_solution(save_fwd_path, fwd)
33 | 
34 |     return fwd
35 | 
36 | 
37 | 
38 | parser = argparse.ArgumentParser(description='Parameters that can be changed in this experiment')
39 | parser.add_argument('--eeg_path', type=str, default=r'example_eeg.fif')
40 | parser.add_argument('--save_fwd_path', type=str, default=r'example_eeg-fwd.fif')
41 | 
42 | args = parser.parse_args()
43 | 
44 | fwd = forward_solution(eeg_path=args.eeg_path,
45 |                        save_fwd_path=args.save_fwd_path)
46 | 
47 | 
48 | 
49 | 
50 | 
51 | 


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/inverse.py:
--------------------------------------------------------------------------------
  1 | import matplotlib.pyplot as plt
  2 | import numpy as np
  3 | import os.path as op
  4 | 
  5 | import mne
  6 | from mne.datasets import sample
  7 | from mne.minimum_norm import make_inverse_operator, apply_inverse_raw
  8 | from mne.minimum_norm import write_inverse_operator
  9 | from mne.datasets import eegbci, fetch_fsaverage
 10 | from utils import read_eeg_brainvision
 11 | import argparse
 12 | 
 13 | 
 14 | 
 15 | def inverse_solution(eeg_path, fname_fwd, inverse_method="dSPM", snr=3.0, hemi="lh", save_inverse_operator_path="eeg-inv.fif", clim=[2, 3, 5], inverse_operator_loose=0.2, inverse_operator_depth=0.8):
 16 |     '''
 17 | 
 18 |     :param eeg_path:
 19 |     :param fname_fwd: -fwd.fif
 20 |     :param save_inverse_operator_path: end with -inv.fif
 21 |     :param hemi: "lh" or "rh", refering to left hemisphere or right hemisphere
 22 |     :param clim: [low, middle, high] for the colorbar
 23 |     :return:
 24 |     '''
 25 |     raw, _, _ = read_eeg_brainvision(eeg_path)
 26 |     raw.set_eeg_reference(projection=True)
 27 | 
 28 |     noise_cov = mne.compute_raw_covariance(raw, method='empirical', method_params=None,
 29 |                                rank=None, verbose=None)
 30 | 
 31 | 
 32 |     fwd = mne.read_forward_solution(fname_fwd)
 33 | 
 34 | 
 35 |     inverse_operator = make_inverse_operator(
 36 |         raw.info, fwd, noise_cov, loose=inverse_operator_loose, depth=inverse_operator_depth
 37 |     )
 38 |     del fwd
 39 | 
 40 | 
 41 | 
 42 |     write_inverse_operator(
 43 |         save_inverse_operator_path,
 44 |         inverse_operator, overwrite=True
 45 |     )
 46 | 
 47 | 
 48 |     method = inverse_method
 49 |     snr = snr
 50 |     lambda2 = 1.0 / snr**2
 51 | 
 52 | 
 53 |     stc = apply_inverse_raw(
 54 |         raw, inverse_operator, lambda2, method, pick_ori=None
 55 |     )
 56 | 
 57 | 
 58 | 
 59 |     vertno_max_lh, time_max_lh = stc.get_peak(hemi='lh')
 60 |     vertno_max_rh, time_max_rh = stc.get_peak(hemi='rh')
 61 | 
 62 |     fs_dir = fetch_fsaverage(verbose=True)
 63 |     subjects_dir = op.dirname(fs_dir)
 64 | 
 65 | 
 66 |     if hemi == "lh":
 67 |         initial_time = vertno_max_lh
 68 |     else:
 69 |         initial_time = vertno_max_rh
 70 | 
 71 |     surfer_kwargs = dict(
 72 |         hemi=hemi,
 73 |         subjects_dir=subjects_dir,
 74 |         clim=dict(kind="value", lims=clim),
 75 |         background=(1, 1, 1),   # background color: (1, 1, 1) --> white
 76 |         initial_time=initial_time,
 77 |         time_unit="s",
 78 |         size=(800, 800),
 79 |         smoothing_steps=10,
 80 |         surface='white',  # type of surface
 81 |         views='lateral',
 82 |         colormap='mne'
 83 |     )
 84 | 
 85 | 
 86 | 
 87 |     brain = stc.plot(**surfer_kwargs)
 88 |     brain.add_foci(
 89 |         initial_time,
 90 |         coords_as_verts=True,
 91 |         hemi=hemi,
 92 |         color="blue",
 93 |         scale_factor=0.6,
 94 |         alpha=0.5,
 95 |     )
 96 |     brain.add_text(
 97 |         0.1, 0.9, inverse_method + " (plus location of maximal activation)", "title", font_size=14
 98 |     )
 99 | 
100 | 
101 | 
102 |     peak_stc_lh = stc.lh_data[vertno_max_lh, :]
103 | 
104 |     peak_stc_rh = stc.rh_data[vertno_max_rh, :]
105 | 
106 |     data = stc.data[:10, :].T
107 |     fig, ax = plt.subplots()
108 |     colors = ['b', 'g', 'r', 'c', 'm', 'y', 'k', 'orange', 'purple', 'brown']
109 | 
110 |     for i in range(10):
111 |         ax.plot(stc.times, data[:, i] + i * 3, color=colors[i])  # Shift each channel for visualization
112 | 
113 |     ax.plot(stc.times, peak_stc_lh + 30, color='cornflowerblue', label='lh')
114 |     ax.plot(stc.times, peak_stc_rh + 33, color='firebrick', label='rh')
115 |     ax.set(xlabel="time (s)", ylabel="%s value" % method)
116 |     ax.set_xticklabels([])
117 |     ax.set_yticklabels([])
118 |     plt.show()
119 | 
120 | 
121 | 
122 | parser = argparse.ArgumentParser(description='Parameters that can be changed in this experiment')
123 | parser.add_argument('--eeg_path', type=str, default=r'example_eeg.fif')
124 | parser.add_argument('--fname_fwd', type=str, default=r'example_eeg-fwd.fif')
125 | parser.add_argument('--inverse_method', type=str, default=r'dSPM')
126 | parser.add_argument('--snr', type=float, default=3.0)
127 | parser.add_argument('--hemi', type=str, default=r'lh')
128 | parser.add_argument('--save_inverse_operator_path', type=str, default=r'example_eeg-inv.fif')
129 | parser.add_argument('--clim', type=list, default=[2, 3, 5])
130 | parser.add_argument('--inverse_operator_loose', type=float, default=0.2)
131 | parser.add_argument('--inverse_operator_depth', type=float, default=0.8)
132 | 
133 | 
134 | args = parser.parse_args()
135 | 
136 | 
137 | inverse_solution(args.eeg_path, args.fname_fwd, inverse_method=args.inverse_method, snr=args.snr,
138 |         hemi=args.hemi,
139 |         save_inverse_operator_path=args.save_inverse_operator_path,
140 |         clim=args.clim, inverse_operator_loose=args.inverse_operator_loose,
141 |         inverse_operator_depth=args.inverse_operator_depth)
142 | 
143 | 


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/preprocessing.py:
--------------------------------------------------------------------------------
  1 | import mne
  2 | from mne.preprocessing import ICA
  3 | import numpy as np
  4 | import argparse
  5 | from convert_eeg_to_bids import convert_to_bids
  6 | 
  7 | 
  8 | def cut_eeg(raw, start_chapter='CH01', remaining_time_at_beginning=10):
  9 | 
 10 |     annotations_onset = raw.annotations.onset
 11 |     annotations_description = raw.annotations.description
 12 |     rows_indexes = np.where(annotations_description == start_chapter)
 13 |     start_index = rows_indexes[0][0]
 14 |     start_time = annotations_onset[start_index] - remaining_time_at_beginning
 15 | 
 16 |     raw.crop(tmin=start_time)
 17 | 
 18 |     return raw, start_time
 19 | 
 20 | 
 21 | def read_mff_file(eeg_path, montage_name='GSN-HydroCel-128', preload=False):
 22 |     '''
 23 |     Read .mff file, set annotations and pick 128 channels, set montage
 24 |     '''
 25 |     raw = mne.io.read_raw_egi(eeg_path)
 26 |     events = mne.find_events(raw, )
 27 |     event_id = raw.event_id
 28 |     event_desc = {value: key for key, value in event_id.items()}
 29 | 
 30 |     annotations = mne.annotations_from_events(events, sfreq=raw.info['sfreq'], event_desc=event_desc)
 31 | 
 32 |     raw.set_annotations(annotations)
 33 | 
 34 |     raw.pick_types(eeg=True)
 35 |     raw.drop_channels(['VREF'])
 36 | 
 37 |     montage = mne.channels.make_standard_montage(montage_name)
 38 |     raw.set_montage(montage)
 39 | 
 40 |     if preload == True:
 41 |         raw = raw.load_data()
 42 | 
 43 |     return raw
 44 | 
 45 | def create_new_raw(raw, crop_time_at_beginning, montage_name='GSN-HydroCel-128', preload=False):
 46 |     '''
 47 |     create a raw object which can be saved as BrainVision format without losing annotation information
 48 | 
 49 |     :param crop_time_at_beginning: the time of the crop point
 50 |     '''
 51 | 
 52 |     # get raw data, annotations
 53 |     data = raw.get_data()
 54 |     annotations = raw.annotations
 55 |     onset = annotations.onset
 56 |     # create new annotations
 57 |     new_onset = onset - crop_time_at_beginning
 58 |     new_annotations = mne.Annotations(onset=new_onset, duration=annotations.duration, description=annotations.description)
 59 |     # create new raw
 60 |     new_raw = mne.io.RawArray(data, raw.info)
 61 |     new_raw.set_annotations(new_annotations)
 62 | 
 63 |     montage = mne.channels.make_standard_montage(montage_name)
 64 |     new_raw.set_montage(montage)
 65 | 
 66 |     if preload == True:
 67 |         new_raw = new_raw.load_data()
 68 | 
 69 |     return new_raw
 70 | 
 71 | 
 72 | 
 73 | def process_single_eeg(eeg_path=None, sub_id='06', ses='LittlePrince',
 74 |                        task='LittlePrince', run=1, raw_data_root='dataset',
 75 |                        filtered_data_root='dataset/derivatives/filtered',
 76 |                        processed_data_root='dataset/derivatives/preproc',
 77 |                        dataset_name='Novel Reading',
 78 |                        author='Xinyu Mou, Cuilin He, Liwei Tan', line_freq=50,
 79 |                        start_chapter=None, low_pass_freq=0.5,
 80 |                        high_pass_freq=80, resample_freq=256,
 81 |                        remaining_time_at_beginning=10, bad_channels=[], montage_name='GSN-HydroCel-128',
 82 |                        ica_method='infomax', ica_n_components=15, rereference='average'):
 83 |     '''
 84 |     :param eeg_path: data path of the unprocessed eeg.
 85 |     :param sub_id: a string of the id of the subject. Pad 0 if the id has only one digit.
 86 |     :param ses: a string describing the session of the current data. It will be contained in the file name when saving
 87 |                 the file.
 88 |     :param task: a string describing the task of the current data. It will be contained in the file name when saving
 89 |                 the file.
 90 |     :param run: an integer standing for the run number of the data.
 91 |     :param raw_data_root: the path of your raw data, which is also the root of the whole dataset.
 92 |     :param filtered_data_root: the path of your filtered data.
 93 |     :param processed_data_root: the path of your pre-processed data.
 94 |     :param dataset_name: name of the dataset, which will be saved in the dataset_description.json.
 95 |     :param author: author of the dataset.
 96 |     :param line_freq: line frequency of the data. This is needed when saving the data into BIDS format.
 97 |                       Default to be 50.
 98 |     :param start_chapter: a string which is the eeg mark of the first chapter in current eeg data
 99 |                           e.g. if your eeg starts with chapter 1, then the argument should be 'CH01'.
100 |     :param low_pass_freq: the low pass frequency of the filter.
101 |     :param high_pass_freq: the high pass frequency of the filter.
102 |     :param resample_freq: the resample frequency of the filter.
103 |     :param remaining_time_at_beginning: the remaining time before the start of the valid eeg segment.
104 |     :param bad_channels: bad channels which should be interpolated at beginning.
105 |     :param montage_name: the montage of the eeg.
106 |     :param ica_method: which ica_method you want to use. See mne tutorial for more information.
107 |     :param ica_n_components: how many ICA components you want to use. See mne tutorial for more information.
108 |     :param rereference: re-reference method you want to use.
109 |     '''
110 | 
111 |     raw = read_mff_file(eeg_path=eeg_path, montage_name=montage_name, preload=True)
112 | 
113 |     convert_to_bids(raw, ica_component=None, ica_topo_figs=None, ica_dict=None, bad_channel_dict=None, sub_id=sub_id, ses=ses,
114 |                     task=task, run=run, bids_root=raw_data_root, dataset_name=dataset_name,
115 |                     dataset_type='raw', author=author, line_freq=line_freq)
116 | 
117 |     #raw = read_mff_file(eeg_path=eeg_path, montage_name=montage_name, preload=True)
118 | 
119 |     raw.info["bads"].extend(bad_channels)
120 |     raw = raw.interpolate_bads()
121 | 
122 |     # cut data
123 | 
124 |     if start_chapter is not None:
125 |         raw, crop_start_time = cut_eeg(raw=raw, start_chapter=start_chapter, remaining_time_at_beginning=remaining_time_at_beginning)
126 |     else:
127 |         crop_start_time = 0
128 |     print('-------------------- raw cut --------------------')
129 | 
130 | 
131 | 
132 |     # dowmsample
133 |     raw.resample(resample_freq)
134 | 
135 |     print('-------------------- raw resampled --------------------')
136 | 
137 | 
138 |     # notch filter
139 |     raw = raw.notch_filter(freqs=(50))
140 | 
141 |     print('-------------------- notch filter finished --------------------')
142 | 
143 |     # band pass filter
144 |     raw = raw.filter(l_freq=low_pass_freq, h_freq=high_pass_freq)
145 | 
146 |     filt_raw = create_new_raw(raw=raw, crop_time_at_beginning=crop_start_time, montage_name='GSN-HydroCel-128',
147 |                                  preload=False)
148 | 
149 |     convert_to_bids(filt_raw, ica_component=None, ica_topo_figs=None, ica_dict=None,
150 |                     bad_channel_dict=None, sub_id=sub_id, ses=ses,
151 |                     task=task, run=run, bids_root=filtered_data_root,
152 |                     dataset_name=dataset_name, dataset_type='derivative',
153 |                     author=author, line_freq=line_freq)
154 | 
155 | 
156 |     print('-------------------- filter finished --------------------')
157 | 
158 |     # mark bad electrodes and bad part of the data
159 |     raw.plot(block=True)
160 | 
161 |     bad_channels = raw.info['bads']
162 |     print('bad_channels: ', bad_channels)
163 | 
164 |     bad_channel_dict = {'bad channels': bad_channels}
165 | 
166 |     # bad channel interpolation
167 |     raw = raw.interpolate_bads()
168 | 
169 | 
170 |     print('-------------------- bad channels interpolated --------------------')
171 | 
172 |     # ICA
173 |     ica = ICA(n_components=ica_n_components, max_iter='auto', method=ica_method, random_state=97)
174 |     # raw_for_ica = raw.copy().filter(l_freq=1, h_freq=None)
175 |     ica.fit(raw, reject_by_annotation=True)
176 | 
177 |     ica_components = ica.get_sources(raw)
178 |     ica_components = ica_components.get_data()
179 | 
180 | 
181 |     ica.plot_sources(raw, show_scrollbars=False, block=True)
182 | 
183 |     ica_topo_figs = ica.plot_components()
184 | 
185 | 
186 | 
187 |     print('exclude ICA components: ', ica.exclude)
188 | 
189 |     ica_dict = {'shape':ica_components.shape, 'exclude':ica.exclude}
190 | 
191 |     ica.apply(raw)
192 | 
193 |     print('-------------------- ICA finished --------------------')
194 | 
195 |     # re-reference
196 |     raw.set_eeg_reference(ref_channels=rereference)
197 | 
198 |     print('-------------------- rereference finished --------------------')
199 | 
200 | 
201 |     # plot final data
202 |     raw.plot(block=True)
203 | 
204 |     preproc_raw = create_new_raw(raw=raw, crop_time_at_beginning=crop_start_time, montage_name='GSN-HydroCel-128', preload=False)
205 | 
206 |     convert_to_bids(preproc_raw, ica_component=ica_components, ica_topo_figs=ica_topo_figs, ica_dict=ica_dict, bad_channel_dict=bad_channel_dict, sub_id=sub_id, ses=ses,
207 |                     task=task, run=run, bids_root=processed_data_root,
208 |                     dataset_name=dataset_name, dataset_type='derivative',
209 |                     author=author, line_freq=line_freq)
210 | 
211 | 
212 | parser = argparse.ArgumentParser(description='Parameters that can be changed in this experiment')
213 | parser.add_argument('--eeg_path', type=str, default='subject_07_eeg_01.mff')
214 | parser.add_argument('--sub_id', type=str, default='07')
215 | parser.add_argument('--ses', type=str, default='LittlePrince')
216 | parser.add_argument('--task', type=str, default='reading')
217 | parser.add_argument('--run', type=int, default=1)
218 | parser.add_argument('--raw_data_root', type=str, default='test_dataset')
219 | parser.add_argument('--filtered_data_root', type=str, default='test_dataset/derivatives/filtered')
220 | parser.add_argument('--processed_data_root', type=str, default='test_dataset/derivatives/preproc')
221 | parser.add_argument('--dataset_name', type=str, default='Novel Reading')
222 | parser.add_argument('--author', type=str, default='Xinyu Mou, Cuilin He, Liwei Tan')
223 | parser.add_argument('--line_freq', type=float, default=50)
224 | parser.add_argument('--start_chapter', type=str, default='CH01')
225 | parser.add_argument('--low_pass_freq', type=float, default=0.5)
226 | parser.add_argument('--high_pass_freq', type=float, default=80)
227 | parser.add_argument('--resample_freq', type=float, default=256)
228 | parser.add_argument('--remaining_time_at_beginning', type=float, default=10)
229 | parser.add_argument('--bad_channels', type=list, default=[])  # e.g. ['E1', 'E2']
230 | parser.add_argument('--montage_name', type=str, default='GSN-HydroCel-128')
231 | parser.add_argument('--ica_method', type=str, default='infomax')
232 | parser.add_argument('--ica_n_components', type=int, default=20)
233 | parser.add_argument('--rereference', type=str, default='average')
234 | 
235 | args = parser.parse_args()
236 | 
237 | process_single_eeg(eeg_path=args.eeg_path, sub_id=args.sub_id, ses=args.ses,
238 |                        task=args.task, run=args.run, raw_data_root=args.raw_data_root,
239 |                        filtered_data_root=args.filtered_data_root,
240 |                        processed_data_root=args.processed_data_root,
241 |                        dataset_name=args.dataset_name,
242 |                        author=args.author, line_freq=args.line_freq,
243 |                        start_chapter=args.start_chapter, low_pass_freq=args.low_pass_freq,
244 |                        high_pass_freq=args.high_pass_freq, resample_freq=args.resample_freq,
245 |                        bad_channels=args.bad_channels,
246 |                        remaining_time_at_beginning=args.remaining_time_at_beginning, montage_name=args.montage_name,
247 |                        ica_method=args.ica_method, ica_n_components=args.ica_n_components, rereference=args.rereference)
248 | 
249 | 


--------------------------------------------------------------------------------
/data_preprocessing_and_alignment/utils.py:
--------------------------------------------------------------------------------
 1 | import mne
 2 | import numpy as np
 3 | import openpyxl
 4 | import csv
 5 | import argparse
 6 | 
 7 | 
 8 | def read_eeg_brainvision(eeg_path, montage_name='GSN-HydroCel-128'):
 9 |     eeg = mne.io.read_raw_brainvision(eeg_path)
10 |     events = []
11 |     event_id = {}
12 | 
13 |     events_path = eeg_path.replace('eeg.vhdr', 'events.tsv')
14 | 
15 |     with open(events_path) as events_file:
16 |         csv_reader = csv.reader(events_file, delimiter='\t')  # use csv.reader to read file
17 |         header = next(csv_reader)        # read the titles in the first row
18 |         for row in csv_reader:  # save data
19 |             events.append([int(row[4]), 0, int(row[3])])  # select a column and add to data
20 |             if row[2] not in event_id.keys():
21 |                 event_id[row[2]] = int(row[3])
22 | 
23 |     events = np.array(events)
24 | 
25 |     annotations = mne.annotations_from_events(events, eeg.info['sfreq'], event_id)
26 |     eeg.set_annotations(annotations)
27 | 
28 |     montage = mne.channels.make_standard_montage(montage_name)
29 |     eeg.set_montage(montage)
30 | 
31 | 
32 |     return eeg, events, event_id


--------------------------------------------------------------------------------
/docker/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM ubuntu:22.04
 2 | 
 3 | # Configures operative system dependencies
 4 | ENV LANG C.UTF-8
 5 | ENV DEBIAN_FRONTEND noninteractive
 6 | ENV DEBCONF_NONINTERACTIVE_SEEN true
 7 | 
 8 | # Install dependencies
 9 | RUN apt-get update -qq 
10 | RUN echo 'Installing OS dependencies' 
11 | RUN apt-get install -qq -y --fix-missing sudo 
12 | RUN apt-get install -qq -y --fix-missing software-properties-common 
13 | RUN apt-get install -qq -y --fix-missing git 
14 | RUN apt-get install -qq -y --fix-missing unzip 
15 | RUN apt-get install -qq -y --fix-missing wget 
16 | RUN apt-get install -qq -y --fix-missing vim 
17 | RUN apt-get install -qq -y --fix-missing curl 
18 | RUN apt-get install -qq -y --fix-missing libxext-dev libxrender-dev libxslt1.1 libxtst-dev
19 | RUN apt-get install -qq -y --fix-missing libgtk2.0-0 libcanberra-gtk-module
20 | 
21 | RUN echo 'Cleaning up' 
22 | RUN apt-get clean -qq -y 
23 | RUN apt-get autoclean -qq -y
24 | RUN apt-get autoremove -qq -y 
25 | RUN rm -rf /var/lib/apt/lists/* 
26 | RUN rm -rf /tmp/*
27 | 
28 | 
29 | # Install python3.10
30 | RUN add-apt-repository ppa:deadsnakes/ppa 
31 | RUN apt-get install python3.10 -y
32 | RUN ln -s /usr/bin/python3.10 /usr/bin/python
33 | RUN wget https://bootstrap.pypa.io/get-pip.py
34 | RUN python get-pip.py
35 | RUN rm get-pip.py
36 | RUN apt-get install -qq -y --fix-missing python3.10-venv
37 | RUN apt-get install -qq -y --fix-missing python3.10-tk
38 | 
39 | 
40 | # Install pycharm
41 | ENV PYCHARM /opt/pycharm
42 | RUN mkdir $PYCHARM
43 | RUN wget https://download-cdn.jetbrains.com/python/pycharm-community-2023.2.5.tar.gz -O - | tar xzv --strip-components=1 -C $PYCHARM
44 | ENV PATH $PYCHARM/bin:$PATH
45 | 
46 | ENV DISPLAY host.docker.internal:0
47 | 
48 | 
49 | # set no password for root
50 | RUN passwd --delete root
51 | # add new user: mynewuser
52 | RUN useradd -m mynewuser -s /bin/bash
53 | RUN usermod -aG sudo mynewuser
54 | WORKDIR /home/mynewuser
55 | USER mynewuser
56 | 
57 | RUN git clone https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing.git
58 | 
59 | WORKDIR /home/mynewuser/Chinese_reading_task_eeg_processing
60 | 
61 | # create new venv
62 | RUN python -m venv eeg_dataset_env
63 | RUN ["/bin/bash", "-c", "source ./eeg_dataset_env/bin/activate"]
64 | 
65 | # install dependencies
66 | RUN eeg_dataset_env/bin/pip install --upgrade pip
67 | RUN eeg_dataset_env/bin/pip install openpyxl==3.1.2
68 | RUN eeg_dataset_env/bin/pip install mne-bids[full]==0.14
69 | RUN eeg_dataset_env/bin/pip install psychopy==2023.2.3 --no-deps
70 | RUN git clone https://github.com/tobiipro/g3pylib.git
71 | RUN eeg_dataset_env/bin/pip install ./g3pylib
72 | RUN eeg_dataset_env/bin/pip install pybv==0.7.5
73 | RUN eeg_dataset_env/bin/pip install egi-pynetstation==1.0.1
74 | 
75 | RUN rm -rf ./g3pylib
76 | 
77 | 
78 | 
79 | WORKDIR /opt/pycharm/bin
80 | 
81 | USER root
82 | 
83 | 
84 | 
85 | 
86 | 


--------------------------------------------------------------------------------
/docker/README.md:
--------------------------------------------------------------------------------
  1 | # Docker Tutorial
  2 | 
  3 | ## Introduction
  4 | 
  5 | This content is intended to facilitate researchers who are not familiar with setting up a Python environment. Although each section's `README` document already lists the required packages and their versions, we still provide a Docker image that includes all dependencies for the project, to simplify the environment setup process. Docker can run a virtual container on Linux, Windows, or macOS computers, which contains an application and its dependencies. Therefore, users only need to follow this tutorial to download Docker, pull the image, and run the container, to easily set up the environment for running the code.
  6 | 
  7 | The two core concepts of Docker are `images` and `containers`. A Docker `image` is a read-only template used to build containers, which stores and transmits applications. A Docker `container` is a standardized, encapsulated environment for running applications, and can be considered as an independent operating system running on the host machine. In practice, we first pull an image and then use it to build our own container, in which we carry out operations such as running code.
  8 | 
  9 | Our Docker image [mouxinyu/eeg_dataset](https://hub.docker.com/r/mouxinyu/eeg_dataset) is based on [ubuntu:22.04](https://hub.docker.com/_/ubuntu) and includes project code pulled from [GitHub repositories](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing). Additionally, the image is configured with [Pycharm](https://www.jetbrains.com/pycharm/download/?section=linux) that supports a graphical user interface, allowing users to operate through the GUI while running the container, avoiding complex command-line editing. The source code of the dockerfile is in `docker` directory.
 10 | 
 11 | **The following tutorial will detail how to use Docker to apply our image on the Windows operating system**. The methods for other operating systems will be slightly different and are not elaborated here.
 12 | 
 13 | If you want to know more about Docker, you can browse its [official website](https://www.docker.com/).
 14 | 
 15 | ## Step-by-step Tutorial
 16 | 
 17 | ### Docker Installation
 18 | 
 19 | For detailed instructions on how to download Docker on a Windows system, it is recommended to refer to the [comprehensive guide](https://docs.docker.com/desktop/install/windows-install/) provided in the official documentation. Here we give a brief instruction.
 20 | 
 21 | #### WSL Installation
 22 | 
 23 | Before downloading Docker, ensure that the Windows Subsystem for Linux (WSL) is installed on your system. We recommend that you follow the [official guidelines](https://learn.microsoft.com/en-us/windows/wsl/install) to ensure the proper functioning of WSL. Below, we provide a simplified installation process and some important commands for your reference.
 24 | 
 25 | You can install wsl by opening the command line interface and executing the following commands:
 26 | 
 27 | ```
 28 | wsl --install
 29 | ```
 30 | 
 31 | Please ensure that the default version of WSL is set to WSL2. You can specify the default version by executing the following command:
 32 | 
 33 | ```
 34 | wsl --set-default-version 2 
 35 | ```
 36 | 
 37 | You can also view the list of Linux distributions installed on your Windows computer by using the following command:
 38 | 
 39 | ```
 40 | wsl -l -v
 41 | ```
 42 | 
 43 | Other basic commands for WSL can be found in its [official documentation](https://learn.microsoft.com/en-us/windows/wsl/basic-commands).
 44 | 
 45 | #### Docker Installation
 46 | 
 47 | Once you have verified that WSL is properly installed, you can proceed with the installation of Docker. Please install the version corresponding to your operating system. The installation path for Windows systems is provided [here](https://docs.docker.com/desktop/install/windows-install/).
 48 | 
 49 | After completing the download steps as instructed in the document, double-click to open Docker Desktop, and you will see the interface as shown in the following image.
 50 | 
 51 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/docker.png)
 52 | 
 53 | ### VcXsrv Installation
 54 | 
 55 | Docker does not provide GUI by default. To address this issue, we need to install VcXsrv on the Windows host machine (this solution is also applicable to Mac). Download it from the official website provided [here](https://sourceforge.net/projects/vcxsrv/).
 56 | 
 57 | After downloading, double-click to install it. Once the installation is complete, open VcXsrv (named Xlaunch after installation) from the start menu. If it's not found in the start menu, you can use the search function to locate it. Upon opening, a settings page will appear.
 58 | 
 59 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/vcxsrv_1.png)
 60 | 
 61 | Simply keep the default settings and click 'Next' until you reach the end, and then click 'Finish' to start VcXsrv. After it starts, you can see the following icon in the dock, indicating a successful launch.
 62 | 
 63 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/vcxsrv_2.png)
 64 | 
 65 | ### Image Pulling
 66 | 
 67 | Now, we will deploy our environment by pulling the image from [Docker Hub](https://hub.docker.com/) and running it to create a container.
 68 | 
 69 | First, please ensure that you have successfully launched Docker Desktop and have opened VcXsrv as previously instructed. Next, open the command line interface and enter the following command to pull the required image from Docker Hub:
 70 | 
 71 | ```
 72 | docker pull mouxinyu/eeg_dataset
 73 | ```
 74 | 
 75 | After you enter this command and press the enter key to execute it. If you see the following response in the command line interface, the image is successfully pulled:
 76 | 
 77 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/image.png)
 78 | 
 79 | ### Run Container
 80 | 
 81 | Next, we can create a container using the image we have just pulled. You can create a container by executing the following command:
 82 | 
 83 | ```
 84 | docker run --name eeg_dataset_container -it -v <path/to/your/windows/mount/point>:<path/to/your/container/mount/point> mouxinyu/eeg_dataset
 85 | ```
 86 | 
 87 | The explanation of the relevant parameters is as follows:
 88 | 
 89 | - --name: This parameter specifies the name of the container being created, which we can later use to refer to the specific container.
 90 | - -it: This command runs the container in interactive mode and allocates a pseudo terminal for the container.
 91 | - -v: This command mounts a file system address from the host machine to a corresponding address inside the container, allowing file exchange between the container and the host machine. `:` is used to separate the paths on the two systems, with the left side being the path on the host machine and the right side the path within the container, which is recommended to be set as `/home/mynewuser/mount`.
 92 | 
 93 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/docker_run.png)
 94 | 
 95 | The container by default has two users, one is the `root` user, and the other is a user named `mynewuser`. When entering the container, by default, you are in the `/opt/pycharm/bin` directory under the `root` user.
 96 | 
 97 | We recommend that you switch to user `mynewuser` for most operations, and you can switch users using the following command:
 98 | 
 99 | ```
100 | su mynewuser
101 | ```
102 | 
103 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/su.png)
104 | 
105 | #### GUI 
106 | 
107 | In `/opt/pycharm/bin`, enter the following command to open the GUI interface of Pycharm:
108 | 
109 | ```
110 | sh pycharm.sh
111 | ```
112 | 
113 | If you successfully run the GUI, the following interface will appear:
114 | 
115 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_1.png)
116 | 
117 | Confirm the user agreement and enter the pycharm interface:
118 | 
119 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_2.png)
120 | 
121 | Select the 'Open' option in the middle part, and choose `home/mynewuser/Chinese_reading_task_eeg_processing` to open the project.
122 | 
123 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_3.png)
124 | 
125 | Here, select 'Trust Project' to open the project.
126 | 
127 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_4.png)
128 | 
129 | Typically, Pycharm will automatically detect the virtual environment `eeg_dataset_env` in the project folder and configure it automatically (this may take some time if it's your first time setting up this environment). However, if Pycharm does not automatically recognize the environment, you can manually configure it using the steps below:
130 | 
131 | Click on the bottom right area of the Pycharm page, and select `Add New Interpreter --> Add Local Interpreter` .
132 | 
133 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_5.png)
134 | 
135 | In the PyCharm settings interface, select `Virtualenv Environment` from the sidebar on the left. Then in the `Environment` section, choose the `Existing` option to specify a pre-existing environment.
136 | 
137 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_6.png)
138 | 
139 | Then select `/home/mynewuser/Chinese_reading_task_eeg_processing/eeg_dataset_env/bin/python3.10` as the interpreter. Press `OK` to configure the environment.
140 | 
141 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/pycharm_7.png)
142 | 
143 | Now, you can edit and run code within PyCharm.
144 | 
145 | Caveat: If you intend to use the `matplotlib` package for plotting, please insert the following lines of code at the very beginning of your code file:
146 | 
147 | ```
148 | import matplotlib
149 | matplotlib.use('TkAgg')
150 | import matplotlib.pyplot as plt
151 | import tkinter as tk
152 | ```
153 | 
154 | However, if you are using the integrated plotting functions within the `mne` package, you can disregard this step.
155 | 
156 | #### Command Line
157 | 
158 | If you are familiar with Linux command line operations, you can also perform corresponding actions using the command line. We have already configured tools like the vim editor for you. It's important to note that command line operations can only be done when the GUI is closed.
159 | 
160 | ### Mount
161 | 
162 | To exchange files between your host machine and your container, you'll need to utilize Docker's mounting feature. Please write the files you wish to save to the mount point in the container (if you followed our earlier recommendation, this mount point should be at `/home/mynewuser/mount`). Then, these files will appear at the designated location on your host machine. Similarly, you can place files from the host machine at its mount point, and then access these files at the mount point within the container. We highly recommend you to load and write data using the mount feature to avoid issues with memory or other possible problems.
163 | 
164 | ### Exit
165 | 
166 | We can switch back from the `mynewuser` user to the `root` user using the `exit` command, and similarly, we can exit the container by using the `exit` command while in the `root` user.
167 | 
168 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/exit.png)
169 | 
170 | ### Other Docker Command
171 | 
172 | If you have already closed the container and wish to re-enter the container's command line interface, you can follow these steps. First, use the following command to view all existing containers in your system:
173 | 
174 | ```
175 | docker ps -a
176 | ```
177 | 
178 | This command will show all containers, including these stopped ones.
179 | 
180 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/ps.png)
181 | 
182 | Next, use the following command to start a stopped container and enter its command line interface:
183 | 
184 | ```
185 | docker start eeg_dataset_container
186 | docker exec -it eeg_dataset_container /bin/bash
187 | ```
188 | 
189 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/restart.png)
190 | 
191 | ### Caveat
192 | 
193 | When using WSL and Docker, please ensure that your system has sufficient memory space. Additionally, in some cases, you may need to connect to a VPN to access external networks. If you don't have enough memory or are not connected to a VPN, you may experience issues such as lag or unresponsive commands.
194 | 


--------------------------------------------------------------------------------
/experiment/Error.py:
--------------------------------------------------------------------------------
 1 | 
 2 | class EyetrackerNotFoundException(Exception):
 3 |     def __str__(self):
 4 |         return 'No eyetracker connected'
 5 | 
 6 | class EgiNotFoundException(Exception):
 7 |     def __str__(self):
 8 |         return 'No egi device connected'
 9 | 
10 | 
11 | 


--------------------------------------------------------------------------------
/experiment/README.md:
--------------------------------------------------------------------------------
  1 | # Scrolling display project based on Chinese language corpus
  2 | 
  3 | ## Introduction
  4 | 
  5 | This project aims to conduct a novel reading task with **real-time highlighted character** based on **Chinese language corpora** while simultaneously recording participants' brainwave (EEG) and eye movement (eye-tracking) data. The collected data can be utilized for EEG language decoding and related research endeavors.
  6 | 
  7 | ## Device Models
  8 | 
  9 | EGI: 128-channel
 10 | 
 11 | eyetracker: Tobii Glass 3
 12 | 
 13 | ## Environment
 14 | 
 15 | **This project must be based on Python version 3.10!!!** You can use Anaconda to create a new environment to run this project. The command is as follows:
 16 | 
 17 | ```
 18 | conda create -n psychopy_tobii python=3.10
 19 | ```
 20 | 
 21 | Then you can activate this environment to install packages we need:
 22 | 
 23 | ```
 24 | conda activate psychopy_tobii
 25 | ```
 26 | 
 27 | ### Psychopy
 28 | 
 29 | The entire experimental program is written based on Psychopy. You can download Psychopy either through the command line using the following command or by downloading it from the Psychopy official website: https://www.psychopy.org/.
 30 | 
 31 | ```
 32 | pip install psychopy
 33 | ```
 34 | 
 35 | There are two modes in Psychopy: builder and coder. In our project, we use the coder to implement our experiment.
 36 | 
 37 | ### g3pylib
 38 | 
 39 | This project utilizes g3pylib, a Python client library for tobii Glasses3. You can install this package by following the instructions here: https://github.com/tobiipro/g3pylib
 40 | 
 41 | You should first clone the package:
 42 | 
 43 | ```
 44 | git clone https://github.com/tobiipro/g3pylib.git
 45 | ```
 46 | 
 47 | ```
 48 | cd g3pylib
 49 | ```
 50 | 
 51 | Then you can install the package under the path of the cloned package using the following command:
 52 | 
 53 | ```
 54 | pip install .
 55 | ```
 56 | 
 57 | ### egi-pynetstation
 58 | 
 59 | egi-pynetstation is a python package to enable the use of a Python-based API for EGI's NetStation EEG amplifier interface. Install it using the following command:
 60 | 
 61 | ```
 62 | pip install egi-pynetstation
 63 | ```
 64 | 
 65 | For more information about this package, you can visit this website: https://github.com/nimh-sfim/egi-pynetstation
 66 | 
 67 | ## Code Explanation
 68 | 
 69 | Before you conduct the experiment, you should first cut your novel to our required format using `cut_Chinese_novel.py` in the `novel_segmentation` section. You can read the README in that section for detailed information. 
 70 | 
 71 | If you have already succesfully done that, you can run `PlayNovel.py` to start the main experiment.
 72 | 
 73 | ### Main Experiment
 74 | 
 75 | #### Main Procedure
 76 | 
 77 | `PlayNovel.py` is used to run the main experiment based on Psychopy. 
 78 | 
 79 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/screen.png)
 80 | 
 81 | The program includes multiple modes, allowing you to choose whether to connect to the EGI and eye tracker for the experiment. The experimental procedure mainly consists of eye tracker calibration (if the eye tracker mode is selected), practice reading, formal reading, and breaks between chapters. During the reading process, the novel will be presented on the screen with three lines per page, and each line will not exceed ten Chinese characters (excluding punctuation marks). On each page, the middle line will be highlighted as the focal point, while the upper and lower lines will be displayed with reduced intensity as the background. Each character in the middle line will be sequentially highlighted for a certain duration, and participants will be asked to follow the highlighted cues to read the novel content. Various parameters, such as individual character highlight duration, highlight color, etc., can be adjusted by modifying the corresponding settings. 
 82 | 
 83 | #### Parameters
 84 | 
 85 | Detailed explanations of the parameters will be provided below. Note that you must change the bold parameter to your own settings, or there may be some errors and calculation inaccuracies.
 86 | 
 87 | | Parameter                      | type  | default                                | usage                                                        |
 88 | | ------------------------------ | ----- | -------------------------------------- | ------------------------------------------------------------ |
 89 | | highlight_color                | str   | 'red'                                  | Highlight color of the characters                            |
 90 | | add_eyetracker                 | bool  | True                                   | Whether conneting to the eyetracker                          |
 91 | | add_mark                       | bool  | True                                   | Whether connecting to the egi device                         |
 92 | | shift_time                     | float | 0.35                                   | The shifting time of the highlighted character               |
 93 | | **host_IP**                    | str   | '10.10.10.42'                          | The IP address of the net station (The computer which runs this experiment) |
 94 | | **egi_IP**                     | str   | '10.10.10.51'                          | The IP of the egi device                                     |
 95 | | **eyetracker_hostname**        | str   | "TG03B-080202024891"                   | The serial number of the eyetracker                          |
 96 | | novel_path                     | str   | "segmented_Chinese_novel_main.xlsx"    | The path of the  .xlsx format novel you want to play         |
 97 | | preface_path                   | str   | "segmented_Chinese_novel_preface.xlsx" | The path of the  .xlsx format preface you want to play       |
 98 | | fullscreen                     | bool  | True                                   | Whether to set a full screen                                 |
 99 | | rest_period                    | int   | 3                                      | The chapter interval of rest                                 |
100 | | force_rest_time                | int   | 15                                     | The forced rest time                                         |
101 | | **distance_screen_eyetracker** | float | 73                                     | distance from the center of the screen to the center of the eyetracker in centimeter |
102 | | **screen_width**               | float | 54                                     | The width of the screen                                      |
103 | | **screen_height**              | float | 30.375                                 | The height of the screen                                     |
104 | | **screen_width_height_ratio**  | float | 16/9                                   | The ratio of the screen width to screen height               |
105 | | **eyetracker_width_degree**    | float | 95                                     | The horizontal scanning range of the eye-tracking camera in degree (both sides together) |
106 | | **eyetracker_height_degree**   | float | 63                                     | The vertical scanning range of the eye-tracking camera in degree (both sides together) |
107 | | isFirstSession                 | bool  | True                                   | Whether this is the first session of the experiment, this will determine whether to display the preface before the formal experiment. |
108 | 
109 | **Notice**: As mentioned in the previous section, we may run this script multiple times in the experiment to sequentially present each part of the novel. You need to specify the parameter "isFirstSession" every time you run this program to let it know if this is the first playback. If the value is "True," the program will play the preface to do practice reading before the formal reading begins. If it is "False," the practice reading part will be skipped, and the program will start directly from the main content.
110 | 
111 | #### EEG Markers
112 | 
113 | If you use the EGI device to record EEG signals during the experiment, our program will place markers at certain corresponding time points. These markers will assist you in aligning eye tracker recordings and EEG signals, as well as locating texts corresponding to specific segments of EEG signals.
114 | 
115 | The detailed information of markers are shown below:
116 | 
117 | ```
118 | EYES: Eyetracker starts to record
119 | EYEE: Eyetracker stops recording
120 | CALS: Eyetracker calibration starts
121 | CALE: Eyetracker calibration stops
122 | BEGN: EGI starts to record
123 | STOP: EGI stops recording
124 | CH01：Beginning of specific chapter (Numbers correspond with chapters) 
125 | ROWS: Beginning of a row
126 | ROWE: End of a row
127 | PRES：Beginning of the preface
128 | PREE：End of the preface
129 | ```
130 | 
131 | #### Calibration Coordinate Transformation
132 | 
133 | In this experimental program, we designed a personalized calibration procedure. A dot will appear sequentially at the four corners and center of the screen, each staying for 5 seconds. Participants are required to fixate on the center of the dot to complete the calibration. For each dot, we record the middle and later segment of the participant's gaze data (from 3s to 4s) and calculate the average point of gaze as the participant's mean fixation point. We then compare the average fixation point with the actual center position of the dot to calculate the error. By averaging the errors from all five dots, we obtain the final calibration error. If the final error is below the predetermined error threshold, we consider the calibration as successful. If the calibration is not successful, the experimental program will automatically return to the calibration phase and repeat the process until calibration is achieved.
134 | 
135 | In order to align the coordinate systems of the eye tracker and the Psychopy program to obtain the actual positions of gaze points on the screen, we derived a transformation formula between the coordinate systems of the eye tracker and the Psychopy program using geometric relationships. This formula was then applied during the calibration process. The specific relationship is as follows:
136 | ```math
137 | x_{\text{eyetracker}} = (\frac{{W \cdot x_{\text{psychopy}}}}{{d \cdot r \cdot \tan(\text{width\_degree/2})}} + 1 )  \cdot \frac{1}{2}
138 | ```
139 | ```math
140 | y_{\text{eyetracker}} = (1 - \frac{{H \cdot y_{\text{psychopy}}}}{{d \cdot \tan(\text{height\_degree/2})}} )\cdot \frac{1}{2}
141 | ```
142 | where:
143 | ```math
144 | (x_{eyetracker}, y_{eyetracker}) \ is \ the \ coordinate \ in \ the \ eyetracker\ coordinate \ system
145 | ```
146 | ```math
147 | (x_{psychopy}, y_{psychopy}) \ is \ the \ coordinate \ in \ the \ psychopy\ coordinate \ system
148 | ```
149 | ```math
150 | W \ : \ the \ width \ of \ the \ screen
151 | ```
152 | ```math
153 | H \ : \ the \ height \ of \ the \ screen
154 | ```
155 | ```math
156 | r \ : \ the \ ratio \ of \ the \ width \ to \ the \ height
157 | ```
158 | ```math
159 | width\_degree \ : \ the\ horizontal\ scanning\ range\ of\ the\ eyetracking \ camera\ in\ degree\ ( both \ sides \ together)
160 | ```
161 | ```math
162 | height\_degree \ : \ the\ vertical\ scanning\ range\ of\ the\ eyetracking \ camera\ in\ degree\ ( both \ sides \ together)
163 | ```
164 | 
165 | 
166 | ## Experiment Procedure
167 | 
168 | The experimental setup is as below:
169 | 
170 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/exp_layout.png)
171 | 
172 | Below are the operational steps and an example of starting the project from scratch, using the novel *The Little Prince* as an example.
173 | 
174 | ### Activate Environment
175 | 
176 | First, activate the environment we set up before, and then navigate to the directory where the project is located.
177 | 
178 | ```
179 | conda activate psychopy_tobii
180 | cd <your_path_to_project>
181 | ```
182 | 
183 | ### Novel Segmentation
184 | 
185 | Take a `.txt` novel file that meets the format requirements (format requirements can be found in the "Sentence Segmentation" section of the Code Explanation) as input. Specify the parameter to divide the text into several parts. Run `cut_Chinese_novel.py`, and you will obtain the corresponding number of `.xlsx` files. Here, we divided the novel into 4 main parts, resulting in 5 files (4 main body parts and 1 preface part).
186 | 
187 | ```
188 | python cut_Chinese_novel.py --divide_num=8,16,24 --Chinese_novel_path=xiaowangzi_main_text.txt
189 | ```
190 | 
191 | ### Main Experiment
192 | 
193 | - First, connect the EGI equipment and the eye tracker.
194 | - Next, adjust the parameters and run the main program. Set the addresses for the main body and preface parts in the variables `novel_path` and `preface_path`, respectively. Adjust the parameters `add_mark` and `add_eyetracker` to decide whether to connect to the EGI and eye tracker. Change `host_IP`, `egi_IP`, and `eyetracker_hostname` to the IP numbers of your own devices. Set `isFirstSession` to True during the first run to include the preview session. Other adjustable parameters can be found in the "Parameters" section of the Code Explanation under Main Experiment. Note that you may need to modify some size and distance-related parameters according to your own setup. In subsequent runs, change the `novel_path` to read different parts of the novel and set `isFirstSession` to False.
195 | 
196 | ```
197 | python play_novel.py --add_mark --add_eyetracker  --preface_path=<your preface path> --host_IP=<host IP> --egi_IP=<egi IP> --eyetracker_hostname=<eyetracker serial number> --novel_path=<your novel path> --isFirstSession
198 | ```
199 | 
200 | ​		Here is our own settings as an example:
201 | 
202 | ```
203 | First time:
204 | python PlayNovel.py --add_mark --add_eyetracker  --preface_path=segmented_Chinense_novel_preface_display.xlsx --host_IP=10.10.10.42 --egi_IP=10.10.10.51 --eyetracker_hostname=TG03B-080202024891 --novel_path=segmented_Chinense_novel_run_1_display.xlsx --isFirstSession
205 | 
206 | Second time:
207 | python PlayNovel.py --add_mark --add_eyetracker  --preface_path=segmented_Chinense_novel_preface_display.xlsx --host_IP=10.10.10.42 --egi_IP=10.10.10.51 --eyetracker_hostname=TG03B-080202024891 --novel_path=segmented_Chinense_novel_run_2_display.xlsx
208 | 
209 | ...
210 | ```
211 | 
212 | - **During the forced break period, the EGI system will be disconnected. ** ***At this time, you need to restart the EGI system to ensure it is in a running state before the participant continues the experiment, or the program will crash!!!***
213 | 
214 | - **At the end of each experimental session, it is necessary to replenish the saline for the participant's EEG cap and replace the eye tracker's batteries to ensure sufficient power. Start and reconnect the eye tracker, and restart the EGI system.** ***Remember not to disconnect the eye tracker or replace its batteries during the experiment (including the rest periods) as doing so may cause the program to crash!!!***
215 | 
216 | - The main process of the experiment includes: calibration - preface session (only in the first part) - formal reading - rest (including mandatory rest and participant-initiated rest periods). After each rest period, recalibration will be performed.
217 | 
218 |   - Calibration
219 | 
220 |     **Note: When calibration fails multiple times, the experimenter can choose to skip the calibration and proceed directly to the reading section by pressing the right arrow key on the keyboard at the calibration failure prompt page.**
221 | 
222 |   - Preface Reading
223 | 
224 |   - Formal Reading
225 | 
226 |   - Rest
227 | 
228 |     **restart the EGI system to ensure it is in a running state before the participant continues the experiment**
229 | 
230 |   
231 | 
232 | 
233 | 
234 | 


--------------------------------------------------------------------------------
/image/cardiac_artifact.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/cardiac_artifact.jpg


--------------------------------------------------------------------------------
/image/channel_artifact.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/channel_artifact.jpg


--------------------------------------------------------------------------------
/image/data_processing.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/data_processing.png


--------------------------------------------------------------------------------
/image/docker.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/docker.png


--------------------------------------------------------------------------------
/image/docker_run.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/docker_run.png


--------------------------------------------------------------------------------
/image/egi_montage.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/egi_montage.png


--------------------------------------------------------------------------------
/image/exit.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/exit.png


--------------------------------------------------------------------------------
/image/exp_layout.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/exp_layout.png


--------------------------------------------------------------------------------
/image/eye_artifact.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/eye_artifact.jpg


--------------------------------------------------------------------------------
/image/ica_001.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ica_001.png


--------------------------------------------------------------------------------
/image/ica_006.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ica_006.png


--------------------------------------------------------------------------------
/image/ica_007.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ica_007.png


--------------------------------------------------------------------------------
/image/ica_010.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ica_010.png


--------------------------------------------------------------------------------
/image/ica_015.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ica_015.png


--------------------------------------------------------------------------------
/image/ica_topo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ica_topo.png


--------------------------------------------------------------------------------
/image/image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/image.png


--------------------------------------------------------------------------------
/image/mount_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/mount_1.png


--------------------------------------------------------------------------------
/image/muscle_artifact.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/muscle_artifact.jpg


--------------------------------------------------------------------------------
/image/pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pipeline.png


--------------------------------------------------------------------------------
/image/pipeline_english.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pipeline_english.png


--------------------------------------------------------------------------------
/image/processing_pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/processing_pipeline.png


--------------------------------------------------------------------------------
/image/ps.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/ps.png


--------------------------------------------------------------------------------
/image/pycharm_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_1.png


--------------------------------------------------------------------------------
/image/pycharm_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_2.png


--------------------------------------------------------------------------------
/image/pycharm_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_3.png


--------------------------------------------------------------------------------
/image/pycharm_4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_4.png


--------------------------------------------------------------------------------
/image/pycharm_5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_5.png


--------------------------------------------------------------------------------
/image/pycharm_6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_6.png


--------------------------------------------------------------------------------
/image/pycharm_7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/pycharm_7.png


--------------------------------------------------------------------------------
/image/random_mask1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/random_mask1.png


--------------------------------------------------------------------------------
/image/random_mask2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/random_mask2.png


--------------------------------------------------------------------------------
/image/random_mask3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/random_mask3.png


--------------------------------------------------------------------------------
/image/restart.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/restart.png


--------------------------------------------------------------------------------
/image/screen.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/screen.png


--------------------------------------------------------------------------------
/image/structure_new.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/structure_new.png


--------------------------------------------------------------------------------
/image/su.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/su.png


--------------------------------------------------------------------------------
/image/vcxsrv_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/vcxsrv_1.png


--------------------------------------------------------------------------------
/image/vcxsrv_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ncclabsustech/Chinese_reading_task_eeg_processing/279b2a6d5b1ea9d58a7fdb15a7b54fb7d8f72baf/image/vcxsrv_2.png


--------------------------------------------------------------------------------
/novel_segmentation_and_text_embeddings/README.md:
--------------------------------------------------------------------------------
 1 | # Chinese Language Corpus Segmentation and Text Embeddings
 2 | 
 3 | ## Introduction
 4 | 
 5 | This README file aims to illustrate how to segment Chinese novel and get their text embeddings using pretrained language models.
 6 | 
 7 | ## Environment
 8 | 
 9 | We use `xlsx` files to save our segmented novel. We use the package `openpyxl` to write the content to excel files. You can install this package with pip:
10 | 
11 | ```
12 | pip install openpyxl
13 | ```
14 | 
15 | Also we need packages for deep learning and pretrained language models. So pytorch and Transformers packages are needed.
16 | 
17 | ```
18 | pip install pytorch
19 | ```
20 | 
21 | ```
22 | pip install transformers
23 | ```
24 | 
25 | ## Explanation
26 | 
27 | ### Novel Segmentation
28 | 
29 | You should run `cut_Chinese_novel.py` to process your `.txt` formated Chinese novel for sentence segmentation, obtaining the corresponding formatted `.xlsx` files. 
30 | 
31 | `cut_Chinese_novel.py` is used to process the novel into a suitable format for presentation in the Psychopy program. The novel content will be segmented into a series of units, each containing no more than 10 Chinese characters, and adjacent three units will be combined to form the content of each frame during the screen presentation. The same combination of adjacent units will be repeated multiple times to facilitate the subsequent highlighting display of each individual character. Additionally, you can specify to divide the novel from certain chapters into several parts and output them as separate files.
32 | 
33 | ​                                                                                                                                                                                                                                                         The input for this script must be a Chinese novel content in `.txt` format. The format of the novel content should meet the following requirements:
34 | 
35 | 1. **The beginning of the file should contain the preface section with a sign Ch0**. The preface will be used as a simulation before reading the formal chapters.
36 | 2. Each subsequent chapter should start with a sign **Ch + \<chapter number>**.
37 | 
38 | Below is an example:
39 | 
40 | ```
41 | Chinese_novel.txt
42 | Ch0
43 | This is the preface of the novel
44 | Ch1
45 | Chapter 1 of the novel
46 | Ch2
47 | Chapter 2 of the novel
48 | ...
49 | ...
50 | ...
51 | ```
52 | 
53 | In real experiments, considering issues like eye tracker battery life, we must perform battery replacement during the experiment. Due to technical constraints, we have to temporarily stop the experiment program during battery replacement. Therefore, we may divide a complete novel into multiple parts and present each part during each complete program run. For example, we may divide a novel with 30 chapters to 4 parts, and we will run the program several times to finish the entire novel content. 
54 | 
55 | In this script, you can specify the chapters you want to divide the novel by changing parameter `divide_num` , and the program will divide the chapters and output the  files containing these different parts.
56 | 
57 | The outputs have three different types of data files:
58 | 
59 | - The first type of file contains one specific file that contains the entire content of the novel. Its contents are divided according to each line as it is displayed. This file is named `segmented_Chinese_novel.xlsx`.  In this type of data, **we removed all lines that do not contain Chinese characters, namely those lines composed entirely of punctuation marks.** This is based on the following reason: During the actual display process, lines that contain only punctuation are not presented in the central line of the screen for the subjects to read. Therefore, in the analysis of related EEG data, these lines do not have corresponding data segments. 
60 | - The second type of file contains the content of each trial (i.e., the specified chapter played during one session), also divided according to each line as displayed, and these files are named `segmented_Chinese_novel_run_xx.xlsx`, where xx is the specific number of the run. **we also removed all lines that do not contain Chinese characters for the same reason as before.**
61 | -  The third type of file is a specialized file used for projecting the content on the screen. It directly corresponds to the second type of file, with each file in the third type being a transformation of the second type. The primary purpose of these files is to present content on the screen via the Psychopy program, hence we add “display” at the end of the file name in the second type to denote this distinction. The format is like `segmented_Chinese_novel_run_xx_display.xlsx` 
62 | 
63 | We provide a `.txt` file of the novel *The Little Prince* and the processed `.xlsx` files as an example. 
64 | 
65 | #### Parameters
66 | 
67 | | Parameter          | type | usage                                                        |
68 | | ------------------ | ---- | ------------------------------------------------------------ |
69 | | divide_nums        | str  | Breakpoints which you want to divide your novel (comma-separated). e.g. If you want to break the novel from chapter 8, 16, 24, you should pass the argument as --divide_nums=8,16,24 |
70 | | Chinese_novel_path | str  | Path to your `.txt` Chinese novel content                    |
71 | | save_path          | str  | Path to save the outputs                                     |
72 | 
73 | #### Running
74 | 
75 | You can run this code using the following command:
76 | 
77 | ```
78 | python cut_Chinese_novel.py --divide_nums=<chapter numbers of the cutting point> --Chinese_novel_path=<path to your .txt file of the novel> --save_path=<path to save the outputs>
79 | ```
80 | 
81 | ### Text Embeddings
82 | 
83 | We offer the code `embedding.py` used to generate the text embeddings. We have employed a deep learning-based pretrained language model to calculate text embeddings. The pretrained model we use is [bert-base-chinese](https://huggingface.co/bert-base-chinese). Specifically, we process each line of text in the textual stimuli to generate text embeddings of uniform length. These embeddings are organized by individual run in the experiment and are saved in the `.npy` format. Consequently, each `.npy` file of text embeddings corresponds to the embeddings of each line of textual stimuli in a single run. You should pass the path to your text files with `.xlsx` format in all runs  as inputs (that is the output path of your novel segmentation in the above section). 
84 | 
85 | We have already calculate embeddings of the two novels we use. They are offered in `derivatives/text_embeddings` in the dataset. You can use them directly to do subsequent analysis including alignment of EEG and text in representation space and EEG semantic decoding. **Notice: Due to special circumstances during the experimental process, subject-07 in the LangWangMeng session did not read the content of Chapter 18 as intended in the 18th run. Instead, as a substitute, the participant read the content of Chapter 19. Therefore, in this specific case, there is no direct correspondence between the EEG data in the 18th run and the 18th text embedding file. **
86 | 
87 | #### Running
88 | 
89 | | Parameter          | type | usage                                                        |
90 | | ------------------ | ---- | ------------------------------------------------------------ |
91 | | Chinese_novel_path | str  | Path to the folder that contains texts stimuli in each run   |
92 | | run_num            | int  | The number of the runs in the experiment (that is the number of texts stimuli files) |
93 | | save_path          | str  | Path to save the embeddings                                  |
94 | 


--------------------------------------------------------------------------------
/novel_segmentation_and_text_embeddings/cut_chinese_novel.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import re
  3 | import openpyxl
  4 | import os
  5 | 
  6 | '''
  7 | Author: Jianyu Zhang, Xinyu Mou
  8 | 
  9 | This is used for novel segmentation and format transformation for the Chinese novel you want to play.
 10 | '''
 11 | 
 12 | def delete_specific_element(str, element):
 13 |     """Remove specific elements from a string"""
 14 |     segments = re.split(element, str)
 15 |     segments = list(filter(lambda x:x != element, segments))
 16 |     result = ''.join(segments)
 17 | 
 18 |     return result
 19 | 
 20 | def contain_leading_quotation(sentence):
 21 |     if '“' in sentence:
 22 |         return True
 23 |     return False
 24 | 
 25 | def contain_back_quotation(sentence):
 26 |     if '”' in sentence:
 27 |         return True
 28 |     return False
 29 | 
 30 | 
 31 | def merge_short_sentences(segments):
 32 |     """Concatenate overly short, split sentences in a sentence"""
 33 |     results = []
 34 |     results.append(segments[0])
 35 |     for i in range(1, len(segments)):
 36 |         if len(results[-1]) + len(segments[i]) <= 10:
 37 |             results[-1] += segments[i]
 38 |         else:
 39 |             results.append(segments[i])
 40 |     return results
 41 | 
 42 | def insert_element_to_str(str, element, index):
 43 |     """Insert a specified element into a specified position in a string"""
 44 |     str_list = list(str)
 45 |     str_list.insert(index, element)
 46 |     result = ''.join(str_list)
 47 |     return result
 48 | 
 49 | 
 50 | def calculate_length_without_punctuation_and_indexes(sentence):
 51 |     """Calculate the length of a sentence excluding punctuation and the coordinates of all non-punctuation positions"""
 52 |     punctuations = ['\n', '。', '，', '！', '？', '：', '；', '“', '”', '、', '《', '》', '.', '（', '）', '…', '·']
 53 |     sentence_list = list(sentence)
 54 |     length_without_punctuation = 0
 55 |     indexes = []
 56 |     for index, char in enumerate(sentence_list):
 57 |         if char not in punctuations:
 58 |             length_without_punctuation += 1
 59 |             indexes.append(index)
 60 | 
 61 |     return length_without_punctuation, indexes
 62 | 
 63 | 
 64 | 
 65 | def cut_paragraph(paragraph):
 66 |     """Split the article into complete sentences"""
 67 |     # First split the entire sentence
 68 |     sentences = re.split(r"(。|！|？|”|；)", paragraph)
 69 |     # Piece together the separate punctuation marks.
 70 |     sentences = [''.join(i) for i in zip(sentences[0::2], sentences[1::2])]
 71 | 
 72 |     # Move the punctuation to the right place
 73 |     for i in range(len(sentences)):
 74 |         if sentences[i][0] in ['。', '！', '？', '”', '；']:
 75 |             sentences[i - 1] += sentences[i][0]
 76 |             sentences[i] = sentences[i][1:]
 77 | 
 78 |     # Remove empty str
 79 |     sentences = list(filter(lambda x:x != '', sentences))
 80 |     sentences = [i.strip() for i in sentences]
 81 | 
 82 |     # Remove \n in str
 83 |     sentences = [delete_specific_element(i, '\n') for i in sentences]
 84 | 
 85 |     # Remove space in str
 86 |     sentences = [delete_specific_element(i, ' ') for i in sentences]
 87 | 
 88 |     # Reassemble the double quotes that are not in the same string together
 89 |     results = []
 90 |     isOneSentence = False
 91 |     for i in range(len(sentences)):
 92 |         # Both having the opening quotation mark and the closing quotation mark
 93 |         if contain_leading_quotation(sentences[i]) and contain_back_quotation(sentences[i]):
 94 |             results.append(sentences[i])
 95 |         # Only having opening quotation mark. Subsequent sentence should be added
 96 |         elif contain_leading_quotation(sentences[i]) and not contain_back_quotation(sentences[i]):
 97 |             results.append(sentences[i])
 98 |             isOneSentence = True
 99 |         # Only having closing quotation mark. Adding is finished
100 |         elif contain_back_quotation(sentences[i]):
101 |             results[-1] += sentences[i]
102 |             isOneSentence = False
103 |         # No quotation, but surrounded by quotations
104 |         elif isOneSentence == True:
105 |             results[-1] += sentences[i]
106 |         # No quotation, and not surrounded by quotations
107 |         else:
108 |             results.append(sentences[i])
109 | 
110 |     return results
111 | 
112 | 
113 | def cut_sentences(sentences):
114 |     """Check if each sentence exceeds ten words; if it does, split it at the commas"""
115 |     results = []
116 |     for i in range(len(sentences)):
117 |         if len(sentences[i]) <= 10:
118 |             results.append(sentences[i])
119 |         else:
120 |             segments = re.split(r"(，|：)", sentences[i])
121 |             segments.append("")
122 |             #print(segments)
123 |             # Piece together the separate punctuation marks.
124 | 
125 |             segments = [''.join(i) for i in zip(segments[0::2], segments[1::2])]
126 | 
127 |             #print(segments)
128 |             # Move the punctuation to the right place
129 |             for i in range(len(segments)):
130 |                 if segments[i][0] in ['，', '：']:
131 |                     segments[i - 1] += segments[i][0]
132 |                     segments[i] = segments[i][1:]
133 |             # Remove empty str
134 |             segments = list(filter(lambda x: x != '', segments))
135 |             segments = [k.strip() for k in segments]
136 |             #print(segments)
137 |             segments = merge_short_sentences(segments)
138 |             for j in range(len(segments)):
139 |                 results.append(segments[j])
140 | 
141 |     return results
142 | 
143 | 
144 | def arrange_sentences_within_30_words(sentences):
145 |     """Obtain the text for each frame displayed on the screen
146 |     (no more than 30 words per frame)."""
147 |     results = []
148 |     results.append(sentences[0])
149 | 
150 |     # Integrate the short sentences according to the capacity of the screen,
151 |     # assuming that each frame does not exceed 30 words and each line contains
152 |     # a maximum of 10 words
153 |     for i in range(1, len(sentences)):
154 |         length_wiithout_punctuation_last, _ = calculate_length_without_punctuation_and_indexes(results[-1])
155 |         length_wiithout_punctuation_new, _ = calculate_length_without_punctuation_and_indexes(sentences[i])
156 | 
157 | 
158 |         if sentences[i] in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40']:
159 |             results.append(sentences[i])
160 |         elif sentences[i-1] in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40']:
161 |             results.append(sentences[i])
162 |         elif length_wiithout_punctuation_last + length_wiithout_punctuation_new < 31:
163 |             results[-1] += sentences[i]
164 |         else:
165 |             results.append(sentences[i])
166 | 
167 |    # Insert \n every ten words in each sentence
168 |     for i in range(len(results)):
169 |         sentence_length_without_punctuation, indexes_of_non_punctuation = calculate_length_without_punctuation_and_indexes(results[i])
170 |         row_num = sentence_length_without_punctuation // 10
171 |         for j in range(row_num):
172 |             index = indexes_of_non_punctuation[(j+1)*10-1] + 1 + j
173 |             results[i] = insert_element_to_str(results[i], '\n', index)
174 | 
175 |     return results
176 | 
177 | 
178 | 
179 | 
180 | 
181 | def split_chapter_title(sentences):
182 |     """Make each chapter title into a separate sentence."""
183 |     chapter_num = 0
184 |     for i in range(len(sentences)):
185 |         if sentences[i].find('Ch' + str(round(chapter_num))) != -1:
186 |             index = sentences[i].find('Ch' + str(round(chapter_num)))
187 |             segments = [sentences[i][:(index)], str(round(chapter_num)), sentences[i][(index+len(str(round(chapter_num)))+2):]]
188 |             sentences[i] = segments[0]
189 |             sentences.insert(i+1, segments[1])
190 |             sentences.insert(i+2, segments[2])
191 |             chapter_num += 1
192 |     sentences = list(filter(lambda x:x != '', sentences))
193 | 
194 |     return sentences
195 | 
196 | 
197 | def repeat_sentences(sentences):
198 |     """Copy the sentence as many times as its length after removing punctuation,
199 |     to facilitate frame switching when highlighting in PsychoPy."""
200 |     results = []
201 |     indexes = []
202 |     punctuations = ['\n', '。', '，', '！', '？', '：', '；', '“', '”', '、', '《', '》', '.', '·']
203 |     for i in range(len(sentences)):
204 |         sentence = sentences[i]
205 |         sentence_list = list(sentence)
206 |         length_without_punctuation = 0
207 |         for index, char in enumerate(sentence_list):
208 |             if char not in punctuations:
209 | 
210 |                 indexes.append(index)
211 |                 length_without_punctuation += 1
212 | 
213 |         for j in range(length_without_punctuation):
214 |             results.append(sentence)
215 | 
216 |     #print(list(results[1]))
217 | 
218 |     return results, indexes
219 | 
220 | 
221 | def split_row(sentences):
222 |     """Divide the text according to each line as displayed in PsychoPy."""
223 |     results = []
224 |     for i in range(len(sentences)):
225 |         sentence_list = sentences[i].split('\n')
226 |         sentence_list = list(filter(lambda x: x != '\n' and x != '', sentence_list))
227 |         for j in range(len(sentence_list)):
228 |             results.append(sentence_list[j])
229 | 
230 |     # Move the punctuation at the beginning of a sentence to the end of the previous sentence.
231 |     punctuations = ['。', '，', '！', '？', '：', '；', '”', '、', '》', '.', '）', '…', '·']
232 |     for i in range(len(results)):
233 |         if results[i][0] in punctuations:
234 |             results[i-1] += results[i][0]
235 |             results[i] = results[i][1:]
236 | 
237 |     results = list(filter(lambda x:x != '', results))
238 |     #print(results)
239 |     return results
240 | 
241 | 
242 | def split_preface_main_content(sentences, divide_nums):
243 |     """Separate the preface section and divide the main text into a specified number
244 |     of parts according to the chapters."""
245 |     # def get_breakpoints(n, m):
246 |     #     if m <= 1 or n <= 0:
247 |     #         return []
248 | 
249 |     #     breakpoints = []
250 |     #     interval = n // m + 1
251 |     #     for i in range(1, m):
252 |     #         breakpoint_value = i * interval
253 |     #         breakpoints.append(breakpoint_value)
254 | 
255 |     #     return breakpoints
256 | 
257 | 
258 |     if '1' in sentences:
259 |         first_chapter_index = sentences.index('1')
260 | 
261 |     else:
262 |         first_chapter_index = len(sentences)
263 | 
264 |     preface = sentences[:first_chapter_index]
265 |     preface = preface[1:]   # Remove the mark 0 at the beginning
266 | 
267 |     main_content = sentences[first_chapter_index:]
268 | 
269 | 
270 |     max_chapter = 0
271 |     while str(round(max_chapter+1)) in main_content:
272 |         max_chapter += 1
273 | 
274 | 
275 |     cut_chapter = divide_nums
276 |     for i in range(len(divide_nums)):
277 |         if cut_chapter[i] + 1 > max_chapter:
278 |             cut_chapter.pop(i)
279 |     cut_indexes_last = [main_content.index(str(round(i+1))) for i in cut_chapter]
280 |     cut_indexes_last.append(len(main_content)+1)
281 | 
282 | 
283 |     main_content_parts = []
284 |     cut_index_first = 0
285 |     for i in cut_indexes_last:
286 |         main_content_parts.append(main_content[cut_index_first:i])
287 |         cut_index_first = i
288 | 
289 |     return preface, main_content_parts
290 | 
291 | 
292 | 
293 | 
294 | def arrange_sentences_in_psychopy_requirement(sentences):
295 |     """The line that needs to be highlighted is in the middle,
296 |     with one line above and one below it as background"""
297 |     results = []
298 |     indexes = []
299 |     main_row = []
300 |     row_num = []
301 |     for i in range(len(sentences)):
302 |         if i == 0 and sentences[i] not in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40']:
303 |             length_without_puntuation, indexes_of_non_punc = calculate_length_without_punctuation_and_indexes(
304 |                 sentences[i])
305 |             for k in range(length_without_puntuation):
306 |                 results.append(sentences[i] + '\n' + sentences[i + 1])
307 |                 indexes.append(indexes_of_non_punc[k])
308 |                 main_row.append(0)
309 |                 row_num.append(2)
310 |         elif i == len(sentences) - 1:
311 |             length_without_puntuation, indexes_of_non_punc = calculate_length_without_punctuation_and_indexes(
312 |                 sentences[i])
313 |             for k in range(length_without_puntuation):
314 |                 results.append(sentences[i-1] + '\n' + sentences[i])
315 |                 indexes.append(indexes_of_non_punc[k])
316 |                 main_row.append(1)
317 |                 row_num.append(2)
318 |         elif sentences[i] in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40']:
319 |             results.append(sentences[i])
320 |             indexes.append(0)
321 |             main_row.append(0)
322 |             row_num.append(1)
323 |         elif sentences[i-1] in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40']:
324 |             length_without_puntuation, indexes_of_non_punc = calculate_length_without_punctuation_and_indexes(sentences[i])
325 |             for k in range(length_without_puntuation):
326 |                 results.append(sentences[i] + '\n' + sentences[i+1])
327 |                 indexes.append(indexes_of_non_punc[k])
328 |                 main_row.append(0)
329 |                 row_num.append(2)
330 |         elif sentences[i+1] in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40']:
331 |             length_without_puntuation, indexes_of_non_punc = calculate_length_without_punctuation_and_indexes(
332 |                 sentences[i])
333 |             for k in range(length_without_puntuation):
334 |                 results.append(sentences[i-1] + '\n' + sentences[i])
335 |                 indexes.append(indexes_of_non_punc[k])
336 |                 main_row.append(1)
337 |                 row_num.append(2)
338 |         else:
339 |             length_without_puntuation, indexes_of_non_punc = calculate_length_without_punctuation_and_indexes(
340 |                 sentences[i])
341 |             for k in range(length_without_puntuation):
342 |                 results.append(sentences[i-1] + '\n' + sentences[i] + '\n' + sentences[i+1])
343 |                 indexes.append(indexes_of_non_punc[k])
344 |                 main_row.append(1)
345 |                 row_num.append(3)
346 |     return results, indexes, main_row, row_num
347 | 
348 | 
349 | def save_to_xlsx(file_path, file_name, text, indexes=None, main_row=None, row_num=None):
350 |     workbook = openpyxl.Workbook()
351 |     sheet = workbook.active
352 | 
353 | 
354 |     if not os.path.isdir(file_path):
355 |         os.makedirs(file_path)
356 | 
357 |     if indexes is not None:
358 |         for i, content in enumerate(zip(text, indexes, main_row, row_num)):
359 |             sheet.cell(row=i + 2, column=1, value=content[0])
360 |             sheet.cell(row=i + 2, column=2, value=content[1])
361 |             sheet.cell(row=i + 2, column=3, value=content[2])
362 |             sheet.cell(row=i + 2, column=4, value=content[3])
363 | 
364 |         sheet.cell(row=1, column=1, value='Chinese_text')
365 |         sheet.cell(row=1, column=2, value='index')
366 |         sheet.cell(row=1, column=3, value='main_row')
367 |         sheet.cell(row=1, column=4, value='row_num')
368 | 
369 |         workbook.save(file_path + file_name)
370 |     else:
371 |         for i, content in enumerate(text):
372 |             sheet.cell(row=i + 2, column=1, value=content)
373 | 
374 |         sheet.cell(row=1, column=1, value='Chinese_text')
375 |         workbook.save(file_path + file_name)
376 | 
377 | if __name__ == '__main__':
378 |     parser = argparse.ArgumentParser(description='Parameters that can be changed in this experiment')
379 |     
380 |     parser.add_argument('--Chinese_novel_path', type=str, default=r'../data/novel/xiaowangzi_main_text.txt', help='Path to your .txt Chinese novel content')
381 |     parser.add_argument('--divide_nums', type=str, default='4, 8, 12, 16, 20, 24', help='Breakpoints which you want to divide your novel (comma-separated)')
382 |     parser.add_argument('--save_path', type=str, default=r'../data/segmented_novel',
383 |                         help='Path to save the outputs')
384 |     args = parser.parse_args()
385 | 
386 |     divide_num_list = args.divide_nums.split(',')
387 |   
388 |     divide_num_list = [int(num) for num in divide_num_list]
389 | 
390 |     args.divide_nums = divide_num_list
391 |     
392 | 
393 |     with open(args.Chinese_novel_path, encoding='utf-8') as file:
394 |         text = file.read()
395 |         result = cut_paragraph(text)
396 |         result = cut_sentences(result)
397 |         result = split_chapter_title(result)
398 |         result = arrange_sentences_within_30_words(result)
399 |         result = split_row(result)
400 | 
401 | 
402 |         # To be saved for use with PsychoPy
403 |         preface, main_content_parts = split_preface_main_content(result, args.divide_nums)
404 | 
405 |         preface_text, preface_indexes, preface_main_row, preface_row_num = arrange_sentences_in_psychopy_requirement(
406 |             preface)
407 | 
408 |         save_to_xlsx(args.save_path, r'/segmented_Chinense_novel_preface_display.xlsx', preface_text,
409 |                      preface_indexes, preface_main_row, preface_row_num)
410 | 
411 |         for i, main_content_part in enumerate(main_content_parts):
412 |             text, indexes, main_row, row_num = arrange_sentences_in_psychopy_requirement(main_content_part)
413 |             file_name = r'/segmented_Chinense_novel_run_' + str(round(i + 1)) + '_display.xlsx'
414 |             save_to_xlsx(args.save_path, file_name, text, indexes, main_row, row_num)
415 | 
416 | 
417 | 
418 |         # To be saved for retrieval
419 | 
420 |         result_without_punc = []
421 |         for row in result:
422 |             length_without_punc, _ = calculate_length_without_punctuation_and_indexes(row)
423 |             if length_without_punc != 0:
424 |                 result_without_punc.append(row)
425 | 
426 | 
427 |         save_to_xlsx(args.save_path, r'/segmented_Chinense_novel.xlsx', result_without_punc[1:])
428 | 
429 |         preface_without_punc, main_content_parts_without_punc = split_preface_main_content(result_without_punc, args.divide_nums)
430 | 
431 | 
432 | 
433 |         save_to_xlsx(args.save_path, r'/segmented_Chinense_novel_preface.xlsx', preface_without_punc)
434 | 
435 |         for i, content_without_punc in enumerate(main_content_parts_without_punc):
436 |             filename = r'/segmented_Chinense_novel_run_' + str(i+1) + '.xlsx'
437 |             save_to_xlsx(args.save_path, filename, content_without_punc)
438 | 
439 | 
440 | 
441 | 
442 | 
443 | 
444 | 


--------------------------------------------------------------------------------
/novel_segmentation_and_text_embeddings/embedding.py:
--------------------------------------------------------------------------------
 1 | from transformers import BertTokenizer, AutoModelForMaskedLM
 2 | import openpyxl
 3 | import os
 4 | import numpy as np
 5 | import torch
 6 | import argparse
 7 | 
 8 | os.environ['CURL_CA_BUNDLE'] = ''
 9 | 
10 | from transformers import AutoTokenizer, AutoModelForMaskedLM
11 | 
12 | # Load the pre-trained dictionary and tokenization method
13 | tokenizer = BertTokenizer.from_pretrained(
14 |     pretrained_model_name_or_path='bert-base-chinese',
15 | )
16 | 
17 | model = AutoModelForMaskedLM.from_pretrained("bert-base-chinese")
18 | 
19 | parser = argparse.ArgumentParser(description='Parameters that can be changed')
20 | 
21 | parser.add_argument('--Chinese_novel_path', type=str, default=r'../data/segmented_novel',
22 |                     help='Path to the folder that contains the .xlsx files of the texts')
23 | parser.add_argument('--run_num', type=int, default=7,
24 |                     help='Number of runs of your experiment')
25 | parser.add_argument('--save_path', type=str, default=r'../data/embeddings',
26 |                     help='Path to save the embedding files')
27 | 
28 | args = parser.parse_args()
29 | 
30 | 
31 | for i in range(args.run_num):
32 |     novel_path = args.Chinese_novel_path + '/segmented_Chinense_novel_run_' + str(i+1) + '.xlsx'
33 | 
34 |     wb = openpyxl.load_workbook(novel_path)
35 |     wsheet = wb.active
36 |     texts = []
37 | 
38 |     for j in range(2, wsheet.max_row + 1):
39 |         texts.append((wsheet.cell(row=j, column=1)).value)
40 | 
41 |     print(texts)
42 | 
43 |     embeddings = []
44 |     for k in range(len(texts)):
45 |         token = tokenizer.encode(texts[k], return_tensors='pt')
46 |         embedding = model(token).logits
47 |         embedding = torch.mean(embedding, dim=1)
48 |         embeddings.append(embedding.detach().numpy())
49 | 
50 | 
51 |     embeddings = np.array(embeddings)
52 | 
53 |     embeddings = embeddings.reshape(embeddings.shape[0], embeddings.shape[2])
54 | 
55 |     print(embeddings)
56 | 
57 | 
58 |     np.save(args.save_path + '/text_embedding_run_' + str(i+1) + '.npy', embeddings)
59 | 
60 | 
61 | 
62 | 


--------------------------------------------------------------------------------
/random_mask/random_mask.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | 
 4 | np.random.seed(0)
 5 | 
 6 | def generate_unique_array(length, upper_limit):
 7 |     """
 8 |     Generate a random vector with non-repeating elements. It is used to determine which positions in the data are masked
 9 |     :param length: The length of the vector that needs to be generated
10 |     :param upper_limit: The upper bound of the elements in the vector
11 |     :return: A random vector with non-repeating elements
12 |     """
13 |     if length > upper_limit:
14 |         raise ValueError("Length cannot be greater than the upper limit.")
15 | 
16 |     unique_elements = np.arange(0, upper_limit)
17 |     np.random.shuffle(unique_elements)
18 |     unique_array = unique_elements[:length]
19 |     return unique_array
20 | 
21 | 
22 | def mask(eeg, mask_rate, same_place=False):
23 |     """
24 |     randomly mask eeg data
25 |     :param eeg: numpy array，[n_channel, length]
26 |     :param mask_rate: mask ratio，between [0, 1]
27 |     :param same_place: Whether the mask positions are the same for each channel, the default is False
28 |     :return: masked eeg data，numpy array，[n_channel, length]
29 |     """
30 | 
31 |     mask_length = round(eeg.shape[1] * mask_rate)
32 | 
33 | 
34 |     if same_place:
35 |         mask_places = generate_unique_array(mask_length, eeg.shape[1])
36 |         for place in mask_places:
37 |             eeg[:, place] = 0
38 |     else:
39 |         for channel in range(eeg.shape[0]):
40 |             mask_places = generate_unique_array(mask_length, eeg.shape[1])
41 |             for place in mask_places:
42 |                 eeg[channel:channel+1, place] = 0
43 | 
44 |     return eeg
45 | 
46 | def main():
47 |     input_eeg = np.load(r"../data/random_mask/test.npy")[:, :1000] # (15, 2560)
48 |     input_eeg_copy = np.copy(input_eeg)
49 | 
50 |     masked = mask(input_eeg_copy, mask_rate=0.75, same_place=False)
51 |     plt.scatter(range(masked.shape[1]), masked[0], s=1, label="masked signal 0", c="green")
52 |     plt.scatter(range(masked.shape[1]), masked[1], s=1, label="masked signal 1", c="red")
53 |     plt.legend()
54 |     plt.title("mask_rate=0.75, same_place=False")
55 |     plt.show()
56 | 
57 | 
58 | 
59 | 
60 | 
61 | if __name__ == "__main__":
62 |     main()
63 | 
64 | 


--------------------------------------------------------------------------------
/random_mask/readme.md:
--------------------------------------------------------------------------------
 1 | # Random Mask for EEG Data
 2 | 
 3 | ## Introduction
 4 | 
 5 | This script aims at implementing random mask for EEG data.
 6 | 
 7 | ## Environment Requirement
 8 | 
 9 | `matplotlib >= 3.7.0`
10 | 
11 | `numpy >= 1.25.0`
12 | 
13 | ## Example
14 | 
15 | We use an EEG data of shape (15, 2560), where 15 is the channel number and 2560 stands for the data length. The data can be found in `data/random_mask/test.npy`.
16 | 
17 | Here is the example of how to use this scripts：
18 | 
19 | ```
20 | import numpy as np
21 | import matplotlib.pyplot as plt
22 | 
23 | input_eeg = np.load("test.npy")[:, :1000] # (15, 2560)
24 | input_eeg_copy = np.copy(input_eeg)
25 | masked = mask(input_eeg_copy, mask_rate=0.75, same_place=True)
26 | 
27 | plt.scatter(range(input_eeg.shape[1]), input_eeg[0], s=1, label="origin signal 0", c="red")
28 | plt.scatter(range(masked.shape[1]), masked[0], s=1, label="masked signal 0", c="green")
29 | plt.legend()
30 | plt.title("mask_rate=0.75, same_place=True")
31 | plt.show()
32 | ```
33 | 
34 | The result is shown below：
35 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/random_mask1.png)
36 | 
37 | if `same_place=True`, Then all the channels are masked in the same place：
38 | 
39 | ```
40 | masked = mask(input_eeg_copy, mask_rate=0.75, same_place=True)
41 | plt.scatter(range(masked.shape[1]), masked[0], s=1, label="masked signal 0", c="green")
42 | plt.scatter(range(masked.shape[1]), masked[1], s=1, label="masked signal 1", c="red")
43 | plt.legend()
44 | plt.title("mask_rate=0.75, same_place=True")
45 | plt.show()
46 | ```
47 | 
48 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/random_mask2.png)
49 | 
50 | if `same_place=False`，Then all the channels are **not** masked in the same place：
51 | 
52 | 
53 | ![](https://github.com/ncclabsustech/Chinese_reading_task_eeg_processing/blob/main/image/random_mask3.png)
54 | 
55 | 
56 | 
57 | 
58 | 
59 | ## Main Functions
60 | 
61 | `generate_unique_array(length, upper_limit)`：Produce a random vector with no duplicated element. Used to determine the sites for masking.
62 | 
63 | `mask(eeg, mask_rate, same_place=False)`：To mask EEG data randomly.
64 | 
65 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
  1 | aiohttp==3.8.6
  2 | aiortsp @ git+https://github.com/m4reko/aiortsp@16b0e084e2520759ed32ff1dd911d82db84b8f34
  3 | aiosignal==1.3.1
  4 | arabic-reshaper==3.0.0
  5 | astunparse==1.6.3
  6 | async-timeout==4.0.3
  7 | attrs==23.1.0
  8 | av==10.0.0
  9 | beautifulsoup4==4.12.2
 10 | blosc2==2.3.1
 11 | certifi==2023.11.17
 12 | cffi==1.16.0
 13 | charset-normalizer==3.3.2
 14 | colorama==0.4.6
 15 | contourpy==1.2.0
 16 | cryptography==41.0.5
 17 | cycler==0.12.1
 18 | decorator==5.1.1
 19 | defusedxml==0.7.1
 20 | dpkt==1.9.8
 21 | dukpy==0.3.0
 22 | EDFlib-Python==1.0.8
 23 | eeglabio==0.0.2.post4
 24 | egi_pynetstation==1.0.1
 25 | elementpath==4.1.5
 26 | esprima==4.0.1
 27 | et-xmlfile==1.1.0
 28 | ffpyplayer==4.5.1
 29 | fonttools==4.45.0
 30 | freetype-py==2.4.0
 31 | frozenlist==1.4.0
 32 | future==0.18.3
 33 | g3pylib @ file:///D:/%E7%89%9F%E6%96%B0%E8%AF%AD%E7%9A%84%E6%96%87%E4%BB%B6%E5%A4%B9/%E7%A7%91%E5%A4%A7%E8%AE%AF%E9%A3%9E%E5%AE%9E%E4%B9%A0/%E9%A1%B9%E7%9B%AE%E5%86%85%E5%AE%B9/EEG%E9%A2%84%E8%AE%AD%E7%BB%83%2B%E8%AF%AD%E8%A8%80%E8%A7%A3%E7%A0%81%E9%A1%B9%E7%9B%AE/%E6%95%B0%E6%8D%AE%E5%A4%84%E7%90%86/%E6%95%B0%E6%8D%AE%E5%A4%84%E7%90%86%E5%89%8D%E4%BA%94%E4%B8%AA/git_new/Chinese_reading_task_eeg_processing/experiment/g3pylib
 34 | gevent==23.9.1
 35 | gitdb==4.0.11
 36 | GitPython==3.1.40
 37 | greenlet==3.0.1
 38 | h5py==3.10.0
 39 | html2text==2020.1.16
 40 | idna==3.4
 41 | ifaddr==0.2.0
 42 | imageio==2.33.0
 43 | imageio-ffmpeg==0.4.9
 44 | javascripthon==0.12
 45 | jedi==0.19.1
 46 | Jinja2==3.1.2
 47 | json-tricks==3.17.3
 48 | kiwisolver==1.4.5
 49 | lazy_loader==0.3
 50 | markdown-it-py==3.0.0
 51 | MarkupSafe==2.1.3
 52 | matplotlib==3.8.2
 53 | mdurl==0.1.2
 54 | mne==1.6.0
 55 | mne-bids==0.14
 56 | msgpack==1.0.7
 57 | msgpack-numpy==0.4.8
 58 | multidict==6.0.4
 59 | ndindex==1.7
 60 | nibabel==5.1.0
 61 | ntplib==0.4.0
 62 | numexpr==2.8.7
 63 | numpy @ file:///D:/bld/numpy_1653325554875/work
 64 | opencv-python==4.8.1.78
 65 | openpyxl==3.1.2
 66 | packaging==23.2
 67 | pandas==2.1.3
 68 | parso==0.8.3
 69 | Pillow==10.1.0
 70 | platformdirs==4.0.0
 71 | pooch==1.8.0
 72 | psutil==5.9.6
 73 | psychopy==2023.2.3
 74 | psychtoolbox==3.0.19.0
 75 | py-cpuinfo==9.0.0
 76 | pybv==0.7.5
 77 | pycparser==2.21
 78 | pyglet==1.4.11
 79 | pymatreader==0.0.32
 80 | pyparallel==0.2.2
 81 | pyparsing==3.1.1
 82 | pypi-search==2.0
 83 | pypiwin32==223
 84 | PyQt5==5.15.10
 85 | PyQt5-Qt5==5.15.2
 86 | PyQt5-sip==12.13.0
 87 | pyserial==3.5
 88 | python-bidi==0.4.2
 89 | python-dateutil==2.8.2
 90 | python-gitlab==4.1.1
 91 | python-vlc==3.0.11115
 92 | pytz==2023.3.post1
 93 | pywin32==306
 94 | pyWinhook==1.6.2
 95 | PyYAML==6.0.1
 96 | pyzmq==25.1.1
 97 | questplus==2023.1
 98 | requests==2.31.0
 99 | requests-toolbelt==1.0.0
100 | scipy==1.11.4
101 | six==1.16.0
102 | smmap==5.0.1
103 | soundfile==0.12.1
104 | soupsieve==2.5
105 | tables==3.9.1
106 | tqdm==4.66.1
107 | tzdata==2023.3
108 | ujson==5.8.0
109 | urllib3==2.1.0
110 | websockets==10.4
111 | wxPython==4.2.1
112 | xarray==2023.11.0
113 | xmlschema==2.5.0
114 | xmltodict==0.13.0
115 | yarl==1.9.3
116 | zeroconf==0.47.4
117 | zope.event==5.0
118 | zope.interface==6.1
119 | 


--------------------------------------------------------------------------------
/scripts_test/egi_test.py:
--------------------------------------------------------------------------------
 1 | from egi_pynetstation import NetStation
 2 | import time
 3 | 
 4 | 
 5 | """Test whether the EGI device can be connected properly, record the EEG data, and apply markers."""
 6 | 
 7 | print("import pynetstation")
 8 | # set ip address
 9 | IP_ns = '10.10.10.42'
10 | # Set a port that NetStation will be listening to as an integer
11 | port_ns = 55513
12 | ns = NetStation.NetStation(IP_ns, port_ns)
13 | # Set an NTP clock server (the amplifier) address as an IPv4 string
14 | IP_amp = '10.10.10.51'
15 | ns.connect(ntp_ip=IP_amp)
16 | 
17 | # sync_time = time.time()
18 | # ns.set_synched_time(sync_time)
19 | 
20 | print("connected to netstation")
21 | 
22 | 
23 | ns.begin_rec()
24 | print("start recording")
25 | 
26 | n = 5
27 | test_data = {"dogs": "0001"}
28 | while n > 0:
29 |     time.sleep(2)
30 |     ns.send_event(event_type="test")
31 |     n -= 1
32 | 
33 | # ns.end_rec()
34 | # ns.disconnect()


--------------------------------------------------------------------------------
/scripts_test/eyetracker_test.py:
--------------------------------------------------------------------------------
 1 | import asyncio
 2 | import logging
 3 | import os
 4 | 
 5 | from g3pylib import connect_to_glasses
 6 | 
 7 | logging.basicConfig(level=logging.INFO)
 8 | 
 9 | 
10 | async def access_recordings():
11 |     async with connect_to_glasses.with_hostname(os.environ["G3_HOSTNAME"]) as g3:  # "G3_HOSTNAME" is the serial number of the eyetracker
12 |         async with g3.recordings.keep_updated_in_context():
13 |             logging.info(
14 |                 f"Recordings before: {list(map(lambda r: r.uuid, g3.recordings.children))}"
15 |             )
16 |             await g3.recorder.start()
17 |             logging.info("Creating new recording")
18 |             await asyncio.sleep(3)
19 |             await g3.recorder.stop()
20 |             logging.info(
21 |                 f"Recordings after: {list(map(lambda r: r.uuid, g3.recordings.children))}"
22 |             )
23 |             creation_time = await g3.recordings[0].get_created()
24 |             logging.info(f"Creation time of last recording in UTC: {creation_time}")
25 | 
26 | 
27 | def main():
28 |     asyncio.run(access_recordings())
29 | 
30 | 
31 | if __name__ == "__main__":
32 |     main()


--------------------------------------------------------------------------------