├── .DS_Store ├── .gitbook └── assets │ ├── adalNetVsMsalNet.png │ ├── adalNetVsMsalNetAPI.png │ ├── image (1).png │ ├── image.png │ ├── msalMigrationProjectScope.png │ ├── msalMultiLayerResilience.png │ ├── msalNetFlowchart.png │ ├── msalRollout.png │ ├── msalSharedLibraryStructure.png │ └── redisTokenCacheDS.png ├── .gitignore ├── Anti-Bio.md ├── Bio.md ├── Documents ├── .DS_Store ├── 317A-GivingEffectiveFeedback.pdf ├── 7-Observable-Behaviors-of-Executive-Presence.pdf ├── AudiencePostures.pdf ├── CommunicatingWithPrecision.pdf ├── CommunicationStyleCharacters.pptx ├── ConflictResolutionByFredKofman.pdf ├── CoreProtocols_EricZhang.pptx ├── Disk_PeopleReading.pdf ├── Generic IDP .docx ├── MS_CoreExpectations.pdf ├── MS_CoreExpectations.pptx ├── PowerfulQuestions .pdf ├── SituationalLeadership.pdf ├── SituationalLeadership.pptx ├── The Integrated Leader.pdf ├── circleOfFeedback.png ├── energyLeadership_chinese.jpg ├── energyLeadership_chineseBack.jpg └── energyLeadership_english.jpg ├── RandomThoughts ├── PathInEntrepreneurship.md ├── Paths.md ├── Writings │ ├── README.md │ ├── fo-xue-ci-di-hua-kai.md │ ├── fo-xue-tian-long-ba-bu-yu-zhi-nian.md │ ├── guan-xi-hong-lou-zhong-de-xian-yuan-yu-chen-yuan.md │ ├── guan-xi-ji-nian-fu-qin-wu-shi-jiu-sui-sheng-ri.md │ ├── individual-phychology.md │ ├── technology-entrepreneurship.md │ ├── ying-xiang-shi-jie-de-fa-xian-kang-sheng-su.md │ ├── zhe-xue-fu-lan-ke-lin-zi-chuan.md │ └── zhe-xue-ke-zhou-qiu-jian-yu-yu-qie.md └── backlog.md ├── SUMMARY.md ├── _config.yml ├── _layouts └── default.html └── career ├── Meeting notes.md ├── presentations.md ├── publications.md ├── resume_finraintern.md ├── resume_opentextanalytics.md ├── teams_resume.md └── toastMasterFeedback.md /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.DS_Store -------------------------------------------------------------------------------- /.gitbook/assets/adalNetVsMsalNet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/adalNetVsMsalNet.png -------------------------------------------------------------------------------- /.gitbook/assets/adalNetVsMsalNetAPI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/adalNetVsMsalNetAPI.png -------------------------------------------------------------------------------- /.gitbook/assets/image (1).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/image (1).png -------------------------------------------------------------------------------- /.gitbook/assets/image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/image.png -------------------------------------------------------------------------------- /.gitbook/assets/msalMigrationProjectScope.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/msalMigrationProjectScope.png -------------------------------------------------------------------------------- /.gitbook/assets/msalMultiLayerResilience.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/msalMultiLayerResilience.png -------------------------------------------------------------------------------- /.gitbook/assets/msalNetFlowchart.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/msalNetFlowchart.png -------------------------------------------------------------------------------- /.gitbook/assets/msalRollout.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/msalRollout.png -------------------------------------------------------------------------------- /.gitbook/assets/msalSharedLibraryStructure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/msalSharedLibraryStructure.png -------------------------------------------------------------------------------- /.gitbook/assets/redisTokenCacheDS.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/.gitbook/assets/redisTokenCacheDS.png -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | .DS_Store 3 | .DS_Store 4 | .DS_Store 5 | .DS_Store 6 | -------------------------------------------------------------------------------- /Anti-Bio.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Anti-Bio.md -------------------------------------------------------------------------------- /Bio.md: -------------------------------------------------------------------------------- 1 | # About Me 2 | 3 | 4 | -------------------------------------------------------------------------------- /Documents/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/.DS_Store -------------------------------------------------------------------------------- /Documents/317A-GivingEffectiveFeedback.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/317A-GivingEffectiveFeedback.pdf -------------------------------------------------------------------------------- /Documents/7-Observable-Behaviors-of-Executive-Presence.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/7-Observable-Behaviors-of-Executive-Presence.pdf -------------------------------------------------------------------------------- /Documents/AudiencePostures.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/AudiencePostures.pdf -------------------------------------------------------------------------------- /Documents/CommunicatingWithPrecision.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/CommunicatingWithPrecision.pdf -------------------------------------------------------------------------------- /Documents/CommunicationStyleCharacters.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/CommunicationStyleCharacters.pptx -------------------------------------------------------------------------------- /Documents/ConflictResolutionByFredKofman.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/ConflictResolutionByFredKofman.pdf -------------------------------------------------------------------------------- /Documents/CoreProtocols_EricZhang.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/CoreProtocols_EricZhang.pptx -------------------------------------------------------------------------------- /Documents/Disk_PeopleReading.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/Disk_PeopleReading.pdf -------------------------------------------------------------------------------- /Documents/Generic IDP .docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/Generic IDP .docx -------------------------------------------------------------------------------- /Documents/MS_CoreExpectations.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/MS_CoreExpectations.pdf -------------------------------------------------------------------------------- /Documents/MS_CoreExpectations.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/MS_CoreExpectations.pptx -------------------------------------------------------------------------------- /Documents/PowerfulQuestions .pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/PowerfulQuestions .pdf -------------------------------------------------------------------------------- /Documents/SituationalLeadership.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/SituationalLeadership.pdf -------------------------------------------------------------------------------- /Documents/SituationalLeadership.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/SituationalLeadership.pptx -------------------------------------------------------------------------------- /Documents/The Integrated Leader.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/The Integrated Leader.pdf -------------------------------------------------------------------------------- /Documents/circleOfFeedback.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/circleOfFeedback.png -------------------------------------------------------------------------------- /Documents/energyLeadership_chinese.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/energyLeadership_chinese.jpg -------------------------------------------------------------------------------- /Documents/energyLeadership_chineseBack.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/energyLeadership_chineseBack.jpg -------------------------------------------------------------------------------- /Documents/energyLeadership_english.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/Documents/energyLeadership_english.jpg -------------------------------------------------------------------------------- /RandomThoughts/PathInEntrepreneurship.md: -------------------------------------------------------------------------------- 1 | * https://femgineer.com/about/ 2 | 3 | * 融云IM: http://www.52im.net/thread-2703-1-1.html -------------------------------------------------------------------------------- /RandomThoughts/Paths.md: -------------------------------------------------------------------------------- 1 | - [Career models](#career-models) 2 | - [Generalist](#generalist) 3 | - [Specialist](#specialist) 4 | - [Manager](#manager) 5 | - [References](#references) 6 | - [Next companies](#next-companies) 7 | - [Where am I now](#where-am-i-now) 8 | - [What are my fellow folks doing](#what-are-my-fellow-folks-doing) 9 | - [Standards](#standards) 10 | - [1. High growth](#1-high-growth) 11 | - [2. Job and scope](#2-job-and-scope) 12 | - [3. Interesting domain](#3-interesting-domain) 13 | - [4. Culture and diversity](#4-culture-and-diversity) 14 | - [5. Compensation](#5-compensation) 15 | - [Candidate dimensions](#candidate-dimensions) 16 | - [Next companies](#next-companies-1) 17 | - [Paths](#paths) 18 | 19 | # Career models 20 | ## Generalist 21 | * Joel Spolsky: https://www.joelonsoftware.com/about-me/ 22 | * David Heinemeier Hansson: https://dhh.dk/ 23 | 24 | ## Specialist 25 | * Brendan Gregg: https://www.brendangregg.com/ 26 | * Glen Kohl: https://www.linkedin.com/in/glen-kohl-25bb0a29/ 27 | 28 | ## Manager 29 | * Julia H Grace: http://www.juliahgrace.com/ 30 | 31 | ## References 32 | * Kun Xi's thoughts on engineering career path: https://www.kunxi.org/2021/01/careerup-mentoring-program-recap/ 33 | 34 | # Next companies 35 | ## Where am I now 36 | * Service owner / Auth area expert 37 | 38 | ## What are my fellow folks doing 39 | * Identity at Lyft 40 | * Storage at FB 41 | * IM at FB 42 | 43 | ## Standards 44 | ### 1. High growth 45 | 1. Already passed product market fit stage, mid to late staged startup 46 | 2. Rapidly expanding locations/department in big techs 47 | 48 | ### 2. Job and scope 49 | * Could only see when 50 | 51 | ### 3. Interesting domain 52 | 1. Fin tech / Blockchain 53 | 2. Database 54 | 3. Big data and stream processing 55 | 4. Authentication and authorization 56 | 57 | ### 4. Culture and diversity 58 | 1. Not too much rush for speed 59 | 60 | ### 5. Compensation 61 | 62 | ### Candidate dimensions 63 | * [Company and products research](https://docs.google.com/spreadsheets/d/1Pa2ma5UNvy-j_9HYMK6jdslj9V88VZwtapuJPAIMees/edit?usp=sharing) 64 | * [Culture research](https://docs.google.com/spreadsheets/d/1qiEMJvnqP8ZJ7qje5pJRn1dTFT2vAU_QtVKxw7AK6IU/edit#gid=1271895131) 65 | 66 | ## Next companies 67 | | | `High Growth` | `Job and scope` | `Interesting domain` | `Culture and diversity` | `Compensation` | 68 | |---|---|---|---|---|---| 69 | | Databricks | Y | Unknown | Y | Unknown | Unknown | 70 | | Stripe | Y | Unknown | Y | Unknown | Unknown | 71 | | Notion | Y | Unknown | Y | Unknown | Unknown | 72 | | Pinterest | N | Unknown | Y | Unknown | Unknown | 73 | | Airbnb | N | Unknown | Y | Unknown | Unknown | 74 | | Coinbase | N | Unknown | Y | Unknown | Unknown | 75 | | Offerup | Y | Unknown | N | Unknown | Unknown | 76 | | Google | N | Unknown | TBD | Unknown | Unknown | 77 | | Facebook | Y | AR/VR | TBD | Unknown | Unknown | 78 | 79 | ## Paths 80 | * https://femgineer.com/about/ 81 | * 融云IM: http://www.52im.net/thread-2703-1-1.html -------------------------------------------------------------------------------- /RandomThoughts/Writings/README.md: -------------------------------------------------------------------------------- 1 | * 豆瓣: https://www.douban.com/people/205870843/ 2 | * 知乎: zhihu.com/people/shijie-zhang-seattle 3 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/fo-xue-ci-di-hua-kai.md: -------------------------------------------------------------------------------- 1 | # 佛学《次第花开》 2 | 3 | 写于08/13/2020 4 | 5 | ### 无常 6 | 7 | 隐约还记得二零年三月初,华州发布居家令,我把办公用品打包回家准备开始在家办公。当时的心情欣喜大于忧虑,欣喜的是在家办公自己可以更灵活地安排自己的时间,以及节省下很多在办公室打情骂俏以及交通的时间;忧虑的是疫情会打乱自己社交活动。不过想想也还好,毕竟天朝两三个月就把疫情控制住了,医疗条件这么发达的美国应该也不会差到哪里去,几个月的时间自己可以没有干扰埋头做些事情也挺不错的。世事无常,事情并没有按照我这样的喜好或者预期发展下去。 8 | 9 | ### 分别心 10 | 11 | 起初心里有一些懊恼, “疫情什么时候能过去”,有一些活动不好推进;眼看疫情至少还要延烧半年,上线上的哈达瑜伽课,意识到几个月埋头学习与工作让自己变得肉肉的,”hmm, 我不是很喜欢,感觉像是regression”; 三十而立,我也感受到一些家庭和事业上的选择,计划着趁疫情期间把ABCDE统统搞定,想象到了彼岸压力会小一些。这样的分别心有时也会让我感到焦虑。 12 | 13 | ### 正念 14 | 15 | 可是人生原本就没有彼岸,人生在世,难免经历生老病死,除此以外,不同的人还会在不同条件下经历怨憎会、爱离别、求不得、不欲临的痛苦。而在所有的这一切中,最痛苦的事情莫过于自己和别人不一样,认为自己不应该有痛苦的痛苦,纠结在这种情绪之中。 16 | 17 | 而正念恰恰是告诉我们,痛苦就像蔚蓝天空上的云彩, 如果觉察体味它,所有痛苦背后都有一丝淡淡的喜悦,因为那是自己昨日的因结出的果,而此时又是一个新的因果。 18 | 19 | ### 目送 20 | 21 | 想做的事情太多力不从心,难免会引发焦虑;想法宏大而没有计划手足无措,也必然会引发焦虑;有了计划却不能坚持,看着一点一滴流逝的时光,不焦虑都难。 22 | 23 | 做了这些事情,再能做的也就是目送这经历的因渐行渐远。 24 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/fo-xue-tian-long-ba-bu-yu-zhi-nian.md: -------------------------------------------------------------------------------- 1 | # 佛学 天龙八部与执念 2 | 3 | * [天龙八部与执念](fo-xue-tian-long-ba-bu-yu-zhi-nian.md#天龙八部与执念) 4 | * [放下执念才能平衡](fo-xue-tian-long-ba-bu-yu-zhi-nian.md#放下执念才能平衡) 5 | * [当信仰变成执念](fo-xue-tian-long-ba-bu-yu-zhi-nian.md#当信仰变成执念) 6 | * [执念下的不平衡](fo-xue-tian-long-ba-bu-yu-zhi-nian.md#执念下的不平衡) 7 | 8 | 写于11-06-2020 9 | 10 | ## 放下执念才能平衡 11 | 12 | 这周纷纷攘攘的气氛反而容易让我想起历史和故人。蓦然意识到距离金庸先生2018年10月30日驾鹤西去已经两年了,作为一个从小不知道懵懵懂懂看了多少遍金庸作品的读者,特别想找机会把自己的一些想法写下来,记录自己的生命,同时表达对金庸先生的怀念。 13 | 14 | 金庸作品里让我印象最深的是两部最长的作品《天龙八部》和《鹿鼎记》,前者讲佛经而后者讲历史。今天我们先说说前者,记得1966年陈世骧先生评价天龙是“有情皆虐,无人不冤”,所有作品中的主人公都不由地陷入命运的轮回与捉弄,其中有些人最终超脱并结束了这样的轮回,比如萧峰用气壮山河的一死结束了宋辽的冤仇,虚竹放下了从小成为少林寺一代高僧的愿望成为了潇洒的灵鹫宫主和西夏驸马,甚至最后段誉也放下了对王语嫣美貌的偏执,与木婉清和钟灵共结连理。这三个都有主角光环有如天助,有常人不可想象的运气与才能,都是打破执念与轮回的正面例子,而我们往往能从反面的例子中学到悟到更多的东西。 15 | 16 | ## 当信仰变成执念 17 | 18 | 天龙里让我感觉最真实而且变化最大的其实是慕容复,他一生的信仰就是希望继承祖业,光复大燕,即使宋朝距离大燕已经过了七百多年,即使他们慕容家人丁单薄,即使他的家传武功斗转星移其实发动条件非常苛刻......从平民当上皇帝这件事,自古至今中国历史上也只有两个人做到过,难呢,难怪金庸戏谑地称呼他为“难”慕容。天龙里慕容复不是一个讨喜的角色,孤傲狭隘而且屡战屡败。但其实直到他到西夏应选驸马,我都觉得这个人虽然不被多数人接受,但是至少他知道自己要什么,并且愿意尽全力为自己争取,让我发生改观的是他回答文川公主的问题。 19 | 20 | “公子生平在什么地方最为逍遥快乐?” 这个问题慕容复曾听侍女问过四五十人,但问到自己时却突然张口结舌,答不上来。他一生营营役役,不断为兴复燕国而奔走。别人见他年少英俊,武功高强,名满天下,江湖上众所敬畏,自必志得意满,但他内心,从没真正快乐过。慕容复呆了一呆,说到:“要我觉得真正快乐,那是将来,不是过去。” 21 | 22 | 他的这个回答已经说明复燕已经成为执念,而不再是信仰,他已经完全无法从所做的事情中获得心流,绝对是看破进而放下的时候了。 23 | 24 | ## 执念下的不平衡 25 | 26 | 虽然表哥表妹叫得很亲人,但是慕容复和王语嫣是我觉得天龙里因为执念待在一起最不和谐的一对了。从王语嫣的角度讲,她从小在曼陀山庄长大,没有接触过太多成年的男子,她和慕容复之间更没有经历过恋爱从吸引、不确定、排他性、亲密性进而到订婚的过程,而是直接跳到了准订婚阶段。如果他们默契还好,可是并不是这样。王语嫣最让人提心吊胆的操作就是在大庭广众之下像教小孩子一样指导慕容复打架,而绝大多数男人都希望自己被信任和接纳,虽说是出于好心害怕表哥打架吃亏但其实是有更好地方式来提建议。从慕容复的角度看,他虽然屡次因为表妹的指点而当众恼怒,却从来也没有和王语嫣沟通过这个问题。而且,他整天为了大燕的事情奔走,基本没有精力与时间去照顾和王语嫣,这样的情绪价值即使对于一个已经属意于他的人也是不可以接受的。 27 | 28 | 佛家六度教我们看破与放下执念,也只有这样,才能更好地平衡自己的人生。 29 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/guan-xi-hong-lou-zhong-de-xian-yuan-yu-chen-yuan.md: -------------------------------------------------------------------------------- 1 | # 关系 红楼中的仙缘与尘缘 2 | 3 | 写于2020-04-26 4 | 5 | 6 | 7 | 贾宝玉最爱的有两个人: 黛玉和袭人。 8 | 9 | ### 仙缘 10 | 11 | 黛玉和宝玉是仙缘,前世缠绵缱绻,今生初见便已恍若回首,黛玉见宝玉一惊“何等眼熟到如此”,宝玉见黛玉一笑“这个妹妹曾经见过的”。 12 | 13 | 宝黛之间的爱情始于两小无猜,日则同行同坐,夜则同息同止,宝钗的到来成为了黛玉情窦初开的契机。在《探宝钗黛玉半含酸》一回中,对情感专一的渴望,使得黛玉看到金玉良缘的暗示就有酸酸的娇嗔,而身在情中的宝玉也能洞悉话中的机巧而妙答,爱情在平淡的一言一语中尽现。黛玉的小醋意,宝玉的小伏低,初恋中那酸酸甜甜的感觉在两个人的心底荡漾。 14 | 15 | 爱情的纠缠总是由心而发,再细小的事,只要是触动了恋人的心,那这事就再也不是小事。在《荣国府归省庆元宵》一回中,黛玉误会宝玉把自己一针一线绣的荷包也随手送给别人了,赌气回房将宝玉烦她做得那个香袋拿过来就铰;宝玉赶忙跑来解释,从红袄襟上取下带在里面的荷包,掷向黛玉怀中便走;黛玉越发生气起来,声咽气赌,汪汪地落下泪来,拿起荷包来又剪;宝玉见状禁不住妹妹长妹妹短赔不是。任性和赌气都只会给自己最爱的人,在怒过痛过哭过之后,是心灵的又一次接近,亲昵的感情一步步向深处走去。 16 | 17 | 正如每个人都是肉体、精神和灵性的统一体一样,爱情也同样在这三个层面延展。在《西厢记妙词通戏语》一回中,宝黛在飞舞的桃花下共读西厢记,当宝玉忘情地说到“我就是那多愁多病身,你就是那倾国倾城貌”时,黛玉微腮带怒薄面含嗔“你这该死的胡说!好好地把淫词艳曲弄了来,还学了这些混话来欺负我”。西厢记里讲完这两句话崔莺莺和张生就上床做爱了,而宝黛在欲望荡漾之时并没有被肉体的激情席卷而去,这份控制力使得他们能走向长久灵性层面的成长。 18 | 19 | 归根结底每个人都需要回答自己怎样看待自己的生命,暮春时节黛玉含泪葬花,《葬花吟》的开头“花谢花飞飞满天,红消香断有谁怜?”就抛出了关于生死的问题 - 颜色要消失了,香味要结束了,谁会对他们心生怜悯?如果大家对生命的结束都没有怜惜,那生命的意义又在哪里? 结尾句“一朝春尽红颜老,花落人亡两不知”, 由花及人,黛玉的花容月貌有一天也会无可寻觅。宝玉原本也准备葬花,但听到《葬花吟》 ,不由得跌落在地上,他在生命中第一次直面生死,他终于悟到了“虽然所有的生命最终都将消逝,有一天当我死了,能得到一个人真诚的泪水,那么我的生命就是有价值的”。 20 | 21 | ### 俗缘 22 | 23 | 黛玉葬花,旁人很难能理解她的痴,可是宝玉可以理解,并能有参破生死的领悟;而也只有黛玉从不劝宝玉求功名,不用修身齐家平天下那一套来评价这个喜欢跟女孩子混,喜欢吃胭脂的怪男孩子,因为他不是那个世界的人,不是这块料,也无心于此,黛玉真的懂他。对宝玉来说,黛玉是他心灵的交流,可是很难想象宝黛发生肉体关系还要结婚生孩子;而袭人是宝玉的尘缘,袭人几乎扮演了所有女性中的角色 - 母亲、姐姐、妾以及丫鬟。 24 | 25 | 俗缘中不可避免的会有各种各样的勾心斗角,袭人在宝玉周边的所有丫头中不论是模样和针线都不是最出色的,但是她却有一种别人比不了的痴劲儿和轴劲儿,心中眼中只有一个宝玉。因为她对宝玉的性格、心理、好恶了如指掌,所以都能对症下药,可以看到她察言观色、相机而动、欲擒故纵、以退为进等等,来对付宝玉的种种荒谬怪癖。 26 | 27 | 俗世中人们都有那么多的无可奈何,而红楼梦却能以悲悯的态度看待每个人的不足,并且能理解根植于这种不足的生命之痛。袭人不满足于只做怡红院最有实权的大丫头,她是个要强的,有自己人生规划,要成为宝玉的正经侍妾。要实现这种规划,仅仅有宝玉的依恋是远远不够的,唯一正当的途径是获得权力阶层的认可,也就是宝玉母亲王夫人的认可。终于,在宝玉被打之后,在自己可能不能生育的暗示下,在宝黛恋情不断升级的危机下,她选择了背叛大观园,投靠王夫人。之后,袭人完成了从大丫鬟到姨娘的身份转变,却也背上了告密者的名声。而红楼的作者是原谅袭人的,作者看见的,是一个为改变自己生命处境而竭尽全力的女孩子。 28 | 29 | 仙缘,参透生死而不舍深情;尘缘,洞悉苦痛却满怀悲悯。 30 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/guan-xi-ji-nian-fu-qin-wu-shi-jiu-sui-sheng-ri.md: -------------------------------------------------------------------------------- 1 | # 关系 纪念父亲五十九岁生日 2 | 3 | 写于07-27-2020 4 | 5 | 6 | 7 | 周日的傍晚,夕阳的余晖透过百叶窗洒在我的书桌上,耳畔响起了许巍的《喜悦》”愿此时的暖阳 也在静静照耀你 带着我所有的感激 对你们的思念……“,宁静而又温暖的旋律,我再也难以控制自己的情感,提笔写下这篇文章 - 纪念父亲五十九岁的生日,感激你们给了我这样一个美满和谐的家。 8 | 9 | ### 接受爱 10 | 11 | 一直到硕士毕业,我的生活过得平淡无奇,一切却也顺风顺水。眼睛盯着成绩工作按部就班地往前走,在肯定和赞美中,我也一直歌舞升平地沉迷于自己的小世界-好好学习努力工作,希望长大后能为他们多分担一些。 12 | 13 | 爸爸妈妈结婚三十年来,虽然偶尔因为事业家庭上的事情而争吵甚至大打出手,但他们的脾气一缓一急,在很多事情上都配合得相得益彰,家庭的和谐以及父母无条件的支持一直是我最坚强的后盾,即使遇到再困难的事情也相信能走过去。 14 | 15 | ### 给予爱 16 | 17 | 老妈,精明却也多疑,果断却也急躁,浅浅的酒窝让人觉得很温暖热情;老爸,坚毅却也寡言,温和却也执拗,挺拔的后背总是让人觉得很有安全感。之前父母的沟通好多时候会像走流水账一样例行公事,直到有一天我开始意识到他们也需要我的支持和鼓励。 18 | 19 | 老妈这几年变得有很多情绪化的抱怨,我之前很容易被抱怨和指责带得也很情绪化以致于谈话不欢而散。不过回头想想,我也是几个为数不多的老妈分担情绪压力的人,我需要努力让自己谈话时变得更加平静,不把她情绪化的评论和自己挂钩,而是沟通时尽可能慢下来,帮助她去臣服于已发生的事实,观照她内心的想法,争取把我的平静和能量分享给她。 20 | 21 | 老爸这几年的焦虑感越来越多,我之前觉得这么多年有什么风浪老爸都扛过来了,给他时间让他一个人静静就可以了。直到有一次凌晨三点老爸醒来后再也无法入眠给我打电话,我才突然意识到他真的需要我的支持 — 老妈越来越情绪化使得目前他们之间没有办法很好地沟通,我需要以更主动中间人的身份去主导家里的谈话;老爸渐渐迈入六旬他也有点跟不上现在的时代,我需要帮助他去思考下一步的可能性,把我的态度和见识分享给他。 22 | 23 | ### 爱与喜悦 24 | 25 | 和学习上取得优异成绩工作上获得肯定的快乐不同,每次家人相聚,就会有一种爱的喜悦,它不需要任何的回报,就像是我们爬到山顶,并不是为了摘下月亮,而是为了让今夜的月光洒满我的身上,感激你们教会了我爱与喜悦。 26 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/individual-phychology.md: -------------------------------------------------------------------------------- 1 | # Intro 2 | * Hello everyone, today I am really glad to share with you something which shocks my mind. 3 | * Alfred Adler had a point "All problems are interpersonal relationship problems". 4 | * This statement might sound a bit unaccurate from technical perspective. For example, exploring space out of solar system and making humanbeings a transplanet specices is probably not an interpersonal relationship problem. 5 | * But I do feel a lot of challenges and even friction from interpersonal relationship. I used to arrive at office at 6:00am in the morning to be the most diligent person in the company and to be seen and praised by my manager. For example, I feel struggled a bit talking to my parents because they always bring up the topic of marriage every time we had a phone call. 6 | 7 | * Two most important lessons I learnt from this book is 8 | * How to build horizontal relationships instead of vertical ones 9 | * How to build one's happiness by the sense of contribution instead of size of contribution. 10 | 11 | # Horizontal relationship 12 | ## Start with a question 13 | * Let’s start with an easily understood example, teach your child to do something or of training junior staff in the workplace, generally speaking there are two approaches that are considered: one is the method of raising by rebuke, or criticize, and the other is the method of raising by praise. 14 | * Which one is the better choice? To rebuke or praise? 15 | * When I thought about this question, of course it should be praise. 16 | 17 | ## Is priase right approach 18 | * For example, suppose I praised a statement you made by saying, “Good job!” Wouldn’t hearing those 19 | words seem strange somehow? I guess it would put me in an unpleasant mood. What’s unpleasant is the feeling that from the words “Good job!” one is being talked down to. 20 | * The mother who praises the child by saying things 21 | like “You’re such a good helper!” or “Good job!” is unconsciously creating a hierarchical relationship and seeing the child as beneath her. 22 | * Whether we praise or rebuke others, the only 23 | diАerence is whether we are using carrot or the stick, and the background goal is manipulation, it is like training an animal. 24 | * Alfred Adler categorizes both praise and rebuke as symbols of a vertical relationship, not a horizontal one. 25 | 26 | ## Correct 27 | ### From receiver's perspective 28 | * Being a praised is always a motivation for me. From the receiver point of view, When I did something great but no one praises it, it always make me feel a little disappointed. 29 | 30 | * If I only motivate myself by being praised by others, I implicitly guide myself into a rabbit hole of getting happiness. To get more praise, you need to make bigger contributions. But actually happeniss comes from the sense of contribution, but the size of contribution. Just imagine a person who is disable, he might make much less contribution than someone not disabled, and receiving much less praise from others, then should he not motivate himself to do things and feel happy. 31 | * 32 | * We should separate the task of how to feel happy and how to make bigger contributions. 33 | 34 | ### From giver point of view 35 | * So concretely speaking from the giver perspective, how does one go about this? One cannot praise, and one cannot rebuke. What other words and choices are there? 36 | * The answer is really simple, we could always start "Thank u" and acknowledge the impact someone has made. By saying thank you, you are actually saying that someone is beneficial to the community and help others build a true sense of one's worth. 37 | 38 | # Answer 39 | * This mindset does help me a lot. 40 | * In case of my family, having the mindset of horizontal relationship helps me to deal with it a lot because I no longer try to change myself to be agreed by my parents. I understand that even they don't praise me, even complains about me, that's a problem for me. I don't want to be disagreed or disliked, but how they interpret my behaviors are their life topics, not my life topic. 41 | * In case of my job, it also helps me focus on making bigger impact rather than easily turned into some personal mood. If I make some huge efforts, arriving in the office at 6:00am in the morning and my manager never praises that, that's totally fine because our common goal is how to make contribution to the team, company and community. 42 | * In a word, this book, or individual phychology, it teaches me how to build horizontal instead of vertical relationships, and how to gain happiness from the sense of contribution, rather than the size of contribution. 43 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/technology-entrepreneurship.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DreamOfTheRedChamber/think-bigger/01abcdc38369efdb5488e4a9b2358e99dc652df4/RandomThoughts/Writings/technology-entrepreneurship.md -------------------------------------------------------------------------------- /RandomThoughts/Writings/ying-xiang-shi-jie-de-fa-xian-kang-sheng-su.md: -------------------------------------------------------------------------------- 1 | # 影响世界的发现 抗生素 2 | 3 | 写于07/27/2020 4 | 5 | 6 | 7 | ### 年轻的人类 8 | 9 | 正如吴军老师在《文明之光》中的比喻“如果将地球的年龄缩短成一年,人类则在最后的半小时才出现“,人类相对于地球还是太年轻了,以后还有漫漫长路要走,不过也正因如此,我们今天所面对的各种缺憾也就没有什么好傲慢与抱怨的。 10 | 11 | 一次新型冠状病毒的爆发,让几十亿人的生活脱离了正常的轨迹,各国只能寄希望于隔离和疫苗来降低传播率,根本找不到什么针对病毒的特效药。其实何止与病毒的特效药,我蓦然意识到对绝大多数常见病诸如咳嗽感冒或者是慢性病诸如高血压血糖,人类的药物最多能做到改善症状,防止疾病恶化,远远谈不上能够治愈疾病。 12 | 13 | ### 更多的社会责任始于了解 14 | 15 | 至今为止,人类能拍着胸口说找到特效药的,有且仅有一次辉煌的胜利,就是治愈细菌感染的利器-抗生素。抗生素在1920年代发现并在1940年代进入大规模应用,在一瞬间治愈了人类几万年的顽疾:各种外伤感染,肺结核,梅毒,痢疾,霍乱等,从而将人类的平均寿命从45岁提高到了60岁。 16 | 17 | 虽然从最早的青霉素和链霉素至今,大部分抗生素都不是人类发明的,但是经过了几万年,人类终于学会了利用真菌和放线菌这一类的地球生物历经亿万年形成的保护机制来对抗疾病。 18 | 19 | 能投身于重大的科技创新中,改善人类的生活,一直是有理工科背景的我的一个理想;而想要在以后能更多地承担社会责任,从现在开始就要多学习多了解,毕竟行动始于想法。 20 | 21 | ### 青霉素的应用史 - 从0到N 22 | 23 | #### 简单却不容易 - 从0到1偶然发现青霉素 24 | 25 | 中学教科书上曾讲《弗莱明和青霉素》的故事:弗莱明做实验,在培养皿里面放了细菌培养液,偶然一次机会,弗莱明注意到他的培养液有一处蓝色发霉的地方,这个霉点周围没有细菌,他抓住机会深入研究就发现了青霉素,获得了诺贝尔奖。 26 | 27 | 恍惚记得我在学完《弗莱明和青霉素》一课之后,也曾梦想着将来自己有那种运气,也曾努力让自己拥有一个有准备的头脑,同时也为自己缺乏灵感而沮丧。不过这个故事听起来很简单,但做起来一点也不容易:科学家做实验往往要准备多个培养皿,而且当时的实验也不是为了发现能杀死细菌的物质,而且实验条件有限,出现样品污染也十分正常。如果弗莱明对每个看上去不太对的培养皿都深入研究,那最大的可能就是在浪费时间;弗莱明必须懂得忽略噪声,抓住主题,才能高效地完成研究工作。可是如果按照这个逻辑,弗莱明就不会发现青霉素。 28 | 29 | 大脑的”认知抑制“就是帮人忽略不重要的细节抓住重点,正常人认知抑制的接触都比较低,所以大多数人都是”正常人“,或者至多是缺乏创造性但是能干、可靠的人才;而天才和疯子的”认知抑制”解除都高,两者的区别在于前者能让大脑不被大量不重要的信息和幻觉轰炸,能在对细节敏感的情况下,再次忽略不重要的细节,把重要的细节留下,成为自己灵感的来源。这种能力真的是一种艺术,没法用语言来描述,只能靠人自己去体悟。 30 | 31 | #### 调动资源的能力 - 从1到N提取药用青霉素 32 | 33 | 弗莱明有关青霉素的论文早在1929年就发表了,但是在长达十年的时间里都没有引起科学界的注意。发明走不下去的原因有弗莱明的知识结构和性格特点的原因:弗莱明不懂化学,无法从霉菌中分离出青霉素,他也不懂得药理,搞不清楚青霉素杀菌的原理;另外,弗莱明是出了名的不善表达,不常与外界科学家沟通。 34 | 35 | 对于青霉素的药品化,一个叫弗洛里的牛津大学教授起到了关键作用。弗洛里不仅精通药理,更重要的是他还有非凡的组织才能,能调动很多资源,而且身边还有一大批能干的科学家诸如钱恩、亚伯拉罕以及希特利。这个从1到N的的漫长过程又经过了从1940年到1944年五年的时间: 36 | 37 | 1. 钱恩和亚伯拉罕首先从青霉菌中分离出了有效成分-青霉素。 38 | 2. 为了得到足够多的实验样本,弗洛里以超低的薪水雇佣了许多当地的女孩,把牛津大学里所有能用到的瓶瓶罐罐都用来培养青霉素 39 | 3. 寻找新菌种 40 | 4. 改进培养液 41 | 5. 了解青霉素的杀菌机理,才让人们放心使用青霉素 42 | 6. 说服美国的政府机构和著名的四大药厂共同研制和生产青霉素 43 | 7. 解决批量制药生产中遇到的类似除泡的问题 44 | 45 | 回顾从1到N的过程,很大程度是靠弗洛里非凡的组织才能,没有他以及他联合起来的无数科学家、工程师、工人,青霉素的大规模制药应用估计还要等许多年。 46 | 47 | ### 给我的启示 48 | 49 | 读完从0到1的发现,让我有了更多的敬畏之心,怎样把简单的事情做得出人意料的精彩是一种艺术;读完从1到N的应用,让我有了更大的格局,帮我更好的认识到培养调动资源的能力是如此的重要。 50 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/zhe-xue-fu-lan-ke-lin-zi-chuan.md: -------------------------------------------------------------------------------- 1 | # 哲学 《富兰克林自传》 2 | 3 | ## 印刷工 富兰克林 4 | 5 | 不知不觉已经宅在家里三周了,感觉一切慢下来了好多。阳光洒在猫树上,西安痴痴地俯瞰着庭院里那珠含苞欲放的樱花树;由于没有了上班下班的路程以及和同事的打情骂俏,自己每天也仿佛多了几个小时,累了可以随时去打个盹,倦了可以忙里偷闲刷个剧。 6 | 7 | 我并不是很喜欢读传记类的书籍,一是需要类似于读小说一样的大块时间,二是因为大多数作者的年代离我们很远,引起的共振会比较有限。富兰克林自传引起我感兴趣的有几点:一是吴军老师在得到的栏目里曾提到富兰克林是他的精神导师,我特别好奇怎样的人会成为吴军这类人的精神导师;二是这是一本自传,自传不同于传记,有一种自省生命的态度,是自己审视过去的一生,哪些事回过头来看特别重要,而不在乎外在的浮华和喧嚣。 8 | 9 | 正如富兰克林有了难以计数的名号—企业家、发明家、外交家…… 他却为自己墓碑上选择了不是特别起眼的一个身份 “印刷工 富兰克林”。回顾一生他最为珍视与在意的并不是自己创下的堪称美国国父的丰功伟业,而是自己从印刷工起就锻造的品格。 10 | 11 | ## 幸福与品格 12 | 13 | 有一份更开心而有意义的工作,有一个更温馨美满的家庭,会为生命带来幸福;但如果去刻意去维持追求这样的幸福,却很容易陷入水中捞月的困局。原因是工作和家庭原本就不完全在我们的控制之下,天有不测风云月有阴晴圆缺,喜人的工作也会枯燥无聊,美满的家庭也会冲突不断,在这样缺乏幸福的低谷与风暴里,人需要有超越幸福的意义才能维持自己内心的平静,而我觉得这里的意义就是完善自己的品格。 14 | 15 | 人性本身都有恶的一面如懒惰、嫉妒、愤怒、骄傲…; 而人性本身里也都有善的一面诸如正直、诚实、同理心…… 品格是自己内心的标尺,无论外在的环境怎样变化,只要坚守住自己的品格就有理由为自己感到自豪,以这样的标尺衡量自己的一天会更容易保持平静。 16 | 17 | ## 我所珍视的品格 18 | 19 | 富兰克林有传世的十三美德,对人重要的品格往往因人而异,而珍视的品格往往体现在让人生气的。对照了下我自己,有下面这六条,很多我都在不停地努力 20 | 21 | ### 有始有终 22 | 23 | 想要做成哪一件事情都不容易,而人天性里总会容易倦怠和松弛,觉得另外一件事更容易更吸睛,不去坚持做事往往会像狗熊掰棒子,花了很多时间却一事无成。 24 | 25 | ### 专心致志 26 | 27 | 年龄越大担负的责任越多,容易让人分心的小事和干扰也越多,如果自己不能去做深入思考,只能去完成最简单不用动脑的事,以后的命运也是可想而知的 28 | 29 | ### 积极幽默 30 | 31 | 生活中不如意的事情真的很多,自己首先能保持自己的平静与乐观,同时自己的出现应该总能给别人带来轻松和信心 32 | 33 | ### 守时守诺 34 | 35 | 每个人一生的时间都是有限的,不要养成总是迟到十分钟的习惯;如果有困难不能履行自己的诺言应该提前和别人说清楚,不要打肿脸充胖子耽误别人的事和时间 36 | 37 | ### 谦虚平和 38 | 39 | 如果自己脸皮太薄,不愿意向别人请教,什么事情都要自己摸索的话会走很多弯路;当别人向自己请教时也应该以平等姿态和别人讨论,当别人指出自己错误的时候也应该审慎的看待 40 | 41 | ### 同理共情 42 | 43 | 不要随意去评价别人,毕竟每个人的成长环境那么不同;不随意把责任推在别人身上,毕竟每个人都是参与者,需要大家携手才能攻读难关 44 | -------------------------------------------------------------------------------- /RandomThoughts/Writings/zhe-xue-ke-zhou-qiu-jian-yu-yu-qie.md: -------------------------------------------------------------------------------- 1 | # 哲学 刻舟求剑与瑜伽 2 | 3 | 写于09-09-2020 4 | 5 | ## 相对的坐标系 6 | 7 | 大多数国人在小学就学过刻舟求剑的故事,说楚国人坐船把剑掉进了海里却在船边上做了个记号想等船靠岸之后再把剑捞出来,用以讽刺楚国人做事僵化不知道变通。 8 | 9 | 换个角度看这其实是一种相对外物的坐标系,人生在世,如果也用和他人外物比较的相对坐标系来指导自己的生活,是很难潇洒逍遥的。 10 | 11 | ## 自知自爱自强的坐标系 12 | 13 | 瑜伽对我来说是一种维持正念的方法,不再是相对外物的坐标系,而是一种自知自爱自强的坐标系,这点真得让我好感动。 14 | 15 | 要去完成一个瑜伽的姿势,首先要自知,大脑里明白应该身上的哪一块肌肉在用力,哪一根筋膜在拉伸,用大脑控制自己的身体,这样才不会强为以致于伤害到自己。面对一些特别难的姿势,往往会产生一些畏难情绪,还记得我开始练习乌鸦式的时候总是以训练服瑜伽垫太滑为借口想去换个衣服换个垫子再做,其实更多的是自己的大脑里想得太多;其次是自爱,做的时候要时刻与自己的身体保持连接,能准确地接收到自己身体的反馈,如果太难现在做不到,也不要把给普罗大众制定的标准和评价强加到自己身上,给与自己消极地评价,可以采用一个修改过的更适合的姿势来练习;最后是自强,身体是不会再一两天内发生什么巨大的变化,但是只要自己有耐心不忘初心,能够苦中作乐,即使中断之后也能快速重新拾起,善始善终慢慢就能做好。 16 | 17 | 我开始练习瑜伽的时候会觉得这件事挺费时间,不能较快减肥和塑形,不过我慢慢明白,其实生活中有很多比外在坐标系更重要的东西,那就是保持正念,和自知自爱自强的坐标系保持连接,这样自己的努力才会真正地应用于自身,有累积效应。最理想的情况下就是不用专门腾出时间来让自己回到自知自爱自强的坐标系,身处闹市依然能举重若轻,但目前的我并不是总能做到,尤其是被繁忙的学习工作推动时,有时会像停不下来的陀螺。 18 | 19 | ## 正念下的顺势而为 20 | 21 | 疫情初始,一个人呆在家里,有时候会有点无聊或者有压力,食物唾手可得不知不觉每次休息时候就吃一点,好像就能缓解一下自己的情绪。回到正念,我有一天突然意识到自己其实并不饿,可能只是自己嘴巴希望一点变化,于是每天清晨我都会在书桌旁放一罐一加仑的水,每次想吃东西的时候就去喝点水。我发现这不仅消除了我的情绪,还不会让我由于不知不觉吃了很多东西而变困或者变胖。有效的控制了饮食,变瘦的目标也变得水到渠成。 22 | -------------------------------------------------------------------------------- /RandomThoughts/backlog.md: -------------------------------------------------------------------------------- 1 | # Backlog 2 | 3 | ## Spontaneous speaking 4 | * https://www.youtube.com/watch?v=HAnw168huqA&t=45s 5 | 6 | ## Financial 7 | * https://www.amazon.com/How-Happy-Active-Stock-Trader-ebook/dp/B09JL574ND -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- 1 | # Table of contents 2 | 3 | * [Resume](https://github.com/DreamOfTheRedChamber/my-resume/blob/master/Shijie_resume.pdf) 4 | * [Presentations](career/presentations.md) 5 | * [Publications](career/publications.md) -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-leap-day 2 | 3 | title: [Everything except technical details] 4 | description: [Eric thoughts on career (product, entrepreneurship, management), leadership (impact, communication) and hobbies] 5 | -------------------------------------------------------------------------------- /_layouts/default.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | {% seo %} 8 | 9 | 10 | 11 | 14 | 15 | 16 | {% include head-custom.html %} 17 | 18 | 19 | 20 |
21 |

{{ page.title | default: site.title | default: site.github.repository_name }}

22 |

{{ page.description | default: site.description | default: site.github.project_tagline }}

23 |
24 |
25 | 28 |
29 | {{ content }} 30 | 31 |
32 | 37 |
38 | 39 | -------------------------------------------------------------------------------- /career/Meeting notes.md: -------------------------------------------------------------------------------- 1 | - [Behavior](#behavior) 2 | - [Cassandra](#cassandra) 3 | - [Multi-User chat room](#multi-user-chat-room) 4 | - [Ticket master](#ticket-master) 5 | - [Calendar](#calendar) 6 | - [TopK](#topk) 7 | - [Youtube](#youtube) 8 | - [Donation](#donation) 9 | - [Google drive](#google-drive) 10 | 11 | # Behavior 12 | 请问亚麻recruiter嘉宾,能给个内推吗?感谢running away what?要慎用“running away from bad things” 吧听众能不能mute一下哪个老铁 mute一下啊😯Rong Yao弹琴的可以mu一下吗🙏就是可以你来我往的感觉吧Would you please give an example of a bar raising answer example?主持人可以先全部过一遍slice吗?而不是自己的follow up,这样子大家可以有个全局观,谢谢money incentive 可以搞笑的方式带过diversity and inclusion 的话题怎么回答我就怕我吹大劲儿我也是一定要提team work project 嗎?還是個人開發的也可以Frank, thank you very much! I appreciate it!吹牛就没输过个人开发可以,看你怎么讲了。比如说你参加solo hackathon,赢了,就很牛逼了所以做政府项目的要怎么描述謝謝Q: 对于刚毕业的学生,这些话题大部分都没有经历,HR或者recruiter如果面试NG, BQ questions他们应该会怎么问呢? 还是按部就班的念稿吗,对于NG怎么准备BQ?因为没有工作经验,这些问题大部分都没有体会project或者side project的经验,都可以讲NG 可以说school project,里面也涉及到collaboration、communication 哇这种学校的东西和在工作中,完全不一样非要硬往上帖,只能编了实在不行参加一些hackathon吧,很真实的团队合作了,大部分hackathon 36-48小时。构思,分工,熬夜,meet deadline,pitching。等于是压缩了一个季度的工作能举个例子吗什么叫complex嗯嗯同问Paul 说到点子上了!Paul刚才说啥来着?错过了谢谢 Paul 补充!16 Leadership now!Shopify的特有吧Paul说尽量选一个大的project,能说明你的实力的,方便评级更高Project dive deep 是主要考察技术能力吧感谢ganxie可以讲完slide再一起提问吗今天有录像吗搞过sev0 算不算厉害厉害可以说 没有mistake吗?说了 会很尴尬吗:)如果sev0 不是你写出来的话会没有mistake不太真实blame game is on就说会尴尬么。。。这么答就挂了?有人这样回答我的,我选择没挂他这一题。。。这个故事好,背下来这个故事要强调tech context吗?这样来说, 总体 来说, 感觉我们就会招进来 差不多的人突然想 艺术类 面试 ,越奇葩,越个性 越好晚了,已经背下来了。能讲一下例子吗 fail的例子晚了,背完了已经不能背啥啊没人能crash我, 算法可以。不怕 找人多mock! hoho戳中了说的就是我晚了,已经背了。上司的话需要analyze 但是你的customer 不管说什么必须要听例子太好了,深深印入了脑海里只有魔法才能打败魔法晚了,已经背下来了。这样会不会显得manager很笨你可以编一下,说是隔壁组的manager同问 感觉manager听到能挣钱很难会block你跟直接manager有conflict不太合适正常人早就同意了 你还和我争对啊有conflict一般都是 clarity不够,有个人没有拿到所有信息话说面试官是中国人的话是好事吗解释一下就行了,但是这样没scope啊。还是之前的故事好。都背下来背下来"有conflict一般都是 clarity不够,有个人没有拿到所有信息" 13 | 👍Wendy 牛!同样一件事情说到了manager level有没有5分钟中场休息,喝点水上个洗手间的?现在下半场都要结束了😃😂确实new grad太难了,很难讲leadership的Hr会给你发公司valuebq搞定亚麻 = 搞定一切公司对NG要求不会很高说实话,准备好了amazon的bq,其他公司的bq都稳了哈哈哈,Gigas也是也么想的上面说的可能有误导,我一会补充一下请问有过往其他行业工作经历的转专业学生,面试讲故事可以用以前的工作经历吗?还是必须要讲和tech有关的项目或经历?lol爱因斯坦是谁lol🤣太棒了好像是个政客爱泼斯坦他哥?夸父是个好同志大公司不需要招 那个level 的聪明人wendy好牛啊 wendy可以考虑开个小红书账号分享职场面试信息吗有时间的可以看看wiki和公司官网blog和公司的linkedin,然后我觉得每个glassdoor的positive评价也可以很好地看出一个公司地valuebe myself 我怕没人给offer爱因斯坦克be best myself 不用谢~👍 Paul Wendy 在理大牛👍哪家对人大于业务?麦霸leadership ownership ng可当作主人翁精神 肯定可以聊大家可以读一下Ben Horowitz写的一本书叫The hard thing about hard things,里面有一章节叫Take care of the people, the products and the profits - in that order职场俱乐部volunteer 很锻炼leadership 😀这是回应之前的问题“哪家对人大于业务?”又背过一个: 主动帮oncall engineer去解决livesite的incident其实每家公司都是“人大于业务”,也是“业务大于人”,取决于您的观察点对牛人,就是人大于业务, 如果您没有贡献,就是业务大于人,呵呵你就说系统本身的限制,不说别人的问题old system就是很菜觉得ownership跟之前的第二个category “tell me a project you are proud of.” 和第五个category “leadership”都有点overlap? 老师们的例子听起来好像是有点overlap。请问怎样解决一个例子好像可以对应很多问题的情况呢? 谢谢大家你还有第七页?earn trust?我参加的比较晚 请问能把ppt的几点title列一下吗?+1ppt没啥,没有老师讲的故事精彩反客为主6666这就是handle ambiguity把可以问回去amazon的interview feedback要写的很详细的,面试官主要是collect datapoint但是最后也不给feedback面试官会不会光听不写,然后回去编?😂不可能我从来没有见过这样的,那样问题就大了Amazon onsite behavior question https://www.1point3acres.com/bbs/thread-307462-1-1.htmlearn trust是啥故事来着?比如我最近面试一个SDM, 光feedback都写了近2000 words请问我们有recording吗?来晚了,miss掉了前半部分😭如果来面Amazon, 最好真的好好读读LP,一定要不然很难过的What is LP?Leadership Principles每个问题对应的lp在这里找:Amazon onsite behavior question https://www.1point3acres.com/bbs/thread-307462-1-1.html 14 | From zhaozhonghao to Everyone: (8:59 PM) 15 | 想问一下,如果问到tell me about your weakness, 应该怎么说? 16 | From Paul Lou to Everyone: (8:59 PM) 17 | 另一家重视BQ的公司是Apple,级别高一些的有一半或超过一半的面试是BQ 18 | From Hobite Sun to Everyone: (8:59 PM) 19 | https://www.kraftshala.com/blog/amazon-interview-questions/ 20 | From Hobite Sun to Everyone: (9:00 PM) 21 | 上面的链接是amzn问题对应的lp 22 | From Betty Ho to Everyone: (9:00 PM) 23 | 謝謝 24 | From Changzheng Rao to Everyone: (9:01 PM) 25 | 十分感谢🙏 26 | From Kevin Wen to Everyone: (9:01 PM) 27 | 我们几乎不问“weakness” 了吧,如果被问了,就举一个真实的故事, 还有您怎么自己提高的, 或者如果是manager的话,也可以讲如何利用下属或者同事来check balance终归没有完美的人嘛 28 | From Andrew to Everyone: (9:01 PM) 29 | 问的。我就被问到有什么hash feedback,harsh* 30 | From Kevin Wen to Everyone: (9:02 PM) 31 | Harsh feedback那个是“earn trust”的问题收到feedback,如果是又问题,怎么解决,最后earn the trust back 32 | From Verity Chu to Everyone: (9:03 PM) 33 | 谢谢 Ken 老师 34 | From lei chen to Everyone: (9:03 PM) 35 | earn trust 是承认错误吗 36 | From Tao Mao to Everyone: (9:03 PM) 37 | 很好的回答! 38 | From Paul Lou to Everyone: (9:03 PM) 39 | Kevin Wen很有经验 40 | From Kevin Wen to Everyone: (9:03 PM) 41 | 如果是有问题,为嘛不承认?承认了改了就好 42 | From W D to Everyone: (9:03 PM) 43 | 想问一下对于转码选手,讲的故事和cs不相关可以么? 44 | From Anna to Everyone: (9:03 PM) 45 | 想问对于 handle tight deadline,Couldn’t finish tasks before deadline这类问题,有没有比较好的回答 46 | From lei chen to Everyone: (9:04 PM) 47 | 就主要是不懂earn trust 48 | From Kevin Wen to Everyone: (9:04 PM) 49 | 如果没有问题, 就是要怎么提供资料,对事不对人的解决 50 | From lei chen to Everyone: (9:04 PM) 51 | 有点lost 52 | From Kevin Wen to Everyone: (9:04 PM) 53 | 谢谢Paul 54 | From Andrew to Everyone: (9:04 PM) 55 | 有错误也不能承认啊 56 | From YL to Everyone: (9:04 PM) 57 | help peers也是earn trust 58 | From Kevin Wen to Everyone: (9:04 PM) 59 | learnLearn from your scar 60 | From Becky to Everyone: (9:05 PM) 61 | 想请问一下bq和system design对定level哪个权重比较高呢? 62 | From Verity Chu to Everyone: (9:05 PM) 63 | 谢谢老师们~ 64 | From Andrew to Everyone: (9:06 PM) 65 | 可不可以试试就知道。。 66 | From Kevin Wen to Everyone: (9:06 PM) 67 | 如果您要到L6+, 一定要“Learn from your scar” 68 | From Pencil to Everyone: (9:06 PM) 69 | 无关的意思不大 70 | From Kevin Wen to Everyone: (9:06 PM) 71 | 不能是大错误哈 72 | From Tony to Everyone: (9:07 PM) 73 | 亚麻最近coding题不算难 74 | From Kevin Wen to Everyone: (9:07 PM) 75 | 您不能说我把AWS或者Azure搞down了 76 | From Gigas to Everyone: (9:07 PM) 77 | amazon 的大量是指多少呢 78 | From Kevin Wen to Everyone: (9:07 PM) 79 | 那就麻烦了 80 | From Yang to Everyone: (9:07 PM) 81 | 好像没有回答刚才的问题 82 | From YL to Everyone: (9:07 PM) 83 | 500+吧 84 | From May Zhu to Everyone: (9:07 PM) 85 | earn trust 就是和团队契合,团队觉得你可以完成工作内容,还有会和团队相处融洽。基本上各种BQ都是考研你值不值得trust吧 86 | From Andrew to Everyone: (9:07 PM) 87 | 500够吗 88 | From Yang to Everyone: (9:07 PM) 89 | 问题问的是已经通过了coding test拿到面试的前提下,bq能不能用其他行业的工作经历来回答 90 | From Kevin Wen to Everyone: (9:08 PM) 91 | 亚麻肯定可以 92 | From jackson to Everyone: (9:08 PM) 93 | 可以的,就想有个人说组织meeting 94 | From Tony to Everyone: (9:09 PM) 95 | 最近面亚麻每一轮都问behavior 96 | From Yang to Everyone: (9:10 PM) 97 | 谢谢大家! 98 | From Kevin Wen to Everyone: (9:10 PM) 99 | 亚麻BR对SDE一般还是要问一个简单的技术问题的 100 | From Tao Mao to Everyone: (9:10 PM) 101 | 说没有! 102 | From keira to Everyone: (9:10 PM) 103 | 谢谢老师们分享! 辛苦! 104 | From Betty Ho to Everyone: (9:11 PM) 105 | 謝謝今天的分享~受益良多 106 | From Lu Li to Everyone: (9:11 PM) 107 | 谢谢老师们分享!两小时干货满满 收获很多! 108 | From Isabella to Everyone: (9:11 PM) 109 | 谢谢大家 110 | From Yang Bai to Everyone: (9:11 PM) 111 | 非常感谢老师! 112 | From Kevin Wen to Everyone: (9:11 PM) 113 | 谢谢大家,我也很受教!! 114 | From Sophie Chen to Everyone: (9:11 PM) 115 | 謝謝各位老师今天的分享~!!! 116 | From May Zhu to Everyone: (9:12 PM) 117 | 非常感谢各位老师的分享,非常受用! 118 | From Selena to Everyone: (9:12 PM) 119 | 对对 earn trust求解 120 | From wz to Everyone: (9:12 PM) 121 | Thank you Ken, Wendy & Frank! Thank you host! 122 | 123 | 124 | 125 | 126 | # Cassandra 127 | C* consistency levels: one -> quorum -> allmore options to chooseTypical, and more added later…CAP theorem: C vs. A tradeoffnot all company swap cassandra , some big company are still use itbut if you use it in critical data/transaction business, then cassandra is NOT the right choice请问 wide-column store 是啥意思这样讲听不懂没听到Cassandra的应用场景,请问早来的朋友点我几句Quote - “if you use it in critical data/transaction business, then cassandra is NOT the right choice”那cassandra有啥用。。Typical use cases: chat messages, logs, IoT data, etc.@Sh.W For cases that you need fast write: chat, event logs, etc用zookeeper保存的Service discovery感谢解答。那这个Cassandra跟mongodb啥的有啥区别,为啥要用cassandra存log没讨论过,后面可以问~行,谢谢。那我就认为存log吧。。。dns?你把前面放个reverse proxy? lbhttps://teddyma.gitbooks.io/learncassandra/content/client/which_node_to_connect.html最近在看firebase,就是document db,和MongoDB一样可以很好地用来快速储存聊天信息active-passive lbactive-passive lb +10086这些东西应该都是cloud provider管吧?是的是ring0 infra管的跟application layer木啥关系https://www.ibm.com/docs/en/b2b-integrator/5.2?topic=system-installing-apache-cassandra-apache-zookeeper我觉得是问peer2peer怎么propogate 数据变化的query node 1, node1怎么决定找下一个node,如果本node无此数据。复制只是复制到replicatorvector 是啥version vectorquorum-based还能是LWW吗Cassandra没啥关系 看一下service discovery/service mesh一般怎么做能不能share 一下ppt我最后会发一个总结,slides会包含在里面,还有一些扩展阅读为什么update不行??Cassandra released in 2008; dynamo in 2012 为啥说cassandra 抄dynamo?!Dynamo 不是dynamoDB请问可以share一下具体table长什么样吗?0基础有点看不懂啊。Dynamo在SOSP‘07发表的:https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf越删越多跟aws redshift有点像啊Ok cool 抄挺快!对 vacuumHBase 和RocksDB也都有类似合并机制,两者都是LSM treelog structure merge tree吗是的Log Structured Merge Treeprimary key = partition key + clustering key. clustering key is orderedDynamoDB 是single leader的 不是P2P根据CAP定理,只能同时满足两个,而由网络分区带来的分区错误风险是必然存在的,因此只能在CA中间选一个,Cassandra选择了AP那cassandra有什么应用场景,跟mongodb 应用场景有什么不同请问MySQL是什么呢?AC吗?@Neal yesCPCassandr一般用在高可用性的场景,即使整个集群就剩下一台机器了,也要能工作,用Raft等协议就保证不了这一点RAFT是啥 猴子哥跟 MySQL 集群具体的高可用和一致性方案有关吧?Raft一致性协议,比Paxos简化谢谢!系统设计如果遇到银行相关的得保证AC,也需要scalable,那他就没有partition tolerance吗?https://donggeitnote.com/2021/10/16/raft/我们前几周讲的raft这个讲raft讲的特别好 128 | http://thesecretlivesofdata.com/raft/确切说,是newbie friendlynewbie dota best dota+1就是来听db选型的类似于sortkey的一个东西?denormalized吧column可以存json 吗column family, all columns within the same column family are stored sequentically on diskWikipedia has thiscolumn family, all columns within the same column family are stored sequentically on disk 牛逼牛逼A wide-column store is a type of NoSQL database. It uses tables, rows, and columns, but unlike a relational database, the names and format of the columns can vary from row to row in the same tablequery within the same column family is faster than row-based db.relational DB对column number有限制,Oracle是1000那为什么还要定义schema那这个wide-column和存json是一个意思吗?就是可以nest column吗?value can be json blob主要是在disk上的存储方式不同 129 | [file: (null)] 130 | Column family使用cluster key来定义的吗妈同事介绍 暴露了哈哈哈哈我也给你介绍一个好不好 答主大家还是注意保护好自己信息不要乱开玩笑哈列式存储同一个column family的data存成一个文件(key: value (key: value)) ?这个稳步不搞明白 不知道往里面存啥这个问题不搞明白 不知道往里面存啥NoSQL有好几种,也不一样parquet是可以的 col-orientedcolumnar storage format大佬最后可以讲一下为啥淘汰了😂找对人, 131 | Hire and develop the best 132 | 哈哈 133 | [file: (null)] 134 | 要deprecated 掉啊?业务导向Cassandra严重依赖gossip,这个有性能问题gossip好我们的Cassandra就10几个node。。超大规模的话 consistency就是bottleneck了答主负责。不用不乱说今天有个群里发过https://blog.softwaremill.com/what-is-wrong-with-apache-cassandra-materialized-views-a7a25431dad👍感谢感谢respect感谢 分享cogs?还是cost?Rebalancing 要花很久 我们也遇到过写烂了是个好的总结好几个小时印度人😂🤣现代分布式系统 不考虑consistency就是原罪那大部分nosql 都躺枪了spanner我上学的时候的proj就是simulate gossip。。。兰伯特论文写的烂我也想当个科学家 现在还来得及吗最近的道路是不是转数据科学家schemalessFB在MySQL上面包了一层: TAO为什么说Cassandra schemaless但是我们这里又定义了schema:Cassandra旧版本用的binary protocol Thrift-based API所以还支持schemaless。但是在新的实现里CQL+storage engine已经需要定义schema了。 https://stackoverflow.com/questions/63380973/in-cassandra-how-is-it-possible-to-save-data-in-a-column-name-while-leaving-the/63381339#63381339所以最好用的是不是mysql + shard?Tao这个名字很酷,难怪改名叫MetaThank you for praising my name.Tao, 66666666666666哈哈哈哈哈哈哈技术互喷transactional 是啥意思事务嘛事务是啥要么都成功,要么都失败,ACID甩锅失败 哈哈哈啊,ACID啊2pc懂了你的麦克风好像有点杂音transaction是指的DB的ACID有背景音风很大哈哈我挺赞同东哥的观点+1+1 135 | From 非洲黑猴子 to Everyone: (7:58 PM) 136 | +2 137 | From weibo wang to Everyone: (7:58 PM) 138 | 同意 139 | From Sh.W to Everyone: (7:59 PM) 140 | 东哥6666 141 | From Zooey to Everyone: (7:59 PM) 142 | 教练:这叫意识流选手 143 | From Sh.W to Everyone: (7:59 PM) 144 | 意识流是啥 145 | From Mark Liu to Everyone: (7:59 PM) 146 | 头脑风暴,如果有怀疑的位置,提出来,然后暂停,继续后面。 147 | From Michael Qiu to Everyone: (7:59 PM) 148 | 对 我们就互相默认大家都是菜鸡。 149 | From YL to Everyone: (8:00 PM) 150 | 同意,都是学习的 151 | From Sh.W to Everyone: (8:00 PM) 152 | 同意,都是学习的 153 | From Zooey to Everyone: (8:00 PM) 154 | 我联动的活动我自己都不懂 155 | From Sh.W to Everyone: (8:00 PM) 156 | respect 157 | From Zooey to Everyone: (8:00 PM) 158 | 🤣 159 | From Sh.W to Everyone: (8:00 PM) 160 | 666666666999999999999 161 | From 非洲黑猴子 to Everyone: (8:00 PM) 162 | 讨论才是大头,主要就从讨论中学习 163 | From Cheng Jing to Everyone: (8:00 PM) 164 | 名词全都不知道的有✋ 165 | From Sh.W to Everyone: (8:00 PM) 166 | 6翻了 167 | From Bei Z to Everyone: (8:00 PM) 168 | respect+1 169 | From Sh.W to Everyone: (8:00 PM) 170 | 讨论才是大头,主要就从讨论中学习 171 | From Kevin Hu to Everyone: (8:01 PM) 172 | 谢谢! 173 | From Sh.W to Everyone: (8:01 PM) 174 | 我也讨论时候学的东西好多append only妈? 175 | From Becky to Everyone: (8:02 PM) 176 | 和mysql mvcc比较一下 177 | From 非洲黑猴子 to Everyone: (8:03 PM) 178 | SQL是从单机进化来的,不是原生分布式的,所以容易支持事务 179 | From 木仓馆长 to Everyone: (8:04 PM) 180 | transaction里面那个isolation也非常重要 181 | 182 | 183 | ## Multi-User chat room 184 | From Michael to Everyone: (7:12 PM) 185 | 没声音? 186 | From 2002079 Xi Zhou to Everyone: (7:12 PM) 187 | 有呀 188 | From Qian Teng to Everyone: (7:12 PM) 189 | +1 190 | From lining to Everyone: (7:12 PM) 191 | 有 192 | From yyh Ace to Everyone: (7:12 PM) 193 | 你要join audio吧 194 | From A to Everyone: (7:13 PM) 195 | 啥时候开始。。 196 | From Michael to Everyone: (7:13 PM) 197 | OK 不小心点错了 198 | From Ken to Everyone: (7:15 PM) 199 | meeting notes: https://docs.google.com/document/d/1Hfnhg09v9ISJ20151u7PDTpvzqjpp5ajiNI5h-KevY0/edit# 200 | From YL to Everyone: (7:16 PM) 201 | 这字体这么fancy吗 202 | From Jian Zhu to Everyone: (7:17 PM) 203 | 看着好难受 - - 204 | From Becky to Everyone: (7:19 PM) 205 | @group member 206 | From Jiabei Luo to Everyone: (7:19 PM) 207 | 开始时候介绍用的slide能share一下嘛?具体就是l4 l5 criteria的那一页。谢谢~ 208 | From A to Everyone: (7:20 PM) 209 | 又要开始算算术了吗 210 | From Weida to Everyone: (7:20 PM) 211 | 感觉面试官不大兴奋? 212 | From 2002079 Xi Zhou to Everyone: (7:21 PM) 213 | lol 214 | From Jiabei Luo to Everyone: (7:21 PM) 215 | can you add additional people to a 1:1 chat?And turn it into group chat 216 | From xinz to Everyone: (7:21 PM) 217 | 面试官是senior 吗? 218 | From A to Everyone: (7:21 PM) 219 | 这是在reverse engineering 微信吗 220 | From xing wang to Everyone: (7:22 PM) 221 | 大多数面试官都这样,把舞台交给应试者 222 | From A to Everyone: (7:22 PM) 223 | 请向答主要个表情包feature,谢谢。 224 | From 2002079 Xi Zhou to Everyone: (7:23 PM) 225 | Edit 已经发送的信息 feature 226 | From tomdi to Everyone: (7:23 PM) 227 | 2min内可以recall message 228 | From Becky to Everyone: (7:23 PM) 229 | Online status 要不要 230 | From v to Everyone: (7:24 PM) 231 | 感觉这边很多聊天软件设计理念和wechat区别挺大。。。总以wechat作为出发点 可能非国人会confuse 232 | From Eddie菜 to Everyone: (7:24 PM) 233 | 正在偷人...... 234 | From ningdi to Everyone: (7:24 PM) 235 | 要需求的这个聊天方式好棒 236 | From Ken to Everyone: (7:24 PM) 237 | Soft Skills:1: requirements gathering2: make decisions and justify tradeoffs3: describe the solution using clear presentation, concise language and accurate technical termsHard Skills:1: design quality; scalability, reliability, efficiency etc (L4, L5)2: basic facts about existing software solutions and hardware capabilities (L4 - partly, L5)3: project lifecycle awareness, e.g. how a project is developed and maintained (L5) 238 | From Jiabei Luo to Everyone: (7:24 PM) 239 | Edit or delete message (like discord) ? 240 | From A to Everyone: (7:24 PM) 241 | 谁在偷人 242 | From Will to Everyone: (7:24 PM) 243 | 请问record在哪里能看到? 244 | From A to Everyone: (7:24 PM) 245 | 这么刺激 246 | From AAA to Everyone: (7:25 PM) 247 | exciting 248 | From Anita Chen to Everyone: (7:25 PM) 249 | 一般來說和面試官confirim req應該佔多長時間呀? 250 | From Robin to Everyone: (7:25 PM) 251 | 会不会聊太多requirement了,这些很难能在45min内聊清楚吧,我觉得能深入说清楚1:1chat和group chat就很不错了 252 | From Weilong Ding to Everyone: (7:26 PM) 253 | 少于十分钟吧 254 | From Becky to Everyone: (7:26 PM) 255 | Retrieve chat history from local storage or remote? 256 | From A to Everyone: (7:26 PM) 257 | 同感,我觉得requirement时间有点长 258 | From Feng Gao to Everyone: (7:26 PM) 259 | 有以往的mock recording吗 260 | From xing wang to Everyone: (7:26 PM) 261 | 计算存储量了吗? 262 | From Richard Tu to Everyone: (7:26 PM) 263 | 确实感觉,如果45分钟总时间的话,更符合真实interview 264 | From A to Everyone: (7:26 PM) 265 | 计算存储量为啥?硬盘很贵吗 266 | From yingzhu to Everyone: (7:26 PM) 267 | 没见过这么久clarify的。。。 268 | From kk to Everyone: (7:27 PM) 269 | 同意。感觉确认mvp就好了。 270 | 比如图片功能等等在最后optimization/extension的时候再聊吧 271 | From 非洲黑猴子 to Everyone: (7:27 PM) 272 | 原来不就是个Multi user的聊天室吗?现在微信快设计全了,除了朋友圈 273 | From YL to Everyone: (7:27 PM) 274 | 1+1, group,notification就可以讲很久了 275 | From 2002079 Xi Zhou to Everyone: (7:27 PM) 276 | 一般clarification 大概多少分钟 277 | From Feng Gao to Everyone: (7:27 PM) 278 | NR 一般还会有个HA吧 279 | From xinz to Everyone: (7:27 PM) 280 | 聊天信息的顺序要保证吧 281 | From A to Everyone: (7:27 PM) 282 | NR是啥,HA是啥 283 | From Michael to Everyone: (7:27 PM) 284 | real-time message应该是要的吧 285 | From xinz to Everyone: (7:27 PM) 286 | 还有availability 也要保证吧 287 | From Feng Gao to Everyone: (7:28 PM) 288 | HA: High availability 289 | NR: Non-functional requirement 290 | From A to Everyone: (7:28 PM) 291 | non-functional requirement直接跳过吧 292 | From v to Everyone: (7:28 PM) 293 | 顺序只能保证每个人看到的order一样吧。。没办法global 保证顺序? 294 | From A to Everyone: (7:28 PM) 295 | 说不说的吧 296 | From YL to Everyone: (7:29 PM) 297 | 这咋能跳过呢.. 298 | From Jiabei Luo to Everyone: (7:29 PM) 299 | 顺序在毫秒级别差别应该没那么重要吧 300 | From Michael to Everyone: (7:29 PM) 301 | +1 302 | From v to Everyone: (7:29 PM) 303 | 顺序不对会有些逻辑上的错误 304 | From Sean Gao to Everyone: (7:29 PM) 305 | 顺序是不是在 server 端加 ts 作为 truth ? 306 | From A to Everyone: (7:30 PM) 307 | 为啥不能跳过呢? NR 有啥用呢? 308 | From ningdi to Everyone: (7:30 PM) 309 | server端加上也没办法保证ordering 310 | From v to Everyone: (7:30 PM) 311 | 比如三个人群聊。。在c看来 a对b的回答比b对a的提问先到 312 | From xing wang to Everyone: (7:30 PM) 313 | 45分钟吗? 314 | From Sean Gao to Everyone: (7:31 PM) 315 | @v 有道理 316 | From Michael to Everyone: (7:31 PM) 317 | 感觉snowflake的id就已经满足大部分要求了。happend-before relation和绝对的顺序感觉还挺难的。 318 | From YL to Everyone: (7:31 PM) 319 | 顺序只能在自己端看到的是一样的 320 | From Fei to Everyone: (7:31 PM) 321 | 顺序无所谓的 322 | From v to Everyone: (7:31 PM) 323 | 顺序能不能用vector clock来解决? 324 | From A to Everyone: (7:31 PM) 325 | 顺序感觉不重要大差不差就得 326 | From Fei to Everyone: (7:32 PM) 327 | 只要保证partial order就可以了 328 | From A to Everyone: (7:32 PM) 329 | partial ordre是啥意思 330 | From AAA to Everyone: (7:32 PM) 331 | 以前微信好像也会出现信息错位的问题 332 | From A to Everyone: (7:32 PM) 333 | 微信现在也有 334 | From AAA to Everyone: (7:32 PM) 335 | 所以不用确保完全正确吧 336 | From ningdi to Everyone: (7:32 PM) 337 | 顺序都保证不了。。 就不是聊天了。。 338 | From Fei to Everyone: (7:32 PM) 339 | 就是A说了话,引起B说话,显示的时候A在B前面,因果关系 340 | From Sean Gao to Everyone: (7:33 PM) 341 | @v 想了想, b对a 的response,一定晚于 a 的问题。如果server 端排序,那不可能b比a先到。 342 | From A to Everyone: (7:33 PM) 343 | 有时候信息都丢了 344 | From ray to Everyone: (7:33 PM) 345 | it's the hardest problem to ensure the high consistency 346 | From Michael to Everyone: (7:33 PM) 347 | @v 我觉得不用vector clock,简单的lamport clock就行。但是要是面试会不会太复杂。 348 | From ningdi to Everyone: (7:33 PM) 349 | 我们以前做过测试,一个群里看到的消息确实可能不是一个顺序 350 | From Ken to Everyone: (7:33 PM) 351 | Meeting notes with QR code to join WeChat group (if you have not joined yet): https://docs.google.com/document/d/1Hfnhg09v9ISJ20151u7PDTpvzqjpp5ajiNI5h-KevY0/edit# 352 | From zepengzhao to Everyone: (7:33 PM) 353 | duplicate 那些是不是reliability的问题呢 354 | From ray to Everyone: (7:33 PM) 355 | duplicate is still consistency issue I think 356 | From A to Everyone: (7:33 PM) 357 | b的response一定晚于a的问题啊。如果a的问题没有deliver,b问题都没看到,怎么会发response? 358 | From xing wang to Everyone: (7:33 PM) 359 | 用什么db讲了吗? 360 | From A to Everyone: (7:33 PM) 361 | 这和设计没关系。。基本法啊 362 | From Ender to Everyone: (7:34 PM) 363 | 但是deliver到c的顺序可能是b在a前面 364 | From ray to Everyone: (7:34 PM) 365 | I didn't see any multi thread topic popped yet 366 | From v to Everyone: (7:34 PM) 367 | 对于a和b是没问题。。。对于c来说。。。 顺序是乱的 368 | From YL to Everyone: (7:35 PM) 369 | a和b同时向对方发送信息,两个人的顺序就是不一样的 370 | From Sean Gao to Everyone: (7:35 PM) 371 | @v c如果严格读取 group 的 ts,就不会 372 | From v to Everyone: (7:35 PM) 373 | 顺序可能是乱的。。比如a的问题发给c的时候延时特别大 374 | From Sean Gao to Everyone: (7:35 PM) 375 | @yl 他说的是 b回复a 的信息,是有关联的。 376 | From v to Everyone: (7:35 PM) 377 | 服务器的时间是不准确的可能会有回调 378 | From Weilong Ding to Everyone: (7:35 PM) 379 | timestamp是不可靠的 380 | From v to Everyone: (7:35 PM) 381 | 除非用version id类似lambo clock 382 | From Sean Gao to Everyone: (7:35 PM) 383 | 回调确实会 384 | From A to Everyone: (7:36 PM) 385 | 这个delivery顺序没法保证,除非牺牲latency。 上一条msg没收到,你就不发下一条信息。这样太扯了。 386 | From ningdi to Everyone: (7:36 PM) 387 | 那种延迟也是对于client端来说的,但是对于server端,不应该存在servers有不同的 ordering。 388 | From A to Everyone: (7:36 PM) 389 | 而且delivery 有quality of service要求的。要求不高的直接qos=0发了就不管了,drop and go 390 | From 姚剣楠 to Everyone: (7:37 PM) 391 | 用什么协议 需不需要提一下?websocket? 392 | From Jerry to Everyone: (7:37 PM) 393 | 服务器时间一般什么情况会回调? 394 | From Feng Gao to Everyone: (7:37 PM) 395 | message应该就是用服务器端的timestamp 吧,存DB 396 | From zepengzhao to Everyone: (7:37 PM) 397 | 我觉得要踩到点吧 398 | From ray to Everyone: (7:37 PM) 399 | I remember the wechat use the multi paxos 400 | From zepengzhao to Everyone: (7:37 PM) 401 | real time messaging 402 | From A to Everyone: (7:37 PM) 403 | 这种message topic/queue都append only的吧,server发了, 爱收到不收到。。 404 | From zepengzhao to Everyone: (7:37 PM) 405 | paxos 是一致性的 406 | From Weilong Ding to Everyone: (7:37 PM) 407 | 建议去看ddia有讲 408 | From Feng Gao to Everyone: (7:37 PM) 409 | message也要存数据库的吧 410 | From ningdi to Everyone: (7:37 PM) 411 | 不同server的ts可能不一致。。感觉总是要把group放在一个server上 才能真的rely on ts 412 | From Jackie G to Everyone: (7:38 PM) 413 | Do we need to expand on UserMeta? What kind of metadata is stored? 414 | From bernini to Everyone: (7:38 PM) 415 | Ddia是书还是视频? 416 | From Lucas Li to Everyone: (7:38 PM) 417 | 要对比一下通信方式,HTTP Polling, Long Polling, WebSocket之间的区别么 418 | From zepengzhao to Everyone: (7:38 PM) 419 | 要很多machine maintain tcp connection (websocket) 420 | From Weilong Ding to Everyone: (7:38 PM) 421 | 书 422 | From Lucas Li to Everyone: (7:38 PM) 423 | 这种面试,是不是每道题目都要提前准备一下啊 424 | From xing wang to Everyone: (7:38 PM) 425 | 用什么协议 需不需要提一下?websocket?这是重点,应该开始就讲 426 | From zepengzhao to Everyone: (7:38 PM) 427 | trade off 比较 428 | From bernini to Everyone: (7:38 PM) 429 | 我感觉要来不及了。。。 430 | From zepengzhao to Everyone: (7:39 PM) 431 | 基本上requirement 讲太久了 432 | From Ken to Everyone: (7:39 PM) 433 | presentation: https://docs.google.com/presentation/d/1pWuOkQrxk_Eib3oBwEGccXnqSZXisuoeSfdfToYLK7w/edit?usp=sharing 434 | From Michael to Everyone: (7:39 PM) 435 | websocket scale也得注意 436 | From lining to Everyone: (7:39 PM) 437 | 用NTP同步Server时间不就行了 438 | From 姚剣楠 to Everyone: (7:39 PM) 439 | Nginx + WebSocket这里 有个坑 就是6w 端口限制 这块设计好了 会是加分 440 | From zepengzhao to Everyone: (7:39 PM) 441 | 像fb 45分钟 442 | From xing wang to Everyone: (7:39 PM) 443 | 要对比一下通信方式,HTTP Polling, Long Polling, WebSocket之间的区别么,,,,同意! 444 | From v to Everyone: (7:39 PM) 445 | websocket scale有啥问题? 446 | From 姚剣楠 to Everyone: (7:39 PM) 447 | nginx能支撑的websocket连接数最大只有 65535吧 448 | From Sean Gao to Everyone: (7:40 PM) 449 | 对,如果server time 同步了, 还有 ts 的问题么 ? 450 | From 姚剣楠 to Everyone: (7:40 PM) 451 | 》 websocket scale有啥问题? 452 | From zepengzhao to Everyone: (7:40 PM) 453 | 10分钟 requirement 10分钟high level,然后剩下20分钟deep dive 454 | From 姚剣楠 to Everyone: (7:40 PM) 455 | Websocket 文件描述符数量的调整下吧 456 | From Sean Gao to Everyone: (7:40 PM) 457 | tcp 链接是 5元组 判定唯一, 65535好像不是平静。 458 | From ningdi to Everyone: (7:40 PM) 459 | Server ts咋同步。。 每时每刻都syc吗。。 多个server如何确定谁是master的ts 460 | From 姚剣楠 to Everyone: (7:40 PM) 461 | 每打开一个tcp链接 占用一个文件描述符 462 | From Sean Gao to Everyone: (7:40 PM) 463 | 瓶颈 464 | From Michael to Everyone: (7:41 PM) 465 | @v 你得记录下哪个client在哪个websocket server上或者,client连上websocket server之后得subscript一个topic 466 | From Sean Gao to Everyone: (7:41 PM) 467 | @ningdi 好像有专门的协议 468 | From lining to Everyone: (7:41 PM) 469 | NTP 470 | From v to Everyone: (7:41 PM) 471 | 我之前看好像一个server可以hold up to 1million websocket connection? 472 | From zepengzhao to Everyone: (7:41 PM) 473 | 要先有个server discover 吧 474 | From Sean Gao to Everyone: (7:41 PM) 475 | @v 那个是 WhatsApp 的 erlang 476 | From v to Everyone: (7:41 PM) 477 | 对。。需要存下connection的信息 478 | From ningdi to Everyone: (7:42 PM) 479 | @Sean, 那time draft的情况也会发生吧 难道还能修复已经persist 到db的records? 480 | From Sean Gao to Everyone: (7:42 PM) 481 | @ningdi 细节我不懂。。。 482 | From v to Everyone: (7:42 PM) 483 | erlang是个类似websocket的协议么 484 | From 姚剣楠 to Everyone: (7:42 PM) 485 | Websocket的连接量不是瓶颈 百万应该也没问题 但是前面要是有nginx 那nginx的6w5端口数 就是瓶颈了 486 | From ray to Everyone: (7:42 PM) 487 | right 488 | From Yijie Shen’s iPhone to Everyone: (7:42 PM) 489 | Scale 可以弄多个websocket handler 吗? 490 | From Sean Gao to Everyone: (7:43 PM) 491 | thanks Jiannan 492 | From ningdi to Everyone: (7:43 PM) 493 | Nginx会成为battleneck? 是因为websocket 这个协议导致的吗? 494 | From Lucas Li to Everyone: (7:43 PM) 495 | websocket连的是http 服务器一个机器大概50K个连接左右? 496 | From 姚剣楠 to Everyone: (7:44 PM) 497 | https://blog.51cto.com/u_15300443/3091841 这里有人也处理过这个坑 498 | From ray to Everyone: (7:44 PM) 499 | how many tcp connections are available for one server? theoretically? 500 | From Lucas Li to Everyone: (7:44 PM) 501 | 有个著名的10K问题 502 | From ningdi to Everyone: (7:44 PM) 503 | 666 感谢 504 | From YL to Everyone: (7:44 PM) 505 | 这不应该发送到group然后再发送给每个user吗 506 | From Lucas Li to Everyone: (7:45 PM) 507 | 后来有100K,1M50K应该没有问题 508 | From zepengzhao to Everyone: (7:45 PM) 509 | 好像后端可以做成pub/sub 510 | From 姚剣楠 to Everyone: (7:45 PM) 511 | how many tcp connections are available for one server? theoretically? 如果你内存 cpu够大 几百万是完全没问题的 512 | From A to Everyone: (7:45 PM) 513 | group chat 必然是push啊 514 | From Sean Gao to Everyone: (7:45 PM) 515 | NTP server time sync 516 | NTP is intended to synchronize all participating computers to within a few milliseconds of Coordinated Universal Time (UTC). It uses the intersection algorithm, a modified version of Marzullo's algorithm, to select accurate time servers and is designed to mitigate the effects of variable network latency. 517 | From zepengzhao to Everyone: (7:45 PM) 518 | group chat里面的参与者都subscribe到某个conversation 519 | From A to Everyone: (7:45 PM) 520 | 肯定是pub、sub,一个groupchat就是一个topic 521 | From 姚剣楠 to Everyone: (7:45 PM) 522 | 我试过 把websocket服务器的文件描述符改成200w 一台也能处理 523 | From ray to Everyone: (7:46 PM) 524 | google global database use time stamp for strong consistency 525 | From tomdi to Everyone: (7:46 PM) 526 | whatspp 一个server可以 5M connection 527 | From zepengzhao to Everyone: (7:46 PM) 528 | 要clarify 不会有很多人 529 | From 姚剣楠 to Everyone: (7:46 PM) 530 | 2. 文件描述符数量 可能需要调整内核参数,文件描述符的数量其实也是和内存相关的,因为每打开一个tcp连接,就得占用一个文件描述符。 内核参数:fs.file-max 这是和系统资源相关的,也不会是瓶颈 531 | From zepengzhao to Everyone: (7:46 PM) 532 | fanout太多人的话就会有performance问题 533 | From ray to Everyone: (7:46 PM) 534 | NTP is quite an important module for server 535 | From 姚剣楠 to Everyone: (7:46 PM) 536 | 搬运工 供参考 537 | From zepengzhao to Everyone: (7:46 PM) 538 | 这其实跟new feed的道理差不多fanout 539 | From v to Everyone: (7:47 PM) 540 | 感觉最开始提需求挖坑太多了 541 | From Robin to Everyone: (7:47 PM) 542 | +1 挖坑太多了 543 | From Lucas Li to Everyone: (7:47 PM) 544 | 需求跟面试官都确认过的吧 545 | From zzb to Everyone: (7:47 PM) 546 | 是的 应该就简单的 use case 开始做 547 | From ningdi to Everyone: (7:47 PM) 548 | nginx最多只能维持(65535*后端服务器IP个数)条websocket的长连接-> 意思是 我加很多台机器 其实也不算是啥瓶颈咯。 正常来说希望一个机器处理多少connection比较合适呢? 549 | From A to Everyone: (7:47 PM) 550 | IBM 的chatting broker可以handle 最多100万个session 551 | From Lucas Li to Everyone: (7:48 PM) 552 | 是不是面试官点头的,都要讨论啊 553 | From A to Everyone: (7:48 PM) 554 | 为啥用websocket?websocket协议有啥优势吗? 555 | From xing wang to Everyone: (7:48 PM) 556 | 超时了吗?有人记录时间吗? 557 | From Lucas Li to Everyone: (7:48 PM) 558 | 双向通信 559 | From Andrew Hou to Everyone: (7:48 PM) 560 | 好奇的问下 multi user chat的系统设计 不应该是focus on 系统设计上吗,感觉现在是在说多人聊天的功能逻辑 561 | From zepengzhao to Everyone: (7:48 PM) 562 | 而且很重要一点,好像把面试官当coworker会比较好, 而不是给他找个solution 563 | From A to Everyone: (7:48 PM) 564 | 这里面没必要双向通信 565 | From ningdi to Everyone: (7:48 PM) 566 | 说实话 面试官给的这个requirement 我都觉得没必要用websocket了 long pulling貌似都能处理的了 567 | From zepengzhao to Everyone: (7:48 PM) 568 | T5 569 | From kk to Everyone: (7:49 PM) 570 | 聊天为什么没必要双向。。 571 | From zepengzhao to Everyone: (7:49 PM) 572 | 要求drive, 还有如何应对feedback 573 | From v to Everyone: (7:49 PM) 574 | long pulling和web socket之前的pro con分部是啥? 575 | From A to Everyone: (7:49 PM) 576 | 因为你是在和server 双向通信,不是sender和receiver 577 | From zepengzhao to Everyone: (7:49 PM) 578 | 都是lantency 579 | From Yijie Shen’s iPhone to Everyone: (7:49 PM) 580 | Rea time 是不是用web socket 比较好 581 | From bernini to Everyone: (7:49 PM) 582 | 开销太大? 583 | From A to Everyone: (7:49 PM) 584 | 可以decouple sender and receiver 585 | From v to Everyone: (7:49 PM) 586 | 啥情况下long pulling比较好? 587 | From zepengzhao to Everyone: (7:49 PM) 588 | websocket latency最短 589 | From Lucas Li to Everyone: (7:49 PM) 590 | 间隔短了服务器吃不消,间隔长了体验差 591 | From kk to Everyone: (7:49 PM) 592 | 并没有long pulling比较好情况。long pulling直接http,比较容易实现。懒人专用。、 593 | From bernini to Everyone: (7:50 PM) 594 | 发太多request了 595 | From ningdi to Everyone: (7:50 PM) 596 | 这个requirement 感觉有conflict 一方面real time 一方面又 不login 不收消息。。。 597 | From Robin to Everyone: (7:50 PM) 598 | wesocket开销小,不需要每条message都要新开connection 599 | From 姚剣楠 to Everyone: (7:50 PM) 600 | 作chat room ,websocket 或者SSE都可以,long pulling 没有优势吧 601 | From v to Everyone: (7:50 PM) 602 | 是啊。。感觉websocket总是比long pulling好 603 | From Feng Gao to Everyone: (7:50 PM) 604 | 感觉没时间设计storage了 605 | From zepengzhao to Everyone: (7:50 PM) 606 | 不login不收notification 607 | From 姚剣楠 to Everyone: (7:50 PM) 608 | Websocket 熟悉框架的话 其实开发也很快 609 | From Andrew Hou to Everyone: (7:50 PM) 610 | 多人聊天肯定 不是real time 第一 设计 notification 第二 异步推送 611 | From ray to Everyone: (7:50 PM) 612 | why not real time for group chat 613 | From A to Everyone: (7:50 PM) 614 | 你看她的设计,明显这个backend是作为一个broker出现的 615 | From kk to Everyone: (7:50 PM) 616 | Long pulling ne 617 | From A to Everyone: (7:50 PM) 618 | 没必要双向通信 619 | From Lucas Li to Everyone: (7:50 PM) 620 | 这里的login应该和online两码事 621 | From kk to Everyone: (7:50 PM) 622 | 能做到的,ws都能做到。 623 | From Jerry to Everyone: (7:51 PM) 624 | group chat是不是要分在线的和离线的两波用户讨论 625 | From Michael to Everyone: (7:51 PM) 626 | 发送给在线user和离线user应该不一样吧 627 | From v to Everyone: (7:51 PM) 628 | 为啥要用message queueKafka能支持这么多topic么 629 | From Lucas Li to Everyone: (7:51 PM) 630 | 解耦 631 | From lining to Everyone: (7:51 PM) 632 | 1 to 1要用pub、sub吗? 633 | From kk to Everyone: (7:51 PM) 634 | 没必要。 635 | From 姚剣楠 to Everyone: (7:51 PM) 636 | Kafka能支持这么多topic么 同样的疑问 637 | From A to Everyone: (7:51 PM) 638 | 解耦,高端词汇 639 | From v to Everyone: (7:52 PM) 640 | 这为啥需要decouple 641 | From Feng Gao to Everyone: (7:52 PM) 642 | 我也觉得有点奇怪,message为啥不放DB 643 | From zzb to Everyone: (7:52 PM) 644 | 这里用不用socket 是具体实现问题 面试candidate 应该把重心放在模块上面 有哪些 数据类型 哪些数据DB 怎么跟系统交互 这里讲清楚 645 | From v to Everyone: (7:52 PM) 646 | 感觉这么多topic kafka性能会有影响 647 | From 姚剣楠 to Everyone: (7:52 PM) 648 | 每个聊天室 或者 1 对1的聊天 都抽象成websocket里面的一个channel 649 | From ray to Everyone: (7:52 PM) 650 | maybe use redis for the in-cache mem 651 | From lining to Everyone: (7:52 PM) 652 | 如果 1 to 1 用pub, sub,那得多少topic 653 | From Lucas Li to Everyone: (7:52 PM) 654 | MQ先进先出,吞吐量大 655 | From zepengzhao to Everyone: (7:52 PM) 656 | 可以做一个conversation 啊, 聊天参与者是conversation subscribers 657 | From kk to Everyone: (7:52 PM) 658 | chat server用mq,是没必要的。 659 | From Yijie Shen’s iPhone to Everyone: (7:53 PM) 660 | 对方离线的时候 可以把msg 放到kafka, 在线的时候用websocket 661 | From zepengzhao to Everyone: (7:53 PM) 662 | 101, group chat都可以吧 663 | From ningdi to Everyone: (7:53 PM) 664 | 1:1也topic n^2 topic 665 | From v to Everyone: (7:53 PM) 666 | 没必要啊。。。直接query一下就行了 667 | From A to Everyone: (7:53 PM) 668 | chat server 用mq很常见 669 | From ray to Everyone: (7:53 PM) 670 | too many topics 671 | From yingzhu to Everyone: (7:53 PM) 672 | 有websocket了是不是没必要mq了? 673 | From v to Everyone: (7:53 PM) 674 | 没必要缓存啊 675 | From ningdi to Everyone: (7:53 PM) 676 | ws跟mq不冲突吧 677 | From ray to Everyone: (7:53 PM) 678 | message queues looks like used for service to service deliverynot for the user to user chat :) 679 | From zepengzhao to Everyone: (7:53 PM) 680 | too many topics可以infra解决吧 681 | From kk to Everyone: (7:53 PM) 682 | mq用在聊天很常见? 683 | From Lucas Li to Everyone: (7:54 PM) 684 | service来不及怎么处理啊 685 | From bernini to Everyone: (7:54 PM) 686 | 实测wechat也丢消息 687 | From kk to Everyone: (7:54 PM) 688 | 我表示怀疑。 689 | From sherry的 iPhone to Everyone: (7:54 PM) 690 | 如果mq jam了怎么办 691 | From zepengzhao to Everyone: (7:54 PM) 692 | too many topic有问题吗 693 | From A to Everyone: (7:54 PM) 694 | mq可以1. 吧那些没有及时处理的请求存在mq里 2. 把messagelog存进queue里,用来做之后的分析 和存储,作为一个append only log存在 695 | From bernini to Everyone: (7:54 PM) 696 | 离线sync写出过问题 697 | From Lucas Li to Everyone: (7:54 PM) 698 | 水平切分 699 | From A to Everyone: (7:54 PM) 700 | 水平切分是啥意思 701 | From xing wang to Everyone: (7:55 PM) 702 | 今天的面试官太nice了 703 | From ray to Everyone: (7:55 PM) 704 | maybe the user pull the messages directly from the in-mem cache,? 705 | From zepengzhao to Everyone: (7:55 PM) 706 | 而且没有问消息要存云端不 707 | From A to Everyone: (7:55 PM) 708 | horizontal sharding? 709 | From zepengzhao to Everyone: (7:55 PM) 710 | 都没说好 711 | From A to Everyone: (7:55 PM) 712 | 要cache干啥? 713 | From 姚剣楠 to Everyone: (7:55 PM) 714 | 直接存用户的聊天记录 不会有法律问题吗 715 | From v to Everyone: (7:55 PM) 716 | 这设计write fanout amplification也太大了。。。五百人的群 每个消息都写500份到kafka 717 | From Yijie Shen’s iPhone to Everyone: (7:55 PM) 718 | 面试官输出的比较少,大多数面试都是这样吗? 719 | From ningdi to Everyone: (7:56 PM) 720 | 1:1要是用了mq, 想象一下2个人加了好友,然后发消息,现场创建topic? 如果是pre set topic, 那么你有5m的用户,他们每个人都有1000个好友,你要创建5km的topic 721 | From zepengzhao to Everyone: (7:56 PM) 722 | 500 fanout不算大吧主要怕millions比较川普粉丝mllions 723 | From A to Everyone: (7:56 PM) 724 | 用户信息必须存啊 725 | From YL to Everyone: (7:56 PM) 726 | 我觉得要看你怎么和面试官交流吧 727 | From zepengzhao to Everyone: (7:56 PM) 728 | 500 用多个worker就可以了 729 | From kk to Everyone: (7:56 PM) 730 | 这样topic一多。。 731 | From sherry的 iPhone to Everyone: (7:56 PM) 732 | 不需要创造500个topic吧 500个user subscribe一个topix不结了 733 | From 姚剣楠 to Everyone: (7:56 PM) 734 | 1:1要是用了mq, 想象一下2个人加了好友,然后发消息,现场创建topic? 如果是pre set topic, 那么你有5m的用户,他们每个人都有1000个好友,你要创建5km的topic。 同意 kafka大部分时间在忙着建立和删除topic 735 | From Lucas Li to Everyone: (7:56 PM) 736 | 一个用户一个topic就可以了吧 737 | From kk to Everyone: (7:56 PM) 738 | kafka顶得住?我表示怀疑。 739 | From A to Everyone: (7:56 PM) 740 | 你要不放心就hash一下,想看的时候偷偷看 741 | From ray to Everyone: (7:57 PM) 742 | kafka ding bu zhu 743 | From A to Everyone: (7:57 PM) 744 | 一个用户就是一个topic 745 | From ray to Everyone: (7:57 PM) 746 | how many memory it would be for only creating one new topic? 747 | From ningdi to Everyone: (7:57 PM) 748 | 一个用户一个topic 也不现实。 749 | From A to Everyone: (7:57 PM) 750 | topic是tree 结构的,/a/b/c/d/e/牛逼闪闪 751 | From Jiayue(Hubert) Wu to Everyone: (7:57 PM) 752 | 不需要每个消息存500份吧 ,存一份再push到所有人 753 | From YL to Everyone: (7:57 PM) 754 | 面试官好像笑出来了 755 | From zepengzhao to Everyone: (7:57 PM) 756 | topic多infra解决,时间到了 757 | From Lucas Li to Everyone: (7:58 PM) 758 | 想象1000台机器,每台机器10000个topic 759 | From 姚剣楠 to Everyone: (7:58 PM) 760 | Web socket 应该很容易解决上面kafka那些问题。。 直接用socket不好吗 761 | From ningdi to Everyone: (7:58 PM) 762 | kafka broker已经可以处理这么多topic了? 763 | From Jerry to Everyone: (7:58 PM) 764 | 但是用户离线的情况怎么办 765 | From Lucas Li to Everyone: (7:58 PM) 766 | http server来不及处理怎么办啊 767 | From YL to Everyone: (7:58 PM) 768 | 离线通过notification sever解决吧 769 | From Michael to Everyone: (7:59 PM) 770 | browser没法用socket 771 | From Lucas Li to Everyone: (7:59 PM) 772 | notification server来不及处理怎么办啊 773 | From Feng Gao to Everyone: (7:59 PM) 774 | image/video应该是没法放进DB的。需要blob storage 775 | From xing wang to Everyone: (7:59 PM) 776 | 讲了为啥不用kv吗 777 | From Richard Tu to Everyone: (7:59 PM) 778 | 不好意思,我可能错过了什么documentDB干嘛的? 779 | From bernini to Everyone: (7:59 PM) 780 | 一般都是socket,法request太昂贵了 781 | From 姚剣楠 to Everyone: (7:59 PM) 782 | 是不是预估一下网络traffic比较好 10m用户同时在线 你发一条消息,会广播给同组的人 最多扩大500倍 其实用户多起来 traffic也是蛮大 783 | From bernini to Everyone: (7:59 PM) 784 | kv range search太贵 785 | From 非洲黑猴子 to Everyone: (7:59 PM) 786 | 附件用对象存储就好,还能压缩转码 787 | From 姚剣楠 to Everyone: (7:59 PM) 788 | 我们组的服务上周刚被 4gps每秒的攻击 挤爆网络 789 | From Lucas Li to Everyone: (8:00 PM) 790 | 多媒体发送给S3,消息里面放个链接就可以了 791 | From ningdi to Everyone: (8:00 PM) 792 | 你们组没有黑名单吗。。 793 | From Yijie Shen’s iPhone to Everyone: (8:00 PM) 794 | Attachment 比如image 或者video 是不是用S3比较好啊 795 | From Enze to Everyone: (8:00 PM) 796 | mqtt 可以支持大量topics 797 | From Feng Gao to Everyone: (8:00 PM) 798 | 这个backend就一个service。。。。 799 | From A to Everyone: (8:00 PM) 800 | 终于有人提mqtt了我就知道lol 801 | From ray to Everyone: (8:01 PM) 802 | how does the message delivered cross the region, like one person send at asia, and another one received at northamerica, is there any replication in this case? 803 | From A to Everyone: (8:01 PM) 804 | 附件用对象存储就好,还能压缩转码 ,猴子哥,对象存储是什么? 805 | From A to Everyone: (8:01 PM) 806 | object storage吗? s3? 807 | From Lao luo to Everyone: (8:01 PM) 808 | 是group chat的话,说大概要多少个group吗?10 M user 不说明group就多吧 809 | From 非洲黑猴子 to Everyone: (8:02 PM) 810 | 存文件的,kv,拿着key就能找到云上的文件 811 | From ningdi to Everyone: (8:02 PM) 812 | 上来就aws全家桶 基本啥也不是瓶颈 啥也不是问题了 哈哈哈 813 | From Lao luo to Everyone: (8:02 PM) 814 | Kafka确实有topic 815 | From Sean Gao to Everyone: (8:03 PM) 816 | @ray 存到 global NoSQL, reqplicate 到 其region ? 817 | From xing wang to Everyone: (8:03 PM) 818 | 上来就aws全家桶 基本啥也不是瓶颈 啥也不是问题了 哈哈哈,,,有取巧之嫌 819 | From YL to Everyone: (8:03 PM) 820 | 45分钟的话时间已经到了…实际应该只有不到半小时吧 821 | From Lucas Li to Everyone: (8:03 PM) 822 | S3设计又是一道面试题 823 | From ningdi to Everyone: (8:03 PM) 824 | gfs 825 | From ray to Everyone: (8:04 PM) 826 | there should be some sycronize process to ensure the consistency of messages for group chat 827 | From A to Everyone: (8:04 PM) 828 | s3好啊,背过了,多谢非洲黑猴子哥 829 | From Lucas Li to Everyone: (8:04 PM) 830 | 实在不行把多媒体的需求略过 831 | From Richard Tu to Everyone: (8:04 PM) 832 | 我觉得上面也是我想问的问题。架设你作为candidate,对aws全家桶特别熟,但是从面试官角度来说,是面试官想要的吗 833 | From A to Everyone: (8:04 PM) 834 | 附件dedup也要做一下 835 | From bernini to Everyone: (8:04 PM) 836 | 实测所媒体跨区域sync延迟极大 837 | From Zhengguan Li to Everyone: (8:04 PM) 838 | receiver不就是MQ consumer? 839 | From ray to Everyone: (8:04 PM) 840 | there is tunnelfor cross region network 841 | From A to Everyone: (8:05 PM) 842 | tunnel是啥 843 | From ray to Everyone: (8:05 PM) 844 | it would be supper high speed 845 | From Lucas Li to Everyone: (8:05 PM) 846 | In the first test, we set up a Kafka cluster with 5 brokers on different racks. In that cluster, we created 25,000 topics, each with a single partition and 2 replicas, for a total of 50,000 partitions. So, each broker has 10,000 partitions. We then measured the time to do a controlled shutdown of a broker. The results are shown in the table below.https://blogs.apache.org/kafka/entry/apache-kafka-supports-more-partitions 847 | From Becky to Everyone: (8:05 PM) 848 | Receiver service 是单机流吗 849 | From A to Everyone: (8:06 PM) 850 | 那是个aws asg,里面有1万个vm 851 | From ray to Everyone: (8:06 PM) 852 | we are using the chat function of zoomLOL 853 | From v to Everyone: (8:06 PM) 854 | lol 855 | From ray to Everyone: (8:06 PM) 856 | and we can deliver real time video audio globally 857 | From A to Everyone: (8:06 PM) 858 | zoom牛逼 859 | From zepengzhao to Everyone: (8:06 PM) 860 | message -> using conversation_id to get chat participants -> get participant’s web socket sessions -> push message to participant’s machine -> ws -> user device 861 | From Lucas Li to Everyone: (8:07 PM) 862 | ZOOM同时在线的用户有限 863 | From ray to Everyone: (8:08 PM) 864 | there is roles and permissions setup for the chatespecially for the group 865 | From A to Everyone: (8:08 PM) 866 | IAM 答主没说 867 | From ray to Everyone: (8:08 PM) 868 | the admin to create group, delete group 869 | From A to Everyone: (8:08 PM) 870 | 要了那么多requirement,来不及说了 871 | From Jerry to Everyone: (8:09 PM) 872 | consume那一步都做什么操作 873 | From lining to Everyone: (8:09 PM) 874 | requirement数量要自己控制吗? 875 | From Zhengguan Li to Everyone: (8:09 PM) 876 | 请问一下 面试的时候也有条件画图嘛 877 | From ray to Everyone: (8:09 PM) 878 | if it is message queue, it is pulling message from the topic 879 | From YL to Everyone: (8:09 PM) 880 | 画的 881 | From Jerry to Everyone: (8:09 PM) 882 | group的聊天记录接受者要用什么DB存取 883 | From ray to Everyone: (8:10 PM) 884 | so it is quite a drawback of pulling, since it is hard to balance the produce and consume 885 | From Sean Gao to Everyone: (8:10 PM) 886 | lsm 应该是 887 | From Lucas Li to Everyone: (8:10 PM) 888 | pub/sub是不是可以给服务发消息啊 889 | From ray to Everyone: (8:10 PM) 890 | there is push style 891 | From YL to Everyone: (8:10 PM) 892 | 我感觉还是应该用wide column DB 893 | From Jerry to Everyone: (8:10 PM) 894 | message queue会有容量上限吗? 到了上限这么办 895 | From aaa to Everyone: (8:11 PM) 896 | 时间差不多到了 897 | From ray to Everyone: (8:11 PM) 898 | If everyone likes to use aws, we can have interview like how to design aws servce like s3, documentDB 899 | From YL to Everyone: (8:11 PM) 900 | 时间都超了 901 | From Andrew Hou to Everyone: (8:11 PM) 902 | 我觉得这个是一个设计流程图 而不是设计系统 903 | From Ken to Everyone: (8:11 PM) 904 | Started 7:13 905 | From shawnzech to Everyone: (8:11 PM) 906 | .... 907 | From Zhengguan Li to Everyone: (8:12 PM) 908 | 有上限 queue depth(# of messages), 或者size(e.g. 5M)啥的 909 | From Jiabei Luo to Everyone: (8:12 PM) 910 | Why not use noSQL DB? 911 | From ray to Everyone: (8:12 PM) 912 | good point 913 | From Sean Gao to Everyone: (8:12 PM) 914 | 一定是吧,写多,少edit,时间排序 915 | From ray to Everyone: (8:12 PM) 916 | and there is also blacklist friend functionality of wechat, LOL 917 | From Richard Tu to Everyone: (8:13 PM) 918 | 1小时正好 919 | From lining to Everyone: (8:13 PM) 920 | 到点了 921 | From Jerry to Everyone: (8:13 PM) 922 | 这好像还没说message存储是怎么partition的吧? 923 | From Peng Wang to Everyone: (8:14 PM) 924 | meta data是什么? 925 | From YL to Everyone: (8:14 PM) 926 | 除了by UserId还有什么partition的方法? 927 | From A to Everyone: (8:14 PM) 928 | 就是facebook的data 929 | From ray to Everyone: (8:14 PM) 930 | metadata is like some control plane data, like configuration of the message queue 931 | From lining to Everyone: (8:15 PM) 932 | LOL 933 | From ray to Everyone: (8:15 PM) 934 | facebook has no face anymore, lol 935 | From 非洲黑猴子 to Everyone: (8:15 PM) 936 | 为了元宇宙,脸都不要了 937 | From A to Everyone: (8:15 PM) 938 | 给大包也得去啊 939 | From AAA to Everyone: (8:16 PM) 940 | 😂 941 | From A to Everyone: (8:16 PM) 942 | 求原宇宙公司内推 943 | From Pencil to Everyone: (8:16 PM) 944 | That’s a good question 945 | From AAA to Everyone: (8:16 PM) 946 | 求抱大腿 947 | From bernini to Everyone: (8:16 PM) 948 | 上次E5,fb挂我sd,加面过了都不给去, 结果几个月后大放水 949 | From A to Everyone: (8:16 PM) 950 | e5大佬 wow 951 | From AAA to Everyone: (8:17 PM) 952 | 大佬什么背景 953 | From Jiabei Luo to Everyone: (8:17 PM) 954 | 放水怎么说? 955 | From A to Everyone: (8:17 PM) 956 | 放水都不要我😭要求降下来了现在还放水吗在吧?E5 要求变E4?我疫情刚爆发时候面的312要求变E4,被拒绝,找你就是要在找E5132132132132132132321132132132132132132132132132132132123123How much time should we spend in gathering requirements?123123123123123123213 957 | From Peijin to Everyone: (8:19 PM) 958 | 213 959 | From 非洲黑猴子 to Everyone: (8:19 PM) 960 | 213 961 | From Kevin to Everyone: (8:19 PM) 962 | 123 963 | From 2002079 cici to Everyone: (8:19 PM) 964 | 213 965 | From lining to Everyone: (8:19 PM) 966 | 213 967 | From A to Everyone: (8:19 PM) 968 | 1 969 | From johnc to Everyone: (8:19 PM) 970 | 213 971 | From Patrick to Everyone: (8:19 PM) 972 | 21 973 | From Spin to Everyone: (8:20 PM) 974 | 21 975 | From ningdi to Everyone: (8:20 PM) 976 | 2 977 | From Qianwen Huang to Everyone: (8:20 PM) 978 | 21 979 | From ray to Everyone: (8:20 PM) 980 | hard skill 213 981 | From yinghuaguan to Everyone: (8:20 PM) 982 | 2 983 | From ningdi to Everyone: (8:21 PM) 984 | 拿需求那步真的秀。 学到了 985 | From A to Everyone: (8:21 PM) 986 | 面试官应该scope down到chat service only 987 | From lining to Everyone: (8:22 PM) 988 | 可能对wechat太熟了 989 | From A to Everyone: (8:22 PM) 990 | 答主问了太多requirement clarification,面试官需要控制一下 991 | From Jian Zhu to Everyone: (8:22 PM) 992 | 这些应该是面试者自己控制啊 993 | From bernini to Everyone: (8:22 PM) 994 | 需求花了太多时间了 995 | From ningdi to Everyone: (8:22 PM) 996 | 可能要我 我就focus在消息送达 上面了。。 其他的根本不care 😂 997 | From 非洲黑猴子 to Everyone: (8:22 PM) 998 | 面试官控场一下 999 | From AAA to Everyone: (8:23 PM) 1000 | 抱歉,发错了 1001 | From A to Everyone: (8:23 PM) 1002 | 需要面试官控场 1003 | From AAA to Everyone: (8:23 PM) 1004 | 有回放么其实我觉得真实面试情况下,还是不能依靠面试官。如果面试官本身就不postive,candidate还是得尽量自己控制时间面试官,自信点。是!如果自己不问需求,会不会失分我觉得ballpark 计算dau之类的感觉没啥用 面试时能跳过吗?我面一次amazon 真的2场遇见风格完全不一样的面试官。。一个小白哥贼积极,一个亚裔就根本不鸟我,全是我在bb要是我以后当面试官 我觉得ballpark计算可以直接跳过了大多数面试官都这样,把舞台交给应试者ballpark计算最没用直接跳过浪费时间我也觉得那些计算没啥用可以直接问面试官 我不说会不会扣分吗?貌似完全没有api design 😂除非对硬件的capability有数。要不然就是白费劲如果后面设计db的时候要用到,可以再提一下对哦 1005 | From Zhengguan Li to Everyone: (8:26 PM) 1006 | protocol是Http long polling, websocket啥的嘛 1007 | From Jiabei Luo to Everyone: (8:26 PM) 1008 | 没有api design只有object design hh 1009 | From Michael to Everyone: (8:26 PM) 1010 | 应该是receiver端real time chat 1011 | From Michael Qiu to Everyone: (8:26 PM) 1012 | chat needs bi-directional communication 1013 | From bernini to Everyone: (8:27 PM) 1014 | websocket不用一直握手,开销小 1015 | From 非洲黑猴子 to Everyone: (8:27 PM) 1016 | 底层netty可以考虑 1017 | From Sean Gao to Everyone: (8:28 PM) 1018 | java 估计性能还是差了点, cpp or c or erlang ? 1019 | From A to Everyone: (8:28 PM) 1020 | netty是啥 猴子哥 1021 | From bernini to Everyone: (8:28 PM) 1022 | nosql? 1023 | From v to Everyone: (8:28 PM) 1024 | 面试官讲一下kafka在这里用可以么? 这么多的topic可以支持么 1025 | From bernini to Everyone: (8:28 PM) 1026 | 那重新登录的时候拉history会不会很慢? 1027 | From Michael to Everyone: (8:28 PM) 1028 | 是那个java 的 nio 模型framework? 1029 | From 非洲黑猴子 to Everyone: (8:29 PM) 1030 | 底层用来做RPC层的框架,spark、flink、dubbo都在用 1031 | From Sean Gao to Everyone: (8:29 PM) 1032 | 是的 nio framework 1033 | From 非洲黑猴子 to Everyone: (8:29 PM) 1034 | Kafka高冷,人家自己写的网络层 1035 | From A to Everyone: (8:29 PM) 1036 | rpc的protocol 很轻量吗 ? 1037 | From zepengzhao to Everyone: (8:30 PM) 1038 | group chat fanout 到底在哪里好可以讨论下吗 1039 | From Jiabei Luo to Everyone: (8:30 PM) 1040 | history 不能in memory cache 一段 吗 1041 | From zepengzhao to Everyone: (8:31 PM) 1042 | 是不是有五十个人的chat room, 我们把一条message fanout成50条,然后每条conv_id一样,然后reciepient不一样呢 1043 | From A to Everyone: (8:31 PM) 1044 | history为什么要放在cache 1045 | From 非洲黑猴子 to Everyone: (8:31 PM) 1046 | 可以,在client端都可以缓存historymessage 1047 | From ningdi to Everyone: (8:31 PM) 1048 | Fanout说白了就是 有n个消失 for each 写给m个人消息* 1049 | From A to Everyone: (8:31 PM) 1050 | 你的chat history都是存在本地的吧 1051 | From Jiabei Luo to Everyone: (8:31 PM) 1052 | 我是说clientside localstorage 这种对 1053 | From Zhengguan Li to Everyone: (8:31 PM) 1054 | protocol和kafka这两个点怎么关联起来? 1055 | From zepengzhao to Everyone: (8:32 PM) 1056 | chat history要requirement的时候讲清楚 1057 | From Lao luo to Everyone: (8:32 PM) 1058 | 有谁解决过动态创建topic太多的问题了吗?我们现在就碰到类似的问题 1059 | From xinz to Everyone: (8:33 PM) 1060 | 内部service 之间的通信是用RPC 吗? 1061 | From A to Everyone: (8:33 PM) 1062 | 非洲黑猴子 大神,大家挺好大家听好请问 token 是存在哪里的token存在本地没错所以还是用的topic吗? 每个group chat一个topic?SQL too expensive我觉得要pub/sub + message handler + message queue比较robust我还是持反对意见what? SQL?user必须用sql啊Message 用sql 写太慢了吧message queue只有一个subscriber我第一次听说NoSQL 存uder tablemessage可以存noSQLmessage用sql不好吧user group可以存SQL分开呀message写sql,能慢到哪去呢,我不觉得会特别慢Message 用nonsqldb又不贵,多来几个dbnosql用bigtable稳得很,widecolumemessage 用 NoSQL, 图片用s3, 其它用sqlsql主要是scalability不太好, message太多,分片存NoSQL挺好的对 存历史消息 可以用非同期处理why the user table has to use sql?sorry I didn't get itIs an object store a database?Cuz user -group needs relational associationThere is no point for SQL unless you need to join multiple tables当你需要查看你的微信群里都有谁的时候,叫啥,长得好看不好看,你需要sql我不同意任何说SQL分片或者scalability的意见,因为在现在,任何nosql能做的分片/scalebility,sql都能做Every time send/receive a message need to read/write a row from the SQL database@Richard 那 写的效率呢 ??为什么要从db读?只是写而已。msg要放在append only的nosql db里,user,auth这些必然sql。我一般都用一个例子,就是dynamoDB用的是mySQL作为单点storage node是, uber内部的也是 MySQL 改的,存实时地理数据Normalization不过这样的话,那区别是什么呢 ?user-group table needs secondary key? 1063 | From ray to Everyone: (8:40 PM) 1064 | dynamodb use the mysql database engine 1065 | From Enze to Everyone: (8:41 PM) 1066 | 数据量大时怎么join? 1067 | From A to Everyone: (8:41 PM) 1068 | data normalization 是有好处的 1069 | From Jackson to Everyone: (8:41 PM) 1070 | firebase的firestore,太少人用了,说实话firebase虽然是NoSQL,但是他能做到sql的对应关系。用firebase做例子不好 1071 | From bernini to Everyone: (8:41 PM) 1072 | 会溢出? 1073 | From YL to Everyone: (8:41 PM) 1074 | 那这样讲的话所有的都可以用sql+good partition吗 1075 | From Yue Liang to Everyone: (8:42 PM) 1076 | ^我也这么理解。Tao 后台也是用mysql 1077 | From Richard Tu to Everyone: (8:43 PM) 1078 | 对,虽然现在engine改成myrocket了 1079 | From ningdi to Everyone: (8:43 PM) 1080 | 也就是说 如果没有partition limit是10k 1081 | From A to Everyone: (8:43 PM) 1082 | 什么是topic partition? 1083 | From ningdi to Everyone: (8:43 PM) 1084 | 那么最多也就10k个topic 1085 | From Lucas Li to Everyone: (8:43 PM) 1086 | 分片联合查询? 1087 | From Lucas Li to Everyone: (8:44 PM) 1088 | In the first test, we set up a Kafka cluster with 5 brokers on different racks. In that cluster, we created 25,000 topics, each with a single partition and 2 replicas, for a total of 50,000 partitions. So, each broker has 10,000 partitions. We then measured the time to do a controlled shutdown of a broker. The results are shown in the table below. 1089 | 1090 | https://blogs.apache.org/kafka/entry/apache-kafka-supports-more-partitions 1091 | From Lao luo to Everyone: (8:44 PM) 1092 | Kafka topic太多性能就下来了 1093 | From ningdi to Everyone: (8:44 PM) 1094 | 3秒的delay也太大了 1095 | From Lao luo to Everyone: (8:44 PM) 1096 | 和partion有关 1097 | From bernini to Everyone: (8:46 PM) 1098 | 我们都拿firebase当cache用 1099 | 1100 | ## Ticket master 1101 | From Jun to Everyone: (7:17 PM) 1102 | 这个画图是什么网站? 1103 | From Laoluo to Everyone: (7:17 PM) 1104 | 肯定是秒杀相关但是是不是象12306一样有不同的站 1105 | From Ken to Everyone: (7:17 PM) 1106 | We have notes for previous meeting. Please scan the QR code on top of the notes doc in this doc: https://docs.google.com/document/d/11hsGVxwAzfBPR6coFB-RiXmokUgqKbnFQ1R7urE6m_s/edit# to join our WeChat group 1107 | From Laoluo to Everyone: (7:17 PM) 1108 | 有不同的站会复杂不少 1109 | From Skit to Everyone: (7:19 PM) 1110 | 什么是p0, p3 1111 | From Qiang Lu to Everyone: (7:20 PM) 1112 | is it priority? 1113 | From Christie to Everyone: (7:20 PM) 1114 | Priority 0? 1115 | From 非洲黑猴子 to Everyone: (7:20 PM) 1116 | 需求是不是最急需的 1117 | From Eric Che to Everyone: (7:20 PM) 1118 | 这题的考点应该是类秒杀设计,怎么保证在大并发下能保证票不会超卖,并且能handle大并发量。 1119 | From 非洲黑猴子 to Everyone: (7:20 PM) 1120 | P0基本就是MVP了 1121 | From Skit to Everyone: (7:23 PM) 1122 | callback function? 1123 | From Tekken to Everyone: (7:23 PM) 1124 | 直接开始接口设计了吗 1125 | From Alan to Everyone: (7:23 PM) 1126 | can customer send out and order and payment info, or customer can make an order first and then pay within an hourand payment info, assume payment system 1127 | From Laoluo to Everyone: (7:25 PM) 1128 | 比较好奇最近几次好象都是男的出题,女的做题 1129 | From Erwin to Everyone: (7:25 PM) 1130 | is non functional requirements skipped for this one? 1131 | From Sun Anna to Everyone: (7:25 PM) 1132 | +1 where is the non functional requires? 1133 | From Jackie G to Everyone: (7:27 PM) 1134 | Sorry, What is “item”? 1135 | From Skit to Everyone: (7:28 PM) 1136 | i think she meant ticket "name" or description 1137 | From YL to Everyone: (7:28 PM) 1138 | 每一次order都要更新所有的ticket吗? 1139 | From Yanbin Li to Everyone: (7:28 PM) 1140 | 请问这个ticket系统是卖什么票的,这个聊了吗 1141 | From Richard Tu to Everyone: (7:29 PM) 1142 | 同问,这个不用考虑什么座位号之类的吗 1143 | From bill.wang to Everyone: (7:29 PM) 1144 | TicketMaster--mostly they are concert tickets 1145 | From YL to Everyone: (7:29 PM) 1146 | 所有ticket都是一样的 1147 | From Erwin to Everyone: (7:29 PM) 1148 | seat num should be needed in tickermasterso that user could select the seat they want 1149 | From Richard Tu to Everyone: (7:30 PM) 1150 | 从我角度,应该有个event之类的event table mapping multiple tickets 1151 | From anna to Everyone: (7:30 PM) 1152 | 求问个题外话,这是什么画图软件? 感觉好好用,拖拉拽超级方便 1153 | From YL to Everyone: (7:30 PM) 1154 | 感觉可以考虑成音乐节的门票之类的,都站票 1155 | From Tekken to Everyone: (7:30 PM) 1156 | 需求分析后 直接跳到接口设计 这时候面试官是不是控场一下更好 资源预估 Data Flow, Service讨论 系统设计图 这些都是要先于接口设计做吧 1157 | From Richard Tu to Everyone: (7:31 PM) 1158 | 可以是可以,但是requirement有说过吗,是我miss了什么需求吗针对 > 感觉可以考虑成音乐节的门票之类的,都站票 1159 | From YL to Everyone: (7:32 PM) 1160 | 所有ticket没有分级,都是一样的 1161 | From iPad to Everyone: (7:32 PM) 1162 | Requirement说all tickets are the same,是不是就是说的是没有座位的分别啊 1163 | From Erwin to Everyone: (7:32 PM) 1164 | I bought tickets from Ticketmaster and there are different types of tickets for one event 1165 | From Yanbin Li to Everyone: (7:32 PM) 1166 | 需求分析后做API design没啥问题吧,我理解首先通过API design明确你的service提供什么服务,后面才好设计为了提供这些服务怎么设计数据模型和系统架构 1167 | From Jilong Chen to Everyone: (7:32 PM) 1168 | User table should include credit card or other payment methods 1169 | From Christie to Everyone: (7:32 PM) 1170 | Ticket service, order service 分别在哪呀 1171 | From Yufei Qian to Everyone: (7:33 PM) 1172 | QPS, TPS分析了吗 1173 | From 非洲黑猴子 to Everyone: (7:33 PM) 1174 | 应该是后面的数据库表不一样,一个是ticket另一个是order 1175 | From Christie to Everyone: (7:34 PM) 1176 | 謝謝! 1177 | From Yufei Qian to Everyone: (7:34 PM) 1178 | 这么多User直接hit relational会击穿吧 1179 | From YL to Everyone: (7:34 PM) 1180 | ACID优先 1181 | From Erwin to Everyone: (7:35 PM) 1182 | where do we store payment related info? 1183 | From 非洲黑猴子 to Everyone: (7:35 PM) 1184 | 看怎么设计了,设计好了不会击穿,多拿流量都能给你扛下来 1185 | From Yufei Qian to Everyone: (7:35 PM) 1186 | 目前的设计没有cache 1187 | From renyuming to Everyone: (7:37 PM) 1188 | cache一个需要考虑的是时效性 1189 | From YL to Everyone: (7:38 PM) 1190 | 这里加cache的话应该存啥, available ticket? 1191 | From ds awsome to Everyone: (7:38 PM) 1192 | 10m user ~ 100 qps,这样一台server就够了吧 1193 | From Richard Tu to Everyone: (7:38 PM) 1194 | 所以这个设计,是更偏向秒杀?10m qps? 1195 | From ds awsome to Everyone: (7:38 PM) 1196 | 是不是根本不用分布式系统啊 1197 | From 非洲黑猴子 to Everyone: (7:38 PM) 1198 | 先看面试官咋说 1199 | From Shihao Zhong to Everyone: (7:38 PM) 1200 | 10m user应该是都在一个时间左右买票,所以可能qps要10m 吧 1201 | From H.B. to Everyone: (7:38 PM) 1202 | 一定要保证用户看到的ticket 数量是最新的? 1203 | From Erwin to Everyone: (7:38 PM) 1204 | have we discussed peak qps before? 1205 | From Shihao Zhong to Everyone: (7:38 PM) 1206 | 10m 不是平均的啊, 1207 | From 非洲黑猴子 to Everyone: (7:39 PM) 1208 | 用户一定看到最新的,但是下单的时候一定是检查最新的 1209 | From leo zhang to Everyone: (7:39 PM) 1210 | 10m user如何推出 100 qps的? 这是秒杀10m 不是dau 1211 | From 非洲黑猴子 to Everyone: (7:39 PM) 1212 | 用户不一定看到最新的,但是下单的时候一定是检查最新的 1213 | From Laoluo to Everyone: (7:40 PM) 1214 | SQL应该可以的,Cache或读写分离 1215 | From Erwin to Everyone: (7:40 PM) 1216 | also each ticket should have a uuid? so that we could refer to the payment/user related info? 1217 | From v to Everyone: (7:40 PM) 1218 | 就算用了redis 还是有100k的qps? 1219 | From 非洲黑猴子 to Everyone: (7:40 PM) 1220 | Reds支持秒级10万并发 1221 | From v to Everyone: (7:41 PM) 1222 | 就算用了redis 还是会有100k的qps到mysql因为有100k的票 1223 | From ds awsome to Everyone: (7:41 PM) 1224 | 要是10m秒杀,那至少要10个 redis? 1225 | From Shihao Zhong to Everyone: (7:41 PM) 1226 | 你可以一次发n张票到redis啊 redis不是有原子操作么 1227 | From iPad to Everyone: (7:41 PM) 1228 | 一开始都没想到这是个flash sale的题目 1229 | From renyuming to Everyone: (7:41 PM) 1230 | redis有cluster也是可以scale up的到mysql的100K需要shard ticket了 1231 | From 非洲黑猴子 to Everyone: (7:42 PM) 1232 | 看了不一定买,写请求怎么处理且看面试者怎么设计 1233 | From renyuming to Everyone: (7:42 PM) 1234 | 每一张ticket都应该是一个record 1235 | From ds awsome to Everyone: (7:42 PM) 1236 | 那这个scale mysql撑不住吧 1237 | From iPad to Everyone: (7:42 PM) 1238 | 现在的设计怎么防止超卖呢? 1239 | From renyuming to Everyone: (7:42 PM) 1240 | redis在前面挡着每张ticket也会有write lock吧 1241 | From 非洲黑猴子 to Everyone: (7:42 PM) 1242 | 锁定库存 1243 | From Richard Tu to Everyone: (7:43 PM) 1244 | 至少得加个mq,异步吧 1245 | From v to Everyone: (7:43 PM) 1246 | Total quantity可以分开维护么。。这样就不用每次query 数据库来算count了? 1247 | From Richard Tu to Everyone: (7:43 PM) 1248 | 削峰限流 1249 | From Jerry to Everyone: (7:43 PM) 1250 | 要到最后成功支付成功或者取消才算确认吧 1251 | From Alan to Everyone: (7:43 PM) 1252 | 双十一抢购 1253 | From renyuming to Everyone: (7:44 PM) 1254 | 支付和book感觉可以分成两部分,book之后有一定时间去pay 1255 | From Christie to Everyone: (7:44 PM) 1256 | 是不是可以先放 MQ ,payment 成功才更新 available_quantity? 1257 | From renyuming to Everyone: (7:44 PM) 1258 | 因为pay一般做不到ms级 1259 | From ds awsome to Everyone: (7:44 PM) 1260 | Redis的性能是每秒10万 还是1M啊? 1261 | From 非洲黑猴子 to Everyone: (7:44 PM) 1262 | 阿里自己二开了MySQL,增加了写请求排队功能 1263 | From renyuming to Everyone: (7:44 PM) 1264 | available_quantity 我感觉可以直接用redis + lua,只跟book相关,不跟pay相关 1265 | From v to Everyone: (7:45 PM) 1266 | 这个设计会有thundering herd吧 1267 | From renyuming to Everyone: (7:45 PM) 1268 | 也只保存quantity不是加了mq? 1269 | From Jerry to Everyone: (7:46 PM) 1270 | pay的过程中要锁定这部分库存吧但是要加个time out 1271 | From Yufei Qian to Everyone: (7:46 PM) 1272 | thundering herd无法避免,traffic pattern就是这样,需要设计去handle 1273 | From Shihao Zhong to Everyone: (7:47 PM) 1274 | 为什么会有thundering herd,没有理解 1275 | From Laoluo to Everyone: (7:47 PM) 1276 | 掉坑里了 1277 | From YL to Everyone: (7:47 PM) 1278 | 1s就抢没了 1279 | From Christie to Everyone: (7:47 PM) 1280 | 10 sec 東西賣光了 1281 | From leo zhang to Everyone: (7:47 PM) 1282 | 抢票不刷新就没了啊 1283 | From Alan to Everyone: (7:47 PM) 1284 | request进来是不是要先做order request然后排队 1285 | From Alan to Everyone: (7:48 PM) 1286 | 如果拿到ticket,那就create order 1287 | From 非洲黑猴子 to Everyone: (7:48 PM) 1288 | 读的时候读不到最新的没关系,pay的时候不出错就好 1289 | From Yufei Qian to Everyone: (7:48 PM) 1290 | 那样用户体验会比较差 1291 | From Jerry to Everyone: (7:49 PM) 1292 | 限购的需求是不是还没加 1293 | From Yufei Qian to Everyone: (7:49 PM) 1294 | 是的,限购需求没有讨论 1295 | From Alan to Everyone: (7:49 PM) 1296 | 每个ticket system 有个cache, ticket先产生,买个ticket 系统分配一定数量ticket在 cache从cache拿ticket需要synchronous如果需求不多,应该很快,需求多就要排队,因为synchrounous 1297 | From tomdi to Everyone: (7:51 PM) 1298 | payment 和 ticket count update 做一个 transaction 事务, cache write upate只在每个transcation commit之后 1299 | From Laoluo to Everyone: (7:51 PM) 1300 | cache里保证所有的ticket不断地减少,不能不变 1301 | From Alan to Everyone: (7:51 PM) 1302 | ticket是事先产生的啊肯定不能做ticket count啊,笑死人 1303 | From Erwin to Everyone: (7:53 PM) 1304 | 如果这里ticket service的一个server挂了,有没有什么办法保证这个service对应的tickets可以被其他server利用? 1305 | From Alan to Everyone: (7:53 PM) 1306 | data race是难点 1307 | From Skit to Everyone: (7:53 PM) 1308 | ticket先create row,然后book把 1309 | From tomdi to Everyone: (7:53 PM) 1310 | cache只读,db transaction update后再 update cache, cache读data可以有滞后 1311 | From Skit to Everyone: (7:54 PM) 1312 | 这样不需要count 1313 | From v to Everyone: (7:54 PM) 1314 | 用redis的话 如果不写到disk 会有data loss的风险。。。如果写到disk的话 write throughput会很不好吧? 1315 | From Ken to Everyone: (7:54 PM) 1316 | 5 more minutes 1317 | From H.B. to Everyone: (7:54 PM) 1318 | 他好像没说啥时候create session? 1319 | From Alan to Everyone: (7:54 PM) 1320 | 有,每个ticket,保存分配到哪个server信息 1321 | From H.B. to Everyone: (7:54 PM) 1322 | session 不是一个小时吗 1323 | From Tekken to Everyone: (7:54 PM) 1324 | 这是道老题目了 如果事前稍微准备下 油管上能找到很多很成熟的设计方案 1325 | From Alan to Everyone: (7:54 PM) 1326 | 如果那个server crash, 他的那些没被order的ticket从新被放回去 1327 | From leo zhang to Everyone: (7:55 PM) 1328 | 可能事先不知道题目 1329 | From YL to Everyone: (7:55 PM) 1330 | 知道吧 1331 | From Laoluo to Everyone: (7:56 PM) 1332 | cache自行减少,不用更新数据库,蛤payment的service来更新ticket 的数量就行了payment量少很多,cache来读数据做同步 1333 | From H.B. to Everyone: (7:56 PM) 1334 | 为了读的块 1335 | From Kasey to Everyone: (7:57 PM) 1336 | 为啥不能直接用redis? 1337 | From 非洲黑猴子 to Everyone: (7:57 PM) 1338 | 好几张表要join 1339 | From H.B. to Everyone: (7:57 PM) 1340 | 卡住了 1341 | From Christie to Everyone: (7:57 PM) 1342 | 所以不用 ticket mysql 的表了? 1343 | From Kasey to Everyone: (7:57 PM) 1344 | 就把所有ticket 存redis里面不行么 1345 | From Laoluo to Everyone: (7:58 PM) 1346 | 关键点没有讨论,特别是怎样削峰 1347 | From Hao to Everyone: (7:58 PM) 1348 | 问个问题,应该什么时候减库存呢?是pay成功才减库存? 1349 | From Kasey to Everyone: (7:58 PM) 1350 | 肯定吧 1351 | From Laoluo to Everyone: (7:59 PM) 1352 | +1 1353 | From H.B. to Everyone: (7:59 PM) 1354 | 肯定pay 成功后 1355 | From Ender Li to Everyone: (7:59 PM) 1356 | Pay成功才减库存不会超卖吗 1357 | From wantong jiang to Everyone: (7:59 PM) 1358 | 这怎么避免超卖呢? 1359 | From leo zhang to Everyone: (7:59 PM) 1360 | 不会啊 1361 | From tomdi to Everyone: (7:59 PM) 1362 | pay成功和减库存是一个 transaction 1363 | From Kasey to Everyone: (7:59 PM) 1364 | pay成功和失败是两个情况 1365 | From H.B. to Everyone: (7:59 PM) 1366 | 你说的库存是mysql里的? 1367 | From Zhengguan Li to Everyone: (7:59 PM) 1368 | pay成功前也可以呃减啊 事先锁定嘛 1369 | From H.B. to Everyone: (7:59 PM) 1370 | 还是他说redis里的 1371 | From leo zhang to Everyone: (7:59 PM) 1372 | pay的时候检查库存, 放一个tracnsaction 1373 | From renyuming to Everyone: (8:00 PM) 1374 | 应该是book就lock库存,之后pay失败了就恢复,pay成功了就减掉了 1375 | From leo zhang to Everyone: (8:00 PM) 1376 | order成功没付款还有有风险不能proceeed 1377 | From Kasey to Everyone: (8:00 PM) 1378 | 嗯 1379 | From renyuming to Everyone: (8:00 PM) 1380 | transaction感觉很慢?尤其带3 party的api的? 1381 | From lily liu to Everyone: (8:01 PM) 1382 | order service的时候要减db库存并更新cache了吧,然后payment 成功的时候再调整一次 1383 | From H.B. to Everyone: (8:01 PM) 1384 | 312 1385 | From Zhengguan Li to Everyone: (8:01 PM) 1386 | 321 1387 | From Zidong to Everyone: (8:01 PM) 1388 | 321 1389 | From 非洲黑猴子 to Everyone: (8:01 PM) 1390 | 跟钱相关的不得不transaction 1391 | From Spin to Everyone: (8:01 PM) 1392 | 132 1393 | From leo zhang to Everyone: (8:01 PM) 1394 | 付款 lantency不是最紧急的需求吧? 1395 | From Yufei Qian to Everyone: (8:01 PM) 1396 | 132 1397 | From Simon Z to Everyone: (8:01 PM) 1398 | 321 1399 | From david to Everyone: (8:01 PM) 1400 | 132 1401 | From Christie to Everyone: (8:01 PM) 1402 | 312 1403 | From Jackie G to Everyone: (8:01 PM) 1404 | 132 1405 | From leo zhang to Everyone: (8:01 PM) 1406 | accuracy更重要 1407 | From Kj to Everyone: (8:01 PM) 1408 | 312 1409 | From x to Everyone: (8:01 PM) 1410 | 312 1411 | From xinz to Everyone: (8:02 PM) 1412 | 132 1413 | From 非洲黑猴子 to Everyone: (8:02 PM) 1414 | 312 1415 | From johnc to Everyone: (8:02 PM) 1416 | 312 1417 | From Xiaoqin Fu to Everyone: (8:02 PM) 1418 | 312 1419 | From Julie Long to Everyone: (8:02 PM) 1420 | 312 1421 | From anna to Everyone: (8:02 PM) 1422 | 132 1423 | From YL to Everyone: (8:02 PM) 1424 | 312 1425 | From Qiang Lu to Everyone: (8:02 PM) 1426 | 312 1427 | From HW to Everyone: (8:02 PM) 1428 | 321 1429 | From Peiwen Tian to Everyone: (8:02 PM) 1430 | 312 1431 | From Xuexin Chen to Everyone: (8:02 PM) 1432 | 312 1433 | From Richard Cao to Everyone: (8:02 PM) 1434 | 312 1435 | From TBL to Everyone: (8:02 PM) 1436 | 132 1437 | From christie Yu to Everyone: (8:02 PM) 1438 | 312 1439 | From lily liu to Everyone: (8:02 PM) 1440 | 132 1441 | From Ken to Everyone: (8:03 PM) 1442 | Hard skill 1443 | From Zhengguan Li to Everyone: (8:03 PM) 1444 | 312 1445 | From YL to Everyone: (8:03 PM) 1446 | 21 1447 | From christie Yu to Everyone: (8:03 PM) 1448 | 213 1449 | From H.B. to Everyone: (8:03 PM) 1450 | 21 1451 | From Jackie G to Everyone: (8:03 PM) 1452 | 213 1453 | From lining to Everyone: (8:03 PM) 1454 | 21 1455 | From Spin to Everyone: (8:03 PM) 1456 | 21 1457 | From Mark Liu to Everyone: (8:03 PM) 1458 | 213 1459 | From johnc to Everyone: (8:03 PM) 1460 | 21 1461 | From Christie to Everyone: (8:03 PM) 1462 | 21 1463 | From xinz to Everyone: (8:03 PM) 1464 | 21 1465 | From Xiaoqin Fu to Everyone: (8:03 PM) 1466 | 21 1467 | From david to Everyone: (8:03 PM) 1468 | 21 1469 | From HW to Everyone: (8:03 PM) 1470 | 21 1471 | From Shihao Zhong to Everyone: (8:04 PM) 1472 | 这个3还是没有了解是什么 1473 | From TBL to Everyone: (8:04 PM) 1474 | 21 1475 | From Charlie to Everyone: (8:04 PM) 1476 | 只发数字顺序看不出那个好坏程度,还是每个指标打分更合适 1477 | From H.B. to Everyone: (8:05 PM) 1478 | 嗯嗯我也觉得 每个都给1-5打分 1479 | From leo zhang to Everyone: (8:07 PM) 1480 | +1. 1481 | From H.B. to Everyone: (8:08 PM) 1482 | 考官? 1483 | From Kasey to Everyone: (8:08 PM) 1484 | 考官哈哈哈 1485 | From lining to Everyone: (8:08 PM) 1486 | 😀 1487 | From Zhengguan Li to Everyone: (8:09 PM) 1488 | 面试者: 考官竟是我自己..哈哈 1489 | From Jackie G to Everyone: (8:14 PM) 1490 | 弱问一下: 如果所有票都一样,为什么还要一张ticket一行呢?直接一行ticket和count不行吗? 1491 | From x to Everyone: (8:14 PM) 1492 | 对,其实面试者说的意思就是只需要一行 1493 | From Peijin Sun to Everyone: (8:15 PM) 1494 | Ticket 是不是其实是event 1495 | From Richard Tu to Everyone: (8:15 PM) 1496 | 他这个表就是event稍微有点儿confusing 1497 | From lining to Everyone: (8:17 PM) 1498 | 对 1499 | From Zidong to Everyone: (8:18 PM) 1500 | 5000个是可以refilll吗 1501 | From leo zhang to Everyone: (8:19 PM) 1502 | 限流没问题啊 1503 | From Zidong to Everyone: (8:19 PM) 1504 | like 一个buket? 1505 | From iPad to Everyone: (8:19 PM) 1506 | P0: buy ticket, cap = 2 tickets / user, all tickets the same 1507 | From leo zhang to Everyone: (8:19 PM) 1508 | 多放点到后面就是但是限流可能是需要的,因为商品只有这么多 1509 | From Lu to Everyone: (8:20 PM) 1510 | 想问下大家知道newSQL吗 听说了这个concept,好像又可以ACID,又可以horizontally scale 1511 | From leo zhang to Everyone: (8:20 PM) 1512 | 放10M的流量到后面去没有意义 1513 | From Shihao Zhong to Everyone: (8:20 PM) 1514 | 面试用newsql好么。 1515 | From Lu to Everyone: (8:21 PM) 1516 | 没试过 😄 1517 | From Jerry to Everyone: (8:21 PM) 1518 | 限流的话就是假设每个request只能买一张票 1519 | From Brave to Everyone: (8:21 PM) 1520 | 搞个message queue存requests,然后异步处理,异步处理可以merge requests(比如同一类的票可以合并)避免访问太多次数据库 1521 | From Jerry to Everyone: (8:21 PM) 1522 | 多张票要多个requests吗 1523 | From Richard Tu to Everyone: (8:21 PM) 1524 | 用呗,别说什么newSQL,提具体的db名字 1525 | From Lu to Everyone: (8:21 PM) 1526 | Google Spanner? 1527 | From leo zhang to Everyone: (8:22 PM) 1528 | 10 M限制成 120k的流量, 这就是huge win 1529 | From Jerry to Everyone: (8:22 PM) 1530 | 那面试者的POST url就不能有number_items了吧 1531 | From Shihao Zhong to Everyone: (8:22 PM) 1532 | 贵啊 1533 | From Jackie G to Everyone: (8:23 PM) 1534 | Redis 的replication支持strong consistency吗?还是eventual consistency? 1535 | From Jerry to Everyone: (8:23 PM) 1536 | Redis 有AOF log恢复不过这题redis好像不需要恢复数据库重读更新就可以吧 1537 | From Lu to Everyone: (8:25 PM) 1538 | 同意 直接cache在API GATEWAY… 1539 | From leo zhang to Everyone: (8:27 PM) 1540 | 写120K /hour只要前面filter了, 后面写不是问题 1541 | From Zhengguan Li to Everyone: (8:30 PM) 1542 | redis可以设置一个timeout? 1543 | From leo zhang to Everyone: (8:30 PM) 1544 | MQ有 exact-once sementic跟consumer配合 1545 | From Ender Li to Everyone: (8:32 PM) 1546 | 那对redis的-1操作必须要锁redis是吗?那每次只有一个人可以update redis,这个不会成为性能瓶颈吗 1547 | From Shihao Zhong to Everyone: (8:32 PM) 1548 | redis有cas 原子操作不会成为瓶颈这里 1549 | From Sean Gao to Everyone: (8:33 PM) 1550 | CAS 能包括 send to Kafka 么 ? 1551 | From Sean Gao to Everyone: (8:33 PM) 1552 | 或者 CAS 包括 persistence 步骤 ? 1553 | From Brave to Everyone: (8:35 PM) 1554 | 春节买火车票就是在那排队 1555 | From Ender Li to Everyone: (8:36 PM) 1556 | CAS不就是说大量并发去更新redis只有一个会成功,其他都需要重试吗?因为别的old value都不对,更新会失败,是吗? 1557 | From Sean Gao to Everyone: (8:36 PM) 1558 | @ender redis全内存操作, 性能损失不大 。 也不是完全lock,是 CAS。 1559 | From hobite to Everyone: (8:36 PM) 1560 | 输入完信用卡信息,商家完成与银行间的认证,用户点submit的瞬间,update 1561 | From Mark Liu to Everyone: (8:37 PM) 1562 | 不对吧,Payment按照12306会给你20分钟左右的操作Payment在20分钟内不成功,才会失败吧 1563 | From hobite to Everyone: (8:38 PM) 1564 | comit redis, 同时send payment and clean up task to queue. 这也涉及到如果放到queue里的期望,就是我们expect 送到queue里的task会99.9%的可能性成功,除非系统崩溃。 1565 | From Sean Gao to Everyone: (8:39 PM) 1566 | “comit redis, 同时send payment and clean up task to queue.” -- 这个是能全部放入 transacation么 ? 1567 | From hobite to Everyone: (8:40 PM) 1568 | 另外一个solution, 是用户在query的时候锁住一个seat,submit的时候update或者release lock另外一个用户在lock的期间默认这个seat不available 1569 | From Jerry to Everyone: (8:43 PM) 1570 | redis的数据定期去db里同步更新可以吗 1571 | 1572 | ## Calendar 1573 | https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit#Meeting notes: https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit#For friends who just joined zoom: Meeting notes: https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit#For friends who just joined zoom: Meeting notes: https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit#For friends who just joined zoom: Interview will take 45 minutes: 6:15-7pm PSTMeeting notes: https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit#已经30min了For friends who just joined zoom: Interview will take 45 minutes: 6:15-7pm PSTMeeting notes: https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit#Are User and Calendar in one-to-one relationship?based on the requirement, no ^ThanksDoes the event attendance status need to be consistent? Eg user A updates yes, is it OK to sometimes get the incorrect status?eventual consistency should be okStupid idea, please critique:When the read requests exceeds the number that a SQL server can handle. Can we split out read request to read -pnly replica instead of using a cache?Read-only*of course yes这是mvcc的问题,用sql的话是可以保证strongconsistency, read要么读到 A update之前的status,要么读到 A update之后的status, 取决于read的time stamp是在write commit之前还是write commit之后客户端到现在还闲着呢,最近的事件可以缓存在客户端,可以有效减小服务端的读取压力这里是不是会涉及 跨shard join 的问题? 有没有什么指导原则 ?我是觉得这个东西对一致性要求没那么高,应该问题不大话说是不是到时间了,今天是45分钟来着?有道理Yes. Time is up now.Consistency guarantees depend on which part of calendar - updating attendance might be OK with eventual but event privacy (public vs private) likely needs strong consistency.Score比之前只打相对分的靠谱多了稍等,投票再哪?Where is the poll?弹出来的没看到where is the poll?没有有看到pop upNo pop-up for meCould you send one more time?zoom这里是不是用的 eventual consistency.......strong consistency是sql免费送的,NoSQL要考虑consistency的问题用量的大的时候 这个就得考虑k/v storage,像 aggregation event的 join costing 就大太多了,所以SQL最好就别用了,Shard的时候我觉得Shard key 也是应该根据 event time 来做indexing👍1. Requirement gathering - meets1 requirement gathering: exceed, meet, needs1m1 meet2.needs2nn2 n2n看见poll应该要升级zoom 客户端3m3m3n3m3m3 n3n看不到,被盖住了看不到看不到ok可以ok要不用一個poll master 做?https://doodle.com/poll-maker好像hard skill target L5不一定吧热数据相当于cache?冷库不是还要做shardinginterview summary google doc的链接能再分享一下嘛应该还是要做sharding冷库也可以不走cacheMeeting notes: https://docs.google.com/document/d/1Zod6Cz0-KGJ-BDeZJoqEFrMQ4sdsSiJ-qT8Jkegwd-I/edit# 1574 | From Eric Che to Everyone: (7:25 PM) 1575 | 引用kafka? 1576 | From Xinyu Zhang to Everyone: (7:26 PM) 1577 | shedule紧急meeting前几秒钟 延迟几秒 嗯。。 1578 | From Sean Gao to Everyone: (7:26 PM) 1579 | 小概率 1580 | From Richard Tu to Everyone: (7:27 PM) 1581 | ^sheculde meeting 几秒,这概率也太小了。那就完全可以用另一套workflow了 1582 | From Xinyu Zhang to Everyone: (7:28 PM) 1583 | 假如发一个invent, 邀请了全公司所有人,每个人点accept或者propose to new time都算修改么?sorry, invite* 1584 | From Cheng Jing to Everyone: (7:29 PM) 1585 | 算修改吧 1586 | From Xinyu Zhang to Everyone: (7:29 PM) 1587 | 那CEO给全公司发一个,那修改量不小啊 1588 | From Sean Gao to Everyone: (7:29 PM) 1589 | batch 处理 write request,应该还好吧 1590 | From Xinyu Zhang to Everyone: (7:30 PM) 1591 | 恩恩 1592 | From Quan to Everyone: (7:30 PM) 1593 | 为什么1million的用户,有10 million的calendar 1594 | From Cheng Jing to Everyone: (7:30 PM) 1595 | 倒是不用都在同一个时间发invite,我觉得 1596 | From Sean Gao to Everyone: (7:30 PM) 1597 | 不过我也不懂, 1个 sql 改1000行, 和 1000个 sql 每个改一行,性能差别多大? 1598 | From Richard Tu to Everyone: (7:31 PM) 1599 | accept我觉得不能算update吧?至少不会update event本身 1600 | From Xinyu Zhang to Everyone: (7:31 PM) 1601 | 对了 好奇 有必要设计calendar table么? 直接用event table可以么? 1602 | From Sean Gao to Everyone: (7:31 PM) 1603 | accept算吧,因为你以后还能读出来。 应该是 update 的 是 user 和 event 的 relation。 1604 | From Richard Tu to Everyone: (7:32 PM) 1605 | 那relation的表,key肯定不一样,所以应该不会造成hot key 1606 | From Sean Gao to Everyone: (7:33 PM) 1607 | 对的,不是只对着一个 event写。如果 NoSQL 可能就不一样了。 也许用 redis ? 1608 | From Zoom user to Everyone: (7:34 PM) 1609 | 1000 sql written in 1 batch vs 1000 sql transactions are the same from consistency guarantee perspective. But I wonder if there's perf overhead due to locks 1610 | From Xinyu Zhang to Everyone: (7:36 PM) 1611 | 感觉sql这里有很多优点, 但这个 data structure 是一定是定死的么 1612 | From Eric Che to Everyone: (7:42 PM) 1613 | 不能当做是一个event来看待吗? 1614 | 1615 | ## TopK 1616 | https://docs.google.com/document/d/1YYrcTZ5Spbz2gauu-U8PgZLv7bsQuH69kml8cY3hO38/edithigh availability是指时间上的概念(i.e. 24/7 available)还是multiplatform?24/7high availability是service availability吧?500ms 不算low latency了吧?这里的low latency是啥意思。。。 一个trending service关心的不都是real time的trending吗。。 不懂这个low latency在这干嘛的。。。 模版吗。。好像所有的面试回答第一条都是high availability,然后就没有下文了,就是保证server 24/7运行 + backup server in case primary server failure?感觉从头到尾都是模板这几个term好像都是模板感觉design还没开始打开YouTube之后需要加载五秒钟还是挺影响体验的是0.5s吧?开始了喊我0.5s我的意思是这道题low latency还是很有必要的Target是l4 大牛们稍安勿躁请教一下: high available和fault-tolerent是不是重复的概念?estimation这里是不是时间花太多了我觉得是保证server不断电low latency是 get topk的时候 快速返回 还是说 我的last 24 hrs trending是 real time的 还是 有1-2 小时延迟。。后者是consistency吧这种情况design一个function in existing system,我们是不是需要先问问什么已经有了low latency是 get top的时候 快速返回fault tolerance = high availability + proper failure handling谢谢同意 我觉得是不是能assume已经有了一个counting system感觉模板不好用了 连观众都不买帐了我感觉,last 24hrs trending,经常可以延迟产生的 1617 | 我觉得没人会关心这个trending是不是实时刷新500ms不是low latency了吧500ms也能算low吧。500ms对这个top k应该够了,个人意见吧I see感觉500ms有点卡对于YouTube是有点卡,哈哈100m的 dau 然后每个人的点击都会影响trending。。 如何收集 咋收集。。都是个问题 😂无所谓了,你说500ms,100ms最后design出来不都是一套系统。。24hours trending看你用batch还是streaming processing, 一般5到10分钟的delay是可以实现的其实没有必要在意这些细节,毕竟这只是设计,不是实现trending不用实时更改,观众不会那么关心试试更改Trending其实我一直没搞懂traffic estimate的意义在哪里这个时间安排45分钟不够了吧traffic estimation是grokking模版里,我个人认为性价比最低的part,聊胜于无没有意义,只要问一下每秒有多少个view就行了traffic estimation是grokking模版里,我个人认为性价比最低的part,聊胜于无 1618 | 1619 | 同意大家面试会跳过么?还行dau开始算,没必要traffic estimation是grokking模版里,我个人认为性价比最低的part,聊胜于无 1620 | 1621 | 同意‘所以面试的时候 如果考官没有提,是不是可以直接略过这个traffic estimation? 不会成为扣分点吧?有大神指导我说,traffix estimation的一个意义是: 1622 | 选择哪种数据库,是选择sql或nosql我个人,一般traffic大概估算一下,还有storage那块,主要是数据库怎么设计.... 爲啥traffic 和 db選擇 有關係啊estimation 3-5分钟快速搞完?完全沒有關聯啊求细节,怎么个估算法?怎么选数据库?不会真的有面试官期望你设计一个not distributed system吧数据多就NoSQL?看 read 和 write的啊怎麽可能是看traffic视频基数到底多大 才是top k的基本问题吧。。 nlogk 你好歹要知道n是多少呀。。 难不成30亿的n吗。。事务多就SQL?对,是Read write怎么看read write选数据库?这里的count min是count min sketch吗簡直了嗯,事物也是誰亂講的哪种数据库不能read write呢?重点是ratio去看ddiastreaming system 都有log的,如果从existing system开始讲, 可能会容易点lolddia万能啊面试时候我也这么说哈哈,我瞎讲,多半我就记得不对我感覺top video 可以用 url在 redis 存你想想disk IO,write heavy 做sql 刺激嗎?schema on write 的時候用sql 是什麽體驗payment service: ??所以如果用NoSQL会比较好嘛不用sql用啥如果能用的话选择NoSQL的原因是什么?如果做olap,用 NoSQL 的 join 是不是很刺激因为别人都说用nosql,所以大部分人选了nosql讲道理,你distributed sql的join和nosql join有什么区别。。Count-min 中间插个storage是干什么用的。data collection phase -》 data calculation phase -〉 result read phase。。 这道题是想考的到底是哪几个?NoSQL写的快,但是无法join,事务性也比较难 1623 | 1624 | sql最容易,最好用,但不能handle那么多write“这里的count min是count min sketch吗” - count min sketch一般用来统计频率,unique topk一般用hyperloglog👍只要sql shard出现,就跟joint没啥关系了东欧大哥的topK用的count min sketch东欧大哥,哈哈哈东欧大哥是?LOLyoutuber东欧大哥是?请教Chanel名字谢谢Shouldn’t we have a data schema, then API?這個 fast processors Count-Min 我看不懂https://www.youtube.com/watch?v=kx-XDoPjoHw那是top K的频率 而不是view count,否则数学上推到不了谢谢楼上有点像伯恩 LOL我最近看了一篇google napa data warehousing的paper。主要就是根据不同时间片段来分级aggregate。感觉能用在topK的case。http://vldb.org/pvldb/vol14/p2986-sankaranarayanan.pdffast processors是什么,是个service吗?data warehouse,就不是实时了吧5 mins is a good choice我看里面ingestion是可以streaming进来的,应该可以保证个near-realtimeNapa supports database freshness of near-real time to a few hours小时级别是不是不大够用这5 mins的 设计不cover很多corner case吧Schema感觉又问题,应该存frequency吧trending这种那就1 min 更新一次?micro batch应该也行我感觉B站大概30s一次?real time那更好了这个distributed MQ是干啥的Update latest view奥,makes sense这个也需要设计吗?existing system不就有吗这在东欧大哥视频里是重点只有fast的吗?求个东欧大哥的link我去听听正确答案睡觉去了https://www.youtube.com/watch?v=kx-XDoPjoHw感谢女神我一直以为他是毛子帮我留个feedback,感谢。1. load estimate 时间太长了 后面也没用到 2.面试加油。這也是我的問題到底這是count 啥,爲啥是 minCount-min要花点时间解释的面试环境和自己想还是不一样的。林老师已经给了很多hint了,但面试的人太紧张就会get不到。也许换一个环境他就会说的很好,但面试的时候难说了。L4不用考system design,能做到这样,我觉得可以了只是谷歌不考而已只是google不考吧,别的公司还考呢时间真快,45min了amazon L4 SD不考Amzn L4 = Google L5Sry反了哥這個接近L5 水品了Amzn L4 = Google L3這已經很厲害了一般这个群备注的等级,都是按google来的我觉得这题应该有个MR的solution吧??我面某家公司,就给的是spark solutionMR是啥map reduce地图-降低这题跟collect metric的区别在哪里Which one is for senior level, L4 or L5 ?https://docs.google.com/document/d/1YYrcTZ5Spbz2gauu-U8PgZLv7bsQuH69kml8cY3hO38/edit#如何体现project lifecycle awareness? 加monitoring? metrics?怎么evaluate project lifecycle awareness?如何iterate project ? 系统不可能3个月就做完了是吧alarm metricsagree@Ken林老师待会儿可以讲讲project awareness 吗能给个例子么?怎么计算机器数量?怎么根据qps估算机器数量?有公式吗?dau --> qps --> 根据一台机器的qps处理能力,来估计需要几台机器1000 qps你就算一台 一般都没问题所以要背mongoDB, Cassandra的throughput?太卷了吧这个只是算的web server吧nosql 10k tps , sql 1k tps , memory 100k不需额外背吧1000 QPS is a lambda, 一台机大概50000‘1000 qps你就算一台’多大RAM?几个Core?Kafka, SQS?你真的要把时间放在 machine几个core上吗。。。我刚才没听见fast processor是run在什么东西上的直接说 10k qps 我给30台机器你觉得行吗;; 没人会纠结这个的。。 没人会卡你这个 给你ram 给你core然后让你算他的capability。。我没听明白他这个是怎么calculate top K viewed videos这个design真的是workable的吗。。cross team dependency 1625 | handling unusal spike of traffic 1626 | scale upread flow没有讲过这种情况下面试官会期望答案跟组里一致么能走一个case吗。。 就是他如何读topk的? YouTube有30亿个video 假设1个亿的video 在过去5分钟被人view过。。 nlogk的n是1个亿啊。咋store 咋sort。 1627 | From Liang Tan to Everyone: (8:24 PM) 1628 | 请教一下如果用了MR了, 这里的 MQ 还是需要的嘛, 是不是放在GFS上就行了。 1629 | From Zhao to Everyone: (8:28 PM) 1630 | Batch处理可以直接读log file,结果会比较准确。实时的request话不需要每一个都处理,可以做一下sampling,比如从1亿个request降到1M,然后接一个queue来处理 1631 | From Weizhe Liang to Everyone: (8:29 PM) 1632 | 也有道理多一個queue 去reduce queue 的話會好做點 1633 | From bambloo to Everyone: (8:33 PM) 1634 | LRU 1635 | From Zhao to Everyone: (8:34 PM) 1636 | redes 1637 | From bambloo to Everyone: (8:34 PM) 1638 | 可不可以用LFU? 1639 | From ningdi to Everyone: (8:36 PM) 1640 | 终于有人问这个了。。 1641 | From Spin to Everyone: (8:36 PM) 1642 | 这样没法防刷单吧? 1643 | From ningdi to Everyone: (8:36 PM) 1644 | Sampling 确实是个解决办法。。。能大量减少unique id 只有一个1 view 1645 | From james to Everyone: (8:41 PM) 1646 | Mongo DB seems a good choice 1647 | From Zhao to Everyone: (8:42 PM) 1648 | 我觉得DB里可以分级存,比如daily数据可以留365的,一天,hourly的留24*30的,5min的留一周的,这样无论你要什么granularity 都能满足 1649 | From admin to Everyone: (8:42 PM) 1650 | 离线+实时计算 hive+flink 1651 | From james to Everyone: (8:42 PM) 1652 | Each video has its own document 1653 | From jao to Everyone: (8:42 PM) 1654 | 要求多长时间刷新排行榜?每五分钟吗 1655 | From Spin to Everyone: (8:43 PM) 1656 | 怎样保证一个unique user的count只计算一回? 1657 | From ningdi to Everyone: (8:44 PM) 1658 | unique user的count 值计算一次 可以在client 端做去重比较简单会不准确 但是我觉得most case应该是work的 1659 | From admin to Everyone: (8:46 PM) 1660 | Click事件可以异步发送kafka 然后保存数仓里面 1661 | From Zhao to Everyone: (8:46 PM) 1662 | 同意tomdi说的系统设计不是唯一解,没必要争论,我看scott shi的mock里面只要能讲的通好像就可以 1663 | From Lixuan Zhu to Everyone: (8:54 PM) 1664 | https://www.youtube.com/watch?v=kx-XDoPjoHw 1665 | From Ender to Everyone: (9:00 PM) 1666 | 请教一下topK这个问题有什么点或者follow up是俄罗斯大哥的视频没cover到的吗? 1667 | 1668 | ## Youtube 1669 | 1670 | From Ken to Everyone: (6:16 PM) 1671 | Starting around: 6:15Approximate completion time: 7:00Welcome to our event. I am taking notes in https://docs.google.com/document/d/1XLpTHQxYaZBDJzRTyVgfHFfxBHD9G79hNFDvNlRwty4/edit# 1672 | From Ken to Everyone: (6:20 PM) 1673 | If you have not joined Ming Dao WeChat group, you can join using the QR code on the top of the documentApproximate Start time: 6:15. End time: 7:00 1674 | From Ken to Everyone: (6:24 PM) 1675 | Welcome to our event. I am taking notes in https://docs.google.com/document/d/1XLpTHQxYaZBDJzRTyVgfHFfxBHD9G79hNFDvNlRwty4/edit#If you have not joined Ming Dao WeChat group, you can join using the QR code on the top of the documentApproximate Start time: 6:15. End time: 7:00 1676 | From Jackie G to Everyone: (6:27 PM) 1677 | Does “width” mean “throughput”? 1678 | From Bam to Everyone: (6:27 PM) 1679 | Bandwidth I guess 1680 | From Jackie G to Everyone: (6:27 PM) 1681 | thanks 1682 | From A to Everyone: (6:28 PM) 1683 | 感谢楼主算算术,来晚了,设计还没开始 1684 | From ningdi to Everyone: (6:28 PM) 1685 | Hahah :) 1686 | From Yue's iPad to Everyone: (6:30 PM) 1687 | 7G per second for video upload 是怎么来的? 1688 | From Ken to Everyone: (6:30 PM) 1689 | Welcome to our event. I am taking notes in https://docs.google.com/document/d/1XLpTHQxYaZBDJzRTyVgfHFfxBHD9G79hNFDvNlRwty4/edit#If you have not joined Ming Dao WeChat group, you can join using the QR code on the top of the documentApproximate Start time: 6:15. End time: 7:00 1690 | From Bot to Everyone: (6:30 PM) 1691 | 这estimate那个1:200让我想到了educative.io那个 1692 | From Jackie G to Everyone: (6:31 PM) 1693 | Why does uploadVideo accept a videoId? I thought video id is generated by the system when the video is uploaded. Does he mean “video title” instead? Thanks 1694 | From ningdi to Everyone: (6:31 PM) 1695 | 压根没有downloading的 req 但是模版背多了直接就来了个ratio。。 1696 | From Charlie to Everyone: (6:31 PM) 1697 | storage 683T/day 是根据什么算的 1698 | From 非洲黑猴子 to Everyone: (6:32 PM) 1699 | 传offset可能不太行。一旦传了offset给服务端,那不就意味着文件上传下载就需要经过服务端server?其网关或者LB的网卡可能成为瓶颈 1700 | From Charles to Everyone: (6:33 PM) 1701 | Upload rate 1702 | From A to Everyone: (6:33 PM) 1703 | 这是在背诵educative的solution吗 1704 | From Spin to Everyone: (6:33 PM) 1705 | 这是指对client的,last viewed position? 1706 | From A to Everyone: (6:33 PM) 1707 | 我去看看答案 1708 | From ningdi to Everyone: (6:33 PM) 1709 | 一个3小时长的视频,后段可能是cut成很多小的chunks,然后offset可以快速定位到具体哪个chunk你要去load 1710 | From ningdi to Everyone: (6:33 PM) 1711 | 我觉得是这样吧。。 1712 | From A to Everyone: (6:35 PM) 1713 | 啥av 1714 | From Sean Gao to Everyone: (6:35 PM) 1715 | @猴子哥 offset 我感觉 GFS 类似系统可以提供吧? 或者 server 先把 offset 在 metadata 里面 解析成 GFS 的chunk 地址,再从 GFS 传。 1716 | From Bot to Everyone: (6:35 PM) 1717 | avi 1718 | From Sean Gao to Everyone: (6:36 PM) 1719 | 这里 aws 提供 api 来fetch 一部分的 file: https://stackoverflow.com/questions/36436057/s3-how-to-do-a-partial-read-seek-without-downloading-the-complete-file 1720 | From Ken to Everyone: (6:36 PM) 1721 | Welcome to our event. I am taking notes in https://docs.google.com/document/d/1XLpTHQxYaZBDJzRTyVgfHFfxBHD9G79hNFDvNlRwty4/edit#If you have not joined Ming Dao WeChat group, you can join using the QR code on the top of the documentApproximate Start time: 6:15. End time: 7:00 1722 | From Robin to Everyone: (6:37 PM) 1723 | 是的,我觉得(videoID + offset) 对应到一小段视频,这个offset是某种预设的granularity,比如可能后端只支持按分钟分块 1724 | From Yi to Everyone: (6:37 PM) 1725 | 不需要offset, client 直接chunk 成小块,upload这些data blob,服务器对每一个blob返回一个hash id,然后client把这些id 拼接起来commit到metadata service 1726 | From Erwin to Everyone: (6:38 PM) 1727 | client不是用presigned url upload到s3吗? 1728 | From ningdi to Everyone: (6:38 PM) 1729 | 这里只是再说 play video需要offset 1730 | From Sean Gao to Everyone: (6:38 PM) 1731 | 那你 getVideo 要从中间chunk读起来, server 怎么知道从哪个 blob 开始传给你 ? 1732 | From Yi to Everyone: (6:38 PM) 1733 | 看错了,我以为是upload。。 1734 | From Sean Gao to Everyone: (6:38 PM) 1735 | ack 1736 | From ningdi to Everyone: (6:40 PM) 1737 | 这个encode service在这里是干嘛的请问。。。 都用了s3了。。。 s3直接返回video地址了不是吗。。 1738 | From Robin to Everyone: (6:41 PM) 1739 | 或许支持多种分辨率? 1740 | From Sean Gao to Everyone: (6:41 PM) 1741 | post processing 1742 | From Erwin to Everyone: (6:41 PM) 1743 | 应该是encode到不同的resolution 1744 | From Shihao Zhong to Everyone: (6:41 PM) 1745 | 应该是把视频转换成不同的格式或者分辨率以方便不同的设备吧 1746 | From ningdi to Everyone: (6:41 PM) 1747 | 啊 那确实可能。。 1748 | From Zhengguan Li to Everyone: (6:41 PM) 1749 | 各种不同的播放格式和分辨率吧 手机格式 电脑模式 1750 | From Robin to Everyone: (6:41 PM) 1751 | 但是确实requirement里没提到多种分辨率这点 1752 | From 非洲黑猴子 to Everyone: (6:41 PM) 1753 | S3能把文件翻译成消息发到MQ?S3这么听话吗? 1754 | From Shihao Zhong to Everyone: (6:42 PM) 1755 | 说到了各种不同的设备嘛 1756 | From ningdi to Everyone: (6:42 PM) 1757 | 这是典型的 知道答案来考试。 然后题目跟答案有点不搭了 😂 1758 | From Shihao Zhong to Everyone: (6:42 PM) 1759 | 刚才说的是双写,不是S3给发消息,而且MQ应该也可以用SQS来做,这样S3的消息也可以监控到 1760 | From 非洲黑猴子 to Everyone: (6:43 PM) 1761 | 还不如直接用国内的七牛云,人家自带各种视频转码和图片缩放 1762 | From Shihao Zhong to Everyone: (6:43 PM) 1763 | S3的事件,比如get put 1764 | From Erwin to Everyone: (6:43 PM) 1765 | s3本身就可以generate event到sqs/sns 1766 | From Sean Gao to Everyone: (6:43 PM) 1767 | change capture 1768 | From Erwin to Everyone: (6:43 PM) 1769 | https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-how-to-event-types-and-destinations.html#supported-notification-destinations 1770 | From ningdi to Everyone: (6:43 PM) 1771 | 系统设计面试中能用这么多aws全家桶吗? 基本啥都傻瓜化了。。。 储存s3 通知sqs。。。 基本没啥需要设计的吧。。 1772 | From Shihao Zhong to Everyone: (6:44 PM) 1773 | 不知道啊,我也很好奇这个问题。 但如果用别的组host的infra其实也差不多吧,无非就变成了hdfs和Kafka? 1774 | From Laoluo to Everyone: (6:44 PM) 1775 | 不建议,除非你面的是AWS,然后迎合面试官。但你要说得出所以然来 1776 | From Kasey to Everyone: (6:44 PM) 1777 | 一般是先设计完然后一些具体实现之后可以细致的说 1778 | From Richard Tu to Everyone: (6:44 PM) 1779 | 当然可以用,面试官也会问你了不了解里面的细节 1780 | From tomdi to Everyone: (6:44 PM) 1781 | s3只是storage, mq应该是upload service trigger 1782 | From Kasey to Everyone: (6:44 PM) 1783 | 不然不用上 1784 | From 非洲黑猴子 to Everyone: (6:44 PM) 1785 | 哈哈,个人感觉国内的水平更高,毕竟并发啥的不是跟北美一个量级的,而且那边粥少僧多,竞争激烈 1786 | From ningdi to Everyone: (6:44 PM) 1787 | 主要是我没用过aws的全家桶。。 在考虑要不要去补一套。。 1788 | From Sean Gao to Everyone: (6:44 PM) 1789 | 这里 metadata svc 和 s3 的consistency 如何保证 ? ? 1790 | From Richard Tu to Everyone: (6:44 PM) 1791 | 不然,就知道个名词就不好了 1792 | From Ken to Everyone: (6:45 PM) 1793 | Welcome to our event. I am taking notes in https://docs.google.com/document/d/1XLpTHQxYaZBDJzRTyVgfHFfxBHD9G79hNFDvNlRwty4/edit#If you have not joined Ming Dao WeChat group, you can join using the QR code on the top of the documentApproximate Start time: 6:15. End time: 7:00About 15 minutes to go. 1794 | From Sean Gao to Everyone: (6:45 PM) 1795 | 这里 metadata svc 和 s3 的consistency 如何保证 ? ? 要重试么如果 s3 fail 1796 | From 蔡海林 to Everyone: (6:46 PM) 1797 | meta service 保存的是video的meta信息,和original s3之间的关系是通过返回给client段的upload url来联系在一起的 1798 | From ningdi to Everyone: (6:46 PM) 1799 | 我唯一用过的就是s3 s3上传不成功会告诉你上传失败的。。 1800 | From Kasey to Everyone: (6:46 PM) 1801 | 不用S3 用hdfs一样的吧 1802 | From 蔡海林 to Everyone: (6:46 PM) 1803 | upload部分一定有重试机制的 1804 | From Kasey to Everyone: (6:46 PM) 1805 | hdfs都是storage服务有什么不同么 1806 | From 蔡海林 to Everyone: (6:46 PM) 1807 | 而且upload的时候基本要进行chunk话,否则很难在upload partial fail之后重新传 1808 | From Kasey to Everyone: (6:47 PM) 1809 | 他这里很重要的CDN没说吧。。。 1810 | From 蔡海林 to Everyone: (6:47 PM) 1811 | 还早呢lb也都没有 1812 | From Sean Gao to Everyone: (6:47 PM) 1813 | @蔡 谢谢。 meta db 里面应该也有 upload status, 然后如果完全失败了,提醒用户手动重试。 1814 | From 蔡海林 to Everyone: (6:48 PM) 1815 | 嗯, 1816 | From Sean Gao to Everyone: (6:48 PM) 1817 | YouTube 网页关闭后,不然没法retry 1818 | From ningdi to Everyone: (6:49 PM) 1819 | 请问 用了s3了 还需要cdn? 1820 | From Laoluo to Everyone: (6:49 PM) 1821 | 要的 1822 | From Sean Gao to Everyone: (6:49 PM) 1823 | 需要 1824 | From 蔡海林 to Everyone: (6:49 PM) 1825 | upload状态实际上可以考虑在本地localstorage保存一份,在上传完成之后通知服务端就好 1826 | From Kasey to Everyone: (6:49 PM) 1827 | 嗯要的 1828 | From 蔡海林 to Everyone: (6:49 PM) 1829 | s3速度不行的前面一定要cdn 1830 | From Ryan to Everyone: (6:49 PM) 1831 | s3 bucket 有region 1832 | From Kasey to Everyone: (6:49 PM) 1833 | 而且CDN可以用多级的 1834 | From Sean Gao to Everyone: (6:49 PM) 1835 | localstorage 的问题是,multi device 无法 sync 1836 | From ningdi to Everyone: (6:49 PM) 1837 | 那么cdn在这里是 client端去做 还是 s3去做? 1838 | From Richard Tu to Everyone: (6:49 PM) 1839 | CloudFront + S3 1840 | From Kai Z. to Everyone: (6:50 PM) 1841 | storage需要节省吗 1842 | From Ryan to Everyone: (6:50 PM) 1843 | +1 cloudfront 1844 | From Yumin Gui to Everyone: (6:50 PM) 1845 | 真的不考虑成本吗?你用aws s3,你怕不会一天花掉10亿美元。都有100M的active user了,这种情况下还不自建存储服务? 1846 | From Sean Gao to Everyone: (6:50 PM) 1847 | 需要节省吧storage 1848 | From Yi to Everyone: (6:50 PM) 1849 | 自建不一定比s3 便宜 1850 | From Ryan to Everyone: (6:51 PM) 1851 | s3 glacier 1852 | From ningdi to Everyone: (6:51 PM) 1853 | S3不是有一个什么叫 glacier吗 1854 | From Laoluo to Everyone: (6:51 PM) 1855 | glacier是archive 1856 | From Kasey to Everyone: (6:51 PM) 1857 | glacier是做archive的 1858 | From ningdi to Everyone: (6:51 PM) 1859 | 长时间没有read的 只寸low resolution的version在s3 其他的放进glacier。。。 算是省钱的一种吧。。。 1860 | From Kasey to Everyone: (6:52 PM) 1861 | 你可以设置life cycle的 1862 | From ningdi to Everyone: (6:52 PM) 1863 | 还真就aws全家桶设计一切了。。。 😂 1864 | From Sean Gao to Everyone: (6:52 PM) 1865 | glacier 意思是 压缩存储么 ? 1866 | From Shihao Zhong to Everyone: (6:52 PM) 1867 | 不至于吧 这个700T /天 纯粹S3的话现在0.02 gb/M 如果按里面存5年的data来算2500w/月左右 1868 | From ningdi to Everyone: (6:52 PM) 1869 | 据说是 响应速度满。 1870 | From Richard Tu to Everyone: (6:52 PM) 1871 | glacier用的hdd 1872 | From ningdi to Everyone: (6:53 PM) 1873 | 请问视频播放有cache吗? 1874 | From Sean Gao to Everyone: (6:53 PM) 1875 | cdn 1876 | From ningdi to Everyone: (6:53 PM) 1877 | 这种file 文件的cache。。。 1878 | From Kasey to Everyone: (6:53 PM) 1879 | cloudfront不就是CDN 1880 | From Shihao Zhong to Everyone: (6:53 PM) 1881 | 如果你80%放到archive的话大概600万/月 好像也没有特别高 1882 | From ningdi to Everyone: (6:54 PM) 1883 | 看来我需要好好查查看cdn了。。 1884 | From A to Everyone: (6:54 PM) 1885 | s3 glacier是cold storage,存在锤子 1886 | From Zhengguan Li to Everyone: (6:54 PM) 1887 | 600w不高嘛? 1888 | From Sean Gao to Everyone: (6:55 PM) 1889 | 对 youtube不高吧 1890 | From Bam to Everyone: (6:55 PM) 1891 | 居然只有五分钟了 1892 | From Ender Li to Everyone: (6:56 PM) 1893 | search是不是还没design呢 1894 | From Richard Tu to Everyone: (6:56 PM) 1895 | Glacier的get SLA是按小时来的。视频文件从里面取,估计用户都跑光了 1896 | From Ryan to Everyone: (6:56 PM) 1897 | tiktok 好像是自建的storage 1898 | From Kasey to Everyone: (6:57 PM) 1899 | non popular可以设置life cycle么 1900 | From Mossaka to Everyone: (6:57 PM) 1901 | CDN会自动提供DASH服务吗? 1902 | From Ender Li to Everyone: (6:57 PM) 1903 | 请教下是所有视频都放CDN吗?还是只有hot/popular的放在cdn 1904 | From 蔡海林 to Everyone: (6:58 PM) 1905 | 不可能所有放到cdn storage, cdn storage也是很贵的 :) 1906 | From Ken to Everyone: (6:58 PM) 1907 | 2 minutes to goStart time: 6:15, end time ~7:00pm 1908 | From Ryan to Everyone: (6:58 PM) 1909 | 为啥要cache comments... 1910 | From Kasey to Everyone: (6:58 PM) 1911 | 看用户对延迟的要求 1912 | From Ryan to Everyone: (6:59 PM) 1913 | loading video 肯定latency 更高呀 1914 | From Kasey to Everyone: (6:59 PM) 1915 | youtube这种的话我觉得要放挺多在CDN的 1916 | From 蔡海林 to Everyone: (6:59 PM) 1917 | comments如果需要broadcast到所有看同样视频的用户,那个就是另外的设计了 1918 | From Shihao Zhong to Everyone: (6:59 PM) 1919 | 一个很复杂的comment 用什么数据库存比较好呢,尤其是很多层的那种? 1920 | From Zhengguan Li to Everyone: (6:59 PM) 1921 | “Glacier的存档检索延迟(档案在3-5小时后可用)“意思是找一个东西要3-5小时 1922 | From Erwin to Everyone: (6:59 PM) 1923 | S3 Glacier Instant Retrieval 的get latency也是miliseconds级别的 1924 | From Zhengguan Li to Everyone: (6:59 PM) 1925 | ? 1926 | From Zhao to Everyone: (6:59 PM) 1927 | 如何决定什么视频存在哪个CDN? 1928 | From Ken to Everyone: (7:00 PM) 1929 | Time is up 1930 | From Sean Gao to Everyone: (7:00 PM) 1931 | youtube 的comment 应该不是 broadcast 的 1932 | From 蔡海林 to Everyone: (7:00 PM) 1933 | 有几种方法,1)统计视频播放的热度;2)根据预先的估计,比如很热门的电视剧出了新的season,那么就一定要搞到cdn去了 1934 | From Sean Gao to Everyone: (7:01 PM) 1935 | reddit 的 comment好像直接存的 MySQL ? 1936 | From Ryan to Everyone: (7:01 PM) 1937 | broadcast 完全是另一个topic了吧 1938 | From 非洲黑猴子 to Everyone: (7:01 PM) 1939 | Redis有RDB和AOF可以落盘 1940 | From Shihao Zhong to Everyone: (7:01 PM) 1941 | 那comment要是叠个很多层岂不是query mysql直接挂了还是他业务上就不允许很多层的comment 1942 | From 蔡海林 to Everyone: (7:02 PM) 1943 | 3)还可以根据不同地域用户的观看习惯把video push到相应地域的cdn去 1944 | From 非洲黑猴子 to Everyone: (7:02 PM) 1945 | 给面试官说,Redis可以做缓存数据库个MQ 1946 | From First Last to Everyone: (7:02 PM) 1947 | time is over. 1948 | From 非洲黑猴子 to Everyone: (7:03 PM) 1949 | 主从复制 1950 | From Jerry to Everyone: (7:03 PM) 1951 | getVideo的服务是不是还没设计 1952 | From 蔡海林 to Everyone: (7:03 PM) 1953 | 是啊 1954 | From Bam to Everyone: (7:03 PM) 1955 | 设计了,CDN 1956 | From 蔡海林 to Everyone: (7:03 PM) 1957 | 漏掉了不少东西 1958 | From Jerry to Everyone: (7:04 PM) 1959 | 怎么记录播放进度的 1960 | From Yi to Everyone: (7:04 PM) 1961 | 主要面试官也没有深入问那里 1962 | From Bam to Everyone: (7:04 PM) 1963 | 这个没提 1964 | From Charlie to Everyone: (7:04 PM) 1965 | 系统设计到什么程度算是过关呢? 1966 | From Sean Gao to Everyone: (7:04 PM) 1967 | 感觉应该过关了吧 ? 1968 | From 蔡海林 to Everyone: (7:04 PM) 1969 | 至少能够自圆其说no, 我觉得没过关 1970 | From kevin to Everyone: (7:05 PM) 1971 | 这个最好能讨论一下bar 1972 | From 蔡海林 to Everyone: (7:05 PM) 1973 | 毕竟是l5 1974 | From Sean Gao to Everyone: (7:05 PM) 1975 | 哪里不够格? 1976 | From Kasey to Everyone: (7:05 PM) 1977 | L5的话有点困难 1978 | From kevin to Everyone: (7:05 PM) 1979 | 我觉得很vague 这个bar 1980 | From Spin to Everyone: (7:05 PM) 1981 | 差不多吧 1982 | From kevin to Everyone: (7:05 PM) 1983 | 有没有资深的给个hint过没过bar 1984 | From Kasey to Everyone: (7:05 PM) 1985 | 但首先YouTube就是hard system design question 1986 | From J to Everyone: (7:06 PM) 1987 | 那哪些question是简单 哪些是hard 求问 1988 | From Shihao Zhong to Everyone: (7:06 PM) 1989 | 可以说下easy system design和hard system design的例子么 1990 | From Bam to Everyone: (7:06 PM) 1991 | tinyurl 1992 | From Kasey to Everyone: (7:06 PM) 1993 | tinyurl 1994 | From Ping Lu to Everyone: (7:09 PM) 1995 | 请问一下,这个画图软件是什么? 1996 | From Kai Z. to Everyone: (7:12 PM) 1997 | decision呢 1998 | From Shihao to Everyone: (7:13 PM) 1999 | 这个选什么云的服务怎么回答啊 S3 啊 azure blob不都差不多 2000 | From J to Everyone: (7:14 PM) 2001 | L5 这题如果答得好应该是怎么用的样* 2002 | From Liang Tan to Everyone: (7:14 PM) 2003 | 请问db design 应该放在是跟service上边画图边做,还是放到service前或者后比较好。是不是用一个upload service就好了猴子哥说的网卡是什么呢?实际的production上没有从orginal取的,全都是从cdn取。cdn费用肯定比 server便宜。如果是冷门视频呢 ?如果upload或者download的数据经过自己的service,则会打满后者的网卡的风险牛upload和download数据和信令都是分离的。冷门数据也要用cdnCDN上啥都有,那为啥还要S3呢Xing Wang 大佬🐂🍺s3存orignal👍牛面试官期待面试者自己deep dive topic吗,我觉得deep dive是需要面试官去问出来的吧Jane Liu 您的建议是先口述一个user journey,再问面试官面试关注的feature是吗streaming基本上都是从cdn从streaming的,建议看看dash和hls streaming arcS3 good for video streaming: https://stackoverflow.com/questions/3505612/amazon-s3-hosting-streaming-videoYou can send 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in an Amazon S3 bucket. There are no limits to the number of prefixes that you can have in your bucket.https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/你们都过于依赖aws了,事实上所有的video chunks都是依赖于cdn的。netflix的核心竞争力是他的自建cdn,而不是依赖在aws上。冷门视频如何处理呢 ?所有的streaming相关的公司,cdn都是他们成本考虑的最重要因素。CDN 挡了90%的流量aws只handle信令和meta data,video chunk从来都不经过micro seevicestaotao 说 “所有chunks”从ops角度讲,cdn得挡99%的流量 2004 | From Sean Gao to Everyone: (7:42 PM) 2005 | true 2006 | From Ning to Everyone: (7:42 PM) 2007 | 记得看过Netflix 说是90% 2008 | From Joselyn phone to Everyone: (7:42 PM) 2009 | 如果冷门的视频,是不是也是从cdn读比较好 2010 | From Sean Gao to Everyone: (7:43 PM) 2011 | 冷门视频可能不在cdn里面 2012 | From Jia to Everyone: (7:43 PM) 2013 | 有大神能recap下upload,download该如何scale吗?sorry刚没听清。download用自建的cdn cache或aws cloudfront类似的service,upload用queue? 2014 | From Joselyn phone to Everyone: (7:43 PM) 2015 | 那冷门视频从哪里读,就是直接分布文件系统吗? 2016 | From Bam to Everyone: (7:44 PM) 2017 | 肯定有视频不在CDN里,比如刚上传的视频,或者10年没人看的视频 2018 | From nz to Everyone: (7:44 PM) 2019 | you drive 2020 | From kabuka to Everyone: (7:45 PM) 2021 | 我面过一个公司的SD 其中一个feedback就是面试者要proactively drive interview 2022 | From Taotao to Everyone: (7:48 PM) 2023 | netflix关于视频所有的存储都是自建的, 2024 | From Richard Tu to Everyone: (7:48 PM) 2025 | 碰肯定能碰到概率问题 2026 | From nz to Everyone: (7:49 PM) 2027 | communication skill 2028 | From Sean Gao to Everyone: (7:49 PM) 2029 | 👍 2030 | From Zhao to Everyone: (7:51 PM) 2031 | 我觉得可以把design 面试看成你跟自己公司architect review design的过程。一般先说一下high level, 然后deep dive,中间经常问问feedback,Qs, etc. 然后说说 positive path,negative path,如何scale up 2032 | From nz to Everyone: (7:53 PM) 2033 | no right or wrong answer. you should present solution and alternative solutions. what are the tradeoff 2034 | From Sean Gao to Everyone: (7:53 PM) 2035 | 解耦,异步,削峰,填谷 2036 | From Shihao Zhong to Everyone: (7:54 PM) 2037 | 削峰填谷英文咋说 2038 | From Sean Gao to Everyone: (7:54 PM) 2039 | shift loading 2040 | From kabuka to Everyone: (7:54 PM) 2041 | 上传视频怎么async? 例如上传一个1GB的视频。带宽是1MB/s 的话怎么样也要等1024秒吧 2042 | From 非洲黑猴子 to Everyone: (7:54 PM) 2043 | 解耦异步、削峰填谷 2044 | From Shihao Zhong to Everyone: (7:54 PM) 2045 | 可以的 2046 | From nz to Everyone: (7:54 PM) 2047 | buffer 2048 | From Ning to Everyone: (7:55 PM) 2049 | 这个有专门的协议来处理吧 2050 | From Zhao to Everyone: (7:56 PM) 2051 | 推荐去看看微信的技术类公众号,有很多好的文章,特别适合了解国内高并发处理的常用方案,大厂案例 2052 | From Bam to Everyone: (7:57 PM) 2053 | 求推荐公众号 2054 | From Laoluo to Everyone: (7:57 PM) 2055 | 可以发上来给大家参考一下大家都提高了以后讨论的水平就慢慢上来了 2056 | From Zhao to Everyone: (7:57 PM) 2057 | 我经常看51CTO技术栈的 2058 | From Sean Gao to Everyone: (7:57 PM) 2059 | google 就能搜到很多 2060 | From Taotao to Everyone: (7:57 PM) 2061 | 因该叫transcoding 才对 2062 | From 石登辉 to Everyone: (8:02 PM) 2063 | 视频再用http的gzip没啥意义了 2064 | From 石登辉 to Everyone: (8:02 PM) 2065 | 一般是文本文件 2066 | From Zhao to Everyone: (8:06 PM) 2067 | 问一下大家,是不是可以把一些细节讨论放后面。在讲完HLD后,可以把Failure Handling 和scale up 先大致讲讲,然后再看面试官态度决定深挖哪个,以及schema design,etc? 有没有人用过这个策略? 2068 | From Taotao to Everyone: (8:06 PM) 2069 | 现在讨论的这些都没啥意义,去看看dash 和hls的规范才好。现在的设计更像是民科有专门的协议的 2070 | From First Last to Everyone: (8:07 PM) 2071 | 求link 2072 | From Sean Gao to Everyone: (8:07 PM) 2073 | 关键面试不考 dash 2074 | From Zhao to Everyone: (8:07 PM) 2075 | 😅 2076 | From Taotao to Everyone: (8:13 PM) 2077 | bookmark的sync也是重点考察的一方面 2078 | From xing wang to Everyone: (8:14 PM) 2079 | 多谢分享! 2080 | From Sean Gao to Everyone: (8:14 PM) 2081 | 谢谢大家 2082 | From John to Everyone: (8:14 PM) 2083 | 谢谢! 2084 | From Laoluo to Everyone: (8:14 PM) 2085 | 谢谢! 2086 | From Yvonne to Everyone: (8:14 PM) 2087 | 谢谢 2088 | From 非洲黑猴子 to Everyone: (8:14 PM) 2089 | 谢谢 2090 | 2091 | ## Donation 2092 | From Ken to Everyone: (7:12 PM) 2093 | 7:12Meeting notes: https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit 2094 | From ningdi to Everyone: (7:14 PM) 2095 | 原来是 round up amount 捐钱啊 😂 以为是捐食物呢 2096 | From Ken to Everyone: (7:18 PM) 2097 | Interview 7:12->7:57. Meeting notes: https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit 2098 | From ningdi to Everyone: (7:25 PM) 2099 | Hahah food order基本就集中在某几个小时 2100 | From james to Everyone: (7:25 PM) 2101 | 1M/Day = 12qps ! 2102 | From ningdi to Everyone: (7:25 PM) 2103 | Qps不应该这么算,很容易爆 2104 | From shawnzheng to Everyone: (7:26 PM) 2105 | 刚加入 donation和doordash有什么关系? 2106 | From ningdi to Everyone: (7:26 PM) 2107 | Check out的时候给选项想你要钱捐给你指定的charities 2108 | From shawnzheng to Everyone: (7:28 PM) 2109 | Hmm 但是很多人都不会捐款吧 算order的qps我有点confuse 2110 | From ningdi to Everyone: (7:28 PM) 2111 | 😂 是的 2112 | From Cheng Jing to Everyone: (7:28 PM) 2113 | 有道理唉,我基本没捐过🤦‍♂️ 2114 | From ningdi to Everyone: (7:29 PM) 2115 | 建议面试前捐一点。。 2116 | From kevin to Everyone: (7:29 PM) 2117 | shawnzheng说到点子上了 2118 | From Hu to Everyone: (7:29 PM) 2119 | 没懂为什么除以3600就行了 不需要除以24 2120 | From fengxiong to Everyone: (7:29 PM) 2121 | 因为peak hour才有人要吃饭 2122 | From ningdi to Everyone: (7:29 PM) 2123 | 这个还是比较好理解的吧,虽然不代表每个order都会有捐款,但是这个不就是跟order分不开的。 2124 | From Cheng Jing to Everyone: (7:30 PM) 2125 | 是说,doordash的order,基本都集中在饭点,而不是平均到每个小时 2126 | From Yanbin Li to Everyone: (7:30 PM) 2127 | 面试官刚才纠正了,是每小时3million,只不过note上没改 2128 | From ningdi to Everyone: (7:30 PM) 2129 | 。。。 dd用户量这么大了? 2130 | From kevin to Everyone: (7:31 PM) 2131 | 假设 2132 | From Yanbin Li to Everyone: (7:31 PM) 2133 | 感觉问qps这部分,有一个很重要的数没问,就是第三方payment能承受的qps是多少,这个直接影响solution还有第三方的latency 2134 | From fengxiong to Everyone: (7:32 PM) 2135 | 既然可以承受 点单,那么donation也可以 2136 | From ningdi to Everyone: (7:32 PM) 2137 | 你是第三方的client, latency你需要关心,但是第三方的qps又不是你一个人在用,关心他干嘛。。 2138 | From kevin to Everyone: (7:32 PM) 2139 | Yanbin给你加分 2140 | From Yanbin Li to Everyone: (7:32 PM) 2141 | 这种集成一般都有SLA,不是你想怎么call就怎么call的 2142 | From kevin to Everyone: (7:32 PM) 2143 | 说的非常好 2144 | From Ken to Everyone: (7:32 PM) 2145 | Interview 7:12->7:57. Meeting notes: https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit 2146 | From ningdi to Everyone: (7:33 PM) 2147 | Sla是 latency, 第三方支持的qps不是你一个人独享,你拿到了也没用吧? 2148 | From Ender Li to Everyone: (7:34 PM) 2149 | 一般大客户都是要说好的,我大概每秒会call多少,银行payment这种api和普通面向网站的是不一样的,不是假设多少量都要接着的 2150 | From claire lin to Everyone: (7:35 PM) 2151 | 你这个怎么告诉饭店 订单下了,啥时候饭菜送到? 2152 | From ningdi to Everyone: (7:35 PM) 2153 | 那在这道题里面,会出现 难点 第三方qps不支持你的order量?emmmm。。 2154 | From kevin to Everyone: (7:36 PM) 2155 | 一般来说都假设第三方能够承受这样的流量,但是面试者最好问一下,这样说明面试者考虑比较周全 2156 | From ningdi to Everyone: (7:37 PM) 2157 | 话两头说吧。。。 思虑周全 跟 over design 之间,没有界限 😂 2158 | From claire lin to Everyone: (7:37 PM) 2159 | 如果用async call,那你怎么confirm ? 2160 | From Bam to Everyone: (7:38 PM) 2161 | 所以订菜和捐款是两个API么? 2162 | From kevin to Everyone: (7:38 PM) 2163 | 没有design的事情,就是说一下你assume第三方api能够handle,仅此而已,除非面试官说不是这样告诉你具体的信息 2164 | From ningdi to Everyone: (7:38 PM) 2165 | 看起来是把 food order跟 donation order彻底分开了。 2166 | From Bam to Everyone: (7:38 PM) 2167 | 我感觉不大好,应该一起来,both or nothing 2168 | From ningdi to Everyone: (7:39 PM) 2169 | 前面verify了 不能让donation amount 影响regular order 2170 | From Ender Li to Everyone: (7:39 PM) 2171 | 你在澄清需求的时候考虑到了问出来,就叫思虑周全。问都没问就假设第三方支持不了,然后一顿设计,就叫over design 2172 | From kevin to Everyone: (7:39 PM) 2173 | ender说的对 2174 | From ningdi to Everyone: (7:39 PM) 2175 | 啊对对对 2176 | From Ender Li to Everyone: (7:40 PM) 2177 | 个人感觉第三方支持不了你的qps是个很好的follow up 2178 | From kevin to Everyone: (7:40 PM) 2179 | 是的可以是个follow up而且不仅仅是技术方面可以涉及到产品的设计 2180 | From Bam to Everyone: (7:41 PM) 2181 | 话说订菜失败,捐款成功的话怎么办 2182 | From Ender Li to Everyone: (7:42 PM) 2183 | 定菜成功后再place捐款的订单做成workflow 2184 | From Bam to Everyone: (7:42 PM) 2185 | 这个可以有 2186 | From kevin to Everyone: (7:42 PM) 2187 | 非常好 2188 | From fengxiong to Everyone: (7:42 PM) 2189 | nb 2190 | From kevin to Everyone: (7:43 PM) 2191 | 处理这个case会加分说明有深入的思考 2192 | From claire lin to Everyone: (7:44 PM) 2193 | 一般是先hold钱,然后order,order成功charge,否则cancel hold 2194 | From Ender Li to Everyone: (7:45 PM) 2195 | 不好意思没听到,payment method是个啥 2196 | From Sebastian Su to Everyone: (7:45 PM) 2197 | apiToken 是JWT之类的嘛 2198 | From Laoluo to Everyone: (7:45 PM) 2199 | 是不是原则上订餐和捐款同一个transaction? 2200 | From Bam to Everyone: (7:46 PM) 2201 | 不是,只有捐款失败则不rollback 2202 | From Kai’s iPhone to Everyone: (7:46 PM) 2203 | 不应该啊吧 2204 | From Peng Su to Everyone: (7:46 PM) 2205 | 请问apiToken是干啥用的 2206 | From s to Everyone: (7:46 PM) 2207 | donation 的qps应该远小于订餐qps吧,这样的话第三方handle不了的可能性是不是不大 ? 2208 | From kevin to Everyone: (7:46 PM) 2209 | 可以是,但不是必须 2210 | From ningdi to Everyone: (7:46 PM) 2211 | 这个design目前看起来像我们面试experience的rest api 的考点。。 2212 | From Laoluo to Everyone: (7:46 PM) 2213 | API token实际上没有必要单独列出来,不同的实现会有不同的参数象secretkey 2214 | From Kai’s iPhone to Everyone: (7:46 PM) 2215 | 你捐款不能影响主业务啊万一捐款api挂了 你order不就全挂了 2216 | From Laoluo to Everyone: (7:46 PM) 2217 | 有道理,订餐为主 2218 | From kevin to Everyone: (7:47 PM) 2219 | 这个要clarify,是不是捐款失败,订餐还可以成功 2220 | From Ender Li to Everyone: (7:47 PM) 2221 | 感觉是不是一个transaction取决于agreement怎么写的,不过一般公司都会倾向于用户接受订餐成功捐款失败的条款吧。这个面试的时候推荐问一下面试官 2222 | From Laoluo to Everyone: (7:47 PM) 2223 | 这里讨论是订餐后,单独把捐款另外做? 2224 | From Ken to Everyone: (7:48 PM) 2225 | Interview 7:12->7:57. Meeting notes: https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit 2226 | From Zhao to Everyone: (7:48 PM) 2227 | 捐款不一定需要当场实现,可以是公司每个周/月汇总了再捐。所以点菜+捐款做成一个transaction,只要记录一下捐款量就好。 2228 | From xiao to Everyone: (7:48 PM) 2229 | Update_ts用来干嘛呀 2230 | From james to Everyone: (7:49 PM) 2231 | Sql能处理多少ups?pqs 2232 | From Dingwen Chen to Everyone: (7:49 PM) 2233 | 订餐transaction包括了payment和donation了吗? 2234 | From Ender Li to Everyone: (7:49 PM) 2235 | 我觉得面试题如果这样问比较有意思:现在doordash想增加donation功能,你怎么设计? 2236 | From fengxiong to Everyone: (7:49 PM) 2237 | 用到消息队列了 2238 | From ningdi to Everyone: (7:49 PM) 2239 | 代码实现的时候 可不想把 donation的代码写进 正常food order的code里面, 出了问题一起完蛋。 还是从系统上去区分跟互相影响吧。 额外做个mq去坚挺payment success 的消息 然后去get order里面有没有捐款,做aysnc捐款 2240 | From james to Everyone: (7:49 PM) 2241 | 多少qps以上就不能用sql? 2242 | From kevin to Everyone: (7:50 PM) 2243 | Zhao说的太好了,能从业务角度去思考,大加分 2244 | From Cheng Jing to Everyone: (7:50 PM) 2245 | 1000qps以上吧 2246 | From Dingwen Chen to Everyone: (7:50 PM) 2247 | 放在payment里面,有cash back, donation, tips选项 2248 | From Pu Wang to Everyone: (7:50 PM) 2249 | 没有这种说法,1000QPS是single node的sql db,sql db也是scable的 2250 | From ningdi to Everyone: (7:50 PM) 2251 | Sql一个是1000 如果读写都有的话 纯写的话不知道了 2252 | From Ken to Everyone: (7:51 PM) 2253 | Interview 7:12->7:57. Meeting notes: https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit 2254 | From Kai’s iPhone to Everyone: (7:51 PM) 2255 | 有道理因为order可能被取消 2256 | From Dingwen Chen to Everyone: (7:51 PM) 2257 | 加多一个VIP ID for donation 2258 | From Kai’s iPhone to Everyone: (7:51 PM) 2259 | 所以donate可能被rollbqck 2260 | From ningdi to Everyone: (7:52 PM) 2261 | 其实你们下单donate了之后,你们银行里面是几个transaction? 2262 | From Shihao Zhong to Everyone: (7:52 PM) 2263 | 肯定是俩啊 2264 | From kevin to Everyone: (7:52 PM) 2265 | Ender的想法非常好,算是给面试官的feedback 2266 | From Bam to Everyone: (7:52 PM) 2267 | 这个图是怎么生成的? 2268 | From Dingwen Chen to Everyone: (7:52 PM) 2269 | 一个吧 2270 | From ningdi to Everyone: (7:52 PM) 2271 | 这个说 肯定是俩的 是真的例子 还是想当然啊。。。没捐过目前 不知道到底是几个 2272 | From Ender Li to Everyone: (7:53 PM) 2273 | 收款人一个是doordash,一个是慈善组织,没法一个吧 2274 | From Shihao Zhong to Everyone: (7:53 PM) 2275 | 对啊。 2276 | From ningdi to Everyone: (7:53 PM) 2277 | 都是doordash hold钱月底结账给商家。。 2278 | From Bam to Everyone: (7:53 PM) 2279 | 也可以dd汇总之后每月打款 2280 | From Pu Wang to Everyone: (7:54 PM) 2281 | 是的,一般只是record下,再处理donate 2282 | From ningdi to Everyone: (7:54 PM) 2283 | 直接给钱的话,投诉啥的,扣不了钱。。 2284 | From Bam to Everyone: (7:54 PM) 2285 | 那一笔转账更好点,否则transaction fee受不了捐2毛,被银行收1毛 2286 | From kevin to Everyone: (7:55 PM) 2287 | 是的 2288 | From Dingwen Chen to Everyone: (7:55 PM) 2289 | 类似tips, 如果不是不同的payment method, 就是一个transaction 2290 | From Bam to Everyone: (7:55 PM) 2291 | 而且可以assume用户用的是同一张卡吧,要么一起成功一起失败 2292 | From Ender Li to Everyone: (7:55 PM) 2293 | 这个点很赞,所以doordash按月汇总捐款更合理 2294 | From Shihao Zhong to Everyone: (7:56 PM) 2295 | 有道理诶 2296 | From Bam to Everyone: (7:56 PM) 2297 | 应该不会有人特地用两张卡结账吧 2298 | From kevin to Everyone: (7:56 PM) 2299 | 给zhao点赞 2300 | From Ender Li to Everyone: (7:56 PM) 2301 | 前面做成一个给doordash的transaction就可以,逻辑还简单了 2302 | From Eric Che to Everyone: (7:56 PM) 2303 | 为什么要用kafka,而不是mq?Kafka虽然可以当mq,但并不能完全取代mq 2304 | From kevin to Everyone: (7:57 PM) 2305 | 一种实现二用 2306 | From christie Yu to Everyone: (7:57 PM) 2307 | SQL db 是不是不容易做sharding啊? 2308 | From kevin to Everyone: (7:57 PM) 2309 | 一种实现而已 2310 | From ningdi to Everyone: (7:57 PM) 2311 | zhao 到底说了啥。。 2312 | From ningdi to Everyone: (7:57 PM) 2313 | 啥一种实现2用。。。 我还往上翻。。 2314 | From Jing to Everyone: (7:58 PM) 2315 | 为什么read heavy?order应该是write heavy吧 2316 | From kevin to Everyone: (7:58 PM) 2317 | Zhao说捐款是每月汇总结算 2318 | From tomdi to Everyone: (7:58 PM) 2319 | order cache 不太需要 2320 | From xiao to Everyone: (7:58 PM) 2321 | Order cache怎么用啊这里 2322 | From Sebastian Su to Everyone: (7:59 PM) 2323 | order cache 确实不太需要。 2324 | From Sean Gao to Everyone: (7:59 PM) 2325 | 查询order ? 2326 | From ningdi to Everyone: (7:59 PM) 2327 | 读status的时候 尤其是recent status 读比较频繁 2328 | From tomdi to Everyone: (7:59 PM) 2329 | 1 master 2330 | From Yibin to Everyone: (7:59 PM) 2331 | 1 master for consistency 2332 | From Shihao Zhong to Everyone: (7:59 PM) 2333 | 用mysql怎么俩master啊。 2334 | From Sean Gao to Everyone: (7:59 PM) 2335 | 多数据中心的话,是不是 1 master per colo ? 2336 | From ningdi to Everyone: (8:00 PM) 2337 | 2个master也可以多consistency 只需要你保证某个信息 只会被/永远只会 被其中一个master update对于某个record他是 single master就行。。 2338 | From Shihao Zhong to Everyone: (8:01 PM) 2339 | 喔,那需要加个中间件 2340 | From Bam to Everyone: (8:01 PM) 2341 | Mysql 有Read/Quorum么? 2342 | From christie Yu to Everyone: (8:01 PM) 2343 | 没有吧read/write quorum 只有leaderless replication 有吧 2344 | From Am to Everyone: (8:02 PM) 2345 | L4是sde2? 2346 | From ningdi to Everyone: (8:02 PM) 2347 | 他说的都是gg的 2348 | From Am to Everyone: (8:02 PM) 2349 | thx 2350 | From Ken to Everyone: (8:06 PM) 2351 | https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit# 2352 | From Yibin to Everyone: (8:09 PM) 2353 | 考官可以讲一下如果是L5的话还有哪方面需要改进的吗 2354 | From Sean Gao to Everyone: (8:11 PM) 2355 | +1 2356 | From Kai’s iPhone to Everyone: (8:13 PM) 2357 | +1 2358 | From Jiayue(Hubert) Wu to Everyone: (8:18 PM) 2359 | Database hook是什么呀,搜了一下好像没搜到 2360 | From kevin to Everyone: (8:18 PM) 2361 | 我猜是数据库触发器 2362 | From Cheng Jing to Everyone: (8:18 PM) 2363 | sql triggers? 2364 | From Sebastian Su to Everyone: (8:20 PM) 2365 | +1 没懂db hook 是什么,做什么的 2366 | From Mark to Everyone: (8:23 PM) 2367 | 如果是L5 是hire吗? 2368 | From Lucas Li to Everyone: (8:23 PM) 2369 | 这位美女可以上L5么 2370 | From Chris to Everyone: (8:23 PM) 2371 | 题目太简单了 2372 | From tomdi to Everyone: (8:23 PM) 2373 | L5可以加面一轮 2374 | From Chris to Everyone: (8:23 PM) 2375 | l5会问比较难的idempotency 2376 | From Kai’s iPhone to Everyone: (8:26 PM) 2377 | 面试官经验丰富 2378 | From Mark to Everyone: (8:26 PM) 2379 | 一般考官都不知道 2380 | From ningdi to Everyone: (8:27 PM) 2381 | debrief的时候 那个图内容这么少 能back up吗。。 2382 | From james to Everyone: (8:29 PM) 2383 | 能谈谈sql还是no-sql的选择吗? 2384 | From Joselyn phone to Everyone: (8:30 PM) 2385 | 同问db hook 2386 | From Jiashen to Everyone: (8:31 PM) 2387 | 可以share 一下总结的doc吗 2388 | From Savannah Tong to Everyone: (8:32 PM) 2389 | https://docs.google.com/document/d/19wMqh91tdvcTw9UqeWljFZSxbVVl2KRw_FX2ZC13974/edit# 2390 | From ningdi to Everyone: (8:32 PM) 2391 | 😂 茶壶煮饺子 2392 | From Sean Gao to Everyone: (8:32 PM) 2393 | 👍 2394 | From ningdi to Everyone: (8:32 PM) 2395 | 好比喻啊 2396 | From Taotao to Everyone: (8:32 PM) 2397 | 当前这瓢冷水泼的很好,这些点说的都挺好,听得很有帮助。但是我觉得面试者communication很好了 2398 | From Yibin to Everyone: (8:33 PM) 2399 | 谢谢分享!很有帮助! 2400 | From Kai’s iPhone to Everyone: (8:33 PM) 2401 | 感谢 2402 | From Mark to Everyone: (8:34 PM) 2403 | 谁知道面试官看重什么 2404 | From Sean Gao to Everyone: (8:37 PM) 2405 | 但是涉及 分布式事务吧 2406 | From ningdi to Everyone: (8:37 PM) 2407 | 面试前没题啊。 😂我以为最好的面试flow是 high level图画出来 然后面试官想展开那个module 再详细讲。。 不知道对不对。。。 2408 | From Mark to Everyone: (8:39 PM) 2409 | 从business角度上看 哪些材料更好学习 2410 | From Chris to Everyone: (8:48 PM) 2411 | 看题目 有些题目需要estimate 2412 | From s to Everyone: (8:48 PM) 2413 | 跟solution还是有关系的qps很小的话,可能都不需要你scale的 2414 | From Sebastian Su to Everyone: (8:50 PM) 2415 | 一般多少qps 以内是不用distributed 2416 | From s to Everyone: (8:50 PM) 2417 | 这个很容易考察你是不是真有经验 2418 | From Chris to Everyone: (8:51 PM) 2419 | qps 不高,都不用queue了spofactive passive 2420 | From Savannah Tong to Everyone: (8:56 PM) 2421 | db hook https://orientdb.com/docs/2.2.x/Hook.html 2422 | From Lucas Li to Everyone: (9:09 PM) 2423 | 不能用异步的队列吧 2424 | 万一信用卡信息不对呢 2425 | 2426 | ## Google drive 2427 | From Tekken to Everyone: (7:06 PM) 2428 | 第一次做mock面试官 做的不足的地方 大家多指教 🙏 2429 | From 老黄瓜 to Everyone: (7:06 PM) 2430 | 老哥谦虚了 请开始你的表演! 2431 | From Ken to Everyone: (7:15 PM) 2432 | https://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit# 2433 | From Yu Zheng to Everyone: (7:17 PM) 2434 | google drive has desktop version too... 2435 | From Li to Everyone: (7:17 PM) 2436 | +1 2437 | From jun to Everyone: (7:17 PM) 2438 | +1 2439 | From Xinyu Zhang to Everyone: (7:17 PM) 2440 | +1 2441 | From 老黄瓜 to Everyone: (7:17 PM) 2442 | So what’s the difference between dropbox vs google drive?In terms of user functionality 2443 | From Ken to Everyone: (7:18 PM) 2444 | https://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit#includes meeting notes and QR codeTime: 7:16-8:01 2445 | From jun to Everyone: (7:19 PM) 2446 | Thanks 2447 | From Ken to Everyone: (7:19 PM) 2448 | Calendar for future events: https://www.designclub.mingdaoschool.com/calendar.html 2449 | From HW to Everyone: (7:20 PM) 2450 | Tom是interviewee吗? 2451 | From jun to Everyone: (7:20 PM) 2452 | Delete files? 2453 | From 老黄瓜 to Everyone: (7:20 PM) 2454 | Update files? 2455 | From Xinyu Zhang to Everyone: (7:20 PM) 2456 | 多人实时更新文件功能 为什么算是bonus部分啊? 2457 | From Yu Zheng to Everyone: (7:20 PM) 2458 | 是啊。。为啥都是面试人自己写 functionality 2459 | From EE to Everyone: (7:21 PM) 2460 | 这个设计更像是file system 2461 | From xiaonan to Everyone: (7:21 PM) 2462 | 因为卷... 2463 | From Zhengguan Li to Everyone: (7:22 PM) 2464 | ... 2465 | From HW to Everyone: (7:22 PM) 2466 | req collection应该是考察的一部分 2467 | From jun to Everyone: (7:22 PM) 2468 | +1 2469 | From Yu Zheng to Everyone: (7:22 PM) 2470 | 是啊。。 2471 | From James to Everyone: (7:22 PM) 2472 | share files 不管了?面试官提了好几回 2473 | From 应Jianghong to Everyone: (7:22 PM) 2474 | 3个9也太低了 2475 | From EE to Everyone: (7:22 PM) 2476 | Drive also makes it easy for others to edit and collaborate on files 2477 | From Tony Y to Everyone: (7:22 PM) 2478 | 面试的小伙不要看chat哈 2479 | From Xinyu Zhang to Everyone: (7:23 PM) 2480 | 而且多人会同时修改文件 还要处理consistent 2481 | From Peng Su to Everyone: (7:23 PM) 2482 | 真实的面试我从来没碰到过严格按照这套流程走的都是有来有往的讨论 2483 | From Yu Zheng to Everyone: (7:23 PM) 2484 | 因为这个题目是提前知道的。。所以准备的时候倾向过了点 2485 | From jun to Everyone: (7:23 PM) 2486 | It is driven by the interviewee now 2487 | From 老黄瓜 to Everyone: (7:23 PM) 2488 | @Peng Su 能说下是咋样的来往讨论呢? 2489 | From Peng Su to Everyone: (7:23 PM) 2490 | 一开始给的题目也会有更多的细节,不会是就给个5个单词的题目 2491 | From la s er to Everyone: (7:24 PM) 2492 | 面试者 是不是 比较太 aggressive了?还是我的错觉 2493 | From lw to Everyone: (7:24 PM) 2494 | 要我没用过难道不让面了。。 2495 | From EE to Everyone: (7:24 PM) 2496 | mock的senior level? 2497 | From Yu Zheng to Everyone: (7:24 PM) 2498 | 没用过就去跟面试官聊 user case 2499 | From James to Everyone: (7:24 PM) 2500 | +1 没用过就去跟面试官聊 user case 2501 | From Ken to Everyone: (7:24 PM) 2502 | https://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit#includes meeting notes and QR codeTime: 7:16-8:01 2503 | From lw to Everyone: (7:24 PM) 2504 | 对呀。感觉还是和面试官聊吧。 2505 | From xiaonan to Everyone: (7:25 PM) 2506 | 根据我的经验,没用过基本就是挂了。除非你是天才 2507 | From 老黄瓜 to Everyone: (7:25 PM) 2508 | 啊 没用过正常吧 design tinder 你咋说 2509 | From lw to Everyone: (7:25 PM) 2510 | 那没用的多了。投票里不是还有stock exchange么。 2511 | From 应Jianghong to Everyone: (7:26 PM) 2512 | tinder还是有可能用过的 2513 | From Yu Zheng to Everyone: (7:26 PM) 2514 | 因为 feature 是对方给的。。你只要知道 feature 对应的 user case 就可以了。。。 2515 | From Ken to Everyone: (7:26 PM) 2516 | Ming Dao School event calendar:https://www.designclub.mingdaoschool.com/calendar.html Vote for popular questionshttps://www.designclub.mingdaoschool.com/popular-interview.htmlhttps://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit#includes meeting notes and QR codeTime: 7:16-8:01 2517 | From iPad to Everyone: (7:26 PM) 2518 | 没用过很正常 2519 | From 应Jianghong to Everyone: (7:26 PM) 2520 | 有的面试题就是没productionize的feature 2521 | From EE to Everyone: (7:26 PM) 2522 | cloud storage和Google drive的痛点不一样 2523 | From Peng Su to Everyone: (7:27 PM) 2524 | 一般是先设计一个MVP,不考虑scale,然后再根据面试官的提问,往不同的方向走 2525 | From xiaonan to Everyone: (7:27 PM) 2526 | 没用过你基本上只能往通用的一些点靠,从而失去了特定题目的特点。往往要考察的通常是这些特点 2527 | From Peng Su to Everyone: (7:27 PM) 2528 | 比如有的会问resiliency,有的问scale 2529 | From Xinyu Zhang to Everyone: (7:27 PM) 2530 | 这个bandwidth没必要算的具体吧 和interviewer聊聊差不多的级别就可以了吧 2531 | From xiaonan to Everyone: (7:27 PM) 2532 | 不是所有面试官都有来有往的 2533 | From lw to Everyone: (7:27 PM) 2534 | 可以问面试官吧。这才是沟通。不然不是背答案。。 2535 | From Yu Zheng to Everyone: (7:27 PM) 2536 | 没用过产品不代表没用过类似的。。我没用过 pin 但是可以用过竞争对手的 2537 | From xiaonan to Everyone: (7:27 PM) 2538 | L5基本上你要drive整个过程你当然可以问does it make sense这种问题 2539 | From Cory Wang to Everyone: (7:28 PM) 2540 | drive没错,但是scope requirements的时候还是要问问interviewer吧 2541 | From xiaonan to Everyone: (7:28 PM) 2542 | 但是我遇到的考官就是不给你任何提示 2543 | From Yu Zheng to Everyone: (7:28 PM) 2544 | drive 的是 design。。不是 requirement 吧。。 2545 | From xiaonan to Everyone: (7:29 PM) 2546 | 咱得做好最坏的打算 2547 | From Peng Su to Everyone: (7:29 PM) 2548 | 对,drive的意思是有很多条路,根据需求选一条 2549 | From EE to Everyone: (7:29 PM) 2550 | 这就是了cloud file system不是Google drive 2551 | From jun to Everyone: (7:29 PM) 2552 | 20 files 2553 | From Yu Zheng to Everyone: (7:29 PM) 2554 | drive 的是 conversion。。不是说脑补需求。。这个有区别的。。 2555 | From Peng Su to Everyone: (7:29 PM) 2556 | Drive的意思不是我就按照我自己的路线开车 2557 | From jun to Everyone: (7:29 PM) 2558 | 1024/50 2559 | From Xinyu Zhang to Everyone: (7:29 PM) 2560 | (requirement7分钟 + capacity6分钟) 2561 | From Ken to Everyone: (7:30 PM) 2562 | Time: 7:16-8:01Meeting notes and QR codehttps://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit# 2563 | From jun to Everyone: (7:31 PM) 2564 | permission is at the front 2565 | From 老黄瓜 to Everyone: (7:31 PM) 2566 | CRUD 每个功能一个Service? 2567 | From Kun Zhang to Everyone: (7:32 PM) 2568 | As a serverless? 2569 | From jun to Everyone: (7:32 PM) 2570 | no cache? 2571 | From Eric Haung to Everyone: (7:32 PM) 2572 | Auth service 不是应该在Upload, download, delete, list files service etc的前面吗? 2573 | From 老黄瓜 to Everyone: (7:33 PM) 2574 | 感觉我已经confuse了 2575 | From Xinyu Zhang to Everyone: (7:33 PM) 2576 | DB 2577 | From Tony Y to Everyone: (7:33 PM) 2578 | 冷静。。可能只是general api 2579 | From James to Everyone: (7:33 PM) 2580 | 不理hint。不太好吧 2581 | From 应Jianghong to Everyone: (7:33 PM) 2582 | Eric说得对 2583 | From Xinyu Zhang to Everyone: (7:33 PM) 2584 | 连DB都没有 2585 | From Yu Zheng to Everyone: (7:33 PM) 2586 | 的确是。。忽视 hint 貌似是个常见错误 2587 | From 应Jianghong to Everyone: (7:33 PM) 2588 | Auth service应该是一个截面 2589 | From jun to Everyone: (7:34 PM) 2590 | Google drive shall use single-signon, right? 2591 | From Li to Everyone: (7:34 PM) 2592 | “连DB都没有” +1 2593 | From Kai to Everyone: (7:34 PM) 2594 | Is it overkill to build once service for each upload/download/delete action? 2595 | From Eric Haung to Everyone: (7:35 PM) 2596 | could file storage应该是他想表达的DBcloud* 2597 | From Tong Liu to Everyone: (7:35 PM) 2598 | Cache可以先不设计吗,到scale那步再加? 2599 | From Li to Everyone: (7:35 PM) 2600 | DB需要存metadata 2601 | From Kai to Everyone: (7:35 PM) 2602 | Metadata +1 2603 | From jun to Everyone: (7:35 PM) 2604 | metadata+1 2605 | From Vivian huai to Everyone: (7:35 PM) 2606 | interviewee好像不需要interviewer讲话就好的感觉 >_< 2607 | From Li to Everyone: (7:35 PM) 2608 | file的metadata, user的metadata, 各个device的metadata, etc 2609 | From 应Jianghong to Everyone: (7:35 PM) 2610 | 理论上的话one service per each crud operation可以horizonal scaling 2611 | From 老黄瓜 to Everyone: (7:35 PM) 2612 | 感觉可以稍微更high-level,比如 user -> API gateway -> CRUD service -> DB. etc 然后每个部分细节再展开,可能让观众更容易跟上 2613 | From 应Jianghong to Everyone: (7:36 PM) 2614 | 实际上嘛我估计代码会很bloated 2615 | From Xinyu Zhang to Everyone: (7:36 PM) 2616 | DB要存的东西挺多的 那个FS和DB完全是不一样的东西 2617 | From Qiqi to Everyone: (7:36 PM) 2618 | 存metadata为啥不能直接用cloud? 2619 | From jun to Everyone: (7:36 PM) 2620 | It does not have a big picture 2621 | From Eric Haung to Everyone: (7:36 PM) 2622 | 感觉走太快了 应该把各个component 说一次,make sure 和面试官是same page 2623 | From Tony Y to Everyone: (7:36 PM) 2624 | 先写一个mvp然后再考虑scale呢? 2625 | From Xinyu Zhang to Everyone: (7:36 PM) 2626 | 而且这个API很不rest 2627 | From James to Everyone: (7:36 PM) 2628 | 感觉沟通有点脱节 2629 | From Eric Haung to Everyone: (7:36 PM) 2630 | 现在面试官可能很多问题,但是已经开始写API了。。 2631 | From Cory Wang to Everyone: (7:36 PM) 2632 | +1 沟通脱节 2633 | From lw to Everyone: (7:37 PM) 2634 | api需要写这么细嘛(是一个问题)?能不能不写这么多args。 2635 | From 应Jianghong to Everyone: (7:37 PM) 2636 | 其实中间的service应该整合成一个application layer 2637 | From Kd to Everyone: (7:37 PM) 2638 | 觉得是面试不太够,没有把一个深度的东西画出来。 那个图可能没有任何信息量 2639 | From Eric Haung to Everyone: (7:37 PM) 2640 | +1 2641 | From jun to Everyone: (7:37 PM) 2642 | +1 2643 | From Yu Zheng to Everyone: (7:37 PM) 2644 | do you know that....very bad wording lol 2645 | From Vivian huai to Everyone: (7:37 PM) 2646 | +1 沟通脱节 2647 | From xiaonan to Everyone: (7:37 PM) 2648 | 面试者过会应该会更新他的图吧 2649 | From Tony Y to Everyone: (7:38 PM) 2650 | offline sync 是个functional requiremnet 2651 | From Xinyu Zhang to Everyone: (7:38 PM) 2652 | 这个API要写的话可以 file/ GET/PUT/DELETE 2653 | From jun to Everyone: (7:38 PM) 2654 | drive to details too fast 2655 | From Eric Haung to Everyone: (7:38 PM) 2656 | POST,GET,Delete 2657 | From jun to Everyone: (7:38 PM) 2658 | HEAD? 2659 | From Tony Y to Everyone: (7:38 PM) 2660 | api我经常不写。。。 2661 | From jun to Everyone: (7:39 PM) 2662 | no user db? 2663 | From 应Jianghong to Everyone: (7:39 PM) 2664 | 我突然意识到一个问题,这个里头是不是缺了个front end? 2665 | From jun to Everyone: (7:39 PM) 2666 | web/clientapp 2667 | From 应Jianghong to Everyone: (7:39 PM) 2668 | 要不然你怎么sync呢? 2669 | From iPhone to Everyone: (7:39 PM) 2670 | 是不是他画的User?被挡住了… 2671 | From Eric Haung to Everyone: (7:40 PM) 2672 | 好像面试的哥哥想一次性把自己的想法写出来 再慢慢讲解 这个缺点是 如果一开始就是错了 那么就浪费非常多时间 2673 | From Qi Wang to Everyone: (7:40 PM) 2674 | service的划分好像不太对,应该是读写service分离,然后用cache解决读的问题。 2675 | From jun to Everyone: (7:40 PM) 2676 | It looks like user call rest api directly 2677 | From James to Everyone: (7:40 PM) 2678 | 我觉得可以在functional requirement的时候就写APIs 2679 | From Li to Everyone: (7:41 PM) 2680 | table存在了file system里面了,这明显错误了 2681 | From Xinyu Zhang to Everyone: (7:41 PM) 2682 | 这个为啥要transaction啊 2683 | From Yu Zheng to Everyone: (7:42 PM) 2684 | 哈哈太自信了 2685 | From emma to Everyone: (7:42 PM) 2686 | seems like the candidate doesn't know the difference between db and file storage 2687 | From Li to Everyone: (7:42 PM) 2688 | “seems like the candidate doesn't know the difference between db and file storage” +1 2689 | From Yu Zheng to Everyone: (7:42 PM) 2690 | indeed 2691 | From I to Everyone: (7:42 PM) 2692 | +1 2693 | From 老黄瓜 to Everyone: (7:42 PM) 2694 | 感觉这个设计比较难collect signals 2695 | From jun to Everyone: (7:42 PM) 2696 | The question is too big to answer in 1 hour 2697 | From xiaonan to Everyone: (7:43 PM) 2698 | +1 2699 | From Vivian huai to Everyone: (7:43 PM) 2700 | 感觉有点像背答案,不是真的理解 2701 | From Ming to Everyone: (7:43 PM) 2702 | Meta data 一般存db,对吧? 2703 | From Phia to Everyone: (7:43 PM) 2704 | 不是应该follow interviewer direction吗? 感觉小哥就是想把自己想讲的讲了 2705 | From jun to Everyone: (7:43 PM) 2706 | NOSQL DB 2707 | From Yu Zheng to Everyone: (7:43 PM) 2708 | 恩,从 feature 开始就是很明显背答案了。。所以都在往自己准备过的上面去套。。。 2709 | From iPhone to Everyone: (7:43 PM) 2710 | 先设计出个MVP感觉比较安全 2711 | From Jinmin’s iPad to Everyone: (7:43 PM) 2712 | I’d like to see the relationship between the permission service and the other services. I’d like to see a workflow about how permission service, upload service and db work together. 2713 | From 老黄瓜 to Everyone: (7:43 PM) 2714 | File blob 和 metadata 是存在一起吗?transaction 保证 metadata 就行了吧? 2715 | From Huimin Yang to Everyone: (7:44 PM) 2716 | 怎么突然跳到chunk了。。 2717 | From Shihao Zhong to Everyone: (7:44 PM) 2718 | 有个问题,这个题目的核心应该是做一个分布式文件系统还是说product的功能? 2719 | From Xinyu Zhang to Everyone: (7:44 PM) 2720 | 而且直接就nosql了,每个api的qps全没用到。之前说了6分钟的capacity几乎没用到 2721 | From First Last to Everyone: (7:44 PM) 2722 | 背答案 + 1 2723 | From Kd to Everyone: (7:44 PM) 2724 | 感觉就是没有什么节奏可言 2725 | From Ming to Everyone: (7:44 PM) 2726 | 因为要讲multi part upload 2727 | From 老黄瓜 to Everyone: (7:44 PM) 2728 | 感觉pick too many small things 2729 | From Kd to Everyone: (7:44 PM) 2730 | 整个flow没走通就已经开始detail了 2731 | From Ming to Everyone: (7:44 PM) 2732 | 的确是太跳跃了 2733 | From Eric Haung to Everyone: (7:44 PM) 2734 | 背答案 + 1 2735 | From First Last to Everyone: (7:44 PM) 2736 | 感觉面试者不理解这个东西 2737 | From Li to Everyone: (7:44 PM) 2738 | 这个设计没法做到多个device之间sync file change。需要大改。 2739 | From Yu Zheng to Everyone: (7:45 PM) 2740 | 因为找到的答案就是这样。。。 2741 | From Jay to Everyone: (7:45 PM) 2742 | lol 2743 | From 老黄瓜 to Everyone: (7:45 PM) 2744 | Compression/chunk + zip 一句话可以带过的 2745 | From lw to Everyone: (7:45 PM) 2746 | 可是答案也不是这样的。。 2747 | From iPhone to Everyone: (7:45 PM) 2748 | Chunks是哪里来的? 2749 | From First Last to Everyone: (7:45 PM) 2750 | 图的component,完全不是这样的,答案也不是这样的。。 2751 | From jun to Everyone: (7:45 PM) 2752 | +1 2753 | From Kd to Everyone: (7:45 PM) 2754 | 有好答案链接推荐吗 2755 | From Xinyu Zhang to Everyone: (7:45 PM) 2756 | “可是答案也不是这样的。” + 1 2757 | From Tony Y to Everyone: (7:45 PM) 2758 | 感觉讲回来点了,先说chunk然后就可以推到为什么需要metadata了 2759 | From 应Jianghong to Everyone: (7:45 PM) 2760 | 我是觉得最好还是top down drive的 2761 | From Kd to Everyone: (7:45 PM) 2762 | YouTube link? Web link? 2763 | From Ming to Everyone: (7:45 PM) 2764 | s3就是有用chunk。 2765 | From Yu Zheng to Everyone: (7:45 PM) 2766 | 一会可以看看面试官准备的参考答案 2767 | From Ken to Everyone: (7:45 PM) 2768 | meeting notes and QR codehttps://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit#Time: 7:16-8:01Ming Dao School event calendar:https://www.designclub.mingdaoschool.com/calendar.html Vote for popular questionshttps://www.designclub.mingdaoschool.com/popular-interview.html 2769 | From 应Jianghong to Everyone: (7:45 PM) 2770 | 面试官可能不care chunk 2771 | From Catherine zhang to Everyone: (7:45 PM) 2772 | 大家不要做太多假设 再说本来来这里面试就是会提前做准备的啊 2773 | From Esther to Everyone: (7:46 PM) 2774 | 面试官可能不care chunk +1 2775 | From Li to Everyone: (7:46 PM) 2776 | +1 2777 | From lw to Everyone: (7:46 PM) 2778 | 背答案ok的呀。 2779 | From Zhengguan Li to Everyone: (7:46 PM) 2780 | (为啥我感觉说的还行...) 2781 | From First Last to Everyone: (7:46 PM) 2782 | 参考答案:https://www.youtube.com/watch?v=PE4gwstWhmc Dropbox Senior Engineer design at Stanford University. 2783 | From lw to Everyone: (7:46 PM) 2784 | 一样的题当然是背咯。 2785 | From Xinyu Zhang to Everyone: (7:46 PM) 2786 | 30min了 7min requirement+6min capacity + 17min后来的这些 基本上真正面试已经结束了 2787 | From 老黄瓜 to Everyone: (7:46 PM) 2788 | Data estimation 也没说大文件,chunks 可能也不需要 2789 | From kk to Everyone: (7:46 PM) 2790 | 那咱们讲点建设性的 怎样才能减少背答案的印象啊,有啥方法技巧需要注意吗 2791 | From jun to Everyone: (7:46 PM) 2792 | suppose u r a user 2793 | From Yu Zheng to Everyone: (7:47 PM) 2794 | 碰到准备过的题目不要高兴得太早。。去挖掘对方感兴趣的考点。。而不是自己硬套 2795 | From lw to Everyone: (7:47 PM) 2796 | 我觉得就是和面试官聊出来。 2797 | From Li to Everyone: (7:47 PM) 2798 | 同一用户多个客户端/设备之间保持local的文件一致,这点完全没有谈 2799 | From lw to Everyone: (7:47 PM) 2800 | 来出来就是自然的。聊 2801 | From iPhone to Everyone: (7:47 PM) 2802 | 理解原理的话背答案没问题啊,面试官的问题都能处理好就行 2803 | From Yu Zheng to Everyone: (7:47 PM) 2804 | 比如 feature 你可以说自己准备的,然后问他想要看到什么。。 2805 | From 应Jianghong to Everyone: (7:47 PM) 2806 | 这个sync还是有一定技术难度的 2807 | From lw to Everyone: (7:47 PM) 2808 | 而且面试官也知道你学过。。 2809 | From Vivian huai to Everyone: (7:47 PM) 2810 | 和面试官聊和沟通很重要 2811 | From Kd to Everyone: (7:47 PM) 2812 | 感觉主要是得刚开始先跟着面试官的要求走,然后再往自己的去套 2813 | From Yu Zheng to Everyone: (7:47 PM) 2814 | 不要去 challenge interviewer。。去 合作解决问题。。 2815 | From Ming to Everyone: (7:48 PM) 2816 | +1 2817 | From DEFA WANG to Everyone: (7:48 PM) 2818 | 开5个services。。。。 2819 | From Kd to Everyone: (7:48 PM) 2820 | 就是一开始太根据自己的想法走,不理面试官的feeback,就会有种背答案的感觉了。 2821 | From Yu Zheng to Everyone: (7:48 PM) 2822 | 比如面试官刚才说 meta db 和 file db 是不是要分开?你回答分开就是了。。 2823 | From Vivian huai to Everyone: (7:48 PM) 2824 | 而且确实没听面试官在说啥 2825 | From lw to Everyone: (7:49 PM) 2826 | 感觉这些service之后可以分开。一开始这么多,而且很相似,没啥必要。 2827 | From Cory Wang to Everyone: (7:49 PM) 2828 | high level design完成了吗? 这个图不是high level design吧 2829 | From 老黄瓜 to Everyone: (7:49 PM) 2830 | 我没理解 他的数据是咋存的 NoSQL 是metadata 还是both 2831 | From jun to Everyone: (7:49 PM) 2832 | Sometimes interviewers work like that 2833 | From Cory Wang to Everyone: (7:49 PM) 2834 | Not workable solution 2835 | From Vivian huai to Everyone: (7:49 PM) 2836 | 很多该说清楚的都没说清楚 2837 | From Xinyu Zhang to Everyone: (7:49 PM) 2838 | 其实不带chunker也没问题 就是效率低呗 关键的file-detecter和每个file的version怎么查搞个minimal working solution 然后再说怎么不同用户sync, resolve conflict 2839 | From iPhone to Everyone: (7:49 PM) 2840 | 为啥引入MQ?好像没做太多tradeoff的讨论和原因解释 2841 | From kk to Everyone: (7:49 PM) 2842 | 感谢各位大佬的举例! 2843 | From First Last to Everyone: (7:49 PM) 2844 | Not workable solution + 1 2845 | From 老黄瓜 to Everyone: (7:50 PM) 2846 | 还有一些 fail case 能讨论估计能更清晰,比如用户传到一半失败了,后面是重新来还是有checkpoint?用户 pull 最新的文件允许lag吗?用户2个手机,一个删文件一个加文件咋handle? 2847 | From Catherine zhang to Everyone: (7:50 PM) 2848 | MQ 在这里干什么 没听明白 2849 | From Huimin Yang to Everyone: (7:51 PM) 2850 | 面试的时候跟面试官在same page很重要吧。。感觉不能各说各的 2851 | From tom to Everyone: (7:51 PM) 2852 | MQ在实时共享编辑的时候是需要的 2853 | From jun to Everyone: (7:51 PM) 2854 | If you delete a file, the operation will go to a mq 2855 | From Catherine zhang to Everyone: (7:52 PM) 2856 | 这个chat里的tom是interviewee? 2857 | From Kevin to Everyone: (7:52 PM) 2858 | @Catherine,是的 2859 | From iPhone to Everyone: (7:52 PM) 2860 | 不是说MQ在这里用的完全不对,是觉得应该解释下,否则就可能有点跳跃了 2861 | From Vivian huai to Everyone: (7:52 PM) 2862 | 感觉interviewee就是把各种东西七拼八凑在一起,撞到关键词就算 2863 | From Kd to Everyone: (7:52 PM) 2864 | 一般MQ是什么时候需要?解耦? 2865 | From Anony to Everyone: (7:52 PM) 2866 | Tom & Jerry 2867 | From Cory Wang to Everyone: (7:52 PM) 2868 | 😂 2869 | From Catherine zhang to Everyone: (7:52 PM) 2870 | 太厉害了 mutli-tasking 2871 | From jun to Everyone: (7:53 PM) 2872 | multi operation on a single resource 2873 | From 应Jianghong to Everyone: (7:53 PM) 2874 | MQ肯定是少不了的,但是没有讲service 和storage architecture就显得很跳脱 2875 | From Jerry to Everyone: (7:53 PM) 2876 | 怎么解决version conflict 2877 | From Anony to Everyone: (7:53 PM) 2878 | 还真有jerry呀 2879 | From Huimin Yang to Everyone: (7:54 PM) 2880 | ... 2881 | From Vivian huai to Everyone: (7:54 PM) 2882 | 没有给面试官深入探讨的机会。面试官一问就转移话题 2883 | From Shihao Zhong to Everyone: (7:54 PM) 2884 | 别搞这个啊老哥 2885 | From christie Yu to Everyone: (7:54 PM) 2886 | 为什么要讨论client connection option? 2887 | From emma to Everyone: (7:54 PM) 2888 | a walk through of a single flow is necessary 2889 | From christie Yu to Everyone: (7:54 PM) 2890 | 这是在讨论 同时编辑一个文件嘛? 2891 | From Kd to Everyone: (7:54 PM) 2892 | https://www.youtube.com/watch?v=PE4gwstWhmc 这个就没有MQ呀? 2893 | From 老黄瓜 to Everyone: (7:54 PM) 2894 | 感觉讨论都很细节 high-level不是很多 2895 | From Tony Y to Everyone: (7:54 PM) 2896 | 就很真实。。。我第一次面亚麻就这样没准备好亚麻直接给了一年半冷冻期 2897 | From emma to Everyone: (7:55 PM) 2898 | 感觉讨论都很细节 high-level不是很多 +1 2899 | From Dingwen Chen to Everyone: (7:55 PM) 2900 | 好像用不上MQ, 至少没解释清楚 2901 | From 应Jianghong to Everyone: (7:55 PM) 2902 | network latency呢 2903 | From Hao Wu to Everyone: (7:55 PM) 2904 | 感觉真实面试不需要这么多细节吧 2905 | From Li to Everyone: (7:55 PM) 2906 | “没有给面试官深入探讨的机会。面试官一问就转移话题” +1 2907 | From Xinyu Zhang to Everyone: (7:55 PM) 2908 | fifo? 放queue里的先后也有网络延时 2909 | From Qi Wang to Everyone: (7:55 PM) 2910 | 为啥要用websocket,有啥场景需要么, 2911 | From jun to Everyone: (7:56 PM) 2912 | The interview do watch chats! 2913 | From Ming to Everyone: (7:56 PM) 2914 | 我觉得细节也不够。很多东西刚开始讲,就没很detail就下一个了 2915 | From Tony Y to Everyone: (7:56 PM) 2916 | 面试的人别看chats哈 会被影响的 2917 | From iPhone to Everyone: (7:56 PM) 2918 | 权限还没检查呢,User有没有资格上传? 2919 | From ZZB to Everyone: (7:56 PM) 2920 | I think the permission/auth layer will not work in this design. It can not be parallel with other ops 2921 | From Catherine zhang to Everyone: (7:57 PM) 2922 | 这个问题就是很大 都讲时间肯定不够 要先抓住一个feature 讲清楚 再说别的 2923 | From Ping Lu to Everyone: (7:57 PM) 2924 | 系统是不是太大太复杂了,该怎么取舍才能把问题讲清楚,觉的遇到这样的系统设计题,挺难面的。 2925 | From Kd to Everyone: (7:57 PM) 2926 | 是个好策略 2927 | From Xinyu Zhang to Everyone: (7:57 PM) 2928 | 这个题要fanout么? 2929 | From Yu Zheng to Everyone: (7:57 PM) 2930 | 一定要挖掘大概也可以。。但是没必要吧 2931 | From xiaonan to Everyone: (7:57 PM) 2932 | chunk check在client端做是不是更容易一些? 2933 | From Huimin Yang to Everyone: (7:57 PM) 2934 | 问面试官想dive deep哪里,然后讲那一块就好了 2935 | From Xinyu Zhang to Everyone: (7:57 PM) 2936 | pull mode不行么? 2937 | From ZZB to Everyone: (7:57 PM) 2938 | MQ 为啥做的? 2939 | From Ming to Everyone: (7:58 PM) 2940 | 面试官会guide的,跟着面试官就好。大部分时候说完high level,抓一两个深入讨论。 2941 | From Vivian huai to Everyone: (7:58 PM) 2942 | 铺的很大,但是都没讲清楚 2943 | From jun to Everyone: (7:58 PM) 2944 | +1 2945 | From First Last to Everyone: (7:58 PM) 2946 | 逻辑很混乱 2947 | From Ping Lu to Everyone: (7:59 PM) 2948 | 也有面试官让你自己决定, 2949 | From Yu Zheng to Everyone: (7:59 PM) 2950 | 先有个 mvp 比较好。。。hint 一直在给 2951 | From 老黄瓜 to Everyone: (7:59 PM) 2952 | 感觉应该把核心数据流走完 能满足用户需求 再去加cache或者别的 2953 | From Kevin Li to Everyone: (7:59 PM) 2954 | 面试官第三次说run a use case了 。。 2955 | From Cory Wang to Everyone: (7:59 PM) 2956 | +1 2957 | From 老黄瓜 to Everyone: (7:59 PM) 2958 | 先满足 functional 再去想 non-functional 2959 | From jun to Everyone: (7:59 PM) 2960 | 面试官已经尽力了 2961 | From Cory Wang to Everyone: (7:59 PM) 2962 | 同意老黄瓜讲的 2963 | From Jerry to Everyone: (7:59 PM) 2964 | MQ应该是上传的时候时间太长的情况用吧,其它操作感觉都要很强很及时的consistency 2965 | From Ken to Everyone: (8:00 PM) 2966 | meeting notes and QR codehttps://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit#Time: 7:16-8:01 2967 | From Vivian huai to Everyone: (8:00 PM) 2968 | interviewer可以控制下时间了 2969 | From First Last to Everyone: (8:00 PM) 2970 | 时间到了! 2971 | From Neng Wang to Everyone: (8:00 PM) 2972 | 时间到了吧 2973 | From iPad to Everyone: (8:00 PM) 2974 | 面试官说了一万次放下手下的..跑跑usecase 2975 | From iPhone to Everyone: (8:00 PM) 2976 | 名词有点多,有俩MQ吗?不好意思,有点跟不上了 2977 | From Ken to Everyone: (8:00 PM) 2978 | 30 seconds.. 2979 | From jun to Everyone: (8:00 PM) 2980 | 上传的时候mq里面的命令已经执行了 2981 | From Li to Everyone: (8:00 PM) 2982 | 彻底的混乱了。。。。。。 2983 | From Yu Zheng to Everyone: (8:00 PM) 2984 | 杯具。。时间到了,还没 workable 2985 | From Huimin Yang to Everyone: (8:01 PM) 2986 | 。。。 2987 | From Li to Everyone: (8:01 PM) 2988 | 因为最简单的case都跑不通,不停的加/改,设计已经失控了 2989 | From ZZB to Everyone: (8:01 PM) 2990 | 为啥还有dedupe? 2991 | From lw to Everyone: (8:01 PM) 2992 | dedup也没那么重要。。 2993 | From iPhone to Everyone: (8:01 PM) 2994 | 待会儿能让面试官和面试者领着大家过一遍这个解法吗? 2995 | From 老黄瓜 to Everyone: (8:01 PM) 2996 | 感觉Tom同学还是有知识储备的 如果能自己给自己mock训练一下面试技巧会更好 2997 | From jun to Everyone: (8:01 PM) 2998 | 感觉背了一堆细节 2999 | From Tony Y to Everyone: (8:01 PM) 3000 | dedupe is optmization 3001 | From Vivian huai to Everyone: (8:01 PM) 3002 | 把背过的知识点都拿出来说一下。。。 3003 | From Kd to Everyone: (8:01 PM) 3004 | 感觉Tom是看过DDIA的 3005 | From Vivian huai to Everyone: (8:02 PM) 3006 | ddia是啥 3007 | From Xinyu Zhang to Everyone: (8:02 PM) 3008 | dup是啥dup啊? 同样的file么? (存储便宜) 3009 | From lw to Everyone: (8:02 PM) 3010 | 细节是ok。但是要面试官问吧。面试官都不想听这个。 3011 | From Ping Lu to Everyone: (8:02 PM) 3012 | 看过能用上,很牛 3013 | From kk to Everyone: (8:02 PM) 3014 | 待会能让面试官带着讲讲合理的时间分配吗 3015 | From iphone to Everyone: (8:02 PM) 3016 | Dedupe: remove duplicate 3017 | From Jerry to Everyone: (8:02 PM) 3018 | merle hash tree可以用来dedup吗merkle * 3019 | From Vivian huai to Everyone: (8:03 PM) 3020 | 请问DDIA是什么 3021 | From Selena to Everyone: (8:03 PM) 3022 | 野猪头 3023 | From Catherine zhang to Everyone: (8:03 PM) 3024 | dedup可以有很多很多方式 要based on use case 3025 | From Jerry to Everyone: (8:03 PM) 3026 | Designing Data-Intensive Applications (DDIA)书 3027 | From Vivian huai to Everyone: (8:03 PM) 3028 | 谢谢 3029 | From jun to Everyone: (8:04 PM) 3030 | 能给下slides?谢谢 3031 | From Neng Wang to Everyone: (8:04 PM) 3032 | 没收到问卷 3033 | From Vivian huai to Everyone: (8:04 PM) 3034 | 没收到问卷 3035 | From 老黄瓜 to Everyone: (8:04 PM) 3036 | 同没收到 3037 | From 老黄瓜 to Everyone: (8:06 PM) 3038 | Hard skill 问卷收到了。。 3039 | From Vivian huai to Everyone: (8:07 PM) 3040 | +1 3041 | From Qi Wang to Everyone: (8:07 PM) 3042 | 没收到问卷的是因为zoom client 版本没有升级到最新 3043 | From iPad to Everyone: (8:07 PM) 3044 | ipad的话貌似打开了chat poll就不popup* 3045 | From Neng Wang to Everyone: (8:07 PM) 3046 | 我也是这次收到了 3047 | From 应Jianghong to Everyone: (8:07 PM) 3048 | 下次应该让设计一个zoom poll system 3049 | From iPhone to Everyone: (8:07 PM) 3050 | 能让面试官和面试者领着大家过一遍这个解法吗?谢谢 3051 | From 老黄瓜 to Everyone: (8:08 PM) 3052 | 哈哈 下次来个 design zoom poll feature 3053 | From Ken to Everyone: (8:08 PM) 3054 | https://www.designclub.mingdaoschool.com/popular-interview.html 3055 | From Xinyu Zhang to Everyone: (8:08 PM) 3056 | 是个fanout lol 3057 | From tom to Everyone: (8:08 PM) 3058 | tinder system design 3059 | From 老黄瓜 to Everyone: (8:09 PM) 3060 | @tom tinder 系统设计不简单的 😅 3061 | From HW to Everyone: (8:09 PM) 3062 | 会有machine learning system design吗? 3063 | From ZZB to Everyone: (8:10 PM) 3064 | I can not see the screenNow I can 3065 | From ggg to Everyone: (8:11 PM) 3066 | 问题是有点复杂太general了我印象中tinder要做geofencing还有一个分发机制吧YouTube 上有个简单的 mock tindertinder的主要的问题,是不是推荐系统?看面试官具体问什么。。。Tiner 没用过, 能假名要注册一么?面试官比较nice 3067 | From Catherine zhang to Everyone: (8:14 PM) 3068 | 都要 3069 | From Xinyu Zhang to Everyone: (8:14 PM) 3070 | 都要 3071 | From Esther to Everyone: (8:15 PM) 3072 | 不打断 +1 3073 | From Yu Zheng to Everyone: (8:15 PM) 3074 | 给 hint 不接受也没辙。。。 3075 | From Kd to Everyone: (8:15 PM) 3076 | 我是面试者我肯定想让面试官打断我的,但我遇到的面试官都是很nice的,让我在错误的道路上越来越远 3077 | From Cory Wang to Everyone: (8:15 PM) 3078 | 不打断然后最后给个no hire? 3079 | From Huimin Yang to Everyone: (8:15 PM) 3080 | 我也觉得适当打断比较好 3081 | From jun to Everyone: (8:15 PM) 3082 | 这个是捧杀 3083 | From Huimin Yang to Everyone: (8:16 PM) 3084 | 不然最后都讲飞了 3085 | From jun to Everyone: (8:16 PM) 3086 | 打断是棒杀 3087 | From 老黄瓜 to Everyone: (8:16 PM) 3088 | Tinder1. 发现附近的人2. 每个用户有自己的主页,能上传照片3. 匹配上的人能发起(实时)对话4. 能解除匹配 3089 | From Cory Wang to Everyone: (8:16 PM) 3090 | 不打断是捧杀,打断是棒杀😂 3091 | From 应Jianghong to Everyone: (8:16 PM) 3092 | ??? 3093 | From emma to Everyone: (8:16 PM) 3094 | omg 3095 | From Qi Wang to Everyone: (8:16 PM) 3096 | 这一点面试官很对, 3097 | From Xinyu Zhang to Everyone: (8:16 PM) 3098 | 画图最好还是分着画 3099 | From iPhone to Everyone: (8:17 PM) 3100 | 面试官应该是说画在两个框框里就比较清楚 3101 | From Ming to Everyone: (8:17 PM) 3102 | 都是分开放的 3103 | From Yu Zheng to Everyone: (8:17 PM) 3104 | 这个所有 mock interview 都是分开画得吧。。 3105 | From 1705081 Shimingyi Chen to Everyone: (8:17 PM) 3106 | db和file storage 都是分开的吧 3107 | From Qi Wang to Everyone: (8:17 PM) 3108 | 逻辑上的cloud file storage 和db不是一回事 3109 | From Cory Wang to Everyone: (8:18 PM) 3110 | 不打断但是design的不是我想要的,那给hire还是 no hire 3111 | From iphone to Everyone: (8:18 PM) 3112 | The purpose of interview is to please the interviewer(boss) 3113 | From Cory Wang to Everyone: (8:18 PM) 3114 | +1 3115 | From Yu Zheng to Everyone: (8:18 PM) 3116 | and provide enough signal for hire.. 3117 | From emma to Everyone: (8:19 PM) 3118 | 逻辑上的cloud file storage 和db不是一回事 +1 3119 | From Cory Wang to Everyone: (8:19 PM) 3120 | 😂 3121 | From NL to Everyone: (8:19 PM) 3122 | 愿意打断的都是好心的 3123 | From Vivian huai to Everyone: (8:19 PM) 3124 | 面试官这个没法控制,每个人都有不同风格 3125 | From Catherine zhang to Everyone: (8:19 PM) 3126 | 是的 3127 | From Jenny Xu to Everyone: (8:19 PM) 3128 | 面试官逐渐安静不是啥好信号 3129 | From Cory Wang to Everyone: (8:20 PM) 3130 | 面试官默默打开自己的电脑开始干自己的活 3131 | From Tony Y to Everyone: (8:20 PM) 3132 | 嗯 可以马上聊sync 3133 | From 老黄瓜 to Everyone: (8:20 PM) 3134 | 赞同 @panfeng 说的 很对 3135 | From 应Jianghong to Everyone: (8:20 PM) 3136 | 在service level可能是同一个service你能够拿到metadata和文件,但是绝对不意味db和filestorage是一个logical component 3137 | From Cory Wang to Everyone: (8:21 PM) 3138 | 哈哈哈 3139 | From Xinyu Zhang to Everyone: (8:21 PM) 3140 | 不clear要问面试官 +1 3141 | From Jenny Xu to Everyone: (8:22 PM) 3142 | +1 3143 | From iPhone to Everyone: (8:22 PM) 3144 | 问面试官+1 3145 | From Yu Zheng to Everyone: (8:22 PM) 3146 | 不要去 challenge 面试官。。。 3147 | From NL to Everyone: (8:22 PM) 3148 | 面试者超级自信👍 3149 | From Vivian huai to Everyone: (8:22 PM) 3150 | 不要去 challenge 面试官+1 3151 | From Ming to Everyone: (8:22 PM) 3152 | 都是分开放的面试者不太尊重面试官。我们原来碰到过一次,直接pass。 3153 | From Cory Wang to Everyone: (8:22 PM) 3154 | 不要去challenge面试官+1 3155 | From yao yao to Everyone: (8:22 PM) 3156 | 那你可能是超级面试者。。。 3157 | From Qi Wang to Everyone: (8:22 PM) 3158 | 面试官问sync是很正常的,也是很好心。 3159 | From lw to Everyone: (8:23 PM) 3160 | 考点是sync。。怎么能不问呢。 3161 | From Li to Everyone: (8:23 PM) 3162 | “面试者不太尊重面试官。我们原来碰到过一次,直接pass。” +1 3163 | From Qi Wang to Everyone: (8:23 PM) 3164 | sync或者写锁是这个题的重要考点之一 3165 | From lw to Everyone: (8:23 PM) 3166 | deupe这种都是小事。 3167 | From First Last to Everyone: (8:24 PM) 3168 | 面试者不太尊重面试官。我们原来碰到过一次,直接pass。” +1 3169 | From emma to Everyone: (8:24 PM) 3170 | 面试者不太尊重面试官。我们原来碰到过一次,直接pass。” +1 3171 | From Esther to Everyone: (8:24 PM) 3172 | 面试者不太尊重面试官。我们原来碰到过一次,直接pass。” +1 3173 | From Xinyu Zhang to Everyone: (8:24 PM) 3174 | pass是给过了? 3175 | From Catherine zhang to Everyone: (8:24 PM) 3176 | 不能follow hint的 在我们这里 属于not a team player lol 3177 | From lw to Everyone: (8:24 PM) 3178 | pass是下一个。 3179 | From Cory Wang to Everyone: (8:24 PM) 3180 | pass是给挂了 3181 | From jun to Everyone: (8:24 PM) 3182 | 太幽默了 3183 | From Yu Zheng to Everyone: (8:24 PM) 3184 | pass candidate, next one lol 3185 | From Xinyu Zhang to Everyone: (8:24 PM) 3186 | lol 3187 | From Ming to Everyone: (8:24 PM) 3188 | 当然是fail,连feedback都省了 3189 | From First Last to Everyone: (8:24 PM) 3190 | pass掉,就是不理会,挂掉! 3191 | From Xinyu Zhang to Everyone: (8:25 PM) 3192 | 一亩三分地耍久了 面经pass是过。。 3193 | From lw to Everyone: (8:25 PM) 3194 | 提醒了吧。。 3195 | From Esther to Everyone: (8:26 PM) 3196 | 提醒了吧。。 +1 3197 | From Jenny Xu to Everyone: (8:26 PM) 3198 | 提醒挺多次了😂 3199 | From Cory Wang to Everyone: (8:26 PM) 3200 | 提醒了 +1 3201 | From lw to Everyone: (8:26 PM) 3202 | 主要没时间。只能看面试官想考啥。 3203 | From Xinyu Zhang to Everyone: (8:26 PM) 3204 | 而且qps说了半天也没说到一个具体的数量级 比如多少k,而且最后api和DB选择也没用到 3205 | From Jenny Xu to Everyone: (8:27 PM) 3206 | 面试者像极了几年前找工作的我。。。自嗨得不行 3207 | From iPhone to Everyone: (8:27 PM) 3208 | MVP先设计出来会比较安全 3209 | From Kd to Everyone: (8:27 PM) 3210 | MVP是什么? 3211 | From Xinyu Zhang to Everyone: (8:27 PM) 3212 | 求面试官带着讲一个答案 3213 | From Jenny Xu to Everyone: (8:27 PM) 3214 | Minimun Variable Product 3215 | From Vivian huai to Everyone: (8:27 PM) 3216 | 可以讨论下这个题目该怎么解答吗 3217 | From lw to Everyone: (8:28 PM) 3218 | 告诉我9999我也没办法量化。 3219 | From jun to Everyone: (8:28 PM) 3220 | 这个问题单讲可能需要100个小时 3221 | From lw to Everyone: (8:28 PM) 3222 | 对的。 3223 | From Shihao Zhong to Everyone: (8:28 PM) 3224 | 同样有这种怡文疑问 3225 | From emma to Everyone: (8:29 PM) 3226 | 我觉得面试官可以更自信一些 3227 | From Vivian huai to Everyone: (8:29 PM) 3228 | +1 3229 | From Huimin Yang to Everyone: (8:29 PM) 3230 | 这里说的L4是Google的L4嘛 3231 | From Cory Wang to Everyone: (8:29 PM) 3232 | 当面试官也不容易 3233 | From 老黄瓜 to Everyone: (8:29 PM) 3234 | L4不是不考系统设计吗😂 3235 | From lw to Everyone: (8:30 PM) 3236 | avaiability有不需要的么。主要看面试官考不考。 3237 | From Panfeng Xue to Everyone: (8:30 PM) 3238 | L4 也会考 3239 | From ZZ to Everyone: (8:30 PM) 3240 | 要考的 3241 | From Catherine zhang to Everyone: (8:30 PM) 3242 | 会比较简单 3243 | From 201703005 Di Ha to Everyone: (8:30 PM) 3244 | 只有Google的L4不考sd,其它公司的mid senior都考sd 3245 | From Yue to Everyone: (8:31 PM) 3246 | MLE的话考sd吗 3247 | From Yu Zheng to Everyone: (8:32 PM) 3248 | 考的。。只是侧重点不太一样 3249 | From Catherine zhang to Everyone: (8:32 PM) 3250 | 考ml sd 3251 | From lw to Everyone: (8:33 PM) 3252 | kafka consumer group? 3253 | From Tony Y to Everyone: (8:33 PM) 3254 | 书似乎实在有kafka之前写的 3255 | From HW to Everyone: (8:33 PM) 3256 | 其实有点想了解ml sd的interview,不知道铭道会有这方面的计划没有 3257 | From Shihao Zhong to Everyone: (8:33 PM) 3258 | 感觉这个有一致性的问题把 3259 | From jun to Everyone: (8:33 PM) 3260 | 这个是没办法的 3261 | From Tony Y to Everyone: (8:34 PM) 3262 | 这里说的是每个user要有一个queue, 我看的时候觉得不一定是最优的 3263 | From Xinyu Zhang to Everyone: (8:34 PM) 3264 | 不同client同时写 怎么handel啊? 3265 | From emma to Everyone: (8:34 PM) 3266 | dropbox好像用的是long polling 3267 | From Yu Zheng to Everyone: (8:34 PM) 3268 | google drive 其实本身还真有不少一致性问题没解决 3269 | From Tony Y to Everyone: (8:34 PM) 3270 | 一致性书里也问题没讲我的想法是有conflict就算一个新的branch 算一个文件的副本 3271 | From Jerry to Everyone: (8:37 PM) 3272 | 应该是先建metadata再传文件chunk最后完成在更新metadata吗 3273 | From 老黄瓜 to Everyone: (8:37 PM) 3274 | 感觉需要 checksum,metadata 每个文件有状态,保证chunk 不完整的时候 metadata 状态不是 complete 3275 | From Jerry to Everyone: (8:38 PM) 3276 | drive属于object storage吗 3277 | From 老黄瓜 to Everyone: (8:38 PM) 3278 | 比如 file_123, in-progress, 5/8, 同时维护 file_123 -> {chunk_123_0, …} 的映射 3279 | From Li to Everyone: (8:38 PM) 3280 | “应该是先建metadata再传文件chunk最后完成在更新metadata吗” +1 3281 | From Tekken to Everyone: (8:38 PM) 3282 | https://whimsical.com/google-drive-BBsXFU8DQX7tp9CMyTXASs 3283 | From Jerry to Everyone: (8:39 PM) 3284 | 一次传不完,还要断点续传 3285 | From Kd to Everyone: (8:39 PM) 3286 | 上传失败了可 3287 | From jun to Everyone: (8:40 PM) 3288 | 你在硬盘上是连续存的?本地要存metadata, 然后比较哪个chuck被改了chunker切的。client里有chunker感觉就是性能好 但不是最基本所需要的对,应该不是 mvp 的图中的 workspace 是啥意思?赞同 可以假设用户只能上传 <10MB 的文件 然后完成MVP@Ming, 说得对在前端就block了 3289 | From iPhone to Everyone: (8:42 PM) 3290 | 肯定是先访问Metadata Service啊,权限不得先检查才能忘File Server里写 3291 | From Qi Wang to Everyone: (8:43 PM) 3292 | 我觉得这个图的基本思路没啥问题。 3293 | From Tony Y to Everyone: (8:44 PM) 3294 | 查重可能是个花时间的活 所以先存cloud storage再去重? 3295 | From Xinyu Zhang to Everyone: (8:44 PM) 3296 | 好奇 这个从cloud往本地sync的过程,假设download很慢,本地就是先存file, 然后改本地DB,是吧 3297 | From jun to Everyone: (8:45 PM) 3298 | 我觉得太抠细节了 3299 | From Shihao Zhong to Everyone: (8:45 PM) 3300 | 我觉得太抠细节了 3301 | +1 3302 | From Randy to Everyone: (8:45 PM) 3303 | “应该是先建metadata再传文件chunk最后完成在更新metadata吗” +1 3304 | client 先post https://FQDN/files/upload 3305 | 拿到短期的storage URI, 然后client对内容进行分块,往storage URI里一块一块upload 3306 | 结束了,再把metadata改成 upload finished 3307 | From iPhone to Everyone: (8:45 PM) 3308 | 安浏览器插件吧 3309 | From Shihao Zhong to Everyone: (8:45 PM) 3310 | @Randy甚至还是幂等操作 3311 | From Catherine zhang to Everyone: (8:46 PM) 3312 | 同意 @randy solution 3313 | From lw to Everyone: (8:46 PM) 3314 | 这个属于同步编辑了。 3315 | From Qi Wang to Everyone: (8:46 PM) 3316 | 一开始提需求的时候不要提merge冲突的问题,这是个坑 3317 | From 老黄瓜 to Everyone: (8:46 PM) 3318 | 感觉可以稍微优化,支持乱序并行批量上传 3319 | From iphone to Everyone: (8:46 PM) 3320 | Google drive and google doc is a different topic 3321 | From Qi Wang to Everyone: (8:46 PM) 3322 | 加个写锁就完了 3323 | From emma to Everyone: (8:47 PM) 3324 | workspace是啥 3325 | From Qi Wang to Everyone: (8:47 PM) 3326 | 保持文件原子性 3327 | From iPhone to Everyone: (8:47 PM) 3328 | 不让改就完了 3329 | From Xinyu Zhang to Everyone: (8:47 PM) 3330 | 那用户假设存了一个很重要的文件,但是另一个用户存了个同名的没用的file, 更新的时候被lock了 3331 | From tom to Everyone: (8:47 PM) 3332 | 加写锁太慢了 3333 | From Cory Wang to Everyone: (8:47 PM) 3334 | 😂 3335 | From Qi Wang to Everyone: (8:47 PM) 3336 | 加写锁为啥慢,又不是全局写锁这里的写锁是逻辑上的,不是数据库里的写锁。 3337 | From HW to Everyone: (8:48 PM) 3338 | 其实可以在这里讨论tradeoff 3339 | From jun to Everyone: (8:49 PM) 3340 | +1 3341 | From Shihao Zhong to Everyone: (8:49 PM) 3342 | 直接传变成一个新version然后让用户自己决定哪个version有效呗 3343 | From Tony Y to Everyone: (8:49 PM) 3344 | offline sync 有锁也没用吧 3345 | From jun to Everyone: (8:49 PM) 3346 | 感觉是技术大牛讨论方案 3347 | From tom to Everyone: (8:49 PM) 3348 | version diff conflict resolution+eventual consistency, conflict解决不了的时候roll back 3349 | From Kun Zhang to Everyone: (8:49 PM) 3350 | What is the data model for the per line lock? 3351 | From Jerry to Everyone: (8:49 PM) 3352 | 那不就是github 了哈哈 3353 | From 应Jianghong to Everyone: (8:50 PM) 3354 | 这么看的话metadata放前面其实更好解决锁的问题或者把锁的metadata单独拿出来放前面 3355 | From iPhone to Everyone: (8:51 PM) 3356 | HDFS就是客户端先chunk,然后访问Namenode的metadata,然后再去到Namenode给定的data nodes上传下载 3357 | From Amity to Everyone: (8:51 PM) 3358 | 😂, Merge, solve conflict before allow update。 3359 | From Xinyu Zhang to Everyone: (8:51 PM) 3360 | 好奇,本地从cloud下载的过程当中,本地文件被人为修改了。本地最好怎么改啊? 3361 | From Tony Y to Everyone: (8:51 PM) 3362 | start a new branch will sovle conflict 3363 | From Panfeng Xue to Everyone: (8:52 PM) 3364 | merge conflict 3365 | From ZZB to Everyone: (8:54 PM) 3366 | Is Long polling ok for pulling newest meta data (and download …) 3367 | From Qi Wang to Everyone: (8:54 PM) 3368 | 不是的,是你一个客户端添加了一个new file,这个file 的信息要sync带其他连接的客户端上。 3369 | From 老黄瓜 to Everyone: (8:55 PM) 3370 | 同感觉没必要作push 这样可能引入race condition,同时pull一样的文件需要做 version isolation 3371 | From Ken to Everyone: (8:58 PM) 3372 | Thanks everyone for coming. If you want to join our WeChat group, the QRCode is here on top of the doc: https://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit# 3373 | From Xinyu Zhang to Everyone: (8:58 PM) 3374 | 那不还是上传完之后更新么 3375 | From 老黄瓜 to Everyone: (9:01 PM) 3376 | 这样会有一致性问题 保证所有chunk更新或者不更新 3377 | From iPhone to Everyone: (9:01 PM) 3378 | Metadata里面有权限信息,你可能根本没有权限去写Cloud Storage里面的某个Url目录 3379 | From 应Jianghong to Everyone: (9:02 PM) 3380 | 这么弄怎么解决multi client sync的问题 3381 | From 老黄瓜 to Everyone: (9:02 PM) 3382 | 没必要每个chunk去更新 等metadata完成 无非E2E latency慢一点 3383 | From Qi to Everyone: (9:02 PM) 3384 | 文件上传是不可能走lb的一般都是系统分配一个url,直接传到s3或者对等的obj storage里 3385 | From Ken to Everyone: (9:04 PM) 3386 | Thanks everyone for coming. If you want to join our WeChat group, the QRCode is here on top of the doc: https://docs.google.com/document/d/19WtV88EbH8_t5J8bxZ6tAbMJz0iO6EtsKetzun55UC8/edit# 3387 | From Xinyu Zhang to Everyone: (9:04 PM) 3388 | md没有就重新来呗,反正存储不贵 3389 | From 老黄瓜 to Everyone: (9:05 PM) 3390 | 只要metadata DB能保证transaction 保证ALO message delivery没有问题 3391 | From Jingru Li to Everyone: (9:05 PM) 3392 | 十分感谢主持这次session!:) :) :) 3393 | From Ken to Everyone: (9:06 PM) 3394 | 谢谢大家! 3395 | From ZZB to Everyone: (9:07 PM) 3396 | Metadata 包括 chunk 信息? 3397 | From Vivian huai to Everyone: (9:08 PM) 3398 | notification server主要是为了不同用户的sync吧 3399 | From Kai Sun to Everyone: (9:09 PM) 3400 | 十分感谢主持这次 session🙏🏼请问之后会公佈 recording 吗? 3401 | From 老黄瓜 to Everyone: (9:09 PM) 3402 | 感觉不加notification也没啥问题 3403 | From Ken to Everyone: (9:10 PM) 3404 | 目前没有计划发布 recording,@KS, 笔记都可以在 designclub.mingdaoschool.com 查到。 3405 | From Vivian huai to Everyone: (9:10 PM) 3406 | Notification server主要什么作用呢 3407 | From Kai Sun to Everyone: (9:10 PM) 3408 | 谢谢 Ken 🙏🏼 3409 | From 老黄瓜 to Everyone: (9:12 PM) 3410 | 可以做 lazy update 3411 | From Zhengguan Li to Everyone: (9:13 PM) 3412 | 最新的不就是做一个full scan嘛? 3413 | From 老黄瓜 to Everyone: (9:13 PM) 3414 | 没必要每次pull full scan 3415 | From Zhengguan Li to Everyone: (9:13 PM) 3416 | notification只有改动的 3417 | From Tiger to Everyone: (9:15 PM) 3418 | 既然能常链接,为什么不能client每隔几秒pull一下status?是不是类似的? 3419 | From lw to Everyone: (9:15 PM) 3420 | pull就要respnose。long polling不用一直response。 3421 | From Zhengguan Li to Everyone: (9:16 PM) 3422 | SSE 使用 HTTP 协议,现有的服务器软件都支持。WebSocket 是一个独立协议。-https://www.bookstack.cn/read/webapi-tutorial/spilt.2.docs-server-sent-events.md 3423 | From Ken to Everyone: (9:16 PM) 3424 | https://mint-lillipilli-1b9.notion.site/Comparison-for-4-methods-for-asynchronous-request-and-response-over-HTTP-a372edb5b8614d63ba115fa7156187af 3425 | From Guest to Everyone: (9:17 PM) 3426 | Push is implemented by long polling? 3427 | From emma to Everyone: (9:17 PM) 3428 | np 3429 | From Zhengguan Li to Everyone: (9:20 PM) 3430 | 虚拟机可以扩展port嘛? 3431 | From Eric Haung to Everyone: (9:21 PM) 3432 | 请问那个LP是什么? 3433 | From Guest to Everyone: (9:25 PM) 3434 | Push is also http 3435 | From emma to Everyone: (9:30 PM) 3436 | strong hire 3437 | From Ming to Everyone: (9:30 PM) 3438 | L5也够了 3439 | From SmallCracker to Everyone: (9:58 PM) 3440 | 这个画图得工具是什么啊? 3441 | From Tekken to Everyone: (9:58 PM) 3442 | whimsical 3443 | From SmallCracker to Everyone: (9:59 PM) 3444 | 谢谢哈 3445 | From Peace to Everyone: (10:05 PM) 3446 | 请问过去的mock interview recording在哪里能看到? 3447 | From Yao Xiao to Everyone: (10:07 PM) 3448 | 老师应该下线了 3449 | From Ping Lu to Everyone: (10:07 PM) 3450 | 谢谢! 3451 | From SmallCracker to Everyone: (10:07 PM) 3452 | 谢谢! 晚安 3453 | From Yao Xiao to Everyone: (10:07 PM) 3454 | 自动解散吧 3455 | From Yang Bai to Everyone: (10:07 PM) 3456 | 👍 3457 | From Yao Xiao to Everyone: (10:07 PM) 3458 | 晚安 3459 | From Tekken to Everyone: (10:07 PM) 3460 | https://whimsical.com/google-drive-BBsXFU8DQX7tp9CMyTXASs -------------------------------------------------------------------------------- /career/presentations.md: -------------------------------------------------------------------------------- 1 | - [Communication](#communication) 2 | - [Public speaking](#public-speaking) 3 | - [Nonverbal communication](#nonverbal-communication) 4 | - [Conflict management](#conflict-management) 5 | - [Phychology](#phychology) 6 | - [Tech](#tech) 7 | - [Information security](#information-security) 8 | - [Messaging](#messaging) 9 | - [Migration](#migration) 10 | 11 | # Communication 12 | ## Public speaking 13 | * [Deal with Stage Fright](https://docs.google.com/presentation/d/1gUTG_TimlRaxrjKyZuXKRoW6PIkKuWAvfh0_0nEJdzQ/edit?usp=sharing) 14 | * [Concise communication](https://docs.google.com/presentation/d/18UxFs9fwyx1mdlcHSLiScxqX6PSQn6I_BbymYwsb-VA/edit?usp=sharing) 15 | 16 | ## Nonverbal communication 17 | * [Gestures in presentation](https://docs.google.com/presentation/d/1dXOwDSuaVHY33dRAp3Rd0Z-SRHwpstcuUMCczjFRBAs/edit?usp=sharing) 18 | 19 | ## Conflict management 20 | * [Crucial Conversations](https://docs.google.com/presentation/d/1JRJThU1g_dqQ9xWhCRmIvp1-5vFl7E5QB0awUVbRUhY/edit#slide=id.p) 21 | 22 | ## Phychology 23 | * [拖延心理学](https://docs.google.com/presentation/d/1Ur8_y3Egzg4c0UipQ1OYe7Mm5vCTH6tp/edit?usp=sharing&ouid=117883747511694701500&rtpof=true&sd=true) 24 | 25 | # Tech 26 | ## Information security 27 | * [Keyboard and covert channels](https://www.slideshare.net/ShijieZhang2/keyboard-covert-channels) 28 | * [Searchable encryption: Path Oram](https://www.slideshare.net/ShijieZhang2/path-oram) 29 | * [2014 Finra Intern project: Leveraging Splunk to Manage AWS Access](https://www.slideshare.net/ShijieZhang2/aws-64257730) 30 | 31 | ## Messaging 32 | * [Design multi-user group chat](https://docs.google.com/presentation/d/1USZsFZDCY9kUosPrSSI4WaDN4koqe801p0MjPV_1n5U/edit?usp=sharing) 33 | 34 | ## Migration 35 | * [Auth library migration](https://docs.google.com/presentation/d/1MpL-0gpj26kJaeYDu2-QH0z3wNSU0a58tI9oKb9dLUw/edit?usp=sharing) 36 | 37 | -------------------------------------------------------------------------------- /career/publications.md: -------------------------------------------------------------------------------- 1 | * [A novel segmentation based video-denoising method with noise level estimation](https://www.sciencedirect.com/science/article/abs/pii/S0020025514005830?via%3Dihub) 2 | * Information Sciences, Volume 281, 10 October 2014, Pages 507-520 3 | * Yang Cao, ShiJie Zhanga, Zheng JunZha, Jing Zhang, Chang WenChen 4 | * [Underwater stereo image enhancement using a new physical model](https://ieeexplore.ieee.org/document/7026097) 5 | * 2014 IEEE International Conference on Image Processing (ICIP) 6 | * Shijie Zhang; Jing Zhang; Shuai Fang; Yang Cao 7 | * [Automatic tag saliency ranking for stereo images](https://www.sciencedirect.com/science/article/abs/pii/S0925231215006049) 8 | * Neurocomputing, Volume 172, 8 January 2016, Pages 9-18 9 | * Yang Cao, Kai Kang,Shijie Zhang, Jing Zhang, Zengfu Wang 10 | -------------------------------------------------------------------------------- /career/resume_finraintern.md: -------------------------------------------------------------------------------- 1 | --- 2 | description: A role based access control system on AWS 3 | --- 4 | 5 | # SWE Intern @ Finra 14 Summer 6 | 7 | ## Featured content 8 | 9 | * [SlideShare link: Intern project presentation](https://www.slideshare.net/ShijieZhang2/aws-64257730) 10 | * [Presentation link: Official product based on intern project: Leveraging Splunk to Manage Your AWS Environment](http://technology.finra.org/articles/video/using-splunk-to-manage-aws.html) 11 | * [News: FINRA takes control of its big data challenges with Splunk Cloud](http://diginomica.com/2014/10/07/finra-takes-control-big-data-challenges-splunk-cloud/) 12 | 13 | ### Intern project 14 | 15 | ![](../.gitbook/assets/image.png) 16 | 17 | ### Final product 18 | 19 | ![](<../.gitbook/assets/image (1).png>) 20 | 21 | ## Background 22 | 23 | ### FINRA 24 | 25 | #### Def 26 | 27 | * FINRA, one of the largest independent securities regulators in the United States, was established to monitor and regulate financial trading practices. FINRA protects investors by regulating brokers and brokerage firms by monitoring trading on U.S. stock markets. 28 | * The regulatory body is responsible for all of the data that relates to trades occurring in the United States. To put that into perspective, this equates to dealing with more data than either Twitter or Visa heft on a daily basis. We are talking petabytes of data. 29 | 30 | #### Company Missions 31 | 32 | * FINRA watches over 6 billion shares traded on the stock market each day (Or up to 75 billion events per day, over 20 petabytes of storage). FINRA uses the monitored large amounts of data to 33 | * Reconstruct the market from trillions of events 34 | * Data from broker-dealers and exchanges 35 | * Equities, Options, Fixed Income 36 | * Build a graph of market order events 37 | * Analyze the data looking for financial fraud 38 | * Insider trading, layering, cross-product manipulation, front running & many more 39 | * Looking for a needle in a haystack 40 | 41 | #### Company Visions 42 | 43 | * To respond to rapidly changing market dynamics, FINRA moved about 75 percent of its operations to Amazon Web Services, using AWS to capture, analyze, and store a daily influx of 75 billion records. The company estimates it will save up to $20 million annually by using AWS instead of a physical data center infrastructure. 44 | * FINRA has a goal to move its on-premises data center to the cloud within 5 years. AWS is currently FINRA's primary cloud provider. 45 | 46 | #### Finra and AWS 47 | 48 | * AWS services used by FINRA 49 | * Data management 50 | * Amazon RDS 51 | * Amazon S3 52 | * Amazon Glacier 53 | * Amazon EC2 54 | * Data analytics 55 | * Machine learning 56 | * Presto 57 | * Amazon EMR 58 | * Amazon Redshift 59 | * Hive 60 | * Usage statistics 61 | * 30k+ EC2 nodes per day. 93% of EC2 usage is EMR based 62 | * 20Pb+ storage (Amazon S3, Amazon Glacier) 63 | * 60% PROD, 25% QC/UAT, 15% DEV 64 | * Node lifecycles 65 | * 50%: Under 2h 66 | * 35%: 2h to 5h 67 | * 15%: Over 5h 68 | 69 | ### Team 70 | 71 | * Our team is responsible for monitoring all security-related events happened inside FINRA. My team uses Splunk, AWS EMR, AWS IAM and AWS CloudTrail. Sample projects in our team includes 72 | 73 | #### AWS shared responsibility 74 | 75 | **AWS responsibility - Security of the cloud** 76 | 77 | * Protecting the network through automated monitoring systems and robust internet access to prevent distributed denial of service (DDos) attacks. 78 | * Performing background checks on employees who have access to sensitive areas. 79 | * Decommissioning storage services by physically destroying them after end of life. 80 | * Ensuring physical and environmental security of data centers, including fire protection and security staff. 81 | 82 | **Your responsibility - Security in the cloud** 83 | 84 | * Encrypting network traffic to prevent attackers from reading or manipulating data (for example, HTTPS) 85 | * Configuring a firewall for your virtual private network that controls incoming and outgoing traffic with security groups and ACLs. 86 | * Managing patches for the OS and additional software on virtual servers. 87 | * Implementing access management that restricts access to AWS resources like S3 and EC2 to maintain a minimum with IAM. 88 | 89 | #### Splunk cloud 90 | 91 | * [AWS Splunk Cloud](http://diginomica.com/2014/10/07/finra-takes-control-big-data-challenges-splunk-cloud/) 92 | * We were one of the first people to use Splunk Cloud. Our company in general has a good presence in cloud and we are one of the bigger customers of AWS. We wanted to build a SIEM, but we needed huge amounts of data storage and data volume storage, and if we were going to do it on premise then we would need to buy and manage this box and that box – it ends up that you spend a greater amount of time on maintenance of these things than you do adding value. We thought it was a good idea to offload that to Splunk – we are happy with the cloud. Let them own the base layer and we will start adding value on top of that. With cloud I can put an exact dollar amount on data. I can say that it is going to cost us $10,000 a year to give you that information – is it worth it? It makes those type of decisions a little bit easier. 93 | 94 | ### Project 95 | 96 | * The purpose of this project is as the first step in building a security 97 | 98 | ## Technical components 99 | 100 | ### IAM 101 | 102 | #### Def 103 | 104 | * Identity Access and Management service provides granular control for your AWS account. 105 | * Users and Groups 106 | * Unique security credentials 107 | * Temporary security credentials 108 | * Policies & permissions 109 | * Roles 110 | * Multi-factor authentication 111 | 112 | #### Access control policy 113 | 114 | **Def** 115 | 116 | * You can grant or deny access by defining: 117 | * Who can access your resources 118 | * What actions they can take 119 | * Which resources they can access 120 | * How will they access your resources 121 | * The policy language is about authorization. 122 | 123 | **Structure** 124 | 125 | * Policies are JSON-formatted documents, which contain statements which specify 126 | * What actions a principal can perform 127 | * Which resources can be accessed 128 | 129 | ```json 130 | { 131 | "Statement" : [ 132 | { 133 | "Effect": "Allow", 134 | "Action": ["s3: Get", "s3: List"], 135 | "Resource": "*" 136 | } 137 | ] 138 | } 139 | ``` 140 | 141 | **Principals** 142 | 143 | * Principal is an entity that is allowed or denied access to a resource. 144 | * Principal element is required for resource-based policies. 145 | 146 | **Actions** 147 | 148 | * Actions describes the type of access that should be allowed or denied. 149 | * Statements must include either an Action or NotAction element. 150 | 151 | **Resources** 152 | 153 | * Resources describe objects that are being requested. 154 | * Statements must include either a Resource or a NotResource element. 155 | 156 | **Condition** 157 | 158 | * Allows a user to access a resource under the following conditions: 159 | * The time is after 12:00 p.m. on 8/16/2013 160 | * The time is before 3:00 p.m. on 8/16/2013 161 | * The request comes from an IP address in the 192.0.2.0 /24 or 203.0.113.0 /24 range 162 | 163 | **Types** 164 | 165 | **Resource-based policies vs Identity-based policies** 166 | 167 | * For some AWS services, you can grant cross-account access to your resources. To do this, you attach a policy directly to the resource that you want to share, instead of using a role as a proxy. The resource that you want to share must support resource-based policies. Unlike a user-based policy, a resource-based policy specifies who (in the form of a list of AWS account ID numbers) can access that resource. 168 | * Advantage: 169 | * Cross-account access with a resource-based policy has an advantage over a role. With a resource that is accessed through a resource-based policy, the user still works in the trusted account and does not have to give up his or her user permissions in place of the role permissions. In other words, the user continues to have access to resources in the trusted account at the same time as he or she has access to the resource in the trusting account. This is useful for tasks such as copying information to or from the shared resource in the other account. 170 | * Disadvantage: 171 | * The disadvantage is that not all services support resource-based policies. A few of the AWS services that support resource-based policies are listed here: 172 | * Amazon S3 buckets – The policy is attached to the bucket, but the policy controls access to both the bucket and the objects in it. For more information, go to Access Control in the Amazon Simple Storage Service Developer Guide. 173 | * Amazon Simple Notification Service (Amazon SNS) topics – For more information, go to Managing Access to Your Amazon SNS Topics in the Amazon Simple Notification Service Developer Guide. 174 | * Amazon Simple Queue Service (Amazon SQS) queues – For more information, go to Appendix: The Access Policy Language in the Amazon Simple Queue Service Developer Guide. 175 | * For a complete list of the growing number of AWS services that support attaching permission policies to resources instead of principals, see AWS Services That Work with IAM and look for the services that have Yes in the Resource Based column. 176 | 177 | **Managed policies vs Inline policies** 178 | 179 | * Using IAM, you apply permissions to IAM users, groups, and roles (which we refer to as principal entities) by creating policies. You can create two types of IAM, oridentity-based policies: 180 | * Managed policies – Standalone policies that you can attach to multiple users, groups, and roles in your AWS account. Managed policies apply only to identities (users, groups, and roles) - not resources. You can use two types of managed policies: 181 | * AWS managed policies – Managed policies that are created and managed by AWS. If you are new to using policies, we recommend that you start by using AWS managed policies. 182 | * Customer managed policies – Managed policies that you create and manage in your AWS account. Using customer managed policies, you have more precise control over your policies than when using AWS managed policies. 183 | * Inline policies – Policies that you create and manage, and that are embedded directly into a single user, group, or role. Resource-based policies are another form of inline policy. Resource-based policies are not discussed here. For more information about resource-based policies, see Identity-Based (IAM) Permissions and Resource-Based Permissions. 184 | * Identity-based (IAM) policies can be either inline or managed. Resource-based policies are attached to the resources (inline) only and are not managed. 185 | 186 | Generally speaking, the content of the policies is the same in all cases—each kind of policy defines a set of permissions using a common structure and a common syntax. 187 | 188 | **Policy example** 189 | 190 | * Given 4 APIs: 191 | * "s3:DeleteBucket", 192 | * "s3:DeleteObject", 193 | * "s3:ListBucket", 194 | * "s3:PutObject" 195 | 196 | ```json 197 | { 198 | "Version": "2012-10-17", 199 | "Statement": [ 200 | { 201 | "Sid": "Stmt1483586463716", 202 | "Action": [ 203 | "s3:DeleteBucket", 204 | "s3:DeleteObject", 205 | ], 206 | "Effect": "Allow", 207 | "Resource": "*" 208 | }, 209 | { 210 | "Sid": "Stmt1483586463717", 211 | "Action": [ 212 | "s3:DeleteBucket", 213 | "s3:ListBucket", 214 | ], 215 | "Effect": "Deny", 216 | "Resource": "*" 217 | } 218 | ] 219 | } 220 | ``` 221 | 222 | **Evaluating policies** 223 | 224 | * [Slides 46](http://www.slideshare.net/AmazonWebServices/mastering-access-control-policies-sec302-aws-reinvent-2013/7) 225 | 226 | #### Identities 227 | 228 | **Root user** 229 | 230 | * Root account has unrestricted access to all resources within the account, including access to billing information. 231 | 232 | **Users** 233 | 234 | * An IAM user is an entity that you create in AWS to represent the person or service that uses it to interact with AWS. A user in AWS consists of a name and credentials. 235 | * You can access AWS in different ways using the different types of credentials that can be associated with a user: 236 | * Console password: A password that the user can type to sign in to interactive sessions such as the AWS Management Console. 237 | * Access keys: An access key is the combination of an access key ID and a secret access key. You can assign two to a user at a time. These can be used to make programmatic calls to AWS when using the API in program code or at a command prompt when using the AWS CLI or the AWS PowerShell tools. 238 | 239 | **Groups** 240 | 241 | * An IAM group is a collection of IAM users. Groups let you specify permissions for multiple users, which can make it easier to manage the permissions for those users. 242 | * For example, you could have a group called Admins and give that group the types of permissions that administrators typically need. Any user in that group automatically has the permissions that are assigned to the group. If a new user joins your organization and needs administrator privileges, you can assign the appropriate permissions by adding the user to that group. Similarly, if a person changes jobs in your organization, instead of editing that user's permissions, you can remove him or her from the old groups and add him or her to the appropriate new groups. 243 | * Note that a group is not truly an "identity" in IAM because it cannot be identified as a Principal in a permission policy. It is simply a way to attach policies to multiple users at one time. 244 | * Following are some important characteristics of groups: 245 | * A group can contain many users, and a user can belong to multiple groups. 246 | * Groups can't be nested; they can contain only users, not other groups. 247 | * There's no default group that automatically includes all users in the AWS account. If you want to have a group like that, you need to create it and assign each new user to it. 248 | * There's a limit to the number of groups you can have, and a limit to how many groups a user can be in. For more information, see Limitations on IAM Entities and Objects. 249 | * [A group example](http://docs.aws.amazon.com/IAM/latest/UserGuide/id\_groups.html) 250 | 251 | **Roles** 252 | 253 | * An IAM role is similar to a user, in that it is an AWS identity with permission policies that determine what the identity can and cannot do in AWS. However, instead of being uniquely associated with one person, a role is intended to be assumable by anyone who needs it. Also, a role does not have any credentials (password or access keys) associated with it. Instead, if a user is assigned to a role, access keys are created dynamically and provided to the user. 254 | 255 | **Use case** 256 | 257 | * You can use roles to delegate access to users, applications, or services that don't normally have access to your AWS resources. For example, you might want to grant users in your AWS account access to resources they don't usually have, or grant users in one AWS account access to resources in another account. 258 | * You might want to allow a mobile app to use AWS resources, but not want to embed AWS keys within the app (where they can be difficult to rotate and where users can potentially extract them). Sometimes you want to give AWS access to users who already have identities defined outside of AWS, such as in your corporate directory. Or, you might want to grant access to your account to third parties so that they can perform an audit on your resources. 259 | * For these scenarios, you can delegate access to AWS resources using an IAM role. This section introduces roles and the different ways you can use them, when and how to choose among approaches, and how to create, manage, switch to (or assume), and delete roles. 260 | 261 | **Comparison** 262 | 263 | | Categories | Root user | IAM user | IAM role | 264 | | -------------------------------------- | --------------------- | -------- | -------- | 265 | | Can have a password | Always | Yes | No | 266 | | Can have an access key | Yes (not recommended) | Yes | No | 267 | | Can belong to a group | No | Yes | No | 268 | | Can be associated with an EC2 instance | No | No | Yes | 269 | 270 | ### S3 271 | 272 | #### Characteristics 273 | 274 | * Amazon S3 provides the most feature-rich object storage platform available in the cloud today. 275 | * Amazon S3 provides durable infrastructure to store important data and is designed for durability of 99.999999999% of objects. Your data is redundantly stored across multiple facilities and multiple devices in each facility. 276 | 277 | #### Object storage vs file system storage 278 | 279 | **Access mode** 280 | 281 | * Object storage access: RESTful API 282 | * File system access: Special protocols 283 | 284 | #### Comparisons 285 | 286 | * S3 vs Glacier vs DynamoDB vs RDS vs ElastiCache 287 | 288 | #### Output 289 | 290 | * Data fields for the output 291 | * Timestamp 292 | * AWS accountID 293 | * AWS region 294 | * Resources 295 | * Environment 296 | * UserName 297 | * Service Name 298 | * API Name 299 | * Expected 300 | * Observed 301 | * Policy Name 302 | * Expected 303 | * Severity 304 | * Incident type 305 | * API capability 306 | * Environment 307 | 308 | ### CloudTrail 309 | 310 | #### Def 311 | 312 | * AWS CloudTrail captures AWS API calls and related events made by or on behalf of an AWS account and delivers log files to an Amazon S3 bucket that you specify. 313 | 314 | #### Use cases 315 | 316 | * Operations: 317 | * Track changes to AWS resources: Track creation, modification, and deletion of AWS resources such as Amazon EC2, VPC security groups and Amazon EBS volumes. 318 | * Troubleshoot operational issues: Quickly identify the most recent changes made to resources in your environment. 319 | * Who started that ec2 in development 320 | * Who stopped that ec2 in production 321 | * Security: Use log files as an input into log management and analysis solutions to perform security analysis and to detect user behavior patterns. 322 | * Was that change to the security group authorized 323 | * Why was that user added to the group 324 | * Why is this ID generating so many AuthFailure/AccessDenied 325 | * Application: 326 | * My application worked yesterday, what changed 327 | * Have I been added to the monitoring group yet 328 | 329 | #### Workflow 330 | 331 | * AWS CloudTrail captures AWS API calls and related events made by or on behalf of an AWS account and delivers log files to an Amazon S3 bucket that you specify. 332 | * You can also choose to receive Amazon SNS notifications each time a log file is delivered to your bucket. 333 | 334 | #### Methods 335 | 336 | * You can create a trail with the CloudTrail console, the AWS CLI, or the CloudTrail API. A trail is a configuration that enables logging of the AWS API activity and related events in your account. 337 | 338 | #### Log example 339 | 340 | **Information in a recorded API call** 341 | 342 | * Who made the API call 343 | * userName: 344 | * When was the API call made 345 | * EventTime: 2013-10-23T23:30:42z 346 | * What was the API call made 347 | * API and service name 348 | * What were the resources that were acted up on in the API call 349 | * Where was the API call made from 350 | * IP address 351 | * AWS region 352 | 353 | #### Monitor and receive notifications 354 | 355 | * You can monitor any specific event recorded by CloudTrail and receive notification from CloudWatch. Should monitor for security or network related events that are likely to have a high blast radius. 356 | * Popular examples based on customer feedback 357 | * Creation, deletion and modification of security groups and VPCs. 358 | * Changes to IAM policies or S3 bucket policies. 359 | * Failed AWS management console sign-in events. 360 | * API calls that resulted in authorization failures. 361 | * Launching, terminating, stopping, starting and rebooting EC2 instances. 362 | * Use fully defined and pre-built CloudFormation template to get started. 363 | 364 | ### CloudTrail + Splunk 365 | 366 | #### Responsibilities 367 | 368 | * True security requires collecting all data. 369 | * AWS CloudTrail delivers valuable visibility into user account activity. 370 | * CloudTrail captures everything, but Splunk app allows for filtering. 371 | 372 | ## Most challenging part 373 | 374 | * How to finish it within three months 375 | * Have daily standup meetings with my manager to make sure we are in sync. 376 | * When I need to collaborate with other team members, We will notify them ahead of time to save time. 377 | 378 | ## How to test 379 | 380 | * AWS policy generator 381 | * AWS policy simulator 382 | 383 | ## Algorithm 384 | 385 | ### Estimate the amount of data 386 | 387 | > 10KB \* 3 \* 10 ^ 4 = 300MB 388 | 389 | ### Only consider action and effect 390 | 391 | ``` 392 | for each user 393 | { 394 | fetch all policies associated with the user 395 | 1. List of user policies 396 | 2. List of group policies 397 | 3. List of role policies 398 | Map( API, Allow/Deny ) 399 | for each policy 400 | { 401 | for each statement 402 | { 403 | if ( effect == deny ) 404 | { 405 | Map.put( action, deny ) 406 | } 407 | else 408 | { 409 | Map.putIfAbsent( action, allow ) 410 | } 411 | } 412 | } 413 | 414 | Compare the map with permissions inside csv 415 | } 416 | ``` 417 | 418 | ## Possible improvements 419 | 420 | ### Add access control policies on resources 421 | 422 | ### Use new AWS technologies 423 | 424 | #### Combined 425 | 426 | * AWS Lambda provides an easy way to build back ends without managing servers. API Gateway and Lambda together can be powerful to create and deploy serverless Web applications. In this walkthrough, you learn how to create Lambda functions and build an API Gateway API to enable a Web client to call the Lambda functions synchronously. For more information about Lambda, see the AWS Lambda Developer Guide. For information about asynchronous invocation of Lambda functions, see Create an API as a Lambda Proxy. 427 | 428 | #### AWS CloudWatch logs 429 | 430 | * One of the ways that you can work with CloudTrail logs is to monitor them by sending them to CloudWatch Logs. For a trail that is enabled in all regions in your account, CloudTrail sends log files from all those regions to a CloudWatch Logs log group. You define CloudWatch Logs metric filters that will evaluate your CloudTrail log events for matches in terms, phrases, or values. You assign CloudWatch metrics to the metric filters. You also create CloudWatch alarms that are triggered according to thresholds and time periods that you specify. You can configure an alarm to send a notification when the alarm is triggered so that you can take immediate action. You can also configure CloudWatch to automatically perform an action in response to an alarm. CloudTrail events are protected by SSL encryption as they are delivered from CloudTrail to the CloudWatch Logs log grou 431 | * Advantages of CloudWatch logs: What makes CloudWatch Logs more preferable over other third party tools? 432 | * CloudWatch is the single platform to monitor resource usage and logs. 433 | * CloudWatch Logs pricing is based on pay as you use model which may turn out to be cheaper than third party tools that work on per node licence model. Here you will be paying for log storage and bandwidth used to upload the files. 434 | -------------------------------------------------------------------------------- /career/resume_opentextanalytics.md: -------------------------------------------------------------------------------- 1 | # OpenText Analytics 2 | 3 | 4 | 5 | ## Product - BIRT 6 | * My team works on business intelligence reporting tools. It has an open source version https://projects.eclipse.org/projects/technology.birt and a commercialized version 7 | * On the technology side, we use Java, Maven, Highcharts and Tomcat. 8 | 9 | ### Overview 10 | * First, you can design your reports inside our report studio by dragging and dropping. 11 | * Then, the report data connector will connect to different types of data sources and retrieve the data we need. 12 | * Next, all kinds of data transformation defined in the report such as filtering, grouping and slicing are applied to the data. 13 | * After data is ready, different report items such as charts, cross tabs, pivot tables could be generated. 14 | * Finally, the in-memory report could be emitted in the desired output format such as Excel, PDF, HTML, Powerpoint or Word. 15 | 16 | #### Design service 17 | * To create a persistent report, the report engine transforms and summarizes the data and caches the generated report in an intermediate binary file, the report document file. This caching mechanism enables BIRT to scale to handle large quantities of data. 18 | 19 | #### Report service 20 | * The BIRT report engine enables XML report designs created by teh BIRT report designer to be used by a J2EE/Java application. To support this functionality, the report engine provides two core services, generation and presentation. 21 | 22 | * Generation services: The generation service within the report engine connects to the data sources specified in a report design, uses the data engine to retreive and process data, creates the report layout, and generates the report document. Report document can be either viewed immediately using the presentation services, or saved for later use. 23 | 24 | * Presentation services: The presentation services process the report document created by the generation services and render the report to the requested format and the layout specified in the design. The presentation services use the data engine to retrieve and process data from the report document. The presentation services use whichever report emitter they require to generate a report in the requested format. BIRT has several standard emitters, HTML, PDF, DOC, PPT, PS and XLS. 25 | 26 | #### Data service 27 | * The data engine contains the APIs and provides services to retrieve and transform data. The data services retrieve data from its source and process the data as specified by the report design. When used by the generation engine, the data services retrieve data from the data source specified in the design. When used by the presentation engine, the data services retrieve data from the report document. 28 | * The data engine provides two key service types: data access services and data transformation services. The data access services communicate with the ODA framework to retrieve data. The data transformation services perform operations such as sorting, grouping, aggregating, and filtering the data returned by the data access services. 29 | * BIRT uses the ODA framework provided by the Eclipse Data Tools Platform project to manage ODA and native drivers, load drivers, open connections and manage data requests. 30 | 31 | #### Chart serivce 32 | * Creates all types of charts. 33 | 34 | ## The most challenging part - How to dive into really large code base 35 | 36 | ### Approaches you might consider 37 | 1. Try to find out what the code is supposed to do, in business terms. 38 | 2. Read all the documentation that exists, no matter how bad it is. 39 | 3. Talk to anyone who might know something about the code. 40 | 4. Step through the code in the debugger. 41 | 5. Introduce small changes and see what breaks. 42 | 6. Make small changes to the code to make it clearer. 43 | 44 | ### Some of the things I do to clarify code are 45 | 1. Run a code prettifier to format the code nicely. 46 | 2. Add comments to explain what I think it might do 47 | 3. Change variable names to make them clearer (using a refactoring tool) 48 | 4. Using a tool that highlights all the uses of a particular symbol 49 | 5. Reducing clutter in the code - commented out code, meaningless comments, pointless variable initializations and so forth. 50 | 6. Change the code to use current code conventions (again using refactoring tools) 51 | 7. Start to extract functionality into meaningful routines 52 | 8. Start to add tests where possible (not often possible) 53 | 9. Get rid of magic numbers 54 | 10. Reducing duplication where possible 55 | 56 | ### As for the place to start? 57 | * Start with what you do know. I suggest inputs and outputs. You can often get a handle on what these are supposed to be and what they are used for. Follow data through the application and see where it goes and how it is changed. 58 | * One of the problems I have with all this is motivation - it can be a real slog. It helps me to think of the whole business as a puzzle, and to celebrate the progress that I'm making, no matter how small. 59 | 60 | ## Features I implemented 61 | ### RLE on cache layer 62 | * We attempt to reduce the size of the data cache layer so that BI reports could load them into memory faster. The data source could be thought of as a two dimensional table. And right now the problem is how we store this two dimensional table more efficiently. There are three levels of optimization altogether and I implemented the last part. 63 | 64 | #### Column-based 65 | * The two dimensional table is stored in a column-oriented manner. There are two reasons for this: 66 | - First, row-oriented storage is great for write efficiency because it is really easy to add/modify a record. Column-oriented storage is great for read efficiency because we can read only columns which we are interested in. In addition, column-oriented storage usually has higher compression efficiency because similar values are grouped together. As a result, they usually have less entropy and are easier to compact and encode. 67 | - Second, typically a business intelligence report data source contains hundreds of columns and millions of rows. A common use case for report users is they only use a few columns from hundreds of columns. In addition, when a business intelligence report runs, we want the data source being able to be loaded into the memory as fast as possible. This is exactly the use case of column-oriented storage. 68 | 69 | #### Fact/Dictionary table 70 | * There are usually a lot of repetitive values and the number of unique entries is far less than the total number of entries. So it is efficient to use a dictionary/fact table encoding. 71 | 72 | #### Run length encoding 73 | * In business intelligence reports, columns appear as sorted keys and same values are grouped together. As a result, run-length encoding can be applied on these columns to further reduce the size of the report. 74 | 75 | #### Possible improvements 76 | * We implement encoding only on string values, not numeric values. Actually, if the numeric values exhibit great locality, a number of encoding algorithms could also be applied. 77 | 78 | ### Implement integration test 79 | * Our reporting tools can be deployed into web applications as war files. We implemented integration tests to gaurantee that the generated war files are always correct. We use two Maven Failsafe to run integration tests. We use PhantomJS to run browerless testing and selenium for integration tests. 80 | 81 | #### Components 82 | - **PhantomJS** 83 | - **as a plugin**: phantomjs-maven-plugin for installing phantomjs to path ${phantomjs.binary} 84 | - **as a dependency**: phantomjsdriver used inside integration testing classes 85 | - **Selenium** 86 | - **as a dependency**: selenium-java used inside integration testing classes 87 | - **FailSafe** 88 | - **as a plugin**: Run integration testing with "*IT.java" 89 | - Rely on the {phantomjs.binary} variable setup previously 90 | - **Embedded Jetty** 91 | - Run integration test on with war under ${project.build.directory} 92 | - HTTP port 9999 93 | 94 | #### Whole process 95 | 1. Copy/Unpack the war to be run during package life-cycle. 96 | 2. Launch Jetty server during pre-integration-test life-cycle. 97 | 3. Fail-safe plugin run integration test with phantomJS during integration test life-cycle. 98 | 4. Jetty stop server during after-integration-test life-cycle. 99 | 5. Failsafe verify the tests run correctly during verify life-cycle. 100 | -------------------------------------------------------------------------------- /career/teams_resume.md: -------------------------------------------------------------------------------- 1 | - [Project deepdive](#project-deepdive) 2 | - [Non-Technical skills](#non-technical-skills) 3 | - [People](#people) 4 | - [Product](#product) 5 | - [Process](#process) 6 | - [Tell me about yourself](#tell-me-about-yourself) 7 | - [Tips](#tips) 8 | - [Bonus points](#bonus-points) 9 | - [Why this company and position are good matches](#why-this-company-and-position-are-good-matches) 10 | - [My answer](#my-answer) 11 | - [A success project](#a-success-project) 12 | - [Tips](#tips-1) 13 | - [Goal](#goal) 14 | - [STAR method](#star-method) 15 | - [Go to the details](#go-to-the-details) 16 | - [My answer](#my-answer-1) 17 | - [A failure experience](#a-failure-experience) 18 | - [Tips](#tips-2) 19 | - [My answer](#my-answer-2) 20 | - [Conflict management](#conflict-management) 21 | - [Goal](#goal-1) 22 | - [Patterns](#patterns) 23 | - [My answer](#my-answer-3) 24 | - [Leadership](#leadership) 25 | - [Ownership / Responsibility](#ownership--responsibility) 26 | - [Questions to ask back](#questions-to-ask-back) 27 | - [TODO](#todo) 28 | 29 | # Project deepdive 30 | * [Auth library migration](https://docs.google.com/presentation/d/1MpL-0gpj26kJaeYDu2-QH0z3wNSU0a58tI9oKb9dLUw/edit?usp=sharing) 31 | * TODO: Sign in service migration 32 | * TODO: Stale cache: Design open source packages 33 | * TODO: [K/V usage in Redis/Memcached in Weibo](https://time.geekbang.org/qconplus/detail/100091366) 34 | 35 | # Non-Technical skills 36 | ## People 37 | 38 | ## Product 39 | 40 | 41 | ## Process 42 | ### Tell me about yourself 43 | #### Tips 44 | 1. Assume: Interviewer has already read resume. 45 | 2. Overview: Appreciate and learn from the current position. 46 | 3. Limitations of the current position, naturally leads to why new and old role matches. 47 | 48 | #### Bonus points 49 | ##### Why this company and position are good matches 50 | * Heart: Industry, project, customer 51 | * Impact: Scope 52 | * Money (could think out of the box but better obey the interview rule and not talk about) 53 | 54 | ### My answer 55 | * I have been working in Microsoft Teams for the past 4.5 years. Just to give you a bit more context, Microsoft Teams is a group chat SaaS application for Office products. 56 | * Within this position, I had worked as service owners, authentication specialist and SDK designers. I learnt how to drive things E2E and collaborate across teams / orgs. 57 | * I want to work closer to products and cutting edge technologies. This position within Doordash could provide me with such opportunities. 58 | 59 | ### A success project 60 | #### Tips 61 | ##### Goal 62 | * Behind the scene the interviewer wants to understand the scope and complexity of the project. 63 | * Similar to performance review / promotion discussion. 64 | 65 | ##### STAR method 66 | * Situation/Task: Describe the situation you were in or the task that you needed to accomplish. You must describe a specific event or situation, not a generalized description of what you have done in the past. 67 | * Action: Describe the action you took and be sure to keep the focus on you. Even if you are discussing a group project or effort, describe what you did. 68 | * Result: What 69 | 70 | ##### Go to the details 71 | * Interviewer wants to see What methodology and logic you use to make decisions. Understand on why and what. 72 | 73 | #### My answer 74 | * Situation: One of the vertical markets that have not adopted SaaS products is financial industry due to its high compliance requirements and volume. In order to sell M365 products to the financial industry, Teams product was chosen as the pioneer to implement a new type of authentication token. 75 | * Task: As the critical service within Teams product, we were the first service to uncover this area and lead all designs. We worked with auth SDK owners within microsoft to design and refine the protocol. We also worked with different client teams (web, mobile, desktop, etc.) to guarantee all types of auth scenarios are supported. The challenge of this products is more on balancing quality and speed. 76 | * On one hand, since our service is a pioneer service, we want to do this in a way which could establish reusable patterns and SDKs for all client-facing services within Teams. 77 | * On the other hand, since we are the first adopter, we want to balance the speed and quality so that we could test the performance on scale quickly 78 | * Action: 79 | * First we had a weekly cadence meeting including all stakeholders to guarantee that the all important concerns could be raised in time. And all the design tradeoffs we made or corners we cut are documented well for comments and reviews. 80 | * When we implement this function, we implemented all functionalities in a dedicated middleware where all services could easily plug into their request handling pipeline. 81 | * Result: 82 | * We delivered this feature within the given timeline, implemented it in a way easy for other teams to adopt, and have all the tradeoffs documented really well for future review and adoption. 83 | 84 | ### A failure experience 85 | #### Tips 86 | * Template: 87 | * Not perfect in the beginning. Cannot foresee the risk and dependencies of the project. 88 | * In the end, 89 | * Failure redflag: 90 | * Failures don't conflict with core values. 91 | 92 | #### My answer 93 | * 94 | 95 | ### Conflict management 96 | #### Goal 97 | * Scope of the 98 | * Empathy 99 | * Core values 100 | 101 | #### Patterns 102 | * Possible scenarios 103 | * Technology: 104 | * Tradeoffs 105 | * Background and perspectives: 106 | * With PM 107 | * Process: 108 | * Recognize the differences 109 | * Learn the differences 110 | * Understand the culture 111 | * e.g. Amazon Right a lot 112 | * e.g. People, flexible 113 | * Possible solutions 114 | * Convince 115 | * Adopt the suggestion 116 | * Redflag: 117 | * Avoid conflict 118 | * Too aggressive 119 | * Too concerned about your own needs 120 | * Emotional 121 | 122 | #### My answer 123 | * Situation: Our team is building a org level reusable shared libraries. While I work on the library migration project, I propose that we put the reusable components inside shared libraries so all other teams could reuse it. However, my manager said that there were not much resource, and would prefer not to invest too much into this project. 124 | * Action: 125 | * I first synced with my manager / PM to make sure that I understood their priorities in the coming quarter, and who are the stakeholders for these incoming features. 126 | * Then I held a discussion for partner teams who will probably use my package if I did it in a reusable way, and understand how much they will benefit from this. 127 | * Next I hold a meeting with my manager and PM, demonstrating how much more influence we could make if we could make if we switch one project in the coming quarter. 128 | 129 | ### Leadership 130 | * Process 131 | * Influence without authority 132 | * Firefighter 133 | * Manage up 134 | * Manage peer 135 | * Result: Gain the team win and get trust from peers. 136 | * Provide multiple example and ask the interviewer which one he is interested. 137 | 138 | ### Ownership / Responsibility 139 | * Two core points: 140 | * Volunteer 141 | * Originally out side of your plate 142 | * Add more details when describing it: 143 | * Evaluating the resources, prioritizing work, and communicating with partners. 144 | * Red flag 145 | * Not assigned to you, wait for directions 146 | * Hesitated to take the action 147 | 148 | ### Questions to ask back 149 | * tech stack, roadmap, team structure, current challenge, etc. 150 | 151 | 152 | # TODO 153 | * On behalf of: https://curity.io/resources/learn/on-behalf-of-flow/ -------------------------------------------------------------------------------- /career/toastMasterFeedback.md: -------------------------------------------------------------------------------- 1 | # Concise communication 2 | 1. Good structure and gestureFor the last point “now what” you may want to give an example of breaking “now what” into 3 points 3 | 4 | 2. From scott to Me: (Privately) (12:32 PM) 5 | Like the opening, personal story (small talk vs. conciseness with manager)Why slide is very clear with focus on bi-directional (allowing other to talk)Really like your framework: What, so what (implication) , now what (AIs)Great example of Sean Pean: “if you don’t vote, you don’t matter”. Reminded me “if you are not at the desk, you are on the menu” type of concise strong messageGood Flow: e.g. at this point you are talking about Meaning vs. detailsSo what slide (power of 3), would nice to have another example (because now the bar is higher, I as an audience want to hear another example here)Nice ending as well. Overall very nice, maybe slow it down even a bit more between slide to have audience digest/think 6 | 7 | 3. From Li to Me: (Privately) (12:33 PM) 8 | Great presentation, thank you for putting this together. Really like the way you delivered. Maybe adding an example of what, now what, so what can help. Overall it’s a great presentation, like the pictures and the flow. Great job! --------------------------------------------------------------------------------