├── .github
└── FUNDING.yml
├── open-culture-institute.md
├── LICENSE
├── README.md
├── twitter-community-archive.md
└── current_projects.md
/.github/FUNDING.yml:
--------------------------------------------------------------------------------
1 | ko_fi: defenderofbasic
2 |
--------------------------------------------------------------------------------
/open-culture-institute.md:
--------------------------------------------------------------------------------
1 | # Open Culture Institute
2 |
3 | > I need something in between the Russian troll farms @bennjordan described, and like, a public research lab. An "open culture institute" that A/B tests people and publishes the result, for the general public
4 | >
5 | > https://x.com/DefenderOfBasic/status/1859754830962823415
6 |
7 | ### Aella
8 |
9 | Aella does this work very explicitly (1) rigorous surveys & publishing the results for the public (2) intentionally vocal about fixing bugs in culture, questioning the source of stigma & reframing narratives for public good
10 |
11 | > Fwiw I'm a sex worker and also widely hated by the far right but also have severe issues with academia and how PhDs like this work. Just wanted to register the displeasure in the system is not just a far right thing!
12 | >
13 | > https://x.com/Aella_Girl/status/1863276185851224459
14 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2025 Defender
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Defender's PhD
2 |
3 | Pursuing independent research in "culture science". My current focus is developing experimental techniques, documenting them, and carrying out experiments to corroborate or falsify theories. See: [list of people I believe are in this field](https://github.com/DefenderOfBasic/PhD/issues/2).
4 |
5 | My background is in computer science & math. I worked in tech for 6 years (geospatial software).
6 |
7 | You can follow [my writing on Substack](https://defenderofthebasic.substack.com/).
8 |
9 | ### 📜 Research statement
10 |
11 | The behavior of groups of humans can be predicted with great precision. Private entities that have a lot of data about people can already do this. We should study this in the open so we can protect ourselves, and also because we'll have much better data, and much better models. This will help us align our incentives with those with power, or confirm it's not possible, and instead find & cooperate with the groups who are aligned with us.
12 |
13 | > solve cooperation, use it to solve everything else - @IvanVendrov
14 |
15 | My top priority is figuring out how to study, predict, and engineer our culture safely. I'm doing this work 100% in the open for this reason.
16 |
17 | ### 🔭 Directions I'm currently working on
18 |
19 | See [current_projects.md](current_projects.md)
20 |
21 | ### 🧑🏫 Advisors
22 |
23 | - [Slime Mold Time Mold](https://slimemoldtimemold.com), _(active researchers on the internet, of "potato diet" fame)_
24 | - [Park Doing](https://ethics.engineering.cornell.edu/archives/retired-staff/), PhD _(retired, Department of Science and Technology Studies at Cornell, author of "Velvet Revolution at the Synchrotron: Biology, Physics, and Change in Science")_
25 |
26 | ### 💰 Funding
27 |
28 | Currently funded by:
29 |
30 | - Tekne Labs (https://x.com/TekneLabs)
31 | - Kanro (https://kanro.fi/)
32 | - Analogue (https://analoguegroup.org/)
33 |
34 | Plus various community contributions.
35 |
36 |
37 |
38 |
39 |
40 | ### 🧭 Long term vision
41 |
42 | If I can do a good job learning, writing as I go, getting feedback from researchers on the internet, and eventually publishing something novel & useful, I can get an endorsement from people with academic credentials. My reward would be status, that I can use to further my career as a researcher, or hold a position in a corporation furthering this work.
43 |
44 | I want to contribute to a world where all tribes on earth have access to tools to study themselves, safely in private if needed, and to signal what they want to signal publicly about themselves. We would see culture as open source, art is forked & remixed, not frozen by copyright.
45 |
46 | Our culture is open source & needs ongoing maintenance.
47 |
48 | > (1) humans need tribes (2) tribes need walls (3) "the other side is evil" is an effective way of reinforcing those walls, but it's not true (4) finding a way to reinforce walls that IS true is better for everyone
49 | > https://x.com/DefenderOfBasic/status/1858204201211334690
50 |
51 |
52 |
--------------------------------------------------------------------------------
/twitter-community-archive.md:
--------------------------------------------------------------------------------
1 | # Twitter Community Archive
2 |
3 | This is a project to (1) collect twitter archives from volunteers to study ourselves (2) build open source tools for collection & analysis so that other communities can reproduce this work on their own data.
4 |
5 | I started this project in July 2024 with [Xiq](https://x.com/exgenesis). It currently has 4 million tweets from 187 accounts.
6 |
7 | - Project homepage: https://www.community-archive.org/
8 | - GitHub: https://github.com/TheExGenesis/community-archive
9 |
10 |
11 |
12 | ### Visualizing the evolution of language over time
13 |
14 | App link: https://labs-community-archive.streamlit.app/
15 |
16 | This is a "ngram viewer" over the corpus of the collected tweets. It's very useful to see how language unique to a specific community has evolved over time. For example, most of the people in the archive belong to a community called "tpot". But it wasn't always called that. At some point it was called "ingroup", and/or "postrat" (short for post rationalist).
17 |
18 | With this kind of tool we could see (1) exactly at what point this name emerged (2) how long it took for it to settle as the official name (3) who was the first person to coin it?
19 |
20 |
21 |
22 | It would be interesting to recreate the timeline of how it spread. Did a few specific users cause an inflection point in its usage? Can we visualize the first occurrence of it in _each_ user's lexicon? Can we do this with other words in the community's dictionary to identify if there are "thought leaders" who consistently adopt & spread new terms ahead of the curve? Or is this always a diffuse process with no specific leader?
23 |
24 | Another interesting example are the words **egregore** and **psychofauna**. They mean the same thing, but `@visakanv` coined the latter [Nov 5 2022](https://x.com/i/web/status/1588796630287147009) as an attempt to create an easier to remember term:
25 |
26 |
27 |
28 | The word **psyop** entered this particular community's vocabulary in 2020.
29 |
30 |
31 |
32 | ### Lexicon analysis
33 |
34 | I created this simple Jupyter notebook as a template: https://colab.research.google.com/drive/109XOgTWj-sajpAYhDCNPfts5zvdkpi_s
35 |
36 | I generated the 10 most common bigrams for each user, and asked people on twitter to guess who was who (these are bigrams for users `@eshear, @NathanpmYoung, and @nosilverv`:
37 |
38 |
39 |
40 | It was interesting to see "don't know" appear in almost everyone's top 10 bigrams. My current theory is that this is because this particular community cares a lot about truth & epistemology, and this comes out in the language used, but we'll need a control group to verify this.
41 |
42 | ### Personal semantic search
43 |
44 | Source code: https://github.com/DefenderOfBasic/twitter-semantic-search
45 |
46 | I built a basic semantic search which is very useful for finding tweets & threads by meaning, or that are related to a general topic. Here the query is a paraphrase of a tweet I made & system can find it even if the keywords aren't in the search query:
47 |
48 |
49 |
50 | This can be the basis of a lot of future work, like automatically finding topics of overlap between two or more users. It can also be used to cluster tweets together and graph them over time, to see how the ideas & themes in each person's writing have evolved over time:
51 |
52 |
53 |
54 | _(the table above is from an [app that Xiq is working on](https://x.com/DefenderOfBasic/status/1866486560239313136/))_
55 |
56 | ### Future work
57 |
58 | The general areas of focus are (1) coordination and (2) sense making. When new discourse flares up, I want to know the history of it & how we got here. When I meet someone new, I want to know how much our beliefs & values overlap. Tools that make my writing legible & searchable help me find others who are aligned with my work.
59 |
60 | Some specific ideas:
61 |
62 | - A chrome extension that scrapes tweets as I view them and adds them to the open database, will give us a "twitter firehose" of data comparable to the Blue Sky firehose (focused on a specific community)
63 | - An app similar to the "ngram viewer" but with semantic search. Would allow us to trace the evolution & timeline not just of specific words but of ideas & meaning
64 |
--------------------------------------------------------------------------------
/current_projects.md:
--------------------------------------------------------------------------------
1 | # Current projects
2 |
3 | These are current directions I'm actively pursuing.
4 |
5 | ### Anatomy of an Internet Argument
6 |
7 | https://defenderofthebasic.substack.com/p/anatomy-of-an-internet-argument
8 |
9 | This is a series of case studies about how to resolve internet disputes. It's an empirical test of the theory that humans speak different languages, and if you can translate each tribe's different language, you can resolve disputes. This is a framework for "participatory science", inviting people to learn the technique, apply it in the real world, and if they find it doesn't work, they can "disprove" it, or update the theory, or get peer feedback.
10 |
11 |
12 | read more
13 |
14 | The object level goal is to teach people how to have more productive conversations online, which leads to greater empathy and social cohesion, and the meta goal is to make a fun, accessible way to engage the average person in a process of science & truth seeking. It's a big game where you win by understanding the other side, proving you can predict them (which is often the goal of the other side, to be understood)
15 |
16 | Trying to collect the theory & case studies in an open source book: https://defenderofbasic.github.io/in-good-faith-handbook/
17 |
18 |
19 |
20 | ### Twitter community archive
21 |
22 | https://www.community-archive.org/
23 |
24 | A project to ask users to export & share their data as part of a public archive, so that anyone can study it. I wrote about ["visualizing how ideas spread"](https://github.com/TheExGenesis/community-archive/wiki/Exploring-historical-trends-in-the-community-archive) with a google-trends like app on this data.
25 |
26 |
27 | read more
28 |
29 | My initial plan was to just publish my own archive, study it to see how & why I changed significantly over a span of ~1 year as I was exposed to specific ideas. This is kind of a very invasive analysis, but I thought that doing it on myself for the public good would be the easiest way to get permission, and that if I find something useful or interesting for myself, others might do it too. And if there is anything dangerous about it, others will see that too. Either way it will contribute to the field.
30 |
31 | I believe this can become a common paradigm. We don't need to ask permission from big companies to study ourselves, every user can export their own data and share it with whomever they choose to. The only other notable example of this pattern is [Washington Post asking its followers to export and give them their TikTok user data](https://omarshehata.substack.com/p/washington-post-is-collecting-tiktok).
32 |
33 |
34 |
35 |
36 |
37 | ### Create empirical maps of discourse with LLM's semantic embeddings
38 |
39 | See [twitter thread with examples](https://x.com/DefenderOfBasic/status/1856002128327643289).
40 |
41 | This gives us a rigorous answer to things like "does our culture think capitalism is more evil than communism?" There is an empirical answer to this question. You can create this map with with semantic embeddings.
42 |
43 | The missing piece is that, this gives you the aggregate. I want different tribes to take this quiz, to map out what _they_ think it should be. The reality is there are large groups who would say the concept of "capitalism" is much closer to "evil" than communism. I want to build these conceptual maps that are specific to different tribes, because they let us identify (1) exactly in what way do people see a different reality (2) exactly what is our common ground
44 |
45 | ### Communal culture science
46 |
47 | I want to create tools or a platform for people to (1) ask an interesting question (2) collect data, and choose whether to publish it or keep it private within participants (3) for anyone to fork and repro the study.
48 |
49 |
50 | read more
51 | Aella does this a lot, but I want this to somehow be easier, more common of a pattern. Like the [communal daily plot](https://perthirtysix.com/communal-plot-daily-poll) but in a way I can fork or submit questions?
52 |
53 | This is partially about creating the tech tools and partially about spreading that idea that anyone can do this, you can analyze other people's data, etc.
54 |
55 | Forking is really important because Aella's polls for example sometimes get crazy answers/conclusion where people say "that's not true/representative". I want people to be able to run the study within their own friend network, and compare their network to the global/sampled population. This is how to get things to "spread sideways", in areas where internet culture doesn't touch. My mom isn't going to see this surveys, but if the game is: "poll your network" I can go sit with her and do it.
56 |
57 |
58 | ### IRL cultural experiments
59 |
60 | It is not typical in the culture where I live for friends to "just stop by" your house without calling first. But I can change this with agreement with my friends, then if it works, we can write about it. Does it spread? What about "friendship breakups" ? and normalizing "saying no"? Can we just find all the bottlenecks in our local culture and fix them? And then do "cultural peer review" on the internet?
61 |
62 | On normalizing talking to people at coffee shops:
63 |
64 | > this is incredible. my secret plan is to collect stories like this, get people to try it, if it works for many people & communities independently, it spreads. Cultural peer review
65 | https://x.com/DefenderOfBasic/status/1855620094107230688
66 |
--------------------------------------------------------------------------------