[email protected] Presents: The Cultural Anthropology of Stack Exchange

Founders@Google Presents: The Cultural Anthropology of Stack Exchange


MIKE WINTON: Hi, I’m
Mike Winton. And for those of you who don’t
know me, I lead Developer Relations here at Google. And today I would like to
introduce our special guest, Joel Spolsky. Joel’s an expert on software
development dating back many years. Until long ago, he worked at
Microsoft and designed VBA as a member of the Excel team. He’s also worked at Juno Online
Services on their internet client. And a little bit more recently,
he co-founded Fog Creek Software and
is co-creator of stackoverflow.com. He’s also got a popular website
that’s been translated into over 30 languages. He’s written four books about
software development. And he’s here to talk to us
today about something else he’s an expert at, which is
communities and the dynamics and the anthropology
of the communities. And I would say we are so
thrilled here at Google about the existence of
Stack Overflow. And that’s something that’s
really one of the fundamentals about how we do developer
support and how we do developer relations, is going
out and participating in the conversations and in the
community that developers are having there at
stackoverflow.com. So with that, I’ll hand
you over to Joel. Thank you, Joel. JOEL SPOLSKY: Thanks. [APPLAUSE] JOEL SPOLSKY: So this
talk is called– Can I walk around? Yes. I’m not sure what
I’m attached to. This talk isn’t called
The Cultural Anthropology of Stack Exchange. Did anybody take anthropology
in college? I had. It was the worst class
I ever took. It was a class called Cultural
Anthropology because I found it to be unbelievably boring. And there was all this stuff
about Trobriand Islanders and there was a tribe somewhere that
exchanged blankets called Potlach blankets. This is a Canadian thing. And there was kind of a lot
of stuff that I found unbelievably irrelevant
and boring. And then I discovered that this
turned out to be my day job, actually studying
anthropology, especially as it applies online. The stack exchange network is
266,000,000 page views every month, 25 million global
monthly uniques. That’s one way we have
of counting it. We have another way of counting
it, which is 50 million uniques. I like that way better but
let’s go with 25 million different people visit
every single month. If we were a country, we would
be up there with some of the most reputable countries. with– [LAUGHTER] –their governance. Not really a good– Not really an impressive list. If we were a state, we would
probably be the third largest state and we’re about
to break– about to pass Texas. So that’s why the only way you
can study this large group of people is kind of the tools of
anthropology, the tools of cultural anthropology, studying
people and what they try to make, and how they
create from the smallest group, which is just a group of
people to come together to do something temporarily and
then disband to larger [INAUDIBLE] like groups. Now when I started programming
a long time ago– that’s me. No, I’m just kidding. We had these tape drives that– No. We thought that computing was
about computing, like getting a number, getting a
result, solving a problem using a computer. And you felt like you were lucky
if you could somehow manage to create computer
software that, at the end of the day, ran the payroll at
least once every two weeks when the payroll was due. And if you actually got payroll
numbers, or tax reports, or whatever, then
you felt like you had accomplished something. But then people invented
time-sharing in the ’60s, and I actually got my start sort
of in this generation. And so this is a little printing
terminal called a DECwriter and they were
very popular. And it would hook up with– It had to be a 300 baud modem,
because any faster than that and the print head couldn’t
keep up, even though there were faster modems probably
already being invented. They came up with a later
version of the DECwriter that could do 1,200 baud. And multiple people were online
at once, for the first time, instead of doing
the batch jobs. And that meant that all of a
sudden you had the possibility of communication, but 300 baud
was still too slow to put large amounts of text. So you didn’t really
have conversations. Email hadn’t yet
been invented. Until the first terminals came
out, the first what they called– this was called a smart
terminal, because it had the ability to move
the cursor. The VT-100 was sort of a
classic of this era. And they could go up
to 1,200 baud, or even 9,600 baud later. And so, actually, that starts
to be about as fast as your eyes can read. I would say most people read
at about 2,400 baud. And it starts to become
reasonable to imagine doing things like sending emails,
so email was invented. And the first versions of online
support groups, and the first ways that people
communicated in groups online, were just extensions of email. Google Groups, to this day, you
could trace it all the way back to Usenet. And Usenet you can trace all the
way back to this idea of just sending around mailing
lists, where there would be some stupid daemon that would
just receive email at one address, and send
it out to people according to a list, right? A mailer daemon. So Usenet was really
interesting, because I used Usenet a lot in college, and
now it’s almost completely gone, except for distributing
illegal things. But it was this is a fairly
late version of the Usenet software TRN, the threaded RN. RN stands for “read news.” And
up in the right-hand corner, you can actually see the thread,
and the idea of the thread being that a conversation
can have a parent and multiple conversations can
have multiple parents. It’s all wonderful for computer
scientists, not for regular people. But one of the things that I
noticed about Usenet was a very interesting phenomenon– I know it’s very hard to see
this in the back– but you would get these conversations
that showed artifacts of the software, OK? So this is kind of
interesting. This is a conversation on
talk.politics.mideast, which I participated in for about
a month, until it went into a loop. And then it never emerged,
it never got out of. But an interesting thing to
notice is when you hit R to reply to a message in the RN
program, the first thing you have to know about Usenet is
it’s not client-server. It was a totally distributed
system, meaning everybody just connected to each other and sent
each other any messages they hadn’t already seen,
and every node didn’t have enough people. Hard drives were expensive, and
so the average Usenet node would have maybe three days
worth of archives and throw away everything older than
three or four days. They would keep enough so that
you cold away for the weekend, come back on Monday, and not
have missed anything. But very rarely did anybody have
a big enough hard drive to keep a whole week worth
of Usenet, or two weeks. So– which is one reason why it was
so hard to assemble the Usenet history, because it was
never in one place. Because you– when you’re applying to a
message, if there is a probability that the person
who’s seeing your reply does not have access any more to
the thing that you were replying to, that makes it hard
to have a conversation. And so people started quoting
the message they were replying to. And then the software said, hey,
let’s make that a feature of the software to
make that easy. So when you hit R, which meant
reply, you’ve got this thing– it still happens in some
email clients– where, the message you are
responding to, every single line got a little greater than
in front of it, saying this is what I’m replying to. And what people did when
confronted with the software is interesting. They would then go intersperse
their own comments in between the other people’s comments. And it was very easy to follow
what was going on because the greater thans were the
original message. And then you could have your
nitpicky little answers interspersed on every
single sentence. And so here you have– clearly
this is nonsense, this is clearly nonsense, this
is nonsense, this is nonsense, et cetera. You, what you would do is you
would pick apart someone’s argument by picking apart
every sentence of their argument, which is a certain
type of culture. It was very popular on Usenet,
the nitpicky culture. Still exists on the internet
in certain places today, I believe. And the bloggers later
reinvented this, and they called it fisking, if
you’re a blogger. But this particular style
is an artifact. Its an accidental thing that
happened because the software happened to behave in
a particular way. And what’s interesting is, until
you start noticing that the culture is acting according
to the software, then you may forget to design
the software in a way that makes the culture work. So this is at Moscone
Center actually. And it’s just and architectural
image that shows the concept of if you build
something in a certain shape, the people will go
into that shape. And that happens in architecture
all the time. Sometimes it’s accidental. You happen to have a nice
curve that’s awesome for skateboarding, and you’ll
get kids skateboarding. Well, this is a little bit more
intentional, where if you build a table that looks like it
might be the right size for chessboard, and it’s got
an 8 x 8 grid on it. And you put two things that
are the shape of humanoid chairs on either side, then old
men will come and sit down and play chess. Whatever you build, somebody
will find a way to come use it in the way that you built
it, sometimes completely non-intentionally. This is the Spanish Steps in
Rome, where you’ll see people come in and just sort of
hangout, sitting on the steps. It was built because there are
two roads at different heights and they can’t be connected
with the road. It would be too steep. So they built steps. But actually, that turns out to
be a really good place to sit down if you’re a teenager,
or you’re a backpacker, and get your hair braided by
gypsies, because it’s just sort of the perfect physical
environment for that. And then you can copy it. So, in Times Square, they built
this staircase that doesn’t go anywhere in hopes
that backpackers and gypsies will come sit down there. If you don’t know, if you’re not
paying attention, here’s what happened on software. We had Usenet and then
everything moved to the web and they said well let’s
not use NNTP as our protocol anymore. And they started building these
web versions of Usenet. And the web versions of Usenet,
which are still out there all over the
place today. There’s software called
[INAUDIBLE] and phpBB There’s a whole bunch of
these packages. They’re actually just
copying Usenet. They work in the same way,
except that they have the Smiley face that rolls to the
left and the right when you’ enter a rolling smiley
face command. But other than that, they’re
functionally equivalent to Usenet, which is an accidental
design that actually, literally came from email. So, look, nothing has been
sort of innovated here. So if you want to be Utopian– I don’t know why I have
this picture here. I guess this is a
Utopian picture. You have to actually design
stuff a little bit intentionally. That’s what we started to
do with Stack Exchange. I feel like we’re about
10% of the way there. And I just want to show you
some examples of how we’ve done this at Stack Overflow and
Stack Exchange over the four years that we’ve been
out and in business. All right, so one thing which
we focused very much on is first impressions. Does anybody know what this
is a picture of, anybody? AUDIENCE: Occupy Wall Street. JOEL SPOLSKY: Occupy
Wall Street. It’s already starting to
be historical, but they’re still there. And there’s a lot of clues,
obviously, besides the, well, it’s on Wall Street, so that
helps, if you saw that little thingy there. Like, this fellow here, who one
of these days is going to accost me after one of these
events because I don’t know who it is. But he’s got a Zapata t-shirt,
which is awesome. I don’t know if he knows who
that is, he probably knows. He’s got kind of like a
kaffiyeh, but it’s like interesting colors,
bright colors. Here we have a person who loves
the 99% with really, really expensive headphones. [LAUGHTER] That’s OK. There’s just all kinds– everybody is trying to sort of
put out their signals as to who they are, what they believe
in, because they want to attract more people like them
to their cause, right? So what’s the first impression
that we got? I sent Jeff Atwood out, and I
said, when we started this project four years ago, more
than four years ago. I said, go look at some Q&A
sites that are out there. Because here’s my idea,
it’s different. And what he found was Yahoo! Answers is the big one. Answers.com is also
pretty big. This is Yahoo! Answers. What’s your first impression
of Yahoo! Answers? This is a screen shot
of the home page I took at some point. What are you– I use to clean out
my coffee maker. What is your favorite
plant of all time? Anyone up for a food/drink,
true/false survey? What are you listening to? What kind of questions
are these? I mean, they’re questions,
right? They have question marks– sometimes. What are you listening to? What is the last
thing you ate? Can I die from carbon
monoxide? No, no, it’s OK. No. This is the clue here. It’s all the way
at the bottom. I keep forgetting to
do my homework. That’s the clue, the
secret clue. Does anybody know the secret? Yahoo! Answers? AUDIENCE: Six-year olds. JOEL SPOLSKY: Well,
it’s slightly– They’re 12. They’re latchkey kids. It’s really, really active in
the afternoon when kids get home from school, especially
girls, because they have no permanent identity here. It’s not like Facebook, where
you set up an identity and then people can be creepy. You can be totally anonymous,
and then you could just be another person tomorrow when you
ask a different question. So Yahoo! Answers became a chat
room for teenagers. And when you look at the
website, you would be deterred from using Yahoo! Answers for any purpose other
than talking about how to, you know, what you’re
listening to. Here’s Answers.com again. What kind of attorney is needed
for advice on getting someone you know committed
to a mental institution? I like that question. That’s a good question. So, never mind. What’s supplied does the
U.S. get from Ghana? I don’t know. Who cares. Another clue. What are some examples of a
welcome address for JS prom? JS prom, anybody? It’s not JavaScript, that’s
what I thought. There is no JavaScript prom. This is obviously a kid that
is so young that they don’t even get that this prom that
they have at their school, called the JS prom,
is not, like, a universal internet thing. It’s just specific
to their school. Or they don’t get that the
internet is a universal thing. Either way, once again,
what’s going on here is this is for kids. What is the rights
of McDonald’s? That’s a good one. Askville. Amazon bought this little site
called Askville and then proceeded to ignore it. How can I start making the
right choices in my life? What is the 21 largest states? What is the interval,
some sort of math– This is actually a question,
at least, that makes some sense. And the answer is, like,
this is homework. We don’t do homework
questions here. So again, what’s the first
impression on Stack Overflow? The first impression on Stack
Overflow is, if you’re a programmer, you get that these
are programmer questions. You’re like, oh look, it’s full
of programmer questions. If you’re not a programmer, you
don’t understand a single thing and you leave. Which is a good thing,
this is what we want. Here’s one of our sites, a
network of 90 stack exchange sites, for Jewish life
and learning. Again, unless you actually went
to a yeshiva and studied sort of extensively into the
details of orthodox Jewish law, you probably don’t even
understand the questions because they’re written in
a special language called Yeshiva English, which has
pretty much replaced Yiddish. It’s English but, I mean, all
those words are Hebrew. Take this one. Is it possible to have
hametz on the Shabbat directly after Pesach? In English, that would be
is it possible to– Are you permitted to eat
leavened bread on the Sabbath which immediately follows
Passover? So I can translate
that to English, but it’s not in English. Why isn’t it in English? Because you’re trying to push
away the people that don’t speak Yeshiva English
because, this is– Not only is this a site for
Jewish life and learning as advertised, it’s actually
kind of orthodox, right? And we actually kind of don’t
really want conservative, uneducated Jews hanging
out here. That’s not my decision. I mean, I founded a conservative
Kibbutz. I’m a strong– but I also went to Yeshiva and
I get what’s going on here. Cross Validated is a site
about statistics. Comparing two methods of
sampling from bivariate– Again, I don’t understand it,
but if you’re into statistics, you immediately recognize it. You Immediately say, OK, this
is a real statistics site. This is for people that actually
do statistics, and understand it. And know it. Now Askville does
have sections. So to be fair, I tried to go
into the Askville math section and see what was going on,
what they had going on in their math section. Write an algorithm to find the
number of between 7 and 100 which is exactly divisible by– OK. Apply for apartments online. Asked 13– What does that tell you,
asked 13 hours ago? AUDIENCE: It’s spam. JOEL SPOLSKY: It’s spam. What else? What does it tell you, though? AUDIENCE: [INAUDIBLE] JOEL SPOLSKY: Nobody cares. Like, nobody’s cleaning up the
spam here, so nobody goes here, right? Because there’s still spam and
nobody’s cleaning it up. What is the size of a plot
in the Caribbean? I like that question. These are the full
text, I think. 40, maybe 50. OK. So here’s my problem. I’m 20 years old. Didn’t really attend high school
and know only super basic math, meaning plus. All right. So if you’re a Fields medallist
or a math professor at Berkeley, you don’t
go to that site. You go to this one. This is our math site. We’ve got another. We have two math sites, because
the mathematicians have bifurcated. This is the PhD-level
mathematics. Again, I can’t understand
a single thing there. I don’t even know what they’re
talking about. I barely understand the tags
in terms of knowing about areas of mathematics. But we have two sites. Math overflow has a rule that if
a math professor is likely to know the answer to
your question, don’t ask it on math overflow. Because math overflow is for
research-level questions, the kind of things that most
people probably don’t know the answer to. Otherwise, you have to go to
math stack exchange, which as you can see, is liberal and just
allows anybody to ask any kind of crappy math question
that they want on there. So once again, it’s all about
what’s that first impression that you give. This just a random picture
that I took. You could– There’s a few things going
on here that you can immediately tell. There’s some kids are playing
Ultimate Frisbee. This is Gail. Because it looks like Gail. There’s no leaves on the trees, which means it’s winter. And yet they’re wearing
shorts. So it’s either the first day
of spring or, more likely, their Californians
because they’re playing Ultimate Frisbee. And they also kind of look like
jocks to me, actually, based on the sunglasses, and
the backwards baseball cap, and all that kind of stuff. When you see this scene, you’re
walking down campus and you see these kids, you
immediately say, oh my God, I want to join that because I am
a Californian at Yale who plays Ultimate Frisbee and
I’m kind of a jock. Or there’s a million things
there to turn you off. It doesn’t matter. But everything about a community
either draws people into the community or
pushes them away. And so many people in web design
have been trying to figure out how do we make a
web page that sucks every single person in the
known universe in. And when you’re trying to get
expert answers to difficult questions, that is the
opposite problem. You actually want to drive away
as many morons as you possibly can, hopefully as
quickly as possible. All right. So that was one big area, that’s
first impressions. And that’s a really important
thing to us in the Stack Exchange. Number two, voting. Obviously, is an important
part of stack exchange. I’ll breeze through this
because you’re all familiar with this. Questions are voted up. Answers are even voted up. Here are some hot questions
from the previous week. People can vote up the
questions they like. They vote down the questions
they don’t like. They vote up the answers
that they like. That’s really cool because it
sorts the answers in the order of how good they are. But what it also means is that
you can’t have random conversations because the quotes
keep getting quoted out of context. So you can’t get into
a back and forth argument on Stack Exchange. You can in the comments,
but we’ll delete that. But you can’t get into a back
and forth argument because those back and forth arguments
are useless. You’re not creating a useful
artifact for the internet. Voting is mostly important
because it leads to reputation. Reputation is this thing that
tells you do I trust people? This is Colin Powell. This is what the US Army calls
“fruit salad” that he’s wearing, which tells you all
kinds of useful stuff about who he’s been and campaigns
he’s been on. And his reputation is he’s got
four little silver stars on his epaulette. And we’ve got the same thing. Here are some top users from
stack overflow by reputation. Jon Skeet, who you may know,
works for you guys in London. JON SKEET: How you doing? JOEL SPOLSKY: Oh! Oh my god, he’s here. [LAUGHTER] One thing that’s interesting
to notice here– I’d never noticed this until
we start putting locations on the site. Marc Gravell, who works for
us now, Forest of Dean. I don’t know where that is. It’s in Never Never Land,
somewhere, it’s Middle Earth, UK. we don’t care. He’s just a brain in a box. He types code for us. It’s awesome. Rouen, France. Curacao. Madison, Wisconsin. New Jersey. France. Alex Martelli. Is Alex here? He actually works here
at this office. He’s probably heard this
speech three times. But other than Alex, I don’t
think there’s anyone here in California. Very, very little participation
from the Silicon Valley, guys. I don’t know why. Those little badges that
you were seeing. When you start out on the site,
you start with a little teeny tiny badge. And this is somebody who
is– they’ve made a name, Geek Matter. They haven’t customized
their avatar. So we gave them something based
on their IP address. We gave them some triangles. And you start out with one
reputation which you get for successfully typing
in your name. But you start to earn more
and more reputation as people vote you up. So Favolas, who’s there, has
to 208 points and also some little badges. There’s a little silver badge
and some bronze badges there. And we’ll also let you customize
your avatar, and the accept rate. At one point, we were trying to
encourage people to accept answers that were good and so
we’re displaying also what percentage of answers
that you accept. And you could earn sort of
more and more points. Daniel Hilgarth here– It’s hard to see, but his avatar
is displayed with a drop shadow. And the drop shadow is the
subtle hint that, if you move your mouse over that, you get
like a little customized profile that shows up. So when you earn, I believe
10,000 reps, you start to get the right to customize the
profile that shows up when somebody mouses over you. And there’s John, and you can
tell you what date this from based on when I took the
screen shot, based on when you have 400– only 403,000 rep. JON SKEET: March? JOEL SPOLSKY: March. And you were just about to hit
3000 badges over there. As a good accept
rate, as well. But you can actually go higher
than Jon Skeet, which is you could become a moderator. That’s an elected position and
mostly a burden that involves having to spend a lot of time
on the site deleting bad behavior, instead of answering
questions. And then you get this extra
little diamond that shows up next your name. So that’s all the flair. People wear flair for all
kinds of reasons. It’s an important part, flair
that you wear in real life. This fellow, I think, has got
five things going on here, which you may need to notice. He’s got the Confederate flag
on his cap, which I have to explain, when I’m outside
the United States, what that means. He’s got a lot of tattoos. And I won’t even try
to explain what each of these means. I’m sure they mean something. He’s got big muscles. He’s wearing a tank top, or a
wife beater, as we call it. He’s got a spare tank top, in
case the main, primary, tank top fails for some reason. And a Harley Davidson
bike logo. So just think about all the
things that he had to think of in the morning, when he was
like, I’m going to go outside and I’m going to project
this image of myself. And he the sort of decorated
himself Shah-of-Iran style, in a way, just to sort of kind of
festoon himself with all those little graphics. So that’s an important thing
that happens in real life. And we do it in Stack
Exchange, too. We’ve got these badges. The badges are interesting,
because the first thing people say to me about badges is,
like, I don’t really care about badges. Who cares about badges. They’re stupid. How do you get people to
care about badges? They’re not worth anything. The badges probably motivate
maybe 1% of the participants in our site. Very, very small number of
people that actually say I want to learn this badge. But everybody on the site
knows about them. And all you have to do a sort of
imagine that one person has seen your flair to actually
care about it. And the neat thing about the
badge is that they tell everybody the behavior that we
want to incentivize, even if they’re not directly
incentivizing it. So there’s all kinds of things
that are norms of Stack Exchange, and we have
communicated to the world, hey, these are norms, because
we give you badges for them. And so if you were ever
wondering whether you’re allowed to ask your own
question, then just go look. You’ll see that there’s a badge
that you can earn for asking your own question. If you’ve ever wondered if we
think it’s a good idea to ask a question was asked two
years ago, well, you earn a badge for that. And so it’s a sort of way of
saying, hey, all of these are behaviors that we want
to see on the site. Reputation translates into jobs,
actually, and so the monetization scheme, if you
could call it that, of Stack Overflow, is to show job
list things to people. And because we have this large
audience of people that use Stack Overflow, and we know
what they we know and what they’re good at, we can fill
positions very well. All right, government. Every culture of three or more
people has some kind of government. Two or more people? One person and a dog
as a government? There’s always all kinds
of rules around there. We try to push a lot of the
simple governance, the policing, on to the
population. And so, as you earn reputation
on Stack Overflow, we decide basically, all right. You know how the system works,
and so we can let you do stuff to self-moderate, so that the
community can sort of self-moderate itself
in many ways. So for example, if you have
1,500 reputation, you can create a tag. If you have 10,000 reputation,
you can vote to delete a question which has already
been closed, et cetera, et cetera. So as you earn more and more
reputation, you get rights to do stuff on the site. And that sort of the mass
terrorism of the population terrorizing itself. I’m trying to think what Kafka
story that corresponds to. But government also– Another interesting thing is– Well, let me do a poll
in this room. How many people in this room
use stack overflow? In any way, shape, or form? Everybody. How many people have been
on Meta Stack Overflow? If you look around,
that’s about 10%. It’s much smaller. Meta Stack Overflow is the
place where the actual policies of the site– it’s the site about
Stack Overflow– are discussed. And it’s sort of the back room,
if you’re really deeply involved, deeply interested. You may actually also be
interested in going on there to see kind of how the things
govern themselves. And this is sort of the
equivalent of the civic society kind of club, or
whatever, where people talk behind the scenes. And then there’s an even deeper
level, which is– How many people have been
in our chat system? That is three. Four. So this is the teacher’s
lounge. This is where the 275
moderators on the network hang out. And there is a chat
system there. Anybody can go in there, and
make rooms, and have conversations out there. But again, it’s sort of the
highest level of deep, dark engagement in Stack Overflow. As you become bored with
the boring little question-and-answer game that we
gave you, this is the place where you can actually sort
of talk about things. And that’s actually where
the live government has. And this is actually where
you’ll see live decisions being taken and discussed by
moderators about how to moderate certain things. We’ve also got a blog. And the blog is all about sort
of promulgating the decrees from Stack Overflow, with
Stack Exchange central. And we have laws, so let me talk
about law, because every society has its rules
of the government. And government essentially
comes up with the laws. We only have one important rule,
which is we hate fun. This is the logo of Stack
Overflow hates fun. And we hate fun is all about
how there are millions of things you’re not allowed
to do on Stack Overflow, especially anything that you
might enjoy, or that might be popular, or that might get on
Hacker News, or Reddit. And that comes from sort of an
observation that there’s a lot of things you could do in
discussion groups online that don’t really leave
a useful artifact behind on the internet. So a very, very important
observation, which I have to keep repeating again and
again and again, because nobody gets it. But this is the most
important thing. If somebody asks you about the
design principles of Stack Overflow, of Stack Exchange, the
most important thing you have to remember is that
the question is asked by one person. It’s answered by, let’s
say, one to four, five people usually. But it’s viewed by hundreds
of people. And hundreds of people will
get benefit from that question, out of just that one
person who asked for it. So if you ask us who we’re
optimizing for, it’s not the person asking the question. It’s not the people answering
the question, although we want them both to be somewhat
happy. We’re doing this all for
the hundreds of people. A very fundamental part of the
initial design direction of stack overflow is that
Google is the user interface to Stack Overflow. You’re on our site because you
typed a question on Google. I used to say search engines. [LAUGHTER] Google is our user interface. You typed a question on Google,
and you found a page. And if we have inventory
there, it has to be really good. And that means everything is
optimized for creating this great artifact, this
historical record. And the biggest problem with
Usenet and the old phpBB sites, and Experts, whatever
it was called. Can’t remember. And all kinds of other sites,
that were out there, the biggest problem is that the
inventory that they had to give Google was just these
crystallized, captured, old conversations that took
place a long time ago. And there were all kinds
of things that were wrong about it. First of all, it’s
a conversation. In a conversation, you see this
back and forth and you see this ridiculous,
hey, did anybody– Does anybody know how to
solve this problem? I don’t know, but
did you try X? Yeah, that didn’t work. Well, did you try this
other thing? Look, I already asked you if you
said da da da, and I said, in my thing, da da da. OK. Hey, does anybody know the
answer to da da da? This is now answer number 17. Hey, does anybody know the
answer, because I’m having the same problem– one year later. And then, if you do get an
answer, it’s on page seven, and it’s wrong, and it involves
some kind of a cross-site scripting
vulnerability when you copied the code. So if you’re not thinking about creating a useful artifact– And then Google also has this
incidental problem, which is that the older a page is, the
more respectable it looks. It’s a page rank. And so, a lot of times you would
search for things, and they would get you answers that
were just really old. And that were crystallized and
that were not even close to being the actual solution
to a problem. So we started optimizing
around that. We said, you’ve got to vote. You’ve got to stop having
conversations. You’ve got to have everything
be editable. Because, for us, it’s the
artifact that we’re creating, a lot like Wikipedia. There are people asking
questions, and there are people answering them, but the
goal is for the 300 people that find that question
later, to be able to get a good answer. So we have these things
called close reasons. And people hate us because
we close questions. And they’re like, this is the
end of Stack Overflow. It’s going to all collapse
because you moderators are heavy handed. And you’re closing all kinds
of useful stuff that would have been fun, or entertaining,
or had thousands of page views, or went
all the way to number one in Hacker News. And we have five reasons
for closing a question, specifically. And these are the five. And I’m going to talk
about them. They’re mostly non-obvious,
right? Exact duplicate. Pretty obvious. But remember, the reason we
close duplicates is because, as long as we’re trying to
create a record for how to solve a particular problem, we
want to create that record in one single place. Because then you have everybody
going to that page and it can be more canonical. It cam be more useful, more
helpful, if you have 100 eyes on that question, instead of
50 eyes on this one and 50 eyes on that one. The question is just
going to get better and better over time. So here’s an example of an
exact duplicate question. Can anyone explain Monads? And then that goes off
to another page. It goes into great
detail on that. We don’t get a lot of things
closed as exact duplicates, because they really
have to be exact. If we have a slightly different
question, we do want to answer it. OK, this is what we
call off-topic. Toilet issue in my company. Not strictly a programming
problem, but any help will be appreciated. I have a lady employee who is
joining the company tomorrow, and I want to convey her a
message that the bathroom facilities in the office
are out of order. How do I tell her to relieve
herself before arriving in the morning? It is technically not a
programming question. And while it may be important
to you, and this is a respected member of our site,
we’re not going to answer it. Sorry. It just doesn’t belong
on the site. Off-topic. People generally understand
what that means, to be off topic, although we have
a rather narrow definition of on-topic. OK. Not constructive. Again, the word not constructive
is not clear what that means to people. Question is not a good
fit to our format. We expect answers to
involve facts, references, specific expertise. This question will likely
solicit opinion, debate, arguments, polling, or
extended discussion. Those are all the
things we hate. Once again, opinion, debate,
argument, polling, extended discussion. They’re great things. But they do not create
an artifact. They do not create a useful
resource that anybody can learn from. If you’ve ever seen one of these
Emacs versus Vim, or Mac versus Windows. This one is Web forms versus
MVC developer. If you’ve ever seen one of these
debates on the internet, they always look so fun. But they’re not. And they’re not interesting. And what’s interesting
about these debates, is they get heated. And then it draws a crowd,
because everybody wants to see the fight. Because you’ve evolved. If there’s something dangerous
going on, a tiger is eating a person, you’ve evolved to pay
close attention to that tiger eating the person. And that is way more important
than the people being friendly over there. You pay attention
to the tiger. And so the stuff that
draws in the page views on the internet? It’s conflict. And it’s useless
as an artifact. It’s fun to participate in, so
go to Hacker News or wherever. Go elsewhere. Buy a subscription on
experts-exchange and have your debates. But on our site, that’s not
what we’re trying to do. And so we want things that can
be answered, that are useful, that have answers. We don’t want “which is better,
x or y?” We don’t want shopping questions. “Which video
monitor should I buy?” That stuff, believe it or not,
I know you need to know which video monitor you should buy. And I know that the– or display
adapter– and I know that the programmers on Stack
Overflow all have amazing opinions about this and they’re
great people to ask, but you can’t do it on our site
because you’re going to make a useless piece of crap
that’s going to come up in someone’s Google results. And they’re going to be pissed
off because they will have wasted time. So it we close them. It’s not constructive. Now, not a real question
is another rule. Need ideas about mobile apps,
Android iPhone, which has never been created before. Thanks for the help, please. I mean this is– You can see how it’s
a question. It’s got a question
mark and stuff. And he’s going to get a lot
answer’s, probably, I think. But this is what we call the
question is ambiguous, vague, incomplete, overly broad,
rhetorical. You know, one thing we have is
we basically say, if your question is one sentence, and
the correct answer would be a book, just don’t ask it. Because we don’t want to see
those questions on our site. OK. I’m missing a bracket
somewhere. [LAUGHTER] I think you recognize
this question. [LAUGHTER] This is– we have a weird
terminology for that. We call this “too localized.”
That’s because Jeff Atwood doesn’t know anything about
localization or internationalization, so he
used that word for another purpose, which is a question
that really only applies to one person in one
circumstance. It’s never going
to apply again. Once again, to the global
internet audience, it’s just not to be useful to
anybody ever. And so, sorry, don’t ask it. We’re not your debugger. A great question would be “I’m
using such-and-such a programming language, with
such-and-such debugger. What’s the easiest way to find
missing parentheses.” That’s a great question. Or “how would you approach
finding missing parentheses?” That’s a good question, and
I’m sure it’s in there. And I’m sure it got
a good answer. But find it for me,
it’s like no. Go away. Go away. This is– the other example that I give
people that are not programmers of a too-localized
question, “Why is there a green Honda Civic parked on
my street?” I don’t know. Is it still there? Go check. Where do you live? So this is Seoul Korea. Stack Overflow. Stack Exchange. 25 million people. Seoul is about 20 million
people, the population of Seoul. And so the number of things that
are going on in, like, one of the world’s largest
cities, that’s essentially what happens on Stack
Exchange. And as you could tell, there
are certain things that are common denominators. I mean, really, everybody in
Seoul speaks Korean, except me, when I was there. But everyone else does. And yet, there’s lots
of high rises, and lots of little buildings. There’s all kinds of
different cultures. There’s all kinds
of subcultures. There’s also some things that
are kind of constant. When you think about the
millions of stories that are going on in a big city, that’s
what’s really happening in Stack Exchange. I mean, we started out
building software. When I started building
software, we were lucky just to get computations. We ended up building software
for hundreds, or thousands, or millions, or millions upon
millions of people, where the actual interaction between the
people is what we’re trying to create and what we’re trying
to make happen. And that’s where you
need anthropology. So thank you very much. That’s my prepared talk. I will take questions. [APPLAUSE] I will take applause. AUDIENCE: Is it really true that
participation in Silicon Valley is low? JOEL SPOLSKY: Participation
in Silicon Valley– No, I don’t think it’s true
that participation– AUDIENCE: Can you
repeat question? JOEL SPOLSKY: Yeah, the question
was is it really true that participation in Silicon
Valley is low. I’ve never actually compared it
to the actual population, but Silicon Valley does not
have a whole lot of programmers, as a percentage
of the programmers in the world. We have our own little
theories, like most programmers in Silicon Valley
have other sources. If you work at Google, actually,
there’s 800 people you could ask questions if you
need help programming. Whereas, if you work at the
Department of Forestry in Nebraska, there may be nobody
that you can ask those questions to. You’re also, if you work at the
Department of Forestry in Nebraska, probably
under-challenged at work. And so finding a place where
you help other people by answering questions may
help challenge you a little bit more. Please don’t challenge Jon
Skeet any further. Yes. AUDIENCE: I also find that Stack
Overflow is more useful to me than a book. [INAUDIBLE] small, whereas [INAUDIBLE]. JOEL SPOLSKY: Stack Overflow
is more useful than a book, was the comment. But the granularity is really
tiny, and so it doesn’t really quite scale in the same way. One of the things that I said,
when I was starting this, I had written some books and so I
was friends with the people at Apress, who published
programmer books. And I actually said to the
founder of Apress, and then the founder of O’Reilly, I
said, you know, what we believe is that people are going
to stop learning how to program from books. Or that they already have. And that the way to learn
programming is usually going to be to find a tutorial, or
some sample code, or to take over someone else’s code. And just to start typing
and, essentially, page fault in knowledge. Every time you get stuck, type
a question in Google. Try to learn that one thing. And then move on and
keep doing that. It’s not a very thorough way to
learn programming, but that seems to be the way most of
them are learning it. And the founder of Apress wanted
to invest, and the O’Reilly people were very
insulted that I should say that books are going away. And they made a competitive site
to Stack Overflow called O’Reilly Answers, which
is still on the web, believe it or not. And has at least five pages. Yes. AUDIENCE: You mentioned,
actually, that there is a badge for [INAUDIBLE] asking
a really good question or answering your own questions. JOEL SPOLSKY: There
is a badge for answering your own question. Yeah. AUDIENCE: And I was kind of
wondering, is there interest in avoiding dupes? And also, there are certain
question that people would ask that you could probably deliver
an answer to based on analytics of the
existing data. JOEL SPOLSKY: Right. So the first question is are we
really deliberately trying to avoid dupes? And secondly, a lot of times
we can answer questions– We already have the answer to
the question that they’re actually typing. So we have a feature right now
that, when you type the title of a question, we’ll actually
do some keyword searching. And it’s pretty naive and show
you some other questions which might answer your question. That’s done in a relatively
naive way, without very much machine learning behind it. And nevertheless, it does
successfully intercept an awful lot of questions being
asked another time. We do very much want to
get rid of dupes. However, there is sort of a
syndrome on Stack Overflow we haven’t been able to cure
completely, where people will answer a question that has been
asked 1,000 times before, either just to earn reputation
quickly, or because it’s more fun and easier to answer a
question the 37th time. They’re just like, hey, I can
answer this, da da da, rather than actually try to search and
see if that thing’s been asked before. So we do get an awful
lot of dupes. We’re going to start trying to
attack that with the machine learning, and better machine
learning algorithms, over the next year or two. But that is– the dupes
are actually still kind of bit of a problem. We’re also starting to see one
of the problems that happens with dupes is that the answer
has changed over time, right? Like the Android has changed. Something you couldn’t
do, you now can do. So there’s 37 things saying you
can’t do it, all which are highly ranked in Google, and
then there’s one thing say, no, you can do it now, or it
has changed, or whatever. And those are too new,
actually, to rank. And so, one of the problems
with the lack of de-duping that we’re doing is that we do
have stale answers that just don’t get enough eyeballs to
fix them, to edit them. So that’s something we want to
work on, and we’re going to rely on machine learning for
that, because humans obviously don’t want to do for us. Yes. AUDIENCE: So how concept
[INAUDIBLE] work outside of programming. So I’m envisioning
a [INAUDIBLE] used and successful. So how successful are
we [INAUDIBLE] can you actually discuss the
Middle East, something– JOEL SPOLSKY: Well, the Middle
East, there are no answers. So you can’t discuss the Middle
East on Stack Exchange. So the question was how
do you discuss– how well does this
scale outside of the programming community? Does it even work outside the
programming community? And how well are other
communities going to work? And could you ever discuss the
Middle East on our network? There are now 90 Stack
Exchange sites. I think 35 of them have
graduated from beta. We have a beta process that’s
pretty rigorous. We don’t let them graduate until
we think they’re going to be around for
the long haul. If you look at the site’s
statistics, a lot of the growth is coming from the Stack
Exchange– what we’ll call the Stack Exchange network,
which is already probably about a third of our
traffic, and probably gets as many unique page views as Stack
Overflow itself, the other sites. However, a lot of them are
kind of semi-geeky. Like, they’re not programmers,
but they’re server fall for system administrators, superuser
for PCs, WordPress, Drupal, database administrators,
TeX, the math typesetting language. So there’s a lot of these
sites that are– like a lot of our traffic
comes from fairly geeky domains. And it’s not clear whether
that’s a historical accident, like, we started with an
audience of programmers, and then we say what else you
want to talk about? And they said Drupal. Or if that’s because there’s
actually something about this mechanism that appeals to
programmers and it works better for programmers. However, we are getting pretty
far afield, and we have some sites that are pretty
successful that are not geeky at all. I mean, there’s a
parenting site. There’s a home improvement
site. Photography, which is geeky
in its own way. But the home improvement site
is kind of interesting, because it’s a bunch of
contractors talking about drywall application
techniques. So we are starting to
move beyond that. And all the growth is happening
on the Stack Exchange side. Not all of it. Stack Overflow is growing now
at about 56% a year, because we already have– just in terms
of reach, like number of unique visitors. Stack Overflow has only grown
56% last year because it’s hard to find more programmers,
at this point, that are not using it. But the Stack Exchange network
itself is growing at 350% a year, so much, much faster. And it’s already, I think, a
third to half of our traffic. So it’s getting bigger faster. We think that it’s sort of like
a system of concentric circles, right? We’ve got the programmers. And now, you ask the programmers
what they want to talk about, and they might say
other programmer-y things. And some of them are going to
be photographers, but there are no programmers who are
also lawyers, very few. Because those are both sort of
all encompassing things. And so we haven’t gotten to
law, although we have an announcement coming next week. So pay attention next week, when
we announce something. Yes. AUDIENCE: Do you think Quora is
a different kind of market? Or– JOEL SPOLSKY: The question
is would I say Quora is a different kind of market. Yeah. I mean, I don’t want to speak
for the Quora guys. To me, it looks like provoked
blogging, meaning it’s sort of like a logging platform, except
that there’s kind of all kinds of provocation on
there to write an awesome blog post about your opinion, about
a particular thing, that happens to address a particular
question. So I feel like they don’t– because Quora is wider, and
anything is allowed on topic, it hasn’t attracted experts in
really anything other than Silicon Valley start-ups. And there’s no particular
field of expertise where you’ll find those experts
on Quora yet. And I don’t know if there
can be, because when I look at Yahoo! Answers, if I were a Fields
medallist mathematician, I would never think
to go on Yahoo! Answers and ask a hard
math question. Just like they won’t go
to Quora and ask it. But when they see Math Overflow,
they will, because they see a whole bunch of other
really, really hard math questions around, and
so that make sense. So I think that, if you start
with something that’s broad and horizontal in general,
you’re never going to attract the hard-core obsessive experts,
or the people who this is their job, it’s
their profession. You can’t get vertical
from horizontal. On the other hand, if we can
build a lot of verticals, we can, at some point, be kind of
indistinguishable from a horizontal because we’ve
got all the verticals. Yes. AUDIENCE: I seem to remember
that Stack Exchange does not have an open source– or it’s not open source. So if I’m a company and I want
to use something internally or I want to use it for
my [INAUDIBLE] or something like that, how? JOEL SPOLSKY: Yeah. Stack Exchange is not actually
open source itself. So we actually think that the
valuable data is the text that people have typed. And that is open source in
the sense that’s it’s all Creative Commons. So we have a public open
API, where you can access any of our data. We have database dumps where
we would take an entire SQL database and make it available
on a monthly basis for anybody that wants download it
and do anything that they want with it. Remix, reuse, whatever. Just don’t– If you put ads on it, and then
remove all the links to stack overflow, then we’ll probably be
pissed off, but we may not be able to do anything
about it. But the actual software itself
is sort of a little bit incidental to our system. So it’s not open source. There are multiple open source
clones of stack overflow that you can get. OS QA is probably the biggest
one, which is an open source project to kind of clones
the way it works. We think, again, the value is
in the community and the questions that they
ask and answer. John Skeet. JON SKEET: You talked about
trying to put off people who aren’t natural members
of the community. It seems that every day on Meta,
there is someone who has got upset with being down-voted
and say, why are you so harsh when, OK, I clicked
saying I understand all this and then I posted
a rubbish question. Do you think that you will– Assuming that has to change in
some way, is it going to change by better education,
so they really don’t ask a rubbish question? Or are we going to put them
off so they don’t even ask [INAUDIBLE]? JOEL SPOLSKY: That’s a
really good question. So the question is sort of
every day, you have large numbers of people showing up on
Meta being angry that their question has been closed,
because it’s idiotic, and it doesn’t follow the rules,
or it’s badly formed, or whatever it may be. And they are very, very
adamant about their rights to– There’s a sense of entitlement
that people believe, well, you’ve given me an edit
box on the internet. I have a right type words
in that edit box. And then when it gets down-voted
or deleted, they sort of say, wait, what
is going on here? How do I not have the right for
my words to appear on the internet for everybody to see? And this is a bit of
an ongoing problem. So we beat those people down
and they get upset and complain and they say that Stack
Overflow is getting– that the moderators are full
of themselves and whatever. I’m being censored, et cetera. And then they– and then, when
they say they’re getting censored, of course anybody
who’s listening from far away is like, you shouldn’t
be censoring people on Stack Overflow. That’s horrible. With a site as large as
Stack Overflow, this is a growing problem. We have 7,000 questions a day. They’re not all gems. There is a large category
of people who cannot do their jobs. They have been hired as
programmers, and they’re incapable of being
programmers. And the only hope that they
have, somebody has told them, well, just type whatever your
boss told you to do into Stack Overflow, and somebody
will help you. And the bad thing is that, sort
of like if you’ve ever done dog training with the
intermittent reinforcement, sometimes they get answers,
so they keep trying. All of you would really get
swapped down badly enough. I don’t know what the natural
end state is going to be. Education, as you mentioned,
would be helpful. We’ve tried the thing, which
I’m not a big believer in, where somebody new shows
up at the site. They ask a question. You say wait stop. Read all this text. Make sure you are doing
all these things. It’s the Eric Raymond “How
to Ask Questions on the Internet.” It’s a
book this long. And when you’re done,
you probably won’t have a question anymore. You’re not one of those people
that has to ask questions. Because the exhaustive nature
of what Eric Raymond would have you do before you can
ask your question. And the truth is sometimes you
put the best questions on Stack Overflow just by not
looking something up. Just by saying, you know
what, I’m reading this documentation. And it makes me wonder
da da da da da. And maybe I could get the answer
somewhere else already, but I’m just going to ask on
Stack Overflow because somebody’s going to answer it. And that’s going to help 100
people who are going to have that same exact question
when they read the same documentation that
I just read. So I don’t know if there’s
a great answer to this. There’s stuff that
we’re definitely doing to work on it. The number one thing that
we’re trying to do– I would say there’s sort of two
things we’re trying to do to work on it. One is, in the very short term,
we have a contest up on Kaggle which, if you’re into
machine learning, go answer that Kaggle contest, you can
win valuable prizes. And it’s a contest to identify
questions that would likely be closed, just based
on their text. So what we’re going to try to
do is develop some kind of machine learning algorithm that
can look at a question and predict whether or not it’s
later going to be closed. We actually discovered that one
of the strongest signals, before we did this
Kaggle contest, the strongest signal– Does anyone want to guess what
the strongest signal is that question is going
to be closed? AUDIENCE: Spelling errors? JOEL SPOLSKY: Spelling
errors, nope. AUDIENCE: Length. JOEL SPOLSKY: Length? AUDIENCE: Length. JOEL SPOLSKY: Length, no. AUDIENCE: [INAUDIBLE]. JOEL SPOLSKY: Sorry? Books? No. AUDIENCE: [INAUDIBLE] poster. JOEL SPOLSKY: All right, I’m
just going to tell you. Sentences that start with a
lowercase letters is one of the strongest features,
essentially, if you try to do the machine learning. So we’re going to try to
improve that so we can actually try to block some
questions earlier. But all that really does
is make people capitalizes their sentences. They really want to ask their
stupid question and then it needs to be– We’re going to keep
working on that. In the long run, one thing
which worries me about Wikipedia– and that allows me to launch
into a little speech about what Wikipedia– Wikipedia also has these
non-intuitive rules. We have this non-intuitive
rule. Don’t ask shopping questions,
for example. This is non-intuitive because
you might think oh, this is a great place to ask which 30-inch
monitor should I buy? But it’s not. And don’t ask subjective
questions. This is a really, really
important rule for us. And I have tried to explain– I have now explained to you
why we have this rule. And many of you can go back
and say, well, here’s why Stack Overflow closes questions
that they think are too localized. You may not agree, but at least
why we’re doing that. But Wikipedia has similar
rules that nobody gets. And they’re always confused
by these rules. So for example, there’s a rule
that says we need tertiary sources, not primary sources,
for Wikipedia. Wikipedia is making an
encyclopedia and we don’t allow say, Joel Spolsky to go
into the article about Joel Spolsky and correct things
that I know to be false because I am not trustworthy. Who is trustworthy? The Village Voice. The New York Times, people that
know absolutely nothing about me and actually, the
information that they have, they got for me, when they
interviewed me that time. So it seems like a funny
rule on Wikipedia. And this rule– There’s sort of another rule
about notoriety, which is you cannot have an article on
Wikipedia about something which is not notorious enough. And then people start say well,
the internet is not running out of pages. Why do you have to shut down
this awesome article I wrote about the five tires that are
stacked up in the garage at the new Google hacker space,
or whatever this is. Why can’t there be an article
about this on Wikipedia? I’ll make it, where
we’re all here. We all see that we can write
down the model numbers and the paint color, and stuff
like that. But there’s no notoriety
there. That’s not notorious. So who cares. Well, the reason we care is
because there’s not anything published that we can go back
in a book, in a library, to check if those facts on
Wikipedia are correct. So you have to have this
combination of two rules, that the thing be notorious
and that all the facts have citations. If you don’t have those two
rules, stuff gets in Wikipedia, which is
not verifiable. We don’t care if it’s
right or wrong. It just has to be verifiable. There just has to be
a way to check. Because if there’s no way
to check, it could never possibly be right. And when you think about
this logically, you say, aha, I see. I now understand why Wikipedia
has to have those two rules, which sound really nasty
and anti-democratic. Some famous writer, I’m
trying to remember who it was, Philip Roth? Yeah, just went on Wikipedia
to try to correct something about one of his own books. And the editor told him, look
I know you’re Philip Roth. I get it. We don’t accept you as the
source unless it’s been published somewhere. So I think what he did is he
did an interview somewhere, and then cited the interview. And they were like OK. And that’s usually the
way this is resolved. But that rule sounds
so ridiculous. And people just get angry with
Wikipedia and they actually drop out of Wikipedia. They say, I am sick and tired of
trying to correct Wikipedia entries that are just
wrong because I can’t do this anymore. And we have the same fear
over Stack Overflow. We keep closing localized
questions and people drop out of the community because they
feel like we’re too strict, or we’re nasty, or our moderators
are cruel. Then we’re going to lose them. But we have to have
those rules. That’s why we have a site that’s
awesome because we have sort of strict rules. So, education. I’m all out of time. Thanks very much for
coming to hear me. I really appreciate it all. You’ve been great. Keep answering questions
and asking questions on Stack Overflow. I’ll be hanging out here
for at least another 15 minutes or so. If you have any more questions
you want to ask, come up and ask me. Thank you. [APPLAUSE]

3 thoughts on “[email protected] Presents: The Cultural Anthropology of Stack Exchange

  1. "Until you start noticing, that the culture is acting according to the [x], then you may forget to design the [x] in a way that makes the culture work."
    – such a beautiful multidisciplinary comment: software, legislation, product development, architecture, leadership—everything, basically.

Leave a Reply

Your email address will not be published. Required fields are marked *