The JSON Saga

This is an unofficial transcript of a talk delivered by Douglas Crockford. There happens to be more than one video of Crockford dlivering a talk with the title "The JSON Saga." The one covered here was uploaded to YouTube in August, 2011.

— Mike Amundsen

Status

Video: YouTube
Audio: MP3
Transcript: HTML

Transcript

Douglas: Good evening, I’m Doug Crockford of Yahoo! This is the JSON Saga. This is going to be the true story about JSON. So first, a warning. I am heretic. So if you don’t wanna hear any heresy, I recommend you leave right now. I discovered JSON. I do not claim to have invented JSON because it already existed in nature. What I did was I found it, I named it, I described how it was useful. I don’t claim to be the first person to have discovered it. I know that there are other people who discovered it, at least, a year before I did. The earliest occurrence I found was there was someone at Netscape who was using JavaScript array literals for doing data communication as early as 1996, which was at least 5 years before I stumbled onto the idea. So the idea’s been around there for a while. What I did was I gave it a specification and a little website. All the rest of it happened by itself and I’m going to explain what that means.

So the story for me starts in 2001. Chip Morningstar and I started a company which was going to be developing application frameworks for what today would be called Ajax and comment applications. We didn’t know what to call the company, so our working title was Veil, the idea that we would unveil the company later when we’d figured out what the name was going to be. Even though it was only a temporary name, I designed this logo for it. And I was really sorry that it was going to be a temporary throw-away name because I really like the logo. I thought it turned out really, really well. So what we became was State Software. And our advertising agency came up with this logo: a couple of frisky parameciums. And the negative space in between them kind of looks like an S, as for State. So that’s what we called it. So this is the very first JSON message. Not quite as momentous as "what hath God wrought?" or "Mr. Watson, come here, I need you!" We didn’t know that we were making history at the time. We were in Chip’s garage. His server sent this message to my laptop, in response to a form submit post. And this is what came back. So the text in green was the very first JSON message.

In our framework, objects were addressed to specific objects. And in this case, it was addressed to the session object. Usually, the session object would be instantiating objects which represent the application. In this case, it was just handling a test. So do was the method that the test here. And so we embedded it in an HTML document because it came back and got put into a frame as part of the submit process. We did it this way because it worked as well in IE as it did on Netscape 4. And it was really important for us to work on Netscape 4 in 2001 because it was still an important browser. There’s a lot of talk about how awful Netscape 6 is, but at that point in time, IE 6 was the best browser that had ever been. Netscape 4 was so bad, it made Microsoft look brilliant and competent. That’s just how bad it was. It was a crime against humanity. We wanted to be able to support it because there were a lot of technologically backward companies that were stuck on it; they would not allow their employees to use IE 6. And we wanted to do business with some of those, including Sun Microsystems and IBM. So this was the scheme we came up with to do the communication at that time.

So the document contained a script. The first line of the script set the document.domain, so we could get around the same origin policy. And the second statement called the receive method on the session object in the containing frame. And so that was how we caused the message to get delivered, and then the message was included in the document. So it was a really nice form. I’d like to tell you that this worked, but it didn’t. The very first message we sent failed. And the reason it failed was, we had a reserved word in there. It turns out do is a reserved word in JavaScript. So it took us a while to figure out what happened. Did you send it? Yeah, I sent it! Where’d it go? So this produced a syntax error, and it took us a couple of minutes to figure that out. That was when we discovered the unquoted name problem.

It turns out ECMA Script 3 has a whack reserved word policy. So reserved words must be quoted in the key position, which is really a nuisance. When I got around to formalizing this into a standard, I didn’t want to have to put all of the reserved words in the standard, because it would look really stupid. And at the time, I was trying to convince people: yeah, you can write applications in JavaScript, it’s actually going to work and it’s a good language. And I didn’t then, at the same time say, and look at this really stupid thing they did! So I decided, instead, let’s just quote the keys. That way, we don’t have to tell anybody about how whack it is. And that’s why, to this day, keys are quoted in JSON. It also had a secondary benefit in that it significantly simplified the JSON grammar. If you have names, that means you have to have some definition for what a letter is. And it turns out when you’re using Unicode, the question of what is a letter is surprisingly complicated. And by saying hell with it, we’re just going to quote everything, we completely avoided all of that complexity. Also, it turns out Python has the same notation built into it, and Python does require the quoting of the keys. So that kind of aligned us with Python, and we thought it might make us more attractive to the Python community.

Another problem we found in using HTML as the envelope for JSON was that if any of the strings in your data happened to look like HTML—and in particular if it happened to look like a script tag—that would close the block right there. Then you would get a syntax error because you didn’t get the whole thing delivered, which was a nuisance. So another thing I added to the JSON standard was tolerance of a backslash in front of a slash so that we could avoid that. So now you’ve got stuff that looks like HTML but doesn’t look like HTML to the browser that was necessary to get stuff through. JSON doesn’t require that you escape the slash, but it tolerates it, and this is why it tolerates it. We decided to give it a name, so we called it JSML, rhymes with dismal, the JavaScript Message Language. But it turned out there’s another standard that nobody has ever heard of in the Java world, called Java Speech Markup Language. So I was like OK, we need to come up with another name. So we came up with JSON: JavaScript Object Notation. There’s a lot of argument about how you pronounce that, but I strictly don’t care. I think probably the correct pronunciation is "Je son".

But we found it worked really well. So it was extremely effective for the thing that we invented it for, being browser-server communication. But we also used it a lot for inter-server communication. Our platform scaled hugely, and so we could have lots and lots of boxes, and they needed to be kept in sync. And we found JSON was perfect for sending messages between the servers. We also used JSON to implement a simple database. So we just have keys, and for each key, we’d store some JSON data. And so it made it really efficient for storing stuff and getting it back. So we liked it a lot, and we tried to convince our customers that it was good. And our customers said, "Well, we hate it because we’ve never heard of it." And some of our customers said, "Oh, I wish you’d told us this six months ago because we just decided to go with XML, so we can’t consider anything else now." And some of the people we talked to said, "It’s not a standard, so we can’t use it." I said, "It is a standard, it’s a subset of ECMA 262." They said, "No, that’s not a standard." Okay. So in order to use this, I have to declare that this is a standard. So that’s what I did. I decided it’s going to be a standard from now on.

So I bought JSON.org. I put up a one-page website that described JSON. And on that one page, I had the grammar for the language three ways, in a simplified BNF, in a format that Bill McKeeman of Dartmouth recommended, Railroad Diagrams, which I really like that date back to Burroughs and informal English. I figured anybody who’s going to use this has got to be able to understand at least one of those. And then I included a Java reference implementation, just so that people could look at code that actually parsed to JSON, and see how you did it. This was very late in 2002. And by that time I decided to retire. We had spent the two years previously trying to raise money in the post-bubble, post 9/11 environment. It was just way too hard to raise money, and by that point, we had run out. I decided to do something else for awhile, so I went back into consumer electronics. I was doing consulting on high definition television and the digital conversion. I thought I’d let everybody else worry about the internet for a few years. That’s all I did. Basically, I put a message format in a bottle, threw it into the internet, and I was pretty much done with it at that point.

Over time, a number of people stumbled onto my web page and looked at it, and said, "Yeah, that looks like something I could use." And they started using it. And then a few of them started sending code back to me. I got contributors who said, "I’ve done a port of the JSON stuff to Ruby, or Python. Can you put a link to my stuff on your page?" And so yeah, okay, I could do that. And over awhile, I got support for all of these languages. One of the benefits of having a really simple description of a data format is it doesn’t take much code to implement it. And when you’ve got code that’s this easy to write, there are a lot of people who will be willing to write it and share it. So there’s all this stuff out here for all of these languages. So you can have applications written in any pair of these languages, and they can communicate using JSON. And it’s because JSON is the intersection of all of these languages; it’s the intersection of all modern programming languages.

All languages have some sense of data and structures of data. They all have simple values like number strings and booleans. They all have some sense of a sequence of values. Different languages will call it different things; some say it’s an array, some say it’s a vector, some say it’s a list or some other thing. Every language has some sense of a collection of named values, it might be an object, or a record, or a struct, or a hash, or a property list, or something. All languages have these, these are universal ideas. Every language expresses these differently and will add a lot of other stuff on top of it, like type systems, and semantics. But they all have the same idea about what the data looks like, and JSON has the thing that’s common to everything. And by being at the intersection, it turns out to be the thing that everybody can agree on. And so it’s really easy to pass data back and forth. Prior data interchange formats tended to try to be the union of all the languages, and that turns out to be horrendously complex, and really difficult to deal with. JSON, by being so simple, actually became really easy to use.

On the JSON site, there are examples of how you can implement a JSON parser in lots of different techniques. This is a snippet from a recursive descent compiler. Really, really easy to write. This is a snippet from a finite state machine, using a push-down automaton. Most of the work happens in the green statement, in which we go to a table and get to the current token, and the current state, and execute the function that’s stored there. Turns out JavaScript is brilliant for writing state machines because you can put functions right in the state transition tables. So really, really nice for that. The way most people use JSON and JavaScript is using either the JSON 2 library or something very similar to it, in which you use Eval to actually use the JavaScript compiler to parse the JSON for you. That turns out to be really unsafe, so it’s guarded by four regular expressions. It started off as one, and someone said whoops, that got through. And it’s like okay, add a second regular expression. Whoops, that got through. So in the end, it took four of them, which is kind of a nuisance. So we’re not getting the full performance benefit that we’d hoped for getting Eval. Fortunately, that’s getting fixed in the fifth edition of ECMAScript. JSON.parse is now built into the language. It’s going to have its own compiler, which will be faster than the Eval compiler, so performance should be really, really good. It’ll be really safe, really reliable. We expect to have ECMAScript Fifth Edition finished and approved this year, but JSON.parse is available now in better browsers everywhere.

Another benefit of having a really simple description of the language is that it doesn’t take a lot of work to translate it into another human language. And so I was really happy to have wonderful people from all over the world submitting translations to me. And so now the JSON page is available in all of these languages, which is just wonderful. If it turns out that you’re fluent in a language which isn’t on the list, and you’d like to help out, that’d be really great. But the thing that really happened that caused people to take notice of JSON was Ajax. In 2005, Jesse James Garrett discovered that you could use web browsers to have fully interactive applications without having to do a page replacement after every user interaction. A lot of us had been doing that for five years previous, but it was really important when Jesse discovered it because suddenly everybody wanted to do it. We couldn’t give it away in 2001, but suddenly it was really hot in 2005. So a lot of web developers discovered that XML was really tedious to work with, but JSON was really easy. And so it was Ajax that pushed the popularity of JSON. Now, there were some cranks at the time that said, "Wait a minute, Jesse James Garrett said that the X stands for XML. So you can’t use JSON, you have to use XML." That didn’t last for very long.

And so now we had a growing community of people using the language, and I started observing things people were doing that I went ugh, I didn’t anticipate that. One of them was that people were putting instructions to the parser in comments, which was a really bad thing because that would totally break interoperability because there’s this whole level of meta-language which wouldn’t be common, would be outside of the standard. So I revised the definition of JSON to remove the comments. It had had slash-slash and slash-star but those are gone now. It also turned out they added a lot of unnecessary complexity. In our use, we’d never used the comments; I just put them in initially because I thought it might be useful. But it turned out they weren’t that useful. And for some of the ports to other languages, about half of the complexity of doing the thing was just doing the comments. They were surprisingly difficult. I never understood quite why. But taking them out made it easier to port JSON into other languages, and that was desirable.

Also, there was another data interchange format called YAML, which stands for something funny. And YAML, coincidentally, was almost a proper superset of JSON; just similar ideas, and came out almost the same. The biggest point of difference was that JSON had comments in that style and YAML didn’t. So by taking the comments out, JSON became more closely aligned with YAML, and there appeared to be some benefit in doing that. Then the very last change was, I added scientific notation to number. When we were working at State, we were doing business applications and never realized the need for them. But as Ajax got bigger, we found all sorts of things happening in Ajax, so I put them in. And at that point, closed the door. There’ll be no more changes to JSON, ever. Because I never put a version number in it, so there’s no way to indicate what version of JSON you’re using. So there’s no safe way to extend it or redefine it. So as a consequence, JSON will not be changed. If you put a version number on something, if there’s a 1.0, you know there’s going to be a 1.1, and then a 2.0. And everything’s crap until it’s 3.0. So we’re just going to avoid that. We’re not going to have any numbers on this thing, it’s just JSON.

Stability is much more important than any feature we can think of. Over the years, I’ve heard a lot of suggestions for stuff people could put into JSON, and it’s all useless. Everything you need to be able to do, you can do with it now. So I expect, some day, JSON will be replaced by something which is bigger and more exotic or whatever. And I’m actually looking forward to that day in the future. But until then, JSON will be just the way it is. And after that, JSON will stay the way it is too. JSON will be the way it is until the end of time so that there’s, at least, one piece of the stack you can depend on forever. One of the key design goals behind JSON was minimalism. My idea was that the less we have to agree on in order to inter-operate, the more likely we’re going to be able to inter-operate well. If the interfaces are really simple, we can easily connect. And if the interfaces are real complicated, the likelihood that something’s going to go wrong goes way, way up.

So I endeavored to make JSON as simple as possible. And I had a goal of being able to put JSON standard on the back of a business card. And this is the card. So come see me if you want one of these cards. It’s the JSON card; it’s got the JSON standard on the back. Now, I’m not suggesting that that should be a goal for every standard. There are some standards that are just necessarily more complicated than what you can put on the back of a business card. But I think it’s a really nice thing to aspire to, so that when you’re in the standards committee meeting, going gee, is there any way we could simplify this more, so that we could actually fit it on a card? Because generally, standards committees don’t think about things like that. It’s easy to make things bigger, it’s hard to make things better.

So JSON had a lot of influences on its design. It didn’t just come out of my head. It’s based on a lot of things that I had observed over the years. The first one, maybe the greatest influence, was Lisp, John McCarthy’s work out of MIT in 1958. Lisp was built on a textural representation of simple binary trees. And it was really powerful, and syntactically almost nothing, but it was kind of visually confusing because it’s tons and tons of nested parentheses. The thing that was brilliant about Lisp was it used exactly the same representation for programs and data. So originally, the idea was that you would have programs that could act on themselves as data and do interesting things. There were people who recommended that S-expressions should become standard data interchange format, which would have been a good idea but it was never going to happen for the same reason that Lisp never became a mainstream language. Which is just, the mainstream likes syntax. And Lisp is just too goofy looking. So that never happened.

Another influence was Rebol. Rebol’s a more modern language, but with some very similar ideas to Lisp in that it’s all built upon a representation of data, which is then executable as programs. But it’s a much richer thing syntactically. Rebol is a brilliant language, and it’s a shame it’s not more popular because it deserves to be. Obviously, JavaScript was a huge influence, because JSON is JavaScript. I mean, that’s where it came from. I seem to have been making a career out of finding little bits of goodness in JavaScript. Like I wrote this pamphlet on the good parts of JavaScript. JSON is another of the good parts of JavaScript. There’s some good stuff in that language. And it’s not by accident. Brendan Eich, who is the designer of the language, is a brilliant guy. And there’s brilliant stuff in JavaScript. There’s other stuff too, but you don’t need to use that. And one surprising thing is that JavaScript, Python, and Newton were all designed at about the same time, all in isolation. None of the three designers were paying attention to what the other guys were doing. They all came up with exactly the same notation for doing nested objects in arrays. That could be an amazing coincidence, or I think it may be an indication that this was just sort of a natural idea that’s been in the air for a long time, and they all kind of put it together at the same time.

Another example of this is at NeXT, working on the OpenStep platform in '93. They had something called Property Lists, which were basically JSON structures. Syntactically they were slightly different; they had equal signs instead of colons, and they used semi-colons instead of commas. But basically, it was JSON, it was the same idea. They got it right in '93, then threw it away later with OS 10. But they had it right in the idea that we can express data structures and keep our data in this form which is comfortable for people and really efficient for machines. That’s part of the core idea of this stuff. It’s been around for a long time, JSON just gave it a name. Then there’s XML. So the interesting thing for me about XML is not any of its characteristics, but how it became a standard, and how it became so popular so quickly. The world rejected it as document format back when it was called SGML. XML changed some aspects but didn’t repair any of the things that made it a bad document format. I’ll offer as evidence of that the fact that XHTML has totally failed to displace HTML. If XML were a superior document format, XHTML should easily have won over HTML and it hasn’t. HTML is still dominant, XHTML is failing.

So XML in the first place isn’t a very good document format, and it’s an even worse data interchange format. So given that it doesn’t really effectively do any of the things that it was intended to do, how did it become so popular? I think its roots were in HTML. Now, HTML is also based on SGML. But HTML actually improved significantly on SGML by simplifying it. Took a lot of crap out of it, reduced it down basics, and also made it more resilient. Because it turns out one of the things which is bad about this document format is it’s really difficult to get it right. Just getting all the things to balance, and getting everything quoted is apparently really hard. I don’t know why it’s so hard, but the evidence is that nobody has ever done it right. Nobody can open up a text editor and write HTML and get it right. And so the browsers, from the beginning, had to be extremely resilient and forgiving and intelligent about trying to make sense of the markup. And as any approach which says if we find the slightest error anywhere, kill it and show nothing, it’s just death to the web. And that never took off.

But at the time that the web was emerging, there were a lot of Grade-A CTOs and technologists who looked at it and said well, this is obviously not going to work. This is deficient in so many ways, this is obviously not going to work, this is bad, let’s wait for the next thing. But there were a lot more B-level and C-level technologists who said wow, this looks great! And then they got it, and that created the avalanche effect, and eventually HTML won. So those A-list CTOs, they weren’t wrong, because we’re suffering still every day from the problems that they identified. Everything they cited as deficient was correct, they just asked the wrong question. They shouldn’t have asked if it was good enough, they should have asked is it going to be popular enough? And so when XML came out, it’s from the people who gave us HTML and it’s got angle brackets, so well it’s a no-brainer, it’s obviously going to win. So they stepped out of the way and let it go.

In April 2002, I saw John Seely Brown talking at the CTO Forum. Brown ran Xerox PARC for many, many years. He was in charge there when they came up with object-oriented programming, graphical user interfaces, local area networking, laser printers, a whole lot of stuff that we take for granted today happened on his watch. Brilliant guy. He was talking about how the next generation was going to made out of loosely coupled systems, and he thought XML was going to be the thing that would bind them together. He said, "Maybe only something this simple could work." It was a really interesting talk. A couple months later, I went to another conference and heard another guy talking who was a little closer to the ground, also talking about XML. He said, "Maybe only something this complicated could work." And that really struck me. In just a couple of months, it went from something that was so simple to something that was so complicated. What does that indicate? What should we learn from this?

And it occurred to me that it’s complicated because it doesn’t fit. It solves the wrong problem; it doesn’t really adapt itself to doing the thing that we need to do well. There were other people who noticed this too. For example, there was a popular site called XML Sucks. The title of the site was "Why XML is technologically terrible, but you have to use it anyway." So there are basically two schools of thought about XML. One which said this is perfection. We started with SGML which the world loved, and then we got it right, perfect. And then there’s the school that said it’s awful. But there was one thing they could both agree on, and that is: XML is the standard, so shut up. Shut up! But not everybody shut up. There were a lot of tinkerers who were all aware that there’s something wrong here and started trying to fix it. So this is a list of XML alternatives. Each one of these has a crazy inventor behind it who had observed that there was something really deeply wrong with XML, and he thought that he could fix it. This list was compiled by a guy named Paul T. I don’t now who he is, but he was one of the guys, and he was hoping that his would float to the top, and it didn’t. And so when mine floated to the top, he said OK, it’s done, and stopped keeping it up to date.

But each of these guys was right in that they saw that XML was deficient, but there was no way you could build a community out of this stuff. There’s probably no guy on this list who would look at someone else’s and say yeah, he got it better. No one would do that, it’s just a bunch of crackpots. And so none of them could rise above their own noise, except for one, basically because of the Ajax effect. So Ajax won. The XML community took notice of the ascendance of JSON. They had, early on, been happy about been a disruptive technology, and then were very unhappy that they were starting to be disrupted themselves and tried to stop it. Early on, there were vague threats…weren’t quite threats, more like stuttering, like "You’ll rue the day you ever questioned the technological superiority of XML!" You know that kind of stuff. I’ll rue the day someday, I’m sure that’s true. As JSON started ascending, they started getting a little bit more nasty. "Okay, your little web application, JSON, that’s fine. I know we said it wouldn’t work, but OK, you got that working, that’s good. But if you’re doing real applications, manly applications, you need the complexity of XML. That complexity is there for a reason. And if you don’t have it, you will fail." And they could never articulate exactly why you would fail, but they were pretty confident that you would. Well since then, a lot of manly applications have been written with JSON. And what happened was they didn’t fail, they just got faster.

Then finally, there were the death threats. Yeah, death threats. For example, Dave Winer, just before Christmas in 2006 had just discovered JSON, and wrote, "It’s not even XML! Who did this travesty? Let’s find a tree and string them up. Now." What an ugly thing to say. Fortunately, nobody listens to Dave Winer. James Clark, who was one of the principle architects of XML, a few months later wrote, "Any damn fool could produce a better data format than XML." Which it turns out is true. So somehow in the whole XML hysteria, we’d forgotten the first rule of workmanship, which is use the right tool for the right job. Instead, we got distracted on this other thing which was one tool to rule them all. And that’s not good engineering, that’s not good craftsmanship, that’s not the way you do things. It might be desirable to have one super tool that did everything. But there’s never been such a tool, and tools have always been specialized. And part of the craft of engineering is determining, of all the tools available to you, what is the best tool for solving any problem. And there’s this weird period of time where we forgot how to do that.

So one of the benefits of JSON becoming tolerable is that we’re now allowed to consider the best tool now. JSON isn’t necessarily the best tool for every job. But for the ones it is, you can use it. And for the ones that it isn’t, there are other tools out there that you can use. So good engineering has become popular again. I think that’s a nice benefit. So then that made me think. Where did the idea come from that data should be represented by a document format? For me looking back on it, it doesn’t make any sense. Where did that idea come from? It seemed a really powerful idea because for a while everybody bought into it. But it just doesn’t make any sense. So I started looking back through the fossil record to try to figure out where this idea came from. So go all the way back to a program called RUNOFF. This started off at MIT, and then found its way onto Tech Systems, and Multix, and a bunch of other mainframes. This was in the mainframe era. Some of the first versions of this program used punch cards. And in those days, punch cards came only in upper case, so you could insert special codes for indicating which of the upper case letters were intended to be lower case, so you could print out nice documents.

And then a card which started with a letter was going to have text on it and the text would get filled into paragraphs. And the cards that start with a period in column one are command, which indicate that we’re going to skip one blank line, or we’re going to tab over four spaces. So there’s a lot of explicit control going on here. And it was sufficient for making manuals and things like that, but there were obviously better things you could do with this. Charles Goldfarb from IBM got the idea of doing something he called generic markup language. And some of these tag names should be eerily familiar to you. So he started with a piece of unexpected punctuation in column one. And then he also came up with the idea of having a closing punctuation so that you could then put text on the same line. One thing you might not recognize is the EOL tag, which doesn’t map exactly onto anything we use now, but you might guess as to what it meant. And so as Goldfarb was playing with this, he went through this evolution where it was first a special purpose tag, and then he generalized it. And then he stumbled onto the idea of angle brackets. One place we can still see this stuff in HTML today is in entities. So an entity has got some crazy piece of punctuation and then some letters and then another piece of crazy punctuation. How could that have ever made sense? Well, this is where it came from. He ran out of angle brackets at that point, he didn’t have anything else to wrap them in.

The first place where document systems were done right was in Brian Reid’s Scribe, which he developed at Carnegie-Mellon, published it in 1980. Scribe was the first document format that separated document structure from formatting and did it brilliantly. Not only that, he had a really nice notation for expressing the document, which was much easier to write than HTML, and much easier to get right than HTML, and certainly easier than SGML. He only had one reserved character and that was the @. So if you want to have a literal @, you just do double @ and that was that. @ followed by a word, that was a tag. Generally, the tag was followed by a block of stuff with a begin character and an end character. And within that block, you can’t use the begin character and end character literally, but any other character you could use. And he had six sets of begin and end characters that you could have so that you didn’t have the list problem of having all these parents that you had to balance. You had something that was much more tractable visually, including, I should point out, angle brackets. And Goldfarb saw that and went oh yeah, angle brackets.

He also had a nice form where, for something that was really long, like a chapter or a table or something, you could say begin and end and the argument of it would be the tag. And so in this form, you could have anything except end quote in there, and you don’t have to worry about confusion of characters. So it was a really resilient format, syntactically really simple. It was just one of these brilliant ideas. Reid was a really brilliant guy. It’s really a shame that Tim Berners-Lee hadn’t been more knowledgeable of document formats. If he had based his World Wide Web on Scribe instead of on SGML, the World Wide Web would be a better place today. But this doesn’t quite answer the question I took you on this journey with. There’s one more thing to look at. Scribe also had support for bibliographies. So here we have a description of a tech report, a description of a book. And within those we’ve got data. In fact, it looks like JSON. It’s a name value pair separated by columns. And while it’s in a document, this is data. This is data describing documents. So I believe this is the first time when a document format was used to represent data. And Scribe had a big influence on Goldfarb, unfortunately not a big enough influence.

So he took these things, and these became the attributes in SGML. But he just didn’t get the rest of it right. But that took the meme into the SGML community that yeah, we can represent data in the document format because Reid did it. So we can do it. And that idea survived into the XML age. When I put the reference implementation onto the website, I needed to put a software license on it. And I looked up all the licenses that are available, and there were a lot of them. And I decided that the one I liked the best was the MIT license, which was a notice that you would put on your source, and it would say, "You’re allowed to use this for any purpose you want. Just leave the notice in the source and don’t sue me." I love that license, it’s really good. But this was late in 2002, we’d just started the War On Terror. And we were going after the evil-doers with the President, and the Vice-President, and I felt like I need to do my part. So I added one more line to my license, which was, "The Software should be used for Good, not Evil." I thought I’d done my job. About once a year I’ll get a letter from a crank who says, "I should have a right to use it for evil! I’m not going to use it until you change your license!" Or they’ll write to me and say, "How do I know if it’s evil or not? I don’t think it’s evil, but someone else might think it’s evil. So I’m not going to use it." Great, it’s working. My license works, I’m stopping the evildoers!

Audience Member: If you ask for a separate license, can you use it for evil?

Douglas: That’s an interesting point. Also about once a year, I get a letter from a lawyer, every year a different lawyer at a company. I don’t want to embarrass the company by saying their name, so I’ll just say their initials, IBM, saying that they want to use something I wrote. Because I put this on everything I write now. They want to use something that I wrote in something that they wrote. And they were pretty sure they weren’t going to use it for evil but they couldn’t say for sure about their customers. So could I give them a special license for that? Of course. So I wrote back, this happened literally two weeks ago. I said, "I give permission for IBM, its customers, partners, and minions, to use JSLint for evil." And the attorney wrote back and said, "Thanks very much, Douglas!"

So I’ve got to wrap up now, but before I do that I want to talk about the logo. When I put the web page up in 2002, I decided I should have a logo to class up the page, and make it look more substantial. So I came up with this thing. It’s based on a famous optical illusion called the Impossible Torus, which is sort of related to the Ambihelical Hexnut. So what I did was I took it, I made it round, I reoriented it, and gave it some nice shading. I liked it for a number of reasons. One was if you look at it as a two-dimensional figure, it’s made up of two components which are identical but out of phase. So it kind of suggests the two sides of a conversation, because it keeps going around and around. Also, I could see letter forms in it. There’s a J in there, and an N maybe. Clearly an O is in there. And so it had most of the initials that were in the name of the thing. But after looking at this for several years, I noticed something: it’s not impossible. What it is, is a square which is extruded in a circle. And as it goes around, it does one rotation and comes back.

Audience Member: It’s like a Mobius strip.

Douglas: Except it’s a full rotation. Otherwise, it would be like a Mobius strip, but it does a full rotation. It does one rotation and one orbit. So it’s not an impossible shape, it’s actually a simple shape. It’s a square and a circle with a twist. I think it works really nicely as a symbol for JSON. Once I figured that out, it was like OK, so I can put a mathematical model behind it. So I rendered this in JavaScript using Canvass and put some extreme shading on it for a t-shirt design. It’s kind of nice that JavaScript is now powerful enough to render its own logos. This is the design that I did for the business card. I wanted something that looked like it could have been around for 100 years. So JSON the data interchange format mothers have learned to trust for many generations. Then finally, the last one for the night, this one was inspired by Shepard Fairey’s Obama poster. I call it, Data Interchange We Can Believe In. Thank you, and good night.

What do I think would make HTML better? Making it more extensible. Having to fit everything into the limited set of tags that we have just doesn’t work. The thing you mentioned about making headings work in documents which are not heading-full, doesn’t fit. I would like to be able to use CSS to say I want a new thing which is a title, or an ad, or a controller, or whatever. I can give it the name that I want, and I just specify in CSS what it’s supposed to do. And that is all I need to do in order to extend the language, to make it map my application. That’d be a trivial thing to do, but HTML 5 is going off in a different direction. Is there case sensitivity in Unicode Hex characters with a backslash U?

Audience Member: Yes.

Douglas: No, you can use upper or lower case.

Audience Member: Do you think you could add that to your spec? I was using JSON-C, and it doesn’t know about that JSON sensitivity. And then it turns out that the web page doesn’t know about it, either.

Douglas: Huh. I’ll have to look at that. I wasn’t aware of that.

Audience Member: Okay.

Douglas: In the meantime, you should use lower case. What would I like to see replace JSON? We’re seeing templating languages for JSON now. Like, the JSONT language I think is absolutely brilliant.

Audience Member: Right.

Douglas: So that doesn’t need to be in the format. One of the biggest weaknesses in the JSON format is also a weakness in XML, which is that it cannot easily represent cyclical structures, and can’t represent general DAGs. For most applications that’s not a requirement, but there’s some applications that’d be really desirable for. I felt bad about leaving that out of JSON, but I had to leave it out because it wasn’t in JavaScript. And one of my other design rules was it had to be a subset of JavaScript. So I missed that boat. Someday I’d like to be able to take the quotes off the keys because it looks stupid.

Audience Member: What is an example that you could give of an application that couldn’t use the…

Douglas: Okay. The simplest thing that you cannot encode in JSON. Make an array. So A equals empty array, A sub zero equals A. That’s a cyclical structure. And if you ask JSON to serialize that, you’ll get an infinite number of open brackets. And then you’ll die before you generate the first closing bracket. What do I think about schemas and DTDs? I don’t care. If you want to do that, that’s fine. There are some very clever people who have been working on schemas for JSON. Kris Zyp over at Dojos has done some really good work. I considered doing schemas for JSON very early on, because as JSON was starting to ascend, a lot of people coming in from the XML world saying, "We can’t do it. We can’t use it until it’s got schemas." And they would say that for about a month until they figured out how JSON worked, and then it’s oh, never mind. So we never got to that point. And so I had designed a schema, but I never implemented it because there really didn’t seem to be much need for it. Some people think there is a need for it, and I’m happy to have them go off and do that. But the core data format itself doesn’t have to change in order to make that useful. The main reason I took comments out was that I saw people who were trying to control what the parser would do based on what was in the comments. And that totally broke interoperability. So there’s no way I could control the way they were using comments, so the most effective fix was to take the comments out.