Saturday, February 22, 2014

React Demystified

This entry will be a bit of a departure from the usual content of this blog, which is mostly about parsing and low-level programming. Lately I've had some interest in JavaScript frameworks, including Facebook's React. Some recent articles I have read, particularly The Future of JavaScript MVC Frameworks, have convinced me that there are some deep and powerful ideas in React, but none of the articles or documentation I could find explained the core abstractions in a way that satisfied me. Much like my previous article LL and LR Parsing Demystified, this article is an attempt to explain the core ideas in a way that makes sense to me.

The 1000-Foot View

In a traditional web app, you interact extensively with the DOM, usually using jQuery:


I made the DOM red because updating the DOM is expensive. Now sometimes the "App" will have model classes that it uses internally to represent state, but for our purposes that is an implementation detail that is internal to the app.

React's primary goal is to provide a different and more efficient way of performing DOM updates. Instead of mutating the DOM directly, your app builds a "virtual DOM", and React handles updating the real DOM to match:


How does introducing an extra layer make things faster? Doesn't that imply that the browsers have sub-optimal DOM implementations, if adding a layer on top of them can speed them up?

It would mean that, except that the virtual DOM has different semantics than the real DOM. Most notably, changes to the virtual DOM are not guaranteed to take effect immediately. This allows React to wait until the end of its event loop before it even touches the real DOM at all. At that point it calculates a nearly-minimal diff and applies it to the real DOM in as few steps as possible.

Batching DOM updates and applying minimal diffs are things that an application could do on its own. Any application that did this would be as efficient as React. But doing this manually is tedious and error-prone. React handles that for you.

Components

I mentioned that the virtual DOM has different semantics than the real DOM, but it also has a noticeably different API. The nodes in the DOM tree are elements, but the nodes of the virtual DOM are a completely different abstraction called components.

The use of components is very important to React, because components are designed to make calculating the DOM diff much more efficient than the O(n^3) that the fully general tree-diff algorithm would cost.

To find out why, we'll have to dig in to the design of components a bit. Let's take the React "Hello, World" example from their front page:
/** @jsx React.DOM */
var HelloMessage = React.createClass({
  render: function() {
    return <div>Hello {this.props.name}</div>;
  }
});

React.renderComponent(<HelloMessage name="John" />, mountNode);
There is an awful lot going on here that isn't entirely explained. Even this short example illustrates some big ideas, so I'm going to take some time here and go slow.

This example creates a React component class "HelloMessage", then creates a virtual DOM with one component (<HelloMessage>, essentially an "instance" of the HelloMessage class) and "mounts" it onto the real DOM element mountNode.

The first thing to notice is that the React virtual DOM is made up of custom, application-defined components (in this case <HelloMessage>). This is a significant departure from the real DOM where all of the elements are browser built-ins like <p>, <ul>, etc. The real DOM carries no application-specific logic; it is just a passive data structure that lets you attach event handlers. The React virtual DOM, on the other hand, is built from application-specific components that can carry application-specific APIs and internal logic. This is more than a DOM-updating library; it is a new abstraction and framework for building views.

As a side note: If you've been keeping up with all things HTML you may know that HTML custom elements may be coming to browsers soon. This will bring to the real DOM a similar capability: defining application-specific DOM elements with their own logic. But React has no need to wait for official custom elements because the virtual DOM isn't a real DOM. This allows it to jump the gun and integrate features similar to custom elements and Shadow DOM before browsers add those features to the real DOM.

Getting back to our example, we have established that it creates a component called <HelloMessage> and "mounts" it on mountNode. I want to diagram this initial situation in a couple of ways. First let's visualize the relationship between the virtual DOM and the real DOM. Let's assume that mountNode is the document's <body> tag:


The arrow indicates that the virtual element is mounted on the real DOM element, which we'll see in action shortly. But let's also take a look at the logical illustration of our application's view right now:


That is to say, our entire web page's content is represented by our custom component <HelloMessage>. But what does a <HelloMessage> look like?

The rendering of a component is defined by its render() function. React does not say exactly when or how often it will call render(), only that it will call it often enough to notice valid changes. Whatever you return from your render() method represents how your view should look in the real browser DOM.

In our case, render() returns a <div> with some content in it. React calls our render() function, gets the <div>, and updates the real DOM to match. So now the picture looks more like this:


It doesn't just update the DOM though; it remembers what it updated it to. This is how it will perform fast diffs later.

I glossed over one thing, which is how a render() function can return DOM nodes. This is obscured by the JSX which isn't plain JavaScript. It's instructive to see what this JSX compiles to:
/** @jsx React.DOM */
var HelloMessage = React.createClass({displayName: 'HelloMessage',
  render: function() {
    return React.DOM.div(null, "Hello ", this.props.name);
  }
});

React.renderComponent(HelloMessage( {name:"John"} ), mountNode);
Aha, so what we're returning aren't real DOM elements, but React shadow DOM equivalents (like React.DOM.div) of real DOM elements. So the React shadow DOM really has no true DOM nodes.

Representing State and Changes

So far I've left out a huge piece of the story, which is how a component is allowed to change. If a component wasn't allowed to change, then React would be nothing more than a static rendering framework, similar to a plain templating engine like Mustache or HandlebarsJS. But the entire point of React is to do updates efficiently. To do updates, components must be allowed to change.

React models its state as a state property of the component. This is illustrated in the second example on the React web page:
/** @jsx React.DOM */
var Timer = React.createClass({
  getInitialState: function() {
    return {secondsElapsed: 0};
  },
  tick: function() {
    this.setState({secondsElapsed: this.state.secondsElapsed + 1});
  },
  componentDidMount: function() {
    this.interval = setInterval(this.tick, 1000);
  },
  componentWillUnmount: function() {
    clearInterval(this.interval);
  },
  render: function() {
    return (
      <div>Seconds Elapsed: {this.state.secondsElapsed}</div>
    );
  }
});

React.renderComponent(<Timer />, mountNode);
The callbacks getInitialState(), componentDidMount(), and componentWillUnmount() are all invoked by React at appropriate times, and their names should pretty clearly give away their meanings given the concepts we have explained so far.

So the basic assumptions behind a component and its state changes are:
  1. render() is only a function of the component's state and props.
  2. the state does not change except when setState() is called.
  3. the props do not change except when our parent re-renders us with different props.
(I did not explicitly mention props before, but they are the attributes passed down by a component's parent when it is rendered.)

So earlier when I said that React would call render "often enough", that means that React has no reason to call render() again until somebody calls setState() on that component, or it gets re-rendered by its parent with different props.

We can put all of this information together to illustrate the data-flow when the app initiates a virtual DOM change (for example, in response to an AJAX call):


Getting Data from the DOM

So far we have only talked about propagating changes to the real DOM. But in a real application, we'll want to get data from the DOM also, because that is how we receive all input from the user. To see how this works, we can examine the third example on the React home page:
/** @jsx React.DOM */
var TodoList = React.createClass({
  render: function() {
    var createItem = function(itemText) {
      return <li>{itemText}</li>;
    };
    return <ul>{this.props.items.map(createItem)}</ul>;
  }
});
var TodoApp = React.createClass({
  getInitialState: function() {
    return {items: [], text: ''};
  },
  onChange: function(e) {
    this.setState({text: e.target.value});
  },
  handleSubmit: function(e) {
    e.preventDefault();
    var nextItems = this.state.items.concat([this.state.text]);
    var nextText = '';
    this.setState({items: nextItems, text: nextText});
  },
  render: function() {
    return (
      <div>
        >h3<TODO</h3>
        <TodoList items={this.state.items} />
        <form onSubmit={this.handleSubmit}>
          <input onChange={this.onChange} value={this.state.text} />
          <button>{'Add #' + (this.state.items.length + 1)}</button>
        </form>
      </div>
    );
  }
});
React.renderComponent(<TodoApp />, mountNode);
The short answer is, you handle DOM events manually (as with the onChange() handler in this example), and your event handler can call setState() to update the UI. If your app has model classes, your event handlers will probably want to update your model appropriately and also call setState() so React also knows there were changes. If you've gotten used to frameworks that provide automatic two-way data binding, where changes to your model are automatically propagated to the view and vice versa, this may seem like a step backwards.

There is more to this example than meets the eye though. Despite how this example may look, React will not actually install an "onChange" handler on the <input> element on the real DOM. Instead it installs handlers at the document level, lets events bubble up, and then dispatches them into the appropriate element of the virtual DOM. This gives benefits such as speed (installing lots of handlers on the real DOM can be slow) and consistent behavior across browsers (even on browsers that have non-standard behavior for how events are delivered or what properties they have).

So putting all of this together, we can finally get a full picture for the data flow when a user event (ie. a mouse click) results in a DOM update:



Conclusions

I learned a lot about React by writing this entry. Here are my primary takeaways.

React is a view library. React doesn't impose anything about your models. A React component is a view-level concept and a component's state is just the state of that portion of the UI. You could bind any sort of model library to React (though certain ways of writing the model will make it easier to optimize updates further, as the Om post explains).

React's component abstraction is very good at pushing changes to the DOM. The component abstraction is principled, composes well, and efficient DOM updates fall out of the design.

React components are less convenient for getting updates from the DOM. Writing event handlers gives React a distinctly lower-level feel than libraries that automatically propagate view changes into the model.

React is a leaky abstraction. Most of the time you will program only to the virtual DOM, but sometimes you need to escape this and interact with the real DOM directly. The React docs talk more about this and the cases where this is necessary in their Working With the Browser section.

With my new knowledge I am inclined to more closely examine the claims made in the article The Future of JavaScript MVC Frameworks, but that is a slightly different topic that will have to wait for another entry.

I am not an expert in React, so kindly let me know of any mistakes.

2 comments:

  1. This was a remarkably clear and concise high-level view of React. Thanks for writing it.

    ReplyDelete
  2. I really appreciate your objective view towards the Framework. I will probably need to re-read the article to get a better understanding, so in saying that I have bookmarked the post. Thanks again.

    ReplyDelete