Shared posts

30 Nov 14:46

视频访谈: 徐立:视频直播的三个技术难题

by 徐立

我们在ArchSummit深圳大会的现场采访了七牛云的徐立,他介绍了七牛选用Go语言开发直播服务器的优势,直播的技术难点,七牛在与第三方 CDN 合作的过程中会遇到的问题及解决方案,还分享了一些直播的优化策略。

By 徐立
18 Aug 07:48

视障程序员是如何编程的?

by 牧师

【伯乐在线导读】:

先来看看一位盲人程序员 T.V. Raman 的故事:14 岁时因青光眼失去了视力,在志愿者的帮助下他完成了大学学业, 1989 年得到一台盲人用语音合成器和当时最先进的读屏软件,来到美国攻读博士学位,后来并成为了一名计算机科学家。Raman 目前在谷歌研究院工作,之前在 IBM 和 Adobe 工作,他可以在 23 秒内复原盲人魔方。他现在使用电脑没有任何障碍,他天天都上网浏览信息,他还可以使用特别的手机来看地图。他的完整故事看这篇:http://blog.jobbole.com/12176/

有位网友看过 Raman 的故事后,在 Quora 上问「有视觉障碍的程序员是怎样编程的?」伯乐在线摘译了 3 位网友的分享,其中 2 位就是盲人程序员。

1.Tommy MacWilliam ,Quora 工程经理,285 顶

你曾经参加过 Python 拼写比赛吗?想象一下如果你每天的生活都是如此该怎样。

我最好的朋友之一在高三时被确诊为 Leber 遗传性视神经病变,Leber 遗传性视神经病变会造成视力逐渐衰退。在刚入大学时,他就失去了双眼的视力。和我一样,他主修计算机科学;但不同的是,看他编程是我见过的最不可思议的事情。

在大学里,他曾经把屏幕放大器和文字转语音软件组装到一起。使用的放大器叫做 MAGic,MAGic 可以迅速放大屏幕上的某一点到只显示出几个字母。而显示器本身就是一个巨大的 30 英寸的投影机样式的东西,并专门为有视觉障碍的人进行了显示优化。至于文本转语音的软件,我记得他尝试了很多不同的产品,更喜欢使用 JAWS。我特别记得的是,他说现有的开源解决方案相较于他使用的昂贵大型设备(这完全得益于我们学校的支持)就是一个笑话。

他总是把屏幕阅读器的速度设置为最快,大概每分钟刷新 300 字,而有声书的文本通常以这个速度的一半来播放。在我听来这就像一种完全不同的语言,但他听起来却很顺畅。事实上,他更喜欢用一个版本非常古老的火狐浏览器(3.5 版本左右),因为这个版本能够很好地支持他的屏幕阅读器。他几乎完全使用键盘快捷键,这使他能够快速找到应用程序并在窗口之间迅速切换。也就是说,这个放大了 40 倍的屏幕几乎不可能跟上他操作的速度,但他却比我见过的许多其他程序员更有效率。

他的具体编程方法。他用的是 emacs 编辑器(我想这是因为他对键盘快捷键已经极其熟练了),当滚动翻页时,屏幕阅读器会读出代码。同样,屏幕阅读器会读出终端输出,因此任何会导致不能转换成文字或声音的问题都是不被允许的。如你所想,编程语言甚至比英文单词更难理解。没有大量使用语法符号的编程语言则更容易理解,像 Python,Ruby 等更像英语的友好语言更容易理解。并且,程序中使用缩进比使用花括号更难理解,因为你要听出每一行使用的 tab 键数。

为了更好的描述我的意思,我说一件发生在我们第二堂导论课上的事。这节课用 OCaml 语言教,这是一种语法独特的函数式语言。他不得不听着这样让人不着头脑的话:“let rec fib n equals return match n with return pipe one hyphen greater than… semicolon semicolon”等等。并一度在一个不能编译的代码库的上工作。他一遍遍听着大声朗读的荒谬的语法,但是一切听起来都那么正常。直到他在课上想尽办法,一个视力正常的助教才指出,由于这样或那样的原因,屏幕阅读器把数字 “0” 读成了字母 “o” (即“哦”的声音)。而这只是一个全新的,视力正常的程序员无需处理的bug。

值得一提的是,他也对 HTML 辅助规范充满了热情,尤其是 ARIA。而互联网上的大多数网站完全忽略了这件事,尽管要实现它是多么的微不足道。对他而言,网站有与没有 ARIA 功能就像白天与黑夜的区别。

现在,他是一名全职软件工程师。

2. Parham Doustdar,1200+ 顶

我是一名盲人 PHP 开发者。我编程的方式不同于其他答案下提到的那些方法。在继续之前,让我告诉你一些我自己的事。

我生来就是失明的。我眼睛的状况其实并不重要,重点是,我从未有过“失去视力”的问题,因为它不可能发生。这就使很多事情变得简单了。

我个人使用的开发工具是 IDE 。而很多盲人朋友并不使用 IDE,最可能是因为 IDE 的界面对于我们的屏幕阅读器来说并不是完全的无障碍。我们的同行有一个问题就是太忙了以致于只照顾到了大多数人而忽视了少数人的声音。举个例子,Jetbrains IDE 的整个系列都不是无障碍的。而版本 IDEA-111425 才是为盲人和有视觉障碍的人准备的无障碍开发工具。

我个人使用的是 Zend Studio,这是一款基于 Eclipse 的 IDE,大部分人可能都知道或者用过。Eclipse 是那些使你愉悦的产品之一,它的无障碍功能实现的非常好。虽然并没有完全实现无障碍,但是实现 80% 就已经够我使用了。作为盲人,你要学会使用你能够获得的来生活。

因此,这就使得我不必记忆方法签名,文档和其他所有的事情,从而使我的大脑可以更高效地去思考其他的事情–就像遗留代码库为什么会这么难理解。

我没有使用布莱叶盲文键盘。布莱叶键盘只有 6 个键,输入每一个字符都必须同时按住 6 个键,这就要比有很多键可以一个个快速按下的键盘慢很多。我也没有使用盲文显示器,我只是把屏幕阅读器的速度设为 420 个字每分钟,这要比用盲文显示器阅读要快得多。

我认为最重要的是,你必须要领先于你的同行。作为盲人,你先天就要很遭遇很多困难,因为你没有大多数人都有的视觉感官。这不会是个大问题,除非是我之前提过的多数人与少数人的对比。既然你是少数人,你就必须设法做到周围很多人认为是理所当然的事。 在这一点上,我相信力量是建立在你被禁用的能力之上的:由此你学习去提升自己,去前进,去打破不可能,而不是每天担心失业。

3. Lucas Radaelli,Google工程师,17900+ 顶

我完全双目失明。我在 Google 工作,编写代码来更改排名算法。就我个人经验而言,我认为许多盲人程序员的编程方式,和视力正常的同事们并无太大差别。大多数时候,我都会用 Emacs 编辑器(Emacs 有个朗读扩展叫 Emacspeak),还有浏览器来查阅谷歌内部页面。

(Lucas Radaelli 和他的导盲犬)

主要差别是,我们要么听屏幕上的内容,要么通过盲文显示器来阅读。我不能对使用盲文显示器发表评论,因为我还没有买过这样的显示器(太昂贵了),但我可以聊聊程序员是如何通过听力来写代码的。

仅通过听力来编程的最大挑战在于,你必须能记住很多东西。一行一行地听代码。也可以逐字或按字符来听代码,问题来了,在规定的时间内要听完屏幕内的代码。听完后,你还不能动手编程,要查找函数定义,了解传参的变量名。然后记住它。如果想再次查找函数定义,我会设置一个标记。大家会发现,这样做或许会耗费一些宝贵时间。所以提高记忆力非常有用。

我喜欢通过 Emacspeak 编程,比如说,在写 C++ 代码时,它能提供了许多非常酷的东西。在朗读变量、函数和其他编程语言元素时,Emacspeak 会使用不同的语调。所以很容易区分,你也可以把这个看成是声音版的代码高亮工具吧。

最后补充回应大家的好奇心:盲人程序员不用缩进代码。我们通常是写完代码后,再对缩进代码。我们在这方面没有优势。

或许你会问,那遇到 Python 代码又如何?我挺喜欢 Python 的,即便在缩进这点上……但我发明了一些技巧,比如在每个缩进块的结尾跳一行,这样我可以很快知道代码块何时结束了。当我阅读其他人的代码时,我会先在屏幕阅读器上设置一个选项,告诉我缩进级别。但我发现这样有点烦,因为我读的每一行代码,屏幕阅读器都会读出该行代码中的 space 数量。

视障程序员是如何编程的?,首发于文章 - 伯乐在线

04 Dec 01:36

Travis-CI: What, Why, How

by Sayanee Basu

Travis CI makes working in a team for a software project easier with automated builds. These builds are triggered automatically when each developer checks in their code to the repository. In this article, we will go through how we can integrate Travis CI easily with our project, which is hosted on Github. With automation, notification and testing in place, we can focus on our coding and creating, while Travis CI does the hard work of continuous integration!


Hello Travis & CI!

Travis CI is a hosted continuous integration platform that is free for all open source projects hosted on Github. With just a file called .travis.yml containing some information about our project, we can trigger automated builds with every change to our code base in the master branch, other branches or even a pull request.

Before we get started with how we can integrate Travis with our project, the following prerequisites will be helpful:

  1. Git
  2. Github
  3. Basic NodeJS
  4. NPM
  5. GruntJS

At the heart of using Travis, is the concept of continuous integration (CI). Let’s say we are working on one feature and after we are done coding, we will typically build the project so as to create the executable as well as other files necessary to run the application. After the build is completed, good practices include running all the tests to ensure they are all passing and everything is working as expected.

The last step is ensuring that whatever we coded is indeed working even after we integrate it into the mainline code. At this point we build and test again. If the integrated build succeeds we can consider that the feature has been fully implemented. Travis CI automates this exact step of triggering a build and test upon each integration to the master branch, other branches or even a pull request, accelerating the time to detection of a potential integration bug.

In the following sections, we will take a simple project and trigger a failing build, correct it and then pass it. We will also see how Travis CI easily works with Github pull requests.


Travis Interface

When we land on the main homepage, we can also see the “busyness” of many open source projects going through automated build. Let’s deconstruct the interface and understand the various parts:

travis-interface
  1. Sidebar: This shows the list of public open source projects on Github currently going through automated builds. Each item has the hyperlinked project name, duration of the build so far and the sequential number of build.
  2. Build in progress [yellow]: A little yellow colored circle beside the project name indicates that the build is in progress.
  3. Build failed [red]: A little red colored circle beside the project name indicates that the build is complete and it has failed.
  4. Build passed [green]: A little green colored circle beside the project name indicates that the build is complete and it has passed.
  5. Project name and links: The title is in the format username/repository and it is linked to the Travis CI build page. The little Octocat symbol beside it links to the Github page of the repository containing its source code.
  6. Types of build: The automated builds can be triggered by committing the code to the master branch, other branches or even a pull request. By visiting the individual tab, we can get more information about the builds.
  7. Build activity: This section will include information about each of the tasks that the build is running.

Step 1: Hello World!

Before we integrate Travis CI, we will create a simple “hello world” project and create some build tasks. Travis supports various programming languages including Python, Ruby, PHP and JavaScript with NodeJS. For the purpose of our demo, we will use NodeJS. Let’s create a very simple file hello.js as defined on the main website of NodeJS:

var http = require('http');

http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n') // missing semi-colon will fail the build
}).listen(1337, '127.0.0.1');

console.log('Server running at http://127.0.0.1:1337/');

Do notice that there is a missing semi-colon so that later on JSHint, a JavaScript linter will be able to detect this and raise an error. We will build the project using a task runner called GruntJS that will include JSHint. This is of course an illustration, but in real projects, we can go on to include various testing, publishing, linting and hinting tasks.

To indicate the various packages required for GruntJS, JSHint and others, we will create a second file called package.json. This file will firstly contain the name and the version number of our simple application. Next, we will define the dependencies needed with devDependencies which will include GruntJS related packages including JSHint. With scripts, we will tell Travis CI to start running the test suite and the command grunt --verbose. Let’s see the full contents of the file: package.json:

{
  "name": "node-travis",
  "version": "0.1.0",
  "devDependencies": {
    "grunt": "0.4.1",
    "grunt-cli": "0.1.9",
    "grunt-contrib-jshint": "0.6.0"
  },
  "scripts": {
    "test": "grunt --verbose"
  }
}

Next, let’s prepare the Gruntfile.js that will include all the tasks required to run our build. For simplicity, we can include just one task – JavaScript linting with JSHint.

module.exports = function(grunt) {

  grunt.initConfig({
    jshint: {
      all: ['Gruntfile.js', 'hello.js']
    }
  });

  grunt.loadNpmTasks('grunt-contrib-jshint');
  grunt.registerTask('default', 'jshint');

};

Finally, we will run the build that contains only one task after we download all the related packages with npm install:

$ npm install
$ grunt

As expected, the build will not pass because the JSHint will detect a missing semi-colon. But if we place the semi-colon back into the hello.js file and run the grunt command once again, we will see that the build will pass.

travis-fail-pass

Now that we have created a simple project locally, we will push this project to our Github account and integrate Travis CI to trigger the build automatically.


Step 2: Hello World With Travis CI

The very first step in integrating Travis CI is to create a file called .travis.yml which will contain the essential information about the environment and configurations for the build to run. For simplicity, we will just include the programming environment and the version. In our simple project, it is NodeJS version 0.10. The final contents of the file .travis.yml will be as follows:

language: node_js
node_js:
  - "0.10"

Now our project will consist of the following files along with README.md and .gitignore as required:

$ tree
.
|-- .travis.yml
|-- Gruntfile.js
|-- hello.js
|-- .gitignore
|-- README.md
`-- package.json

Let’s now create a git repository and push to a new remote repository hosted on Github:

git init
git commit -m "first commit"
git remote add origin git@github.com:[username]/[repository].git
git push -u origin master
travis-github

Next, log in to Travis CI and authorize Travis CI to access your Github account. Afterwards, visit your profile page to turn on the hook for the Github repository to trigger automated builds with Travis CI.

travis-profile

As a final step to trigger our very first build, we will need to push to Github. Let’s remove the semi-colon in the file hello.js to make a failing build and then push to Github. This will trigger the automated build in Travis CI. Let’s visit the URL: https://travis-ci.org/[username]/[repo] to see the first build in progress!

git add hello.js
git commit -m "removed semi-colon"
git push
travis-build-fail

This failing build in the above example is really a simple illustration. But this situation is reflective of something that might happen in our real projects – we try to integrate our code and the automated build fails. By default, after each build is completed, Travis CI will send emails to the commit author and repository owner. In this way, the developer that pushed the code is immediately alerted and can then fix the integration errors. In our case, let’s just insert the missing semi-colon and push to Github one more time.

git add hello.js
git commit -m "added semi-colon to pass the build"
git push
travis-build-pass

Hurray! The automated build has passed this time. Our code is integrated passing all the required tests. Now each time we try to integrate our changes whether it is to the master branch or even other branches, Travis CI will trigger an automated build.


Pull Requests

Once we have integrated Travis CI into our project, a pull request will also trigger an automated build. This is immensely useful for the repository owner or the developer who is in charge of merging the code base. Let’s see how Travis CI will advise whether the pull request is good to merge or not.

First, using another Github account, let’s fork the original repository and pull request with the following steps:

  1. Fork the original repository
  2. Create a new branch in the forked repository
  3. Make the new changes and commit it
  4. Ensure the feature branch is chosen
  5. Compare and pull request

Merge With Caution

To simulate a failing build in the pull request, we will once again remove the semi-colon in the file hello.js, commit and push the changes and finally pull request.

travis-pull

Upon each pull request, Travis CI will automatically trigger the build. This time, we can also visit the “Pull Requests” tab to see the history of current or past builds triggered due to a pull request.

travis-pull-fail

After Travis CI completes the build, if we visit the pull request page from the original repository, we will see that Travis CI has appended some user-interface changes to alert us that the build has failed.

travis-pull-fail-advise

Good to Merge

This failing build status will be immediately notified to the repository owner as well as the developer who did the pull request. And now, depending on the reason for the failing build, it can be rectified with another commit in the same branch. Hence, let’s add on the semi-colon and pull request one last time. Github will automatically update the pull request page as well.

travis-pull-pass

And finally, when we come back to the original repository’s pull request page, this time we will see a “green” signal to go ahead and do a merge as our build is passing!

travis-pull-pass-advise

Build Configurations

The file .travis.yml defines the build configurations. Our example included just the language type and version, but we can add-on more useful ones as follows:

  1. Language specific. This is an example for Ruby
    	language: ruby
    	rvm:
    	  - 1.9.3
    	
  2. Commands or scripts to run before or after each build. This is an example of a command before running a script:
    	before_script:
    	    - git config --global user.name [myname]
    	
  3. Notifications in terms of emails or chat alerts are sent as declared by the build configurations. This is an example of turning off emails and sending it to IRC:
    	notifications:
    	  email: false
    	  irc: "chat.freenode.net#travis"
    	

Validating .travis.yml

As you can see, the file .travis.yml becomes very important in triggering automated builds. If this file in not valid, Travis CI will not trigger the build upon each push to Github. Hence, ensuring that we have a valid file that Travis CI will interpret correctly is important. For this, we will install a gem called travis-lint and run the file .travis.yml

$ gem install travis-lint
$ travis-lint .travis.yml
travis-lint

Build Status Images

It’s really helpful to include a little image to indicate the current status of the build. The image itself can be accessed from the URL pattern http://travis-ci.org/[username]/[repository-name].png. Another way to quickly access the images embedded in various formats is on the Travis CI project page itself. For example, we can copy the Markdown format and embed in the project’s README.md file.

travis-build

Another cool way to track the build statuses of various open source projects while surfing around Github is to install one of the browser extensions. This will put the build status images prominently right next to each of the project names.


Resources on Travis CI

Here are some resources on the concept of continuous integration as well as learning and integrating Travis CI into our Github projects:

A fantastic way to learn what and how to include the various build configurations in the .travis.yml file is to actually browse through many of the popular open source repositories that already integrate Travis CI. Here are a few:

  1. Ruby on Rails (Ruby)
  2. BackboneJS (JavaScript)
  3. Composer (PHP)
  4. Flask (Python)

I hope this gave you a brief introduction to how we can easily integrate Travis CI in our Github projects. It’s really easy to use, so give it a try and make continuous integration a breeze for your team!

03 Dec 09:41

A Complete Guide to the Table Element

by Chris Coyier
Yunchenge

table

The <table> element in HTML is used for displaying tabular data. You can think of it as a way to describe and display data that would make sense in spreadsheet software. Essentially: columns and rows. In this article we're going to look at how to use them, when to use them, and everything else you need to know.

A Very Basic Example

Here's a very simple demo of tabular data:

See the Pen xkrGs by Chris Coyier (@chriscoyier) on CodePen

It is data that is useful across multiple axis. Imagine running your finger across a row (horizontal) to see a single person and relevant information about them. Or up and down a column (vertical) to get a sense of the variety or pattern of data on that point.

Head and Body

One thing we didn't do in the very basic example above is semantically indicate that the first row was the header of the table. We probably should have. That entire first row contains no data, it is simply the titles of columns. We can do that with the <thead> element, which would wrap the first <tr> (it could wrap as many rows as needed that are all header information).

That HTML would be like this:

See the Pen tzjid by Chris Coyier (@chriscoyier) on CodePen

When you use <thead>, there must be no <tr> that is a direct child of <table>. All rows must be within either the <thead>, <tbody>, or <tfoot>. Notice that we also wrapped all the rows of data in <tbody> here.

Foot

Along with <thead> and <tbody> there is <tfoot> for wrapping table rows that indicate the footer of the table. Like <thead>, best for semantically indicating these are not data rows but ancillary information.

What is unique about <tfoot> is the placement in the HTML. It comes after <thead> and before <tbody>! You might think it would be the last thing before the end of <table>, but not so in this case. I believe this is an accessibility concern, as the footer may contain information necessary to understand the table, it should be before the data in the source order.

Despite coming first in the source order, <tfoot> does indeed render at the bottom of the table, making it a rather unusual HTML element.

It can be used, for example, to repeat the header in the case of a visually very tall/long table where it may be easier to see the column titles at the bottom than the top. Although you don't necessarily need to use it that way.

See the Pen mIjil by Chris Coyier (@chriscoyier) on CodePen

<tfoot> is just begging for clever tricks with layout where the position of elements jumps around from bottom to top depending on needs. For instance a nav bar along the bottom of a screen, but with the HTML source toward the beginning where navigation should be.

The Cells: td and th

The individual cells of a table are always one of two elements: <td> or <th>. You can put whatever you want inside a table cell, but these are the elements that make them a table cell. <th> elements are "tabular headers" and <td> elements are "tabular data".

Using our existing simple demo, the top row is all headers. Not data, just titles for the data. All the rest of the rows are data. So:

See the Pen npvAf by Chris Coyier (@chriscoyier) on CodePen

<th> elements are not necessarily limited to being within the <thead>. They simply indicate header information. So they could be used, for instance, at the start of a row in the <tbody>, if that was relevant. We'll cover a case like that later.

Basic Styling

Most tables you will ever see use colors and lines to distinguish different parts of the table. Borders are very common. By default, all table cells are spacing out from one another by 2px (via the user-agent stylesheet), like this:

See the Pen GmsEc by Chris Coyier (@chriscoyier) on CodePen

Notice the slight extra gap between the first row and the rest. That is caused by the default border-spacing being applied to the <thead> and <tbody> pushing them apart a bit extra. This isn't margins, they don't collapse. You can control that spacing like:

table {
  border-spacing: 0.5rem;
}

But far more common is to remove that space. That property is completely ignored and the space collapsed if you do:

table {
  border-collapse: collapse;
}

Just a little padding, borders, and making those <th> elements be left-aligned goes a long way to make for a simple, styled table:

See the Pen kaErt by Chris Coyier (@chriscoyier) on CodePen

Connecting Cells

There are two important attributes that can go on any table cell element (<th> or <td>): colspan and rowspan. They accept any positive integer 2 or larger. If a td has a colspan of 2 (i.e. <td colspan="2">) it will still be a single cell, but it will take up the space of two cells in a row horizontally. Likewise with rowspan, but vertically.

See the Pen fixJg by Chris Coyier (@chriscoyier) on CodePen

You'll have to do a bit of mental math everything you start working with connected cells. Colspan is fairly easy. Any table cell is "worth" one, unless it has a colspan attribute and then it's worth that many. Add up the values for each table cell in that table row to get the final value. Each row should have exactly that value, or else you'll get an awkward table layout that doesn't make a rectangle (the longest row will stick out).

Rowspan is similar, it's just a little harder and more of a mental leap, because columns aren't grouped like rows are. If a table element has a rowspan attribute, it spans across two rows vertically. That means the row below it gets +1 to it's table cell count, and needs one less table cell to complete the row.

It can be awkward to work out in your head, but we're developers here, we can do it =).

Often these attributes are used in really simple ways like connecting a few related table headers:

See the Pen AlxGt by Chris Coyier (@chriscoyier) on CodePen

As Wide As They Need To Be... Or fill the container... Or beyond

The table element itself is unusual in how wide it is. It behaves like a block-level element (e.g. <div>) in that if you put one table after another, each will break down onto its own line. But the actual width of the table is only as wide as it needs to be.

See the Pen JnLIt by Chris Coyier (@chriscoyier) on CodePen

If the amount of text in the tables widest row only happens to be 100px wide, the table will be 100px wide. If the amount of text (if put on one line) would be wider than the container, it will wrap.

However if text is told to not wrap (i.e. white-space: nowrap;) the table is happy to bust out of the container and go wider. Table cells will also not wrap, so if there are too many to fit, the table will also go wider.

See the Pen ILrKi by Chris Coyier (@chriscoyier) on CodePen

Two Axis Tables

Sometimes it makes sense for tabular data to have two axes. Like a cross-reference situation. A multiplication table is an example of this:

See the Pen Multipication Table by Chris Coyier (@chriscoyier) on CodePen

I might skip a <thead> in this situation even though that first row is all header. It's just no more important than the vertical column of headers so it feels weird to group that top row alone. Just make on row of all <th> and then each subsequent row with the first cell only as <th>.

When To Use Tables

It's a good time to take a break and discuss the when of tables. Perhaps you've heard the generic advice: tables are for tabular data (see the first sentence of this blog post). The "would this make sense in a spreadsheet?" test is usually appropriate.

What kinds of things are appropriate in tables? Here are some: a plan/pricing/features comparison, bowling scores, an internal grid of employee data, financial data, a calendar, the nutrition facts information panel, a logic puzzle solver, etc.

You might occasionally hear: tables are unsemantic. That's not true - they semantically indicate tabular data. Tables are the right choice when that is the case.

When NOT To Use Tables

An inappropriate use for tables is for layout. That may seem counter-intuitive. At a glance at how tables work may make them seem ideal for layout. Easy to control, extremely logical, predictable, and not-at-all fragile.

There are some significant problems with using tables for layout though. First, HTML tags mean things. As we covered, table elements semantically describe tabular data. Using them for anything else is a breach of semantic duty. You aren't going to get a fine in the mail, but you aren't getting as much value from your HTML as you could.

Talking about semantics is a little difficult sometimes (some reads: 1, 2, 3, 4, 5), so let's talk about something we all generally agree on (even if we aren't as good as it as we want to be): websites should be accessible. One part of accessibility is screen readers. Screen readers read tables from top to bottom, left to right. That means the order of how your site is presented is dictated by the table structure, which is dictated by visual choices not accessibility choices. Not to mention a screen reader may even announce the start of tabular data which would be worse than useless.

Speaking of source order, that affects more than accessibility. Imagine a "sidebar on the left" layout. A table would dictate that table comes first in the source order, which while also being bad for accessibility, is likely bad for SEO as well, potentially valuing your ancillary content above primary content.

Could you fix the SEO issues by using semantic tags within the table tags? Possibly somewhat, but now you're using double the HTML. If you really need the layout abilities of a table but want to use semantic tags, see the next section. If you are somehow absolutely stuck using table tags for layout, use the ARIA role="presentation" on the table to indicate it as such.

As I write this in the latter half of 2013, tables have become far less prevalent and even appealing as a layout choice. We're seeing a lot more use of fixed and absolute positioning which you cannot do inside a table. We're seeing flexbox being awesome and being right on the edge of mainstream usability. We're seeing grid layout starting to grow up. We're seeing inline-block be used powerfully. We're seeing the fragility of floats in the olden days fade away.

Rarely do you see modern websites touch tables for layout. The last holdout is HTML emails. The landscape of what renders emails is super wide. It is everything we deal with on the web, plus the world of native apps on both mobile and desktop on operating systems new and ancient. You can do some progressive enhancement for emails, but the layout itself is still generally regarded as being safest done in tables. That is substantiated by the fact that the major email sending services still all offer templates as tables.

Making Semantic Elements Behave Like a Table

CSS has properties to make any element you wish behave as if it was a table element. You'll need to structure them essentially as you would a table, and it will be subject to the same source-order-dependency as a table, but you can do it. I'm not crapping on it either, it's genuinely useful sometimes. If that layout style solves a problem and has no negative order implications, use it.

Don't use inline styles, but just for understanding here's how that would go:

<section style="display: table;">
  <header style="display: table-row;">
    <div style="display: table-cell;"></div>
    <div style="display: table-cell;"></div>
    <div style="display: table-cell;"></div>
  </header>
  <div style="display: table-row;">
    <div style="display: table-cell;"></div>
    <div style="display: table-cell;"></div>
    <div style="display: table-cell;"></div>
  </div>
</section>

A handy trick here is that you don't even need the table-row element in there if you don't want. A bunch of display: table-cell; elements that are children of a display: table; element will behave like they are all in one row.

You always alter the display property of the element to get the table-style behavior. Here's the values:

display: table                /* <table>     */
display: table-cell           /* <td>        */
display: table-row            /* <tr>        */
display: table-column         /* <col>       */
display: table-column-group   /* <colgroup>  */
display: table-footer-group   /* <tfoot>     */
display: table-header-group   /* <thead>     */

Notice there is no <th> alternative. That is for semantic value only. It otherwise behaves just like a <td>, so, no need to replicate it in CSS.

There is also display: inline-table; which is pretty interesting. Remember we talked about how weird table elements widths are above. They are only as wide as they need to be, yet break onto new lines. It's almost like they are inline-block elements which happen to break. This makes them literally like inline-block elements, without the breaking.

If you want to learn a lot more about using semantic elements but also table-style layout, check out the book Everything You Know About CSS Is Wrong!

I've never been a huge fan of that title as it suggests that using this table style layout is the right way and any other layout technique you use is the wrong way. But as I've said, this can be tremendously useful and I'm glad it's in CSS. Just be acutely aware that no matter what kind of elements you use to create a table-based layout, it still subject to the same problems (largely source order dependency).

All Table Related Elements

There is a few elements above we haven't touched on yet. Let's look at all the HTML table related elements. You know what, we might as well use a table to do it:

Element What it is
<table> The table itself
<caption> The caption for the table. Like a figcaption to a figure.
<thead> The table header
<tbody> The table body
<tfoot> The table footer
<tr> A table row
<th> A table cell that is a header
<td> A table cell that is data
<col> A column (a no-content element)
<colgroup> A group of columns

All Table Related Attributes

There are suprisingly few attributes that are specific to tables. Of course you can use class and ID and all the typical global attributes. There used to be quite a few, but most of them were specific to styling and thus deprecated (as that is CSS's job).

Attribute Element(s) Found On What it does
colspan th, td extends a cell to be as wide as 2 or more cells
rowspan th, td extends a cell to be as tall as 2 or more cells
span col Makes the column apply to more to 2 or more columns
sortable table Indicates the table should allow sorting
headers td space-separated string corresponding to ID's of the <th> elements relevant to the data
scope th row | col | rowgroup | colgroup (default) - essentially specifies the axis of the header. The default is that a header is heading a column, which is typical, but a row might start with a header also, where you would scope that header to the row or rowgroup.

Deprecated Attributes

Don't use any of these. The are deprecated. While they may work in some browsers today, there is a chance they stop working in the future.

Deprecated Attribute What to use instead
align Use float property instead
valign Use vertical-align property instead
char The correct answer is to use text-align: "x"; where x is the character to align on, but it's not implemented anywhere yet. But this attribute isn't supported either, so no big loss.
charoff See above
bgcolor Use background property instead
abbr "consider starting the cell content by an independent abbreviated content itself or use the abbreviated content as the cell content and use the long content as the description of the cell by putting it in the title attribute"
axis Use the scope attribute instead
border Use border property instead
cellpadding Using padding property instead
cellspacing Use border-spacing property instead
frame Use border property instead
rules User border property instead
summary Use <caption> element instead
width Use width property instead

The Table Stack

There is an implied vertical stacking of table elements, just like there is in any HTML parent > descendent scenario. It is important to understand in tables because it can be particularly tempting to apply things like backgrounds to the table itself or table rows, only to have the background on a table cell "override" it (it is actually just sitting on top).

Here's how that looks (using Firefox 3D feature in its dev tools):

Important Style Rules for Tables

You can use most CSS properties on table elements. font-family works on tables just like it does on any other element, for example. And the rules of cascade apply. Apply font-family to the table, but a different font-family on the table cell, the table cell wins because that is the actual element with the text inside.

These properties are either unique to table elements or they behave uniquely on table elements.

CSS Property Possible values What it does
vertical-align baseline
sub
super
text-top
text-bottom
middle
top
bottom
%
length
Aligns the content inside a cell. Works particularly well in tables, although only the top/bottom/middle make much sense in that context.
white-space normal
pre
nowrap
pre-wrap
pre-line
Controls how text wraps in a cell. Some data may need to be all on one line to make sense.
border-collapse collapse
separate
Applied to the table to determine if borders collapse into themselves (sort of like margin collapsing only bi-directional) or not.
border-spacing length If border-collapse is separate, you can specify how far cells should be spaced out from each other. Modern version of cellspacing attribute. And speaking of that, padding is the modern version of the cellpadding attribute.
width length Width works on table cells just about how you would think it does, except when there is some kind of conflict. For instance if you tell the table itself to be 400px wide then the first cell of a three-cell row to be 100px wide and leave the others alone, that first cell will be 100px wide and the other two will split up the remaining space. But if you tell all three of them to be 10000px wide, the table will still be 400px and it will just give each of them a third of the space. That's assuming white-space or elements like an image don't come into play. This is probably a whole post in itself!
border length Border works on any of the table elements and just about how you would expect. The quirks come in when you collapse the borders. In this case all table cells will have only one border width between them, rather than the two you would expect them to have (border-right on the first cell and border-left on the next cell). In order to remove a border in a collapsed environment, both cells need to "agree" to remove it. Like td:nth-child(2) { border-right: 0; } td:nth-child(3) { border-left: 0; } Otherwise, source order/specificity wins which border is shown on which edge.
table-layout auto
fixed
auto is the default. The width of the table and its cells depends on the content inside. If you change this to fixed, the table and column widths are set by the widths of table and col elements or by the width of the first row of cells. Cells in subsequent rows do not affect column widths, which can speed up rendering. If content in subsequent cells can't fit, the overflow property determines what happens.

This list isn't exhaustive. There are other CSS quirks that are relevant to tables. For instance, you can't relatively position a table cell in which to either nudge it around or absolutely position things within it. There are ways though.

If you can think of more CSS weirdness with tables, share in the comments below.

Default Styles / User Agent Stylesheet

WebKit does this:

table {
    display: table;
    border-collapse: separate;
    border-spacing: 2px;
    border-color: gray
}

thead {
    display: table-header-group;
    vertical-align: middle;
    border-color: inherit
}

tbody {
    display: table-row-group;
    vertical-align: middle;
    border-color: inherit
}

tfoot {
    display: table-footer-group;
    vertical-align: middle;
    border-color: inherit
}

table > tr {
    vertical-align: middle;
}

col {
    display: table-column
}

colgroup {
    display: table-column-group
}

tr {
    display: table-row;
    vertical-align: inherit;
    border-color: inherit
}

td, th {
    display: table-cell;
    vertical-align: inherit
}

th {
    font-weight: bold
}

caption {
    display: table-caption;
    text-align: -webkit-center
}

I inspected each element in Chrome Dev Tools too, which is now on Blink, and it's still the same.

It's funny though. For sure, the text in <th>s is centered (text-align: center;) by default. But that's not in the UA stylesheet. Not a huge deal but rather mysterious and makes you wonder what other mysterious things happen in rendering.

The UA stylesheet for tables differs from browser to browser. For example, in Firefox (here's 3.6's UA Stylesheet, but this is true in v23 too) table cells have this:

td { 
  display: table-cell;
  vertical-align: inherit;
  text-align: inherit; 
  padding: 1px;
}

Most notably, 1px of padding that WebKit doesn't have. Not a huge deal in most cases, surely, but it is different. That's what CSS resets (and related projects) are all about: removing the differences. So let's check those out.

Resetting Default Table Styles

The most popular CSS reset in the world, the Meyer Reset, does this to tables:

table, caption, tbody, tfoot, thead, tr, th, td {
  margin: 0;
  padding: 0;
  border: 0;
  font-size: 100%;
  font: inherit;
  vertical-align: baseline;
}
table {
  border-collapse: collapse;
  border-spacing: 0;
}

It's done the same way in the HTML5 Reset and the HTML5 (Doctor) Reset Stylesheet.

There is an alternative to CSS resets though, Normalize.css. The philosophy is slightly different. Rather than zero out all styles, it specifically sets known-to-be inconsistent styles to a reasonable default. My general advice on using Normalize.css is: don't remove anything from it. If it's in there, it needs to be for consistency. But feel free to change anything in there.

Normalize only does this to tables:

table {
  border-collapse: collapse;
  border-spacing: 0;
}

I'll have to dig into the reasoning here a little deeper because it seems unusual...

  1. I'm a fan of border-collapse: collapse because spacing between cells is usually way awkward, but, the default in every browser I know of is border-collapse: separate; so isn't in need of normalization.
  2. If border-collapse is collapse, border-spacing doesn't matter.
  3. Table cell elements are in need of normalization (e.g. Firefox padding difference) but that isn't there.

Not a hugely big deal.

This is the kind of thing I would probably normally do:

table {
  border-collapse: collapse;
  width: 100%;
}
th, td {
  padding: 0.25rem;
  text-align: left;
  border: 1px solid #ccc;
}

"Implied" Elements and Unclosed Tags

Check out this rather awkward bit of HTML:

<table>
  <col>
  <tr>
    <td>Cell
</table>

This may be weird to look at, but it's valid. What's going on here?

  • The <col> tag is just one of those no-content elements that doesn't ever need a closing tag. Like <br> / <br />
  • The <td> element doesn't need to be closed in certain circumstances: "The end tag may be omitted, if it is immediately followed by a <th> or <td> element or if there are no more data in its parent element."
  • The missing closing </tr> tag is the same story: "The end tag may be omitted if the <tr> element is immediately followed by a <tr> element, or if the parent table group (<thead>, <tbody> or <tfoot>) element doesn't have any more content."

If we inspect the rendered table in the browser, we can see that the tags that were missing their closing tags are shown with closing tags. Those are automatically added for us. But there are also some brand new elements in there:

One thing to notice is the <col> is wrapped within a <colgroup> automatically. Even if we were to do:

<table>
  <col>
  <colgroup>
    <col>
  </colgroup>
  <tr>
    <td>Cell
    <td>Cell
</table>

Then:

colgroup:first-child {
  background: red;
}

You would think the second cell would be red, not the first, because the "first" colgroup only affects the second cell. But when rendered, both of those columns get wrapped in a colgroup, so the CSS selector will select the first one.

The <tbody> element is also implied. If you don't use any of tbody, thead, or tfoot, the whole guts of the table will be wrapped in tbody. If you use a thead, the whole table will be wrapped in that until it find a tbody, then it will auto-close the thead if you don't, and wrap the rest in tbody (also optional to close). If if finds a tfoot, you can imagine what happens (although remember tfoot should come before tbody).

You can actually use these elements in CSS selectors even though you didn't put them in your actual HTML. I probably wouldn't advise it just because that's weird, confusing, and styling tag selectors usually isn't advisable anyway.

Making a Table Not a Table

A situation may arise someday where you need to force a table element to not exhibit its table-style layout behavior and behave more like a regular element.

The trick is essentially to reset the display property of the table cells:

th, td {
  display: inline;
}

We can pretty quickly un-table a table:

See the Pen Untabling by Chris Coyier (@chriscoyier) on CodePen

Just to be safe, I'd reset the whole she-bang. Just makes me feel better knowing parent elements are also along for the ride and won't get freaky.

table, thead, tbody, tfoot, tr, td, th, caption {
  display: block;
}

This is primarily useful in responsive design where the traditional table layout makes sense on large screens but needs significant shifts to make sense on smaller screens. There is a whole section on that below.

Table Accessibility

We already talked about the problems with using tables for layout and accessibility. But assuming table is being correctly used for tabular data, there are still quite a few accessibility concerns.

There are some great articles on this out there:

Zebra Striping Tables

If you don't set a background-color on the table cell elements, you can set them on the table rows themselves. So at the most basic, you can do:

tbody tr:nth-child(odd) {
  background: #eee;
}

We're using the tbody in the selector because it's unlikely you'd want to stripe header and footer rows. Set the even rows as well if you want to be specific about it instead of let what is underneath show through.

If you need to support browsers that don't understand :nth-child() (pretty damn old) you could use jQuery to do it.

See the Pen Zebra Striped Table by Chris Coyier (@chriscoyier) on CodePen

Studies seem to show that zebra stripping in generally a good idea.

Highlighting Rows and Columns

Hightlighting a particluar row is fairly easy. You could add a class name to a row specifically for that:

<tr class="highlight">
  ...
</tr>

See the Pen Zebra Striped Table by Chris Coyier (@chriscoyier) on CodePen

Highlighting a column is a bit trickier. One possibility is to use the <col> element, which does allow us to set styles for cells that appear in that column. It's weird to wrap your head around, because the cells that are affected by <col> aren't actually descendants of it. The browser just kinda knows what you mean.

A table with four columns in each row would have four <col> elements:

<table>
  <col>
  <col>
  <col>
  <col>

  <thead>
     ...

</table>

Then you could highlight a particular one, like:

col:nth-child(3) {
  background: yellow; 
}

See the Pen Zebra Striped Table by Chris Coyier (@chriscoyier) on CodePen

However this is rarely useful. If you set the background of a row element or table cell element, that will always beat a background of a column element. Regardless of specificity.

You're probably better off setting a class name on each individual table cell element that happens to match that column position in the row. Like:

td:nth-child(2),
th:nth-child(2){
  background: yellow;
}

Highlighting Column/Row/Cell on Hover

Cell highlighting is very easy. You can do it right in CSS:

td:hover { /* th:hover also if you wish */
  background: yellow;
}

Row highlighting is just as easy. You can set the background on table rows and it will show as long as you don't set a background on the table cells.

tbody tr:hover {
  background: yellow;
}

See the Pen Zebra Striped Table by Chris Coyier (@chriscoyier) on CodePen

If you do set a background on the table cells, you can always just to tr:hover td, tr:hover th { } so still pretty easy.

Column highlighting is tricker. You can't use col:hover because those columns aren't actual elements that take up pixel space on the screen that you could hover over. The only option is JavaScript.

I wrote it up in Vanilla JavaScript here, just for fun:

See the Pen Column Highlighting in Vanilla JavaScript by Chris Coyier (@chriscoyier) on CodePen

It works like this:

  1. Get a collection of all cells
  2. Bind a mouseover and mouseout event to all those cells
  3. When the mouseover event fires, get the position in the row of that cell
  4. Loop through all rows and add a highlighting class to each cell in that row that matches that position
  5. When the mouseout event fires, remove the highlighting class from all cells

And here I've combined both row and column highlighting. I used jQuery to make it all 12 lines of code (the raw JavaScript was getting pretty exhausting).

See the Pen Row & Column Highlighting in jQuery by Chris Coyier (@chriscoyier) on CodePen

It's the same concept, it's just much easier to make element collections, and find and select by indexes in jQuery.

Nicely Styled Tables

Some depth, visually distinct headers, and a terminal matching the header.

See the Pen IyDpa by Phelipe M. Peres (@mestremind) on CodePen


When the table is hovered, only the current row highlighted stays dark text, the others fade back. Also note on this one: the roundered corners on the table itself are only possible while you have border-collapse: separate;

See the Pen Crisp table by JOAQUIN RAPHAEL ARCAJO (@jrarcajo) on CodePen


Here's another where the non-hovered rows literally blur:

See the Pen Fade and Blur on Hover Data Table by Jack Rugile (@jackrugile) on CodePen


Twitter Bootstrap has very minimal table styling:

See the Pen KnJfk by Chris Coyier (@chriscoyier) on CodePen


This one, as a bonus, has keyboard control!

See the Pen HeavyTable by Victor Darras (@victordarras) on CodePen


I'm trying to keep a collection of well-designed tables for reference. So if you have any good ones, let me know. Hong Kiat also has a blog post collection.

Table Search

Where table sorting can be quite complicated, table search can be quite easy. Add a search input, and if the value in there matches text anywhere in a row, show it, and hide the others. With jQuery that might be as easy as:

var allRows = $("tr");
$("input#search").on("keydown keyup", function() {
  allRows.hide();
  $("tr:contains('" + $(this).val() + "')").show();
});

Here's a take with RegExp instead:

See the Pen Quick Table Search by Alexander Standke (@XanderStrike) on CodePen

And here's on in raw JavaScript:

See the Pen Light Javascript Table Filter by Chris Coyier (@chriscoyier) on CodePen

Tables Can Be Difficult in Fluid/Responsive Designs

I've written about this in the past, and I think this graphic kind of sums up the experience of a data table on a small screen:

I ultimately created a roundup once a variety of interesting solutions came around.

Real quick though:

Here's a couple of styled live demos with different takes:

See the Pen Responsive Table by Geoff Yuen (@geoffyuen) on CodePen

See the Pen A responsive table by Israel Lemus (@izzrael) on CodePen

Fixed Header Tables

This is another thing I've written about in the past as well as done a little screencast. Those are fairly old, but the demo still works.

The most modern way of handling fixed headers is position: sticky; Here's an article on that. I'm honestly not quite sure the recommended way to use it with tables though. It doesn't work on <thead> under normal circumstances. That kinda makes sense because you can't absolutely position table innards. But it does work on <th>. Anyway if someone wants to figure that out, that'd be a good update to this article (or something).

Here's a live demo of a jQuery plugin that does the trick. I'd probably go for something like this these days until sticky shakes out more.

See the Pen Table with fixed header on scroll by jgx (@jgx) on CodePen

Using Emmet for Creating Table Markup

Emmet is a great tool for a bunch of reasons. One of which is writing HTML abbreviations and having them expand out into real HTML. Since tables are so repetitive and verbose, Emmet is perfect for them. Emmet works on CodePen too =)

Simple four rows and four columns

table>tr*4>td*4

Five rows with the header on the left

table>tr*5>th+td*4

A row of headers on the top

table>tr>th*5^tr*3>td*5

Employees with incrementing IDs

table>tr>th{Name}+th{ID}+th{Favorite Color}^tr*3>td{Name}+td{$$$$$}+td{Blue}

Table with header, footer, and content

table>thead>tr>th*5^^tfoot>tr>th*5^^tbody>tr*10>th+td*4

Same but with cell content in each cell

table>thead>tr>th{Header Cell}*5^^tfoot>tr>th{Footer Cell}*5^^tbody>tr*10>th{Row Header}+td{Cell Data}*4

JavaScript Generated Tables

JavaScript provides some very specific methods for dealing with tables through the HTMLTableElement API. Louis Lazaris wrote a little about it recently. You can use it to create tables with JavaScript, access sub-elements, and change properties in very specific ways. Here's the MDN page with the scoop.

Here's that at work:

See the Pen inosC by Chris Coyier (@chriscoyier) on CodePen

Table Sorting

Imagine a table with two columns. One for Employee ID's and another for Employee Email Address. There are headers for each column. It would be handy to be able to click those headers and sort the table by the data inside. For instance, numerical order, alternating between ascending and descending, for the ID's and alphabetical for the email addresses. That's what table sorting is all about. Making the data more useful.

This is such a common and generic need, that there is actually specification ready for it. Just put the sortable attribute on the table and it will automatically do it as long as you follow a couple of rules laid out in the spec.

At the time of this writing, I don't know of any browsers supporting table sorting natively. But there are lots of third-party options!

  • tablesorter - jQuery-based "Flexible client-side table sorting"
  • sorttable - raw javaScript
  • tablesort - "a small & simple sorting component for tables. Written in Javascript and dependency free"

What's with table sorting scripts and lowercase? Anyway, here's a demo of tablesorter:

See the Pen Table Exercise by egon0119 (@egon0119) on CodePen

If those don't do it for you, Codrops rounded up 33 different table sorting scripts, so there are plenty to choose from.

And those are all JavaScript solutions. It's certainly possible to sort data on the back-end and display the table already sorted in the HTML. That might be required in the case of paginated tables where all the data isn't available right in the DOM.

More Information


A Complete Guide to the Table Element is a post from CSS-Tricks

25 Nov 04:59

Build a Complete MVC Website With ExpressJS

by Krasimir Tsonev
Yunchenge

node

In this article we’ll be building a complete website with a front-facing client side, as well as a control panel for managing the site’s content. As you may guess, the final working version of the application contains a lot of different files. I wrote this tutorial step by step, following the development process, but I didn’t include every single file, as that would make this a very long and boring read. However, the source code is available on GitHub and I strongly recommend that you take a look.


Introduction

Express is one of the best frameworks for Node. It has great support and a bunch of helpful features. There are a lot of great articles out there, which cover all of the basics. However, this time I want to dig in a little bit deeper and share my workflow for creating a complete website. In general, this article is not only for Express, but for using it in combination with some other great tools that are available for Node developers.

I assume that you are familiar with Nodejs, have it installed on your system, and that you have probably built some applications with it already.

At the heart of Express is Connect. This is a middleware framework, which comes with a lot of useful stuff. If you’re wondering what exactly a middleware is, here is a quick example:

var connect = require('connect'),
    http = require('http');

var app = connect()
    .use(function(req, res, next) {
        console.log("That's my first middleware");
        next();
    })
    .use(function(req, res, next) {
        console.log("That's my second middleware");
        next();
    })
    .use(function(req, res, next) {
        console.log("end");
        res.end("hello world");
    });

http.createServer(app).listen(3000);

Middleware is basically a function which accepts request and response objects and a next function. Each middleware can decide to respond by using a response object or pass the flow to the next function by calling the next callback. In the example above, if you remove the next() method call in the second middleware, the hello world string will never be sent to the browser. In general, that’s how Express works. There are some predefined middlewares, which of course, save you a lot of time. Like for example, Body parser which parses request bodies and supports application/json, application/x-www-form-urlencoded, and multipart/form-data. Or the Cookie parser, which parses cookie headers and populates req.cookies with an object keyed by the cookie’s name.

Express actually wraps Connect and adds some new functionality around it. Like for example, routing logic, which makes the process much smoother. Here’s an example of handling a GET request:

app.get('/hello.txt', function(req, res){
    var body = 'Hello World';
    res.setHeader('Content-Type', 'text/plain');
    res.setHeader('Content-Length', body.length);
    res.end(body);
});

Setup

There are two ways to setup Express. The first one is by placing it in your package.json file and running npm install (there’s a joke that npm means no problem man :)).

{
    "name": "MyWebSite",
    "description": "My website",
    "version": "0.0.1",
    "dependencies": {
        "express": "3.x"
    }
}

The framework’s code will be placed in node_modules and you will be able to create an instance of it. However, I prefer an alternative option, by using the command line tool. Just install Express globally with npm install -g express. By doing this, you now have a brand new CLI instrument. For example if you run:

express --sessions --css less --hogan app

Express will create an application skeleton with a few things already configured for you. Here are the usage options for the express(1) command:

Usage: express [options]
Options:
  -h, --help          output usage information
  -V, --version       output the version number
  -s, --sessions      add session support
  -e, --ejs           add ejs engine support (defaults to jade)
  -J, --jshtml        add jshtml engine support (defaults to jade)
  -H, --hogan         add hogan.js engine support
  -c, --css   add stylesheet  support (less|stylus) (defaults to plain css)
  -f, --force         force on non-empty directory

As you can see, there are just a few options available, but for me they are enough. Normally I’m using less as the CSS preprocessor and hogan as the templating engine. In this example, we will also need session support, so the --sessions argument solves that problem. When the above command finishes, our project looks like the following:

/public
    /images
    /javascripts
    /stylesheets
/routes
    /index.js
    /user.js
/views
    /index.hjs
/app.js
/package.json

If you check out the package.json file, you will see that all the dependencies which we need are added here. Although they haven’t been installed yet. To do so, just run npm install and then a node_modules folder will pop up.

I realize that the above approach is not always appropriate. You may want to place your route handlers in another directory or something something similar. But, as you’ll see in the next few chapters, I’ll make changes to the already generated structure, which is pretty easy to do. So you should just think of the express(1) command as a boilerplate generator.


FastDelivery

For this tutorial, I designed a simple website of a fake company named FastDelivery. Here’s a screenshot of the complete design:

site

At the end of this tutorial, we will have a complete web application, with a working control panel. The idea is to manage every part of the site in separate restricted areas. The layout was created in Photoshop and sliced to CSS(less) and HTML(hogan) files. Now, I’m not going to be covering the slicing process, because it’s not the subject of this article, but if you have any questions regarding this, don’t hesitate to ask. After the slicing, we have the following files and app structure:

/public
    /images (there are several images exported from Photoshop)
    /javascripts
    /stylesheets
        /home.less
        /inner.less
        /style.css
        /style.less (imports home.less and inner.less)
/routes
    /index.js
/views
    /index.hjs (home page)
    /inner.hjs (template for every other page of the site)
/app.js
/package.json

Here is a list of the site’s elements that we are going to administrate:

  • Home (the banner in the middle – title and text)
  • Blog (adding, removing and editing of articles)
  • Services page
  • Careers page
  • Contacts page

Configuration

There are a few things that we have to do before we can start the real implementation. The configuration setup is one of them. Let’s imagine that our little site should be deployed to three different places – a local server, a staging server and a production server. Of course the settings for every environment are different and we should implement a mechanism which is flexible enough. As you know, every node script is run as a console program. So, we can easily send command line arguments which will define the current environment. I wrapped that part in a separate module in order to write a test for it later. Here is the /config/index.js file:

var config = {
    local: {
        mode: 'local',
        port: 3000
    },
    staging: {
        mode: 'staging',
        port: 4000
    },
    production: {
        mode: 'production',
        port: 5000
    }
}
module.exports = function(mode) {
    return config[mode || process.argv[2] || 'local'] || config.local;
}

There are only two settings (for now) – mode and port. As you may guess, the application uses different ports for the different servers. That’s why we have to update the entry point of the site, in app.js.

...
var config = require('./config')();
...
http.createServer(app).listen(config.port, function(){
    console.log('Express server listening on port ' + config.port);
});

To switch between the configurations, just add the environment at the end. For example:

node app.js staging

Will produce:

Express server listening on port 4000

Now we have all our settings in one place and they are easily manageable.


Tests

I’m a big fan of TDD. I’ll try to cover all the base classes used in this article. Of course, having tests for absolutely everything will make this writing too long, but in general, that’s how you should proceed when creating your own apps. One of my favorite frameworks for testing is jasmine. Of course it’s available in the npm registry:

npm install -g jasmine-node

Let’s create a tests directory which will hold our tests. The first thing that we are going to check is our configuration setup. The spec files must end with .spec.js, so the file should be called config.spec.js.

describe("Configuration setup", function() {
    it("should load local configurations", function(next) {
        var config = require('../config')();
        expect(config.mode).toBe('local');
        next();
    });
    it("should load staging configurations", function(next) {
        var config = require('../config')('staging');
        expect(config.mode).toBe('staging');
        next();
    });
    it("should load production configurations", function(next) {
        var config = require('../config')('production');
        expect(config.mode).toBe('production');
        next();
    });
});

Run jasmine-node ./tests and you should see the following:

Finished in 0.008 seconds
3 tests, 6 assertions, 0 failures, 0 skipped

This time, I wrote the implementation first and the test second. That’s not exactly the TDD way of doing things, but over the next few chapters I’ll do the opposite.

I strongly recommend spending a good amount of time writing tests. There is nothing better than a fully tested application.

A couple of years ago I realized something very important, which may help you to produce better programs. Each time you start writing a new class, a new module, or just a new piece of logic, ask yourself:

How can I test this?

The answer to this question will help you to code much more efficiently, create better APIs, and put everything into nicely separated blocks. You can’t write tests for spaghetti code. For example, in the configuration file above (/config/index.js) I added the possibility to send the mode in the module’s constructor. You may wonder, why do I do that when the main idea is to get the mode from the command line arguments? It’s simple … because I needed to test it. Let’s imagine that one month later I need to check something in a production configuration, but the node script is run with a staging parameter. I won’t be able to make this change without that little improvement. That one previous little step now actually prevents problems in the future.


Database

Since we are building a dynamic website, we need a database to store our data in. I chose to use mongodb for this tutorial. Mongo is a NoSQL document database. The installation instructions can be found here and because I’m a Windows user, I followed the Windows installation instead. Once you finish with the installation, run the MongoDB daemon, which by default listens on port 27017. So, in theory, we should be able to connect to this port and communicate with the mongodb server. To do this from a node script, we need a mongodb module/driver. If you downloaded the source files for this tutorial, the module is already added in the package.json file. If not, just add "mongodb": "1.3.10" to your dependencies and run npm install.

Next, we are going to write a test, which checks if there is a mongodb server running. /tests/mongodb.spec.js file:

describe("MongoDB", function() {
    it("is there a server running", function(next) {
        var MongoClient = require('mongodb').MongoClient;
        MongoClient.connect('mongodb://127.0.0.1:27017/fastdelivery', function(err, db) {
            expect(err).toBe(null);
            next();
        });
    });
});

The callback in the .connect method of the mongodb client receives a db object. We will use it later to manage our data, which means that we need access to it inside our models. It’s not a good idea to create a new MongoClient object every time when we have to make a request to the database. That’s why I moved the running of the express server inside the callback of the connect function:

MongoClient.connect('mongodb://127.0.0.1:27017/fastdelivery', function(err, db) {
    if(err) {
        console.log('Sorry, there is no mongo db server running.');
    } else {
        var attachDB = function(req, res, next) {
            req.db = db;
            next();
        };
        http.createServer(app).listen(config.port, function(){
            console.log('Express server listening on port ' + config.port);
        });
    }
});

Even better, since we have a configuration setup, it would be a good idea to place the mongodb host and port in there and then change the connect URL to:

'mongodb://' + config.mongo.host + ':' + config.mongo.port + '/fastdelivery'

Pay close attention to the middleware: attachDB, which I added just before the call to the http.createServer function. Thanks to this little addition, we will populate a .db property of the request object. The good news is that we can attach several functions during the route definition. For example:

app.get('/', attachDB, function(req, res, next) {
    ...
})

So with that, Express calls attachDB beforehand to reach our route handler. Once this happens, the request object will have the .db property and we can use it to access the database.


MVC

We all know the MVC pattern. The question is how this applies to Express. More or less, it’s a matter of interpretation. In the next few chapters I’ll create modules, which act as a model, view and controller.

Model

The model is what will be handling the data that’s in our application. It should have access to a db object, returned by MongoClient. Our model should also have a method for extending it, because we may want to create different types of models. For example, we might want a BlogModel or a ContactsModel. So we need to write a new spec: /tests/base.model.spec.js, in order to test these two model features. And remember, by defining these functionalities before we start coding the implementation, we can guarantee that our module will do only what we want it to do.

var Model = require("../models/Base"),
    dbMockup = {};
describe("Models", function() {
    it("should create a new model", function(next) {
        var model = new Model(dbMockup);
        expect(model.db).toBeDefined();
        expect(model.extend).toBeDefined();
        next();
    });
    it("should be extendable", function(next) {
        var model = new Model(dbMockup);
        var OtherTypeOfModel = model.extend({
            myCustomModelMethod: function() { }
        });
        var model2 = new OtherTypeOfModel(dbMockup);
        expect(model2.db).toBeDefined();
        expect(model2.myCustomModelMethod).toBeDefined();
        next();
    })
});

Instead of a real db object, I decided to pass a mockup object. That’s because later, I may want to test something specific, which depends on information coming from the database. It will be much easier to define this data manually.

The implementation of the extend method is a little bit tricky, because we have to change the prototype of module.exports, but still keep the original constructor. Thankfully, we have a nice test already written, which proves that our code works. A version which passes the above, looks like this:

module.exports = function(db) {
    this.db = db;
};
module.exports.prototype = {
    extend: function(properties) {
        var Child = module.exports;
        Child.prototype = module.exports.prototype;
        for(var key in properties) {
            Child.prototype[key] = properties[key];
        }
        return Child;
    },
    setDB: function(db) {
        this.db = db;
    },
    collection: function() {
        if(this._collection) return this._collection;
        return this._collection = this.db.collection('fastdelivery-content');
    }
}

Here, there are two helper methods. A setter for the db object and a getter for our database collection.

View

The view will render information to the screen. Essentially, the view is a class which sends a response to the browser. Express provides a short way to do this:

res.render('index', { title: 'Express' });

The response object is a wrapper, which has a nice API, making our life easier. However, I’d prefer to create a module which will encapsulate this functionality. The default views directory will be changed to templates and a new one will be created, which will host the Base view class. This little change now requires another change. We should notify Express that our template files are now placed in another directory:

app.set('views', __dirname + '/templates');

First, I’ll define what I need, write the test, and after that, write the implementation. We need a module matching the following rules:

  • Its constructor should receive a response object and a template name.
  • It should have a render method which accepts a data object.
  • It should be extendable.

You may wonder why I’m extending the View class. Isn’t it just calling the response.render method? Well in practice, there are cases in which you will want to send a different header or maybe manipulate the response object somehow. Like for example, serving JSON data:

var data = {"developer": "Krasimir Tsonev"};
response.contentType('application/json');
response.send(JSON.stringify(data));

Instead of doing this every time, it would be nice to have an HTMLView class and a JSONView class. Or even an XMLView class for sending XML data to the browser. It’s just better, if you build a large website, to wrap such functionalities instead of copy-pasting the same code over and over again.

Here is the spec for the /views/Base.js:

var View = require("../views/Base");
describe("Base view", function() {
    it("create and render new view", function(next) {
        var responseMockup = {
            render: function(template, data) {
                expect(data.myProperty).toBe('value');
                expect(template).toBe('template-file');
                next();
            }
        }
        var v = new View(responseMockup, 'template-file');
        v.render({myProperty: 'value'});
    });
    it("should be extendable", function(next) {
        var v = new View();
        var OtherView = v.extend({
            render: function(data) {
                expect(data.prop).toBe('yes');
                next();
            }
        });
        var otherViewInstance = new OtherView();
        expect(otherViewInstance.render).toBeDefined();
        otherViewInstance.render({prop: 'yes'});
    });
});

In order to test the rendering, I had to create a mockup. In this case, I created an object which imitates the Express’s response object. In the second part of the test, I created another View class which inherits the base one and applies a custom render method. Here is the /views/Base.js class.

module.exports = function(response, template) {
    this.response = response;
    this.template = template;
};
module.exports.prototype = {
    extend: function(properties) {
        var Child = module.exports;
        Child.prototype = module.exports.prototype;
        for(var key in properties) {
            Child.prototype[key] = properties[key];
        }
        return Child;
    },
    render: function(data) {
        if(this.response && this.template) {
            this.response.render(this.template, data);
        }
    }
}

Now we have three specs in our tests directory and if you run jasmine-node ./tests the result should be:

Finished in 0.009 seconds
7 tests, 18 assertions, 0 failures, 0 skipped

Controller

Remember the routes and how they were defined?

app.get('/', routes.index);

The '/' after the route, which in the example above, is actually the controller. It’s just a middleware function which accepts request, response and next.

exports.index = function(req, res, next) {
    res.render('index', { title: 'Express' });
};

Above, is how your controller should look, in the context of Express. The express(1) command line tool creates a directory named routes, but in our case, it is better for it to be named controllers, so I changed it to reflect this naming scheme.

Since we’re not just building a teeny tiny application, it would be wise if we created a base class, which we can extend. If we ever need to pass some kind of functionality to all of our controllers, this base class would be the perfect place. Again, I’ll write the test first, so let’s define what we need:

  • it should have anextend method, which accepts an object and returns a new child instance
  • the child instance should have a run method, which is the old middleware function
  • there should be a name property, which identifies the controller
  • we should be able to create independent objects, based on the class

So just a few things for now, but we may add more functionality later. The test would look something like this:

var BaseController = require("../controllers/Base");
describe("Base controller", function() {
    it("should have a method extend which returns a child instance", function(next) {
        expect(BaseController.extend).toBeDefined();
        var child = BaseController.extend({ name: "my child controller" });
        expect(child.run).toBeDefined();
        expect(child.name).toBe("my child controller");
        next();
    });
    it("should be able to create different childs", function(next) {
        var childA = BaseController.extend({ name: "child A", customProperty: 'value' });
        var childB = BaseController.extend({ name: "child B" });
        expect(childA.name).not.toBe(childB.name);
        expect(childB.customProperty).not.toBeDefined();
        next();
    });
});

And here is the implementation of /controllers/Base.js:

var _ = require("underscore");
module.exports = {
    name: "base",
    extend: function(child) {
        return _.extend({}, this, child);
    },
    run: function(req, res, next) {

    }
}

Of course, every child class should define its own run method, along with its own logic.


FastDelivery Website

Ok, we have a good set of classes for our MVC architecture and we’ve covered our newly created modules with tests. Now we are ready to continue with the site, of our fake company, FastDelivery. Let’s imagine that the site has two parts – a front-end and an administration panel. The front-end will be used to display the information written in the database to our end users. The admin panel will be used to manage that data. Let’s start with our admin (control) panel.

Control Panel

Let’s first create a simple controller which will serve as the administration page. /controllers/Admin.js file:

var BaseController = require("./Base"),
    View = require("../views/Base");
module.exports = BaseController.extend({ 
    name: "Admin",
    run: function(req, res, next) {
        var v = new View(res, 'admin');
        v.render({
            title: 'Administration',
            content: 'Welcome to the control panel'
        });
    }
});

By using the pre-written base classes for our controllers and views, we can easily create the entry point for the control panel. The View class accepts a name of a template file. According to the code above, the file should be called admin.hjs and should be placed in /templates. The content would look something like this:

<!DOCTYPE html>
<html>
    <head>
        <title>{{ title }}</title>
        <link rel='stylesheet' href='/stylesheets/style.css' />
    </head>
    <body>
        <div class="container">
            <h1>{{ content }}</h1>
        </div>
    </body>
</html>

(In order to keep this tutorial fairly short and in an easy to read format, I’m not going to show every single view template. I strongly recommend that you download the source code from GitHub.)

Now to make the controller visible, we have to add a route to it in app.js:

var Admin = require('./controllers/Admin');
...
var attachDB = function(req, res, next) {
    req.db = db;
    next();
};
...
app.all('/admin*', attachDB, function(req, res, next) {
    Admin.run(req, res, next);
});

Note that we are not sending the Admin.run method directly as middleware. That’s because we want to keep the context. If we do this:

app.all('/admin*', Admin.run);

the word this in Admin will point to something else.

Protecting the Administration Panel

Every page which starts with /admin should be protected. To achieve this, we are going to use Express’s middleware: Sessions. It simply attaches an object to the request called session. We should now change our Admin controller to do two additional things:

  • It should check if there is a session available. If not, then display a login form.
  • It should accept the data sent by the login form and authorize the user if the username and password match.

Here is a little helper function we can use to accomplish this:

authorize: function(req) {
    return (
        req.session && 
        req.session.fastdelivery && 
        req.session.fastdelivery === true
    ) || (
        req.body &&
        req.body.username === this.username &&
        req.body.password === this.password
    );
}

First, we have a statement which tries to recognize the user via the session object. Secondly, we check if a form has been submitted. If so, the data from the form is available in the request.body object which is filled by the bodyParser middleware. Then we just check if the username and password matches.

And now here is the run method of the controller, which uses our new helper. We check if the user is authorized, displaying either the control panel itself, otherwise we display the login page:

run: function(req, res, next) {
    if(this.authorize(req)) {
        req.session.fastdelivery = true;
        req.session.save(function(err) {
            var v = new View(res, 'admin');
            v.render({
                title: 'Administration',
                content: 'Welcome to the control panel'
            });
        });         
    } else {
        var v = new View(res, 'admin-login');
        v.render({
            title: 'Please login'
        });
    }       
}

Managing Content

As I pointed out in the beginning of this article we have plenty of things to administrate. To simplify the process, let’s keep all the data in one collection. Every record will have a title, text, picture and type property. The type property will determine the owner of the record. For example, the Contacts page will need only one record with type: 'contacts', while the Blog page will require more records. So, we need three new pages for adding, editing and showing records. Before we jump into creating new templates, styling, and putting new stuff in to the controller, we should write our model class, which stands between the MongoDB server and our application and of course provides a meaningful API.

// /models/ContentModel.js

var Model = require("./Base"),
    crypto = require("crypto"),
    model = new Model();
var ContentModel = model.extend({
    insert: function(data, callback) {
        data.ID = crypto.randomBytes(20).toString('hex'); 
        this.collection().insert(data, {}, callback || function(){ });
    },
    update: function(data, callback) {
        this.collection().update({ID: data.ID}, data, {}, callback || function(){ });   
    },
    getlist: function(callback, query) {
        this.collection().find(query || {}).toArray(callback);
    },
    remove: function(ID, callback) {
        this.collection().findAndModify({ID: ID}, [], {}, {remove: true}, callback);
    }
});
module.exports = ContentModel;

The model takes care of generating a unique ID for every record. We will need it in order to update the information later on.

If we want to add a new record for our Contacts page, we can simply use:

var model = new (require("../models/ContentModel"));
model.insert({
    title: "Contacts",
    text: "...",
    type: "contacts"
});

So, we have a nice API to manage the data in our mongodb collection. Now we are ready to write the UI for using this functionality. For this part, the Admin controller will need to be changed quite a bit. To simplify the task I decided to combine the list of the added records and the form for adding/editing them. As you can see on the screenshot below, the left part of the page is reserved for the list and the right part for the form.

control-panel

Having everything on one page means that we have to focus on the part which renders the page or to be more specific, on the data which we are sending to the template. That’s why I created several helper functions which are combined, like so:

var self = this;
...
var v = new View(res, 'admin');
self.del(req, function() {
    self.form(req, res, function(formMarkup) {
        self.list(function(listMarkup) {
            v.render({
                title: 'Administration',
                content: 'Welcome to the control panel',
                list: listMarkup,
                form: formMarkup
            });
        });
    });
});

It looks a little bit ugly, but it works as I wanted. The first helper is a del method which checks the current GET parameters and if it finds action=delete&id=[id of the record], it removes data from the collection. The second function is called form and it is responsible mainly for showing the form on the right side of the page. It checks if the form is submitted and properly updates or creates records in the database. At the end, the list method fetches the information and prepares an HTML table, which is later sent to the template. The implementation of these three helpers can be found in the source code for this tutorial.

Here, I’ve decided to show you the function which handles the file upload:

handleFileUpload: function(req) {
    if(!req.files || !req.files.picture || !req.files.picture.name) {
        return req.body.currentPicture || '';
    }
    var data = fs.readFileSync(req.files.picture.path);
    var fileName = req.files.picture.name;
    var uid = crypto.randomBytes(10).toString('hex');
    var dir = __dirname + "/../public/uploads/" + uid;
    fs.mkdirSync(dir, '0777');
    fs.writeFileSync(dir + "/" + fileName, data);
    return '/uploads/' + uid + "/" + fileName;
}

If a file is submitted, the node script .files property of the request object is filled with data. In our case, we have the following HTML element:

<input type="file" name="picture" />

This means that we could access the submitted data via req.files.picture. In the code snippet above, req.files.picture.path is used to get the raw content of the file. Later, the same data is written in a newly created directory and at the end, a proper URL is returned. All of these operations are synchronous, but it’s a good practice to use the asynchronous version of readFileSync, mkdirSync and writeFileSync.

Front-End

The hard work is now complete. The administration panel is working and we have a ContentModel class, which gives us access to the information stored in the database. What we have to do now, is to write the front-end controllers and bind them to the saved content.

Here is the controller for the Home page – /controllers/Home.js

module.exports = BaseController.extend({ 
    name: "Home",
    content: null,
    run: function(req, res, next) {
        model.setDB(req.db);
        var self = this;
        this.getContent(function() {
            var v = new View(res, 'home');
            v.render(self.content);
        })
    },
    getContent: function(callback) {
        var self = this;
        this.content = {};
        model.getlist(function(err, records) {
            ... storing data to content object
            model.getlist(function(err, records) {
                ... storing data to content object
                callback();
            }, { type: 'blog' });
        }, { type: 'home' });
    }
});

The home page needs one record with a type of home and four records with a type of blog. Once the controller is done, we just have to add a route to it in app.js:

app.all('/', attachDB, function(req, res, next) {
    Home.run(req, res, next);
});

Again, we are attaching the db object to the request. Pretty much the same workflow as the one used in the administration panel.

The other pages for our front-end (client side) are almost identical, in that they all have a controller, which fetches data by using the model class and of course a route defined. There are two interesting situations which I’d like to explain in more detail. The first one is related to the blog page. It should be able to show all the articles, but also to present only one. So, we have to register two routes:

app.all('/blog/:id', attachDB, function(req, res, next) {
    Blog.runArticle(req, res, next);
}); 
app.all('/blog', attachDB, function(req, res, next) {
    Blog.run(req, res, next);
});

They both use the same controller: Blog, but call different run methods. Pay attention to the /blog/:id string. This route will match URLs like /blog/4e3455635b4a6f6dccfaa1e50ee71f1cde75222b and the long hash will be available in req.params.id. In other words, we are able to define dynamic parameters. In our case, that’s the ID of the record. Once we have this information, we are able to create a unique page for every article.

The second interesting part is how I built the Services, Careers and Contacts pages. It is clear that they use only one record from the database. If we had to create a different controller for every page then we’d have to copy/paste the same code and just change the type field. There is a better way to achieve this though, by having only one controller, which accepts the type in its run method. So here are the routes:

app.all('/services', attachDB, function(req, res, next) {
    Page.run('services', req, res, next);
}); 
app.all('/careers', attachDB, function(req, res, next) {
    Page.run('careers', req, res, next);
}); 
app.all('/contacts', attachDB, function(req, res, next) {
    Page.run('contacts', req, res, next);
});

And the controller would look like this:

module.exports = BaseController.extend({ 
    name: "Page",
    content: null,
    run: function(type, req, res, next) {
        model.setDB(req.db);
        var self = this;
        this.getContent(type, function() {
            var v = new View(res, 'inner');
            v.render(self.content);
        });
    },
    getContent: function(type, callback) {
        var self = this;
        this.content = {}
        model.getlist(function(err, records) {
            if(records.length > 0) {
                self.content = records[0];
            }
            callback();
        }, { type: type });
    }
});

Deployment

Deploying an Express based website is actually the same as deploying any other Node.js application:

  • The files are placed on the server.
  • The node process should be stopped (if it is running).
  • An npm install command should be ran in order to install the new dependencies (if any).
  • The main script should then be run again.

Keep in mind that Node is still fairly young, so not everything may work as you expected, but there are improvements being made all the time. For example, forever guarantees that your Nodejs program will run continuously. You can do this by issuing the following command:

forever start yourapp.js

This is what I’m using on my servers as well. It’s a nice little tool, but it solves a big problem. If you run your app with just node yourapp.js, once your script exits unexpectedly, the server goes down. forever, simply restarts the application.

Now I’m not a system administrator, but I wanted to share my experience integrating node apps with Apache or Nginx, because I think that this is somehow part of the development workflow.

As you know, Apache normally runs on port 80, which means that if you open http://localhost or http://localhost:80 you will see a page served by your Apache server and most likely your node script is listening on a different port. So, you need to add a virtual host that accepts the requests and sends them to the right port. For example, let’s say that I want to host the site, that we’ve just built, on my local Apache server under the expresscompletewebsite.dev address. The first thing that we have to do is to add our domain to the hosts file.

127.0.0.1   expresscompletewebsite.dev

After that, we have to edit the httpd-vhosts.conf file under the Apache configuration directory and add

# expresscompletewebsite.dev
<VirtualHost *:80>
    ServerName expresscompletewebsite.dev
    ServerAlias www.expresscompletewebsite.dev
    ProxyRequests off
    <Proxy *>
        Order deny,allow
        Allow from all
    </Proxy>
    <Location />
        ProxyPass http://localhost:3000/
        ProxyPassReverse http://localhost:3000/
    </Location>
</VirtualHost>

The server still accepts requests on port 80, but forwards them to port 3000, where node is listening.

The Nginx setup is much much easier and to be honest, it’s a better choice for hosting Nodejs based apps. You still have to add the domain name in your hosts file. After that, simply create a new file in the /sites-enabled directory under the Nginx installation. The content of the file would look something like this:

server {
    listen 80;
    server_name expresscompletewebsite.dev
    location / {
            proxy_pass http://127.0.0.1:3000;
            proxy_set_header Host $http_host;
    }
}

Keep in mind that you can’t run both Apache and Nginx with the above hosts setup. That’s because they both require port 80. Also, you may want to do a little bit of additional research about better server configuration if you plan to use the above code snippets in a production environment. As I said, I’m not an expert in this area.


Conclusion

Express is a great framework, which gives you a good starting point to begin building your applications. As you can see, it’s a matter of choice on how you will extend it and what you will use to build with it. It simplifies the boring tasks by using a few great middlewares and leaves the fun parts to the developer.

Source code

The source code for this sample site that we built is available on GitHub – https://github.com/tutsplus/build-complete-website-expressjs. Feel free to fork it and play with it. Here are the steps for running the site.

  • Download the source code
  • Go to the app directory
  • Run npm install
  • Run the mongodb daemon
  • Run node app.js
18 Nov 07:40

构建自己的AngularJS(1):Scope和Digest

by 穆逸伦
Yunchenge

angularjs

Angular是一个成熟和强大的JavaScript框架。它也是一个比较庞大的框架,在熟练掌握之前,需要领会它提出的很多新概念。很多Web开发人员涌向Angular,有不少人面临同样的障碍。Digest到底是怎么做的?定义一个指令(directive)有哪些不同的方法?Service和provider有什么区别?

Angular的文档挺不错的,第三方的资源也越来越丰富,想要学习一门新的技术,没什么方法比把它拆开研究其运作机制更好。

在这个系列的文章中,我将从无到有构建AngularJS的一个实现。随着逐步深入的讲解,读者将能对Angular的运作机制有一个深入的认识。

在第一部分中,读者将看到Angular的作用域是如何运作的,还有比如$eval, $digest, $apply这些东西怎么实现。Angular的脏检查逻辑看上去有些不可思议,但你将看到实际并非如此。

基础知识

在Github上,可以看到这个项目的全部源码。相比只复制一份下来,我更建议读者从无到有构建自己的实现,从不同角度探索代码的每个步骤。在本文中,我嵌入了JSBin的一些代码,可以直接在文章中进行一些互动。(译者注:因为我在github上翻译,没法集成JSBin了,只能给链接……)

我们将使用Lo-Dash库来处理一些在数组和对象上的底层操作。Angular自身并未使用Lo-Dash,但是从我们的目的看,要尽量无视这些不太相关的比较底层的事情。当读者在代码中看到下划线(_)的时候,那就是在调用Lo-Dash的功能。

我们还将使用console.assert函数做一些特别的测试。这个函数应该适用于所有现代JavaScript环境。

下面是使用Lo-Dash和assert函数的示例:

http://jsbin.com/UGOVUk/4/embed?js,console

Scope对象

Angular的Scope对象是POJO(简单的JavaScript对象),在它们上面,可以像对其他对象一样添加属性。Scope对象是用构造函数创建的,我们来写个最简单的版本:

function Scope() {
}

现在我们就可以使用new操作符来创建一个Scope对象了。我们也可以在它上面附加一些属性:

var aScope = new Scope();
aScope.firstName = 'Jane';
aScope.lastName = 'Smith';

这些属性没什么特别的。不需要调用特别的设置器(setter),赋值的时候也没什么限制。相反,在两个特别的函数:$watch和$digest之中发生了一些奇妙的事情。

监控对象属性:$watch和$digest

$watch和$digest是相辅相成的。两者一起,构成了Angular作用域的核心:数据变化的响应。

使用$watch,可以在Scope上添加一个监听器。当Scope上发生变更时,监听器会收到提示。给$watch指定如下两个函数,就可以创建一个监听器:

  • 一个监控函数,用于指定所关注的那部分数据。
  • 一个监听函数,用于在数据变更的时候接受提示。

作为一名Angular用户,一般来说,是监控一个表达式,而不是使用监控函数。监控表达式是一个字符串,比如说“user.firstName”,通常在数据绑定,指令的属性,或者JavaScript代码中指定,它被Angular解析和编译成一个监控函数。在这篇文章的后面部分我们会探讨这是如何做的。在这篇文章中,我们将使用稍微低级的方法直接提供监控功能。

为了实现$watch,我们需要存储注册过的所有监听器。我们在Scope构造函数上添加一个数组:

function Scope() {
  this.$$watchers = [];
}

在Angular框架中,双美元符前缀$$表示这个变量被当作私有的来考虑,不应当在外部代码中调用。

现在我们可以定义$watch方法了。它接受两个函数作参数,把它们存储在$$watchers数组中。我们需要在每个Scope实例上存储这些函数,所以要把它放在Scope的原型上:

Scope.prototype.$watch = function(watchFn, listenerFn) {
  var watcher = {
    watchFn: watchFn,
    listenerFn: listenerFn
  };
  this.$$watchers.push(watcher);
};

另外一面就是$digest函数。它执行了所有在作用域上注册过的监听器。我们来实现一个它的简化版,遍历所有监听器,调用它们的监听函数:

Scope.prototype.$digest = function() {
  _.forEach(this.$$watchers, function(watch) {
    watch.listenerFn();
  });  
};

现在我们可以添加监听器,然后运行$digest了,这将会调用监听函数:

http://jsbin.com/oMaQoxa/2/embed?js,console

这些本身没什么大用,我们要的是能检测由监控函数指定的值是否确实变更了,然后调用监听函数。

脏值检测

如同上文所述,监听器的监听函数应当返回我们所关注的那部分数据的变化,通常,这部分数据就存在于作用域中。为了使得访问作用域更便利,在调用监控函数的时候,使用当前作用域作为实参。一个关注作用域上fiestName属性的监听器像这个样子:

function(scope) {
  return scope.firstName;
}

这是监控函数的一般形式:从作用域获取一些值,然后返回。

$digest函数的作用是调用这个监控函数,并且比较它返回的值和上一次返回值的差异。如果不相同,监听器就是脏的,它的监听函数就应当被调用。

想要这么做,$digest需要记住每个监控函数上次返回的值。既然我们现在已经为每个监听器创建过一个对象,只要把上一次的值存在这上面就行了。下面是检测每个监控函数值变更的$digest新实现:

Scope.prototype.$digest = function() {
  var self = this;
  _.forEach(this.$$watchers, function(watch) {
    var newValue = watch.watchFn(self);
    var oldValue = watch.last;
    if (newValue !== oldValue) {
      watch.listenerFn(newValue, oldValue, self);
    }
    watch.last = newValue;
  });  
};

对每个监听器,我们调用监控函数,把作用域自身当作实参传递进去,然后比较这个返回值和上次返回值,如果不同,就调用监听函数。方便起见,我们把新旧值和作用域都当作参数传递给监听函数。最终,我们把监听器的last属性设置成新返回的值,下一次可以用它来作比较。

有了这个实现之后,我们就可以看到在$digest调用的时候,监听函数是怎么执行的:

http://jsbin.com/OsITIZu/3/embed?js,console

我们已经实现了Angular作用域的本质:添加监听器,在digest里运行它们。

也已经可以看到几个关于Angular作用域的重要性能特性:

  • 在作用域上添加数据本身并不会有性能折扣。如果没有监听器在监控某个属性,它在不在作用域上都无所谓。Angular并不会遍历作用域的属性,它遍历的是监听器。
  • $digest里会调用每个监控函数,因此,最好关注监听器的数量,还有每个独立的监控函数或者表达式的性能。

在Digest的时候获得提示

如果你想在每次Angular的作用域被digest的时候得到通知,可以利用每次digest的时候挨个执行监听器这个事情,只要注册一个没有监听函数的监听器就可以了。

想要支持这个用例,我们需要在$watch里面检测是否监控函数被省略了,如果是这样,用个空函数来代替它:

Scope.prototype.$watch = function(watchFn, listenerFn) {
  var watcher = {
    watchFn: watchFn,
    listenerFn: listenerFn || function() { }
  };
  this.$$watchers.push(watcher);
};

如果用了这个模式,需要记住,即使没有listenerFn,Angular也会寻找watchFn的返回值。如果返回了一个值,这个值会提交给脏检查。想要采用这个用法又想避免多余的事情,只要监控函数不返回任何值就行了。在这个例子里,监听器的值始终会是未定义的。

http://jsbin.com/OsITIZu/4/embed?js,console

这个实现的核心就这样,但是离最终的还是差太远了。比如说有个很典型的场景我们不能支持:监听函数自身也修改作用域上的属性。如果这个发生了,另外有个监听器在监控被修改的属性,有可能在同一个digest里面检测不到这个变动:

http://jsbin.com/eTIpUyE/2/embed?js,console

我们来修复这个问题。

当数据脏的时候持续Digest

我们需要改变一下digest,让它持续遍历所有监听器,直到监控的值停止变更。

首先,我们把现在的$digest函数改名为$$digestOnce,它把所有的监听器运行一次,返回一个布尔值,表示是否还有变更了:

Scope.prototype.$$digestOnce = function() {
  var self  = this;
  var dirty;
  _.forEach(this.$$watchers, function(watch) {
    var newValue = watch.watchFn(self);
    var oldValue = watch.last;
    if (newValue !== oldValue) {
      watch.listenerFn(newValue, oldValue, self);
      dirty = true;
    }
    watch.last = newValue;
  });
  return dirty;
};

然后,我们重新定义$digest,它作为一个“外层循环”来运行,当有变更发生的时候,调用$$digestOnce:

Scope.prototype.$digest = function() {
  var dirty;
  do {
    dirty = this.$$digestOnce();
  } while (dirty);
};

$digest现在至少运行每个监听器一次了。如果第一次运行完,有监控值发生变更了,标记为dirty,所有监听器再运行第二次。这会一直运行,直到所有监控的值都不再变化,整个局面稳定下来了。

Angular作用域里并不是真的有个函数叫做$$digestOnce,相反,digest循环都是包含在$digest里的。我们的目标更多是清晰度而不是性能,所以把内层循环封装成了一个函数。

下面是新的实现:

http://jsbin.com/Imoyosa/3/embed?js,console

我们现在可以对Angular的监听器有另外一个重要认识:它们可能在单次digest里面被执行多次。这也就是为什么人们经常说,监听器应当是幂等的:一个监听器应当没有边界效应,或者边界效应只应当发生有限次。比如说,假设一个监控函数触发了一个Ajax请求,无法确定你的应用程序发了多少个请求。

在我们现在的实现中,有一个明显的遗漏:如果两个监听器互相监控了对方产生的变更,会怎样?也就是说,如果状态始终不会稳定?这种情况展示在下面的代码里。在这个例子里,$digest调用被注释掉了,把注释去掉看看发生什么情况:

http://jsbin.com/eKEvOYa/3/embed?js,console

JSBin执行了一段时间之后就停止了(在我机器上大概跑了100,000次左右)。如果你在别的东西比如Node.js里跑,它会一直运行下去。

放弃不稳定的digest

我们要做的事情是,把digest的运行控制在一个可接受的迭代数量内。如果这么多次之后,作用域还在变更,就勇敢放手,宣布它永远不会稳定。在这个点上,我们会抛出一个异常,因为不管作用域的状态变成怎样,它都不太可能是用户想要的结果。

迭代的最大值称为TTL(short for Time To Live)。这个值默认是10,可能有点小(我们刚运行了这个digest 100,000次!),但是记住这是一个性能敏感的地方,因为digest经常被执行,而且每个digest运行了所有的监听器。用户也不太可能创建10个以上链状的监听器。

事实上,Angular里面的TTL是可以调整的。我们将在后续文章讨论provider和依赖注入的时候再回顾这个话题。

我们继续,给外层digest循环添加一个循环计数器。如果达到了TTL,就抛出异常:

Scope.prototype.$digest = function() {
  var ttl = 10;
  var dirty;
  do {
    dirty = this.$$digestOnce();
    if (dirty && !(ttl--)) {
      throw "10 digest iterations reached";
    }
  } while (dirty);
};

下面是更新过的版本,可以让我们循环引用的监控例子抛出异常:

http://jsbin.com/uNapUWe/2/embed?js,console

这些应当已经把digest的事情说清楚了。

现在,我们把注意力转到如何检测变更上吧。

基于值的脏检查

我们曾经使用严格等于操作符(===)来比较新旧值,在绝大多数情况下,它是不错的,比如所有的基本类型(数字,字符串等等),也可以检测一个对象或者数组是否变成新的了,但Angular还有一种办法来检测变更,用于检测当对象或者数组内部产生变更的时候。那就是:可以监控值的变更,而不是引用。

这类脏检查需要给$watch函数传入第三个布尔类型的可选参数当标志来开启。当这个标志为真的时候,基于值的检查开启。我们来重新定义$watch,接受这个参数,并且把它存在监听器里:

Scope.prototype.$watch = function(watchFn, listenerFn, valueEq) {
  var watcher = {
    watchFn: watchFn,
    listenerFn: listenerFn,
    valueEq: !!valueEq
  };
  this.$$watchers.push(watcher);
};

我们所做的一切是把这个标志加在监听器上,通过两次取反,强制转换为布尔类型。当用户调用$watch,没传入第三个参数的时候,valueEq会是未定义的,在监听器对象里就变成了false。

基于值的脏检查意味着如果新旧值是对象或者数组,我们必须遍历其中包含的所有内容。如果它们之间有任何差异,监听器就脏了。如果该值包含嵌套的对象或者数组,它也会递归地按值比较。

Angular内置了自己的相等检测函数,但是我们会用Lo-Dash提供的那个。让我们定义一个新函数,取两个值和一个布尔标志,并比较相应的值:

Scope.prototype.$$areEqual = function(newValue, oldValue, valueEq) {
  if (valueEq) {
    return _.isEqual(newValue, oldValue);
  } else {
    return newValue === oldValue;
  }
};

为了提示值的变化,我们也需要改变之前在每个监听器上存储旧值的方式。只存储当前值的引用是不够的,因为在这个值内部发生的变更也会生效到它的引用上,$$areEqual方法比较同一个值的两个引用始终为真,监控不到变化,因此,我们需要建立当前值的深拷贝,并且把它们储存起来。

就像相等检测一样,Angular也内置了自己的深拷贝函数,但我们还是用Lo-Dash提供的。我们修改一下$digestOnce,在内部使用新的$$areEqual函数,如果需要的话,也复制最后一次的引用:

Scope.prototype.$$digestOnce = function() {
  var self  = this;
  var dirty;
  _.forEach(this.$$watchers, function(watch) {
    var newValue = watch.watchFn(self);
    var oldValue = watch.last;
    if (!self.$$areEqual(newValue, oldValue, watch.valueEq)) {
      watch.listenerFn(newValue, oldValue, self);
      dirty = true;
    }
    watch.last = (watch.valueEq ? _.cloneDeep(newValue) : newValue);
  });
  return dirty;
};

现在我们可以看到两种脏检测方式的差异:

http://jsbin.com/ARiWENO/3/embed?js,console

相比检查引用,检查值的方式显然是一个更为复杂的操作。遍历嵌套的数据结构很花时间,保持深拷贝的数据也占用不少内存。这就是Angular默认不使用基于值的脏检测的原因,用户需要显式设置这个标记去打开它。

Angular也提供了第三种脏检测的方法:集合监控。就像基于值的检测,也能提示对象和数组中的变更。但不同于基于值的检测方式,它做的是一个比较浅的检测,并不递归进入到深层去,所以它比基于值的检测效率更高。集合检测是通过“$watchCollection”函数来使用的,在这个系列的后续部分,我们会来看看它是如何实现的。

在我们完成值的比对之前,还有些JavaScript怪事要处理一下。

非数字(NaN)

在JavaScript里,NaN(Not-a-Number)并不等于自身,这个听起来有点怪,但确实就这样。如果我们在脏检测函数里不显式处理NaN,一个值为NaN的监听器会一直是脏的。

对于基于值的脏检测来说,这个事情已经被Lo-Dash的isEqual函数处理掉了。对于基于引用的脏检测来说,我们需要自己处理。来修改一下$$areEqual函数的代码:

Scope.prototype.$$areEqual = function(newValue, oldValue, valueEq) {
  if (valueEq) {
    return _.isEqual(newValue, oldValue);
  } else {
    return newValue === oldValue ||
      (typeof newValue === 'number' && typeof oldValue === 'number' &&
       isNaN(newValue) && isNaN(oldValue));
  }
};

现在有NaN的监听器也正常了:

http://jsbin.com/ijINaRA/2/embed?js,console

基于值的检测实现好了,现在我们该把注意力集中到应用程序代码如何跟作用域打交道上了。

$eval – 在作用域的上下文上执行代码

在Angular中,有几种方式可以在作用域的上下文上执行代码,最简单的一种就是$eval。它使用一个函数作参数,所做的事情是立即执行这个传入的函数,并且把作用域自身当作参数传递给它,返回的是这个函数的返回值。$eval也可以有第二个参数,它所做的仅仅是把这个参数传递给这个函数。

$eval的实现很简单:

Scope.prototype.$eval = function(expr, locals) {
  return expr(this, locals);
};

$eval的使用一样很简单:

http://jsbin.com/UzaWUC/1/embed?js,console

那么,为什么要用这么一种明显很多余的方式去执行一个函数呢?有人觉得,有些代码是专门与作用域的内容打交道的,$eval让这一切更加明显。$scope也是构建$apply的一个部分,后面我们就来讲它。

然后,可能$eval最有意思的用法是当我们不传入函数,而是表达式。就像$watch一样,可以给$eval一个字符串表达式,它会把这个表达式编译,然后在作用域的上下文中执行。我们将在这个系列的后面部分实现这些。

$apply – 集成外部代码与digest循环

可能Scope上所有函数里最有名的就是$apply了。它被誉为将外部库集成到Angular的最标准的方式,这话有个不错的理由。

$apply使用函数作参数,它用$eval执行这个函数,然后通过$digest触发digest循环。下面是一个简单的实现:

Scope.prototype.$apply = function(expr) {
  try {
    return this.$eval(expr);
  } finally {
    this.$digest();
  }
};

$digest的调用放置于finally块中,以确保即使函数抛出异常,也会执行digest。

关于$apply,大的想法是,我们可以执行一些与Angular无关的代码,这些代码也还是可以改变作用域上的东西,$apply可以保证作用域上的监听器可以检测这些变更。当人们谈论使用$apply集成代码到“Angular生命周期”的时候,他们指的就是这个事情,也没什么比这更重要的了。

这里是$apply的实践:

http://jsbin.com/UzaWUC/2/embed?js,console

延迟执行 – $evalAsync

在JavaScript中,经常会有把一段代码“延迟”执行的情况 – 把它的执行延迟到当前的执行上下文结束之后的未来某个时间点。最常见的方式就是调用setTimeout()函数,传递一个0(或者非常小)作为延迟参数。

这种模式也适用于Angular程序,但更推荐的方式是使用$timeout服务,并且使用$apply把要延迟执行的函数集成到digest生命周期。

但在Angular中还有一种延迟代码的方式,那就是Scope上的$evalAsync函数。$evalAsync接受一个函数,把它列入计划,在当前正持续的digest中或者下一次digest之前执行。举例来说,你可以在一个监听器的监听函数中延迟执行一些代码,即使它已经被延迟了,仍然会在现有的digest遍历中被执行。

我们首先需要的是存储$evalAsync列入计划的任务,可以在Scope构造函数中初始化一个数组来做这事:

function Scope() {
  this.$$watchers = [];
  this.$$asyncQueue = [];
}

我们再来定义$evalAsync,它添加将在这个队列上执行的函数:

Scope.prototype.$evalAsync = function(expr) {
  this.$$asyncQueue.push({scope: this, expression: expr});
};

我们显式在放入队列的对象上设置当前作用域,是为了使用作用域的继承,在这个系列的下一篇文章中,我们会讨论这个。

然后,我们在$digest中要做的第一件事就是从队列中取出每个东西,然后使用$eval来触发所有被延迟执行的函数:

Scope.prototype.$digest = function() {
  var ttl = 10;
  var dirty;
  do {
    while (this.$$asyncQueue.length) {
      var asyncTask = this.$$asyncQueue.shift();
      this.$eval(asyncTask.expression);
    }
    dirty = this.$$digestOnce();
    if (dirty && !(ttl--)) {
      throw "10 digest iterations reached";
    }
  } while (dirty);
};

这个实现保证了:如果当作用域还是脏的,就想把一个函数延迟执行,那这个函数会在稍后执行,但还处于同一个digest中。

下面是关于如何使用$evalAsync的一个示例:

http://jsbin.com/ilepOwI/1/embed?js,console

作用域阶段

$evalAsync做的另外一件事情是:如果现在没有其他的$digest在运行的话,把给定的$digest延迟执行。这意味着,无论什么时候调用$evalAsync,可以确定要延迟执行的这个函数会“很快”被执行,而不是等到其他什么东西来触发一次digest。

需要有一种机制让$evalAsync来检测某个$digest是否已经在运行了,因为它不想影响到被列入计划将要执行的那个。为此,Angular的作用域实现了一种叫做阶段(phase)的东西,它就是作用域上一个简单的字符串属性,存储了现在正在做的信息。

在Scope的构造函数里,我们引入一个叫$$phase的字段,初始化为null:

function Scope() {
  this.$$watchers = [];
  this.$$asyncQueue = [];
  this.$$phase = null;
}

然后,我们定义一些方法用于控制这个阶段变量:一个用于设置,一个用于清除,也加个额外的检测,以确保不会把已经激活状态的阶段再设置一次:

Scope.prototype.$beginPhase = function(phase) {
  if (this.$$phase) {
    throw this.$$phase + ' already in progress.';
  }
  this.$$phase = phase;
};

Scope.prototype.$clearPhase = function() {
  this.$$phase = null;
};

在$digest方法里,我们来从外层循环设置阶段属性为“$digest”:

Scope.prototype.$digest = function() {
  var ttl = 10;
  var dirty;
  this.$beginPhase("$digest");
  do {
    while (this.$$asyncQueue.length) {
      var asyncTask = this.$$asyncQueue.shift();
      this.$eval(asyncTask.expression);
    }
    dirty = this.$$digestOnce();
    if (dirty && !(ttl--)) {
      this.$clearPhase();
      throw "10 digest iterations reached";
    }
  } while (dirty);
  this.$clearPhase();
};

我们把$apply也修改一下,在它里面也设置个跟自己一样的阶段。在调试的时候,这个会有些用:

Scope.prototype.$apply = function(expr) {
  try {
    this.$beginPhase("$apply");
    return this.$eval(expr);
  } finally {
    this.$clearPhase();
    this.$digest();
  }
};

最终,把对$digest的调度放进$evalAsync。它会检测作用域上现有的阶段变量,如果没有(也没有已列入计划的异步任务),就把这个digest列入计划。

Scope.prototype.$evalAsync = function(expr) {
  var self = this;
  if (!self.$$phase && !self.$$asyncQueue.length) {
    setTimeout(function() {
      if (self.$$asyncQueue.length) {
        self.$digest();
      }
    }, 0);
  }
  self.$$asyncQueue.push({scope: self, expression: expr});
};

有了这个实现之后,不管何时、何地,调用$evalAsync,都可以确定有一个digest会在不远的将来发生。

http://jsbin.com/iKeSaGi/1/embed?js,console

在digest之后执行代码 – $$postDigest

还有一种方式可以把代码附加到digest循环中,那就是把一个$$postDigest函数列入计划。

在Angular中,函数名字前面有双美元符号表示它是一个内部的东西,不是应用开发人员应该用的。但它确实存在,所以我们也要把它实现出来。

就像$evalAsync一样,$$postDigest也能把一个函数列入计划,让它“以后”运行。具体来说,这个函数将在下一次digest完成之后运行。将一个$$postDigest函数列入计划不会导致一个digest也被延后,所以这个函数的执行会被推迟到直到某些其他原因引起一次digest。顾名思义,$$postDigest函数是在digest之后运行的,如果你在$$digest里面修改了作用域,需要手动调用$digest或者$apply,以确保这些变更生效。

首先,我们给Scope的构造函数加队列,这个队列给$$postDigest函数用:

function Scope() {
  this.$$watchers = [];
  this.$$asyncQueue = [];
  this.$$postDigestQueue = [];
  this.$$phase = null;
}

然后,我们把$$postDigest也加上去,它所做的就是把给定的函数加到队列里:

Scope.prototype.$$postDigest = function(fn) {
  this.$$postDigestQueue.push(fn);
};

最终,在$digest里,当digest完成之后,就把队列里面的函数都执行掉。

Scope.prototype.$digest = function() {
  var ttl = 10;
  var dirty;
  this.$beginPhase("$digest");
  do {
    while (this.$$asyncQueue.length) {
      var asyncTask = this.$$asyncQueue.shift();
      this.$eval(asyncTask.expression);
    }
    dirty = this.$$digestOnce();
    if (dirty && !(ttl--)) {
      this.$clearPhase();
      throw "10 digest iterations reached";
    }
  } while (dirty);
  this.$clearPhase();

  while (this.$$postDigestQueue.length) {
    this.$$postDigestQueue.shift()();
  }
};

下面是关于如何使用$$postDigest函数的:

http://jsbin.com/IMEhowO/1/embed?js,console

异常处理

现有对Scope的实现已经逐渐接近在Angular中实际的样子了,但还有些脆弱,因为我们迄今为止没有花精力在异常处理上。

Angular的作用域在遇到错误的时候是非常健壮的:当产生异常的时候,不管在监控函数中,在$evalAsync函数中,还是在$$postDigest函数中,都不会把digest终止掉。我们现在的实现里,在以上任何地方产生异常都会把整个$digest弄挂。

我们可以很容易修复它,把上面三个调用包在try…catch中就好了。

Angular实际上是把这些异常抛给了它的$exceptionHandler服务。既然我们现在还没有这东西,先扔到控制台上吧。

$evalAsync和$$postDigest的异常处理是在$digest函数里,在这些场景里,从已列入计划的程序中抛出的异常将被记录成日志,它后面的还是正常运行:

Scope.prototype.$digest = function() {
  var ttl = 10;
  var dirty;
  this.$beginPhase("$digest");
  do {
    while (this.$$asyncQueue.length) {
      try {
        var asyncTask = this.$$asyncQueue.shift();
        this.$eval(asyncTask.expression);
      } catch (e) {
        (console.error || console.log)(e);
      }
    }
    dirty = this.$$digestOnce();
    if (dirty && !(ttl--)) {
      this.$clearPhase();
      throw "10 digest iterations reached";
    }
  } while (dirty);
  this.$clearPhase();

  while (this.$$postDigestQueue.length) {
    try {
      this.$$postDigestQueue.shift()();
    } catch (e) {
      (console.error || console.log)(e);
    }
  }
};

监听器的异常处理放在$$digestOnce里。

Scope.prototype.$$digestOnce = function() {
  var self  = this;
  var dirty;
  _.forEach(this.$$watchers, function(watch) {
    try {
      var newValue = watch.watchFn(self);
      var oldValue = watch.last;
      if (!self.$$areEqual(newValue, oldValue, watch.valueEq)) {
        watch.listenerFn(newValue, oldValue, self);
        dirty = true;
      }
      watch.last = (watch.valueEq ? _.cloneDeep(newValue) : newValue);
    } catch (e) {
      (console.error || console.log)(e);
    }
  });
  return dirty;
};

现在我们的digest循环碰到异常的时候健壮多了。

http://jsbin.com/IMEhowO/2/embed?js,console

销毁一个监听器

当注册一个监听器的时候,一般都需要让它一直存在于整个作用域的生命周期,所以很少会要显式把它移除。也有些场景下,需要保持作用域的存在,但要把某个监听器去掉。

Angular中的$watch函数是有返回值的,它是个函数,如果执行,就把刚注册的这个监听器销毁。想在我们这个版本里实现这功能,只要返回一个函数在里面把这个监控器从$$watchers数组去除就可以了:

Scope.prototype.$watch = function(watchFn, listenerFn, valueEq) {
  var self = this;
  var watcher = {
    watchFn: watchFn,
    listenerFn: listenerFn,
    valueEq: !!valueEq
  };
  self.$$watchers.push(watcher);
  return function() {
    var index = self.$$watchers.indexOf(watcher);
    if (index >= 0) {
      self.$$watchers.splice(index, 1);
    }
  };
};

现在我们就可以把$watch的这个返回值存起来,以后调用它来移除这个监听器:

http://jsbin.com/IMEhowO/4/embed?js,console

展望未来

我们已经走了很长一段路了,已经有了一个完美可以运行的类似Angular这样的脏检测作用域系统的实现了,但是Angular的作用域上面还做了更多东西。

或许最重要的是,在Angular里,作用域并不是孤立的对象,作用域可以继承于其他作用域,监听器也不仅仅是监听本作用域上的东西,还可以监听这个作用域的父级作用域。这种方法,概念上很简单,但是对于初学者经常容易造成混淆。所以,本系列的下一篇文章主题就是作用域的继承。

后面我们会讨论Angular的事件系统,也是实现在Scope上的。

构建自己的AngularJS(1):Scope和Digest,首发于博客 - 伯乐在线

01 Aug 02:45

开源分布式版本控制工具 —— Git 之旅

Yunchenge

git, introduction

自诞生以来,Git 就以其开源、简单、快捷、分布式、高效等特点,应付了类似 Linux 内核源代码等各种复杂的项目开发需求。本文在介绍 Git 安装、使用的同时,更加注重于 Git 的设计思想、体系架构、以及各种实用功能,包括 Git 分支模型、Git 标签、Git 补丁提交、CVS 迁移 Git、SVN 迁移 Git 等。