Chapter 22

Test and Launch the Site

Summary

The time has come to launch. You need to check off the final items on your list, make sure the site is running in its production environment, and launch. Then, you need to make sure you resolve the issues that crop up after the site goes live.

Does this Apply?

At some point in the lifespan of every project, it will need to be tested and launched. Therefore, this chapter applies to every project, though in varying degrees. Some of the differentiation will depend on whether or not you completely control your production environment. In many cases you may deploy to an environment controlled by other organizations, which will affect your communication plan and distribution of work.

Narrative

Well, here we are. It’s launch time.

Almost.

There’s actually quite a few things you need to do up to – and surrounding – the launch. Website launches aren’t magic. They feel great because the moment has come, but there are few things worse than a botched launch.

First, you need to make sure your website is tested. And QA is not a single thing as much as it’s a family of related practices.

Additionally, you might need to choreograph the actual deployment. There are often a series of things you need to do to get a website project out the door. You’ll have a checklist of things you need to do.

Once it launches, you need to be vigilant for unforeseen issues that always tend to crop up right after your new project sees the light of day. There’s an often an acute period of “stabilization” that needs to happen in the days and weeks after launch.

Finally, you need to make sure acute operational processes and reporting lines are in place. We’re going to talk about governance and product management later, but for now, you just need to make sure you know who to call if something goes wrong.

Let’s get started.


QA / Testing

Quality Assurance (QA) is not something you embark upon right before launch. If you haven’t done any QA before this point, you likely have some big lurking issues you haven’t found yet. QA should be built into every code release. Your testers should have been in the developing site, early and often.

All QA is not created equally. We can separate the things we find into two rough groups.

  • “Hard” Issues: This is stuff that’s simply broken. This stuff needs to be fixed before launch.

  • “Soft” Issues: This is stuff that isn’t broken, but could be better. We’re actually not going to talk about this in this chapter – these are issues that stem all the way back to your work on your information architecture, content strategy, and design. Your site might work perfectly, but people can’t find anything. This is not a problem that a late-stage QA process is going to fix.

We point out this distinction because people might tell you that something is wrong with the website, and you needf to decide whether or not this is a “Bug/Issue” or just a "Change Request." Just because someone wants something changed doesn’t mean you have a QA problem. You might just need to refine some earlier thinking, and you might do this after launch as you work to optimize your website as a product of the organization, rather than a project with a start and end date.

What we are going to discuss here are three different axes that a QA issue can turn on. An issue’s scope is based on the specific combination of characteristics from the variables below:

  • Public or Private: A public error affects everyone on the site. A private error affects only editors or those who work for your organization.

  • Absolute or Partial: An absolute error prevents any productive work or content consumption from happening. A partial error just impedes part of work or consumption, or only occurs under outlying usage patterns.

  • Contained or Uncontained: An uncontained error is one that “blows up ugly” when someone attempts to do something – a raw error message or crash. A contained error is when functionality is broken, but the affected user is presented by a friendlier error experience – an error page, or some other intended message.

Here are some examples:

  • Whenever anyone attempts to access your website, they get an unformatted page saying “Error Code 4023: Database connection failed.” This issue is public (everyone can see it), absolute (every page is broken), and uncontained (the error message is raw).

  • When editors try to change the privacy policy, the CMS reports a “Permissions Error. Please contact your administrator.” This issue is private (only editors see it), partial (it only affects one content object), and managed (the editors get an error page with more instructions).

  • When viewing the home page, the widget that shows the weather returns the wrong temperature. This issue is partial (it’s only one piece of the page), and is likely private and contained – even though it’s in clear view, it doesn’t appear like an error and no one would know it was an error unless they actually checked the temp from a different source. (Unless it was showing something like 150 degrees, in which case it would be public – whether or not someone knows there’s an error can matter.)

The larger point here is that not all QA errors are the same. In addition to the prior deliniation of “hard” and "soft," some errors are, well…harder than others. This matters because you’ll be collecting issues in the run-up to launch, and you’ll need to triage them to decide what will actually delay your launch, and what can be shuffled to the backlog.

Types of Testing

“Testing” or "QA" are terms so generic as to be useless. There are actually lots of different styles and gradiations. Here are some of the most common.

  • User Acceptance Testing: This is when your users – humans – work with a system or a change to ensure it meets their requirements. This is usually done as a component of development with the intention of providing feedback.

  • Unit/Functional Testing: This is when a developer writes some code to test some other code. They might have code that adds two numbers together. They write some code that passes “1” and "3" to that code and ensures that it returns “3”.

  • Accessiblity/Usability Testing: This is testing to ensure an interface makes sense and meets the cognitive and physical abilities of the users, regardless of ability.

  • Load Testing: This means artificially pushing massive volumes of load (traffic) and measuring the response times and error rates. How will your website handle the traffic your SuperBowl ad generates?

  • Penetration Testing: This is when a website is probed for common security problems, either by an automated battery of tests, or an actual human that tries to hack into it.

  • Fuzz Testing: This is testing that pushes atypical inputs to a program in an attempt to provoke an error. What happens when someone passes the full text of “War and Peace” to your web form?

  • Device Compatibility Testing: This used to be called “browser testing,” but now it’s expanded past just browsers to the broader concept of ensuring your content looks acceptable on all devices.

  • Regression Testing: This is not a separate category of testing, but rather a discipline of repeating your past testing whenever you make a change, to ensure that every thing still works. Occasionally, a change will break some remote functionality that has always worked, and unless you go back to re-test everything, you wouldn’t know until someone complained.

  • Link/Request Testing: An HTTP request to make to a specific URL (or a hyperlink is follow/activated) and the resulting status code or content is evaluated. Usually done as a full-blown crawl of a website, where a page is requested, and all the links on the page are followed, and so on until the entire website has been processed.

Testing Practices

The actual process of testing general falls into two categories:

  • Manual: A human interacts with your website – either informally, or according to some defined test scripts to find problems.

  • Automated: A program of some kind performs the tests and provides a report of the result.

Some tests can only be one or the other.

  • It’s unlikely you could ever coordinate enough humans to complete a useful load test

  • A user acceptance test requires a human to experience and certify a system

Ideally, you want to automate as much as possible. Some forms of rote testing don’t need the intricacies of human judgment for performance or evaluation.

However, many aspects of your websites functioning will require a human tester to evaluate. These testers will need a set of test scripts describing tasks to complete, and the outcome they should expect.

Generally, you’ll have test scripts for each release to the testing environment, which occur upon the completion of each sprint or stage of work. These scripts will get longer and longer as more and functionality is completed. A regression test requires you to retest past functionality, and it’s not at all uncommon to run into new problems with existing features as new features are released.

Automated testing can take a few forms, from scripts to crawl your website and report on errors, all the way to virtual browser testing. Some software can create a virtual browser in memory then navigate through your site as if a human was experiencing it, and ensure that certain content appears in certain elements. More advanced versions can take videos of this process (which, remember, is not actually happening visually, but rather in a “fake” browser) and provide screencaps of errors that might occur.

Many projects will have testing completed as part of the code check-in and deployment process. When a developer checks code into the source code repository, these automated tests can initiate and run unattended. The development team lead and project manager can get reports of tests which failed, allowing them to make decisions before deploying to new environments.

Final Launch Checklist

In additional to release/regression testing, you need to be keeping a list of your final launch checklist to make sure necessary tasks are complete before launch. Often, a team will launch their project, forgetting multiple housekeeping items that need to be completed. These items often fall into the awkward space between development and content, which means no one person or group takes responsibility for them.

Consider the following:

  • URL Redirection: Is there a method in place to redirect old URLs? Are all old/new mappings in place and functioning?

  • Error and Not Found Pages: If someone accesses a page that doesn’t exist, what happens? If you force a server error, what happens? Are these scenarios handled gracefully (a “contained” error, as we discussed above), or do they make scary noises (an "uncontained" error).

  • META and Open Graph Information: Does all your content have adequate META tags and Open Graph data? This isn’t visible to a browser-based user, so it often gets overlooked.

  • ROBOTS.TXT: Is there a ROBOTS.TXT file in place to manage search crawlers? Perhaps more importantly, if one was in place to hide a testing environment from Google, has it been removed?

  • Caching: If you turned off caching for your testing environment, has it been turned back on? Are you prepared to reset the cache immediately after launching?

  • Email: If your website sends email, does the production environment have access to an SMTP server?

  • META File Links: There are lots of URL paths in META and other tags in your HTML. Things like a favicon, and Apple Touch Icon, tile files for Microsoft Windows, custom fonts, etc. Your link checker might not have validated these links – are you sure all the files are there?

  • Analytics and Other Tools Are all analytics or tag manager tools in place? These often get removed for testing environments to avoid polluting the stats – have they been put back?

  • SSL Certificate: Do you have the SSL Certificate inready for installation on your new hosting platform?

There are a lot of details to manage, and without a checklist you will forget them. Make sure you have some place to collect these items throughout your entire process.


Deployment

Done properly, the actual launch of your website shouldn’t be stressful. If you’re under a lot of stress and tension, then maybe question if you’re actually ready to launch.

By this point, several of your requirements will likely have been thrown overboard. That’s not uncommon, and brings us back to your backlog. Approaching launch, you need to be constantly triaging your remaining issues and asking yourself which ones are absolutely required for launch. Some will be sacrosanct, but others will be more fluid. They can get pushed onto the backlog for the proverbial “Phase 2” release. A looming deadline tends to clarify these things.

First off, please don’t wait until launch day to make your first pushes to the production environment. Some “pre-launch” preparation is necessary and crucial.

Refer back to our discussion of DevOps and environments. Throughout the project, your development team will have been moving code through multiple different environments. Individual developers will push their code into the integration environment, it will be tested in the testing environment, and then will sometimes be pushed into a pre-production environment.

The final production environment isn’t usually available right away. Integration and testing are easier to establish as they’ll use lighterweight resources and are easier to modify and swap after creation. However, production requires a more significant infrastructure investment and more planning, which means it’s not established right away.

You want to ensure that your production environment does come online sometime during the development of your project, well before launch. It’s not helpful for this environment to only become available the day before launch day.

Ideally, at about the two-thirds point in your project, the development team should start pushing into the future production environment quite often. Problems can crop up when deploying into production, and you’ll want to ensure that you have significant lead time to correct these problems before launch day comes around.

This “pre-launching” is possible because only very rarely is your new website going to exist on the same computing instance as your old website. Back in the day, we used to have to reuse physical server hardware. This meant that we would physically install a CMS on server somewhere, so we would have to take the existing website offline while we spent the time to get the new website up and running.

Thankfully, this is no longer the case. Almost universally today, the new website is developed in a parallel environment while the current environment still serves the old website. In fact, many organizations take this opportunity to upgrade their computing capacity. To install the new website in the old environment would likely cause a downgrade in performance.

This means for a brief moment in time, your organization will have two completed websites running right next to each other. The entire concept of launch, then, simply boils down to changing where your domain name points, as we discussed in a previous chapter.

This has drastically lowered the stress level of launches. Not only is the launch a simple configuration change, but if there’s a disaster post-launch, it becomes much easier to fallback to the previous production environment.

Additionally, ensure that your production deployments are just as automated as all your other deployments. The last thing you want to allow is some one-off manual process for deploying to production. If there’s any environment deploy that should avoid stupid mistakes, it’s your deployment to production. There is no reason why they shouldn’t be as automated as everything else.

So that’s the big rule: launch early and launch often. Make the actual launch day as boring as possible.


Post-Launch Stabilization and Testing

Earlier, we talked about the idea of regression testing, where you test old functionality after new functionality has been introduced.

The same is true after launch. You may think nothing has changed since you tested right before launch, but understand that the switch in domain names can introduce some very strange bugs. In some cases, functionality might be referencing the organization’s domain name (meaning it was refering to the current – now old – website) or the pre-production domain name, which might no longer exist. These bugs would be invisible until the domain names change and things start to break.

Plan on doing a complete regression test immediately post-launch, to test all the functionality in the installation. Do not consider yourself completely launched until this regression test has been completed and passed. It’s amazing the number of bugs that can be introduced simply by changing environments.

Once your public regression test has been completed, you need to get editors in and using the system to shake out any private bugs that might be hidden by the CMS. Your editors might find problems that are not visible from the public side of the website. They need to be able to create content, edit content, delete content, and navigate the system without error.

In the days immediately following launch, the team needs to be hyper-vigilant for problems. If there are problems that are going to require a rollback to a prior version of the website, you need to find these problems before you get too deep into editing and content creation. One of the worst scenarios would be to create hundreds of new content items and perform thousands of edits only to discover a heretofore unknown problem that would require you to scrap the current launch and all of your content changes with it.

Production Infrastructure Configuration

There are a few things your server administrators need to get in place immediately post-launch. These are things that are normally not running during pre-production.

  • Backup: There needs to be some scheduled process for backing up the resources of the website. this includes both the database and the file system. A test-restore needs to happen as soon as possible to ensure integrity. A backup that can’t be restored is worse than nothing, because it provides a false sense of security.

  • Monitoring: Systems need to be put in place to ensure the website is monitored for up time. If the website experiences an error or downtime, the server administrator needs to be notified.

Until these two systems are in place, you need to consider your newly launched website to be at-risk. A new website cannot be considered stable until these two things are configured and tested in the production environment.


Post-Launch Operational Policy

There are some immediate resource concerns that you need to establish. We’ll talk about larger governance concepts later, but in the short term you need to get the team together and ensure that everybody understands the lines of communication.

Specifically, every member of the team needs to understand the answers to these questions:

  • If they encounter an error, who do they notify about that error?

  • Where do they log that error?

  • In the event of catastrophic downtime, who is the emergency contact?

  • And for that contact, who is the technical resource that can actually bring the environment back up? You don’t want three project managers agreeing that an issue is technical, but not knowing who can fix it. There’s a moment in the lifecycle of many issues where it transitions from a project resource to a technical resource, outside the immediate project team. If a technical resource has to be brought in to actually fix the problem, make sure that everybody on the team knows what that line of communication looks like.


Congratulations, it’s been a long road. Your website is launched, hopefully it’s stable, and you’ll be able to move on to larger concerns like transitioning from a project to a product and start planning out future development.

As we’ll get into in the next few chapters, this is where the real work starts. By this point you’ve hopefully developed a base of functionality and process to enable you to take a long view of your project and plan the future out so that it aligns with your organization.

Too many people think that launch day is the finish line. In reality, launch day is the starting line. This is where the fun begins. This is where the digital property that you just created actually starts providing value for the organization.

This can be exciting and terrifying, all at the same time.

Inputs and Outputs

Given that this chapter comes very late in the lifecycle of your project, it’s safe to say that the output of this phase is a website running in production, that has been tested for functionality and tested for working backups and monitoring, with a communication plan in place to deal with errors and downtime.

Another output might be an exhausted team on the verge of burnout. Be prepared take a beat and collectively catch your breath at this stage.

The Big Picture

This will likely mark the end of the project. Now it transitions into a product or a process that needs to be managed. Instead of working toward a single point in time and the completion of tasks, you’ll start working toward the achievement of metrics and overall goals.

Staffing

The duties described in this chapter will require a QA or testing staff, even if that’s just a single person. On smaller teams, this person may actually do double duty from the content strategy or design teams, as long as they have an eye for detail and are well-organized. The deployment and launch aspects of this chapter will need to be handled by your server administration team, or the equivalent staff that your hosting provider has made available.