Thursday, February 21, 2008

Life on the Bungie Farm: Fun Things to Do with 180 servers and 350 processors

This lecture was given by Luis Villegas and Sean Shypula. This was primarily about the server farm and distributed computed system created by Bungie for automated builds of code and content.

Advantages:
  • Faster iterations -> more polished games
  • Keeps complexity under control

Binary Builds (game and tools)

  • Automated tests are run on tool builds only

Lightmap Rendering

  • Pre-Compute Lighting in scenes (Photon Mapping and custom algoritms from Hao and crew)
  • Bakes the level files (output)

Content Builds

  • Compiles assets into monolythic files

Website (bungie.net) Builds

Patches (maintenance items for servers)

Halo 1 -> All assets processed by hand, very few automated tasks

Halo 2 -> More automation (3 servers in farm -> one for each function)

Halo 3 -> Unified systems into single extensible system

The latest iteration, created with Halo 3, did a few new things (rewrite).

  • Unified codebases, implemented single cluster.
  • One farm
  • Updated code to .net (C#), easier to develop/maintain

Stats

  • Over 11,000 builds (exe/dll)
  • Over 9, 000 lightmap builds
  • Over 28,000 other types of builds
  • Halo 3 would not have shipped in current form without the farm.

Interface for users (developers)

  • Had to be easy, simple with "one-button" submit operation
  • Even if users are developers they still don't want to know what is going on behind the scenes

Architecture

  • Single system/multiple workflows
  • Plug in based
  • Workflows divided into client / server plugins (isolation from each other)
  • Server schedules jobs (messages clients)
  • Client start jobs and sent status and results back to server
  • Server manages state of jobs
  • All communications via SQL Server
  • Incremental builds be default
  • Between continuous integration and scheduled (devs run builds ad-hoc and there is a scheduled nightly build)

Symbol Server used (Debugging Tools For Windows)

  • Symbols registered on server

Source Stamping

  • Linker setting for source location
  • Set at compile time
  • Engineers can attach to any client from any client as long as they have Visual Studio installed.

Lightmapper was written specifically for the farm

  • Chunks job parts to clients
  • Merges results

Simple SLB

  • Min / Max configurable
  • More clients used to support workload if clients are mostly idle

Cubemap farms

  • Used XBoxes and PCs for rendering and assembly.
  • Pools of Xbox Dev Kits
  • No client code on Xbox
  • Few changes for Xbox Support

Implementation Details

  • All C# (.Net)
  • Object serialized to XML to start but switch to binary serialization later (speed and mem benefits)
  • Downsides (memory bottlenecks, forced GCs, should have been more careful with memory)

No comments: