The whole thing was in reference to what Dustin Kirkland heard at a 45min Paul Gunn talk made in 2010 at linux.conf.au Systems Administration Miniconf in
Weta Digital - Challenges in Data Centre Growth
or “You need how many processors to finish the movie???”
Paul’s been at Weta 9 and a half years. Focusing on the machine rooms.
Weta Digital are a visual effects house specialising in movie effects. Founded in 1993—Heavenly Creates, Contact, etc, done on proprietary systems, no render walls.
Paul covers 2000 onwards, moving towards generic hardware, Linux on the desktop, Linux render wall: LoTR, Eragon, District 9, Avatar, etc, etc.
If it took 24 hours to make a movie, visual effects would start at lunchtime. About 2pm the first production footage would arrive, scan or copy it, then hiring artists, moving on to sending to the client, taking feedback, applying changes. By 11 pm the effects team should be finishing. After that colour grading and sound happen.
(Some directors run a bit later, e.g. LotR).
There’s a pre-vis process, where the director gets low-res renders by way of preview, before shooting starts to get a feel of shots before hitting the expensive filming.
Final delivery is a fram, 12 MB, 2K x 1.5K res. 170,000 frames for a decent length movie.
Average artist has 1 - 2 high-end desktops, 8 core, 16 GB of RAM currently, with whatever is the graphics card of the day. Most artists are Linux - 90%. Some packages aren’t on Linux, e.g. Photoshop, but most artists use Maya (70%). Shake and Nuke for 2D work, Alfred (by Pixar) to manage jobs hitting the render wall.
Rander wall: Originally used pizza boxes, moved to blades for efficiency. HP Blades, 8 cores, 24 GB of RAM currently. Linux right the way through, upgrade every couple of years, Ubuntu.
Storage is about a petabyte of NFS disk, with a few more petabytes of tier 2 SATA disks.
Not a large single image - isolated nodes running farm-style. Node are sacrificial, can lose a node and restart tasks as neded; nodes pluck jobs off a queue the artists submit into. Jobs can run from 10s to 48 hours.
At peak there were 10,000 jobs and a million tasks a day last year.
The rendering pipeline hasn’t changed much, but the hardware has; the first render wall was purchased in 2000, 13 machines, 100 Mb ethernet, 1 GB RAM, a 2 Gb uplink in the rack. Current render wall is 3700 machines, 1 Gb per machine, 500 10 Gb ports active servicing the wall/storage.
The machine room used to be a single entitiy with everything crammed in, now there are 6 machine rooms and 7 wiring closets. One room has a “hot” room, a storage room, other servers.
2000 - 2003 went to 3,000 cores for LotR; 2009 was over 35,000 cores.
Cold racks are 7kW, the hot racks are 22kW - 27 kW; heat is the biggest challenge. Thermal load for the render wall in total runs at 700 kW for almost the entire week, with a couple of dips in the Sunday evening and on Mondays.
Started with standard machine room design: raised floors, hot aisle/cold aisles alternating, cooled by standard computer room aircons; this was good for 2-3 kW per square metre. The current hot room is 6.5 kW per square metre.
The first room was pre-existing, with 20cm floor space, shared with network, power, and cold air, replete with hotspots. Smoke detection system, which were triggered by the compressor units losing gas, and then the fire service were called.
Second edition machine room was ready for Kong; 60 cm floor, high cieling, 4 x 60 kW aircon units. All services above-floor; nothing except air and flood detectors. Was a fine room, bit it couldn’t scale, since it was limited by how far air could blow under the floor.
Third gen machine room: Building started in 2008, 9 months, built into a pre-existing building. Concrete plinths, 1.2 metres, for earthquake retention. 30cm pipes for services; 6 rows in the room, for 60 cabinets. Core service pipes branch off. Rittal racks. The racks are sitting 1.2m up on the plinth. All services are above the cabinets. The fire sensors can shut down individual racks. Incidentally, wire for two sensors to go off for fire, not one. Data pre-run into the racks for ease of build. Power comes top-down, too.
Racks are fully enclosed once the door closes. How air comes out about 40 degrees, the doors are water-cooled, and air exits at 20 degrees from the rack. 1800 litres of water per minute pump through the racks at peak load.
Seismic plates; for the low-density room it was more of a challenge, since they weren’t sure what they’d install. There’s a table solution; a steel table. A plenum above the room extracts hot air above the racks.
Main switchboard: 3 transformers feeding in, big UPS.
Ladder racking: over a kilometre at the end of the first stage, up to two kilometres now; managing the navigation of the racks is quite a challenge.
Plant room: Rather than using individual compressions in each room/rack, the compressors have been centralised into the plant room, which pumps water. 2 x chillers, 2 x 1 MW, 1 x 500 KW, provides a lot of redundancy. The chillers are incredibly quiet compared to most machine rooms. Magnetic bearings, so reduced wear and tear on the compressors.
Efficiency: Cooling at the rack means they aren’t worrying about ambient cooling, trapped in the back of the rack. Water is more efficient that aircon units. Free cooling: rather than pumping the heat and chilling it elsewhere, the water is pumped to the roof and run across the roof, getting natural cooling.
The render wall is deliberately run hot, 20 degrees is the norm, the wall is 25 degrees, with no noticable performance or lifespan. HP agreed to warrant. Saves big money - tens of thousands per month per degree.
Traditional machine rooms: $1 per server = $2 - $3 on plant; in the right weather, ratio is 1:0.23.
Make room to expand. You can never have too much space. Don’t put anything under the floor except air, flood sensors, and maybe lights. There’s more volume under the floor than water in the plant, making the environment flood-proof. Plenums to manage/direct hot-air. Water cooling is magical. Free cooling is good. Anywhere with a bit of bad weather is a win. Run servers a bit hot. Not so much your discs.
Challenges (or mistakes)
Ratios of server to storage to render walls. Structured cabling - really regret not doing it. JFDI regardless. Supersize racks. Sharing the humidifier between the hot room and cold room was a mistake, since the hot room can’t run as hot as it might.
Still have space to expand. Water cooling for storage? Vendors are increasing density. Run renderwall @ 28 degrees. * DC power? Maybe, but there are better options right now. Let someone else be first.