OFJ Field Journal from Greg LaBorde - 11/9/95
WE ARRIVE AT JUPITER! AGAIN!My journal doesn't need to cover a week. Already one DAY can fill it up...
"Beep......Beep.....Beep....." Oh no! Not the alarm already! It is 6:00am PST on Thursday, November 9, 1995. I do NOT want to get out of bed. Can't I just sleep an extra half-hour and still make it to work in time to start today's "Relay/JOI version 5 with faults" test? "Waaaaa... Waaaaa...." My 15-month-old son Christopher's wakeup cry comes from down the hall, stopping that deliberation. I roll out of bed, stagger down the hall, scoop him up, and deliver him to "Ma-ma." Just as well. I probably couldn't have made it if I had slept until 6:30.
Showered, dressed, I am out the door by 7am. I head to the Donut shop right up the road from JPL. It is a time-honored tradition that the Testbed test conductor, or TC (see my bio for details), buys donuts for the test-team the morning of the test (well they all SAY that it is...). I pick up copies of the procedure we will use from the reproduction room, and I am in the Testbed at 7:30am. Over the next 40 minutes, experts on various parts of the spacecraft arrive and start to "bring up" (power up and run through the activities required to put in the right mode) their respective support equipment and data displays. So far, we've got representatives from the Command and Data Subsystem (CDS, the spacecraft's master computer) and Attitude and Articulation Control System (AACS, controls pointing, engine firing, and movement of the spacecraft and its parts).
Today we will do the "Relay/JOI version 5 with faults test."
Relay/JOI is the collective name to all of the critical activities which the Orbiter MUST successfully complete on December 7th. These are broadly grouped into two major activities: 1) "Probe Data Relay," collecting the Jupiter atmospheric data radioed from the Probe and storing it onto the tape recorder and into the computer RAM (memory chips, you know); and 2) Jupiter Orbit Insertion, the spinning up of the spacecraft to 10.5 revolutions per minute followed by a 49-minute firing of its main engine to enter orbit.
"Version 5" is the 5th time the "Critical Sequence" (program) controlling these activities has undergone significant changes requiring new analyses, new testing, and NEW PAPERWORK!
"With faults" means we will be trying to make the sequence fail by causing all sorts of problems to the spacecraft. Because of the tape recorder problem (which by now you have read a lot about) the Critical Sequence has been changed to protect the tape recorder before, during, and after recording of Probe data. It must now survive another battle in the Testbed.
Two days ago (Tuesday), Mary Oldham led the "Relay/JOI Standalone test" (no faults - just to see if the computer sequence can run to the end on its own without crashing) which ran through the entire 21-day-long sequence from first to last command. We can do this a lot faster than 21 days by "loadclocking," or jumping the spacecraft's clock. Still, it took 24 hours to complete. Mary is the "lead-TC" for Relay/JOI testing, responsible for planning and coordinating all the tests, and has also done the bulk of preparing for this particular test. Preparation and planning is a big part of the TC's job in any test. The DMS recovery activities have taken a lot of my time, so I am very glad Mary has done so much. I worked from 4pm to 4am on Tuesday's test (that morning I took Nicholas, my four-year-old son, to see the IMAX film "Destiny In Space." Galileo's launch from the Space Shuttle Atlantis on October 18, 1986, was a featured segment of that film), and I had to go back to work Wednesday morning for MORE meetings.
Mary "handed" me the procedure (by email) after my last preparation meeting ended at 4pm. It was not quite ready, with a few hours of work left for me to do on it. My wife Kristy (who also works at JPL) had a class until 9pm, so I had to run out at 5pm to my kids' daycare center right near the Lab to sign out _another_ kid so our babysitter could take him home too. It is already dark at 5:30pm in LA these days, and I was very tired. Aaaargh! I dashed back and made the final changes, dumped it to the printer (which was having its own problems), and plopped a 200-page tome into the overnight repro-man's in-basket (I'm serious. This guy works all night getting our Xerox jobs done so they are ready in the morning. I do not know what we would do without him). It has been a LONG week ALREADY... Such is life on Galileo (our fifth family member, according to Kristy).
The CDS and AACS people are all here, and the Testbed is buzzing with the sounds of humming equipment and people going about the activities required to turn this untidy pile of metal boxes into a living spacecraft. I pop upstairs for a cup of coffee (I've been drinking WAY too much lately). We're "on the net" now. We wear headsets with microphones that plug into convenient (well, not really) outlets located all around the testbed. This helps us hear better over the hum of all the equipment and allows other people involved listen in from their offices. It is really quite loud in the Testbed; imagine a room with 60 PCs all churning away. Our Testbed also has its own air-conditioner.
Today, our job is to simulate the most critical portions of the Relay/JOI sequence - from about 30 minutes after Io flyby through Relay, to just after the JOI burn ends. We are also going to throw in seven "faults" to determine that the critical sequence will do what it is designed to do (and has done in testing the previous four versions): pick up and keep going from wherever it was when a fault occurs.
Since it takes an infinite amount of effort to build a spacecraft that works "right," we design them to work when things go wrong. The Galileo CDS has two complete flight computers, or "strings", either of which can run the whole spacecraft by itself (though we'd like to avoid that situation). Normally both computers are operating, doing pretty much the same thing (one sort of "leads" the other, and thus is called "primary"). *WHEN* one computer crashes (and if you have ever worked with computers, you *know* that this will happen at the *MOST* inconvenient time), the other computer continues to operate the spacecraft. When a fault occurs in the CDS in normal operations (such as when one of the computers gets reset because of a radiation "hit"), the string with the problem goes "down" and stops doing _anything._ The other string then issues commands to "safe" the spacecraft - any currently running sequences are canceled, and instruments and systems are turned on or off so that the spacecraft can wait safely while the flight team analyzes the situation, then sends up commands to bring the down string back "up" and continue on. "Safing" can turn the spacecraft to point at the Earth, turn certain heaters on and off, put some science instruments into safe states (including OFF), etc. All of this is done by programs in the running string based on what the spacecraft was doing, without the need for humans to send commands.
This approach to dealing with problems would not work during Relay/JOI. Unlike most spacecraft activities, the timing of Relay and JOI is *critical*. Even the Probe Release and Orbit Deflection Maneuver, which took place last Summer, could have been delayed for *weeks* (although that would have cost us some propellant and additional work). At Jupiter, however, the laws of physics determine when the Probe will enter the atmosphere and begin transmitting data to the Orbiter, and when and at what orientation the engine firing _must_ take place for the Orbiter to enter orbit. Light (and radio waves) will take 52 minutes to traverse the gulf between Earth and Jupiter on December 7th. That means that any response by the flight team to a problem would require a minimum of 104 minutes, plus analysis and decision time. We do not have time for that. It is simply not possible to respond to problems during Relay/JOI in the "standard" way. Instead, the spacecraft's main computer has to do what it can by itself to recover from any problems and then keep on running the critical sequence from the last spot it "remembers."
Back in the testbed, we are "initializing". When the testbed "spacecraft" awakes at power-up, it believes that it is serenely floating out in the middle of some vaguely-defined "outer space." So the first thing we have to do is convince the AACS that it is really at Jupiter on December 7th. We load the AACS up with the appropriate data (star locations, planet/moon locations, attitude, spin-rate, etc.) and also have the simulator feed it the right stimuli to match. For instance, there is a lamp that shines on the sun-sensor whenever the simulator believes that the spacecraft should be seeing the sun, and there's an LED "star-field" that gives the Star-Scanner something to look at so it can compute the spacecraft's attitude. Then the CDS has to be put into the right "mode." Finally, other things are set just to match the state of the real spacecraft to make the simulation as real as possible (for instance, the Probe is set to "released" even though we have no Probe in the Testbed to set free anyway). All this takes about 4 hours to complete.
Six of the faults we will throw at the spacecraft are the worst. "Inverter Swaps," we call them. An inverter is a device which produces alternating current (AC) from a direct-current (DC) power supply. It is not something used only on spacecraft - for example, many campers/RVs have inverters to power televisions, VCRs, and other household appliances from the vehicle's batteries. Galileo has two inverters, a main and a backup, which are used to convert the DC power that comes out of the spacecraft's nuclear power units into AC power that can be used by the spacecraft systems.
When there is a drop in the voltage on board the spacecraft, the spacecraft power system automatically takes steps to restore full power. All of those steps take no longer than 400 milliseconds to complete. If AC power is still not restored after those steps are done, that means that the main inverter has failed and the power system switches to the backup inverter. On board the real spacecraft, this can only happen once, but we can do it again and again in the testbed. To simulate this, one of our control panels has a button marked "Inverter Switch" which removes power from the entire spacecraft for 400 milliseconds, and then tells the spacecraft computer that it has switched to the backup inverter. It is a serious problem for the computers: basically we pull the power plug from the entire spacecraft for 400 milliseconds. You know, of course, what happens when you do that to your computer at home (or school? Try it) ?!? REBOOT! And if you're *lucky* your hard drive and RAM are still OK when you are done.
"They" (there is always a "they" in everything we do. In this case it is the people responsible for planning all of the Relay/JOI activities) have strategically chosen the times at which we will trigger these faults to produce the most damage, and give the spacecraft the hardest time to recover. If the spacecraft can survive these blows, it can survive anything that real life will throw at it (real life is actually quite calm compared to all the violence we do in testing). To pass these tests, the CDS must restart the computer's command sequence from where it left off, and re-issue any commands that might not have completed because of the fault. The AACS must insure that the Relay Radio Antenna, the small dish antenna that the Orbiter uses to listen to the Probe, is still pointed at the Probe (if this happens during the Probe mission), or that the main-engine burn for JOI is restarted and continues until the correct time if the burn was interrupted. We will also do an inverter swap right before the JOI engine burn, when the AACS spins the spacecraft up from 3 Revolutions Per Minute to 10.5 RPM. The AACS's own computers have to ensure that the spacecraft spins up properly after the fault in time for the burn.
There's one other fault, a "bus reset," a problem that has actually occurred a few times on the real spacecraft. Another switch will flipped to cause a short between Galileo's spinning and non-spinning sections. This will cause the primary CDS string to crash right near the end of the Probe Relay. The secondary CDS string should switch the tape recorder to itself (we call it "grabbing the tape recorder") to finish recording the Relay data.
The last step of initialization is to run through a file of "catchup" commands. These are commands in the Relay/JOI sequence that take place before the point where we start testing it. Once these commands are completed, and the Testbed is as closely matched as we can make it to the spacecraft's expected state at 10:15am PST on Thursday, December 7, 1995, we are ready to start the test. We "perform" the sequence (like typing "RUN"), and then Loadclock to skip to the correct time. Relay/JOI is underway!
Mary arrives. We decided ahead of time that we wanted as many pairs of knowledgeable eyes looking at this as possible. There is a lot going on, and much to record, print out, commands to send, etc. We also have Jesse Glance, a relatively new guy on our team who has been working on computer simulation testing of Galileo (when THEY make a mistake, they can just reboot; We have to start over). He is getting some experience a little bit closer to the real spacecraft. We do the first inverter swap. This one is to check that some changes to DMS (the tape recorder) commands have been done correctly. We would also like to repeat some commands that will move the scan platform (home of the spacecraft's cameras) when the CDS re-issues commands during the recovery process; to do this, we have a 2-second window in which to issue the fault. "...3...2...1...MARK!"
CDS Operator Qui Chau presses the button. All the lights on the panel, indicating which parts of the spacecraft are powered, go out. Then they light again as the inverter comes back on-line, and more come on as the power system starts the recovery. The line-printer, a big tractor-feed type that most people see now only in movies, starts chugging away printing out all of the commands that the CDS spits out as the spacecraft recovers from the fault. The sounds of an Inverter Swap, mostly from the printer, are very distinctive. The net reverberates with reports from the analysts: the tape recorder remained "ready" as it should have, and the AACS saw its command repeated. A perfect fault!
We are ready to jump the clock again to the start of Probe Relay when a minor problem appears. The simulator that produces the Probe data is OFF! Apparently Ron Morgan, the guy who makes it go, thought the test was TOMORROW. We pause while he brings "Lassie" to life (taken from its abbreviation, LSSSE. We *like* to name things...). Then we do four more Inverter Swaps in rapid succession. Jesse is having WAY too much fun. The CDS Operators even have him push the button for TWO of the Inverter Swaps because they have too much to do. It is amazing that all of these closely-spaced faults go so smoothly. This team has "done" the Relay/JOI sequence at _least_ 18 times that I can count, and we are getting VERY good at it.
Finally we do the bus reset. Another minor snag here. Joel Mirelez, assigned to flip the switch that causes the reset, does not realize that he is to flip it back immediately. This sort of thing has happened before. We all understand what is supposed to happen. Unfortunately *WHAT* we each understand is DIFFERENT! It is really important for us all to discuss these things in meetings, write it all down, and review it to be SURE that we are going to do everything right. It stays set for 2 minutes. Man, that was some reset! However, the CDS is designed to be able to handle it, and, in fact, the spacecraft recovers nicely. The last Inverter Swap interrupts the spin-up to 10.5 RPM, and we again recover and start the Jupiter Orbital Insertion burn. We're almost finished with the test.
It is now 6:30pm, and while the JOI burn is a MAJOR event, there is very little for us to *DO* while it goes on. Only Tracy Neilson, the AACS engineer, has to busily record what her subsystem is doing. Poor Tracy. CDS Analyst Al Nikora and I jump in his car and go to our favorite pizza take-out place (Domenico's, if you must know) and bring back a couple of pizzas for dinner. We chase away the raccoons outside the door (JPL must be like a Denny's to wildlife. There are a lot of raccoons, possums, and deer, and they all look very well-fed. The raccoons can tell you are bringing back pizza just from the way you drive, I think) and consume pizza. We laugh as we insult our absent coworkers. Since they left at a normal hour, this is the price they must pay.
The JOI burn is over! Successful again. I am finally losing it (my mind), so I turn the helm over to Mary to finish up the data readouts and tape playback of the Probe data. Tomorrow, experts will pore over the data to check that the effects were just what we would expect from the faults introduced (actually they find a slight problem, an explanation, and a solution, but that is another story...). I toodle up to my office to make sure I do not have any critical voice-mail or email that I should know about, and then I am off. I am home at 10:00. I actually get to see Nicholas (who should be in BED!). I feel very happy that our test went so well and so smoothly, and that the Relay/JOI sequence completed successfully. And there are only 28 days until Relay/JOI!!!
Tomorrow (Friday) I'll be in by 8am, because it is my turn to bring breakfast for our "Breakfast Club" (they are getting bagels... TOUGH LUCK if they expect anything fancier!). In the morning, we'll be sending commands to the real spacecraft to play back 40 more seconds from the tape recorder in the morning. There will be a meeting to sum up today's test, and then I will be in the control room to look at the data from the tape recorder playback (it takes several hours, including round-trip light-time, between the commands and the results, so I'll get a lot done in between). But that is yet another story.
Once again, we've made it to Jupiter.