The A330 ADIRU Dilemma

of unresolvable anomalous behaviour

 

Refs

a.    http://www2.cs.uidaho.edu/~krings/CS449/Notes.S07/449-07-28.pdf

b.    http://www.atsb.gov.au/newsroom/2009/release/2009_02.aspx

c.    link to Interim Report        d.  Further ADIRU incidents   (link)

ADIRU fault detection and tolerance

  Preface: (From ref B)

1. 07 Oct 08  "the A330-303 aircraft abruptly pitched nose-down twice while in normal cruise flight. The aircraft (registered VH-QPA) was being operated on a scheduled passenger service (QF72) from Singapore to Perth. At 1240, while cruising at 37,000 ft, the aircraft experienced two significant uncommanded pitch-down events while responding to various system failure indications. The crew made a PAN urgency broadcast to air traffic control and requested a clearance to divert to and track direct to Learmonth. After receiving advice from the cabin of several serious injuries, the crew declared a MAYDAY".

2.  27 Dec 08  "The second event occurred on 27 December 2008, when another Qantas A330-303 aircraft (VH-QPG) was on a flight from Perth to Singapore. In response to a similar pattern of fault messages as occurred on the 7 October 2008 flight, the crew completed the relevant procedures (introduced since the 7 October 2008 occurrence) to select both parts of the ADIRU off and returned to Perth for a normal landing."

3.  Two other occurrences have been identified involving similar anomalous ADIRU behaviour, but in neither case was there an in-flight upset. The ATSB also investigated an in-flight upset occurrence related to an ADIRU failure on a Boeing 777-200 aircraft, which occurred on 1 August 2005, 240 km north-west of Perth. The ADIRU on that aircraft was made by a different manufacturer and of a different type to that on VH-QPA. However it is noteworthy that: "On August 29, 2005, the U.S. FAA issued an emergency AD (2005-18-51) to prevent the operational program software (OPS) of the air data inertial reference unit (ADIRU) of the B777 from using data from failed sensors, which could result in anomalies of a.o. the fly-by-wire primary flight control and autopilot." This was also an interim fix and minus a root cause resolution.

  The "AD" in ADIRU stands for "air data".

The ADR part of the ADIRU supplies barometric altitude, speed, Mach, angle of

attack (AOA) and temperature information to other aircraft systems. It receives air

data from the aircraft’s pitot probes, static pressure ports, AOA sensors, and total

air temperature probes. Air data modules convert pneumatic data from pitot and

static sources into electrical signals for the ADIRUs.

Examination of Ref A indicates that the ADIRU (in fact all three) accept air data inputs at "face value" (whilst the system is vetting for electrical power supply, electronic, software corruption and mechanical faults and failures). What are those raw data air inputs and how could they induce a momentary error in an ADIRU that won't cause the other ADIRU's to vote it out as unreliable?

There are three theories and all involve the air sensor inputs. All three theories presuppose that there are independent static sources, at least two pitot supply lines and that each ADIRU's air inputs are pneumatically plumbed separately (i.e. from some point onwards to each ADIRU module's measuring transducer).

One theory involves water in static lines that may (or may not) freeze in flight - if its lines are running through vulnerable unpressurized and unheated areas - they may freeze and thus choke off partially (or fully) any static pressure changes (i.e. locks in the ambient static pressure value at the time, and height, of the event). How can water be introduced into a static system? By aircraft washing, in tropical downpours, and it can be sucked in as it flows over the unplugged ports while parked on the ground, especially while the atmospheric pressure is increasing. Static systems have low point drains through which trapped water can be removed but this is done quite irregularly. Why would a static pressure error be confused with an AoA spike? The former quickly leads to the latter, and only the latter is DFDR recorded, leading to a false attribution of causation/blame.

Another involves air-leaks that open and/or close as atmospheric and adiabatic heating cause the fuselage (and lines and their joints) to expand/contract. In fact, even protracted exposure of one side of the aircraft to sun and inflight passage from day to night might be players here.... in opening/closing leaks (as can fuselage expansion/contraction due to pressurization).

A third theory, just as plausible, involves the manometric effect of changes of static air pressure caused by water in lines- particularly if that column of water is being "pushed uphill" by a suddenly increasing or decreasing ambient pressure. This can happen when an aircraft suddenly passes from one air-mass to another (such as when in transit through a jetstream -see Met info table)..... or when it changes pitch attitude (inducing flows) .....or when the volume of trapped water passes a critical level and the water flows into an adjacent or further section of line. More on that later. However please reflect upon what an initial induced pitch-change will do to any free-flowing water in the static system. As the pitch attitude steepens during an upset, it will have its flow (and therefore its follow-up effect) magnified. Think of it in terms of trying to balance a 100lbs of Mercury in a large flat bowl. The more you lose a basic level attitude, the more difficult the balancing task becomes. A small volume of water in static lines may have a similar "mercurial" effect upon the Flight Control Computers, with no system provision for the haphazard outcome being "error-trapped" or disregarded. The magnitude of these dampened small changes can be affected (i.e. magnified) by the sensitivity of the system. What may have happened in recent memory to increase system sensitivity to minuscule air pressure changes at height? RVSM certification and required system fine-tuning perhaps?

Meteorological information

The Bureau of Meteorology provided the following information regarding the weather conditions prevailing at the location and time of the occurrence:

• A ridge extended over southern Western Australia with a surface trough developing along the north and west coasts during the day.

• A sharpening upper level trough extended from the Great Australian Bight through Perth and into the Indian Ocean.

• Some thunderstorm activity was recorded from about Karratha to just north of Learmonth, with cloud tops to about flight level (FL) 330 (33,000 ft).

• The axis of a 120 kt sub-tropical jet stream lay north-west to south-east between Learmonth and Carnarvon at FL 400 (40,000 ft). A shear line was developing south of the jet-stream as the upper trough developed.

• Data obtained at 0600 UTC (1400 local time) on 7 October 2008 showed a shear line associated with the upper level trough well south of the jet stream. There was no evidence of any penetration of cold air under the jet stream that could have lead to increased vertical wind shear.

• Three model-generated forecasts predicted an area of moderate turbulence associated with the jet stream.

• At the time of the occurrence, the aircraft appeared to be in the vicinity of the sub-tropical jet stream, to the near north of a shear line and well south of any significant convection activity.

• Turbulence at a moderate or greater level was unlikely to have influenced the aircraft at the time of the occurrence.

But why would the triple redundancy voting system disregard air input errors, whether transient or not, of a sufficient magnitude to cause a  flight upset? It may well be the same design philosophy that permits a Radar Altimeter's incorrect read-outs to be accepted as valid (the recent Schiphol Turkish 737-800 accident). In that Amsterdam crash there was no rejection of the #1 RadAlt's sudden assertion that it was at -8ft and therefore the auto-throttle was motored to idle whilst the aircraft was still at around 2000ft on the ILS glideslope.

From ref b. "immediately prior to the autopilot disconnect, one of the air data inertial reference units (ADIRUs) started providing erroneous data (spikes) on many parameters to other aircraft systems". So, assuming that a similar "acceptable" fraudulent sensor input isn't ruled out by the triple ADIRU's system voting, what could be the effect upon the aircraft's operation? Back to pneumatic basics.

The aircraft's motion through the air (its airspeed) is measured by the pitot system (which takes in dynamic + static pressure via a forward-facing pitot tube). The static pressure measured at the aircraft's static ports S1 should be the same as the static pressure S2 taken in at the pitot, but that's only going to be true as long as the dedicated static lines remain free of internal air-leaks, ice and water (heated, drained, leak-free). If the air from the static ports suddenly becomes locked at a value, what would happen in a simple pitot-static system? Firstly the VSI/RCDI would read zero, the altimeter wouldn't register a climb or descent and what of the ASI? The indicated airspeed is a measure of Dynamic+StaticS1 Pressure (pitot measured) minus static pressure S2. Let's say that (in a perfect system) the IAS = D+S1 – S2 = D

Following the second upset event, the crew continued to review the ECAM messages and other flight deck indications. The IR1 FAULT light and the PRIM 3 FAULT light on the overhead panel were illuminated. There were no other fault lights illuminated. Messages associated with these faults were again displayed on the ECAM, along with several other messages. The crew reported that the messages were constantly scrolling, and they could not effectively interact with the ECAM to action and/or clear the messages. The crew reported that master caution chimes associated with the messages were regularly occurring, and they continued to receive aural stall warnings.

Maintenance watch suggested that the crew could consider switching PRIM 3 off, and this action was carried out. This action did not appear to have any effect on the scrolling ECAM messages, or the erratic airspeed and altitude information.

But what happens to IAS when S1 – S2 doesn't equal zero-sum? In level flight = not much, but in a climb? Eventually S2 being greater than S1, the airspeed indication will quickly, over a couple of thousand feet, wind back to zero (as D+S1 – S2 sums to a negative value).

Conversely, in a descent, the ASI will over-read massively. (justification: [from ref B:] "An overspeed parameter was recorded by the FDR. The first overspeed warning occurred at 0440:54 UTC and numerous such warnings were recorded from this time until 0502:01 UTC when the aircraft was descending through an altitude of 25,400 ft.

What about the effect of even a minor S1/S2 disparity (due to one of theories 1 to 3 and a transit of an area of horizontal pressure change (such as a jetstream) upon a barometrically based autopilot that's maintaining an FMS dictated barometric level (and a set speed)? Will it suddenly seek to correct and would this lead to the plunge/plummet that was seen on the 07 Oct QANTAS A330 flight?

From ref B: "some of the spikes in angle of attack were not filtered by the aircraft's flight control computers." Chicken and egg? What came first? .... the autopilot seeking its proper barometric level I'd suggest..... and that then leading to rapid AoA changes/ fluctuations.

The flight crew described the first pitch-down movement as very abrupt, but smooth. It did not have the characteristics of a typical turbulence-related event and the aircraft’s movement was solely in the pitching plane. They did not detect any movement in the rolling plane.

From ref B: "The crew were also receiving aural stall warning indications at this time, and the airspeed and altitude indications on the captain’s primary flight display (PFD) were also fluctuating."

To summarize, the contention is that the ADIRU's system logic is attuned to accepting the aircraft's various pneumatic inputs as Gospel and not to reject or question them, as long as they are credible and within (or quickly regain) acceptable ranges. However the rate-of-change of these parameters may not be so selectively filtered. In some circumstances this transducer-driven complicity may be sufficient to invite flight control mayhem as the Primary Flight control computers accept momentarily invalid or erratic pneumatic sensor inputs without any cross-checking verification - and react per their hair-trigger algorithms. Although many system fault messages were generated or precipitated by the 07 Oct event, many were found to be spurious and, to date, extensive trouble-shooting has failed to produce a root cause. System automation and integration to this extent only needs one unvetted wild card (such as trapped water) to create a downstream maelstrom of confusion. The recent A320 accident off Perpignan is further evidence of those possibilities.

Ref B: "For most of the ADIRU parameters, the PRIMs obtained three different values of the same parameter. Each value came from a different sensor and was processed by a different ADIRU. The PRIMs compared the value of the parameter coming from each ADIRU. If the value of any of the parameters differed from the median(middle) value by more than a threshold amount for more than a set period of time, then the relevant part (that is, ADR or IR) of the associated ADIRU would no longer be used by the PRIMs."

 
One reason why this (as highlit above) may not occur is that, even though the end result is dynamic (i.e. the flight upsets), the actual static error induced by manometric water flow is akin to hysteresis (i.e. its initial discrepancy is large enough to disrupt, but very soon thereafter the parameter reaches an approximate system equilibrium again, with little remnant pressure disparity)..... an upset applecart where no apples actually hit the ground.... but in a system galvanized to react to changing environmental conditions, the Primary Flight Control Computers respond readily to lagging discrepant parameters.......... and ergo, you have an "upset".