Challenger shuttle disaster: Predicting O-ring failure using Regression Model

CodeSerra
3 min readDec 17, 2020

--

Motivation

On January 28, 1986, the Space Shuttle Challenger (OV-099) broke apart 73 seconds into its flight, resulting in catastrophe of the highest order — loss of life all seven crew members aboard.

The investigation reveal the root cause of this tragic accident to be a failure of o-ring seal in the shuttle’s right solid-fuel rocket booster. This seal was designed to prevent leaks from the fuel tank, however during liftoff the seal weakened at the frigid temperatures and failed, this causing hot gas to leak our of fuel tank. The fuel tank itself collapsed and tore apart, and the resulting flood of liquid oxygen and hydrogen created the huge fireball believed by many to be an explosion.

Problem Statement:

Predict the number of O-rings that experience thermal distress on a flight at 31 degrees F given data on the previous 23 shuttle flights

Attribute Information:

1. Number of O-rings at risk on a given flight

2. Number experiencing thermal distress

3. Launch temperature (degrees F)

4. Leak-check pressure (psi)

5. Temporal order of flight

Dateset source can be found here and here

Data cleaning/Preparation

Let’s look at the dataset:

The given dataset doesn’t have any missing values, further we’ll drop columns which have non-unique values, to get final data-frame:

Our target variable is RingsInDistress, which is number of rings failed (0,1,2)

Exploratory data analysis

Let’s look at distribution plot for all features:

Plot for RingInDistress versus Temperature and Pressure below:

Let’s see the trend among features using pair-plot:

pair plots for dependent features

We can further explore the correlations between features as heatmaps:

heat map

While RingsInDistress is strongly correlated to Launch Temperature and Leak-Check Pressure. There is less correlation between temperature and pressure themselves.

Predictions Using the Logistic Regression

After separating training set and test set. We have run MinMaxScaler to normalize the dependent feature datasets. Following which we’ll use Binomial Regression:

Linear Regression stats report

Let us find optimum cutoff probability by plotting sensitivity, accuracy and precision

plot sensitivity, accuracy and precision

as can be seen optimal cutoff is somewhere around 0.2.

With this information, we can make predictions on our test set as below:

Conclusion:

With logistic regression we can predict o-ring failure by 71.4%

In next segment, we’ll discuss how this can be further improved.

My work book version can be found here where I have explore prediction with Linear Regression and Logistic Regression

--

--

No responses yet