Home| Journals | Statistics Online Expert | About Us | Contact Us
    About this Journal  | Table of Contents
Untitled Document

[Abstract] [PDF] [HTML] [Linked References]

 

Bayesian Analysis for the Generalized Rayleigh Distribution

 

Shankar Kumar Shrestha1, Vijay Kumar2

1Public Youth Campus, Tribhuvan University, Kathmandu, NEPAL.

2Department of Mathematics and Statistics, DDU Gorakhpur University, Gorakhpur-273009, Uttar Pradesh, INDIA.

Corresponding Addresses:

1[email protected], 2[email protected]

Research Article

 


Abstract: In this paper, the estimation problem of generalized Rayleigh distribution is considered. The parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. We have computed MLEs and Bayes estimates under gamma priors along with their asymptotic confidence, bootstrap and HPD intervals. The Bayesian estimates of the parameters of generalized Rayleigh distribution are obtained using Markov chain Monte Carlo (MCMC) simulation method. We have obtained the probability intervals for parameters, hazard and reliability functions. The posterior predictive check method has been applied for evaluating the model fit. We have also discussed the Bayesian estimation and prediction for Type-II censored data. All the computations are performed in OpenBUGS and R software.  A real data set is analyzed for illustration of the proposed inferential procedures.

 

Keywords:  Generalized Rayleigh distribution, Markov chain Monte Carlo, Bayesian estimation, Bootstrap, OpenBUGS.

 

1.  Introduction

     The Rayleigh distribution is a special case of the Weibull distribution, which provides a population model useful in several areas of statistics, including life testing and reliability which age with time as its failure rate is a linear function of time. Rayleigh has a linearly increasing failure rate which makes it a suitable model for the lifetime of components that age rapidly with time.

    In recent years, new classes of models have been proposed based on modifications of the existing model. Several exponentiated distributions have been studied quite extensively, since the work of Mudholkar and Srivastava (1993) on exponentiated Weibull distribution. The exponentiated form of exponential distribution has been introduced by Gupta and Kundu (1999) and named it as generalized exponential distribution. Along the same line of the generalized exponential distribution Surles and Padgett (2005) introduced two-parameter Burr Type X distribution and named as the generalized Rayleigh distribution, Kundu and Raqab (2005). Nadarajah and Kotz (2006) proposed several exponentiated type distributions extending the Frchet, gamma, Gumbel and Weibull distributions. The two-parameter generalized Rayleigh distribution is a member of the generalized Weibull distribution, originally proposed by Mudholkar and Srivastava (1993). Kundu and Raqab (2005) and Raqab and Kundu (2006) have discussed the different methods of estimation of the parameters and other properties of generalized Rayleigh distribution. Raqab and Madi (2009) studied exponentiated Rayleigh distribution in Bayesian framework. In fact, there exist a lot of density functions which may be considered as generalizations of the Rayleigh distribution.

    Recently, two versions of generalized Rayleigh distribution (GRD) are appeared in literature Voda (2007, 2009). The probability density function of first one has the following form :

 

. With  we obtain the usual Rayleigh probability density function (pdf).

            Initially, this family has been proposed by Voda (1976a, 1976b). We shall name it as the generalized Rayleigh distribution and is denoted by The two-parameter generalized Rayleigh distribution provides a rich family of specific distributions that have widespread application in many disciplines. Members of this family include the Rayleigh distribution itself, the Half-Normal distribution, the Maxwell distribution, and the Chi-distribution.

The main objective of this paper is to explore the inferential procedures, classical as well as Bayesian, for the generalized Rayleigh distributions.

   The Rayleigh distribution itself has had many applications in life testing. Clearly, the GRD family is quite broad and lends itself to widespread application. In the case of reliability modeling, the GRD family is more flexible than the widely used Weibull model, as the latter includes only the Rayleigh distribution as a special case, while the GRD also encompasses the Maxwell and Chi-distributions. In addition, other members of the GRD family, such as the Half-Normal distribution, are quite widely applied in the social sciences and elsewhere.

   It is to be noted that most of the cited literature is confined to classical developments and any systematic development on Bayesian results are rarely seen for the generalized Rayleigh distribution. The Bayesian methods are equally well applicable for small sample sizes and censored data problems; the two common features in reliability data analyses.

   The advent of Markov chain Monte Carlo(MCMC) sampling has flourished Bayesian statistics. The freely available software package known as Bayesian inference using Gibbs sampling(BUGS) has been in the forefront of this proliferation since the mid-1990s. However, more recent advances in this software, leading first to WinBUGS and now to an open-source version OpenBUGS, Thomas et al. (2006), Thomas (2010) and Lunn et al. (2013), including interfaces to the open-source statistical package R, (R Development Core Team, 2013), have brought MCMC to a wider audience. We shall use OpenBUGS and R software in our present study.

   For Bayesian analysis, we also need to assume a prior distribution for the model parameters.. In this paper, Bayesian analysis has been preformed under different loss function assuming independent priors for the parameters.

   A major difficulty towards the implementation of Bayesian procedure is that of obtaining the posterior distribution. The process often requires the integration, which is very difficult to calculate not only for high-dimensional complex models even if dealing with low-dimensional models. In such a situation, Markov chain Monte Carlo (MCMC) methods are very useful to simulate the deviates from the posterior density and produce the good approximate results.

   The rest of the paper is organized as follows. The generalized Rayleigh distribution and its properties are discussed in Section 2. The point estimation procedures for the parameters of the considered model under classical set-up and the confidence/bootstrap intervals have been constructed in Section 3. In Section 4, we have developed the Bayesian estimation procedure under independent gamma priors for the parameters. To check the applicability of the proposed methodologies, a real data set has been analysed in Section 5. In this section, the ML estimators of the parameters, approximate confidence intervals are presented. We cover Bayesian analysis using the MCMC simulation in Section 6. In this section, the Bayes estimates and credible intervals of parameters, hazard and reliability functions are presented.  In Section 7 we have applied the predictive check method in order to give an assessment of the performance of the model for the given data. We have addressed the censored data problem in Section 8. Finally, the conclusions have been given in Section 9.

 

2. The

      • model 

 

The cumulative distribution function(cdf) of generalized Rayleigh distribution (GRD), corresponding to pdf given in (1),  is given by

                  

where and are the parameters. 

   Define incomplete gamma function as

                      

 then we can write

                          (4)

This family includes several important probability distributions as special cases.

  • For example, if and  we obtain the one-parameter Rayleigh distribution with density function

            

  • For  and  we obtain the one-parameter Maxwell distribution with the density function

   

  • For  and  we obtain the Chi- distribution with ‘a’ degrees of freedom, whose density function is

 

    where and N denotes the set of natural numbers.

  

   The reliability/survival function is

                       

   The hazard rate function is

                            

   The quantile function is given by

   or        

   The random deviate can be generated from by

                                      

where u has the distribution.

The  moment of  is given by

Therefore,

 and

The GRD distribution is unimodal and asymmetric. The mode is

                       

3. Maximum likelihood estimation (MLE)

   The nice properties, such as consistency, asymptotic unbiased, asymptotic efficiency, and asymptotic normality, make the maximum likelihood estimation most popular and attractive one. In this section, we discuss the maximum likelihood estimators (MLE’s) of the distribution and discuss their asymptotic properties to obtain approximate confidence intervals based on MLE’s.

   Let  be a sample of size n from , then the log-likelihood function  can be written as; 

      

   To obtain the MLE’s of  and , we can maximize (8) directly with respect to  and  or we can solve the following system of non-linear equations.

                    ,

                                      

   where   is the digamma function.

   Note that the MLEs, respectively and of and cannot be solved analytically. Numerical iteration techniques, such as the Newton-Raphson algorithm, are thus adopted to solve these equations.

 

3.1 Approximate confidence intervals

The exact distribution of MLEs cannot be obtained explicitly. Therefore, the asymptotic properties of MLEs can be used to construct the confidence intervals for the parameters, Lawless(2003).            The asymptotic inference for the parameter vector  can be based on the normal approximation of . Under some regular conditions, we have , for n large, and  is the per observation expected information matrix. The asymptotic behavior remains valid if , where  is the observed information matrix, is replaced by the average sample information matrix evaluated at , i.e. . We have

 

whose elements are

   ;    and

.

We can use the normal approximation of the maximum likelihood estimator of  for constructing approximate confidence intervals as well as for testing hypotheses on the model parameters. For example, asymptotic confidence intervals for are given, respectively, by  and , where  is the square root of the diagonal element of corresponding to each parameter, and  is the quantile of the standard normal distribution.

 

3.2 Bootstrap confidence intervals

        In this section we propose the confidence intervals based on the bootstrapping. Bootstrap methods are widely used to improve estimators or to build confidence intervals for the parameters. We have used the percentile bootstrap (Boot-p) method, proposed by Efron and Tibshirani (1986), to construct confidence intervals for the parameters as well as the reliability and hazard functions. To construct the ‘Boot-p’ confidence interval, we proceed as follows, Soliman et al.(2012):

Step 1. From the original data  compute the ML estimates and of the parameters: and  by solving the nonlinear equations (9).

Step 2. Generate a bootstrap sample  of size n from (1) using and . As in Step 1, compute the estimates of and  say and , using bootstrap sample.

Step 3. Repeat Step 2, -times. Obtain the bootstrap estimates and .

Step 4. Let be the ordered values of the estimates . The two-sided boot-p confidence interval (BCI) for can be obtained by , where denotes the largest integer less than or equal to . Similarly, we can obtain the BCI for .

 

4.     Bayesian model formulation

   The Bayesian model is constructed by specifying the prior distributions for the model parameters  and , and then multiplying with the likelihood function  for the given data  to obtain the posterior distribution function using Bayes theorem.  The likelihood function is given by

  Denote the prior distribution of and  as . The joint posterior is

       

 

Priors for the parameters

   We have assumed independent informative priors for the parameters and . Let us suppose the gamma priors for  and  as                                                            

and

       

Thus, we have

         

 

Posterior distribution

The expression for the posterior, up to proportionality, can be written as

 

The posterior is intractable and no close form inferences are possible. We, therefore, propose to consider MCMC methods to simulate samples from the posterior so that sample-based inferences can be easily drawn.  To implement MCMC calculations, Markov chains require a stationary distribution. There are many ways to construct these chains.  Several Monte Carlo (MC) based sampling methods for evaluating high dimensional posterior integrals have been developed: MC importance sampling, Metropolis-Hastings sampling, Gibbs sampling, and other hybrid algorithms. A landmark work for Gibbs sampling in problems of Bayesian inference is Gelfand and Smith (1990), which is actually a special case of Metropolis-Hastings sampling, Metropolis et al. (1953) and Hastings (1970).

 

Gibbs Sampler : Algorithm

   It is currently the most popular MCMC sampling algorithm in the Bayesian inference literature. Gibbs sampling belongs to the Markov update mechanism and advocates the philosophy of “divide and conquer.” We only need to know the full conditional distributions to apply Gibbs sampling. To carry out Gibbs sampling, the basic scheme is as follows:

Step1: Compute the posterior distribution, upto   proportionality, and specify the full conditionals of the model parameters and . The full conditionals of and , using (10),  can be written as

  • full conditional of  given and :            
  • full conditional of  given and :

   

Step 2: Select an initial value  to start the chain.

Step 3: Suppose at the ith-step, takes the value  then from full conditionals, generate

                 from and

                 from .

Step 4: This completes a transition from  to

Step 5: Repeat Step 3,  N  times.

 

Posterior sample : MCMC output  

   Monitor the convergence using convergence diagnostics. Suppose that convergence have been reached after 'B' iterations (the burn-in period). The MCMC output is referred as the sample after removing the initial iterations (produced during the burn-in period) and considering the appropriate lag (or thin interval).

    For the posterior analysis, we have the MCMC output (posterior sample) ,  where

            .

   The Bayes estimates of , under squared error loss function (SELF), are given by

                         

We shall use OpenBUGS software to obtain posterior samples. The modular framework of OpenBUGS provides an in depth and interactive analysis of the model with many built-in features and model extensions can easily be accommodated. It is a powerful and flexible tool for Bayesian analysis. BUGS (Bayesian inference Using Gibbs Sampling) is a piece of computer software for the Bayesian analysis of complex statistical models using MCMC methods. OpenBUGS, with open source code, implements MCMC algorithms and is able to analyse highly complex problems for the probability models available in OpenBUGS, Thomas et al.(2006). But model implementation is difficult for the probability distributions, which are not pre-defined in OpenBUGS. Each new model (probability distribution) causes a new software system to be built. Several probability distributions useful in the field of reliability studies are incorporated into OpenBUGS, Kumar et al. (2010) and Lunn (2010).  

As the generalized Rayleigh distribution is not available in OpenBUGS, it requires incorporation of a module in ReliaBUGS, Lunn et al.(2013),  which is subsystem of OpenBUGS. A module dgen.rayleighI_T(alpha, lambda) is written for the generalized Rayleigh, the corresponding computer program can be obtained from authors, to perform full Bayesian analysis in OpenBUGS using the method described in Kumar et al. (2010), Kumar (2010) and Shrestha and Kumar (2013).

 

5.    Data                                                              

   A real data set is considered for illustration of the proposed methodology. The data extracted from Nichols and Padgett (2006), gives 100 observations on breaking stress of carbon fibres (in Gba)

 

0.39, 0.81, 0.85, 0.98, 1.08, 1.12, 1.17, 1.18, 1.22, 1.25,

1.36, 1.41, 1.47, 1.57, 1.57, 1.59, 1.59, 1.61, 1.61, 1.69,

1.69, 1.71, 1.73, 1.80, 1.84, 1.84, 1.87, 1.89, 1.92, 2.00,

2.03, 2.03, 2.05, 2.12, 2.17, 2.17, 2.17, 2.35, 2.38, 2.41,

2.43, 2.48, 2.48, 2.50, 2.53, 2.55, 2.55, 2.56, 2.59, 2.67,

2.73, 2.74, 2.76, 2.77, 2.79, 2.81, 2.81, 2.82, 2.83, 2.85,

2.87, 2.88, 2.93, 2.95, 2.96, 2.97, 2.97, 3.09, 3.11, 3.11,

3.15, 3.15, 3.19, 3.19, 3.22, 3.22, 3.27, 3.28, 3.31, 3.31,

3.33, 3.39, 3.39, 3.51, 3.56, 3.60, 3.65, 3.68, 3.68, 3.68,

         3.70, 3.75, 4.20, 4.38, 4.42, 4.70, 4.90, 4.91, 5.08, 5.56

5.1. Computation of MLE  and Model Validation

    The maximum likelihood estimates (MLEs) are obtained by direct maximization of the log-likelihood function  given in (8) using R software (R Development Core Team, 2013). We consider the Newton-Raphson algorithm in R to compute the MLEs.  The contour plot of likelihood is displayed in Figure 1, (+) indicates the ML estimates of  and .

Figure 1: Contour plot of log-likelihood

   The value of loglikelihood is . The Akaike information criterion (AIC) and Bayesian information criterion(BIC) can be used to determine which model is most appropriate for the given data. For the given data set AIC = 286.874 and BIC = 292.084. The Table 1 shows the ML estimates, standard error(SE)  and   95 % Confidence Intervals for parameters .

Table 1.   MLE, standard error and 95% confidence interval (CI)

Parameter

MLE

Std. Error

95% CI

alpha

0.7574

0.22862

(0.3093, 1.2055)

lambda

0.2228

0.03350

(0.1571, 0.2884)

 

   We compute the Kolmogorov-Smirnov (KS) distance between the empirical distribution function and the fitted distribution function when the parameters are obtained by method of maximum likelihood to check the validity of the model. The value of KS statistic is 0.052 and the corresponding p-value is 0.95.

 

Figure 2:  Quantile-Quantile(Q-Q) plot using  MLEs as estimate.

The high p-value suggests that fit is satisfactory. The Q–Q plot for the fitted model is shown in Figure 5, Kumar and Ligges(2011).  It can be seen that the fitted generalized Rayleigh distribution provides reasonable fit to the given data. 

 

6.  Bayesian analysis

   We assume the independent gamma priors for  and  with hyper-parameter values We first construct the contour of un-normailzed joint posterior of   in Figure 3, where the contour lines are drawn at  90%, 70%, 40%, 10%  and 5% of the maximum value of the posterior density over the grid, Albert(2009). It gives an idea about the parameters.

 

Script 1 : OpenBUGS code for the Bayesian analysis

model

{

       for( i in 1 : N )

       {

       x[i] ~ dgen.rayleighI_T(alpha, lambda)

       f[i] <- density(x[i], x[i])

       reliability[i] <- R(x[i], x[i])

       }

# Prior distributions of the model parameters

       alpha ~ dgamma(0.001, 0.001)

       beta ~ dgamma(0.001, 0.001)

}

Data

       list(N=100, c(0.39,...,5.56))

    Initial values

     list(alpha= 0.1, lambda= 0.1)   # Chain 1

     list(alpha=2.0, lambda= 2.0)    # Chain 2

 

 

 


  

 

 

 

 

 

 

 

 

    We run the Script 1 in OpenBUGS to generate two Markov chains at the length of 40,000 with different starting points of the parameters. We have chosen initial values for the parameters, wide spread over the parameter space,  for the first chain and  for the second chain. The convergence is monitored using trace, ergodic mean and BGR plots. It can be observed that the Markov chains reached to the stationary condition very quickly, approximately 2000 iterations. Therefore, burn-in of 5000 samples is more than enough to erase the effect of starting point(initial values). Finally, samples of size 7000 are formed from the posterior by picking up equally spaced every fifth outcome (to minimize the auto correlation among the generated deviates.), i.e. thin=5, starting from 5001.

   Therefore, we have the posterior sample from chain 1 and chain 2 as   

Figure 3:     Contour plot of un-normalized joint posterior   density of .          

6.1   Convergence diagnostics

The inferences are valid, if the simulated sample provides a reasonable approximation for the posterior distribution. We have checked the convergence of the simulated draws of for their stationary distributions through different starting points. We have used the graphical diagnostics tools such as: trace,ergodic mean and the Brooks-Gelman-Rubin(BGR) plots. Figure 4 shows the trace, cumulative averages and BGR plots for the parameters alpha and lambda. The trace plots look like a random scatter about some mean value (represented by dotted line). The plots of cumulative averages of sampled values show steady convergence to the mean value (dotted horizontal line). The BGR plots are nice. The ratio of variability between chains to variability within chains is close to one and these are being stable (horizontal) across the width of the plot. We may infer that the chains have escaped from their initial values and found the target distribution of the Markov chain. In fact, these plots are hallmarks of rapid MCMC convergence.

 

Figure 4:  The trace plot (on top), cumulative average plot(in middle) and the BGR plot(at bottom) for alpha and lambda.

From Figure 4, we have no evidence that our posterior samples produced by OpenBUGS chains failed to converge, so we can proceed to use posterior samples for Bayesian inference.

 

6.2   Posterior analysis

        Bayesian analysis is all about the posterior distribution.  All of the statistical inferences of a Bayesian analysis come from summary measures of the posterior distribution, such as point and interval estimates. The posterior analysis is presented via quantitative as well as qualitative methods.

 

6.2.1                                    Quantitative analysis

   The numerical summary is presented for  from chain 1. The chain 2 produces the similar results.

   We have considered various quantities of interest and their numerical values based on MCMC sample of posterior characteristics for generalized Rayleigh distribution.  The MCMC results of the posterior mean, mode, standard deviation(SD), 2.5th percentile, first quartile, median, third quartile, 97.5th percentile, mode and skewness of the parameters are presented in Table 2. The Bayes estimates under absolute and zero-one loss functions are posterior median and mode, respectively.

   

Table 2.   Numerical summaries based on MCMC  sample of posterior  characteristics

 

Characteristics

alpha

lambda

Mean

0.6894

0.2141

Standard  Deviation

0.2348

0.0343

2.5th Percentile(P2.5)

0.2528

0.1503

First Quartile (Q1)

0.5307

0.1900

Median

0.6826

0.2129

Third Quartile (Q3)

0.8455

0.2363

97.5th Percentile(P97.5)

1.1730

0.2840

Mode

0.6685

0.2123

Skewness

0.1302

0.2132

                                                                 

    The highest probability density (HPD) credible intervals for are constructed by using Chen and Shao (1999) algorithm.

    Let  be the corresponding ordered MCMC sample of . Then, the  HPD intervals for  is

        ,

where is chosen so that

    .

Here denotes the largest integer less than or equal to . In the same fashion, one can also obtain the Bayes HPD credible intervals for .

        Table 3 shows the symmetric credible intervals(SCI) and HPD credible intervals for parameters alpha, beta and lambda. We have also computed the 95% bootstrap confidence interval (BCIs), using the algorithm of the percentile bootstrap method, described in section 3.2, we present the mean of 1000 bootstrap samples of the parameters.

 

Table 3.     Two-sided 95% intervals

Parameter

SCI

HPD

BCI

alpha

(0.253, 1.173)

(0.253, 1.172)

(0.401, 1.377)

lambda

(0.150, 0.284)

(0.145, 0.283)

(0.170, 0.313)

  

6.2.2 Qualitative analysis

   We have considered various graphs for qualitative analysis of the marginal posteriors of the parameters. These graphs include the boxplot, density strip plot, histogram, marginal posterior density estimate and rug plots for the parameters. We have also superimposed the 95% HPD intervals.

  These graphs provide almost complete picture of the posterior uncertainty about the parameters. We have used the posterior sample ;  to draw these graphs.

 

Figure 5(a):   Histogram, marginal posterior density

                  and 95% HPD interval

 

Figure 5(b):   Boxplot and density strip plot of based on posterior sample.

 

   Jackson (2008) introduced the density strip plot for a univariate distribution as a shaded rectangular strip, whose darkness at a point is proportional to the probability density. It may be noted from Figures 5(b) and 6(b) that density strip plots are more informative as compared to corresponding boxplot.

   Probability histogram approximates the marginal posterior distribution.  It is the most popular non-parametric method to estimate the density function and gives an idea about skewness, behaviour in the tails, presence of multi-modal behaviour, and data outliers. It may be useful to compare the fundamental shapes associated with standard analytic distributions.

   The kernel density estimates have been drawn using R software with the assumption of Gaussian kernel and properly chosen values of the bandwidths. It can be seen that and  show positive skewness.     

 

Figure 6(a):   Histogram, marginal posterior density and 95% HPD interval based on posterior sample.

 

   Figure 5(a) represents the histogram, marginal posterior density, rug plot and 95% HPD interval for . The boxplot and the density strip plot are displayed in Figure 5(b). 

Figure 6(b):  Boxplot and density strip plot of

   We have plotted the similar graphs in Figure 6(a) and (b) for . 

6.2.3                                 Comparison with MLE

     For the comparison with ML estimates, we have computed the density function at each observed data point for 7000 posterior samples, ; as

          .

In Figure 8 we have plotted quantiles of the estimated density, it can be considered as evaluation of model fit, based on posterior sample. 

 

                               Figure 8:   Density estimates

 

   The density corresponding to MLE has been plotted using the ML estimates of the parameters. We observe in the Figure 8, the MLEs and the Bayes estimates are quite close.

  

6.2.4 Estimation of hazard and reliability functions

    The posterior samples may be used to completely summarize the posterior uncertainty about the functions of parameters e.g. reliability and hazard functions. Suppose we wish to give point and interval estimates for reliability and hazard functions at the mission time t=2.41 (at the 40th observed data point).  

   For the given posterior sample ; , we can obtain the posterior sample for the reliability and hazard functions at t=2.41, using (4) and (5), as

          and                        .

            The MCMC results of the posterior mean, mode, standard deviation (SD), first quartile, median, third quartile, mode,   skewness, 95% symmetric credible intervals(SCI) and HPD credible intervals of reliability and hazard functions are displayed in Table 4.

Table 4.   Posterior summary for Reliability and Hazard functions at t=2.41

Characteristics

Reliability

Hazard

Mean

0.5451

0.6979

Standard  Deviation

0.0392

0.0736

First Quartile (Q1)

0.5188

0.6470

Median

0.5453

0.6946

Third Quartile (Q3)

0.5720

0.7447

Mode

0.5441

0.6859

Skewness

-0.0460

0.2583

95% SCI

(0.4682, 0.6226)

(0.5623, 0.8516)

95% HPD

(0.4662, 0.6199)

(0.5572, 0.8459)

  

   The ML estimates of reliability and hazard function at t=2.41 are computed using invariance property of the MLE. The ML estimates are  and .

  

 

Figure 10:    MCMC output of R(t = 2.41) and h(t =2. 41). Dashed line(...) represents the posterior median and solid lines(-) represent lower and upper bounds of 95% probability intervals (HPD)

A trace plot is a plot of the iteration number against the value of the draw of the parameter at each iteration. Figure 10 displays 7000 chain values for the hazard  and reliability functions, with their sample median and 95% HPD credible intervals.

   The 95% percentile bootstrap confidence interval (BCIs) for reliability and hazard function at t=2.41, using the algorithm described in section 3.2, based on 1000 bootstrap samples are (0.5378, 0.5693) and (0.6329, 0.7933), respectively.

   Now we shall demonstrate the effectiveness of proposed methodology for the entire data set. Since we have an effective MCMC technique, we can estimate any function of the parameters. For this, we have computed the reliability function given in (4) at each data point, using posterior sample ;

                   

 

Figure 11:  Reliability function estimate using MCMC and Kaplan-Meier estimate

   The Figure 11, exhibits the estimated reliability function (dashed line:  quantiles; solid  line :  quantile) using Bayes estimate based on MCMC output. We have superimposed the Kaplan-Meier estimate of the reliability function to make the comparison more meaningful. The Figure 11 shows that reliability estimate based on MCMC is very close to the empirical reliability estimates.

 

7.       Posterior predictive analysis

        A Bayesian approach for checking whether the model fits the data is known as posterior predictive checking. To do posterior predictive checking, we generate replicates of the dataset from the predictive distribution and compare these replicate datasets to the sample. If the replicate datasets and the sample are similar, we conclude that the model fits the data, (Gelman 2003) and (Gelman et al. 2004).           Modern Bayesian computational tools, however, provide straightforward solutions as one can easily simulate predictive samples if MCMC outputs are available from the posterior corresponding to the assumed model. Let  is a vector of n observations from the model . We can simulate the posterior predictive distribution as

 

  • Obtain posterior sample
  • For each posterior sample , simulate n data points, as and

Thus, for each sampled value, , we obtain M replicated data set

   The predictive analysis is based on 2000 posterior samples. For this purpose, 2000 samples have been drawn from the posterior using MCMC procedure and then obtained predictive samples from the model under consideration using each simulated posterior sample. In fact, we have 2000 replicates for each data point . The graphical method is one of the best way to assess model adequacy based on posterior predictive distributions. We view the model-checking as a comparison of the data with the replicated data given by the model, which includes exploratory graphics, Chaudhary and Kumar(2013).

  

Figure 12: Q-Q plot of predictive quantiles versus empirical quantiles

Figure 12 represents the  Q-Q plot of predicted quantiles vs. observed quantiles.  The estimate of CDF based on replicated data given by the model is displayed in Figure 13. Figure 13 exhibits graphical posterior predictive check of the model adequacy, solid line(   ) represents the posterior median and dashed lines(...) represent lower and upper bounds of 95% probability intervals, empirical distribution function is superimposed.

We, therefore, conclude that the generalized Rayleigh distribution is compatible with the given data set.

 

Figure 13:    Estimate of CDF  based on predicted values

   To obtain further clarity on our conclusion for the study of model compatibility, we have considered plotting of density estimates of second largest and largest

Figure 14:  Posterior predictive densities of  and , vertical lines represent corresponding observed value

i.e.  replicated future observations from the model with superimposed corresponding observed data, Figure 14.                                                             

   As the Figure 14 shows, the posterior predictive distributions are centered over the observed values, which indicate good fit. In general, the distribution of replicated data appearsto match that of the observed data fairly well.          

   The Table 5 shows the MCMC results of the posterior mean, median, mode and 95% HPD credible intervals for

     

 

Table 5.   Posterior  characteristics

 

 

Observed

Mean

Median

Mode

HPD

0.81

0.83

0.90

0.85

(0.640, 1.014)

2.00

2.04

2.11

2.06

(1.857, 2.223)

5.08

4.93

5.06

4.88

(4.517, 5.340)

5.56

5.41

5.56

5.35

(4.943, 5.890)

Overall, the results of the posterior predictive simulation indicate that model fits these data particularly well. Model fit assessments based on posterior predictive checks should not be used for model selection, Ntzoufras (2009).

 

8. Bayesian analysis for Type-II censored data        In this section, we shall address the problem of Bayesian analysis of type-II censored data from  Let us consider that the last four observations of the data set, given in section 5, are censored so that only the first 96 observations are available for analysis. The problem we wish to consider is that of estimating the parameters and , as well as predicting the future four failure times  Assume the independent gamma priors for the parameters. The straightforward solution may be obtained by making slight modifications in Script 1.

The OpenBUGS code used to analyze this problem is shown in Script 2. In Script 2, it may be noted that in model section the likelihood is written for uncensored and censored observations separately, and in data section censored observations are treated as NA(Not Available).

We used two distinct sets of initial values to start the Markov chain for the model parameters. We let OpenBUGS to generate the initial values for the censored observations. We have monitored and  as well as future four failure times  where the is written as   in Script 2. The 30000 iterations are generated from each chain. We discarded the first 5000(burn-in) MCMC iterations and used the remaining 25000. In order to reduce the autocorrelation within the MCMC series for the correlation parameter we used every 5th MCMC iteration for posterior computations.

 

Script 2 : OpenBUGS code for the censored case

model

{

for( i in 1 : N - 4 )  # uncensored observations

{

x[i] ~ dgen.rayleighI_T(alpha, lambda)

}

     for(i in N - 3 : N)  # censored observations

    {

     x[i] ~ dgen.rayleighI_T(alpha, lambda) C(x[N - 4], )

     }

   for(i in 1 : 4) # predicted failure times of censored items

     {

       x.pred[i] <- ranked(x[N- 3 : N], i)

}                                                                                                                                                                                          

# Prior distributions of the model parameters

       alpha ~ dgamma(0.001, 0.001)

       beta ~ dgamma(0.001, 0.001)

}

Data

       list(N=100, c(0.39,...,4.70, NA,NA,NA,NA))

    Initial values

     list(alpha= 0.1, lambda= 0.1)   # Chain 1

    list(alpha=2.0, lambda= 2.0)    # Chain 2

 

 


  

 

 

 

 

 

 

 

 

 

 

 

 

 

Therefore, we have the posterior sample from chain 1 and chain 2 as

 

We have also checked the convergence of the sequences of and  for their stationary distributions through different starting values. It was observed that the Markov chains reached to the stationary condition very quickly.

We have considered chain 1 for posterior inference. Figure 15 shows the trace plot of and for 5000 sampled prevalence values after burn-in. The plot shows good mixing of the Gibbs sampler. The posterior mean is represented by dashed line where as solid lines represent 95% HPD confidence intervals for the parameters. The quantitative measures based on posterior sample are summarized in Table 6. The histogram, marginal posterior density and 95% HPD for and are plotted in Figure 16 and Figure 17, respectively. Similar graphs for censored failures are depicted in Figure 18.

 

Figure 15: Trace plot of and . Dashed line(...) represents the posterior mean and solid lines(-) represent lower and upper bounds of 95% probability intervals (HPD)

Table 6.   Numerical summaries based on MCMC  sample of posterior  characteristics

Characteristics

alpha

lambda

Mean

0.6718

0.2072

Standard  Deviation

0.2370

0.0379

First Quartile (Q1)

0.5079

0.1830

Median

0.6627

0.2064

Third Quartile (Q3)

0.8256

0.2319

Mode

0.6625

0.1965

Skewness

0.2736

0.0125

95% SCI

(0.227, 1.164)

(0.128, 0.284)

95% HPD

(0.215, 1.146)

(0.128, 0.283)

 

Table 7.   Posterior  summary based on MCMC sample

 

Mean

Median

Mode

95% HPD

x97:100  

4.84

4.90

4.74

( 4.70, 5.11 )

x98:100  

5.03

5.14

4.86

( 4.71, 5.49 )

x99:100  

5.29

5.47

5.10

( 4.76, 5.98 )

x100:100  

5.76

6.06

5.41

( 4.85, 6.87 )

 

   The Table 7 shows the MCMC results of the posterior mean, median, mode and 95% prediction intervals for censored failure times

 

Figure 16:   Histogram, marginal posterior density and

                  95% HPD interval based on posterior sample.

 

Figure 17:   Histogram, marginal posterior density and

                    95% HPD interval based on posterior sample.

 

Figure 18:   Histogram, posterior predictive densities and 95% HPD intervals of  based on posterior sample

We have observed that the Gibbs sampling technique can be used quite effectively, for estimating the posterior predictive density and also for constructing predictive interval in case of censored sample.

 

9.     Conclusion

   The methods described to implement modern computational- based classical as well as Bayesian approaches related to generalized Rayleigh distribution. We have proposed an integrated procedure for Bayesian inference using MCMC methods. We obtain the Bayes estimates and the corresponding credible intervals using Gibbs sampling procedure for complete as well as for type-II censored data. We have obtained the estimates and probability intervals for parameters, hazard and reliability functions. We have presented the model compatibility analysis via the posterior predictive check method.  We have applied the developed techniques on a real data set.

 

References

    1. Albert, J., Bayesian Computation with R, 2nd edition, Springer, (2009).
    2. Chaudhary, A.K. and Kumar, V., “A Bayesian Analysis of Perks Distribution via Markov Chain Monte Carlo Simulation,” Nepal Journal of Science and Technology, 14(1), 153-166, (2013).
    3. Chen, M. H. and Shao, Q. M., “Monte Carlo estimation of Bayesian credible intervals and HPD intervals,” Journal of Computational and Graphical Statistics, 8(1), 69-92, (1999).
    4. Efron, B. and Tibshirani, R., “Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy,” Statistical Science, vol. 1, no. 1,  54-77, (1986).
    5. Gelfand, A.E. and Smith, A.F.M., “Sampling based approach to calculating marginal densities,” Journal of the American Statistical Association, 85, 398–409. (1990).
    6. Gelman, A., “A Bayesian Formulation of Exploratory Data Analysis and Goodness-of-fit Testing,” International Statistical Review, 71(2), 369-382, (2003). 
    7. Gelman, A., Carlin, J., Stern, H. and Rubin, D., Bayesian Data Analysis, Second Edition, London,  Chapman & Hall, (2004).
    8. Gupta, R. D. and Kundu, D., “Generalized exponential distributions,” Australian and New Zealand Journal of Statistics, 41(2), 173 – 188, (1999).
    9. Hastings, W. K., “Monte Carlo sampling methods using Markov chains and their applications,” Biometrika, 57, 97 –109, (1970).
    10. Jackson, C.H., “Displaying uncertainty with shading,” The American Statistician, 62(4), 340-347, (2008). 
    11. Kumar, V. and Ligges, U., reliaR : A package for some probability distributions, http://cran.r-project.org/web/ packages/ reliaR/index.html, (2011).
    12. Kumar, V., “Bayesian analysis of exponential extension model,” J. Nat. Acad. Math., 24, 109-128, (2010).
    13. Kumar, V., Ligges, U. and Thomas, A., ReliaBUGS User Manual : A subsystem in OpenBUGS for some statistical models, Ver. 1.0, OpenBUGS 3.2.1, http://openbugs.net/w/downloads, (2010).
    14. Kundu, D. and Raqab, M.Z., “Generalized Rayleigh distribution: different methods of estimation,” Computational Statistics and Data Analysis, 49, 187-200, (2005).
    15. Lawless, J. F., Statistical Models and Methods for Lifetime Data, 2nd ed., John Wiley and Sons, New York, (2003).
    16. Lunn, D., “Recent Developments in the BUGS software,” ISBA Bulletin, 17(3), 16-17, (2010).
    17. Lunn, D.J., Jackson, C., Best, N., Andrew, A. and  Spiegelhalter, D., The BUGS Book :A Practical Introduction to Bayesian Analysis, Chapman & Hall/CRC, London, UK, (2013).
    18. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, E., “Equations of state calculations by fast computing machines,” Journal Chemical Physics, 21, 1087–1091, (1953).
    19. Mudholkar, G.S. and Srivastava, D.K., “Exponentiated Weibull family for analyzing bathtub failure-rate data,” IEEE Transactions on Reliability, 42(2), 299-302, (1993).
    20. Nadarajah, S. and Kotz, S., “The exponentiated type distributions,” Acta Applicandae Mathematicae, 92, 97-111, (2006).
    21. Nichols, M.D., Padgett, W.J., “A bootstrap control chart for Weibull percentiles,” Quality and Reliability Engineering International, 22, 141-151, (2006).
    22. Ntzoufras, I., Bayesian Modeling using WinBUGS, John Wiley & Sons, New York, (2009).
    23. R Development Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria,  (2013).
    24. Raqab, M.Z. and Kundu, D., “Burr type X distribution: Revisited,” Journal of Probability and Statistical Sciences, 4(2), 179-193, (2006).
    25. Raqab, M.Z. and Madi, M.T, “Bayesian analysis for the exponentiated Rayleigh distribution,” Metron - International Journal of Statistics, vol. LXVII, No. 3, 269-288, (2009).
    26. Shrestha, S.K. and Kumar, V., “Bayesian estimation and prediction of exponentiated Weibull model,” International Journal of Statistika and Mathematika, 7(2), 37-48, (2013).
    27. Soliman, A.A., Abd-Ellah, A.H., Abou-Elheggag, N.A. and Ahmed, E.A., “Modified Weibull model: A Bayes study using MCMC approach based on progressive censoring data,” Reliability Engineering and System Safety, 100, 48–57, (2012).
    28. Surles, J.G., and Padgett, W.J., “Some properties of a scaled Burr type X distribution,” Journal of Statistical Planning and Inference, 128, 271-280, (2005).
    29. Thomas, A., O’Hara, B., Ligges, U. and Sturtz, S., “Making BUGS Open,” R News, 6, 12–17, http://mathstat.helsinki.fi/openbugs/, (2006).
    30. Thomas, A., OpenBUGS Developer Manual, version 3.1.2,  http://www.openbugs.info/, (2010).
    31. Voda, V.G., “Inferential procedures on a generalized Rayleigh variate (I),” Applications of Mathematics, 21, 395-412, (1976a).
    32. Voda, V.G., “Inferential procedures on a generalized Rayleigh variate (II),” Applications of Mathematics, 21, 413-419, (1976b).
    33. Voda, V.G., “A new generalization of Rayleigh distribution,” Reliability: Theory & Applications, 2 , No. 2, 47-56, (2007).
    34. Voda, V.G., “A method constructing density functions: the case of a generalized Rayleigh variable,” Applications of Mathematics, 54, No. 5, 417-431, (2009).

     

 
 
 
 
 
  Copyrights statperson consultancy www

Copyrights statperson consultancy www.statperson.com  2013. All Rights Reserved.

Developer Details