Home| Journals | Statistics Online Expert | About Us | Contact Us

About this Journal | Table of Contents

Untitled Document

[Abstract] [PDF] [HTML] [Linked References]

Water Physicochemical Analysis with Factor Analysis and Cluster Analysis of Ahmedabad city

Sweta Patel¹, Chetna Bhavsar²

{¹Research Scholar, ²Professor} Department of Statistics, School of Sciences, Gujarat University, Ahmedabad, Gujarat, INDIA.

Corresponding Addresses:

¹[email protected], ²[email protected]

Research Article

Abstract: Multivariate statistical techniques, cluster analysis (CA) and Factor analysis (FA) were applied to the data on water physicochemical parameters of Ahmedabad City of Gujarat in India. This study was carried out data of 2004, 2005, 2006, 2007 and 2010 years. This study evaluated and interpreted complex water quality data sets and apportioned of pollution sources to get better information about water quality and to design a monitoring network. Cluster Analysis and Factor Analysis was applied on the data and had some Clusters for physic chemical parameters and had some Factors for the same. Based on the study we can conclude that water quality assessment is a major aspect of human health. Government should keep track on that and gave pure water to public for drinking purpose. This study can help to improve our water quality analysis in future with the help of these clusters and factors.

Keywords: Cluster Analysis, Factor Analysis, Chlorinity, Salinity, Magnesium Hardness, Calcium Hardness.

1. Introduction

Without water, life cannot survive. Water and life are two sides of the same coin. Life initiates and grows in the lap of water. Water is very vital to all forms of life: from very small living creatures to very complex systems of animals and human being. The purity of water varies from place to place in nature. Rain water, if not contaminated by atmospheric pollutants, is highly pure while the sea water contains large amount of salt. Water for a variety of uses can be obtained from the sources like precipitation in the form of rain, snow and hail while surface water in the form of glaciers, streams, rivers and sea water. Besides these sources of water, there is also a natural rich source of water in the form of groundwater which is complementary to the surface water. Due to steady increase in the population urbanization, deforestation etc, the water resources have been adversely affected both qualitatively and qualitatively. Water pollution is one of the major problems in developing countries like India [1, 2, 3]. Improper policy is one of the most important factors that have caused severe environmental pollution and ecological degradation. Almost all developing countries are experiencing an increase of population, urbanization, depreciation etc. [3,4]. Pollution has become a major threat to existence of man on earth. Rapid industrialization, urbanization and human activities consequently cause water pollution which has brought a veritable water crisis [5-8]. Sirkantaswamy et al. [8] reported seasonal variation of drinking water quality at Mysore, in Karnataka state. They found higher amount of chemical (total dissolved solids, alkalinity) and bacteriological parameters. They concluded that the drinking water quality varied from moderate contamination to larger extent of contamination. Kadam et al. [9] reported more than permissible limit of borewell drinking water in Ahmedpur area of Maharastra. Agnihotri and Singh [10] reported bacteriologically unfit quality of drinking water in Sagar city of Madhya pradesh. Papanna and Nagaraju [3] found 97% of the total water sample (60 bore wells) in Kollegal taluk of Karnataka within the desirable to permissible limit according to Bureau of Indian standards (BIS). Susiladevi et al. [11] studied ground water sample from 30 different sites like bore wells, tube wells, and hand pumps in and around Cuddalere town in Tamil Nadu state. They found that the water in some places were unfit for human consumption due to industrial waste disposal and sewage. In 2004, Suthar et al. [12] reported total hardness and calcium hardness within desirable limit while more than desirable limits of magnesium hardness, chlorinity and salinity were observed in some areas of Ahmedabad city. In 2005, Suthar et al. [13] reported higher amount of calcium hardness, magnesium hardness, chlorinity and alkaline nature of water in eastern part of Ahmedabad city. In 2006, Suthar et al. [14] found total hardness, magnesium hardness, calcium hardness, chlorinity and salinity either above the desirable limit or maximum allowable limits as per Gujarat Pollution Control Board (GPCB) standards in samples from Ahmedabad city. In 2007, Suthar et al. [15] observed higher values of calcium, magnesium, chlorinity and salinity above the desirable limits in Ahmedabad city. Suthar et al. [7] have recently found alterations in physico-chemical characteristics of drinking water collected from 17 areas of Ahmedabad city. In this scenario, to provide safe drinking water is a very big accountability for the governments.

2. Materials and Methods

Ahmedabad is the largest city in Gujarat state and sixth largest city (metro city) in India with a population of almost 5 million. It is located on the bank of Sabarmati River at an elevation of 55 meters (180 ft). It is located at 23.030 N and 72.580 E. It has a dry climate. Its highest recorded temperature is 48° C and lowest is 15°C. The average rainfall is 932 mm. The present study is associated with water quality evaluated of Ahmedabad city of Gujarat state of years 2004, 2005, 2006, 2007 and 2010. The water analysis were collected and assessed by examining chemical characteristics by standard methods done by Prof. M. B. Suthar and his students of Department of Biology, K. K. Shah Jarodwala Maninagar Science College, Maninagar, Ahmedabad, India. In the year 2004, 2005, 2006, 2007 and 2010 the water samples were collected in the morning by students from their homes from the tap and labeled appropriately which later on brought to college laboratory. The drinking water quality was assessed by examining chemical characteristics. The parameters analyzed by standard methods were Total Hardness, Calcium Hardness, Magnesium Hardness, Chlorinity, Salinity and pH. They used total hardness tablets, calcium hardness tablets (both EDTA method) for total hardness and calcium hardness while Argentometric method for chlorinity respectively. Magnesium hardness and salinity were calculated from theses data. The pH was measured using Systronic pH meter 324 at 30°C. The data was compared with GPCB drinking water standards.

3. Statistical analysis

We have data of year 2004, 2005, 2006, 2007 and 2010. We run two Multivariate Techniques with SPSS 16.0 version software. Water quality data sets were subjected to univariate analysis: mean, maximum and minimum and multivariate analysis: Cluster analysis (CA) and Factor analysis (FA). These analyses required a preliminary step of the treatment of data which consisted of the normalization of the raw analytical data, so as to avoid misclassifications due to the different order of magnitude and range of variation of the analytical parameters (Aruga et al., 1995 Aruga, R., Gastaldi, D., Negro, G., Ostacoli, G., 1995. Pollution of a river basin and its evolution with time studied by multivariate statistical analysis. Analytica Chimica Acta 310, 15–25). Statistical computations were executed using the statistical software package, SPSS 16.0. The multivariate methods are summarized in the results and discussion.

3.1 Factor analysis (FA)

Factor analysis is a very powerful technique applied to reduce the dimensionality of a dataset consisting of a large number of interrelated variables, while retaining as much as possible the variability presented in dataset and with a minimum loss of information [J. F. Hair, Multivariate data analysis (3^rd ed.). New York: Macmillan, (1992).]. This reduction is achieved by transforming the dataset into a new set of variables - factors, which are orthogonal (non-correlated) and are arranged in decreasing order of importance. FA can also be used to generate hypotheses regarding causal mechanisms or to screen variables for subsequent analysis.

FA can be expressed as:

F_i = a₁ x₁ j + a ₂ x ₂ j + ... + a _m x _m

Where F_i = factor

a = loading

x = measured value of variable

i = factor number

j = sample number

m = total number of variables

There are three basic steps to factor analysis:

1. Computation of the correlation matrix for all variables.

2. Extraction of initial factors.

3. Rotation of the extracted factors to a terminal solution [Ho. Robert, Handbook of univariate and multivariate data analysis and interpretation with SPSS.

3.2 Cluster analysis (CA)

Cluster analysis is a major technique for classifying a mountain of information into manageable meaningful piles. It is a data reduction tool that creates subgroups that are more manageable than individual datum. In cluster analysis there is no prior knowledge about which elements belong to which clusters. The grouping or clusters are defined through an analysis of the data. Hierarchical CA, the most common approach, starts with each case in a separate cluster and joins the clusters together step by step until only one cluster remains [J. Lattin, D. Carroll and P. Green, Analyzing multivariate data. New York: Duxbury, (2003). J. McKenna, Environmental Modelling and Software, 18 (2003) 205.]. The Euclidean distance usually gives the similarity between two samples, and a distance can be represented by the difference between transformed values of the samples [M. Otto, Multivariate methods. In: R. Kellner, J. M. Mermet, M. Otto and H. M. Widmer, (Eds.), Analytical chemistry. Weinheim: Wiley-VCH. (1998).].

There are four basic cluster analysis steps:

1. Data collection and selection of the variables for analysis

2. Generation of a similarity matrix

3. Decision about number of clusters and interpretation

4. Validation of cluster solution

4. Results and Discussion

4.1 Data Analysis of Year 2004

Total 36 samples were collected and analyzed in the K. K. Shah Jarodwala Maninagar Science College, Ahmedabad laboratory. The sample source has no significant effect on these parameters as shown in Table 1. All the water samples were colourless, odourless and without any pleasant taste.

Table 1: Parameters of water samples collected from different areas showing mean values of Municipality and Tube well samples of year 2004

No.	Sample	No. of Samples	Total Hardness	Ca-Hardness	Mg-Hardness	Chlorinity	Salinity
1	Municipality	19	141.11 (28-304)	68.27 (8-212)	72.83 (20-156)	650.69 (127.8-1491)	1175 (231-2692)
2	Tubewell	17	148.31 (80-292)	69.88 (48-144)	78.57 (8-240)	533.59 (35.5-1178.6)	964 (64-2691)
	Drinking water standard (GPCB)	Desirable Limit- Maximum allowable limits	300-600	75-200	30-90	250-1000	450-1800

We have done factor analysis and Cluster Analysis on the above year 2004 data with SPSS 16.0 version and found some results below. This SPSS output lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. Before extraction, SPSS has identified 7 components within the data set (There should be as many eigenvectors as there are variables and so there will be as many factors as variables.) Factor 1 explains 47.261% of total variance. Before rotation, factor 1 accounted for considerably more variance than the remaining two (47.261% compared to 17.238% and 14.650%), however after extraction it accounts for only 44.664% of variance (compared to 18.616% and 15.869%) respectively in Table 2.

Figure 1 shows the Scree plot of the whole data set. From the Scree Plot we can directly visually say that the data have maximum four factors or components which consider the maximum amount of data and then it goes down.

Table 2: The table contains total Variance explained data which had Extraction Method was Principal Component of year 2004

Component	Initial Eigenvalues			Extraction Sums of Squared Loadings			Rotation Sums of Squared Loadings
Component	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %
1	3.308	47.261	47.261	3.308	47.261	47.261	3.126	44.664	44.664
2	1.207	17.238	64.499	1.207	17.238	64.499	1.303	18.616	63.280
3	1.025	14.650	79.149	1.025	14.650	79.149	1.111	15.869	79.149
4	.848	12.120	91.269
5	.611	8.731	100.000
6	9.392E-6	.000	100.000
7	1.499E-6	2.141E-5	100.000

Figure 1: This figure shows the Scree Plot of the data of year 2004

Table 4 shows the Component Matrix contains all the parameters divided into three different components. This output shows the component matrix before rotation. Where extraction method was Principal Component Analysis. By default SPSS displays all loadings; however, we requested that all loadings less than 0.5 be suppressed in the output and so there are blank spaces for many of the loadings. The first component (PCA 1) has uniform loadings from all the variables. So, it shows that PCA 1 includes Total Hardness, Magnesium Hardness, Chlorinity and Salinity. Those four parameters become First Principal Component. PCA 2 includes only one parameter that is Source (Tube well or Municipality). PCA 3 includes two parameters Calcium hardness and station (from where we collect the sample).

Table 3: Component Matrix before Extraction Method of year 2004.

	Component
	1	2	3
Total Hardness	0.857
Calcium Hardness		-0.696	0.629
Magnesium Hardness	0.809
Chlorinity	0.897
Salinity	0.897
Source		0.550
Station			0.518

Table 4 shows the Rotated Component Matrix that shows the factor loadings for each variable for the factors. We can see that the variable “Total Hardness (TH)” falls into factor 1 as the loading is the biggest in that row (0.749) compared to other factors. Here again simplify the output by suppressing loadings that are less than 0.5 for easier interpretation. So, here we get 3 factors from this output.

Factor 1: Total Hardness, Magnesium Hardness, Chlorinity, Salinity

Factor 2: Calcium Hardness

Factor 3: Source, Station

Table 4: Rotated Component Matrix of year 2004

	Component
	1	2	3
Total Hardness	0.749	0.520
Calcium Hardness		0.985
Magnesium Hardness	0.855
Chlorinity	0.913
Salinity	0.914
Source			0.709
Station			0.646

After the Factor Analysis we will run Cluster Analysis on the same data set and we got the output below. Table 6 shows the Agglomeration Schedule. Displays the objects or clusters combined at each stage (second and third column) and the distances at which this merger takes place. For example, in the first stage, objects 4 (Salinity) and 7 (Chlorinity) are merged at a distance of 0.000. From here onward, the resulting cluster is labelled as indicated by the first object involved in this merger, which is object 1 (Total Hardness). The last column on the very right tells in which stage of the algorithm this cluster will appear next. In this case, this happens in the third step, where it is merged with object 3 (Magnesium Hardness) at a distance of 10.604. The resulting cluster is still labelled 1 (Total Hardness) and so on.

Table 5: Agglomeration Schedule of data of year 2004

Stage	Cluster Combined		Coefficients	Stage Cluster First Appears		Next Stage
Stage	Cluster 1	Cluster 2	Coefficients	Cluster 1	Cluster 2	Next Stage
1	4	7	0.000	0	0	3
2	1	3	10.604	0	0	3
3	1	4	29.010	2	1	4
4	1	2	34.554	3	0	5
5	1	5	43.915	4	0	6
6	1	6	66.793	5	0	0

So, from the Multivariate Techniques (Factor Analysis and Cluster Analysis we found three new factors which includes majority of variance data and we found a clusters for our easy interpretation.

4.2 Data Analysis of Year 2005

In the year 2005 the study contains total 30 water samples were collected and analyzed in the laboratory of Biology Department, K. K. Shah Jarodwala Maninagar Science College Ahmedabad. All the water samples were colourless, odorless and devoid of any unpleasant taste. Table 6 shows the parameters of water samples mean values and minimum and maximum values are shown in parenthesis. Compared to GPCB drinking water standard, the Total Hardness in most of the samples were either within desirable limits or permissible limit. The Calcium Hardness was above the desirable limit in most of the samples.

Table 6: Parameters of water samples shows from Municipality and Tube well mean value; minimum and maximum values are shown in parenthesis in the year 2005. Units of Measurements: Total Hardness (as CaCO₃) mg/l; Calcium (as Ca) mg/l; Magnesium (as Mg) mg/l; Chlorides (as Cl) mg/l; Salinity g/l

No.	Sample	No. of Samples	Total Hardness	Ca- Hardness	Mg- Hardness	Chlorinity	Salinity	pH
1	Municipality	20	188 (116-312)	107.2 (60-204)	80.8 (16-200)	732.7 (35.5-2414)	732.76 (35.5-2414)	8.15 (7.7-8.6)
2	Tube well	10	207.2 (100-380)	96.2 (60-164)	111 (40-240)	823.5 (213.6-1207)	823.5 (213.6-1207)	8.04 (7.8-8.4)
	Drinking Water Standard (GPCB)		300-600	75-200	30-90	250-1000	450-1800	6.5-8.5

We run Factor Analysis on the data of year 2005. And we get the below results. This SPSS output lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. So here factor 1 explains 39.548% of total variance. SPSS extracts all factors with eigenvalues greater than 1, which leaves us with 3 factors. Before rotation, factor 1 accounted for considerably more variance than the remaining three (39.548% compared to 20.003% and 18.566%), however after extraction it accounts for only 38.459% of variance (compared to 20.100% and 19.557%) respectively in Table 9.

Table 7: The table contains total Variance explained data which had Extraction Method was Principal Component of the year 2005

Component	Initial Eigenvalues			Extraction Sums of Squared Loadings			Rotation Sums of Squared Loadings
Component	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %
1	3.164	39.548	39.548	3.164	39.548	39.548	3.077	38.459	38.459
2	1.600	20.003	59.551	1.600	20.003	59.551	1.608	20.100	58.559
3	1.485	18.566	78.116	1.485	18.566	78.116	1.565	19.557	78.116
4	0.790	9.871	87.988
5	0.625	7.817	95.804
6	0.323	4.040	99.845
7	0.012	0.155	100.000
8	2.382E-17	2.977E-16	100.000

Figure 2 shows the Scree plot of the whole data set. From the Scree Plot we can directly visually say that the data have maximum four factors or components which consider the maximum amount of data and then it goes down.

Figure 2: This figure shows the Scree Plot of the data of the year 2005

Table 8 shows the Component Matrix contains all the parameters divided into three different components. This output shows the component matrix before rotation where extraction method was Principal Component Analysis. The first component (PCA 1) has uniform loadings from all the variables. So, it shows that PCA 1 includes Total Hardness, Magnesium Hardness, Chlorinity and Salinity. Those four parameters become First Principal Component. PCA 2 includes Calcium Hardness and pH. PCA 3 includes two parameters Source (Tube well or Municipality) and station (from where collect the sample in Ahmedabad city).

Table 8: Component Matrix before Extraction Method of the year 2005

	Component
	1	2	3
Station			0.583
Source			0.793
Total Hardness	0.877
Calcium Hardness	0.604	-0.664
Magnesium Hardness	0.591		0.539
Chlorinity	0.901
Salinity	0.906
pH		0.865

Table 9 shows the Rotated Component Matrix that shows the factor loadings for each variable for the factors. Here again simplify the output by suppressing loadings that are less than 0.5 for easier interpretation. So, here we get 3 factors from this output.

Factor 1: Total Hardness, Calcium Hardness, Chlorinity, Salinity

Factor 2: Station, pH

Factor 3: Magnesium Hardness, Source

Table 9: Rotated Component Matrix of year 2005

	Component
	1	2	3
Station		-0.589
Source			0.797
Total Hardness	0.829
Calcium Hardness	0.728	-0.539
Magnesium Hardness			0.711
Chlorinity	0.906
Salinity	0.912

After the Factor Analysis we will run Cluster Analysis on the same data set and we got the output below.

Table 10 shows the Agglomeration Schedule. Displays the objects or clusters combined at each stage (second and third column) and the distances at which this merger takes place. Here in the first stage, objects 6 (Chlorinity) and 7 (Salinity) are merged at a distance of 0.815 from here onward; the resulting cluster is labelled as 0 indicated by the first object involved in this merger. The last column on the very right tells you in which stage of the algorithm this cluster will appear next.

Table 10: Agglomeration Schedule of data

Stage	Cluster Combined		Coefficients	Stage Cluster First Appears		Next Stage
Stage	Cluster 1	Cluster 2	Coefficients	Cluster 1	Cluster 2	Next Stage
1	6	7	0.815	0	0	4
2	3	5	15.488	0	0	3
3	3	4	21.960	2	0	4
4	3	6	23.188	3	1	5
5	2	3	39.881	0	4	6
6	1	2	42.472	0	5	7
7	1	8	43.175	6	0	0

So, from the Multivariate Techniques (Factor Analysis and Cluster Analysis we found three new factors which includes majority of variance data and we found a clusters for our easy interpretation.

4.3 Data Analysis of Year 2006

In the year 2006 total 13 samples were collected and analyzed in the K. K. Shah Jarodwala Maninagar Science Collage laboratory. Table 11 shows are wise analysis of different physicochemical parameters. All water samples were odorless, colourless and devoid of any unpleasant taste. Compared with drinking water standards (WHO, ICMR and BIS), the Total Hardness is present more than desirable limits in seven samples and more than maximum permissible limits in six samples.

Table 11: Parameters of water samples collected from different areas of Ahmedabad city in year 2006. (Area wise Mean values and minimum and maximum values are shown in parenthesis.)

No.

Area

No. of Samples

Total Hardness

Ca- Hardness

Mg- Hardness

Chlorinity

Salinity

Amraiwadi

541

(312-692)

128

(112-136)

413

(200-556)

317

(178-518)

572

(321-935)

8.0

(7.8-8.4)

Ghodasar

780

(780)

128

(128)

652
(652)

315

(315)

569

(569)

8.1

(8.1)

Gomtipur

780

(780)

128

(128)

652
(652)

355

(355)

641

(641)

8.3

(8.3)

Isanpur

484

(484)

180

(180)

304

(304)

325

(325)

587

(587)

8.0

(8.0)

Maninagar

684

(280-2088)

114

(100-128)

570

(180-960)

259

(164-355)

469

(296-641)

8.4

(8.3-8.6)

Raipur

484

(484)

270

(270)

214

(214)

553

(553)

999

(999)

8.3

(8.3)

Shah-a-alam

448

(448)

(76)

372

(372)

(20)

(36)

8.2

(8.2)

Thakkarbapa nagar

478

(408-548)

134

(88-180)

344

(228-460)

132

(20-245)

239

(36-442)

7.9

(7.6-8.3)

Vatva

688

(688)

180

(180)

508

(508)

369

(369)

666

(666)

8.1

(8.1)

Total

585

(280-1088)

142

(76-270)

444

(180-960)

282

(20-553)

510

(36-999)

8.1

(7.6-8.6)

WHO

HDL

MPL

200

600

200

150

200

600

---

7.0-8.5

6.5-9.5

ICMR

HDL

MPL

300

600

---

----

---

200

1000

---

7.5-8.5

6.5-9.2

BIS

HDL

MPL

200

600

200

---

250

1000

---

7.0-8.3

8.5-9.0

Units of Measurements: Total Hardness (as CaCO₃) mg/l; Calcium (as Ca) mg/l; Magnesium (as Mg) mg/l; Chlorides (as Cl) mg/l; Salinity g/l; Abbreviation: HDL- Highest Desirable Limit, MPL – Maximum Permissible Limit. We run Factor Analysis on the data of year 2006. And we get the below results. In Table 12 factor 1 explains 39.772% of total variance. Before rotation, factor 1 accounted for considerably more variance than the remaining three (39.772% compared to 31.017% and 12.917%), however after extraction it accounts for only 32.324% of variance (compared to 30.285% and 21.097%) respectively.

Table 12: The table contains total Variance explained data of year 2006 which had Extraction Method was Principal Component

Component	Initial Eigenvalues			Extraction Sums of Squared Loadings			Rotation Sums of Squared Loadings
Component	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %
1	3.182	39.772	39.772	3.164	39.548	39.548	3.077	38.459	38.459
2	2.481	31.017	70.789	1.600	20.003	59.551	1.608	20.100	58.559
3	1.033	12.917	83.706	1.485	18.566	78.116	1.565	19.557	78.116
4	0.666	8.331	92.037
5	0.445	5.561	97.598
6	0.192	2.402	100.000
7	3.988E-7	4.985E-6	100.000
8	-2.443E-16	-3.054E-15	100.000

Figure 3 shows the Scree plot of the whole data set. From the Scree Plot we can directly visually say that the data have maximum four factors or components which consider the maximum amount of data and then it goes down.

Figure 3: This figure shows the Scree Plot of the data in the year of 2006

Table 13 shows the Component Matrix contains all the parameters divided into three different components. The first component (PCA 1) has uniform loadings from all the variables. So, it shows that PCA 1 includes station (from where collect the sample in Ahmedabad city), Calcium Hardness, Chlorinity and Salinity. Those four parameters become First Principal Component. PCA 2 includes Total Hardness, Magnesium Hardness, and pH. PCA 3 includes only one parameters Source (Tube well or Municipality).

Table 13: Component Matrix before Extraction Method of the year 2006

	Component
	1	2	3
Station	0.670
Source	0.517		0.637
Total Hardness		0.796
Calcium Hardness	0.834
Magnesium Hardness	-0.615	0.702
Chlorinity	0.720	0.598
Salinity	0.720	0.598
pH		0.682

Table 14 shows the Rotated Component Matrix that shows the factor loadings for each variable for the factors. Here again simplify the output by suppressing loadings that are less than 0.5 for easier interpretation. So, here we get 3 factors from this output. We can see from both the table that after rotation PCA becomes Factor and that will change the parameters.

Factor 1: Calcium Hardness, Chlorinity, Salinity

Factor 2: Total Hardness, Magnesium Hardness, pH

Factor 3: Station, Source

Table 14: Rotated Component Matrix of year 2006

	Component
	1	2	3
Station			0.669
Source			0.833
Total Hardness		0.980
Calcium Hardness	0.681		0.623
Magnesium Hardness		0.962
Chlorinity	0.985
Salinity	0.985

From the Factor analysis we can convert the huge data in to small factors which includes maximum variance of the data. In this year 2006 we can say that from the Factor Analysis that if we want to convert the whole data in to small factors we can club the above factors. After the Factor Analysis we will run Cluster Analysis on the same data set and we got the output below. Table 15 shows the Agglomeration Schedule. Displays the objects or clusters combined at each stage (second and third column) and the distances at which this merger takes place. For example, in the first stage, objects 6 (Chlorinity) and 7 (Salinity) are merged at a distance of 0.000 from here onward; the resulting cluster is labelled as indicated by the first object involved in this merger. The last column on the very right tells you in which stage of the algorithm this cluster will appear next.

Table 15: Agglomeration Schedule of data of year 2006

Stage	Cluster Combined		Coefficients	Stage Cluster First Appears		Next Stage
Stage	Cluster 1	Cluster 2	Coefficients	Cluster 1	Cluster 2	Next Stage
1	6	7	0.000	0	0	3
2	3	5	0.609	0	0	6
3	4	6	7.395	0	1	4
4	1	4	10.672	0	3	5
5	1	2	12.652	4	0	6
6	1	3	22.822	5	2	0

So, from the Multivariate Techniques Factor Analysis and Cluster Analysis we found three new factors which includes majority of variance data and we found a clusters for our easy interpretation.

4.4 Data Analysis of Year 2007

In the year 2007 total 36 samples were collected and analyzed in the K. K. Shah Jarodwala Maninagar Science Collage laboratory. Table 16 shows sample wise list of physicochemical parameters. The data suggest that most of the samples have Total Hardness, Chlorinity and Salinity within the highest desirable limit of GPCB. Most of the samples have high amount of Calcium and Magnesium Hardness above the highest desirable limit but less than maximum permissible limit of GPCB standards. The Water Quality Index (WQI) showed that almost all the samples were having the index value more than 100 suggesting that drinking water is unsafe as per GPCB standards adopted.

Table 16: Sample source wise list of physicochemical parameters studied in the year 2007

No.	Sample	No. of Samples	Total Hardness	Ca- Hardness	Mg- Hardness	Chlorinity	Salinity
1	Municipality	30	170.61 (100-233)	106.03 (48-152)	64.70 (28-146)	154.56 (56-312)	279.0 (101-563)
2	Tube well	6	242.36 (172-408)	124.40 (84-168)	118.0 (64-240)	236.78 (56-540)	427.51 (101-974)
	Total	36	182.56 (100-408)	109.56 (48-168)	6.640 (28-240)	168.28 (56-540)	303.77 (101-974)

We run Factor Analysis on the data of year 2007. This SPSS output lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. Before extraction, SPSS has identified 8 components within the data set in Table 17. Before rotation, factor 1 accounted for considerably more variance than the remaining two (56.872% compared to 15.697% and 12.946%), however after extraction it accounts for only 36.427% of variance (compared to 33.123% and 15.965%) respectively.

Table 17: The table contains total Variance explained data which had Extraction Method was Principal Component of year 2007

Component	Initial Eigenvalues			Extraction Sums of Squared Loadings			Rotation Sums of Squared Loadings
Component	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %
1	4.550	56.872	56.872	4.550	56.872	56.872	2.914	36.427	36.427
2	1.256	15.697	72.568	1.256	15.697	72.568	2.650	33.123	69.550
3	1.036	12.946	85.514	1.036	12.946	85.514	1.277	15.965	85.514
4	0.622	7.777	93.291
5	0.526	6.575	99.866
6	0.011	0.134	100.000
7	6.242E-7	7.802E-6	100.000
8	4.513E-10	5.641E-9	100.000

Figure 4 shows the Scree plot of the whole data set. From the Scree Plot we can directly visually say that the data have maximum four factors or components which consider the maximum amount of data and then it goes down.

Figure 4: This figure shows the Scree Plot of the data of year 2007

Table 18 shows the Component Matrix contains all the parameters divided into three different components. The first component (PCA 1) has uniform loadings from all the variables. So, it shows that PCA 1 includes Source (Tube well or Municipality), Calcium Hardness, Chlorinity and Salinity, Total Hardness, Magnesium Hardness and WQI. PCA 2 has no parameters before extraction method. PCA 3 includes only one parameters station (from where collect the sample in Ahmedabad city).

Table 18: Component Matrix before Extraction Method

	Component
	1	2	3
Station			0.782
Source	0.602
Total Hardness	0.941
Calcium Hardness	0.574	-0.543
Magnesium Hardness	0.789	0.523
Chlorinity	0.819
Salinity	0.820
Water Quality Index	0.927

Table 19 shows the Rotated Component Matrix that shows the factor loadings for each variable for the factors. Here again simplify the output by suppressing loadings that are less than 0.5 for easier interpretation. So, here we get 3 factors from this output. We can see from both the table that after rotation PCA becomes Factor and that will change the parameters. Here in second Factor we have Calcium Hardness, Chlorinity and Salinity as we have no parameters in the second PCA before rotation.

Factor 1: Source, Total Hardness, Magnesium Hardness, WQI

Factor 2: Calcium Hardness, Chlorinity, Salinity

Factor 3: Station

Table 19: Rotated Component Matrix of year 2007

	Component
	1	2	3
Station			0.868
Source	0.665
Total Hardness	0.685	0.548
Calcium Hardness		0.718	0.571
Magnesium Hardness	0.961
Chlorinity		0.902
Salinity		0.902

From the Factor analysis we can convert the huge data in to small factors which includes maximum variance of the data. In this year 2007 we can say that from the Factor Analysis that if we want to convert the whole data in to small factors we can club the above factors. After the Factor Analysis we will run Cluster Analysis on the same data set and we got the output below. Table 20 shows the Agglomeration Schedule. Displays the objects or clusters combined at each stage (second and third column) and the distances at which this merger takes place. For example, in the first stage, objects 6 (Chlorinity) and 7 (Salinity) are merged at a distance of 0.000. From here onward, the resulting cluster is labelled as indicated by another object involved in this merger. The last column on the very right tells you in which stage of the algorithm this cluster will appear next.

Table 20: Agglomeration Schedule of data of year 2007

Stage	Cluster Combined		Coefficients	Stage Cluster First Appears		Next Stage
Stage	Cluster 1	Cluster 2	Coefficients	Cluster 1	Cluster 2	Next Stage
1	6	7	0.000	0	0	5
2	5	8	2.906	0	0	3
3	3	5	6.530	0	2	4
4	3	4	24.075	3	0	5
5	3	6	24.630	4	1	6
6	2	3	32.630	0	5	7

So, from the Multivariate Techniques Factor Analysis and Cluster Analysis we found three new factors which includes majority of variance data and we found a clusters for our easy interpretation.

4.5 Data Analysis of Year 2010

In the year of 2010 the study focuses on drinking water in some areas of Ahmedabad city. Table 21 shows a comparison of tube well water and municipal supplied water indicate that municipal supplied water is much better than tube well water.

Table 21: Sample source wise list of physicochemical studied parameters in the year 2010 shows mean values and maximum and minimum values in parenthesis

No. of Samples

Total Hardness

Calcium Hardness

Magnesium Hardness

Chlorinity

Salinity

Electrical Conductivity

Municipal

210.25

(108-600)

133.35

(60-320)

76.9

(20-280)

290.5

(144-1494)

524.31

(144-1494)

0.92

(0.25-2.3)

Tube well

235.5

(188-392)

132.50

(64-272)

103.0

(20-264)

376.5

(100-760)

679.441

(180-1371)

1.77

(0.24-6.6)

Total

217.4

(88-600)

133.11(60-320)

84.36

(20-280)

315.07

(80-828)

568.63

(144-1494)

1.16

(0.4-6.6)

In the present study, samples from tube well have significantly higher amount of chlorinity, EC. They suggest possibilities of ground water pollution. It might be due to sewage or industrial sources as areas on eastern part of Ahmedabad city have industrial blocks. They might contribute to ground water pollution. Therefore, proper disposal of industrial waste with periodical monitoring of ground water is recommended.

We run Factor Analysis on the data of year 2010. This SPSS output lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. Before extraction, SPSS has identified 8 components within the data set in Table 22. So in Table 34 factor 1 explains 48.230% of total variance. It should be clear that the first few factors explain relatively large amount of variance (especially factor 1) whereas subsequent factors explain only small amount of variance. SPSS then extracts all factors with eigenvalues greater than 1, which leaves us with 2 factors. Before rotation, factor 1 accounted for considerably more variance than the remaining one (48.230% compared to 20.350%), however after extraction it accounts for only 36.448% of variance compared to 31.132%.

Table 22: The table contains total Variance explained data which had Extraction Method was Principal Component of year 2010

Component	Initial Eigenvalues			Extraction Sums of Squared Loadings			Rotation Sums of Squared Loadings
Component	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %
1	3.376	48.230	48.230	3.376	48.230	48.230	2.551	36.448	36.448
2	1.424	20.350	68.580	1.424	20.350	68.580	2.249	32.132	68.580
3	.966	13.801	82.381
4	.766	10.942	93.323
5	.467	6.677	100.000
6	2.329E-7	3.327E-6	100.000
7	-4.331E-16	-6.188E-15	100.000

Figure 5 shows the Scree plot of the whole data set. From the Scree Plot we can directly visually say that the data have maximum four factors or components which consider the maximum amount of data and then it goes down.

Figure 5: This figure shows the Scree Plot of the data of year 2010

Table 23 shows the Component Matrix contains all the parameters divided into three different components. This output shows the component matrix before rotation where extraction method was Principal Component Analysis. The first component (PCA 1) includes Total Hardness, Magnesium Hardness, Chlorinity, Salinity and Electrical Conductivity. PCA 2 has no parameters before extraction method. Calcium Hardness have value less than 0.5 so it is showing blank space in the below table.

Table 23: Component Matrix before Extraction Method in the year 2010

	Component
	1	2
Total Hardness	0.617	-0.539
Calcium Hardness
Magnesium Hardness	0.874
Chlorinity	0.874
Salinity	0.766	0.592
Electrical Conductivity	0.766	0.592

Table 24 shows the Rotated Component Matrix that shows the factor loadings for each variable for the factors. Here again simplify the output by suppressing loadings that are less than 0.5 for easier interpretation. So, here we get 2 factors from this output. We can see from both the table that after rotation PCA becomes Factor and that will change the parameters. Here in second Factor we have Electrical Conductivity and Salinity as we have no parameters in the second PCA before rotation.

Factor 1: Total Hardness, Calcium Hardness, Magnesium Hardness, Chlorinity

Factor 2: Salinity, Electrical Conductivity

Table 24: Rotated Component Matrix of year 2010

	Component
	1	2
Total Hardness	0.819
Calcium Hardness	0.608
Magnesium Hardness	0.846
Chlorinity	0.846
Salinity		0.948
Electrical Conductivity		.948

From the Factor analysis we can convert the huge data in to small factors which includes maximum variance of the data. In this year 2010 we can say that from the Factor Analysis that if we want to convert the whole data in to small factors we can club the above factors.

After the Factor Analysis we will run Cluster Analysis on the same data set and we got the output below. Table 25 shows the Agglomeration Schedule for the year 2010. Displays the objects or clusters combined at each stage (second and third column) and the distances at which this merger takes place. For example, in the first stage, objects 6 (Salinity) and 7 (Electrical Conductivity) are merged at a distance of 0.000. From here onward, the resulting cluster is labelled as indicated by another object involved in this merger. The last column on the very right tells you in which stage of the algorithm this cluster will appear next.

Table 25: Agglomeration Schedule of data of 2010

Stage	Cluster Combined		Coefficients	Stage Cluster First Appears		Next Stage
Stage	Cluster 1	Cluster 2	Coefficients	Cluster 1	Cluster 2	Next Stage
1	6	7	0.000	0	0	4
2	4	5	0.000	0	0	3
3	2	4	50.221	0	2	4
4	2	6	60.512	3	1	5
5	2	3	62.444	4	0	6
6	1	2	82.663	0	5	0

So, from the Multivariate Techniques Factor Analysis and Cluster Analysis we found two new factors which includes majority of variance data and we found a clusters for our easy interpretation.

5. Conclusion

Water is the most common and important resource on the earth. The hydrologic cycle is entirely adequate to meet human needs for fresh water, because it processes several times as much water as we required today. However the availability of water varies from place to place and time to time. As a result, there is a persistent scarcity of water in many parts of the world. Exponential growth in populations creates an ever-increasing demand for additional water for irrigation, industry and municipal use. This five year study represents an attempt to evaluate the status of ground water of Ahmedabad city used for drinking purpose. Ground water is a precious natural resource. From the foregoing discussion, it is inferred that concentration of most parameters are generally within the highest permissible limit. This research work is attempted to assess the drinking water quality. Ground water is a precious natural resource. From the foregoing discussion, it is inferred that concentration of most parameters are generally within the highest permissible limit. The present study reveals that water is not safe for drinking in industrial area, only it is useful for domestic purpose. So, people should be made aware of the water quality importance on sanitation and economical water treatment methods like filtration and boiling would prove beneficial to avoid water born diseases and other water related disease. We can conclude that inadequate balance of all the physicochemical parameters leads to severe diseases like Osteoporosis, Nephrolithiasis (Kidney stones), Colorectal Cancer, Hypertension, Stroke, Coronary artery disease, Insulin resistance obesity Type II diabetes mellitus, metabolic syndrome etc. The remedial measure must matter most immediately to safeguard and conserve the precious water resources from pollution for future generation. This is a prime solution to pollution and future imminent water wars [16-22]. Based on the result of analysis, it is suggested that further investigations of water may be carried out in future. Public should be made aware of drinking water quality and careful management of precious natural resources. Government and non-government agencies should setup immediate and long term quality monitoring programs. Proper water treatment is necessary. There is need for continuing monitoring for the water quality especially for drinking and other domestic use. Government of India can maintain the limits of physicochemical parameters before supplying to citizens for the prevention of its ill effects on human.

6. Acknowledgements

Authors like to thank Dr. R.R. Shah (Principal), Management and departmental staff (Biology) of K. K. Shah Jarodwala Maninagar Science College, Ahmedabad for facility and encouragement. The sincere and voluntary help of S. Y. B. Sc (CZ) for collection and laboratory analysis of water samples is highly appreciated. The authors are especially thankful to Prof. M. B. Suthar for helping throughout the research.

References

Sharma, R.B. and Sharma, R.C.: Biochem. And Cell Arch., 10(2): 267-273 (2010).
Suthar, M. B. and Suthar, T. M.: Biosci. Guardian, 1 (1) : 1-23 (2010).
Papanna, C. and Nagaraju, D.: Asian J. Environ. Sci., 5(1): 11-13 (2010).
Singh, R., Raghuvanshi, S. P. and Chandra, A.: Ind. J. Environ. Ecoplan., 17 (1-2): 39-44 (2010).
Singh, D. and Joshi, B.D.: Indian J. Environ. Ecoplan., 17 (1-2): 89-92 (2010).
Shrivastava, B. K. and Kumar, A.: Asian J. Exp. Chem., 4 (1-22): 90-91 (2009).
Suthar, M.B., Ravat, N.M., Mesariya, A.R. and Kanjariya, K.V.: Nat. Environ. Poll. Tech., 9(2): 399-408 (2010).
Srikantaswamy, S., Shakunthalata, Bai, Gholami, S. and Mahadev, J.: Asian J. Environ. Sci., 3 (2):104-110 (2008).
Kadam, R. M., Reddy, N.J.M. and Nagpurne, V.S.: Asian J. Environ. Sci., 3(2): 189-190 (2008).
Agnihotri, V. K. and Singh, P.K.: Asian J. Environ. Sci., 5(1): 8-10 (2010).
Sushiladevi, M., Pugazhendy , K., Jayachandran, K. and Jayanthi, C.: Indian J. Environ. Ecoplan., 15 (3): 511-517 (2008).
Suthar, M.B., Mesariya, A.R. and Ravat, N.M.: Elect. J. Environ. Sci., 1: 23-27 (2008).
Suthar, M.B., Mesariya, A.R., Ravat, N.M. and, Kalola, S.K.: Int. J. Bioscience Reporter, 6(1): 55-59 (2008).
Suthar, M.B., Mesariya, A.R. and Ravat, N.M.: Ind. J. Environ. and Ecoplan., 15(1): 171-176 (2008).
Suthar, M.B., Mesariya, A.R., Kanjariya, K.V. and Ravat, N.M.: Bulletin Environ. Sci., 26(1): 51-56 (2008).
Khandwala, R.V. and Suthar, M.B.: Int. J. Biosci. Reporter, 5(1): 1-6 (2007).,
Sushiladevi, M., Pugazhendy , K., Jayachandran, K. and Jayanthi, C.: Indian J. Environ. Ecoplan., 15 (3): 511-517 (2008).,
Navin , K.C., Twarakavi, and Kalurachchi, J.J.: J. Environ. Managt., 81: 405-419 (2006).
Prabhavathi, M., Reddy, S.K. and Indirani, R.: Ind. J. Environ. Ecoplan., 17 (1-2): 343-350 (2010).
Raghuvansi, S., Mishra, A.K. and Arya, M.: The Asian J. Exper. Chem., 5( 1): 7-11 (2010).,
W.A.I.: Drinking Water and Sanitation. Coverage, Financing & Emerging Concerns. Water Aid. India, New Delhi (2005).
Shah, A.N., Ghariya, A. S., Puranik, A. D. andSuthar, M.B.: Elect. J. Environ. Sci., 1: 49-56(2008).

Copyrights statperson consultancy www

Copyrights � statperson consultancy www.statperson.com 2013. All Rights Reserved.
Developer Details

1. Introduction

2. Materials and Methods

3. Statistical analysis

3.1 Factor analysis (FA)

3.2 Cluster analysis (CA)

4. Results and Discussion

4.1 Data Analysis of Year 2004

Table 1: Parameters of water samples collected from different areas showing mean values of Municipality and Tube well samples of year 2004

Table 2: The table contains total Variance explained data which had Extraction Method was Principal Component of year 2004

Figure 1: This figure shows the Scree Plot of the data of year 2004

Table 3: Component Matrix before Extraction Method of year 2004.

Table 4: Rotated Component Matrix of year 2004

Table 5: Agglomeration Schedule of data of year 2004

4.2 Data Analysis of Year 2005

Table 6: Parameters of water samples shows from Municipality and Tube well mean value; minimum and maximum values are shown in parenthesis in the year 2005. Units of Measurements: Total Hardness (as CaCO3) mg/l; Calcium (as Ca) mg/l; Magnesium (as Mg) mg/l; Chlorides (as Cl) mg/l; Salinity g/l

Table 7: The table contains total Variance explained data which had Extraction Method was Principal Component of the year 2005

Figure 2: This figure shows the Scree Plot of the data of the year 2005

Table 8: Component Matrix before Extraction Method of the year 2005

Table 9: Rotated Component Matrix of year 2005

Table 10: Agglomeration Schedule of data

4.3 Data Analysis of Year 2006

Table 11: Parameters of water samples collected from different areas of Ahmedabad city in year 2006. (Area wise Mean values and minimum and maximum values are shown in parenthesis.)

Table 12: The table contains total Variance explained data of year 2006 which had Extraction Method was Principal Component

Figure 3: This figure shows the Scree Plot of the data in the year of 2006

Table 13: Component Matrix before Extraction Method of the year 2006

Table 14: Rotated Component Matrix of year 2006

Table 15: Agglomeration Schedule of data of year 2006

4.4 Data Analysis of Year 2007

Table 16: Sample source wise list of physicochemical parameters studied in the year 2007

Table 17: The table contains total Variance explained data which had Extraction Method was Principal Component of year 2007

Figure 4: This figure shows the Scree Plot of the data of year 2007

Table 18: Component Matrix before Extraction Method

Table 19: Rotated Component Matrix of year 2007

Table 20: Agglomeration Schedule of data of year 2007

4.5 Data Analysis of Year 2010

Table 21: Sample source wise list of physicochemical studied parameters in the year 2010 shows mean values and maximum and minimum values in parenthesis

Table 22: The table contains total Variance explained data which had Extraction Method was Principal Component of year 2010

Table 23: Component Matrix before Extraction Method in the year 2010

Table 24: Rotated Component Matrix of year 2010

Table 25: Agglomeration Schedule of data of 2010

5. Conclusion

6. Acknowledgements

References

Table 6: Parameters of water samples shows from Municipality and Tube well mean value; minimum and maximum values are shown in parenthesis in the year 2005. Units of Measurements: Total Hardness (as CaCO₃) mg/l; Calcium (as Ca) mg/l; Magnesium (as Mg) mg/l; Chlorides (as Cl) mg/l; Salinity g/l