Home

Abstract: Support vector machines (SVM) methods have become increasingly popular tools for data mining tasks viz., classification, regression and novelty detection. The present paper deals with classification of Indian industries using SVM. Industries stable for one month period in NIFTY was selected, of which 50 companies in NIFTY, 32 were found to be stable. Twenty eight key financial ratios of these companies were taken for a period of five financial years (April 2007 to March 2012). Fuzzy clustering and SVM were used to explore the financial data. Principal component analysis (PCA) was applied and it reduced the twenty eight financial ratios into seven components. Thereafter, fuzzy clustering was performed on scores of PCA and was formed into two groups which were categorized into high and low performing industries based on their mean values. SVM was used as a classifier of the industries and it was compared with well known and old classification technique, Linear discriminant analysis (LDA). The classification accuracy in training and testing data set for SVM was 97.32% and 100 % whereas for LDA it was 87.29 and 93.75% respectively.Therefore, the present study concludes that SVM performed better than LDA in the classification of industries.

Key words:Financial Ratio, Classification, Support Vector Machines, Fuzzy Clustering, Principal Component Analysis.

1. Introduction

Livelihood of the people changes due to the development of economy in the country and especially in developing countries, industries plays a vital role in the development of the country’s economy. According to [6] one-third of the population of the world lived in poverty in 1981, whereas the share was 18% in 2001. This huge decline was due to the economic development in India and China. Indian Gross domestic product (GDP) increases from 3.9 in 2001 to 7.2 in 2011. In this, the contribution of industries was 20.16, second to services sector contribution of 65.22%. Rise of industries in a country boost the employment opportunity, income and saving, economic scale and farm productivity. On the other hand it declines the poverty, crime, society imbalance, etc. Due to globalization and liberalization in Indian government policy, many new industries from inside and outside the county has emerged in recent years but it increases the competitive nature to survive which resulted the industries to monitor their performances regularly which is not an easy task. One of the ways to supervise them is by financial ratios and therefore evaluation on the performances of the industries is inevitable. A support vector machine (SVM) is a training algorithm for learning classification and regression rules from data ([23], [36], [7], [21]). SVM is applied successfully in many areas such as face detection ([12], [2], [29], [14] ), image classification ([38], [8]), object recognition ([28], [37], [26]), hand written / digital recognition ([13], [3], [22], [1], [16]), speaker speech recognition ([5], [33], [24] ), gender classification ([17], [35], [10], [27]), text classification ([32], [18]) etc.,. Several recent studies have reported that SVM is capable of rendering higher performance in terms of classification accuracy than other data classification ([11], [34]). Therefore, the present paper deals with usage of SVM as a tool to classify the Indian industries then to check whether SVM classifies Indian industries better than Linear Discrinant Analysis (LDA). Rest of the paper is organized as follows; section two deals with the selection of samples, data description and methodology used in the present study. Brief introduction of data analysis techniques viz., principal component analysis for data reduction, fuzzy clustering for clustering, the industries into homogenous groups and SVM classifier are described in section three. Findings and discussions of the results are presented in section four and conclusion in section five.

2. Data and methodology

2.1 Sample selection and data description

The study was analytical in nature and the present study uses the latest available published secondary data starting from April 2007 to March 2012. The units of analysis include 50 industries that are listed on Nifty. Thirty two industries were filtered based on the following criteria. i) The industries must be listed on Nifty. ii) The industry must be stable in the list of Nifty for period of a month (1^st - 30^thSeptember, 2012). iii) The data of variables for industries must be available for the period of study. iv) Financial service based industry viz., banks, financial intuitions etc., were excluded. Financial ratios provide a quick and relatively simple means of examining the financial condition of an industry since it is of very good help when comparing the financial health of different businesses [19]. Therefore, to identify the financial performance of Indian industries, financial ratios of the industries were used. By carefully examining the previous studies 28 most important financial ratios were selected. Financial ratios were obtained from income statements, balance sheet, cash flow data sheet, etc., of the industries. For this study, these ratios were extracted from money control web page (www.moneycontrol.com) and the needed financial ratios were calculated. These ratios were selected to assess profitability, investment values, liquidity, solvency, debt coverage, management efficiency, profit and loss. The variables (financial ratios) used and its codes are shown in Table 2.1.1.

Financial Ratio	Codes	Financial Ratio	Codes
Operating Profit Per Share	OPPS	Interest Cover	INC
Net Operating Profit Per Share	NOPPS	Total Debt to Owners Fund	TDTOF
Operating Profit Margin	OPM	Financial Charges Coverage Ratio	FCCR
Gross Profit Margin	GPM	Inventory Turnover Ratio	ITR
Cash Profit Margin	CPM	Debtors Turnover Ratio	DTR
Net Profit Margin	NPM	Total Assets Turnover Ratio	TATR
Return On Capital Employed	ROCE	Number of Days In Working Capital	NDWC
Return On Net Worth	RONW	Material Cost Composition	MCC
Return on Assets Including Revaluations	ROAIR	Selling Distribution Cost Composition	SDCC
Return on Long Term Funds	ROLTF	Expenses Total Sales	ETS
Current Ratio	CUR	Dividend Payout Ratio Net Profit	DPRNP
Quick Ratio	QUR	Dividend Payout Ratio Cash Profit	DPRCP
Debt Equity Ratio	DER	Earning Retention Ratio	ERR
Long Term Debt Equity Ratio	LTDER	Cash Earning Retention Ratio	CERR

LINEAR
POLYNOMIAL	,
RBF
SIGMOID

Number of clusters	S(i)	Dunn’s Partition Index	Normalized Dunn’s Partition Index
2	0.31	0.79253	0.75103
3	0.30	0.77579	0.71974
4	0.26	0.74178	0.65571
5	0.16	0.71312	0.56969
6	0.25	0.74510	0.49020

	Training				Testing
Actual	Predicated by Model				Predicated by Model
		HIGH	LOW			HIGH	LOW
	HIGH	42	2	44	HIGH	16	0	16
	LOW	1	67	68	LOW	0	32	32
		43	69	112		16	32	48
AR		97.321			100.00
ER		02.679			0.000

Aggarwal, A. Rani, R. and Dhir, R, “Recognition of devanagari handwritten numerals using gradient features and SVM”, international journal of computer applications, Vol. 48, No.8, pp.39-44, 2012.
Ai, H. Liang, L. and Xu, G, “Face detection based on template matching and support vector machines”, proceedings of international conference on image processing, pp. 1006-1009, 2001.
Bezdek, J.C, “Numerical Taxonomy with fuzzy sets”, journal of mathematical biology, Vol.1, pp.57-71, (1974).
Arora, S. Bhattacharjee, D. Nasipuri, M. Malik, L. Kundu, M and Basu, D. K, “performance comparison of SVM and ANN for handwritten devnagari character recognition”, international journal of computer science issues, Vol. 7, No.3, pp.18-26, 2010.
Chavhan, Y. Dhore, M. L., Yesaware, P, “Speech emotion recognition using support vector machine”, international journal of computer applications, Vol.1, pp.6-9, 2010.
Chen, S and Ravallion, M, “How have the world’s poorest fared since the early 1980s?” ,world bank research observation, Vol. 19, No. 2, pp. 141 – 170, 2004.
Cristianini N, Taylor, J. S, “An introduction to support vector machines”, cambridge university press, cambridge, NewYork 2000.
Dhasal, P. Shrivastava, S. S, Gupta, H. and Kumar, P, “An optimized feature selection for image classification based on SVM-ACO”, international journal of advanced computer research, Vol.2, No. 5, pp.123 – 128, 2012.
Dunn, J. C, “A fuzzy relative of the ISODATA process and its use in detecting compact well- separated clusters”, Journal of Cybernetics, Vol.3, pp.32-57, 1974.
Gaikwad, S. Gawali, B and Mehrotra, S. C, “Gender indetification using SVM with combination of MFCC”, advance in computational research, Vol. 4, issue 1, pp.69-73, 2012.
Gokcen, I and Peng, J, “Comparing linear discriminant analysis and support vector mechaines”, advance in information systems, lecture notes in computer science, Vol. 2457, pp. 104-113, 2002.
Guodong, G, S. Li, and C. Kapluk, “Face recognition by support vector machines”, in proceedings IEEE international conference on automatic face and gesture recognition, pp.196–201, 2000.
Hamdi, R. Bouchareb, F. and Bedda, M, “Handwritten Arabic character recognition based on SVM classifier”, information and communication technologies: from theory to applications, 3rd international conference on sep. 7-11, Damascus, 2008.
Jia, H. and Martinez. A. M, “Face recognition with occlusions in the training and testing sets”, Proceedings of the IEEE international conference on automatic face and gesture recognition, 2008.
Kaufman, L. and Rousseeuw, P. J, “Finding groups in data: An introduction to cluster analysis”, John Wiley and Sons, New York. 1990.
Kumar,P. Sharma, N. and Rana, “A. Hand written character recognition using different kernel based SVM classifier and MLP neural network (A COMPARISON)”, international journal of computer application, Vol. 53, No.11, pp.25-31, 2012.
Lian, H.C. and Lu, B. L, “Multi-view gender classification using multi-resolution local binary patterns and support vector machines”, international journal of neural system, Vol.17, No.16, pp.479 – 487, 2007.
Lewis D. L., Yang Y., Rose T. G., Li, F, “A new benchmark collection for text categorization research”, journal of machine learning research, Vol.5, pp.361-397, 2004.
Mahmoud, O. H, “A multivariate model for predicting the efficiency of financial performance for property and liability Egyptian insurance companies”, casualty actuarial society, discussion paper, 2008.
Manly, B. F. J, “Multivariate statistical method a primer”, chapman and hall, New York, 1986.
Muller K. R, Mika S, “An introduction to kernel-based learning algorithms”, IEEE transactions on neural networks, Vol.12, No.2, pp.181– 201, 2001.
Nasien, D. Haron, H and Yuhaniz, S. S, “Support vector machine (SVM) for English handwritten character recognition”, computer engineering and applications (ICCEA), second international conference, Vol.1, pp.249-252, 2010.
Osuna, E., R. Freund and F. Girosit, “Training support vector machines: an application to face detection”, proceedings of IEEE computer society conference on computer vision and pattern recognition, june 17-19, Puerto Rico, pp: 130-136, 1997.
Pan, Y. Shen, P. and Shen, L, “Speech emotion recognition using support vector machine”, international journal of smart home, Vol. 6 No.2, pp. 101 – 108, 2012.
Person, K, “On lines and planes of closest fit to a system of points in space”, philosophical magazine, Vol.2, pp 557- 572, 1901.
Petropoulos, P. G. Kalaitzidis, C. and Vadrevu, K. P, “Support vector machines and object-based classification for obtaining land-use / cover cartography from hyperion hyperspectral imagery”, computers and geosciences, vol. 41, pp. 99-107, 2012.
Ponnarasi, S. S and Rajaram, M, “Gender classification system derived from fingerprint minutiae extraction”. IJCA proceedings on international conference in recent trends in computational methods, communication and controls (ICON3C 2012) pp.1-6, 2012.
Pontil, M. and Verri, A, “Properties of support vector machines”, neural computation, Vol. 10, No.4, pp. 955–974, 1998.
Romdhani, S. Torr, P. and Scholkopf, B, “Efficient face detection by a cascaded support-vector machine expansion”, royal society of London proceedings, series A, Vol. 460, pp. 3283–3297, 2004.
Rousseeuw, P, “Fuzzy clustering at the intersection”, Technometrics: A journal of statistics for the physical, chemical and engineering sciences, Vol.37, pp. 283-286, 1995.
Ruspini, E. H, “A new approach to clustering, information and control, Vol.15, pp. 22-32, 1969.
Sebastiani, F, “Machine learning in automated text categorization”. ACM computing surveys vol.34, No.1, pp.1–47, 2002.
Shen, P. Changjun, Z and Chen, X, “Automatic speech emotion recognition using support vector machine”, electronic and mechanical engineering and information technology, international conference on Aug 12 -14, vol.2, pp.621-625, 2011.
Tsuta, M. Marsy, G. E, Sugiyama, T. Fujita, and K. Sugiyama, J, “Comaparison between linear discrimination analysis and support vector machine for detecting pesticide on spinach leaf by hyperspectral imaging with excitation-emission matrix”, European symposium on artificial neural network – advance in computational intelligence and learning, Bruges (Belgium) on 22-24 April, pp.337-342, 2009.
Xia, B. Sun, H. and Lu, B. L, “Multi-view gender classification based on local gabor binary mapping pattern and support vector machines”. IEEE International joint conference on neural networks. pp.3388-3395, 2008.
Vapnik, V, “Nature of statistical learning theory”, New York, springer-verlag, 1995.
Zhang, J. Marszalek, M. Lazebnik, S. Schmid, C, “Local features and kernels for classification of texture and object categories: A comprehensive study”, IJCV, Vol.73, pp.213–238, 2007.
Zhang, Y and Wu, L, “Classification of fruits using computer vision and a multiclass support vector machine”, sensors, Vol. 12, pp.12489-12505, 2012.
Zadeh, L. A, “Fuzzy Sets”, information and control, Vol.8, pp. 338-353, 1965.

Copyrights � statperson consultancy www.statperson.com 2013. All Rights Reserved.
Developer Details