Research paper on data mining technology
Before, associations between different things were known just by observation, common sense and by employing simple techniques. But now, with the additional complexities arising everyday that cause the volume of the data to increase dramatically, finding these associations may not be as easy and as explicit like in the past. Merely observing or using only the common sense is not an efficient way to analyze bulks of data. Today, explaining correlations and predicting the future trends in a business are not as simple as plainly saying that they are largely related or that a particular stimulus can trigger it, without any basis. Take a look at the case below.
CASE: In one Midwest grocery chain, it was found out that when men purchased diapers on Thursdays and Saturdays, they also tended to buy beer. Furthermore, these shoppers usually did their weekly grocery shopping on Saturdays. During Thursdays, they only purchased a few items. Through this information, the owner can make new display arrangement of beer and diapers, that is, he can put the beer racks near those of the diapers. He can also make sure that these two products are always available on the days of their high demand.
How did the owner of the store discover these? A possibility is that the owner tolerantly observed his customers and memorized what they bought during Thursdays and Saturdays. But this possibility is not very plausible since there are seven days in a week and there are so many clients - making it hard to remember who bought a diaper or a beer or both on a specific day. In addition, it’s not just diapers and beers the store sells. Just imagine how ineffective this can be. He might even end up not knowing these information at all. What the owner really did was to employ data mining.
Aside from the case presented above, business questions that may be answered by data mining are: How should a new product be introduced to a group of customers? Or, how can an introduction of a product affect clients’ behavior? If such questions can be answered, accompany may earn a competitive advantage in the market.
Data mining is defined as the “process of discovering new correlations, patterns, and trends by sifting through large amounts of data in repositories using pattern recognition technologies as well as statistical and mathematical techniques.”
The advent of the term “data mining” was in the 1990s. But this technology is said to be not very new. Rather, it is an evolution from three classical techniques. These are classical statistics, artificial intelligence and machine learning. Classical statistics pertains to regression analysis, standard distribution, standard deviation, standard variance, discriminant analysis, cluster analysis and confidence intervals which are all used for data analysis and relationship building. Artificial intelligence on the other hand uses human-like processor to provide information to the user. And lastly, machine learning includes computer programs which help people learn about the data they study.
Data mining has two main kinds of models. One is the predictive model and the other one is the descriptive model. The predictive model, as its name obviously says, uses data with known results to develop a model that can be used to explicitly predict values. The descriptive model on the other hand, describes patterns in existing data. These two models are abstract and can facilitate understand business trends, guide decisions and suggest actions.
There are also two basic reasons on why data mining is used. First is because there is either too little or too much information. When there is too little information, an addition to it will help the user more while when there is too much information, cropping it to only its vital parts will aide its user as well. And without data mining, achieving these objectives would be very difficult.
But how does data mining work?
First, there has to be a goal like how revenue can be increased or how new customers could be obtained. Then, the target data should be assembled like the demographic profile of the customers and their records of transactions. Once this is set, techniques in data mining can be used and its results will then be interpreted. But one problem can be that the samples used in the mining are not representative of the whole population. One of the several techniques are employed by data mining is the association rule or the market basket analysis, which is used to discover connections between the characteristics enclosed in a database. This technique uses the frequency counts of a certain item or event and measures how another item or event is related to it. In a supermarket, for example, they can use the association rule of data mining in their inventory database to find out that every time a costumer buys garlic, he usually buys onions as well. Another technique in data mining is the use of decision trees. This method classifies data using tree-models, branching out until it reaches the class labels or the leaves. To reach these leaves or classifications, a series of tests of the given data happens. Thus, this method is a directed knowledge discovery because it’s used to find a specific field whose value is wanted to be known. On the other hand, clustering is also a technique that classifies but is said to be an undirected knowledge discovery. “That is, there is no target field, and the relationship among the data is identified by bottom-up approach.”1 The last technique is called the neural network. It is a layer of interconnected processors. The processor nodes, called neurodes, have weighted connection to a number other nodes in adjacent layers, and through these nodes that input from other layers are received. With the weight of each node, output values are computed. So just think about the great uses of data mining. Isn’t it better to have this kind of technology?
I. Benefits of Data Mining
The use of data mining has given its users some comparative advantage. In business, data mining makes your business profitable. It brings your business intelligence to the next level.
According to IBM, "Data mining offers firms in many industries the ability to discover hidden patterns in their data patterns that can help them understand customer behavior and market trends. The advent of parallel processing and new software technology enable customers to capitalize on the benefits of data mining more effectively than had been possible previously."
Technically, data mining is the one who does those long and tedious mathematical and statistical processes to those data that can be acquired by just observing and translates this data to some meaningful ideas that would surely increase your profit.
Let’s take a retail store owner, named Sir Plus. Using Sir Plus’ data, he found out, through data mining, that there is a low demand on a certain product. He also found out that there is this other product that has perfectly inelastic demand – that is consumers will buy that product even if there is large increase in the price of that product. As a g businessman he would surely want to maximize his profit. He can only do this if he could maintain the same number of buyers or better, if he can increase its demand and at the same time increase the demand on the other product. Bundling is the best way he can do this. Bundling is a business strategy where in you sell two or more products at the same time which you sell at a lower price than if those products are bought individually. When he bundles these two products, he can increase the demand of the product that has perfectly inelastic demand since those buyers of the less sellable product would also consider buying the bundle since they feel that they could also need the other product and they feel that they saved a lot. In that case, the consumers and Sir Plus will be very happy. So imagine he hasn’t used data mining. Then he won’t consider bundling those products. That means he will remain on the same profit he was earning before. And he might incur loss on the other product especially if that product is a perishable good.
This is just one of the benefits data mining gives you in your business and there’s a lot more. Here are some of those benefits:
• Provides insight into hidden patterns and relationships in your data In the example above, data mining helped Sir Plus uncover the pattern on the consumers’ behavior. Another example is when two unrelated goods will have some relation (i.e. consumers buy the two products together like ice cream and facial scrub) • Enables you to exploit these correlations to improve organizational performance Sir Plus used bundling to increase profit. In the other example, the management can consider placing facial scrubs near the ice cream containers. This would help its customers to save time in shopping. This is probably the reason why men’s clothing is always placed on the first floor and the ladies’ clothing on the second or higher floor. Men don’t spend much time on shopping so it would be better if the department store would make it more accessible for them. • Provides indicators of future performance The use of data mining helps companies to uncover customer behavior and market trends through past data. In a retail store, data mining helps the management identify those seasonal products, how many would they order and when will they sell these products. • Enables embedding of recommendations in your applications Data mining results allows you to display a simple summary statement and recommendations within operational applications. • Enables you to take full advantage of a range of data mining algorithms A large number of algorithms are present in the world. Data mining helps you to choose the right combination of algorithms that would suit your business needs.
Here is a list presented by Data-Mining-Software.net of specific industry and how data mining helps them:
Retail / Marketing Identify buying behavior patterns from customers. Find associations among customer demographic characteristics. Predict which customers will respond to mailing. Banking Detect patterns of fraudulent credit card usage. Identify "loyal" customers. Predict customers that are likely to change their credit card affiliation. Determine credit card spending by customer groups. Find hidden correlations between different financial indicators. Identify stocks trading rules from historical market data. Insurance and Health Care Claims analysis - determine which medical procedures are claimed together. Predict which customers will buy new policies. Identify behavior patterns of risky customers. Identify fraudulent behavior. Transportation Determine the distribution schedules among outlets. Analyze loading patterns. Medicine Characterize patient behavior to predict office visits. Identify successful medical therapies for different illnesses.
II. Real Stories
Back in 2003, a group of engineers founded Paperless Trail Inc. (PTI), a privately held company which aimed at providing accessible corporate data information to various business managers to help them make better and more efficient decisions.
After three years of extensive research, Paperless Trail Inc. launched its flagship data mining product suite called Perspective. This product allowed clients to access to corporate data essentially providing a service to mine data electronically. Some of the most prominent clients of Paperless Trail Inc. include multinational companies like Kimberly-Clark and Unilever, Holcim, along with other domestic corporations including Bancnet, LBC, and Jollibee Foods Corporation.
Paperless Trail Inc. made available information on more or less 7,000 companies in the Philippines from various industries. This data mining solution works by gathering corporate data and integrating geographic information thus generating for clients, reliable and comprehensive information to analyze the efficiency of business processes, particularly that of the product distribution division. Users can further investigate by drilling down on the desired areas to view sales and get details of the team that is handling the area. Systems requirements for Perspective Professional include Pentium 4 or higher, Athlon or Athlon 64,256 MB RAM, 64 MB Video and 2 GB of free hard disk space.
Perspective is also equipped with a Geographic Information System (GIS), which is a data visualization tool that displays data in a map or tabular form to facilitate in fast retrieval of information and easy understanding. According to their official product information release, “Perspective Plus provides you with a powerful way to analyze and visualize your data through themes present in the map. Themes present data visually using shades of color, fill patterns, or symbols that makes data evaluation and sales assessment effortless.” Using this visual tool, clients are able to decipher the status of their business and evaluate overall efficiency.
In a testimonial released by Kimberly-Clark, Philippines, they said, “We at Kimberly-Clark have a passion for finding new ways to improve the health, hygiene and well-being of people's lives everyday. Through Paperless Trail's Perspective GIS, we are able to reach our customers far and wide more effectively and efficiently. We were able to plot potential and existing customers thus allowing us to manage our supplies well. Proper route planning gave way to faster and easier delivery of consumers' needs. Through the help of Perspective GIS, our overall sales process has improved and in fact, sales figures in the province of Cebu doubled.”
In a case study on Bancnet’s use of Perspective GIS, it was evident that with this data mining solution, Bancnet efficiently positioned its ATMs in strategic areas to better serve the public. They were able to keep track of the progress of their ATMs nationwide whether by region, by city, by time or by day, and in the process assessing and predicting specific areas of opportunity and isolating areas that required attention.
This data mining solution is leased to clients at Php 2,000.00 per month. Adapting this solution is not only efficient in data analysis but also efficient in data storage because Paperless Trail Inc. offers services to digitize all relevant data, thus cutting down costs on reproduction of data hard copies and maximizes company’s storage space. "There is an estimate that a document is being photocopied almost seven times in its life. You get rid of this cost when you convert this into digitized format, plus you save up real estate," stated Paperless Trail President Peter N. Morrison in an interview with BusinessWorld.
Data mining has opened up new possibilities and insights in determining relationships in a huge pile of data. With the help of data mining, we are discovering patterns and essential information that could be utilized by business corporations who have access to this type of technology. The technology would greatly increase the information about the consumers and businesses would have a more vivid picture on what is going on in the consumer’s head. Non-obvious and essential relationships would be seen given the huge amounts of data which might prove to be very important and might be applied by businesses, corporations and other sectors. Data mining, especially predictive data mining, grants us a window into the future and this could prove very useful in business applications. Through data mining, not only are we discovering new relationships amidst tons of data but we are also opening new doors in the discovery of business information and intelligence that could prove very important in the future.
But, even though data mining has lower cost of machine learning and it could analyze a lot of information that is beyond the capabilities of human analysis, data mining still brings up a lot of issues and disadvantages regarding with its’ use, what it means for the people and the requirements in operating this technology. One of the issues that were raised with regards to data mining technology is the cost that comes with it. Though hardware costs have decreased dramatically, the problem with data mining and data warehousing is these technologies are both self-reinforcing. Another issue concerning data mining is with regards to its’ technical side. Data mining is hotly debated on whether it is a relational database structure or a multidimensional one. In a relational database, data is stored in tables allowing ad hoc queries. On the other hand, in a multidimensional structure, sets of cubes are arranged in arrays, with subsets created according to category. Though multidimensional structuring allows for multidimensional data mining, relational database has performed very well in client/server environments. With the unbelievable success of the internet to consider, we are looking at a world that is turning into one big client/server environment. Another issue to be concerned with is the decrease of flexibility and creativity with the use of data. The technology would make us dependent on the results and patterns that it will give, not allowing other speculations and interpretations be entertained. The people must understand that data mining, though powerful, wouldn’t be of much use if the one’s using it does not have a complete knowledge of the data they are processing. An individual must still be able to comprehend the data and information to be processed so that the individual would be able to know what they are looking for. Another issue of data mining is the rapid increase of information in the database. There is a risk wherein the data would be so large it would be really challenging to keep track of all the data entered in the database. The technology would also have to be further improved to cater to the rapid growth of information in the database. The need for a consolidated “de-duplicated” and cleaned data store where the data should be gotten might be considered as a noteworthy disadvantage with the use of data mining. In building models, 70-80% of the work in building them is due to the cleaning of the data stores making it very time-consuming. We should also address the integrity of the data in the technology. In essence, the technology is only as good as the data it contains. The powerful ability of data mining technology to analyze a huge amount of data is only as good as the data it is analyzing to begin with. The technology would prove to be a problem if it does not address the integrity of the data it is being analyzed. Most likely, given huge amounts of data, the technology would most likely confront conflicting and redundant data. The technology must address this problem of data redundancy and data conflict for it to produce good results. In relation to the analyzed data, another issue that is associated with data mining that could prove to be a disadvantage to the technology is the authentication of the data. Without a system to verify the data stored, the technology could face problems on authenticity and reliability of the technology’s analysis. There are also a lot of security issues that are attached with the utilization of data mining technology. As of the moment, companies do not have sufficient security systems to safeguard the information they have. Even if companies have a lot of information about us that are readily available online, these companies lack the security measures to protect the information they have gotten. On the customer side, the disadvantage is the misuse of the information the technology has gotten from the customers. For example, if a client orders or buys frequently from a store, the technology could uncover this information through analysis of your purchase history, signaling that the client must be given extra attention regardless of other customers who were first to order. The technology will create a bias to frequent buyers and not cater to everybody in order. But the disadvantage that this technology has that stands out on all of the issues that were addressed beforehand is the ethical and privacy issues attached with the use of data mining technology. The technology could uncover information that could compromise confidentiality and privacy obligations. This is clearly felt when the analysis of the data mining technology could pinpoint and identify the specific individuals contained in the data. Anonymity will disappear. The pattern data mining uncovers could pinpoint an individual’s habits and preferences which ultimately violate confidentiality and privacy of an individual especially if the individual’s data is anonymous to begin with. Data mining’s capability of identifying patterns could easily discover essential information about individuals’ business routines which could prove to be essential information that could be exploited by those who has this kind of technology. Ethics and morality are clearly in danger with the implementation of this technology and the essential information that it uncovers will most likely to be abused, misused and exploited if the technology is left unchecked.
The power of data mining might prove to be too great in the present period. Data mining, in its’ layman’s term, has the capability to find the needle in a very huge haystack. The valuable information that it could uncover with the discovery of patterns in the data will most likely touch ethics and morality towards consumers and customers. Privacy would be in danger if the technology would be able to identify the specific individuals based on patterns that emerge within the data. The technology’s prediction and powerful analysis would most likely be abused and exploited to gain competitive advantage towards business transactions that might prove to upset the current status of the business world.
IV. Users of Data Mining in the Philippines
I. Non - Government Organizations
A. Center for Applied Biodiversity Science Center for Applied Biodiversity Science (CABS), under Conservation International, aims to guide conserving nature by bringing together science and action. A project under its Marine Management Area Program is the mining of multiple data sets to determine the effectiveness of the marine managed areas (MMAs) and other matters pertaining to their maintenance. Danajon Bank in Bohol is the focus of this project.
B. Taytay sa Kauswagan Taytay sa Kauswagan, Inc. (TSKI) is awarded by PCFC as the Most Outstanding Microfinance Institution in the Philippines in 2005. TSKI worked with dB Wizards in the deployment of the HR* Wizard to provide timely and cost-effective human resource services to its employees. HR* Wizard is powered by Microsoft SQL Server 2005.
II. Business Organizations
A. ABS-CBN Interactive ABS-CBN Interactive (ABSi) is a media and interactive subsidiary of the company ABS-CBN. ABSi worked with the first Microsoft Gold Certified Planner for Business Intelligence in the Philippines, dB Wizards, to stay in tune with the market of selling downloadable mobile phone contents such as custom ring tones. Microsoft SQL Server 2000 was used in this endeavor dubbed Project 88.
B. ChemSynergy Asia ChemSynergy Asia, distributor of oleochemicals, is based in the Philippines. ChemSynergy Asia deployed Microsoft Dynamics NAV 5.0 with Microsoft Dynamic SureStep upon the advice of Microsoft Gold Certified Partner Raffles Solutions. This is to improve the management of inventory and accounts. Microsoft SQL Server 2005 was utilized as the prime database of this solution.
A. Bureau of Internal Revenue Bureau of Internal Revenue (BIR) used SAS/Warehouse Administrator and SAS Information and Delivery Portal to improve tax collection processes. SAS solutions were initially implemented in the auditing and enforcement areas of the Tax Administration Program.
B. Marikina City Local Governemnt Marikina City used Salveo for Public Health System (SPHS), recommended by stag Philippines, to improve the health services for the constituents. SPHS utilizes Microsoft products such as the Microsoft SQL Server 2005 and Windows XP.
As ABSi and TSKI, De La Salle Lipa at Batangas, turned to dB Wizards. To accelerate the enrollment process, DLSL deployed Microsoft ASP.NET 2.0 and Internet Information Services 6.0. This solution is powered by Microsoft SQL Server 2005.
V. Providers of Data Mining in the Philippines
I. ACRE, Inc. Asia-Pacific Centre for Research (ACRE), Inc. is a 100% Filipino corporation founded August 23, 1989. In the Philippines, ACRE manages and distributes SSPS products. Statistical Package for the Social Sciences (SPSS) is a world-wide leader in predictive analytics technologies through various industries.
II. SAS Philippines SAS is a privately-held software company founded in 1976. It leads in the field of business analytics and services. SAS Philippines conducts trainings in data mining with the use of SAS software.
III. Syntactics, Inc. Syntactics, Inc. offers services such as developing of accounting software and data mining. Syntactics is proficient in software like PHP MySQL, Postgres SQL, and MS SQL. The company is based in Cagayan de Oro, Philippines.
VI. SWOT Analysis
Having a more in-depth look on data mining technology, we could see a lot of qualities that would set aside this technology from other technologies present in our time to today. In terms of functionality, its’ main purpose is to discover relationships, whether obvious or non-obvious, in massive amounts of data. This gives us an idea that data mining is likened to searching for precious ore in a very big mountain. Its’ capabilities enables users, especially those in the business sector, to discover patterns and “predict” future outcomes with the use of existing data. These qualities of data mining technology make it a very essential tool in discovering consumer patterns and construct effective business strategies to implement in the business world. Though it shows great promise, there are still a few things to consider in using data mining technology. Data mining creates a whole new world of opportunities in discovering essential information in giving us a more vivid picture in terms of consumer behavior and other less obvious applications like discovering crime patterns and the like. But, storage costs and cleaning of data must be addressed for it to produce accurate results. Redundancy and conflicting data, as mentioned before, must also be taken care of. A system must also be established in authenticating the data being processed. There must also be a solution in managing the rapid growth of the data being processed. Externally, feelings towards the fairly recent technology have been mixed, due to the ethical issues concerning it. With the use of data mining, there is a fair possibility of discovering specific individuals on anonymous data. The capability of data mining technology could unearth information that was not meant to be seen. Finally, security issues towards their use of the technology must be improved to protect their data as well as the users from violating legal and privacy issues. These privacy issues must be addressed immediately to remove the bad reputation this technology is beginning to have and decrease any threat of it being pulled out of use by the general public.
VII. Writ of Habeas Data
“In matters of truth and justice, there is no difference between large and small problems, for issues concerning the treatment of people are all the same” . – Albert Einstein
It has been nine years since the dawn of the twenty first century and in that short period of time; technology rebuilt the tower of Babylon. The progression of technology has transcended the oceans, connecting the world in a place called Cyberspace. The wealth of knowledge available in Cyberspace easily enables everyone to know what is happening all over the world. A connected world, where human rights are defended and violations published and condemned, there are some things that are still a mystery. And, in the Philippines, there are still headlines that are never published.
What is newsworthy these days? Is it the flashy outcries during the Senate inquiries or the riveting reports of Bebe Gandanghari’s sexuality? Underneath the political chest thumping and media circus, there is an underlying specter in the minds of the people concerning the extralegal killings and enforced disappearances of our co-patriots. This specter is not just a physical injury against another human being. It is not just kidnapping, assault and murder but continued and conscious acts committed against the nobility of our national soul.
The silent outcries of families over the sudden loss and dubious deaths of their loved ones are heard when Chief Justice Reynato Puno and his court implemented the Writ of Habeas Data on February 02, 2008. The Writ of Habeas Data is a remedy for every Filipino, whose right to privacy in life, liberty and security is compromised by data gathered by individuals, public and/or private. According to Chief Justice Reynato Puno:
“This writ entitles the families of disappeared persons to know the totality of circumstances surrounding the fates of their relatives and imposes an obligation of investigation on the part of government. This writ is particularly crucial in cases of political disappearances, which frequently imply secret executions of detainees without trial, followed by the concealment of the bodies for the purpose of erasing all material traces of the crime and securing impunity for the perpetrators”
The writ in essence is a right to truth, encompassing the full disclosure of the data gathered and its purpose concerning the individual. Also any/ all data gathered on an individual must be updated, rectified and/ or destroyed if said data are compromising his right to life, liberty and security. Mr. Paul Rodriguez of Bayan Visayas is listed in the order of battle as a member of the New People’s Army and considered an enemy of the state along with other militant leaders such as Jaime Paglinawan, Ramon Patriarca, Sergio Repuela and Demetrio Carnece. Considering the prevalence of enforced disappearances and extralegal killings in the Philippines and the recent arrest of Ramon Patriarca, being listed in the order of battle and considered an enemy of the state is a threat to one’s right to life, liberty and security. Mr. Rodriguez insisted that this all based on erroneous and untrue data. He has filed a writ of habeas data to have his name removed from the list. The importance of the writ of habeas data to an individual is noted by Chief Justice Reynato Puno. According to Chief Justice Puno: “The exercise of this right is particularly crucial in disappearances driven by politics because they usually involve secret execution of detainees without any trial, followed by the concealment of the body with the purpose of erasing all material traces of the crime and securing impunity for the perpetrators. Indeed, truth is the bedrock of all legal systems, whether the system follows the common law tradition or the civil tradition. Justice that is not rooted in truth is injustice in disguise. That kind of justice will not stand the test of time, for it is not anchored in reality but on mere images.” The writ of habeas data is a protection against data gathered that could compromise a Filipino’s basic right to life, liberty and security that is endangered by the infringing on his right to privacy and truth. But, according to Police Director Jefferson Soriano of PNP Directorate for Investigation and Detective Management the writ of habeas data, with its lack of safeguard against unjustified and malicious suits, will simply reduce the efficiency of the investigative operations of the PNP and increase the conscious and malicious harassment of police officers by individuals wanting to stall investigation.
Senator Jose “Joey” De Venecia III, the apparent whistleblower in the allegedly $329.48-million National Broadband Network (NBN) project had filed a writ of habeas data against Senate President Juan Ponce Enrile who allegedly have unauthorized wiretapped phone conversations concerning him thus threatening his right to privacy. But the court dismissed his suit because:
“the meat of petitioner’s cause of action against respondents rests on the charge of wiretapping of his personal and private communications, particularly telephone conversations. In our best light, however, we see the allegation totally unsupported by evidence and brimming with petitioner’s self-serving and unsubstantiated imputations, based on hearsay and general assumptions [and] the claim of ‘reliable information’ being the basis of his allegation is, at best, hearsay and hence inadmissible.”
The writ of habeas data is not an automatic defense against data mining because if it is, then misuse and abuse of this writ would be effortless and prevalent. The petitioner must prove the foreseeable threat against his right privacy and right to life, liberty and security. Just like the responder must give proof of his right to truth, right to information and lack of infringement to the right of privacy of the petitioner.
A point brought up for and against the writ of habeas data is its impact on journalism. Investigative journalism gathers information through the use of official and unofficial, and sometimes illegal, sources. In a time where corruption is prevalent in the Philippines, lifestyle check by journalist is not unusual. An abuse of the writ of habeas data by politicians would be claims of erroneous data and demands of stopping the investigation and retraction, suppression and destruction of the gathered data with the burden of proof on shoulders of the journalists. But, according to Harry Roque of the University of the Philippines, the writ of habeas data supports the right of information and that “journalists, or any individual for that matter, can now compel government agencies to release data that otherwise would not have been readily available.” Especially since the Supreme Court upholds the defense that lifestyle checks on politicians are a public concern thus support the right to information and truth.
In the end of the day, the writ of habeas data is like a gun on issues regarding the treatment of Filipinos concerning truth and justice. It can be honorably used or maliciously abused, the writ of habeas data has already been fully implemented and all examples show that the line between the malicious abuse and the just use of the writ of habeas data lies on the level of corruptibility and intellect of the members of the Supreme Court.
Data mining indeed is of great help. With its uses in different fields of study, patterns are easily recognized and accessed. Future trends can already be predicted. More especially, due to its applications, business costs are minimized while revenues are maximized. Getting the needed information is more efficient than in using the traditional way. This was illustrated by the extensive use of Perpective, a data mining suite, by many large domestic and multinational companies here in the Philippines such as Jollibee Foods Corporation and Unilever. This suite provides comprehensive information about the business processes especially in the product distribution division.
Aside from the private sector, the government also uses data mining. It uses data mining in tax collecting and improving the education.
However, data mining raises several issues. One is that of its cost. Another concern is it decreases flexibility and creativity and making its users dependent on it, not allowing them to make other interpretations. The storage of data in databases also raises concerns because the technology depends on the data to be interpreted, thus, since data grows exponentially, its storage is in question.
But the most talked-about issue is that of privacy. Since customers have the right to anonymity, the company cannot disclose information regarding them through any means without their consent. When a company uses data mining, however, they can know several information - which should remain unknown in the first place - about a certain customers and consequently, these known information may be exploited.
Sources: Data Mining in the Philippines. 22 March 2009 <http://science.conservation.org/portal/server.pt?open=512&objID=632&&PageID=127611&mode=2&in_hi_userid=127745&cached=true>
ABS-CBN Interactive Doubles Response Rates with SQL Server 2005 Data Mining. 22 March 2009 <http://www.wizardsgroup.com/Media/Pages/ABSCBN.aspx>
Mission Accomplished! The Philippines Bureau of Internal Revenue Reduces Federal Deficit, Improves Tax Collection Processes with SAS, Realizing 400 Percent ROI. 22 March 2009<http://www.sas.com/success/philippinesbir.html>
City of Marikina: Municipal City Turns to Custom Solution to Improve Public Healthcare Services.17 January 2007. 22 March 2009 <http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=200921>
ChemSynergy Asia: Chemicals Trader Roll Outs Unified System in Four Weeks with Effective Methodology. 22 April 2008. 22 March 2009 <http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000001844>
Taytay Sa Kauswagan: Microfinance Firm Automates HR Processes with Integrated Data Management Tool .01 March 2009. 22 March 2009 <http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000003803>
De La Salle Lipa: School Improves Student Enrolment with Integrated Web-based System. 14 July 2008. 22 Mar. 2009 <URL:http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000002397>.
Paperless Trail Inc. 22 March 2009. <http://www.paperlesstrail.net/?p=2>
Perspective GIS Community Website – GIS Solutions for SME in the Philippines . 22 March 2009. <http://www.perspective-gis.com/index.html>
Firm ventures into offering data mining solution BY MARICEL E. ESTAVILLO, Reporter. 22 March 2009. <http://itmatters.com.ph/news2006.php?id=021406a>
A Straight Shooter's Guide to Data Mining. 21 Mar. 2009 http://www.data-mining-software.net/Data-Mining-Business-Benefits.shtm
Pentaho Data Mining. 21 March 2009.<http://www.pentaho.com/products/data_mining/>