Loading...
Please wait, while we are loading the content...
Similar Documents
Using SAS Enterprise Miner to Examine General Education Issues
| Content Provider | Semantic Scholar |
|---|---|
| Author | Cerrito, Patricia B. |
| Copyright Year | 2006 |
| Abstract | General education courses are required to give students a broad background in a variety of disciplines. However, outside of freshmen English and possibly history, many different courses are available to satisfy these requirements. It is rare to examine how students navigate through all of the available course offerings to complete general education. All student enrollment from 2000-2004 was made available for data mining along with student demographics, including ACT scores. With 22,000 students enrolled in any given semester averaging 15 credit hours, the database is extensive. Sequential market basket analysis was used to examine pathways taken by students through general education, and to determine which pathways are more successful compared to others. SAS Text Miner is also used to investigate general education pathways. Enterprise Miner is combined with SAS/Stat to drill down into the data. Results show that course choice does impact student success. INTRODUCTION The number of different combinations of courses that are possible for undergraduate majors is almost unlimited. In order to examine student pathways to success, it is necessary to examine all combinations. Then the problem is very similar to that of examining customer purchases when there are thousands of possible purchase items and purchase combinations. However, unlike customer purchases, there is a final outcome variable for students with success measured by grades and by graduation. When there are thousands of possible combinations available for student choice, transaction counts become so small that the results often become highly questionable, even though choice of major does tend to restrict some combinations. Attempts have been made to address the issue through the development of clusters that can be used to reduce the number of possible course combinations to be examined. However, any time data compression occurs, information is lost. Another means of reducing the problem is to reduce the data to a subset of items. This will allow the intense examination of items within subsets, but will not provide information on how items are related across subsets. Market basket analysis can be combined with other data mining and statistical tools to drill down into student preferences to examine pathways of choice. Sequential path analysis will be used since students enroll in courses sequentially by semesters. It is the purpose of this paper to discuss an application where student preference can lead to success or failure in outcomes. The first step is to examine the nature of student preferences given choice of major. The initial focus will be on general education courses that are required of all students at the University of Louisville. These include English, History, and Communications. Students are also required to take a mathematics course, although many students must first enroll in remedial mathematics before advancing to the level of general education. Another technique used for compression of course combinations will be text analysis. Text strings that contain all courses by individual student will be created, and those text strings will be clustered using SAS Text Miner. The SAS code needed to create the text strings is given in the appendix. It will be shown that students taking specific mathematics general education courses will have a greater likelihood of graduating compared to students taking different courses. Certain pathways are more likely to lead to success. METHOD The dataset studied was the admissions file for all students for the years 2000-2004. All course enrollment information was included in the dataset along with certain admissions information, including ACT and SAT score information. The observational unit was the individual student course enrollment. Therefore, each student is listed in the database for each semester and each enrollment during the time of study. For 22,000 students enrolled in a given year, the database contained information on 54,000+ students and 780,000+ enrollments. The department was recorded in one field with the course number in a second field. The two fields were concatenated with an underscore between the department name and course number. In that way, the course identity can be maintained while using sequential path analysis in Enterprise Miner, version 5. The SAS statement used to combine these two fields, new for Version 9, is equal to CATX (‘_’, course name, course number);. The first entry gives the character used to separate the concatenated items that are listed. RESULTS A sequencing variable was added to the enrollment dataset, limited to general education courses defined with numbers 100200. The sequencing was used to investigate the pathways students take to complete their general education requirements, and the ordering with which they are taken. A sequential link analysis is given in Figure 1. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://analytics.ncsu.edu/sesug/2006/ST02_06.PDF |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |