Never too old to learn

100_0715

 

 

 

 

Ting Sa http://www.wright.edu/~sa.2/img/name.jpg  Contact Info:  E-mail:   sa.2@wright.edu

 

 

I was born in Shanghai, a beautiful and fast growing city in China. I got my B.S degree of computer science

from the Department of Computer Science and Engineering, Shanghai University of Electric Power in 2005.  

I came to U.S in 2006 to take my graduate study of computer science and received my master degree in August

2008 from the Department of Computer Science ,Wright State University. In 2010, I started my master study of

Applied Statistics in the Department of Mathematics and Statistics, Wright State University.

I am interested in the work related to statistical data analysis, data mining, SAS programming and data analysis

software development. I am currently looking for full time or intern jobs related in these areas.  

My resume could be found [here].

 

On this webpage, you could also find other information by clicking the below links:  Research Papers   Statistical & Data Mining Projects  Programming Projects   Courses taken@WSU  TA work

Research Papers [back to top]

Title:   Analyzing and Tracking Weblog Communities Using Discriminative Collection Representatives, appeared in SBP10. [Paper][Slides]

Abstract: Analyzing/tracking weblogs by given communities (ATWC) is increasingly important for sociologists and government agencies, etc. This paper introduces an approach to address the needs of ATWC by using concise discriminative weblog collection representatives (DCRs), which are constructed from large collections of blogs by communities of interest. DCRs are aimed at helping users to quickly identify the major themes/trends in such collections, and to quickly identify important shifts/differences in major themes and trends of blogs by given communities over time and space. We propose to use the quality of DCR-based classifiers to measure DCRs' quality. We present algorithms for constructing DCRs, report experimental results to evaluate the efficiency of the algorithms and the quality of the DCRs they construct, and provide real-data examples to demonstrate the usefulness of DCRs for ATWC.

Title:   Object Similarity through Correlated Third-Party Objects, OhioLINK ETD, 2008. [Paper] [Slides][Video Demo]

Abstract: Given a pair of objects, it is of interest to know how they are related to each other and the strength of their similarity. Many previous studies focused on two types of similarity measures: The first type is based on closeness of attribute values of two given objects, and the second type is based on how often the two objects co-occur in transactions/tuples. In this thesis we study a new ¡°behavior-based¡± similarity measure, which evaluates similarity between two objects by considering how similar their correlated ¡°third-party¡± object sets are. Behavior-based similarity can help us find pairs of objects that have similar external functions but do not have very similar attribute values or do not co-occur quite often.  After introducing and formalizing behavior-based similarity, we give an algorithm to mine pairs of similar objects under this measure. We demonstrate the usefulness of our algorithm and this measure using experiments on several news and medical datasets.

(1) Loading the dataset, a progress bar appears to indicate the loading progress:(2) A progress monitor showing the process of mining similar object pairs in real time:

sim3p2      sim3p3 

(3)The results are displayed in a java table, all the columns could be sorted and extended:(4) The final results are also automatically generated as an html file:

sim3p4    sim3p5

Statistical & Data Mining Projects [back to top]

Projects @ Statistical Consulting Center, Wright State University:

1. Help the institutional research department of WSU to identify important factors that have impacts on students¡¯ retention and graduation.

2. Help the internal audit department to do different statistical analysis on school credit cards usage.

 

Projects @ Qbase

1. Help the non-profit organizations to identify potential donors, new donors and influential donors.

2. Help the Kettering Hospital network to analyze the questionnaires containing the patients¡¯ comments regarding the hospitals¡¯ services using text mining techniques.

 

Projects @ Data Mining Research Lab, Wright State University

 

Title:   OLAP-style Entity Correlation Analysis on Events Data, Lexis-Nexis, 2006-2007. [Video Demo]
In this project I designed and developed tools to perform OLAP-style entity correlation analysis on events data contained in news reports. The aim of the tools is to extract interesting correlations among entities.

The source data is metadata extracted from news reports. The metadata contains a number of attributes such as "company" "organization" "ticker" "person" "city" "country" etc. Each specific event contains a number of attributes, and it contains a number of values for each of those attributes. From each event, each pair of attribute values, for two (possibly identical) attributes, is considered as a correlation instance. A user can provide any specific set of events as input to this program.

The frequent correlations are computed from the given set of events. They are displayed through a user-friendly user interface. Users can navigate the display to do drill-down and roll-up of correlations.

At each level of the display, the user interface first provides a list of attributes in order to give the users a schema description of the data. When a user clicks any of the attributes, she/he will see the top-K most frequent entities for the clicked attribute. The default value for K is 100. When the user clicks any of the displayed entities, the list of attributes will again appear, allowing the user to drill-down another time. This process can repeat many times, allowing the user to drill-down the correlation to a number of levels. At any time, the path from the root to the current attribute value is high-lighted to allow the user to see the history/context of the correlations associated with the current path.

(1) Root Level Display for Correlation Analysis:                                                      (2) Expanded display for each attribute:

 

meta1     meta2

(3) A detailed level of the display:                                                                                (4) A more detailed low level display:

 

meta3   meta4

 

Programming Projects [back to top]

Projects @ TechEdge, Wright Brothers Institute:

 

1. Open Layer Sensing Test bed project:  

Display Ohio traffic web cameras onto a world-map based user interface. By clicking any web camera icon on the map, users could see the real time traffic videos.

 

(1) Traffic web cameras in Dayton:                                                                                (2) Traffic web cameras in Cincinnati, Dayton and Columbus:

 

   

 

(3) Connect to the traffic web camera located on I-75/3rd street:                             (4) Start to play the real-time traffic video:

 

 

      

2. PocketLST project:

Worked as the team leader of the android phone development group and developed android phone applications that could send/receive text and image messages to/from the Google map in real time. [Technical Report] [Slides] [Video Demo]

 

The screenshots below are my part of android implementations for the PocketLST project:

 

 

 

Courses Taken @ WSU [back to top]

Computer Science Courses:  

--CS516: Survey of Computer Science Numerical Methods

--CS605: Introduction to Data Management System

--CS634: Concurrent Software Design
--CS666: Introduction to Formal Language

--CEG66: Matrix Computation

--CS680: Comparative Languages
--CS701: Database System & Design
--CS702: Advanced Computer Networks

--CEG720: Computer Architecture

--CS740: Natural Language processing techniques

--CEG770: Computer Engineering Mathematics
--CS790: Advanced Data Mining

Applied Statistical Courses:  

--STT611 Applied Time Series
--STT646: Statistical Methods for Engineers

--STT661: Statistical Theory I

--STT662: Statistical Theory II
--STT666: Statistical Methods I

--STT667: Statistical Methods II

--STT669: Introduction to Experimental Design

--STT740 Categorical Data Analysis

--STT761: Theory of Linear Model

--STT767: Applied Regression Analysis

 

Unofficial WSU Transcript could be found [here].

 

TA Work [back to top]

TA courses and labs:

cs240(lab): Programming Language I (java)

cs241(lab): Programming Language II (advanced java)

cs242(lab): Programming Language III (c++)

STT264(lab): elementary statistics

STT265(lab): elementary statistics II

MTH126: Intermediate Algebra

@ 2008 Ting Sa. All rights reserved. http:/www.wright.edu/~sa.2