|
News Apr. 10, 2010 --- Finally, the OMOP Cup Competition ended. Our Hawkeye Dorp team, consisting of Lian Duan, Mohammad Khoshneshin, Nick Street, and Si-Chi Chin, get the third place for both the Challenge 1 and 2, and win $5000 prize. The final result is good for us since it is the first time to attend such kinds of competitions for most members in our team. There are a lot of differences between winning competitions and doing research. Anyway, it is really a nice 2-month experience for us and I learned a lot from it. |
|
Correlation Analysis Finding the most interesting correlations among items is essential for problems in many commercial, medical, and scientific domains. Much previous research focuses on finding correlation pairs instead of correlation itemsets. When we design gift sets, the arrangement on store shelves, or product categories on a website, we are more interesed in correlation itemsets than correlation pairs. Though some exsiting methods find high-correlation itemsets, they suffer from both efficiency and effectiveness problems in large databases. Our research on this area is motivated by those problems. Publication:
Good Papers: |
|
Recommender System Recommender systems make use of massive choice data to help individuals more effectively identify content of interest from a potentially overwhelming set of choices. However, there is no clear recipe for providing interesting recommendations to users. Still, there are at least four key elements which should be taken into account. They are relevance, familiarity, novelty, and diversity. The difficult job of recommender systems is to find the proper level of these four key elements for each user. Publication:
Useful resources: Good Papers: |
|
Density-based Clustering and Anomaly Detection Cluster analysis is a primary method for database mining. Density-based approaches apply a local cluster criterion. Clusters are regarded as regions in the data space in which the objects are dense, and which are separated by regions of low object density (noise). These regions may have an arbitrary shape and the points inside a region may be arbitrarily distributed. For other KDD applications, finding the outliers, i.e. the rare events, is more interesting and useful than finding the common cases, e.g. detecting criminal activities in E-commerce. Publication:
Useful resources: Good Papers: |