Research outputs 2012

An efficient web document clustering algorithm for building dynamic similarity profile in similarity-aware web caching

Jitian XiaoFollow

Document Type

Conference Proceeding

Keywords

Similarity profile, Web caching, Web document clusteringBuilding dynamics, Offline, Prefetching, Similarity profile, Similarity threshold, Web Cache, Web caching, Web content management, Web document, Web document clustering, Web usage mining, Clustering algorithms, Cybernetics, Data mining, Information retrieval, Learning systems, World Wide Web

Publisher

IEEE

Faculty

Faculty of Computing, Health and Science

School

School of Computer and Security Science

RAS ID

14852

Comments

Xiao, J. (2012). An efficient web document clustering algorithm for building dynamic similarity profile in similarity-aware web caching . Proceedings of International Conference on Machine Learning and Cybernetics. (pp. 1268-1273). Xian, Shaanxi; China. IEEE. Available here

Abstract

Discovering and establishing similarities among web documents have been one of the key research streams in web usage mining community in the recent years. The knowledge obtained from the exercise can be used for many applications such as optimizing web cache organization and improving the quality of web document pre-fetching. This paper presents an efficient matrix-based method to cluster web documents based on a predetermined similarity threshold. Our preliminary experiments have demonstrated that the new algorithm outperforms existing algorithms. The clustered web documents are then applied to a Similarity-aware web content management system, facilitating offline building of the similarity-ware web caches and online updating similarity profiles of the system.

Access Rights

subscription content

Link to Full Text

COinS

Link to publisher version (DOI)

10.1109/ICMLC.2012.6359547

Research outputs 2012

An efficient web document clustering algorithm for building dynamic similarity profile in similarity-aware web caching

Document Type

Keywords

Publisher

Faculty

School

RAS ID

Comments

Abstract

Access Rights

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations

Research outputs 2012

An efficient web document clustering algorithm for building dynamic similarity profile in similarity-aware web caching

Authors

Document Type

Keywords

Publisher

Faculty

School

RAS ID

Comments

Abstract

Access Rights

Share

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations