Methods for Web 2.0 Data
Developing large scale technologies, e.g., mobile communication systems, requests the study of peoples acceptance and risk sense. For such research topics interdisciplinary collaboration is essential. In cooperation with textlinguists, we focus on the investigation of grammatically non-standard Web comments and their ability to serve as source for automatic opinion detection in terms of acceptance research. Three steps are included in our research:
- The aquistion of Web comments by means of Web Search/Focused Crawling Algorithms,
- the detection/classification of Web comments and their meta data in Web pages, and
- the adaption of natural language processing methods, e.g., POS Tagging to Web comment language.
The figure above shows the interaction between the different methods.
Related research topics and publications
Web Search, Focused Crawling, Ontology-based Search
- M. Neunerdt, E. Reimer, M. Reyer, R. Mathar, Enhanced Web Page Cleaning for Constructing Social Media Text Corpora, Proceedings: 6th International Conference on Information Science and Applications (ICISA), Pattaya, Thailand, February 2015.
- M. Neunerdt, B. Trevisan, M. Niermann, R. Mathar, Focused Crawling for Building Web Comment Corpora, Proceedings: The 10th IEEE Consumer Communications & Networking Conference CCNC 2013, Las Vegas, Nevada USA, January 2013.
- M. Neunerdt, B. Trevisan, T. C. Teixeira, R. Mathar, E. Jakobs, Ontology-based Corpus Generation for Web Comment Analysis, Proceedings: ACM conference on Hypertext and hypermedia (HT 2011), Eindhoven, May 2011.
Webpage Segmentation, Web Comment Detection/Extraction
- M. Neunerdt, M. Reyer, R. Mathar, Automatic Genre Classification in Web Pages Applied to Web Comments, Proceedings: 12th Conference on Natural Language Processing (KONVENS), Hildesheim, Germany, October 2014.
- M. Neunerdt, B. Trevisan, M. Niermann, R. Mathar, Focused Crawling for Building Web Comment Corpora, Proceedings: The 10th IEEE Consumer Communications & Networking Conference CCNC 2013, Las Vegas, Nevada USA, January 2013.
Natural Language Processing for Noisy Web Comments, Tokenization, POS Tagging, Multi-Level Annotation
- M. Neunerdt, M. Reyer, R. Mathar, Efficient Training Data Enrichment and Unknown Token Handling for POS Tagging of Non-standardized Texts, Proceedings: 12th Conference on Natural Language Processing (KONVENS), Hildesheim, Germany, October 2014.
- B. Trevisan, M. Neunerdt, T. Hemig, E. Jakobs, R. Mathar, Detecting Ironic Patterns in Multi-level Annotated German Web Comments, Proceedings: 12th Conference on Natural Language Processing (KONVENS), Hildesheim, Germany, October 2014.
- M. Neunerdt, B. Trevisan, M. Reyer, R. Mathar, Part-of-Speech Tagging for Social Media Texts, Proceedings: International Conference of the German Society for Computational Linguistics and Language Technology (GSCL), Darmstadt, Germany, September 2013.
- B. Trevisan, M. Neunerdt, E. Jakobs, A Multi-level Annotation Model for Fine-grained Opinion Detection in German Blog Comments, Proceedings: 11th Conference on Natural Language Processing (KONVENS), Vienna, Austria, September 2012.
- M. Neunerdt, B. Trevisan, R. Mathar, E. Jakobs, Detecting Irregularities in Blog Comment Language Affecting POS Tagging Accuracy, International Journal of Computational Linguistics and Applications, vol. 3, no. 1, pp. 71-88, June 2012.
Acceptance Integration into Network Planning Models
- A. Engels, M. Neunerdt, R. Mathar, H. M. Abdullah, Acceptance as a Success Factor for Planning Wireless Network Infrastructure, Proceedings: International Symposium on Wireless Communication Systems 2011 (ISWCS'11), Aachen, Germany, November 2011.
Related projects
- HUMIC - Evaluation of acceptance as an integral element of the development and implementation of complex technical systems. Using the example of the mobile communication networks.
- WebDisk (Recent) - WEB 2.0 large scale technology discourses: Computer aided aquistion, risk evaluation and focused group specific communication.
Related student work
- Student research project Entwicklung von Detektions- und Evaluationsalgorithmen für Webkommentare zum fokussierten Webcrawling
- Students assisted developping the software for evaluation