Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets

Gespeichert in:

Bibliographische Detailangaben
Zeitschriftentitel:	Journal of the Association for Information Science and Technology
Personen und Körperschaften:	Arakawa, Yui, Kameda, Akihiro, Aizawa, Akiko, Suzuki, Takafumi
In:	Journal of the Association for Information Science and Technology, 65, 2014, 7, S. 1416-1423
Format:	E-Article
Sprache:	Englisch
veröffentlicht:	Wiley
Schlagwörter:	Library and Information Sciences Information Systems and Management Computer Networks and Communications Information Systems

author_facet	Arakawa, Yui Kameda, Akihiro Aizawa, Akiko Suzuki, Takafumi Arakawa, Yui Kameda, Akihiro Aizawa, Akiko Suzuki, Takafumi
author	Arakawa, Yui Kameda, Akihiro Aizawa, Akiko Suzuki, Takafumi
spellingShingle	Arakawa, Yui Kameda, Akihiro Aizawa, Akiko Suzuki, Takafumi Journal of the Association for Information Science and Technology Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets Library and Information Sciences Information Systems and Management Computer Networks and Communications Information Systems
author_sort	arakawa, yui
spelling	Arakawa, Yui Kameda, Akihiro Aizawa, Akiko Suzuki, Takafumi 2330-1635 2330-1643 Wiley Library and Information Sciences Information Systems and Management Computer Networks and Communications Information Systems http://dx.doi.org/10.1002/asi.23126 <jats:p>Recently, <jats:styled-content style="fixed-case">T</jats:styled-content>witter has received much attention, both from the general public and researchers, as a new method of transmitting information. Among others, the number of retweets (<jats:styled-content style="fixed-case">RTs</jats:styled-content>) and user types are the two important items of analysis for understanding the transmission of information on <jats:styled-content style="fixed-case">T</jats:styled-content>witter. To analyze this point, we applied text classification and feature extraction experiments using random forests machine learning with conventional stylistic and <jats:styled-content style="fixed-case">T</jats:styled-content>witter‐specific features. We first collected tweets from 40 accounts with a high number of followers and created tweet texts from 28,756 tweets. We then conducted 15 types of classification experiments using a variety of combinations of features such as function words, speech terms, <jats:styled-content style="fixed-case">T</jats:styled-content>witter's descriptive grammar, and information roles. We deliberately observed the effects of features for classification performance. The results indicated that class classification per user indicated the best performance. Furthermore, we observed that certain features had a greater impact on classification. In the case of the experiments that assessed the level of <jats:styled-content style="fixed-case">RT</jats:styled-content> quantity, information roles had an impact. In the case of user experiments, important features, such as the honorific postpositional particle and auxiliary verbs, such as “desu” and “masu,” had an impact. This research clarifies the features that are useful for categorizing tweets according to the number of <jats:styled-content style="fixed-case">RTs</jats:styled-content> and user types.</jats:p> Adding <scp>T</scp>witter‐specific features to stylistic features for classifying tweets by user type and number of retweets Journal of the Association for Information Science and Technology
doi_str_mv	10.1002/asi.23126
facet_avail	Online
finc_class_facet	Allgemeines Informatik
format	ElectronicArticle
fullrecord	blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTAwMi9hc2kuMjMxMjY
id	ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTAwMi9hc2kuMjMxMjY
institution	DE-D161 DE-Gla1 DE-Zi4 DE-15 DE-Pl11 DE-Rs1 DE-105 DE-14 FID-BBI-DE-23 DE-Ch1 DE-L229 DE-D275 DE-Bn3 DE-Brt1
imprint	Wiley, 2014
imprint_str_mv	Wiley, 2014
issn	2330-1643 2330-1635
issn_str_mv	2330-1643 2330-1635
language	English
mega_collection	Wiley (CrossRef)
match_str	arakawa2014addingtwitterspecificfeaturestostylisticfeaturesforclassifyingtweetsbyusertypeandnumberofretweets
publishDateSort	2014
publisher	Wiley
recordtype	ai
record_format	ai
series	Journal of the Association for Information Science and Technology
source_id	49
title	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_unstemmed	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_full	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_fullStr	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_full_unstemmed	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_short	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_sort	adding <scp>t</scp>witter‐specific features to stylistic features for classifying tweets by user type and number of retweets
topic	Library and Information Sciences Information Systems and Management Computer Networks and Communications Information Systems
url	http://dx.doi.org/10.1002/asi.23126
publishDate	2014
physical	1416-1423
description	<jats:p>Recently, <jats:styled-content style="fixed-case">T</jats:styled-content>witter has received much attention, both from the general public and researchers, as a new method of transmitting information. Among others, the number of retweets (<jats:styled-content style="fixed-case">RTs</jats:styled-content>) and user types are the two important items of analysis for understanding the transmission of information on <jats:styled-content style="fixed-case">T</jats:styled-content>witter. To analyze this point, we applied text classification and feature extraction experiments using random forests machine learning with conventional stylistic and <jats:styled-content style="fixed-case">T</jats:styled-content>witter‐specific features. We first collected tweets from 40 accounts with a high number of followers and created tweet texts from 28,756 tweets. We then conducted 15 types of classification experiments using a variety of combinations of features such as function words, speech terms, <jats:styled-content style="fixed-case">T</jats:styled-content>witter's descriptive grammar, and information roles. We deliberately observed the effects of features for classification performance. The results indicated that class classification per user indicated the best performance. Furthermore, we observed that certain features had a greater impact on classification. In the case of the experiments that assessed the level of <jats:styled-content style="fixed-case">RT</jats:styled-content> quantity, information roles had an impact. In the case of user experiments, important features, such as the honorific postpositional particle and auxiliary verbs, such as “desu” and “masu,” had an impact. This research clarifies the features that are useful for categorizing tweets according to the number of <jats:styled-content style="fixed-case">RTs</jats:styled-content> and user types.</jats:p>
container_issue	7
container_start_page	1416
container_title	Journal of the Association for Information Science and Technology
container_volume	65
format_de105	Article, E-Article
format_de14	Article, E-Article
format_de15	Article, E-Article
format_de520	Article, E-Article
format_de540	Article, E-Article
format_dech1	Article, E-Article
format_ded117	Article, E-Article
format_degla1	E-Article
format_del152	Buch
format_del189	Article, E-Article
format_dezi4	Article
format_dezwi2	Article, E-Article
format_finc	Article, E-Article
format_nrw	Article, E-Article
_version_	1792335222422896640
geogr_code	not assigned
last_indexed	2024-03-01T14:41:02.982Z
geogr_code_person	not assigned
openURL	url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=Adding+Twitter%E2%80%90specific+features+to+stylistic+features+for+classifying+tweets+by+user+type+and+number+of+retweets&rft.date=2014-07-01&genre=article&issn=2330-1643&volume=65&issue=7&spage=1416&epage=1423&pages=1416-1423&jtitle=Journal+of+the+Association+for+Information+Science+and+Technology&atitle=Adding+%3Cscp%3ET%3C%2Fscp%3Ewitter%E2%80%90specific+features+to+stylistic+features+for+classifying+tweets+by+user+type+and+number+of+retweets&aulast=Suzuki&aufirst=Takafumi&rft_id=info%3Adoi%2F10.1002%2Fasi.23126&rft.language%5B0%5D=eng
SOLR
_version_	1792335222422896640
author	Arakawa, Yui, Kameda, Akihiro, Aizawa, Akiko, Suzuki, Takafumi
author_facet	Arakawa, Yui, Kameda, Akihiro, Aizawa, Akiko, Suzuki, Takafumi, Arakawa, Yui, Kameda, Akihiro, Aizawa, Akiko, Suzuki, Takafumi
author_sort	arakawa, yui
container_issue	7
container_start_page	1416
container_title	Journal of the Association for Information Science and Technology
container_volume	65
description	<jats:p>Recently, <jats:styled-content style="fixed-case">T</jats:styled-content>witter has received much attention, both from the general public and researchers, as a new method of transmitting information. Among others, the number of retweets (<jats:styled-content style="fixed-case">RTs</jats:styled-content>) and user types are the two important items of analysis for understanding the transmission of information on <jats:styled-content style="fixed-case">T</jats:styled-content>witter. To analyze this point, we applied text classification and feature extraction experiments using random forests machine learning with conventional stylistic and <jats:styled-content style="fixed-case">T</jats:styled-content>witter‐specific features. We first collected tweets from 40 accounts with a high number of followers and created tweet texts from 28,756 tweets. We then conducted 15 types of classification experiments using a variety of combinations of features such as function words, speech terms, <jats:styled-content style="fixed-case">T</jats:styled-content>witter's descriptive grammar, and information roles. We deliberately observed the effects of features for classification performance. The results indicated that class classification per user indicated the best performance. Furthermore, we observed that certain features had a greater impact on classification. In the case of the experiments that assessed the level of <jats:styled-content style="fixed-case">RT</jats:styled-content> quantity, information roles had an impact. In the case of user experiments, important features, such as the honorific postpositional particle and auxiliary verbs, such as “desu” and “masu,” had an impact. This research clarifies the features that are useful for categorizing tweets according to the number of <jats:styled-content style="fixed-case">RTs</jats:styled-content> and user types.</jats:p>
doi_str_mv	10.1002/asi.23126
facet_avail	Online
finc_class_facet	Allgemeines, Informatik
format	ElectronicArticle
format_de105	Article, E-Article
format_de14	Article, E-Article
format_de15	Article, E-Article
format_de520	Article, E-Article
format_de540	Article, E-Article
format_dech1	Article, E-Article
format_ded117	Article, E-Article
format_degla1	E-Article
format_del152	Buch
format_del189	Article, E-Article
format_dezi4	Article
format_dezwi2	Article, E-Article
format_finc	Article, E-Article
format_nrw	Article, E-Article
geogr_code	not assigned
geogr_code_person	not assigned
id	ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTAwMi9hc2kuMjMxMjY
imprint	Wiley, 2014
imprint_str_mv	Wiley, 2014
institution	DE-D161, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1, DE-105, DE-14, FID-BBI-DE-23, DE-Ch1, DE-L229, DE-D275, DE-Bn3, DE-Brt1
issn	2330-1643, 2330-1635
issn_str_mv	2330-1643, 2330-1635
language	English
last_indexed	2024-03-01T14:41:02.982Z
match_str	arakawa2014addingtwitterspecificfeaturestostylisticfeaturesforclassifyingtweetsbyusertypeandnumberofretweets
mega_collection	Wiley (CrossRef)
physical	1416-1423
publishDate	2014
publishDateSort	2014
publisher	Wiley
record_format	ai
recordtype	ai
series	Journal of the Association for Information Science and Technology
source_id	49
spelling	Arakawa, Yui Kameda, Akihiro Aizawa, Akiko Suzuki, Takafumi 2330-1635 2330-1643 Wiley Library and Information Sciences Information Systems and Management Computer Networks and Communications Information Systems http://dx.doi.org/10.1002/asi.23126 <jats:p>Recently, <jats:styled-content style="fixed-case">T</jats:styled-content>witter has received much attention, both from the general public and researchers, as a new method of transmitting information. Among others, the number of retweets (<jats:styled-content style="fixed-case">RTs</jats:styled-content>) and user types are the two important items of analysis for understanding the transmission of information on <jats:styled-content style="fixed-case">T</jats:styled-content>witter. To analyze this point, we applied text classification and feature extraction experiments using random forests machine learning with conventional stylistic and <jats:styled-content style="fixed-case">T</jats:styled-content>witter‐specific features. We first collected tweets from 40 accounts with a high number of followers and created tweet texts from 28,756 tweets. We then conducted 15 types of classification experiments using a variety of combinations of features such as function words, speech terms, <jats:styled-content style="fixed-case">T</jats:styled-content>witter's descriptive grammar, and information roles. We deliberately observed the effects of features for classification performance. The results indicated that class classification per user indicated the best performance. Furthermore, we observed that certain features had a greater impact on classification. In the case of the experiments that assessed the level of <jats:styled-content style="fixed-case">RT</jats:styled-content> quantity, information roles had an impact. In the case of user experiments, important features, such as the honorific postpositional particle and auxiliary verbs, such as “desu” and “masu,” had an impact. This research clarifies the features that are useful for categorizing tweets according to the number of <jats:styled-content style="fixed-case">RTs</jats:styled-content> and user types.</jats:p> Adding <scp>T</scp>witter‐specific features to stylistic features for classifying tweets by user type and number of retweets Journal of the Association for Information Science and Technology
spellingShingle	Arakawa, Yui, Kameda, Akihiro, Aizawa, Akiko, Suzuki, Takafumi, Journal of the Association for Information Science and Technology, Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets, Library and Information Sciences, Information Systems and Management, Computer Networks and Communications, Information Systems
title	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_full	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_fullStr	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_full_unstemmed	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_short	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_sort	adding <scp>t</scp>witter‐specific features to stylistic features for classifying tweets by user type and number of retweets
title_unstemmed	Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets
topic	Library and Information Sciences, Information Systems and Management, Computer Networks and Communications, Information Systems
url	http://dx.doi.org/10.1002/asi.23126