Whistlerlib: a distributed computing library...
URL: https://doi.org/10.1007/s11042-024-19827-z
At least 350k posts are published on X, 510k comments are posted on Facebook, and 66k pictures and videos are shared on Instagram each minute. These large datasets require substantial processing power, even if only a percentage is collected for analysis and research. To face this challenge, data scientists can now use computer clusters deployed on various IaaS and PaaS services in the cloud. However, scientists still have to master the design of distributed algorithms and be familiar with using distributed computing programming frameworks. It is thus essential to generate tools that provide analysis methods to leverage the advantages of computer clusters for processing large amounts of social network text. This paper presents Whistlerlib, a new Python library for conducting exploratory analysis on large text datasets on social networks. Whistlerlib implements distributed versions of various social media, sentiment, and social network analysis methods that can run atop computer clusters. We experimentally demonstrate the scalability of the various Whistlerlib distributed methods when deployed on a public cloud platform. We also present a practical example of the analysis of posts on the social network X about the Mexico City subway to showcase the features of Whistlerlib in scenarios where social network analysis tools are needed to address issues with a social dimension.
Todavía no existen vistas creadas para este recurso.
Información adicional
Campo | Valor |
---|---|
Última actualización de los datos | 10 de octubre de 2025 |
Última actualización de los metadatos | 10 de octubre de 2025 |
Creado | 10 de octubre de 2025 |
Formato | HTML |
Licencia | No se ha provisto de una licencia |
Id | 298aa89e-8339-4dd5-95e7-5122f47ae39c |
Package id | b3a9f0bd-728b-4ee5-8a79-24c1069b2785 |
State | active |