Burn Media Sites

Samsung Unveils Galaxy S26 Series: The Most Intuitive Galaxy AI Phone Yet

With powerful hardware working together with an industry-leading camera system and intuitive AI experiences, everyday tasks have never been easier and faster

Netflix Paramount Trump Clash Escalates As Streaming Power Meets Political Pressure

Donald Trump’s call for Netflix to remove board member Susan Rice has intensified the Paramount saga, pushing the streaming wars into a political confrontation.

The Future of Forex Brokers in South Africa – Consolidation, Regulation, or Exit?

South Africa’s retail forex industry is entering a decisive phase as regulation tightens and consolidation accelerates. What does it mean for brokers and traders?

Innovate47 Launches Global Food & Agri Accelerator to Back Climate-Smart Startups

Innovate47, the global venture builder and entrepreneur support organisation, has launched a new Food & Agri Accelerator to help founders reshape food systems and…

Simbi Wabote’s 10-Year Roadmap: How Nigeria Achieved 54% Local Content Growth

When Simbi Wabote assumed leadership of Nigeria’s Content Development and Monitoring Board (NCDMB) in 2016, he inherited an oil industry where local participation had…

Records Tumble In Best Year Yet For South Africa’s Hedge Fund Industry

South Africa’s hedge fund industry closed out 2024 with its strongest performance to date, according to the latest Novare Hedge Fund Survey. Assets under…

OPPO Reno15 Series Lands In South Africa With Big Cameras And Bigger Battery Ambitions

OPPO has officially launched the Reno15 Series in South Africa, bringing the Reno15 Pro and Reno15 F to local shelves from 7 February. Positioned…

OPPO Reno15 Series Brings Flagship-Level Hardware Logic to the Mid-Tier in South Africa

Mid-range smartphones are no longer about compromise. In 2026, they are about decisions. OPPO’s Reno15 Series arrives in South Africa with a clear one:…

HONOR’s customer first strategy is reshaping South Africa’s smartphone market

HONOR has been steadily climbing in South Africa’s smartphone rankings, but its rise is not being driven by specs alone. Instead, the brand has…

Omoda C5 X review

Omoda is one of Chery’s sub-brands in the South African market, combining daring design with an abundance of in-cabin digitisation. Its mid-size crossovers are…

TDK enhances IMUs for extreme temps

TDK has responded to developing market needs with a new range of advanced inertial measurement units (IMU) for automotive applications. The Japanese electronics giant…

Data centres and defence are reviving diesel

Data centres will command power equivalent to the entire Japanese power grid by 2030. It’s a startling prediction and one that infrastructure futurists, data…

Continue in 10 seconds

Skip

Social • 3 May 2016

Analysing social media language: a legal and ethical conundrum

By Daniel Faggella

We live in an age where people’s daily social media posts and internet search queries are not wholly their own. Though most of us search without much thought for the invisible eyes of others, the value of such data has not gone untapped by advertisers or unnoticed by researchers, who have used it for projects from studying influenza outbreaks to predicting stock market behavior.

With all of this internal dialogue set free in the public sphere, some have also argued that more subjective correlations between social media language, human behavior, and our general well being have gone understudied. Dr Lyle Ungar is part of a research team from University of Pennsylvania (UPenn) that was inspired to leverage the plethora of social media expression and study open correlations between our language and behavior.

Using machine learning to analyze data and find potential patterns, the team sought to uncover potential links between what we daily ‘tweet and post’ and our daily behaviors, even our physical and mental health status. Their initial study, Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach, was published in 2013 and immediately caught the attention of the medical field. The team’s research, part of University of Pennsylvania’s World Well-Being Project, is an ongoing work in progress.

Yet by its very nature, the study approach raises questions about the legal and ethical ramifications of accessing and analyzing social media language. Is it legal to analyze social media posts without people’s knowledge, and if they are aware, how do they know the knowledge is being kept ‘private’, or at least out of the hands of companies, governments, or other institutions who might use the information for their own narrow objectives?

Initially, Ungar and his team used a combination of unobtrusive measures (those that include any data about human behavior that can be collected without the subjects’ knowledge) and obtrusive methods (surveys and interviews are examples) to collect data. The team analyzed 700-million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests. The researchers were even able to get 2 000 people to grant access to their health records, allowing the team to perform a micro study on the potential relationships between language use on Facebook profiles and current or past medical conditions.

On the surface, the fact that people volunteered their personal information seems acceptable. Beyond Facebook, which has more stringent requirements for data access, the team collected thousands of tweets from Twitter, an easier to access platform (tweets are open to everyone and the data is easy to manipulate) and one that allowed the researchers to freely map where tweets came from and the geographical location of tweeters.

All this collecting was done in the name of research and furthering social scientists’ understanding of the behavior and well being of individuals and the greater populace; however, the sheer availability and relative ease-of-access to information means that with the right resources, any person or institution (we know most governments have the ability to keep a close watch) could do the same.

Maybe platforms like Twitter are fair game; people are aware that the information they post is available to everyone in the Tweetosphere, and we might assume tweeters should be cognizant and responsible for anything that they put out on publicly visible platforms. But how many people are actively considering the far-reaching impacts of their comments on a whim or in the heat of the moment, especially on a platform that has seemingly become another casual place to have a conversation?

Surprisingly, there hasn’t been all that much published about the legal and ethical ramifications of using social media data for research purposes; however, the potential considerations are wide and deep. A U.K.-based publication covered this topic briefly in a 2014 paper titled Use of Social Media for Research and Analysis. Both the organisations that “own” the data (like Facebook and Twitter) and researchers are largely ‘building the plane while flying it’ when it comes to handling grey areas.

For example, many social media organisations provide specific technical interfaces for the accessing of data, called “Application Programming Interfaces” [API], which allow for monitoring and setting ‘quotas’ (Twitter researchers are only able to access about one percent of material published on Twitter for any day).

Smaller social sites, like Pinterest and Instagram, are not as likely to provide API access. In this case, those who wish to extract data from web pages often use a method called “scraping”, in which the user ‘instructs’ a computer to extract and download information. Scraping seems to fall in the grey area category; while the content is free to access, there are usually copyright and intellectual property laws in place to protect this data (but whether anyone reads or abides by these codes is up for debate.

For many researchers, the idea of information consent for social media, a traditionally iron component of any valid research study, seems in many cases unreasonable, considering the volume of people that could potentially be involved in large studies spanning large regions. Though the 75,000 voluntary participants in the UPenn study seem like a big set, Dr. Ungar noted that the amount of usable extracted information, after human- and machine learning-processing, turned out to be a relatively small data set.

At present, it seems many contemporary researchers may ground their approach to social media research ethics by taking efforts to ensure that people whose data are utilized are not subjected to any directly-related negative effects, similar to steps taken when mitigating legal concerns involving data protection. But there doesn’t seem to be any black or white answers regarding the more abstract legal and ethical standing of social media data in research. There are undoubtedly great potential benefits and likewise harms to conducting such studies, but the issues are a reminder that maintaining a voice on the web is not done in a social bubble or vacuum. Whether our posts and tweets will be used predominantly for or against our own well-being is yet to be checked and balanced.

Daniel Faggella

#MatthewTheunissen apologises for racist Facebook post, Twitter doesn’t buy it

Social • 3 May 2016

Samsung Unveils Galaxy S26 Series: The Most Intuitive Galaxy AI Phone Yet

Netflix Paramount Trump Clash Escalates As Streaming Power Meets Political Pressure

The Future of Forex Brokers in South Africa – Consolidation, Regulation, or Exit?

Innovate47 Launches Global Food & Agri Accelerator to Back Climate-Smart Startups

Simbi Wabote’s 10-Year Roadmap: How Nigeria Achieved 54% Local Content Growth

Records Tumble In Best Year Yet For South Africa’s Hedge Fund Industry

OPPO Reno15 Series Lands In South Africa With Big Cameras And Bigger Battery Ambitions

OPPO Reno15 Series Brings Flagship-Level Hardware Logic to the Mid-Tier in South Africa

HONOR’s customer first strategy is reshaping South Africa’s smartphone market

Omoda C5 X review

TDK enhances IMUs for extreme temps

Data centres and defence are reviving diesel

Analysing social media language: a legal and ethical conundrum

Daniel Faggella

News

Samsung Unveils Galaxy S26 Series: The Most Intuitive Galaxy AI Phone Yet

Netflix Paramount Trump Clash Escalates As Streaming Power Meets Political Pressure

The Future of Forex Brokers in South Africa – Consolidation, Regulation, or Exit?

Kimi K2.5 Enters the Global AI Race Against ChatGPT 5.2 and Its Rivals

We use cookies

Welcome to Memeburn