Dear readers, With the launch of e-newsletter CUHK in Focus, CUHKUPDates has retired and this site will no longer be updated. To stay abreast of the University’s latest news, please go to https://focus.cuhk.edu.hk. Thank you.
8:30 a.m., Monday. For the third time in a week, the train is held up in the middle of the tunnel. Try as you might to be on time, there seems always to be a way to undo the head start you have made. Trapped like the rest of the hundreds of commuters on board, heavy-eyed, you think there is nothing you can do except, maybe, tweet furiously about it:
‘Well don, 3rd delay this week. You ppl at the railways had ONE job.’
Besides being a way to let off some steam, this tweet may seem inconsequential. But for Luo Shuli, ramblings on social media like this can make a difference and lead to better transport—given the right tool to process and understand them.
A PhD candidate at the Department of Geography and Resource Management, Shuli has been using social media data to better understand the perception of a city’s public transport system. For her dissertation research, which won her a Best Paper award at the Second International Conference on Urban Informatics, she focused on Shenzhen and collected tens of thousands of public Weibo posts about the city’s metro system. What she was interested in were the sentiments they convey and such metadata as when and where they were published, which might point to areas of the system that needed improvement.
‘A recent trend in urban planning is to complement conventional surveys with big data,’ said Prof. Sylvia He, Shuli’s advisor and co-author of the prize-winning paper. While allowing for more comprehensive social demographics to be collected and for customization, surveys are costly and can only be conducted infrequently. Meanwhile, a city-wide Internet of Things (IoT)—made possible by the explosion of smart phones, GPS-enabled vehicles and, of course, social media users in recent years—provides a steady, voluminous stream of data capable of revealing a great deal about the population’s travel behaviour.
‘A hallmark of smart cities is the widespread use of energy-efficient, zero-emissions electric vehicles, and to promote e-mobility, we need more charging stations. But where should we build them? This is where big data can fill us in,’ Professor He explained, giving another example of how big data is used in urban planning. Whereas normally researchers would have to depend on surveys, Professor He’s team is now exploring a method of using raw data from existing stations to find out where, when and for how long electric vehicles are more likely to be serviced, thereby ensuring a more reasonable distribution of charging facilities across the city.
As many insights as it may contain, though, there is no way we mere mortals can sift through such a massive amount of data. We need a program that, having been taught the rules, automates the task. Better yet, we can have an AI program, which can figure out the rules without needing us to write them out for it. Indeed, when it comes to social media posts, which are often rife with typos, shorthands and other irregularities like the tweet we have seen at the outset, AI is the clear winner in that it spares us from having to teach the program to recognize the numerous special cases there are. All it needs is a good amount of training, through which it can learn the rules and exceptions from samples we provide it with.
‘Efficiency and personalization are what usually motivate the use of AI in public administration,’ said Prof. Wilson Wong, director of the Data Science and Policy Studies programme. Aside from unlocking the wealth of data around us, an intelligent automated system can help respond to the many different needs of citizens around the clock. In Japan, for example, chatbots have been employed to give more individualized and accurate information on government services. From e-government to e-governance, public goods provision to policymaking—there is much potential for AI to do social good as many have called for lately.
But AI, too, has its limitations. For one part of her dissertation, Shuli ended up ditching the AI model and went with the classical statistical approach, having compared how they performed in discerning the sentiments in the Weibo stories. It could be that the model needed more data for training, Professor He suspects, but there might be no way of knowing what went wrong. Indeed, many AI models are what computer scientists call black boxes, which is to say they have such an opaque decision-making mechanism that it is virtually impossible to diagnose the errors they make. At any rate, AI has not had much of an edge over traditional methods to begin with in terms of understanding emotions.
‘The model can categorize a sentiment as positive or negative pretty decently, but it doesn’t tell you how positive or negative it is,’ said Shuli of her experience performing sentiment analysis using AI. ‘And if you ask the model to be more specific and return anything more descriptive than a label that says “positive”, “neutral” or “negative”, you’ll probably get something wildly inaccurate.’
Things get even muddier when you are dealing with circumlocutions, like the sarcasm in our opening tweet. Solutions have been proposed to give AI models an awareness of contexts, as Professor He noted, but for now, machines are often still dependent on human calibration when it runs into this kind of problem. Beyond the understanding of social media parlance, this lack of tacit knowledge is a major reason why AI is not playing a more decisive role in public administration.
‘There are many misconceptions of what AI can do,’ said Professor Wong, who has been part of an Association of Pacific Rim Universities (APRU) project exploring AI’s capacity for social betterment. ‘With less controversial stuff like renewing driver’s licenses and handing out consumption vouchers, which are really just matters of verifying the applicant’s eligibility, surely AI can be of help. But how much further can it go?’ Consider university admissions. On top of academic results, the board will look for certain personal qualities: being principled, willingness to communicate, honesty, and so on. They are not exactly subjective, but they are hard to define, understood only through socialization. If we are to replace human admission officers with machines, the challenge will be for them to understand these qualities in mathematical and logical terms. How are we to create an algorithm for that?
‘The same goes for court trials. It’s hard to imagine a formula by which a machine can determine if the defendant is remorseful, however fair it might be to have a robot to be the judge.’
And here AI hits another roadblock: it is rarely even impartial. We have seen that the rules by which an AI model makes judgments stem from the samples we chose to train it on. If the samples are biased—as they often are by their nature of representing only part of the whole truth and by the simple fact that they were selected by humans—so must the model be. With Shenzhen having a predominantly younger population, it is not too problematic to consult an AI model feeding solely on social media data, as Professor He pointed out. In cases like hiring government employees, though, it is probably a bad idea to rely on AI. Using the current workforce as the template, an AI recruiter would miss out on bright minds that do not fit ‘the norm’, miss the opportunity to shake things up and, more dangerously, inherit whatever discriminatory practices that characterize the organization in its present state.
‘With all its unreasonable judgements, AI does, after all, serve to expose everything that’s wrong with its teachers, us. Rather than thinking about letting it run our lives, we should take this opportunity to reflect on the human prejudices that have made it the way it is,’ said Professor Wong.
As encouraging as it is, the fact that there have been calls for the use of AI for social good, including even a movement that got itself the catchy abbreviation ‘AI4SG’, is a reminder that things can move—and most certainly have—in the wrong direction. It can be a treasure trove that AI is unlocking, but it can also be a Pandora’s box. We have been talking about AI in tandem with big data, and we have seen how they enable each other, albeit imperfectly. This symbiosis comes at a price, one that we might have given up caring about: privacy.
‘The relationship between data and privacy is forever a contentious one. At one end of the spectrum, you have a society that withholds all its data for privacy’s sake and gives up all the benefits we’ve talked about; at the other end, you have a society that surrenders all its data to the point where even the faintest of facial expressions could, with the right technology, be monitored,’ said Professor Wong. With most people going for the middle ground, the idea of data governance has gained momentum over the past few years.
‘It’s all about creating a mechanism where data users, including the government, can be held accountable,’ Professor Wong explained. Ideally, it will be a legal framework regulating the whats, whens, whos and hows of data collection and use.
‘By making the use of our data transparent and keeping ourselves informed of what’s happening behind the scene, we might find ourselves closer to a symmetry of information and, therefore, power.’
One particular issue with data use in public administration is its scope. While most people are willing to sacrifice some of their data for whatever benefits they are promised, there will always be those that firmly object to any erosion of their privacy. Though it is getting more and more difficult, they remain free not to use social media to keep the hands of Big Tech away from their information. That is, however, not an option in face of an intrusive public policy, which by definition applies to everyone in the community, given the ubiquity of data-driven technologies.
‘It’s impossible to go completely off the grid, to be quite frank. What we can think about is how we can minimize the impact for these people,’ said Professor Wong. ‘Some people are really uncomfortable with the ideas of smart cities and IoT, at the thought of a smart refrigerator looking at your snack stash and intervening in your eating habits in the name of health. What we can do is allow opting out as far as possible. With new policies, we can run pilot schemes with those that are more enthusiastic and let the hesitant wait and see.’
At the end of day, though, no institution is perfect. What is perhaps most needed is an understanding of AI and big data at an individual level, a data literacy.
‘As I often tell my students, data deleted is not deleted. There are many ways in which data can be recovered, so it’s best to think twice before creating it. This is the sort of alertness you get with data literacy,’ said Professor Wong. ‘To be data-literate, ultimately, is to have the knowledge to use data in a way that improves your life while not being enslaved by technology.’
With its blunders and flaws, AI may have a hard time making critical decisions for us; and given the invasion of privacy it enables, it needs more scrutiny, indeed, than it is getting. But no contribution it makes is too small, whether it be adding to traditional statistics when it comes to thinking about a city’s transport, or assisting policymakers in distributing public goods; and with a proper regulatory regime and a keen awareness of the power of data among citizens, it can, after all, do good.
‘With all that we’ve said about smart cities, I believe AI can also encourage civic engagement by making the data the individual citizen generates valuable,’ Professor He added, reminding us how in the world of AI even a throwaway tweet from a disgruntled commuter can contribute to the smooth functioning of a city.
‘I’d like to think it’s here to make lives better.’
Photos by gloriang@cuhkimages and ponyleung@cuhkimages